Referee Performance¶

Match referees are heavily scrutinized in modern football, but analysis into match performance is difficult because of either a lack of data or the sensitive nature of match refereeing. In this example we summarize referee actions over all the matches in charge, as well as breakdown cards and fouls that triggered them.

Language

This script is written in Python 2.7 using the official API client for Python. The client simplifies the process of making HTTP requests to the Soccermetrics API and unpacking the responses.

The total length of the script is about 35 lines. About 10 lines are used to display the results, however.

Concept

This script calls referees.get(), and match.link.get(), making use of the hypertext provided with the referee, events.penalties, and events.offenses responses.

Walkthrough

At the very top of the script we import the SoccermetricsRestClient package that will allow us to communicate with the API.

from soccermetrics.rest import SoccermetricsRestClient

In the main routine we create a SoccermetricsRestClient object. You can instantiate it by passing the authentication tokens through the account and api_key variables, but we recommend storing those tokens in environmental variables.

if __name__ == "__main__":
    client = SoccermetricsRestClient()

We retrieve data for a match referee from the API.

referee = client.referees.get(full_name="Howard Webb").data[0]

We’re going to use the links from the representation to access all matches that the referee officiated, sorted by match date. Response data from the API are split into pages, but we can get the full dataset at once with the all() method.

matches = client.link.get(referee.link.matches,sort="match_date").all()

If we want to record the amount of time added on by the referee, we first extract the lengths of both halves of the football match and subtract those values from the length of each half (45 minutes). We create a list of dictionaries to store the data.

timeon = [dict(first=45-match.firsthalf_length,second=45-match.secondhalf_length)
          for match in matches]

Now we iterate over the list of matches.

Just before the loop we create three lists – penalties, yellows, and reds – to store the event data that we will extract.

For each match representation we access penalty and disciplinary data through the events.penalties and events.offenses hyperlinks. We use client.link.get() to access the data (using card_type to filter the different types of cards).

(A quick aside: We could use a method like client.events.offenses.get(match=?) to achieve the same thing, but client.link.get() is more convenient.)

Finally we append those response data to their respective lists.

penalties = []
yellows = []
reds = []
for match in matches:
    match_pens = client.link.get(match.link.events.penalties).all()
    match_yellows = client.link.get(match.link.events.offenses,card_type="Yellow").all()
    match_2ndyellows = client.link.get(match.link.events.offenses,card_type="Yellow/Red").all()
    match_reds = client.link.get(match.link.events.offenses,card_type="Red").all()

    penalties.extend(match_pens)
    yellows.extend(match_yellows)
    reds.extend(match_2ndyellows + match_reds)

    print """Matchday %s: %s v %s: Penalties %d Yellow %d Yellow/Red %d Red %d  1st Half %d  2nd Half %d""" % (match.matchday,
                match.home_team_name, match.away_team_name, len(match_pens),
                len(match_yellows), len(match_2ndyellows), len(match_reds),
                match.firsthalf_length, match.secondhalf_length)

We now have a list of dictionaries, and we would like to calculate the number of cards shown for specific fouls. To do this we create a temporary function that converts the list of dictionary key/value pairs to a list of values given a specific key.

dict2list = lambda vec,k: [x[k] for x in vec]

Then we create a unique list of fouls called by the referee.

foul_list = set(dict2list(yellows,'foul_type')+dict2list(reds,'foul_type'))

Finally we print the list of fouls and alongside the number of yellow and red cards given for them.

print "Foul Type\tYellows\tReds"
for foul in foul_list:
    print "%30s\t%2d\t%2d" % (foul,
        sum([1 for x in yellows if x['foul_type'] == foul]),
        sum([1 for x in reds if x['foul_type'] == foul]))

Results

Performance by Howard Webb in 2011-12 PL Matches
Matchday	Matchup	Penalties	Yellows	2nd Yellows	Reds	1st Half	2nd Half
2	Sunderland v Newcastle United	0	6	0	1	45	45
3	Manchester United v Arsenal	2	5	1	0	48	46
4	Fulham v Blackburn Rovers	0	4	0	0	45	48
5	Bolton Wanderers v Norwich City	1	4	0	1	46	48
6	Manchester City v Everton	0	6	0	0	45	45
8	Arsenal v Sunderland	0	6	0	0	45	45
10	Tottenham Hotspur v Queens Park Rangers	0	0	0	0	45	45
11	Bolton Wanderers v Stoke City	0	1	0	0	45	47
14	Manchester City v Norwich City	0	1	0	0	45	46
15	Arsenal v Everton	0	4	0	0	45	45
16	Queens Park Rangers v Manchester United	0	3	0	0	45	45
17	Tottenham Hotspur v Chelsea	0	4	0	0	48	45
18	Sunderland v Everton	1	1	0	0	45	45
19	Norwich City v Fulham	0	0	0	0	45	49
20	Newcastle United v Manchester United	0	3	0	0	45	45
21	Liverpool v Stoke City	0	3	0	0	45	48
22	Manchester City v Tottenham Hotspur	1	3	0	0	45	50
24	Chelsea v Manchester United	2	3	0	0	45	45
25	Bolton Wanderers v Wigan Athletic	0	5	0	0	46	49
26	Stoke City v Swansea City	0	1	0	0	45	45
27	Blackburn Rovers v Aston Villa	0	1	0	0	45	46
28	Arsenal v Newcastle United	0	5	0	0	45	52
29	Queens Park Rangers v Liverpool	0	0	0	0	45	46
30	Stoke City v Manchester City	0	3	0	0	45	50
31	Blackburn Rovers v Manchester United	0	2	0	0	45	45
32	Swansea City v Newcastle United	0	2	0	0	45	45
36	Chelsea v Queens Park Rangers	0	1	0	0	45	45
37	Newcastle United v Manchester City	0	7	0	0	45	45
38	Sunderland v Manchester United	0	6	0	0	45	45

Fouls/Cards by Howard Webb in 2011-12 PL Matches
Foul Type	Yellows	Reds
Professional foul	1	0
Persistent infringement	0	1
Delaying restart	1	0
Reckless challenge	73	1
Dissent	5	0
Off-the-ball infraction	5	1
Excessive celebration	1	0
Holding	1	0
Unknown	3	0

Full Script

#!/usr/bin/env python

from soccermetrics.rest import SoccermetricsRestClient

if __name__ == "__main__":

    client = SoccermetricsRestClient()

    referee = client.referees.get(full_name="Howard Webb").data[0]

    matches = client.link.get(referee.link.matches,sort="match_date").all()

    timeon = [dict(first=45-match.firsthalf_length,second=45-match.secondhalf_length)
              for match in matches]

    penalties = []
    yellows = []
    reds = []
    for match in matches:
        match_pens = client.link.get(match.link.events.penalties).all()
        match_yellows = client.link.get(match.link.events.offenses,card_type="Yellow").all()
        match_2ndyellows = client.link.get(match.link.events.offenses,card_type="Yellow/Red").all()
        match_reds = client.link.get(match.link.events.offenses,card_type="Red").all()

        penalties.extend(match_pens)
        yellows.extend(match_yellows)
        reds.extend(match_2ndyellows + match_reds)

        print """Matchday %s: %s v %s: Penalties %d Yellow %d Yellow/Red %d Red %d  1st Half %d  2nd Half %d""" % (match.matchday,
                    match.home_team_name, match.away_team_name, len(match_pens),
                    len(match_yellows), len(match_2ndyellows), len(match_reds),
                    match.firsthalf_length, match.secondhalf_length)

    dict2list = lambda vec,k: [x[k] for x in vec]

    foul_list = set(dict2list(yellows,'foul_type')+dict2list(reds,'foul_type'))

    print "Foul Type\tYellows\tReds"
    for foul in foul_list:
        print "%30s\t%2d\t%2d" % (foul,
            sum([1 for x in yellows if x['foul_type'] == foul]),
            sum([1 for x in reds if x['foul_type'] == foul]))