A manager’s performance in-game is difficult to quantify because once a match starts, a manager has very few inputs into the game. He sets the lineup and the match strategy, but the effectiveness of strategy isn’t possible without knowing the strategy and having technical/tactical match data.
One action that is possible to quantify is substitution behavior, and we will write a script to do just that.
Language
This script is written in Python 2.7 using the official API client for Python. The client simplifies the process of making HTTP requests to the Soccermetrics API and unpacking the responses.
The total length of the script is about 35 lines. About 10 lines are used to display the results, however.
Concept
This script calls managers.get(), and match.link.get(), making use of the hypertext provided with the manager and match.information responses.
Walkthrough
At the very top of the script we import the SoccermetricsRestClient package that will allow us to communicate with the API.
from soccermetrics.rest import SoccermetricsRestClient
In the main routine we create a SoccermetricsRestClient object. You can instantiate it by passing the authentication tokens through the account and api_key variables, but we recommend storing those tokens in environmental variables.
if __name__ == "__main__":
client = SoccermetricsRestClient()
We retrieve data for a manager from the API.
manager = client.managers.get(full_name=u'Roberto Mancini').data[0]
We’re going to use the links from the representation to access all matches involving the manager. First we retrieve all the data from the home_matches and away_matches hyperlinks, and then combine and sort the list by match date.
hmatches = client.link.get(manager.link.home_matches,sort="match_date").all()
amatches = client.link.get(manager.link.away_matches,sort="match_date").all()
matches = sorted(hmatches + amatches,key=lambda k: k.match_date)
Now we iterate over the list of matches and create a list of substitution timings for all the matches. This is actually a list of dictionaries that contain data for each match.
sub_list = []
for match in matches:
We want to obtain match substitutions for the team coached by the manager, so we need to use the proper attribute for the match record.
team_key = 'home_team_name' if match.home_manager_name == manager.full_name \
else 'away_team_name'
We use the hyperlinks in the match representation to retrieve all substitution data for each of the two teams involved. The data are sorted by match time and then stoppage time.
subs = client.link.get(match.link.events.substitutions,
team_name=getattr(match,team_key),sort="time_mins,stoppage_mins").data
We now create the dictionary of substitution times, using the keys first, second, and third. It’s very convenient that these keys are in alphabetical order! We initialize the dictionary by setting all values of the dictionary to None, which allows us to account for unused subs.
sub_dict = dict(first=None,second=None,third=None)
subkeys = sorted(sub_dict.keys())
We load the dictionary of substitution times by looping over the ordered match substitution data.
for k, sub in enumerate(subs):
sub_dict[subkeys[k]] = int(sub.time_mins)
As a diagnostic, we display the matchday, the teams, and the match substitutions made by the manager.
print "Matchday %2s: %s v %s: " % (match.matchday, match.home_team_name,
match.away_team_name),
print sub_dict['first'],sub_dict['second'],sub_dict['third']
Finally we append the substitution dictionary to the list.
sub_list.append(sub_dict)
We now have a list of dictionaries, and we would like to calculate the average time of the first, second, and third substitutions. To do this we create a temporary function that converts the list of dictionary key/value pairs to a list of values given a specific key. If a value is None, we ignore it.
dict2list = lambda vec,k: [x[k] for x in vec if x[k]]
We calculate the average time of the substitution and display the results.
avg1st = sum(dict2list(sub_list,'first'))/len(dict2list(sub_list,'first'))
avg2nd = sum(dict2list(sub_list,'second'))/len(dict2list(sub_list,'second'))
avg3rd = sum(dict2list(sub_list,'third'))/len(dict2list(sub_list,'third'))
print avg1st, avg2nd, avg3rd
Here are the results:
Matchday 1: Manchester City v Swansea City: 59 74 81
Matchday 2: Bolton Wanderers v Manchester City: 68 80 88
Matchday 3: Tottenham Hotspur v Manchester City: 64 75 None
Matchday 4: Manchester City v Wigan Athletic: 60 71 79
Matchday 5: Fulham v Manchester City: 69 81 82
Matchday 6: Manchester City v Everton: 60 78 83
Matchday 7: Blackburn Rovers v Manchester City: 28 79 89
Matchday 8: Manchester City v Aston Villa: 66 66 77
Matchday 9: Manchester United v Manchester City: 70 76 89
Matchday 10: Manchester City v Wolverhampton Wanderers: 63 71 76
Matchday 11: Queens Park Rangers v Manchester City: 68 75 88
Matchday 12: Manchester City v Newcastle United: 69 75 84
Matchday 13: Liverpool v Manchester City: 65 82 90
Matchday 14: Manchester City v Norwich City: 68 72 82
Matchday 15: Chelsea v Manchester City: 64 75 85
Matchday 16: Manchester City v Arsenal: 72 85 85
Matchday 17: Manchester City v Stoke City: 59 68 83
Matchday 18: West Bromwich Albion v Manchester City: 59 74 82
Matchday 19: Sunderland v Manchester City: 46 55 67
Matchday 20: Manchester City v Liverpool: 71 76 None
Matchday 21: Wigan Athletic v Manchester City: 74 81 90
Matchday 22: Manchester City v Tottenham Hotspur: 65 None None
Matchday 23: Everton v Manchester City: 62 68 86
Matchday 24: Manchester City v Fulham: 55 80 90
Matchday 25: Aston Villa v Manchester City: 84 89 90
Matchday 26: Manchester City v Blackburn Rovers: 71 79 86
Matchday 27: Manchester City v Bolton Wanderers: 19 61 84
Matchday 28: Swansea City v Manchester City: 37 84 87
Matchday 29: Manchester City v Chelsea: 46 66 76
Matchday 30: Stoke City v Manchester City: 62 74 84
Matchday 31: Manchester City v Sunderland: 46 58 81
Matchday 32: Arsenal v Manchester City: 17 79 83
Matchday 33: Manchester City v West Bromwich Albion: 63 74 81
Matchday 34: Norwich City v Manchester City: 63 76 81
Matchday 35: Wolverhampton Wanderers v Manchester City: 59 75 86
Matchday 36: Manchester City v Manchester United: 68 82 90
Matchday 37: Newcastle United v Manchester City: 61 70 86
Matchday 38: Manchester City v Queens Park Rangers: 44 69 75
59 74 83
Full Script
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
from soccermetrics.rest import SoccermetricsRestClient
if __name__ == "__main__":
client = SoccermetricsRestClient()
manager = client.managers.get(full_name=u'Roberto Mancini').data[0]
hmatches = client.link.get(manager.link.home_matches,sort="match_date").all()
amatches = client.link.get(manager.link.away_matches,sort="match_date").all()
matches = sorted(hmatches + amatches,key=lambda k: k.match_date)
sub_list = []
for match in matches:
team_key = 'home_team_name' if match.home_manager_name == manager.full_name \
else 'away_team_name'
subs = client.link.get(match.link.events.substitutions,
team_name=getattr(match,team_key),sort="time_mins,stoppage_mins").data
sub_dict = dict(first=None,second=None,third=None)
subkeys = sorted(sub_dict.keys())
for k, sub in enumerate(subs):
sub_dict[subkeys[k]] = int(sub.time_mins)
print "Matchday %2s: %s v %s: " % (match.matchday, match.home_team_name,
match.away_team_name),
print sub_dict['first'],sub_dict['second'],sub_dict['third']
sub_list.append(sub_dict)
dict2list = lambda vec,k: [x[k] for x in vec if x[k]]
avg1st = sum(dict2list(sub_list,'first'))/len(dict2list(sub_list,'first'))
avg2nd = sum(dict2list(sub_list,'second'))/len(dict2list(sub_list,'second'))
avg3rd = sum(dict2list(sub_list,'third'))/len(dict2list(sub_list,'third'))
print avg1st, avg2nd, avg3rd