In my previous post, I looked at the relationship between controlling the pitch (territory) and the ball (possession). When looking at the final plot in that post, you might infer that ‘good’ teams are able to control both territory and possession, while ‘bad’ teams are dominated on both counts. There are also teams that dominate only one metric, which likely relates to their specific tactical make-up.
When I calculated the territory metric, I didn’t account for the volume of passes in each area of the pitch as I just wanted to see how things stacked up in a relative sense. Territory on its own has a pretty woeful relationship with things we care about like points (r2=0.27 for the 2013/14 EPL) and goal difference (r2=0.23 for the 2013/14 EPL).
However, maybe we can do better if we combine territory and possession into one metric.
To start with, I’ve plotted some heat maps (sorry) showing pass completion percentage based on the end point of the pass. The completion percentage is calculated by adding up all of the passes to a particular area on the pitch and comparing that to the number of passes that are successfully received. I’ve done this for the 2013/14 season for the English Premier League, La Liga and the Bundesliga.
As you would expect, passes directed to areas closer to the goal are completed at lower rates, while passes within a teams own half are completed routinely.
Heat map of pass completion percentage based on the target of all passes in the 2013/14 English Premier League, La Liga and Bundesliga. Data via Opta.
What is interesting in the below plots is the contrast between England and Germany; in the attacking half of the pitch, pass completion is 5-10% lower in the Bundesliga than in the EPL. La Liga sits in-between for the most part but is similar to the Bundesliga within the penalty area. My hunch is that this is a result of the contrasting styles in these leagues:
- Defences often sit deeper in the EPL, particularly when compared to the Bundesliga, which results in their opponents completing passes more easily as they knock the ball around in front of the defence.
- German and Spanish teams tend to press more than their English counter-parts, which will make passing more difficult. In Germany, counter-pressing is particularly rife, which will make passing into the attacking midfield zone more challenging.
From the above information, I can construct a model* to judge the difficulty of a pass into each area of the pitch and given the differences between the leagues, I do this for each league separately.
I can then use this pass difficulty rating along with the frequency of passes into that location to put a value on how ‘dangerous’ a pass is e.g. a completed pass received on the penalty spot in your opponents penalty area would be rated more highly than one received by your own goalkeeper in his six-yard box.
Below is the resulting weighting system for each league. Passes that are received in-front of the goal within the six-yard box would have a rating close to one, while passes within your own half are given very little weighting as they are relatively easy to complete and are frequent.
There are slight differences between each league, with the largest differences residing in the central zone within the penalty area.
Heat map of pass weighting model for the 2013/14 English Premier League, La Liga and Bundesliga. Data via Opta.
Using this pass weighting scheme, I can assign a score to each pass that a team completes, which ‘rewards’ them for completing more dangerous passes themselves and preventing their opponents from moving the ball into more dangerous areas. For example, a team that maintains possession in and around the opposition penalty area will increase their score. Similarly, if they also prevent their opponent from moving the ball into dangerous areas near their own penalty area, this will also be rewarded.
Below is how this Territorial-Possession Dominance (TPD) metric relates to goal difference. It is calculated by comparing the for and against figures as a ratio and I’ve expressed it as a percentage.
Broadly speaking, teams with a higher TPD have a better goal difference (overall r2=0.59) but this varies across the leagues. Unsurprisingly, Barcelona and Bayern Munich are the stand-out teams on this metric as they pin teams in and also prevent them from possessing the ball close to their own goal. Manchester City (the blue dot next to Real Madrid) had the highest TPD in the Premier League.
In Germany, the relationship is much stronger (r2=0.87), which is actually better than both Total Shot Ratio (TSR, r2=0.74) and Michael Caley’s expected goals figures (xGR, r2=0.80). A major caveat here though is that this is just one season in a league with only 18 teams and Bayern Munich’s domination certainly helps to strengthen the relationship.
The relationship is much weaker in Spain (r2=0.35) and is worse than both TSR (r2=0.54) and xGR (r2=0.77). A lot of this is driven by the almost non-existent explanatory power of TPD when compared with goals conceded (r2=0.06). La Liga warrants further investigation.
England sits in-between (r2=0.69), which is on a par with TSR (r2=0.72). I don’t have xGR numbers for last season but I believe xGR is usually a few points higher than TSR in the Premier League.
Relationship between goal difference per game and territorial-possession dominance for the 2013/14 English Premier League, La Liga and Bundesliga. Data via Opta.
The relationship between TPD and points (overall r2=0.56) is shown below and is broadly similar to goal difference. The main difference is that the strength of the relationship in Germany is weakened.
Relationship between points per game and territorial-possession dominance for the 2013/14 English Premier League, La Liga and Bundesliga. Data via Opta.
Over the summer, I’ll return to these correlations in more detail when I have more data and the relationships are more robust. For now, the metric appears to be useful and I plan to improve it further. Also, I’ll be investigating what it can tell us about a teams style when combined with other metrics.
——————————————————————————————————————– *For those who are interested in the method, I calculated the relative distance of each pass from the centre of the opposition goal using the distance along the x-axis (the length of the pitch) and the angle relative to a centre line along the length of the pitch.
I then used logistic regression to calculate the probability of a pass being completed; passes are deemed either successful or unsuccessful, so logistic regression is ideal and avoids putting the passes into location buckets on the pitch.
I then weighted the resulting probability according to the frequency of passes received relative to the distance from the opposition goal-line. This gave me a ‘score’ for each pass, which I used to calculate the territory weighted possession for each team.