With the football season now over across the major European leagues, analysis and discussion turns to reflection of the who, what and why of the past year. With the transfer window soon to do whatever the opposite of slam shut is, thoughts also turn to how such reflections might inform potential transfer acquisitions. As outlined by Gabriele Marcotti today in the Wall Street Journal, strikers are still the centre of attention when it comes to transfers:
The game’s obsession with centerforwards is not new. After all, it’s the glamour role. Little kids generally dream of being the guy banging in the goals, not the one keeping them out.
On the football analytics front, there has been a lot of discussion surrounding the relative merits of various forward players, with an increasing focus on their goal scoring efficiency (or shot conversion rate) and where players are shooting from. There has been a lot of great work produced but a very simple question has been nagging away at me:
Does being ‘good’ one year suggest that you’ll be ‘good’ next year?
We can all point to examples of forwards shining brightly for a short period during which they plunder a large number of goals, only to then fade away as regression to their (much lower) mean skill level ensues. With this in mind, let’s take a look at some data.
I’ve put together data on players over the past two seasons who have scored at least 10 goals during a single season in the top division in either England, Spain, Germany or Italy from WhoScored. Choosing 10 goals is basically arbitrary but I wanted a reasonable number of goals so that calculated conversion rates didn’t oscillate too wildly and 10 seems like a good target for your budding goalscorer. So for example, Gareth Bale is included as he scored 21 in 2012/13 and 9 goals in 2011/12 but Nikica Jelavić isn’t as he didn’t pass 10 league goals in either season. Collecting the data is painful so a line had to be drawn somewhere. I could have based it on shots per game but that is prone to the wild shooting of the likes of Adel Taarabt and you end up with big outliers. If a player was transferred to or from a league within the WhoScored database (so including France), I retained the player for analysis but if they left the ‘Big 5′ then they were booted out.
In the end I ended up with 115 players who had scored at least 10 league goals in one of the past two seasons. Only 43 players managed to score 10 league goals in both 2011/12 and 2012/13, with only 6 players not named Lionel Messi or Cristiano Ronaldo able to score 20 or more in both seasons. Below is how they match up when comparing their shot conversion, where their goals are divided by their total shots, across both seasons. The conversion rates are based on all goals and all shots, ideally you would take out penalties but that takes time to collate and I doubt it will make much difference to the conclusions.
Comparison between shot conversion rates for players in 2011/12 and 2012/13. Click on the image or here for a larger interactive version.
If we look at the whole dataset, we get a very weak relationship between shot conversion in 2013/12 relative to shot conversion in 2011/12. The R^2 here is 0.11, which suggests that shot conversion by an individual player shows 67% regression to the mean from one season to the next. The upshot of this is that shot conversion above or below the mean is around two-thirds due to luck and one-third due to skill. Without filtering the data any further, this would suggest that predicting how a player will convert their chances next season based on the last will be very difficult.
A potential issue here is the sample size for the number of shots taken by an individual in a season. Dimitar Berbatov’s conversion rate of 44% in 2011/12 is for only 16 shots; he’s good but not that good. If we filter for the number of shots, we can take out some of the outliers and hopefully retain a representative sample. Up to 50 shots, we’re still seeing a 65% regression to the mean and we’ve reduced our sample to 72 players. It is only when we get up to 70 shots and down to 44 players that we see a close to even split between ‘luck’ and ‘skill’ (54% regression to the mean). The problem here is that we’re in danger of ‘over-fitting’ as we rapidly reduce our sample size. If you are happy with a sample of 18 players, then you need to see around 90 shots per season to able to attribute 80% of shot conversion to ‘skill’.
So where does that leave us? Perhaps unsurprisingly, the results here for players are similar to what James Grayson found at the team level, with a 61% regression to the mean from season to season. Mark Taylor found that around 45 shots was where skill overtook luck for assessing goal scoring, so a little lower than what I found above although I suspect this is due to Mark’s work being based on a larger sample over 3 season in the Premier League.
The above also points to the ongoing importance of sample size when judging players, although I’d want to do some more work on this before being too definitive. Judgements on around half a season of shots appears rather unwise and is about as good as flipping a coin. Really you want around a season for a fuller judgement and even then you might be a little wary of spending too much cash. For something approaching a guarantee, you want some heavy shooting across two seasons, which allied with a good conversion rate can bring you over 20 league goals in a season. I guess that is why the likes of Van Persie, Falcao, Lewandowski, Cavani and Ibrahimovic go for such hefty transfer fees.