Monday, March 4, 2013

Relationship between team age and team strength

I originally thought that I would need individual team age data to get at the question of 'how much strength is a year' equal to.  However a fellow statistician suggested that this could be estimated by looking at how well teams did playing at age versus playing up a year.  Computing the age-effect this would require however cross-age match data and multiple ages.  With help from a fellow poster on, I compiled match data for the U12, U13, U14 and U15 boys teams in WA and OR.  This analysis required that I had almost all select teams at each age group so that I could assume that 'median' team meant the same thing in each age group (i.e. that I wasn't biasing my samples up or down in any age group).

This new database allowed a joint analysis of team strength using all U12-U16 select soccer team matches in WA and OR for the 2012-2013 season (up to March 3rd, 2013).  Cross-play between age groups in tournaments and leagues (when teams play up) allowed all age groups to be analyzed together.  The following graph shows a boxplot of team strength against age.  The y-axis has been scaled to the median team in this analysis, but the meaning of relative differences remains the same as in my region-specific strength ratings:

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
win 34 49 63 76 86 94 98 99 100
tie 31 29 24 17 11 5 2 1 0
lose 35 22 13 7 3 1 0 0 0

A difference of 2 in strength is quite large and the weaker team would rarely win.  A difference of 1 still means a substantial advantage for the stronger team.  The x-axis in the graph is the date of the first age in the age group.  These are US teams and the age brackets start on August 1st.

The graph indicates a striking consistency in the impact of age on team strength.  From age U12 to U16, each year adds 2 strength points to the median team strength.   Thus teams are unable to beat themselves a year later and unlikely to beat themselves 6 months later (if they could enter a time machine and play themselves at an older age).

This suggests that the capacity to increase team strength by selecting older players is substantial, however the spread of team strengths within age is ca 10 (highest team minus lowest team).  Thus age bias on teams is (obviously!) not the only factor affecting team strength.  This also suggests that computing a "age-correction factor" for team strength should be simple.

age adjusted team strength = 
    unadj. team strength - (6.5-mean(birth months of players))/6

where birth month is coded as Aug=1, Sep=2, Oct=3, etc.  6.5 is the mean birth month so this scales the team to "expected strength if team were median age".

Why is an age-correction factor important?  Well if you are trying to understand the strength of a team and how factors other than age affect team strength, then you need to adjust for the team age.  If you want your team to play up in a tournament, then you need an idea of what its strength will be at the higher age so you can choose an appropriate level.  Also if a coach/club happens to have a team made up of summer boys, they might well want to have a sense of the "adjusted" strength of the team if the team were not essentially "playing up".  Or a club might want to compare the strengths of their A and B teams, but for a fair comparison you need to adjust for the age difference between the teams.

What is driving this relationship?  Presumably it is the pattern of physical growth in boys.   The growth curve is pretty linear from U12 (age 10-11) to U16 (age 14-15).  I'll be able to get a better sense of this relationship once I get more U17 and U18 teams and see how the strength levels off.

What next?  An interesting aspect of the age-skew, is that on an age-skewed team, the innate talent is not at its maximum because less talented older players would be chosen over more innately talented, but smaller, young players.  Nonetheless, the strength of the team is higher with the age-skew because age adds greatly to team strength.  As the team ages (U17 and up), one would expect that the age-skew would be reduced as 12-months doesn't confer quite as much advantage.   A question that interests me is how much this age-skew persists in U20+ teams and whether there are other effects of the age-bias, such as positional bias.   The only July boys on an elite (non-academy) team or Dec boys on an academy team would have to be unusually talented (on average).  Are those talents directed to particular positions?  Presumably July (or Dec) goalkeepers would be rare for example.  Another question that interests me is whether Dec professional soccer players have higher valuations than Jan professional players.  Why would that be the case?  In a system with a January cut-off, more Jan born players of lesser innate talent are developed.  The only Dec born players that make it through the system would have to have (on average) higher innate talent.  If innate talent is correlated with valuation, one would expect Dec players to have higher valuations (on average).

No comments:

Post a Comment