Falling for Lazy Analytics: 65 Days Until Kansas Football

We live in a world of advanced analytics, where everyone is looking for that "magic stat" that will give them an edge in evaluating teams for the upcoming season. While Ken Pomeroy wasn't the first, his college basketball rating system has become so mainstream that it's nearly impossible to talk about the sport without some mention of "KenPom".

College football has plenty of these as well, from the traditional Sagarin and Massey Ratings to the newer FPI and S&P+. And each of these are trying to come up with a definitive ranking for the relative strengths of teams.

Ask anyone involved with creating the rating system, and they'll usually be willing to discuss the holes or shortfalls in their system. Whether it's a heavy reliance on a particular box score number, or a trait that they think is important but that they can't measure accurately, every system has something they would like to improve.

This is why basing an evaluation on a single metric is negligent at best and malicious at worst. Which brings us to the latest example of statistical shenanigans:

Big 12 Power Rankings👀 pic.twitter.com/UM2hAUfNAD
— PFF College (@PFF_College) June 25, 2023

There are multiple problems with basing Power Rankings off of a system like ELO. For those that aren't familiar, the ELO system was first developed for chess. It assumes everyone starts at some base number, and the losing player transfers some of their rating points to the winning player after the game. The amount of points depends on the difference between the two ratings.

The system is best suited for individual competitions, where changes in the skill levels of the competitors develop over time. It has been adapted to team sports before, typically by adjusting the formula for calculating the points that transfer to account for things like the location of the game or the margin of victory.

But the system isn't able to capture big swings in available talent, changes in coaching staff, or other factors that can have a huge impact on games. With so many moving pieces and the importance of exploiting match-ups, it's impossible for a system with data points made up entirely of game results to accurately capture the relative strengths of every team in college football.

Long story short, a metric like ELO could be used for something like overall program strength, but it is a very poor metric for setting expectations for the upcoming season. If the formula has been tweaked enough to decent predictive value, then it isn't really an ELO rating anymore.

But more importantly, remember that every metric tells a story. It has a point of view. The creator made an evaluation of what is important when determing how the formulas work. It's just as important to figure out what it isn't telling you as it is to understand what value it brings.

And the next time you see a random statistical graphic on social media, make sure to ask yourself if it tells the story that is being presented.