Fancy Stats Primer: Your guide to hockey's advanced analytics jargon
Be they responsible for splashy front office hires or for rankling old school NHL head coaches like Ted Nolan, advanced hockey statistics couldn't stay out of the news this summer. So what are these “fancy stats” and what use are they to fans?
Unfortunately, the most accessible resource for hockey analytics, Extraskater.com, closed up shop this offseason after its creator Darryl Metcalf joined the suddenly stat-crazy Toronto Maple Leafs organization.
Still, these numbers are available with a little digging. Behindthenet.ca and stats.hockeyanalysis.com, two classics of the genre, still exist, while upstarts like War on Ice and Progressive Hockey fight to replace ExtraSkater.
Regardless of what site you use, though, the categories and tables can seem daunting. That’s why I’ve compiled this field guide to what these stats mean, and how to interpret them.
Stat: Corsi Rating
What it measures: Shots-for vs. shots-against when a given player or players are on the ice. It can be expressed as a percentage—i.e. (shots on goal-for+blocked shots-for+missed shots-for) divided by (shots on goal-against+blocked shots-against+missed shots-against)—or as a per sixty minutes rate: shots-for minus shots-against x 60/time on ice.
Namesake: Jim Corsi, the then-Buffalo Sabres now St. Louis Blues goaltending coach, who developed the statistic to measure his goalies’ workloads.
How it’s used: Corsi rating is the most fundamental advanced stat. Over the course of large samples, shot differential predicts team success better than goal differential. Like Fielding Independent Pitching stats in baseball, the usefulness of a Corsi rating relies on an empirical assumption that many fans find counterintuitive. In sabermetrics, that controversial breakthrough idea was that pitchers have little influence on whether balls are fielded once put in play. For hockey, the equivalent idea is that teams have little influence on the quality of their shots over large sample sizes.
Analysts often refer to Corsi as a possession metric, as it is considered a stand-in for actual time of possession. When players have a demonstrable effect on their teammates’ Corsi ratings, they are said to be good or bad “possession players.” This point can be admittedly a little confusing: Corsi seemingly attempts to be a superior version of plus/minus, free of the variables of shooting percentage, while also serving as an inferior proxy for actual time of possession, a number we wish the NHL measured.
Really it’s both. Puck possession manifests itself in one team outshooting another, and outshooting your opponent is a better predictor of long-term success than goal differential. The duality of the statistics is reflected in its expression. Behindthenet displays a positive or negative number in the same vein as plus/minus, suggesting its role as a replacement. War on Ice, Progressive Hockey, and Stats.HockeyAnalysis.com express Corsi as a percentage— i.e. what portion of the game a team had possession of the puck.
Common mistakes: As a team statistic, Corsi is pretty straight-forward and illuminating. Typically, the best possession teams rise to the top. As a player statistic, it can be abused, as what it really measures is team performance when that player is on the ice, and is thus heavily dependent on the quality of a player’s teammates.
One solution to this problem is CorsiRel, a stat that expresses an individual's Corsi rating relative to the rest of the team. This statistics attempts to account for the possibility that a good player could easily accrue a poor rating on a poor team.
And while this stat seems to make it easier to compare players from different teams by reducing the effect of their teammates, use caution. Often, an OK player on a very bad team will sport a better CorsiRel than a good player on a great team. After all, CorsiRel does not change the simple fact that one player’s team controlled the puck while he was on the ice (albeit with the help of his teammates) and the other player’s team did not.
Patrick Kane, for instance, had a slightly worse CorsiRel than Nail Yakupov last season. That doesn’t discredit the statistic, it just reflects the reality that Yakupov was being compared to Jordan Eberle and Ryan Nugent-Hopkins, not Jonathan Toews and Patrick Sharp. With all these “fancy stats,” context matters.
Stat: Fenwick Rating
What it measures: It’s just a Corsi rating without considering blocked shots. This stat has replaced Corsi in many contexts, as blocks are considered a skill for which teams should receive credit.
Namesake: Matt Fenwick, a blogger who first pointed out this problem with the original Corsi rating.
How it’s used: The same stuff as Corsi. The trade-off here is that you give teams and individuals credit for blocking shots, but you reduce the sample size of the applicable shots. For that reason, Fenwick is often used for comparing teams, not players, that necessarily produce larger samples.
Common mistakes: See above.
Stat: Zone Start Percentage
What it measures: The percentage of times a player starts his shift in the offensive zone versus the neutral zone (OZS/DZS). This number amounts to how a coach chooses to deploy a player. Neutral zone starts are usually not included.
Namesake: Not a mystery.
How it’s used: Zone start percentage adds context to other stats. Players who start most of their shifts in the offensive zone (55-70%) tend to score more points and have better Corsi/Fenwick ratings than those who do not.
Common mistakes: An extreme zone start percentage one way or the other doesn’t have a single meaning. Take the New York Rangers for instance. Last season, coach Alain Vigneault used Martin St. Louis (57%) and Rick Nash (55%) in a majority of offensive zone starts, no doubt to give them the most opportunities to utilize their offensive talents. But he also used John Moore (63.1%) similarly because he can’t trust Moore to not screw up in the defensive zone. And he gave Brad Richards (66.6%) the biggest relative share of offensive zone starts, probably for a combination of those aforementioned two reasons.
What it measures: Team shooting percentage+save percentage when a given player or players are on the ice. Basically, it accounts for everything outside the scope of Corsi rating, to explain discrepancies between what a player does and his results.
Namesake: It doesn’t stand for anything (I think), and may have originated from someone's internet handle.
How it’s used: PDO attempts to quantify a player’s good or bad luck. A score of over 1000 tends to indicate “good luck,” under 1000, the opposite. If a player has a bad Corsi rating but a great plus/minus, he’s probably being overrated thanks in part to his teammates' scoring abilities and/or his goalie’s performance. Granted, using “luck” is oversimplifying things. Some players we would expect to have a better PDO: those who play defense in front of Tuukka Rask or on a line with Sidney Crosby. But in the small sample size of an individual player’s ice time, PDO’s two component statistics are often very wonky and may lead to some illusory boxcar numbers. Breaking PDO down into its two components—as behindthenet.ca does—can also be illuminating. Players who rack up a lot of assists thanks to their linemates’ high shooting percentages that season can usually expect a regression. Sidney Crosby is the only real exception to this rule.
Common mistakes: The word “luck” gets thrown around pretty loosely in the context of advanced sports statistics, to the annoyance of pretty much anybody. PDO involves two numbers that players can genuinely influence with their skill. And while shooting percentage varies wildly, teams with good goalies will generally have higher PDOs.
Still, PDO serves as a good gut check when certain stats seem out of whack. For example, Ben Lovejoy’s +21 rating with the Ducks last season probably has more to do with his goalies’ .942 SV% with him on the ice than any sudden breakout in defensive ability at age 29.
Stat: QualComp (aka Corsi QoC)
What it measures: Quality of competition stats are exactly what they say. Like zone starts, they provide more context in terms of how a coach deploys a player. Generally speaking, QualComp measures the Corsi ratings of a player’s opponents.
Namesake: The world’s worst portmanteau.
How it’s used: QualComp is the other half of the deployment puzzle with zone starts. Whether a player faces the other team’s top talent adds perspective to the numbers.
Common mistakes: Unlike zone starts, which have an easily observed effect on Corsi/Fenwick, there’s some dispute among analytics experts as to how much of an impact quality of competition really has. Generally, the players who are good at driving possession will do it against any opponent. QualComp shares some limitations with CorsiRel, too. It works best not as a way to compare players across teams, or even as a specific number, but as a way of ordering players on the same team, from toughest to easiest assignments.