The Metrics System: How MLB's Statcast is creating baseball's new arms race
- New technology has produced massive piles of data about everything that happens on a baseball diamond. For both players and MLB front offices, all kinds of answers are in there—finding an edge is all about asking the right questions.
This story appeared in the Aug. 22, 2016 issue of SPORTS ILLUSTRATED. To subscribe to the magazine, click here.
One afternoon in May, Diamondbacks third baseman Jake Lamb was engrossed in his daily pregame routine inside the clubhouse when he stopped, suddenly, in front of a TV screen. He had overheard an unlikely topic of conversation on the afternoon MLB Network show: a somewhat obscure second-year player on a middling team who was a having an unremarkable season. The talking heads were diving into a list of players whose average exit velocity—the speed at which a ball comes off a bat—had improved the most from the 2015 season, and they were discussing the hitter who'd seen the greatest jump. The player atop the list, to the surprise of the analysts and to Lamb himself, was Jake Lamb.
Lamb does not like to clutter his head with information; the daily team stat sheets have no use to him. His philosophy is that the categorizing of a hitter's batted balls into "outs" or "hits" on a given night is more or less a dice game—a fielder makes a remarkable play, or happens to be standing in the right spot, and what should be a hit becomes an out. "If I hit the ball hard, I count it as a hit," Lamb says. "If I hit two balls hard, at the end of the night I was 2 for 4, even though on the scorecard I was 0-fer. If you look at the result, you're going to drive yourself crazy."
Over the winter, after a somewhat disappointing 2015 rookie season in which he hit .263 with six home runs in 107 games, Lamb overhauled his approach at the plate; he added a leg kick and lowered his hands so that when he started his swing, they were moving in a straighter plane through the strike zone. Though he felt he was squaring up the ball better early in the '16 season, the results didn't reflect that—that day in May his slugging percentage was .500, and even Lamb was beginning to question the effectiveness of his new style.
But now, standing in front of the TV, he was looking at a number—his 93.7-mph average exit velocity, which ranked higher than that of Bryce Harper, Miguel Cabrera and Lamb's star teammate, Paul Goldschmidt—that told a different story. Lamb was doing precisely what he'd set out to do with his new swing: He was absolutely murdering the baseball. "Here was something telling me, don't change a thing," he says. "I'm doing everything right."
Exit velocity is a measurement that comes from MLB's Statcast system, which deploys radar equipment and high-resolution cameras to track every movement on a baseball field. Statcast information from every ballpark became available in 2015, though teams had been quietly using some of the data it produces in their player evaluations even before that. In the Rays' organization, players sitting in an auditorium on the first day of spring training are told that hitters in that franchise are not measured by batting average but by batted-ball exit velocity. ("It's a term they use exclusively, like nothing else matters," says one former Rays player.) In 2014 the Mets' front office, after agonizing over the choice between Lucas Duda and Ike Davis as the team's first baseman of the future, settled on Duda in large part because of his superior exit velocity. (Duda hit 57 homers over the next two seasons in New York, before going on the DL with a stress fracture in his back this May; Davis has struggled to stick in a major league job.) Scan a ranking of the leaders in exit velocity at any point in a season, and you will find the game's most famous mashers—Giancarlo Stanton, David Ortiz, Miguel Cabrera—but also a smattering of underappreciated players who possess hidden skills.
Players such as, for instance, Lamb. Arizona's third baseman stuck to his new approach, and not only did the hits begin to fall in, but during one stretch in May and June he mashed 16 home runs over a 45-game period. At the All-Star break he was leading the National League in slugging percentage. "I'm trying to barrel up the ball and hit it as hard as I physically can," says the 25-year-old, who was slugging .561 with 24 home runs at week's end with an average exit velocity that ranked high on the Statcast leader board among NL regulars. "Exit velocity tells you what you need to know. I think it's a cool stat, and you're talking to someone who doesn't exactly care for stats."
"Exit velocity is one new metric out of potentially hundreds—it's just scratching the surface of what we expect to do," says Cory Schwartz, MLB.com's vice president of statistics. Schwartz is one of the original employees of Major League Baseball Advanced Media (BAM), which was created under then commissioner Bud Selig in 2000 as the league's tech start-up. BAM now does everything from maintaining team websites to powering instant replay to producing the streaming live video service, MLB TV, which turned BAM into a tech powerhouse. Today it owns the NHL's digital rights; is the streaming service provider for ESPN, HBO and the WWE; and has spun off its tech operation in a deal that would give the new stand-alone company an estimated value of over $3 billion. In many ways Statcast represents BAM's boldest undertaking to date: a big-data initiative that's limited only by the imagination of those turning to it with questions. Every baseball play is tracked to the microsecond and produces an almost infinite stream of data, from the spin of the ball as a pitch hurtles toward home plate (spin rate) to the velocity and angle of the batted ball to the movements of every defender on the field.
On an August afternoon in the BAM offices, in a former cookie factory in lower Manhattan, Schwartz and a group of analysts were in a meeting room deconstructing a single play from the previous night's Yankees-Mets game. On a screen at the front of the room was BAM's internal diagnostic tool, a data-packed user interface which offered a visual representation of outfielder Brett Gardner's failed attempt at an inside-the-park home run to start the game. With everyone gazing up at a large chart with lines tracking the movement of every individual on the field, the room felt like a NASA control center. The data ran across the screen, including the trajectory of the ball (30.3-degree launch angle, 401.4 feet); the time of the exchange and distance on the throw by rightfielder Curtis Granderson (0.87 seconds, 180 feet); cutoff man Neil Walker's exchange time and throw distance (0.63 seconds, 160 feet); and Gardner's home-to-home time (max speed of 20.1 mph, 14.9 seconds around the bases, the fastest home-to-home time of the season, though he was still thrown out). "The fact that Gardner got thrown out, was that a smart decision [by the runner] and just a good play by someone in the field?" asks Schwartz. "That's the part we want to deconstruct."
Two systems merge to create Statcast's complete picture: the Danish company Trackman, which has a system based on missile defense technology, uses radar to measure the ball's movement by tracking the speed of the seams at 40,000 frames per second. A system operated by ChyronHego, a German company, measures the movements of the individuals on the field.
"We want to incorporate all the elements into what we do," says Tom Tango, BAM's senior database architect, looking up at the screen. "The third base coach, when he has to make the decision, he has to figure out what the odds are of [his runner] being thrown out. Now when this outfielder is 180 feet away from the runner and the runner has to decide [whether] to go, we'll be able to construct a chart and say: At that point he's got a certain percentage chance of being safe."
Last season was the first in which this technology was active in all 30 major league stadiums, officially launching what those at BAM loftily refer to as "the Statcast era," an age in which exit velocity will become as ubiquitously cited as pitch velocity (those readings already flash at some ballparks, including Dodger Stadium and Progressive Field) and measurements take out the subjectivity in player evaluations with what statheads call "outcome independent" metrics. It is an era that has been largely mischaracterized by the media. Statcast is not the next Moneyball—in fact it goes in the other direction. The earlier movement was about identifying trends from piles of data in order to exploit market inefficiencies. Statcast, instead, measures individual players down to the fraction of a second, to tell us precisely what we are seeing.
Statcast data is being applied in two main ways: in game broadcasts, targeted at fans and media, and in player evaluation within teams. It has the potential to offer a better picture of what happened on the field, why it happened, and even what will happen—uncovering players who are due for a breakout or bounce-back season, like Lamb or Miami's Marcell Ozuna, who struggled last year but was a top 20 player in exit velocity, and this season was a first-time All-Star. The job of the team of analysts BAM has assembled over the last year, a who's who of all-stars from the world of sabermetric analysis, is to decide what questions to ask of the massive data set. Tango, who previously consulted for teams (most recently the Cubs), has invented a number of metrics, including fielding independent pitching (FIP), and is focusing on what's long been the holy grail of baseball statistics: fielding metrics. He's identified Statcast data that could make popular defensive statistics like Ultimate Zone Rating obsolete. ("We're close," says Tango, who, with info on players' starting position in the field, now has the final piece of the puzzle.) Writer and analyst Mike Petriello was a popular columnist at the analytics site Fangraphs and now writes articles for MLB.com that are driven by Statcast data. He gets queries from front office executives and even players: When veteran outfielder Chris Coghlan was dangling as a free agent, he wanted to know what he could do to improve his defensive Statcast numbers. A third analyst, Daren Willman, a former college player based in Houston, was running the site Baseballsavant.com, which scraped Statcast data from the MLB sites and offered it up, with analysis, to fans. Willman received a cease and desist letter from MLB; instead of shutting him down, however, the league decided to hire him. Willman now spends his time creating visualizations with Statcast data, the kind of visualizations that some organizations are beginning to use behind closed doors to show players what adjustments they need to make.
Willman's work gets to that other way the data is being applied: by teams, internally, in player evaluations. One less obvious area is injury prevention. General managers, for instance, can see how a pitcher's stuff has changed over time; in the weeks leading up to the trade deadline, at least one front office balked at trading for Royals closer Wade Davis at the last minute because of his fading spin rate, even though his velocity remained steady. Just days before the deadline Davis ended up on the DL with a flexor strain.
Every MLB organization now has an analytics team in place to try to figure out what to do with all the data that comes from 2,430 games—roughly 750,000 pitches—a season. The 30 teams are using the data in 30 different ways, but they do all share this: an unwillingness to talk about what it is they are doing with it.
"It's an arms race, with all the different areas to explore. As teams find benefits they're gaining a competitive advantage that they want to hold very close," says Greg Cain, BAM's senior director of sports data. "It even colors how we receive requests for information from clubs. A lot of times we'll get a long list to kind of hide what they're looking for."
The challenge is to know what to look for. "It's so massive, it's just about asking the right questions," says Willman. "As far as the answers: The answers are all there."
Cole Figueroa was a perfect fit for the job, and not just because he was the 25th guy on the Pirates' 25-man roster. Figueroa, a Pittsburgh utility infielder through the season's first half, is deeply interested in analytics—he can code and has even run his own studies on how players with his skill set tend to age. And so, at the start of this season, when MLB allowed teams to use iPads in dugouts for the first time, Figueroa was the obvious guy to man those tablets during games. "There is so much information," he says. "Besides your coaches and manager, it's become the best resource."
For the Pirates, one of the most aggressive teams with defensive shifts, it was no longer necessary for players to memorize fielding positions before a game as if they're cramming for a test. Pitchers now watch video of opposing hitters in between innings, and vice versa. "The way a hitter can look at spin rate, for instance," says Figueroa. He would show a teammate, just before his at bat, video of the opposing pitcher. "If a guy has a high spin rate, you're going to know you've got to be thinking something down in the zone. You know that the ball is going to jump on you a little bit more—that's the perceived velocity. You're going to have to be ready a little bit sooner."
Like spin rate, exit velocity and launch angle have become a part of the everyday vernacular in front offices and even in clubhouses. This off-season Cubs third baseman Kris Bryant made changes at the plate with the goal of adjusting his launch angle; he wanted to get his swing flatter, so that he could turn more foul balls into hits. "You get into all these numbers, and I think my launch angle this year has definitely gone down from last year," Bryant, who's emerged as an MVP candidate, told reporters. For players at the other end of the spectrum, the new data could have even more of an impact. "I'm not 6'3", I don't run like a deer; my advantage is trying to find little openings in data, little edges that can help me prolong my career," says Figueroa.
Whether Statcast would become a true game changer—more than a cool toy for baseball nerds, more than glittery window dressing for broadcasts—was always going to depend on how it would be applied behind closed doors and on practice fields. Lamb, for one, couldn't tell you the first thing about launch angles or optimal swing planes, or, until recently, exit velocity. "If I were struggling, yeah, exit velocity is something I'd look to now," Lamb says.
It was late July, and after his midseason explosion, a slump was coming. Lamb went 0 for 24 over one stretch. But ignoring those ugly results, he changed nothing, and the hits began to fall again. A new hot streak—back-to-back home run nights in early August, a 9-for-28 run to start the month—soon began. Lamb, and anyone who'd taken the time to take a closer look at the new data, could have told you. It was only a matter of time.