[mlbvideo id="31405521" width="600" height="336" /]
On Saturday, Major League Baseball Advanced Media provided a small segment of the analytical community with a glimpse into the very near future. At the MIT Sloan Sports Analytics Conference in Boston, MLBAM CEO Bob Bowman presented a new technology that will enable the capture and reliable measurement of every play on the field, opening new doors for research and analysis. Think of it as PITCHf/x on steroids.
Like PITCHf/x, the new system — which does not yet have a name (personally, I'm calling it OMGf/x) — will be able to tell you the velocity, release point, spin rate and path of each pitch, but where that currently-in-place system ends (at least as far as the public is concerned), this one is just getting started. If a batter makes contact, the batted-ball speed, launch angle, distance and hang time will be tracked, as will each fielder's first-step reaction time, speed, acceleration and route to the ball, and each baserunner's speed and route. For an idea of what the system in play will look like, check out the video above of a spectacular, game-ending catch by Jason Heyward with the new tracking data added.
Where PITCHf/x and its lesser-known siblings, HITf/x and FIELDf/x, use a camera-based system from SportVision, the new technology involves a combination of cameras and radar. It's been developed in partnership with Trackman, whose radar-based software is used for professional golf telecasts and was tested in connection with the expansion of instant replay; the camera portion was developed by a company called ChyronHego. As physicist Alan Nathan explained "Video is the natural technology for tracking players on the field… Radar is the natural technology for tracking the batted ball.”
This new data stream is expected to be real-time enough to use on broadcast instant replays:
For 2014, the system will be in place in just three ballparks — Citi Field in New York, Miller Park in Milwaukee and Target Field in Minnesota — but the plan is to have it operational in all 30 parks by Opening Day 2015. In order to do that, several obstacles will have to be overcome, as Baseball Prospectus' Ben Lindbergh explained in his graphics-rich summary of the rollout. The main one is processing power and data volume, as the radar system samples the flight path of the ball an incredible 20,000 times per second and the cameras record the position of every player 30 times per second. That results in a whopping seven terabytes (7,000 gigabytes) of data per game.
Currently, the system is still measuring a player's center of mass, not his extremities. "In other words, it can tell you if a runner’s body beat the ball to the bag, but it can’t say whether he got his hand in under a tag," wrote Lindbergh. Occlusion (when two players cross paths) can confuse the system, though when the computer has to guess, it gets it right "80-90 percent of the time," and when it doesn't, a manual operator can correct the record.
Beyond the obvious application of enhancing broadcasts, teams will be able to get more objective evaluations of players that could aid them not only for in-game adjustments to defensive positioning, but also for bigger-picture stuff like more accurately measuring the defensive value of a free agent for contract-setting purposes. Former general manager Jim Duquette, who's now an analyst for MLB.com, described the way that teams might use such data:
"When you look at how scouting has been done in the past, there's a lot of subjectivity to the evaluation," he said. "Some guys I have found have varied, from scout to scout, in terms of their opinion of each player. There are a lot of quality defensive statistics out there, but they're not completely accurate. A lot of them are dependent on somebody charting, whether it's UZR or DIPS or Defensive Runs Saved, and they can only go so far. Some players ... range to their left better, some range better to their right, some come in on ground balls better than others, some have better first-step quickness.
"The exciting thing about this new technology is, you can start to take the subjectivity that is given to you by the scout and blend it with raw data now, and come up with a truer picture of evaluating a player. So when you take that data and compare it to others in the game, you can really find out if that position player is the best at his position. You can measure potential free agents, you can measure current free agents."
Bowman said that the technology will be available "for baseball operations and some fan use for 2014," with baseball front offices and the commissioner's office closely scrutinizing the data this year for accuracy. It's a good sign that fans will be involved, both because it sets a precedent and because it can fuel further innovation. PITCHf/x data has been publicly available since it was rolled out in 2007, and its availability has been a game-changer for the sabermetric movement.
Via PITCHf/x, each pitch can be quantified and described, which lends a richness to game accounts; we can say with reasonable certainty that not only did a batter hit a fastball that was low and away, he hit a 91 mph two-seamer that was three inches off the plate and likely would have been called a ball. Furthermore, each pitcher's repertoire can be aggregated and compared. Breakdowns of such repertoires started appearing at FanGraphs a few years ago, telling us (for example) that Ervin Santana threw his slider an MLB-high 36.9 percent of the time in 2010 with an average velocity of 85.9 mph, or that Justin Verlander's average fastball velocity jumped from 93.6 mph in 2008 to 95.6 in 2009.
Had that been all that PITCHf/x yielded, it would have been a significant step forward, but further study has led to even more interesting applications. At BrooksBaseball.net, Dan Brooks and Harry Pavlidas manually review each pitcher's repertoire (because some ambiguity exists when it comes to the pitch classification algorithms) and log the outcomes for each pitch, enabling the site's viewers to see anything from the start-by-start variations in velocity and release point for a given pitcher to the success that batters had against the pitch when it was located in a given cell of the strike zone. At Baseball Prospectus, Mike Fast made a breakthrough in 2011 when he quantified the impact of catcher framing; shortly afterward, he was hired by the Astros, one of many sabermetricians whose demonstration of the value of his work in the public sphere led to a job in the industry.
Interestingly enough, last week, another catcher-framing analyst at BP, Max Marchi, announced that he had been hired by the Indians. Meanwhile, on Monday, Pavlidas and Brooks unveiled BP's new catcher-framing methodology, which unlike previous attempts will be publicly available and updated on a routine basis. (An aside: Holy cow, that stuff is cool).
As PITCHf/x applications have flourished in the public sphere, fans have gotten only fleeting glimpses of its siblings, which have remained proprietary, available only to teams and a select group of top researchers. For example, before his hiring by the Astros, Fast published fascinating research involving the relationship between quality of contact and batting average on balls in play. Nathan's HITf/x work on comebackers was used in setting standards for the new protective headgear introduced for pitchers.
While also announcing that PITCHf/x's status beyond 2014 is to be determined, MLBAM's representatives were less than definitive about the public availability of the new data. During the Q&A segment of the Sloan presentation, MLBAM Chief Technology Officer Joe Inzerillo said, "If you look at what MLBAM has done historically on this front, we’ve made a lot of data available and we’ve had really good collaboration with the community. I would expect that whatever policy we come up with as far as dissemination, it’s going to live within the boundaries and the guidelines that we normally have done."
However, as Lindbergh added, "[I]t’s also worth noting that releasing nothing to the public—as has been very close to the case with HITf/x—is also within the boundaries of what MLBAM has done. And if they decide to do that again, they’ll be well within their rights." If this new technology is merely something that teams and broadcasters are able to use, it will be a solid hit, enhancing our knowledge somewhat with regards to the best and worst at various facets of the game. However, if this data is available publicly to drive further innovation that informs both the public and the industry, it has the potential to be a grand slam, a Mike Trout-level addition to our body of baseball knowledge.