(The following is being syndicated from The Captain’s Blog).
Is WAR the new RBI? That was the question asked in a thought provoking post at IIATMS, which is sure to draw a new battle line in the statistical debate over the value of composite metrics.
At the heart of author’s argument is the suggestion that WAR, like RBIs, is context-based because so many elements of performance are interconnected. To illustrate this point, Adrian Gonzalez’ higher career OPS with men on base is offered as one of the exhibits. In this case, the implication is that Gonzalez’ performance benefits from his teammates getting on base ahead of him (just like with RBIs), so it’s unfair to consider OBP and SLG as strictly individual stats. If we look more closely at Gonzalez’ splits, however, we see that a significant portion of his 50 OPS increase with men on base stems from the 108 intentional walks he has been given (he has only received two with no men on). Although one could still argue that those intentional walks are as much attributable to the men on base as the pitcher’s fear of Gonzalez, it raises other questions as well. Specifically, one must then also consider to what degree the hitters batting ahead of Gonzalez benefit from his presence in the lineup?
It should be pretty obvious that everything that happens on a baseball field is interconnected. Not only do players interact with their teammates, but the opposition has a say as well. In particular, the pitcher influences a batter’s outcome as much as any variable present on his team (other than his individual batting skill). Using the same example, we could posit that Gonzalez has a higher OPS with men on base because stronger pitchers don’t allow them as frequently as weaker ones (and especially not when they are on top of their game). If so, we might then expect to see that all hitters have a higher OPS with men on base, and in fact, this is the case. Since Gonzalez broke into the majors, the average OPS gap in these two splits is 34 points.
The undeniable bottom line is all baseball statistics are context dependent. As much as sabermetrics tries to neutralize contingencies, I don’t think anyone really believes they can be eliminated altogether. Rather, statistics like wOBA work under the assumption that these more subtle contingencies will cancel out over a season, or, at the very least, not make a significant difference. That’s why the offensive component of WAR is not the “new RBI”.
What about defense? Even the most ardent proponents of UZR admit that it is limited in measuring certain positions, such as first base, and usually requires about three years worth of data to be accurate. What’s more, because classification involves human intervention, inherent biases and user errors come into play. On that basis alone, the metric seems ill suited to be combined with more refined statistics that measure offense.
Aside from these specific limitations, however, UZR has a more philosophical flaw: it treats defense as a zero sum game. Unlike offense and pitching, which are measured as individual rates of success, defense is calculated in terms of a fielder’s contribution to the team. If a fly ball is caught in left center, for example, the team records an optimal outcome (an out), but one defender is credited at the expense of another. This divergence between team and individual performance creates an inherent flaw in UZR, and any system that considers a teammates success to be another’s failure (or lack of success). That’s why, if anything, the defensive component of WAR makes it completely the opposite of the RBI, which credits a player for a teammate’s prior success.
One final component of WAR taken to task by the IIATMS post is the concept of replacement value. According to the author, “it’s beyond asinine to conclude that Ellsbury is twice as valuable as Fielder”. Unfortunately, despite making such a strong statement, no evidence is advanced to explain why. Based on wOBA alone, Ellsbury and Fielder have had nearly identical seasons, so it stands to reason that Ellsbury’s defense and base running elevate him above Fielder to some degree. Do they add up to make him twice (according to fangraphs’ and baseball-reference’s WAR, Ellsbury respectively rates 83% and 49% better) as valuable? Perhaps not, but once you consider the relative scarcity of offense at each player’s respective position, the author’s unsubstantiated blanket statement seems more questionable than the conclusion he deems “asinine”.
So, is WAR the new RBI? Not if you use it properly. Although there are noteworthy flaws, the framework is sound. A player’s true contribution is not measured in a simple number like RBIs, but rather his performance in every facet of the game. WAR is far from perfect, but it does a much better job of providing a launching point for comparison than singular statistics like RBIs. In that sense, the only manner in which the two metrics are even remotely related is with regard to the lazy way many try to use them.
Before concluding, it’s worth taking a moment to circle back to the defensive component of WAR (UZR for fWAR and Total Zone for bWAR; for an explanation of the difference, click here). As long as defensive systems are designed as a zero sum game, they will continue to be flawed. Although such a methodology might be well suited for defining the very best fielders, it loses track of those who fall away from the margins. The Yankees’ outfield, which ranks third overall in UZR, provides a perfect illustration of this dynamic. On an individual basis, Brett Gardner and Nick Swisher rate highly, but Curtis Granderson does not. Does that mean Swisher and Gardner are picking up the slack for Granderson? Or, could the combination of the Yankees’ outfield alignment and UZR’s zero sum game be the reason for his low rating? As long as that doubt exists, UZR will be the subject of legitimate criticism.
Without the use of technological methods (such as Field/FX), improving the reliability of defensive metrics will remain a challenge. One possible solution would be to give a fielder credit for a play he could have made (for example, if the centerfielder, left fielder and short stop all converge on a pop up, then all three would be given credit for a putout). The Fielding Bible’s +/- goes halfway in this approach by not penalizing fielders for balls they could have caught, but it still does not give them credit. Another alternative would be to only measure balls that an outfielder should have, but did not catch, thereby avoiding the conflict that arises when two or more players could have made the same play. Although these adjustments would continue to require subjective inputs, they would at least remove the zero sum problem from the equation.