If I gave you a number series of 29, 109, 200, what would you think it meant? You’d have no idea, right? In order for these numbers to have meaning, you have to give them names, and this is where historical circumstance creates problems. If the technology and investigation into statistics wasn’t very good, then there was little verification of the validity of these statistics, and if there was no verification, then we don’t know exactly what the numbers are telling us. When they named those statistics, however, they gave those numbers meaning, but the meaning wasn’t always accurate because, again, there was no investigation into the statistics before introducing them to the public. It wasn’t laziness. It wasn’t stupidity. It was a combination of something with little importance at the time (the importance of stats has since grown with the introduction of awards and the growing value of players, thus necessitating analysis) and insufficient technology. There really wasn’t anything they could do, and they had to name the statistics to tell people what they meant according to what they understood the statistics to mean. So, when I tell you those numbers at the beginning of the paragraph mean Robinson Cano’s home runs, RBI, and hits, you understand the numbers in their context. They now have meaning. But these numbers and statistics are fairly benign. Home runs are home runs. RBIs are the number of runners knocked in by a hitter. Hits are the number of hits a player had. There’s nothing controversial about them on the surface, but let’s look at some others that have more of an implication.
Batting average is easy enough—the number of this divided by the number of at-bats—but its history is not. Henry Chadwick developed the box score, and he didn’t think walks were manly or important. He left walks out of the box score and gave no credit to the hitter for getting one. When someone else came along to divine batting average, they used hits over at-bats because that was what was available and perceived as important. So let’s look at the name—batting average. It seems benign enough, but the name carries weight. By saying the word “batting”, it implies that this average indicates all that is important about hitting, but we know now that walks are important, though probably ever-so-slightly less so than hits. However, because no one questioned batting average for decades, it gained implicit acceptance because it was never refuted, thus somewhat unwittingly reaffirming its value. Sabermetrics has asked the question “Is batting average all that is important in hitting?”, and they answered no after investigation and testing. Batting average still plays a role in newer statistics, but newer statistics have adjusted to account for what batting average left out—walks and the difference between singles and extra-base hits. When saberists name their statistics, they try to be more accurate with their naming, but again, those names carry weight and sometimes more than they can carry. But they’re trying to get better.
Wins have really been in the news lately, and while some have used Felix’s win as the demise of the statistic, I think it’s still alive and well. But let’s take a look at it. Imagine being in our forefathers’ shoes. People want to know how to differentiate between pitchers, but how does one do that? There are no computers or since-accumulated knowledge. So let’s look at this in a very basic manner. Day by day, the team plays games, and there are eight guys who essentially play every one of those games. But guess who’s different? The pitcher, of course! So if the team around them is the same and the only thing that changes is the pitcher and the game’s outcome, then the pitcher must be the difference in the outcome! So, you can look at the team’s record in the games that that pitcher pitches, and the records correspond with the quality of the pitchers. Sounds good and logical, right? Well, at least when pitchers completed games it had a stronger correlation, but we know that things aren’t equal day-to-day—there are different teams faced, varying levels of offensive output, and different parks. Add the diminishing amount of innings pitched by starters and the corresponding increase in bullpen innings pitched, and that’s a lot of other things involved in the win than simply the starting pitcher. However, the term “win” causes problems when the pitcher is the only one receiving credit for the win (why not give the first baseman a W-L record? the second baseman?). The implication of the term, especially when it is called a pitching statistic, is that the pitcher is responsible for the team’s win, but we know the pitcher is not solely responsible. No one, however, seriously questioned this until a few decades ago, and like batting average, it gained implicit acceptance as a result. If no one calls it out, it must be right, correct? And when the object of the game is to win, it makes the statistic seem so much more important than other pitching statistics. If only it had been named something else.
I could do this all day, but I think you’ve gotten the point. Look, we all want to point the blame somewhere, but sometimes, stuff happens that is out of our control. Traditional statistics developed problems because of a variety of reasons. Sometimes, it was misguided machismo, and sometimes, it was a lack of available technology. Numbers were given names, and those names carried meaning. When no one challenged them, the meaning gained power and authority without anyone giving it to them, and eventually, people even gave the meaning that tangible power. As saberists have challenged these statistics, their argument encounters the neglected might of language, time, and reinforced belief. Saberists often challenge the names of the traditional statistics, but I don’t know that we delve into why those names have so much power. When these names are confronted, it’s hard to understand how we could have been wrong, or at least misguided, for so long. How could no one have noticed? If there was something wrong, we should have seen it, right? Our implied ignorance of the problem implies that we are stupid and/or negligent. The thing is that this isn’t what is going on. We believed what we did because A) it was what our forefathers declared, B) it seemed logical at the time, and C) we kept believing it over and over, through generations, and reinforcing its power by passing it on. It added up to some misguided beliefs, but there is no shame in it. Things like this happen all the time—Columbus can just sail west and hit India, the rain gods withhold or grant rain based on how happy or unhappy they are with us, etc.—but we eventually figure it out and usually as a result of having the necessary equipment, discovery, or technology. That shouldn’t be a criticism of the human mind. The ability to figure out our mistakes is a testament to human intelligence. It might take a while, but we undo previously-held beliefs all the time. After a while, those challenges become new previously-held beliefs, and they may need to be challenged (advanced statistics need to be challenged, albeit to improve them and not to destroy, and they already are). Don’t be afraid to learn. Embrace it. It’s what makes you human (though animals also learn and adapt, but I mean that we learn abstract morality and thought). You’ve always loved statistics and have always used them no matter which side you identify with. Don’t make these arguments about stats versus intangibles because it’s never been about that. It’s always been a power struggle over which stats to use and the credence and authority gained from winning that battle. And I’d argue that it’s, most importantly, a power struggle over the right to use certain words to name those statistics, with the term “win” pushing toward the forefront.