We Need to Dig a Deep Hole, Bury the Old Statistics, and Forget Them Forever

I wanted to wait a little bit before following up on my post about the AL MVP race. A week from today, it looks increasingly likely that Miguel Cabrera will be named the AL MVP over Mike Trout, an objectively wrong decision by any definition of the award. If this is the case, the decision will have been made solely based upon use of three statistics-runs, RBIs, and home runs-that are obsolete, useless, and misleading.

Let’s get something out of the way. This is not a case of “quantitative vs. qualitative” or “scouts vs. nerds” or anything like that. Runs, RBIs, and home runs are statistics. They are aggregated records of events that happened in baseball games, just like on base percentage, slugging percentage, .wOBA, UZR, DRS, or whatever else you pick.

Every single person uses statistics to determine who the better MLB baseball player is. No one watches all 162 games per year of every single baseball club, and even if they did their brains would not be physically capable of remembering and aggregating the individual details of the games that they watched in order to make a determination of who is better. These are not prospects with whom you are trying to look into the crystal ball and predict the future, and that qualitative stuff that doesn’t show up in the statistics becomes relevant.

The only question is what statistics we use to evaluate players. Any person who is capable of using a human brain to form thoughts can figure out that RBIs, home runs, and batting average produce almost no useful information about a player. I could make those arguments right now, but I shouldn’t have to. Moneyball happened, and thousands of people on and off the internet had those arguments. Batting average tells you a little bit about how a player goes about providing a piece of his value, but requires tons and tons of context and understanding of luck to be useful. Home runs tell you about one way in which a player can add value, but talk to Curtis Granderson or Adam Dunn this season if you think its a real measure of value. In Moneyball terms, batting average and runs are statistics that tell you something but lack the power of language.

RBIs and runs are useless statistics to determine how good a player is. This is a fact, not an opinion. And they are where the real problem is.

A long time ago, we used to use alcohol as anesthesia and primitive bone saws to amputate limbs. Then, we invented power tools, real anesthetics, and penicillin, and we started using those, and the old stuff became obsolete. That’s how things are supposed to work. You figure out objectively better ways to do things, and you stop doing the old stuff.

But that metaphor is actually wrong. Bone saws and booze didn’t do their job well, but they accomplished something before being replaced by better versions of themselves. Think of them as OBP and Slg%, which were eventually replaced by OPS+, which was eventually replaced by wOBP, etc.

In this metaphor, RBIs and runs look like how doctors used to use leeches to bleed the crap out of patients for no good reason. At some point, doctors just woke up and realized that they had been colossally stupid for centuries, and just killed the whole bloodletting idea.

Giving Miguel Cabrera the 2012 AL MVP award because he won the Triple Crown is equivalent to handing out the 2012 Most Valuable Physician award to the guy who was best at bloodletting.

We need to dig a hole, throw the old statistics inside it, and bury them below 10 feet of dirt. We need to just stop using them, ever. They need to get the hell off stadium scoreboards, Fangraphs dashboards, the Yankee Analysts blog, all of it.

If you haven’t read Eder’s post about how PECOTA inventor Nate Silver predicted the 2012 election (and 2008-2010 elections, and 2008 primaries) dead-on, I highly recommend you take the time. The great lesson of the last few weeks–where Silver faced mountains of criticism from all sorts of people who didn’t want to believe or even make the effort to think about basic math–was that rationalizing stuff works.

Math works. Logic works. If we make the choice to ignore math and logic, we’re doing something very stupid. If we use the RBI or runs statistic as anything other than fun, meaningless trivia, we are ignoring objectively true math and logic. The Triple Crown is a meaningless trivia fact, not something that people who aren’t being stupid put any value in.

About EJ Fagan

E.J. Fagan been blogging about Yankee baseball since 2006. He is a Ph.D. student at University of Texas at Austin.

41 thoughts on “We Need to Dig a Deep Hole, Bury the Old Statistics, and Forget Them Forever

  1. ‘Love how you conveniently leave out Cabrera’s slashline.

    HRs, RBI, and runs are not useless and misleading.

    With your ridiculous logic, Robinson Cano wouldn’t get an 8-year $200M deal from someone if not the Yankees if he posted 50 HR, 145 RBI, 127 R in 2013 when he would.

    Cabrera was the primary offensive force behind the Detroit Tigers becoming 2012 AL Champions. He had a tremendous year not only with BA/HR/RBI, but with OBP and SLG, too. He had it despite moving back to his old position when he didn’t have to (he moved from 1B to 3B so Prince Fielder could remain a 1B.) He posted the numbers he posted as a veteran pitchers have faced literally thousands of times.

    “RBIs and runs are useless statistics to determine how good a player is. This is a fact, not an opinion. And they are where the real problem is.”

    Enlighten us as to why HR, RBI, and R are “useless”. I’ve yet to see your argument other than “Miguel Cabrera shouldn’t win the 2012 AL MVP because I say so.”

    • You can read my post from before. There’s no point in re-litigating math.

      If Robinson Cano had a season where he had 50 home runs, 145 RBIs, and 127 runs, actually useful statistics would likely tell the story of how good he was. That said, those statistics do not necessarily mean that a player is the best in the league.

      The simplest way to do this is to is to actually use your logic in practice:

      Player A hit 50 home runs, 147 RBIs, and scored 127 runs. Player B hit 35 home runs, 120 RBIs, and scored 115 runs. Which is the better player?

      There’s no way to tell, because the statistics are useless. Player A could be a crappy, fat, third baseman hitting in Coors Field, while Player B could be a gold glove shortstop playing in Seattle. Player A could have 10 walks and a .270 OBP while Player B has 100 walks and a .440 OBP.

      This is why we figured out how to do math and come up with better statistics. Its time to retire the old ones.

      • Actually, RBIs are very relevant to Cano, who only knocked in 94 despite having the 8th best wOBA in baseball AND the most baserunners when he came to the plate. How did that happen? Cano underperformed with RISP, and most of his production came via walks when facing right handed pitcher. Because Cano struggled versus lefties all year, that allowed teams to avoid him or attack him when convenient.

        Does that mean RBIs are the basis on which Cano should be judged? Of course not, but to ignore then is burying more than just old stats. It also runs the risk of burying your head in the sand.

        • If you accept that logic, then you need to use a statistic that represents something like RBI%. Raw RBIs doesn’t tell you what you just logically spelled out.

          That said, these types of things are significantly overstated in their actual value to winning baseball games. Guys help score runs in a lot of ways that doesn’t show up in the RBI statistic: you have to give a guy credit for extending a rally by taking a walk 4 batters earlier, or by stealing a base to get in scoring position, or by advance a runner with a single, or getting his own double, or whatever. RBIs honor the order of events more than the total effect of each event.

          • RBI% (or any rate) is always better than the aggregate, but in most cases we can make common sense assumptions. With Cano, we know he played on a good offense and was healthy, so his lower RBI total likely mirrored a low RBI rate.

            There are many components to winning. I agree that the RBI, by virtue of coming at the end of the chain, isn’t paramount, but it is a part of the picture, and, therefore, burying it, detracts from the entire story.

            Instead of burying the RBI, and other traditional stats, we should be using them like canaries in a coal fine to ferret out trends that encompassing stats like wOBA obscure.

          • Why even use RBIs for this purpose? It seems to me that your best option is to just use the splits from the other stats you are using. “Robinson Cano hit .340/.400/.600 .400 wOBP with no one on, but .250/.300/.400 with runners in scoring position,” tells you a lot more than, “Cano had 94 RBIs.”

          • How many at bats did those splits come in? All rates require playing time to have meaning. Splits are not intuitive. If you tell me a top offensive player in the middle of a top offense only had 94 RBIs, it means something. I am not sure why anyone who takes baseball analysis seriously would want to discard a data point, but to each his own I guess. I much prefer to use as many stats, old and new, as well as my observations, as the basis of analysis.

  2. I agree with you that Mike Trout should be MVP. I agree that Runs and RBIs are a function of opportunity and even though highly correlated with great performances shouldn’t be used to differentiate between great players. I disagree with you about Batting Average though. You didn’t really flesh out your argument there but I am assuming you that you are driving at the amount of luck involved (unusually high BABIP?). If I’m interpreting where you were headed correctly, then I would agree that it’s important in predicting repeatability/future performance and evaluating someone’s overall talent but irrelevant in determining the value of their season in question and worthiness of a particular award for that season. What matters is their contribution on the field in that year and not whether we think luck played a role in it. That said, BA still isn’t the greatest statistic as it tends to overrate players like Ichiro who never walk and mostly hit singles. I don’t think its next to useless though as you stated. And clearly defense, base running, and positional importance also need to be factored in.

  3. I don’t think homeruns and average are useless at all. RBIs are overrated, but they still at least signify something. As the Yankees proved this season, you can lose a lot of games by not being able to drive in runners in scoring position. It’s obviously a product of opportunity. Let’s be honest though, even if you put Scott Podsednik in the 4th hole of Texas or NY’s lineup, he’s not going to get >100 RBI.

    As for homeruns, yes, alone the statistic is not a great evaluator of player talent. But that’s why Curtis Granderson wasn’t in the running for the MVP. The only thing he did well was hit homeruns. Miguel Cabrera did more than that. Secondly, homeruns are still an important statistic. Hitting homeruns can turn momentum, not to mention that fact that it’s the best hit you can get in the game. You get to touch all four bases when you hit a homerun, and you score at least one run for your team. If you can do that at a high rate year after year, you are going to be valuable to your team. Curtis Granderson is no MVP, but he has excellent value in CF because of his ability to hit for power.

    Finally, there’s average. You could make the argument that OBP is a more comprehensive statistic, but you could also make that argument that hitting for average is more important. When you face a tough pitcher, he’s going to challenge you. If you have a .250 average, then you’re on base percentage in an at bat where you’re not going to be walked is .250. This is why the Yankees can’t beat good pitching. Yes, on base is important, but I think average is just as important.

    Miguel Cabrera earned the MVP fair and square in my opinion.

  4. Also, walks are great, and they extend rallies, but eventually someone has to hit the ball to drive in runs. Walks move players up one base at a time. Hits move them up 1, 2, 3, or 4 bases at a time. A walk can only drive in a run with the bases loaded, which is also ironically a scenario where a pitcher is less likely to walk a batter. Despite your assertion that RBI are not important, scoring runs is important if a team plans on winning.

    • This a key point that gets overlooked with OBP. Walks are not necessarily a skill and they aren’t always beneficial. Context matters for OBP, just like every other stat. If a weak lineup’s sole threat is walked constantly, we can’t simply assume he is a productive player.

      • Walks need to be in the context of OBP. Stats like wOBA exist because there is value in extra base hits as well. When it comes down to it, if you have a team of the top 9 OBP players, and team of the top 9 RBI leaders, the OBP team is far more likely to win. RBI’s are dependent on your team and luck with clutch hits. The idea of a clutch hitter is a myth.

        This isn’t to say Miguel Cabrera was lucky, because he’s a great hitter because of his OBP and power. When it comes down to it though, I’d rather have his 2011 season (.344/.448/.586 with 30 homeruns and 102 RBI’s) over his 2012 season (.330/.393/.606 with 44 homeruns and 139 RBI’s).

        • That’s an extreme generalization that intuitively fails. A team with the nine best RBI men must necessarily be the highest scoring team(provided another team isn’t scoring a lot of runs on errors or getting absurd production from the bench). If you really want to take the top-9 OBP guys, I’ll take the top RBI men every single time.

          • It’s really not a generalization, it was the first thing instituted in the MLB by Sabermetrics. If you can get on base more than your opponent, you have more opportunity to score runs. Not only because you have more runners in scoring position, but because you have more hitters at the plate. Sure, you’d wanted 9 Barry Bonds up there who can get on base and hit homeruns.

            But the point is that getting on base is much more important than slugging, which is the fundamental flaw in OPS. Aaron Gleeman found that OBP was 80% more valuable.

            RBI’s are often an indication of who’s on base in front of you, and then how much you slug with runners on base. In fact, the implication of RBI is really what a hitter does with RISP, or a clutch player. Again, there is no such thing as a clutch hitter. I just don’t see a single reason why RBI is significant.

          • 2 outs, man on second. Who would you rather have up? Nick Swisher or Ichiro Suzuki? I’d rather have Ichiro because he is more likely to do something that results in a run, i.e. get a hit. I say the same thing if it’s 2 outs and a man on third. This is where average becomes more important than OBP. You can’t just keep passing the buck to the next guy. Eventually someone has to get a hit to bring in the runs. OBP is great, but not when it’s backed up by a low average.

            Again, the Yankees emphasis on OBP is the reason they can never hit in the playoffs. Funny how guys like Jeter and Ichiro, who are considered free swingers compared to the rest of the Yankees lineup, are both known to be awesome in the playoffs against pitchers who challenge hitters.

          • Sabermetrics instituted things in MLB? You’re starting to lose me.

            How can you possibly say your statement wasn’t an exaggeration when by definition having the top-9 RBI men WOULD make you the highest scoring team? You are taking a valid point to the extreme and obscuring it with an inaccurate exaggeration.

            You are trying to tie neat bows around complex concepts, and they just don’t fit. Does OBP usually correlate to run production better than SLG? Yes, but that doesn’t mean SLG is irrelevant. It also doesn’t mean that in some environments, the differences are very different. Every stat has context,and OBP is no different.

            The “clutch players don’t exist” is another exaggerated meme. Clutch can exist (that’s what leverage-based performance is all about), but it’s more a description of performance than a characterization of player. Having said that, there is no reason why a player can’t consistently outperform both himself and the league in high leverage situations. I would define that as being a clutch player.

  5. I think maybe EJ is a bit pissed that Trout will NOT be the AL MVP, and blames this on the overvalueing of ‘certain’ stats. Obviously BA, RBI and HR, as stats, have value. The problem is, as EJ pointed out, is that they are not telling enough to use in determining the MVP.

    Obviously Miggy had a killer year. Everybody knows that.

    But Trout just had a better one. A lot better in ways, if you consider his position… and don’t forget that Defense and Baserunning are an important part of the game.

    • I think the important thing to remember is that by any objective standard, Trout was a better hitter than Cabrera this year. That’s before defense, baserunning, or position. The Triple Crown stats make it seem that Cabrera was a better hitter, but he clearly was not.

      Cabrera had a great year, but the Triple Crown stats mislead you into thinking he had his better year ever. Cabrera was actually a much better hitter from 2010-2011 than he was in 2012. His 2011 season (when he hit 30 home runs and 105 RBIs) was much better offensively than his 2012 season (when he hit 44 home runs and 139 RBIs).

      • By “any” objective measure Trout was a better hitter than Cabrera? Is wOBA objective? Cabrera’s rate of .417 was better than Trout’s .409.

        Also, while you can argue Cabrera was a better hitter in 2010 and 2011, it wasn’t by much.

        I would vote for Trout over Cabrera as well, but that doesn’t warrant exaggeration of the difference.

    • Lets’ not forget the fact that Trout is fantastic on defence. Even if the two are largely a wash offensively, or if you want to argue that Cabrera is marginally better on offence, you cannot deny the fact that the two are miles apart on defence.

  6. What I find important to value, is the ability to create runs and the ability to take them away. HR’s, RBI’s, and BA do nothing to indicate this value. BA is dependent on fielding, luck, and doesn’t take into account your ability to draw walks. RBI’s are dependent on your teammates, as well as luck in those opportunities. HR’s are the least contingent on other factors, as it’s mostly just ballpark dimensions, but also it doesn’t tell you the full story.

    Curtis Granderson’s 2012 is the perfect example. Although he hit 43 homeruns, you’d much rather have a player with a .400 OBP. Nick Swisher was more productive simply because he was on base more.

    You might say, homeruns matter because they drive in runs. What is more important than that, is making outs. If you don’t get on base, it means you’re ending the game quicker, you’re taking the bat out of other player’s hands, and you’re thus creating less runs. Homeruns are great, but it’s not a telling stat for a player’s value.

    I agree with you on trout of course, and that MVP does not mean most valuable hitter. At the same time, I agree with EJ 100%. These are stats that can’t be in discussion if we’re talking about the best player, and it’s not even something I’d bring up when trying to pick the most valuable hitter.

    We’ve come much further in value than OPS, but it’s at least a stat that’s somewhat talked about generally. That’s a better way to pick the best hitter, but if we have free range to create our own sabermetric triple crown, I am going with wOBA/ISO/UZR.

    wOBA gives you a much better indication of your ability to get on base, and values hits much more fairly than BA. ISO is slightly better stat than slugging, which simply takes singles out of the equation. UZR has it’s flaws, but it’s the best fielding stat we have, and it’s far more reliable than RBI’s or BA. I couldn’t imagine finding a player that ever led a baseball season in these three categories though.

    • The sabermetric triple crown… ugh, I just vomited in my mouth.

      I love how people always knock average because of the “unaccounted for variables,” but sabermetricians conveniently leave out the fact that UZR does not consider whether a batter ball was a line drive or a fly ball. In reality this makes a huge difference on whether someone can get to a ball or not. No stat is perfect. They all contribute a little bit to the big picture. Miguel Cabrera was the most feared hitter in the league this year for a reason, and took his team to the world series. Mike Trout was on a team that didn’t make the playoffs. There are some things that stats can’t measure as well. The bottom line is your wOBA, ISO, and UZR are just as flawed as average, RBIs, and Homeruns. None of them are effective on their own, and every stat needs the aid of MANY other stats to tell the whole story about a player.

      • Mike Trout was on a team that didn’t make the playoffs.

        but that won more games than Cabrera’s team in a tougher division.

        Look, I think the argument about the MVP Award is a subjective (and frankly silly) thing. If the biggest problem you have is that you have either Cabrera or Trout on your team but not the other you’re way, way ahead of the game.

    • Really….what a bunch of over analysis…and nonsense.
      Now getting a hit and your batting average going up doesn’t have value because it could have been caught?
      Absurd. Some of you cannot see the forest in the trees.
      Runs do not have enough value because someone else….drove you in?
      The RBI does not have enough value because it was dependent on someone else getting on base and that swing with the bat was based on luck?
      It really gets disturbing how far some people take sabermetrics…Getting a new hat doesn’t mean the old hat loses all value. Really.

  7. Conceivably, you can bat .000, go the whole season without a hit, without scoring a run, without an RBI….but you could walk 120 times and steal 2nd each time you walked…so 120 stolen bases. You can be the best defensive player in MLB, making no errors and catch every ball hit to your position. But, you’d get my vote as MVP because I’m an angry, bitter young man who obviously does not like his father.

    • Brilliant, just brilliant. No one batting .000 is going to play enough to walk 120 times and even if he did why wouldn’t pitchers just heave the ball down the middle on every pitch until he proved he could, you know, hit it?

      I’ll give you a challenge. Find one player, just one, in all of baseball history that has even come within shouting distance of such ridiculous numbers.

      But, you’d get my vote as MVP because I’m an angry, bitter young man who obviously does not like his father.

      Wonderful sarcasm and belittlement of rational analysis by a guy who has demonstrated that he lacks the grade-school arithmetic skills required to calculate a simple percentage.

      • You are wrong, road rider…players batting 000 will get plate appearances when Bill James manages the team…you see, BA and RBIs are not important and they will not be kept as stats anymore, like the writer stated…so 1st of all, you won’t know he is batting, .000…you will just be noting how many times he walks and gushing like a school girl at his stolen bases. Now you say they will just throw the ball down the middle? So? Whats your point? He’ll just foul those pitches off and eventually walk on 8-10 pitches which will increase your love for him because he works the count so well.

  8. I’d just like to say that your commentary is spot on and honestly has me pumped up. Also, don’t let the illogical comments above get to you. Remember that people generally find change threatening and will lash out against anyone who challenges the status quo. I would just consider it a small taste of what Nate Silver and Bill James faced in the early days of sabermetrics. Keep up the good work.

  9. I find it amusing how you sabermetic guys think you are so relevant. You guys play your role in baseball management, but that is it. Outside of internet sites, no one cares; people just wanna enjoy the games.

  10. Got to take a mea culpa here. I just checked, and you’re right: Fangraphs lists Miguel Cabrera’s wOBA a little bit higher. It was actually the opposite when I wrote the MVP post a month ago. My guess is that some minor revision in ballpark adjustments happened. My bad.

  11. Absolutely fantastic piece of writing and logic. There is simply no valid logically reason to argue that RBI is a useful statistic when you look at player quality. As a scientist that uses statistics on a daily basis you come to realize that quantitative analyses must be accurate and predictive, otherwise you just have a pile of useless numbers.

    You can see this logic in some medical research. There are variables that we know are simply not predictive (say of something like the incidence of heart disease), but we think that they should be, so we rely on them and ignore the fact that, what we really want to know is whether we can predict whether someone will have a heart attack. We can measure all sorts of variables and pretend that they predict the likelihood of heart disease, or we can find methods that are truly predictive and rely on those. This is where RBI fails (and I don’t mean in predicting heart disease).

    • Baseball stats are not exclusively used to predict future outcomes. They are also employed to assess past performance. I think that’s where some people get confused. In the former, RBIs are not very telling, but in the latter, they can be indicative of underlying performance,as I illustrated with the Robinson Cano example.

      The argument isn’t whether RBIs are better than wOBA, but whether RBIs are useless.

  12. I just don’t get these comments. Its like saying that economists or political pollsters are useless when we can have people simply speculate about what is likely to happen given some crude statistics. The usefulness of statistics are solely determined by their ability to predict some outcome.

    Baseball statistics are largely used because of historical precedent, not logic. Before computers we counted simple things and thought that these statistics might be useful. Then we invented the computer and realized that those things we were counting really were not useful pieces of information.

  13. Sorry to be unclear (or I should say, overly technical here). Prediction is not about the future when we use it in a statistical sense. It is a statement about model fitting – can we use some information to predict some outcome. Like can we use some data from batters to determine the likelihood that a team will score runs or win games. We fit the model using historic data and ask whether we can ‘predict’ from the model the historical outcomes.

    So a good model would be one that allows us to input pitching and hitting statistics and predict how many games a team will win, and then compare those numbers to how many games teams really won. We then say a model is ‘predictive’ if it can reconstruct what actually happened (and presumably, could be useful in assessing future value, but that is not the main goal necessarily).

  14. Mr. Fagan, your comment that “There’s no point in re-litigating math.” is patronizing. No one posting here cares to “re-litigate math,” but rather to validate the toolset we use to understand and predict the outcomes of a sport we love.

    I appreciate William J’s posts, and particularly his reminder that “context matters.”

    I also appreciate Paco Dooley’s comment that “There are variables that we know are simply not predictive… but we think that they should be, so we rely on them…. We can measure all sorts of variables and pretend that they predict… or we can find methods that are truly predictive and rely on those.”

    We need measurements that both explain and elucidate our observations, and that contribute to the creation of a predictive model of what we might expect going forward.

  15. I agree with you that the introduction and innovation in sabremetrics have introduced stats that can tell you more about a player’s production, and that trout was the best all around ballplayer in 2012. But damage numbers (rbi for instance) are still relevant because they are just that….damage numbers. 80 to 100+rbi tells me this guy is doing damage and I’ll take that anyday. And though I can argue that considering the rarity of triple crown winners explains the weight of such a season, miguel cabera deserves the mvp because of the fact that if cabrera is not on the tigers, the tigers do not make it to the playoffs.

  16. I totally agree with William J. is saying in this.
    EJ…you are going too far extreme into the saber world.
    Each number tells a story and has merit.
    Some numbers are better than others but they still tell part of the story.
    The other side is also…the history of baseball lives and breaths…the guys with the highest BA or the league leader in RBI, Runs scored.
    Who is the all time leader in BA? Cobb? Is what he did less meaningful that we know this?
    Who is the all time leader in Runs scored?
    Who is the all time leader of RBI’s?
    These are some of the best players in the history of the game…and now because we have gotten better at match equations we are supposed to throw all of that away?
    I say nonsense to that.
    While I agree there are stats which are better than others I like looking at War and OPS+…It doesn’t mean we should bury the historical stats we grew up on…just use it…in context.

  17. The lack of humility from sabermetricians is off-putting. The new statistics are very valuable, especially for management but also for diehard fans capable of understanding them. But the bulk of them are not within the reach of the casual fan who isn’t going to expend the effort to understand them, and it is on this fan and his revenue that the MLB’s vitality is grounded. Fans over the age of forty aren’t sitting around a stove or even a water cooler extolling the virtues of the great 1.000 OPS seasons of the past few years. And while runs and RBIs are reliant upon context, they do communicate useful data while being accessible to the layperson. Are there better stats? Sure, and I’d wager a guess that the bulk of MLB front offices, at least the successful ones, put a greater weight on them than they do the stats that appear on the back of baseball cards.

  18. I think the main problem with sabermetrics is PR. Most of the really good models used to evaluate and predict value (e.g. wOBA) require a basic understanding of statistics/regressions to fully understand what these statistics are. RBIs may explain some amount of variance in a given model. What has been discovered by sabermetricians is that other statistics explain the same variance and more while simultaneously reducing error. If you’re looking at a multiple regression involving several offensive statistics, the model can actually become more accurate when you remove certain independent variables (multicollinearity and heteroskedastic error problems). I haven’t run the numbers myself, but given how the sabermetric community has almost unilaterally come to the conclusion that RBIs are a nearly useless statistic, my guess is that this is the result of lots of hypothesis tests that almost all say that RBIs aren’t statistically significant. That’s not to say we should stop recording RBIs as a statistic. It just means that we shouldn’t take RBIs into serious consideration during, say, an MVP debate. If everyone in the baseball world had a degree in statistics, there would be relatively little disagreement over the value of RBIs.

  19. I hear what you’re saying about scientific stats, but I come away from these new measures simply caring less about baseball. I mean, how strenuously can you put down “subjective thinking” before acknowledging that baseball is pointless and not a worthy subject for a mature person’s attention? That said, I’m glad I found your blog. As a Sox fan I hear a lot of simplistic ridiculing of the Yankees, so even though I disagree about this MVP issue I will check out your blog again for your serious analysis.