Enjoyment and Analysis

Now, responding to the second question, RBIs do indicate a positive contribution from the batter. To get the runner in, the batter must have done something to get him in—hit, sacrifice fly, groundout, walk, HBP, etc.—and that led to a run for the team, which by definition is a positive contribution to the team. Considering that runs are the ultimate way in which a team wins, it’s understandable that we be attached to this statistic. The problem, however, is how we assign credit. Subconsciously or maybe consciously, we give the batter all the credit for run—that’s why it’s its own separate statistic—but unless the batter hits a home run, there had to be another party involved. While we realize that, it doesn’t help that we’ve always preferred the “run-producers” than the “table-setters”. The phrase “run-producer” itself gives RBI more power as the term is “active”—the person is making something. “Table-setter”, on the other hand, is a “passive” term—while the person is technically acting, the audience knows there is another action, probably more important, about to happen. But we know the guys on base are actively involved in that run scoring. So while RBIs do indicate a positive offensive contribution by the batter, it also indicates a few more things as well, but those other things are not given credit. Simply, there are statistics that are much better at measuring a hitter than RBI. Go ahead and use RBI, but use it knowing it’s incomplete and needs supplemental information.

Okay, but what about “sentimental value”? Before I agree, what does “sentimental value” mean? Is it nostalgically valuable in a “Oh, I remember when …” kind of way? Or is it a “It was better in my day when …” kind of way? If you like the first sense, then we’re on the same page. RBIs have significant historical value. You can use them to talk about the invention of box scores, machismo in baseball, the history of statistics in baseball, describing the beginning of statistical analysis in baseball, and it will be essential when historians remember this point in baseball history. There’s quite a bit of value in RBIs, and they should never be forgotten.

Another argument is how fans just want to enjoy the game. That’s fine on the surface. People enjoy the game of baseball in various ways, and pretty much all of them are valid. Enjoy the game in whatever way you most enjoy, especially if it’s what drew you to the game. Watch the game in person, only on Sundays, only on TV, a mixture, on MLB.tv, in box scores, playing Strat-o-Matic. I don’t care how you do it, and no one else should either. How exactly RBIs play into that, I’m not sure, but I’m willing to concede that point. If you want RBIs to help you enjoy the game, be my guest.

But be careful when it comes to analysis. Enjoying the game is one thing. You can love watching David Eckstein play, and you can love him because he’s gritty, a hard worker, and short. You can call him your favorite player. You can even call him a good player, but you’d have to be careful how you say that (good for baseball, good for kids, etc.). But saying he’s one of the best second basemen in the league or that he helped the Padres get near the playoffs is wrong. Saying Ryan Howard is one of the best first basemen in the league because he knocks in a lot of runs is wrong. You can love Ryan Howard. You can enjoy watching him hit majestic home runs. He can be your favorite player. But he isn’t one of the best players in baseball, from a performance standpoint.

And here’s where the linguistic disconnect comes in, and it’s the reason we tell our kids to learn new words. Words like “bad” and “good” are vague. Even “valuable” is vague. Well, what do you mean? Are we talking valuable to baseball itself as an ambassador, star attraction, and/or community member? Or are we talking adding wins to a team? Or is it some mixture of both? Sometimes, we just aren’t precise enough with what we’re saying. So we end up arguing over “good” when we’re using different definitions. And we do this with analysis and enjoyment, though the words aren’t as vague.

When analysts criticize a player’s worth, it causes a negative reaction in the fan. In this instance, the fan is confusing their enjoyment of a player with an analysis of the player. One can get significant enjoyment from watching Howard hit while also realizing that he is not a particularly good hitter. It sounds paradoxical, but it’s not. Analysis is directed toward some goal. In this instance, the goal is to figure out what actually contributes to a winning baseball team. In a way, there are certain universal truths to what contributes to a winning team. When it comes to enjoyment, the goals are varied and sometimes not even conscious, but there are no truths, at least no universal ones anyway. Sometimes, those goals overlap, and sometimes they do not. We, in general, are terrible about parsing these details, but it’s essential in this discussion.

Going back to my comment on Chipper, I failed. The comment was meant as an analysis … actually let’s just say it’s aim was ambiguous. To begin the tweet, I was looking at RBIs from an analytical perspective, but then I compared it to an emotion that pertains to enjoyment. A high number of RBIs tell me a few things—that a hitter was probably hitting in the middle of the order, that he did this for a while, and that he must have been pretty good for a while. It does not tell me how good he was, but it’s a fair indication that he was good. But my tweet failed. I took an analytical concept and applied it to an emotional situation. Congratulating Chipper on his accomplishment wasn’t telling him that he was the best third baseman, switch-hitter, or run-producer ever. It was simply saying, “Congratulations on a long, well-played career.” Here, I failed to distinguish between analysis and enjoyment. Was the accomplishment completely arbitrary? Probably, but that doesn’t mean it wasn’t enjoyable. Jacob Peterson, a writer for Talking Chop (follow him, too, @junkstats after you follow me @Mark_L_Smith … okay, I’m done with my shameless self-promotion; Chip’s actually going to think my ego’s run amok) tweeted, “I don’t like RBIs, but 1,500 RBIs still seems a lot cooler than, say ‘80 career bWAR’. Can’t celebrate a WAR milestone b/c it can go down.”

9 thoughts on “Enjoyment and Analysis

  1. I feel so bad for David Eckstein. He was actually worth 2 fWAR last year and above average defensively. When he's been used right, he's been a pretty useful role player for good teams. But the cult of personality that wants to make him so much more than that have turned him into a punchline. That kind of sucks, really.

    • His wife's pretty awesome too with her sci-fi work and clothing line! She's also super-nice in person. :)

    • I'm not sold on the defense. The metrics generally had him as a below-average guy for the years leading up to last season, when he was all of a sudden +5 or 6. Seems kinda weird. If you make the adjustment, he's a below-average regular. B-Ref only had him at 1.3 bWAR. No, he's not awful, but he's not exactly a player you need to have either.

      But I see your point. In a similar vein, Howard gets killed because of the contract and expectations. He's not the best guy in the world, but he's definitely a good first baseman.

  2. It depends on the sizes you're looking at. If you say Player A is better because he's knocked in 39% of RISP and Player B has only knocked in 25%, you need to be careful. 600 PA over an entire season isn't an incredibly large sample size, and when you start taking away from that, you make the sample size smaller. Over an entire career, you might be able to make an argument, but from the research I've seen, players generally, and not surprisingly, do a little bit better with RISP than without. That's not surprising because pitchers pitch worse out of the stretch. Also, a percentage doesn't take walks into consideration. If a guy puts the ball in play a lot (Jose Guillen), he may end up having a higher percentage because he never walks, whereas a guy who walks (Adam Dunn) may have a smaller percentage because he walks instead of hits. A walk doesn't necessarily help in that instant, but it can help lead to a bigger inning, making it still very useful.

  3. Thanks for the shout-out.

    This was a very interesting read… I think the problem is that milestones are inherently emotional. There's nothing about a round number that makes it much different from the numbers just above or below it, yet those are the ones that attract our attention. So since we're starting with an emotional concept, it doesn't even make sense to try to use an analytical measure for the milestone. What is strange is how we choose which numbers to cling to (like RBI and hits) and which to ignore (like times on base, for instance, which is much more important).

    And then there's the fact that the most important measures of a player's quality are nearly always rate stats–which can and often do decline in the last phase of a player's career. The ones that aren't strictly rate-based, like WAR, can often go down as well (as I mentioned in my tweet). Because of this, if your purpose is to evaluate a player's overall career, then you really have to wait for it to be over.

  4. Mark, I think this area is easier to understand if you think of the reasons WHY we use statistics. I can think of three broad reasons, all related, all of which overlap to a degree.

    Take the bottom of the tenth of last night's game as an example. Teix walks, A-Rod doubles Teix to third, Cano lines out to short, and Swisher scores Teix with a sacrifice fly to right. How do we describe that half-inning statistically?

    There are the narrative statistics, which not surprisingly are the old-line statistics. They're the ones tied into the lede: "Nick Swisher hit a sacrifice fly to right field, scoring Mark Teixeira in the bottom of the tenth inning, leading the Yankees to a 6-5 comeback victory over the Orioles." The traditional stats are embedded in that lede: Teix gets a run scored, Swisher an RBI. That's pretty much the way we'd describe the game in one sentence, if we wanted to tell what happened to someone else. The run, the RBI, are all meaningful to our description of what happened during the game.

    A second set of statistics are value statistics. We want to know, who are the most productive players on the Yankees, the ones that do the most to produce runs? Well, at the start of the inning, the Yankees had a 28% chance to score a run. After Teix's walk, the odds went to 42%. With A-Rod's double, the odds went to 86%. Cano lined out, and the odds dropped to 67%. Swisher's sacrifice fly, of course, raised the odds to 100%. So from a value perspective, we clearly see that the most valuable event in the 10th inning was A-Rod's double, a hit that was not measured by the narrative statistics we measured earlier. We'd expect that A-Rod's double would be counted higher than anything else in the bottom of the 10th when it comes to value statistics like WAR or wOBA, and I'm sure we'd be right.

    But we had only a sentence to describe what happened in the game, we might not even mention A-Rod's double.

    There's a third set of statistics, the ones that interest me most, which are predictive statistics. What I want to know most of all is who is going to play well tomorrow. These are the statistics I need to know who the Yankees should seek in a trade, and who is going to be a productive free agent. When we need predictive statistics, some instinctively reach for narrative statistics, and those of us who think we're really smart reach for value statistics. But by necessity, narrative statistics and value statistics look to the past, and we reach for these statistics thinking that the past will repeat into the future. Not so.

    It turns out that with baseball, the past does a reasonably poor job of predicting the future. Past basketball statistics predict the future roughly twice as well as do baseball statistics. Also within baseball, hitting statistics are more predictive than pitching statistics. So I find the most interesting work to be that which is trying to improve our statistical ability to predict the baseball future: stats like BABiP, and line drive percentages, and aging curves, and different aging curves based on body types.

    This is worth mentioning because I finally get to re-mention that Cano lined out in the 10th inning. That line drive is interesting, because line drives turn into hits more often than any other way that a player can strike a ball with a bat. So long as Cano continues to hit line drives at a high rate, he's going to be OK. Small sample size to one side, if you want to look at the bottom of the 10th and bet who's going to perform well tonight, Cano would.be a good bet. But remember, Cano never entered our narrative, and his line drive had a negative value in the bottom of the 10th.

    Me thinks that we tend to fall in love with statistics for their own sake, and forget why the statistics are there. The RBI is still an incredibly useful statistic, in small part because it reflects value in a flawed way, and in large part because it helps us tell the story of the game. Ditto any other statistic we might mention. The statistic is there because it has value; its proper use is up to us.

  5. RBI's are so tangible – like last night, Swish hits the sac fly for the game-winning RBI and gets pied, but shouldn't Tex and Arod get pied for getting on base and moving the runner to third? Which play gets shown on Sportscenter and the local sports news?

    Regardless of how we use RBI's to measure worth analytically, we value them them highly based on our predisposition as humans to place more weight on results than process.