Statistics: We Do Not Have All Of The Information

Matt wrote an excellent post this morning about bringing statistics into the mainstream, and I think Chris began to follow through on that with his fascinating post on breaking down UZR. Both posts illustrated that fans now have more information at their hands than ever before, and that we can educate ourselves about the very essentials of the game. However, an interview that I heard this morning on WEEI, with Theo Epstein, reminded me that as fans, we still do not have all of the information:

I think that he (Ellsbury) is an above-average center fielder now, who is going to be a great center fielder. I know there is a certain number we don’t use that is accessible to people online that had him as one of the worst defensive center fielders in baseball last year. I don’t think it’s worth anything. I don’t think that number is legitimate. We do our own stuff and it showed that he is above average.

I think Theo is posturing here a bit, as every single method available to fans for statistically evaluating defense had Jacoby as a poor centerfielder in 2009. He is likely protecting his player and avoiding the talk radio firestorm that would ensue if he called Ellsbury a poor defender. That said, this did bring into focus the fact that clubs do have proprietary systems to determine player value, such that fans do not have equal information to that of the clubs. While proprietary does not necessarily mean better, these clubs have been attempting to hire those at the cutting edge of the industry, such that you would expect them to be at least slightly ahead of the field.

What does this mean? Put simply, it means that the numbers that we use as fans are imperfect, and should be utilized with that in mind. That does not mean that we should not use those numbers to craft our arguments, or that conclusions based on those numbers are faulty. Rather, when the numbers provide shades of grey, it is important to note that they are likely inexact and far from absolute. Furthermore, because the data is imperfect, subjective judgments and evaluations of players should have a place in the discussion. We can argue about how large that place should be, and I would say that it should be minor, but visual observation can occasionally pick up on nuance that is lost in the statistical breakdown.

I recently had the opportunity to talk to the GM of a team that uses sabermetrics extensively, and he told me that the gap between the information that the clubs have and that which the fans have is rapidly closing. That said, the data that the clubs use is far from perfect in of itself, and the information available to us is certainly no better. We need to be prudent in how we use these numbers, and be careful not to depend on them past their level of reliability. If we do, we become just as ignorant as those who choose to deride sabermetrics.

0 thoughts on “Statistics: We Do Not Have All Of The Information

  1. Unfortunately, a plea for humility will fall on deaf ears for those who need to hear it most. They’re too busy being snarky, too busy trying so hard to be the smartest guy in the chat room, and too busy decrying the old school types for being dismissive, and decide to remedy this by being dismissive themselves.

    Sabermetrics and old school types have become just another lame, boring partisan debate. Like Republicans and Democrats, Conservatives and Liberals, Keynesians and supply siders, each with their half-truths and enormous blind spots.

    • I think there are some people trying to bridge that gap. For example, Tom Tango is very good at trying to explain the advanced metrics to novices (see the Mike SIlva chronicles), and I think there are some others that are trying to allow for the human element as well. I will agree that the debate has become a bit tedious.

  2. Oh God why were you listening to WEEI???

    I know that, personally, I can be really, really snarky. Just because people who aren’t statistically-oriented can be incredibly dismissive (see Murray Chass and whoever came up with that term “VORPies”, which… might have been Murray Chass) and FJM was a MAJOR part of my baseball education, so to say. Plus, it’s over the internet and all. It’s a lot harder to be snarkier to, say, my mom in person when she starts going on about how Joba needs to go in the pen. And a while ago, I was arguing something Yankee-related and got a VERY curt answer and was kind of shocked by how rude it was, so… yeah. I just hope I’ve never been that bad. We could all be more polite.

    That said, it’s important to recognize that while most stats are flawed in some way, some stats are a lot more flawed than others. So I guess I try to convey that in as nice a way as possible.

    • Bingo. The best way to convince people is by making your case, in a way that is thoughtful and respectful. What really hits home is when you hear criticism from a friend, from someone who otherwise likes you. Not from someone who you know can’t stand you or is just trying to impress his buddies.