“Squirrelly” Defense

Here, the announcer has identified the situation (measuring defense with errors) and the problem (errors don’t accurately portray defense), but he won’t take that next step (what do we do about this? why is this significant?). This is significant because the problem causes fans to improperly identify good fielders. What we do about this is find a better way to measure defense, and what do you know, we have a few in UZR, Dewan’s +/-, and Defensive Efficiency (among others). One might argue that the man in question has never seen or been exposed to these new statistics, but the man in question worked with Boog Sciambi for the past few years. If you don’t know who Sciambi is, he is a wonderful broadcaster that tried to integrate new statistics into a certain team’s telecast, and he did so effectively. The announcer knew about these statistics, but he has apparently ignored them and rid them from his mind.

I’m probably preaching to the choir a little bit here, but I will get to the point in a minute. The announcer does a good job of identifying the problems with errors (weird bounces that aren’t a player’s fault, the blurry line between hits and errors, scorer’s judgment, etc.). The next step is to find a solution. There’s more to defense than errors. It’s more about making plays on balls in play. If Player A and Player B are given 100 identical plays, you want the guy who makes the play the most number of times. The number of errors doesn’t matter because they are simply counted within the number of plays not made. Let’s say Player A makes 75 of those plays and has 6 errors, and Player B makes 70 plays and has no errors. Which player do you want at that position (believing that they represent equal offensive value)? Player A because he makes more plays (converts more balls in play into outs) than Player B. The errors do not take away from the number of plays made because, as I said, they simply count as a play not made.

Of course, the world doesn’t work so neatly for us, but that’s why we have UZR, +/-, etc. They go through and try to track the number of plays made vs. the ones that should have been made or would have been made by an average defender. We can argue over which one to use or if they are perfect, but one cannot argue that errors evaluate defense better than any of them (I guess you can argue that, but I don’t think it’s an argument you’re likely to win). Stats people smarter than I will continue to work out the kinks on the new metrics (Field f/x is pretty exciting), but the newer statistics include errors and go further.

Anyway, again, I might be preaching to the choir, so let’s get on with it. In the battle between sabermetricians and traditionalists (and it is a battle, with words – often insulting –, being flung instead of bombs), I believe a common misconception is that traditionalists don’t understand what’s wrong with regular statistics and that they’re ignorant. I actually think they do see the problems. Fans, announcers, reporters, etc. continually contextualize those stats (errors that didn’t deserve to be, line drives that don’t fall in but hurt batting average, bloop hits that score runs hurting a pitcher’s ERA, etc.), and they, then, quasi-scientifically compensate (“Well, I’m not sure that was an error, so we’ll just ignore that one”). The problem with that is that it’s highly subjective (what is an error?), it overlooks a few things (would someone else have made that play more easily, for example), and that mental note is often forgotten (“He made 20 errors! He’s horrible,” while forgetting that 4 or 5 of those were “questionable”) which then equates that semi-error with all the other legitimate errors. People try to do the mental math, but the actual math has already been done for them.

And please do not take this as an attack if you happen to be a “traditionalist” (I hate labels like this because it directly associates a person with a group to which they may have little to no allegiance). I am genuinely curious. What causes this disconnect? What is the part of the bridge that hasn’t been built yet? Because I think the problem is more social science than strictly scientific. I think people do understand the logic behind newer statistics, but they still refuse them. Is it fear that they will show your favorite team/player isn’t what they seem? Is it the reluctance to admit you were wrong in the past? Is it the source of the new statistical movement (a man I very much respect has some issues coming to the “Dark Side” because he finds it difficult to get past Keith Law’s prickly nature)? Is it a lack of exposure?

What causes someone to ignore the answer that seemingly stares them in the face? What causes someone to not realize that they’re even asking the question?

18 thoughts on ““Squirrelly” Defense

  1. Mark,

    Your last comments require HIM to explain something rather than making statements.  Firstly HIM "gets" the newer statistics…Him also gets the traditionalists who don't or refuse to…but HIM and others like HIM prefer not to rely on statistics at all.  Some of us (and I understand that we are in the minority) prefer to watch the game, not keep score, not read any statistics (new or old) before or after the game.  We rely strictly on feel and gut to create our own opinions on the abilities of the players, team staffs, league officers and umpires.  The only statistics we look for are the number of games up or down we are. 

    Statistics are only for those who care for them and we don't.  We are not wrong or right, smart or dumb, young or old… we just want to watch and evaluate it on our own.

    It takes no statistics to know how good Mauer is or Williams (#9) was.  The old statistics are for HOF lovers…and longevity and out side influences we all know has diluted the HOFers. The new statistics are more relative and I am sure a necessity for all those in fantasy leagues.

    Yogi has 11 rings with the philosophy, "SEE THE BALL, HIT THE BALL".  No statistics were needed to understand what made him and Campanella and Dickie Special.  Mantle was an       A-Hole but he could do everything.  Mays could also do it all and he told you he were the best.  "Any time I pound my glove before I get to the ball, I got it", Mays (even in his famous catch he pounded his glove). 

    Jeter will play SS when he is 40, Cano will hit 360…not on statistics, just feel and gut…HIM knows.

  2. HIM, that is a valid way to look at the game. I'm not advocating that everyone has to know everything about every statistic, and like I said, I'm not attacking anyone. I don't even keep score in a game anymore because I prefer to actually watch the game. I'd score a play and miss the next, which isn't the point of going to a game.


    But here's my argument. The announcer, journalist, reporter, blogger, etc. is there to analyze the game. They're writing/talking about the game and trying to explain what is going on, and they make judgments on the performance. In order to make a proper evaluation, you need evidence. When they fail to see the problems with their analysis, they project those problems onto the audience, and a cycle develops. The audience believes the announcer because there's a power structure in place where the fan (especially younger ones) sees the announcer/journalist as an authority figure (he's getting paid to do this, so he must know what's going on). If he can't analyze the situation correctly, it diminishes the chance that future generations will be able to do so.


    And as a fan, I think it's very helpful to know this. If you are simply a fan of baseball, then I wouldn't worry too much about stats. There's something wonderful about simply going to a game and appreciating what it takes to play the game. However, most of the people reading this site are Yankee fans. They critique moves that the manager, player, and GM make. How does it help to talk about how bad the defense is because it made ## of errors even when it was able to turn more balls in play into outs than any other team? How does it help to perpetuate something that is wrong?


    As for Yogi, what made him so great was that when he saw and hit the ball, he was better coordinated and athletic than most other people. And the philosophy has an underlying truth to it. But there was much more to those great Yankee teams than simply seeing the ball and hitting it. Remember, a player can see the ball and hit it, but he might not square it up.

  3. Mark, I Am Not A Sabermatrician But (herinafter and for all time abbreviated as "IANASB") … the stuff I've read about measures like UZR say that you need to consider these statistics over a long period, say three years, before the measures can be relied upon to tell you whether this person or that person plays a good defense.  That's all well and good.  IANASB I understand the importance of sample size.

    Then we have the question of when the shortstop should be given an error on a given play by the official scorer.  There is no sample-size involved; it is an analysis made on a single play.  IANASB I don't know of any way to make this judgment except the old-fashioned way.  At which point, some old-timer can make a judgement whether the play should or should not have been judged an error, and how close the decision might have been.

    I beg to differ that errors are not important.  Until the Sabermatricians come up with a better way to analyze a single play, the error is all we have.  UZR is all well and good, but not all of us can wait three years before concluding that the shortstop booted that ground ball!

  4. I like the IANASB. Well done, and much better than "traditionalist". There are so many loaded connotations with that word.


    On UZR, it is best to use a three-year sample for individual players. Combined for the entire team, it is fine to use for one season as there are many more plays to work with. Regardless, Defensive Efficiency can do a decent job for an entire team as well.


    Here's the kicker on errors. You've been taught your entire life that errors are important or at least they are needed to understand defense. What I would like you (and others) to do is to try to put that aside for a moment. Is identifying an error actually important? This might be the subject for another post, but bear with me real quick. I would argue for ignoring errors altogether. What's important is that you make the play or not. An error simply counts as a play not made. There's nothing terribly special about an error. It's the result of luck (the pitch and where the hitter happens to hit it) and lack of skill or concentration (the "boot"). Ultimately, there's little difference between an error and a play not made. One can still say the player should have made that particular play, but one particular play is not extremely important in one game or over the course of a season. You've always been taught to pay attention to errors, but that doesn't mean you should. I'd still keep the statistic, but it's relegated the level of RBI.

  5. Wow.  An errorless game!  You have to like that.

    I was trying to figure it out, and I'm too lazy tonight to go look it up, but aren't errors still factored into all of the new measures of hitting?  If the batter reaches base and the scorer rules it an error, then doesn't that affect OBP, and OPS, and OPS+, and WAR, etc., etc? 

    I agree with much of what you are saying.  But you may go too far if you create a strict equivalence for all plays not made.  I understand that you're implicitly including the concept of a "zone" in your rule (i.e., a double down the first base line is not a "play not made" by the shortstop).   But a "play not made" on a routine grounder to short is not the equivalent of a "play not made" on a line drive up the middle or a sharp grounder into the hole between short and third.  Some plays are simply more difficult to make than others. 

    Yes, I understand that if your sample size is large enough (say, three years), then you can figure that every shortstop will have had the opportunity to play a similar sample of different difficulties of hit balls, and you can judge one shortstop against the other simply by the percentage of plays made by the shortstop regardless of difficulty (as you suggest, once you're into this large sample size, it doesn't matter whether one shortstop is making the easy plays and a different shortstop is making the difficult plays, so long as both are making the same percentage of plays).  But your approach doesn't work if you're judging a smaller sample, and it particularly doesn't work if you're looking at a single play. 

    If you're looking at a small sample size to measure defense, then you have to do something to compensate for the fact that the plays that player A tries to make may be more or less difficult than the opportunities to make plays given to player B.  The only way to judge the two players fairly is to assign a degree of difficulty to each play.  Unless the data available is a lot better than I think it possibly could be, the assignment of degree of difficulty has to be made subjectively.

    Right now, the "error" is our subjective measure of whether a particular play was or was not too difficult for the average player to be expected to make.  At the moment, it's the most accurate fielding statistic we have for a small sample size.  Or do you disagree?

  6. Mark,

    The difference is that the new statistics are a) new, and b) harder to understand.

    As for a), many people simply resist change.  It's been said that in science new ideas take hold not because they win over old scientists – new ideas win over young scientists, and the old scientists die. It is human nature to stick with and trust what is familiar, flaws and all (its why so many of us stay married ;-P).

    As for difficulty to understand, most obviously there are the calculations themselves. We all get how AVG and ERA are calculated (though I'm not sure most of my students could), and count HRs, RBIs, SBs, etc. Its simple and easy to grasp, moreso than wOBA or Fi or BABIP. It's not that they couldn't calculate these stats – its that the effort exerted simply is not worth the payoff to most people (let alone the time to look up that formula, again!). Some people do get enjoyment from pouring over stats – I'm one of them – but the vast majority don't.

    The old stats are also the common language of the fan, and broadcasters aren't there to educate the masses – they are there to communicate to them. The rewards for using Saber stats in the media are few and far between,. Unless you are a person who enjoys the statistical minutae, the added benefit of using the "better" stats simply isn't significant enough to make most fans care. 90%+ of the time, the old stats are fine. Even obvious atrocities like Win-Loss will, more often than not, give an idea of a pitcher's season.  Sure, if you're looking to sign Crawford or Werth in the offseason, you want to dig into these stats for the best analysis possible, but fans aren't doing the hiring. Mostly they just want to watch and enjoy the games, and use statistics to validate what they think they already know.

    In scientific circles, its long been joked that new theories don't get accepted by persuading old scientists. They get accepted because the old scientists eventually die. I suspect something similar will happen in baseball (and we're seeing it to some degree). Over time, younger people will be brought up with these stats, and will be as comfortable with them as the traditional ones. More of them will get jobs as players, announcers, broadcasters, writers and this will further the spread.

  7. Mark,


    Him understands your points as well as Larry’s, HIM really does and if HIM was an “announcer, journalist, reporter, blogger, etc.”, Him would look to have as much amo as possible.  But you started, “Let’s set the stage. I’m watching a baseball game…” and perhaps that is HIMs problem, HIM wants to judge the game and the players and their comparative abilities in a much simpler and less sophisticated or cerebral fashion…more old school with gut passion and instant visual analysis.  HIM would like to see more historical comparisons by the announcers to utilize the sports great history, undoubtedly the greatest in all of SPORTS. 


    HIM and the other HIMS just love and appreciate the game at a different level…just a game.  The “announcer, journalist, reporter, blogger, etc.” will do their thing and win or lose discussions, but, HIM thinks too much of the pleasure is being lost on too much statistical analysis only.  How about adding more “he’s as fast as…as strong as…as smooth as…” from the “announcer, journalist, reporter, blogger, etc.”.  They are the experts; people will continue to listen or continue to read them.


    Besides, how do you evaluate or analyze Bo Jackson running up and down the centerfield wall, Jeter diving into the stands or …you guys get HIMs point, if you didn’t you wouldn’t be here writing as well as you do and sparing with each other so much and enjoying it.  HIM just wants you guys to remember it is a game, the best there ever was and the best there will ever be.  These days the money (and we know that’s what it’s about) and the drugs and the attitudes from all sides are taking away from OUR game.  That pisses HIM off…and it should you as well.  Let Manny be Manny in Boston or LA, they deserve him (not HIM).  Find HIM 9 Paul O’Neill’s for the Yankees.  The other teams will catch on…no they won’t, IIATMS and only one team wins…and they will have done your analysis because it is relative and correct (over the long run) and it is too expensive to be wrong.


    Mean while HIM will be at the 3 games in Anaheim next week to enjoy HIMs team up close, by HIMself, just watching.

  8. So many good questions. This will take a few comments.


    Let's start with Larry. UZR and the newer statistics do rely on some subjectivity (deciding between a flyball and line drive for instance). However, they do assign a degree of difficulty based on the zone and what type of batted ball is. That's how credit is given or taken away. But the thing about errors is that it still has the same problem you address here. It equates an easy ground ball right at the guy with a tougher one in the hole that he gets to but is unable to make the play on.


    In regard to small sample size, errors don't do a good job either. Look at the above example. The Braves are in the top 10 in some of the advanced defensive metrics, but they have made almost as many errors as games. You would think that errors help in smaller cases, but they actually don't. You're used to thinking it does, but when you really think about it, it doesn't actually match up. It's simply misleading, and it doesn't really count all of the plays. I realize that UZR isn't perfect, but it is better than errors. And with Field f/x coming out, statistics could get extremely accurate very quickly.


    About adding in errors to OBP. How many times a year does a guy get on because of an error? I imagine it really doesn't happen all that often, and it won't affect their stats all that much. Even if a batter who hits in 500 PA gets on 5 times by error, it will only help his OBP by 10 points (I think, someone help me with that). It won't make enough of a difference to matter.

  9. misterd,


    I hear what you're saying, and I definitely think those things apply. But I'm not sure how much more difficult the newer statistics are to grasp. The calculations, sure, are difficult, but the logic behind them isn't. And how many people really go through and calculate ERA? And I really the older statistics generally give an idea. Here's another situation. I'm a Braves fan, so I here a lot about Derek Lowe's 15 wins last season despite his 4.67 ERA. The announcers see the disconnect between the two. He shouldn't have won so much. They, then, put it into context. But why? Why not simply realize that wins aren't that important? Then again, maybe it is as simple as people clinging to what they know. I just wish people wouldn't. It's not so hard to open your mind a bit more.

  10. HIM,


    If I was simply watching the game and had no commentary, I would be fine. I wouldn't have brought this up. However, most fans watch on TV, and they hear this poor analysis and believe it. Most of the time, we don't simply watch the game. We have opinions on what should happen. I just think, if you're going to do so, that you should be informed. There's no point in vehemently defending something that is utterly wrong.


    But I hear you. I really do. Statistics don't tell you how graceful a play was, how athletically difficult, and it can't put emotion into it. There is real value to such things. And I will never advocate strictly only using stats. That's not my goal. My goal is that everyone learn the newer stats so that they become as intuitional as older ones. That way, when people talk about the game, it isn't as riddled with errors. But I hear you. I get pissed when I see certain things, and statistics can't justify them. If you want to use adjectives, do so, but all I'm asking is that you use the right connotations.


    Two more things. One, 9 Paul O'Neills would be a great clubhouse, but you'd lose a lot of games due to poor defense ;). Two, I'm totally envious of you. The closest thing I have to me are the Reds.

  11. As a middle school baseball coach, I'd like to point out errors are irrelevant to what we're trying to teach the kids.  There are many, many mistakes one can make on the baseball field, on both offense or defense, that never are counted as an error.  We work with the kids to minimize those as much and often as possible, and the drop in errors will result.  You can play an errorless game, however, and still not know bunt coverage, how to handle a cut four, and many other things.  I fall into the IANASB category, mostly due to ignorance – but I'm learning.  I would guess that some of the new statistics would help chart the things we are working on with our middle schoolers, though probably not all…

    By the way, Mark, I'm a 20 history teacher so if I can help, yell my way…

  12. That's an excellent point. Language is a powerful tool, and its effect is often subconscious (we don't fully realize it, but it colors our view of things). We see the word "error" and we understand "mistake". That is correct in most cases. But only calling those plays "errors" gives them special importance. It makes it seem that those are the only "errors" that could be made on a play. They're not. Advanced stats don't necessarily include those mistakes, either, but they do so more than ascribing "errors" as the only defensive stat. These sorts of things have an effect on kids. Getting "errors" out of their vocabulary might be a good start.

  13. Mark,

    In terms of difficulty to grasp, you're right, it shouldn't be. But we've been conditioned, some for many, many years, to think in terms of traditional statistics.

    It reminds me a lot of the difficulty I have with students over the metric system. As a science teacher, I can appreciate the simple elegance of the metric system, but as an American, I was raised with the imperial system, and Istill think in those terms. When I hear "2 meters" I mentally convert it to "6 feet." By upbringing and experience, we understand the imperial system on a level that most of us don't get with metrics. Sure we could get there if we forced ourselves to do it, but there is little benefit to be had, so metrics remains somewhat more abstract in our minds.

    Sabermetrics is similar. Most fans just don't get them on the same level they get RBIs and ERA. I say "3.45 ERA" every fan knows on a near-intuitive level what that means. But what about a 3.45 FiP? Is that good or bad? Those of us just becoming familiar with Saber stats may understand FiP, but remembering the proper context for the stats and how they translate to performance isn't so simple. In the end, like metrics, we end up trying to match it back to what it means in the old system (oh, that means his ERA is inflated/deflated). And remember FiP isn't alone. There are at least a half dozen stats you need to know follow a Saber debate, with dozens of more esoteric stats waiting in the wings.

    Now I'm a nerd and I like stats and have long railed against certain stats (wins, RBIs, errors, etc), and so I'm happy to put forward the effort needed to become familiar with these new stats. But for most people, the resistance is understandable. They don't see the need for the stat when "his win total was inflated" gives them what they want to know. Only the nerdliest of us really demand a quantifiable measurement of that inflation.

    Like my students, most of whom never use metrics out of the classroom, the typical fan (and especially the older fan), can get by just fine without them. That won't change until 1) they die off, and 2) the papers and announcers use the saber stats frequently and eventually in preference over the traditional. The former will eventually happen (though they are likely to leave offspring), but the latter?  It'll be a long time.

  14. I definitely understand what you're saying. I guess one of the problems is that you can't put everyone into context. I can put the Braves defense into context, but I haven't seen enough Royals game to do that for them. When it comes down to All-Star Games and Cy Young Awards, people gravitate toward wins and RBI.


    And I know the "it's close enough" argument. But I don't see how it's useful. I'm not going to use a stick to till the garden. I'm using a hoe. Yes, the stick gets the job done, but if I have a hoe at my disposal, I'm using it. Why accept just being useful when you can actually be accurate?


    But I hear you on the impact of socialization. It's just that it's … well … frustrating. I understand it, but at the same time, I think we should still try to find a way to help people move past it.

  15. I've been travelling and away from all things computer-like for the last 24 hours… feels pretty good, I must say. 

    Great conversation here.  I'm enjoying being a spectator.

  16. The only problem with your "hoe" analogy is that not everyone is a gardner. Obviously the professional gardners should use one, and the amateurs who take the hobby seriously, but some people don't want to grow their own tomatoes, and if they don't own a hoe, they don't see a need to buy one because on the off chance they do need to poke a hole in the ground, a stick will do.

  17. It works, but is it the best way to do it?


    Again, most of this is theoretical. It shouldn't be so hard to come to a better consensus, but due to numerous factors (some of which you mention), it doesn't work that way. What I want to find is a way to figure out how to get past those. Let's make UZR (or whatever metric comes out) the "normal" statistic. Normalcy is what you make of it. It's what you've been taught. And it has to change at some point.