The above graph shows the relationship between the horizontal location of pitches thrown to Cano and his swing rate. The dotted lines represent the borders of the strikezone. The gray bands indicate confidence.
As you can see, the most notable difference in swing rates is on pitches down the middle. And now for a different perspective:
The above graph shows the relationship between the vertical location (y axis) of pitches thrown to Cano and his swing rate (x axis). The dotted lines represent the borders of the strikezone. The gray bands indicate confidence.
This graph suggests the same observation as the previous graph: Cano is swinging at a lot more pitches in the heart of the plate this year than previous years. And while he does certainly appear to be swinging at more balls, the graphs suggest that he’s only swinging more at pitches that are already close the zone to begin with.
From the data, we now know that Cano is swinging more at both balls and strikes, but with the larger difference being his swing rate at strikes.
Are these problems? If so, which one, or both?
If Cano’s increased propensity to chase is an issue, then we should see a corresponding dip in performance when he swings at balls. Conversely, if his performance on balls is the same or improved, than that would suggest that swinging at balls is not an issue. We can measure this performance through a statistic called rv100. This stands for run value per 100 pitches. The stat works by measuring the results of every single pitch through a method called linear weights. A terrible name for baseball statistics, I know. After these results are quantified, they are averaged, and then multiplied by 100 to get a more friendly looking number. This same statistic is available on all player pages at Fangraphs under the section called “pitch type values.”
Where negative is bad for Cano and positive is good, here are his rv100 for the past three years on balls that he swings at: -4.8 in 2009, -5.6 in 2010, and -4.0 in 2011. Unfortunately, a problem with these rv100 stats is that they are plagued by the same issues as many full season numbers. For example, they are very dependent on babip, so it’s possible that Cano has been a little luckier on balls that he has put into play this year. However, this does not appear to be the case here. When Cano has hit balls into play that he has chased this year, he has hit more linedrives and less pop-ups than the previous two years. In fact, in 2011 Cano has a higher babip and slugging percentage when he puts pitches that are balls into play than 2009 or 2010. This data suggests that Cano is swinging at more balls than before, but that it really isn’t denigrating his performance. If anything, he’s been more successful when swinging at balls.
Is this really the problem that we were looking at before? Cano’s performance on balls is not an issue, but this does not mean that swinging at balls is a good thing for Cano. As indicated by the rv100, swinging at balls is still a very bad thing for Cano. At his 2011 rate, for every 1000 balls that he swings at, he will cost the Yankees 4 wins (-40 runs, though this would be a ridiculous amount of swings at balls). We need to account for both the rate at which he swings at balls, and how be performs when he does this. In order to this we simply multiply his rv100 on balls that he swings at by his O-swing. This tells us the total runs we should expect to lose due to Cano swinging at balls, out of every 100 balls thrown to him. This gives us expected values of -1.7 runs for 2009, -2 runs for 2010, and -1.6 runs for 2011. However, there are other effects that we need to keep in mind. Although his success when thrown balls is not worse in 2011 than before, he is still shortening his at-bats, which affects all of his performance.
We do not arrive at the same conclusion for Cano’s increased tendency to swing at strikes (Z-swing%), which has increased to 78% this year after hovering around 71% the previous two years. His rv100 on strikes that he swings at is just .45 for 2011, a paltry figure compared to his values of 3.98 in 2010 and 2.41 in 2009. As I stated earlier, we need to consider whether or not luck is playing a role here in these small samples. In 2011, we find that when Cano puts strikes into play, he is hitting about 4% less linedrives than 2009 and 2010 and about 5% more groundballs, obviously a poor tradeoff. Although his batted ball distribution in 2011 is worse than before, it’s pretty similar overall and we should expect his performance on strikes to increase. Indeed, this year Cano has just a .287 babip for the season, a much lower mark than his career average of .320.
How much of a problem is this? Using the same method that I used earlier, we simply multiply his rv100 on strikes that he swings at by his Z-swing. This tells us the total runs we should expect to gain due to Cano swinging at strikes, out of every 100 strikes thrown to him. This gives us expected values of 1.7 for 2009, 2.9 for 2010, and .35 for 2011. That’s a pretty big drop off and explains a lot of the performance difference between 2011 and 2010.
Concluding thoughts:
Cano’s plate discipline does seem to be an issue. However, it’s not so much his propensity to swing at balls that’s been an issue in 2011. The real problem seems to be an increased aggressiveness on pitches within the zone. His approach seems to be, “Can I hit ball? If so, swing!” It’s important to remember that we can’t really infer causation from this kind of analysis. The best we can do is say that there seems to be a relationship between Cano’s increased swing rate on strikes and his decline in performance on these same pitches. Although he may not ever approach his 2010 plate discipline numbers again, we should still expect him to regress to career numbers going forward. Projection systems like ZiPS agree, expecting Cano to post a 5.8% walk rate for the remainder of the season. Indeed, it is likely that this is just a short-term trend. Now if only he could work on that fielding…
You can follow Josh on twitter @J__Stock (two underscores)-
*Pitch f/x data from MLBAM system via Darrel Zimmerman’s pitch f/x database *Cano picture from NYDailyNews *Cano’s fangraphs player page
Uh oh. Somebody whipped out the ggplot2 skillz!
You claim that the big problem is down the middle, but the percentage increase in swing percent at the inside edge seems to be even higher. Cano goes from 50% swinging at pitches 1 foot inside the middle of the plate in 2010 to 60% in 2011 according to your graph for a 20% increase (10 percentage point) in swinging on inside pitches.
We see a similar but slightly smaller percentage point increase down the middle, which as a percentage of his previous swing rate is a little lower of an increase. Either way, this is pretty interesting.
His ISO is currently 20 points higher than last year, while his BABIP is 43 points lower. Perhaps he's trading in line drives and walks for longballs on those inside pitches?
Thanks for coming here to comment! And I will forever have a love affair with ggplot2.
And you're right, there is a larger percentage increase on pitches that are 1 foot inside. Though I think if I were to do some calculus and find the difference in the area under the curves, that I would find the largest area increase on pitches that were in the heart of the plate out of any same sized interval. But perhaps I'm wrong about that.
And yea, he's definitely trading walks for hits (or attempting to), though I think that's true for basically everywhere this year for Cano. Though overall his LD rate is actually slightly higher than previous years so him swinging for the fences too much may not be an issue.