The NL Cy Young Race: An extensive statistical perspective


ED. NOTE: Front page'd. Nice read. Grab a cup of coffee and get down on it. - WC

(Be warned, this is pretty long and explains all of the stats presented before analyzing them. For those familiar with the stats used throughout the post, you should skim past most of the explanations and focus mostly on the numbers.)

It's September now, and with the lack of real playoff races in MLB this season much of the discussion has shifted to award races. For the Phillies, the MVP race is out of reach (unless you count the inevitable 4th place votes that Ryan Howard will receive), but the Cy Young has always seemed like it'd end up in the possession of one of the Phillies' top 3. Between last year's NL Cy Young, Roy Halladay, the indifferent one, Cliff Lee, and the one who doesn't support are troops, Cole Hamels, the Phillies had three pitchers who were capable of being the best pitcher in the NL in 2011. And, to the surprise of absolutely no one, all three have had periods this season where they held on to that title. Currently, however, the narrative has advanced Clayton Kershaw, the Dodgers' dominant young lefthander, above any of the Phillies top 3. Does the narrative match reality? Let's examine it in detail.




First off, let’s address the most important stat (and the only one that really matters): Win/Loss record. Clayton Kershaw is 17-5. Roy Halladay is 16-5. Cliff Lee is 16-7. Cole Hamels is a mere 14-7. And hell, to throw in another name, Ian Kennedy is 19-4. Clearly, Kennedy and Kershaw have been the two best pitchers in the NL this year. No need to go any further.

But…. just in case W/L isn’t fully representative of how well a pitcher has, well, pitched, let’s check out some other statistics. A pitcher is supposed to prevent runs from scoring, so we might as well go and check their ERAs. Kershaw is at 2.45 for the season. Halladay is at 2.49. Lee’s at 2.47. Hamels is at a pathetic 2.60. And, Kennedy is at 2.90. So… there’s not exactly a decisive advantage for anyone. In terms of preventing earned runs, there’s very little separation between the Phils top 3 and Kershaw. Well, let’s move on to other metrics of run prevention.  To begin, let’s consider total runs allowed, rather than earned runs. The Phillies top 3 have allowed six unearned runs combined, whereas Kershaw has allowed six all by himself. Kershaw’s RA is thus pushed up to 2.74, Halladay’s is 2.66, Lee’s is 2.56, and Hamels’ remains at 2.60. Well, you might say, "That’s not fair to Kershaw. Why should he be blamed for the mistakes his defense has made? Why, the Phillies have the best Fielding Percentage in the majors, so clearly they must be benefiting from their defense." Well, let’s delve a little deeper into the defensive statistics and find out.

To compare the defenses the pitchers play in front of, defensive efficiency is a fairly basic but informative way of going about it. The stat determines what percentage of balls in play, i.e. balls that are not home runs, have been converted into outs. The Phillies have converted 71.3% of BIP into outs, while the Dodgers have converted 71.2%. Not much of a difference there, but it shows that, the defenses behind these pitchers have been fairly equivalent. Obviously, there’s a good deal of variability in individual starts, and some pitchers have higher or lower true-talent BABIPs, but on a general level there isn’t a big difference between the two defenses’ effect on these pitchers.

More advanced defensive metrics, however, indicate a different conclusion. The Phillies team UZR/150 games is at -3.5 runs, whereas the Dodgers UZR/150 is 1.6 runs. The Defensive Runs Saved metric (DRS) gives the same data, as the Phillies are -30 as a team while the Dodgers are +2. Now, this seems a bit odd when compared with the nearly identical DE ratings. However, the Phillies pitchers may simply have a lower true talent BABIP, by producing either weak groundballs or lots of infield flys. To be honest, I’m not really sure exactly how this difference has occurred, but it certainly indicates that, at minimum, the Phillies top 3 haven’t been helped by their defense in comparison to Clayton Kershaw. If advanced metrics are to be believed, the Phillies trio are each succeeding despite having a below average defense behind them.

Okay, so the runs allowed picture is at the very least a fair representation of Kershaw’s standing in the race, which puts him slightly behind the Phillies top 3 on a per inning basis. However, while Lee, Halladay, and Kershaw have all pitched almost exactly the same amount of innings (202.2-205.2), Hamels recent DL stint has him at 194 IP. 10 innings isn’t all that much, but it’s enough to declare Hamels as in a slightly lower tier than the other three given the similarity of their runs allowed per nine.

Focusing on runs allowed, however, misses another crucial portion of the pitching equation. The Phillies top 3 all pitch in Citizen’s Bank Park for half their games or so, while Kershaw pitches at Dodger Stadium. As mentioned often here at TGP, Citizen’s Bank Park is no bandbox, as its .994 Park Factor on runs actually indicates it’s a completely neutral ballpark. At the same time, Dodger Stadium is significantly lower at .895. This puts it below other parks that are notorious pitchers parks, including Safeco field and the newly opened Target Field. Pitching in Dodger Stadium should certainly lead to fewer runs than Citizen’s Bank Park. In order to correct for this, we use an adjusted stat,  ERA+. ERA+ accounts for the parks in which the player has pitched. League average for ERA+ is 100, and the higher the value, the better. In this metric, Hamels is at 148 ERA+, Halladay is at 156 ERA+, Lee is at 157 ERA+, and Kershaw is at 151 ERA+. Again, remember that the RA vs. ERA effect discussed earlier affects Kershaw numbers the most, then Halladay’s, then Lee’s, and don’t affect Hamels’ numbers at all. Given the previous information, we can start to see some separation between Halladay/Lee and Kershaw beginning to form. Kershaw’s ERA is right around theirs, but he pitches in a pitchers park and probably has a better defense behind him.

Now, given that the stats above already accounted for some of what affects these pitchers numbers, we might as well continue and see whether the pitchers have faced the same quality of competition.

The hitters Cole Hamels has faced have hit .260/.326/.408 for the season.  When facing Cole they’ve hit .210/.255/.317.

The hitters Roy Halladay has faced have hit .261/.329/.411 for the season. When facing Roy they’ve hit .246/.272/.319.

The hitters Cliff Lee has faced have hit .266/.332/.422 for the season. When facing Cliff, they’ve hit .227/.269/.330.

The hitters Clayton Kershaw has faced have hit .263/.327/.416. When facing Kershaw, they’ve hit .214/.264/.302.

Overall, there’s very little difference between the batters any of the four have faced, with Hamels facing the weakest of the four and Lee facing the toughest. Again, however, the difference is quite miniscule and doesn’t seems to have made too much of an impact on any of the four. Their lines against look very different, but much of this difference comes not from any particular skill but from differences in batted ball luck. And thus comes the beginning of the DIPS portion of this post, or defensive independent pitching statistics. These attempt to eliminate the effects that defense has upon individual pitchers by assuming a league average BABIP and focusing on the more controllable aspects of pitching (K, BB, HR, GB%, FB%).

Of the DIPS statistics, the most widely used is FIP. FIP only considers three different variables: home run rate, strikeout rate, and walk rate. There are certainly flaws in comparing pitchers by this method, as HR rate is another aspect that is affected by luck, but as an introductory measure it’s fairly effective. Their respective FIPs are as follows: Halladay 2.12, Kershaw 2.44, Cliff Lee 2.64, Cole Hamels 2.72. Halladay’s high BABIP against at .310 is neutralized and demonstrates that if not for some bad luck with balls in play he’d be having an absolutely dominant season in terms of runs allowed. For reference, no player has had a seasonal FIP that low since Pedro Martinez in his prime. Hamels has clearly had fortunate BABIP luck this season, as his .257 BABIP allowed is far below league average, so he ends up with the highest FIP of the four.

However, just as ERA+ eliminates the effect of one’s ballpark, another statistic exists for FIP that adjusts for ballpark variation. This is Fangraphs’ minus series of stats, in particular FIP-. FIP- works exactly the same as ERA+ except that it’s scale is the exact opposite. 100 is league average, but lower values are better. For example, a FIP- of 40 would be 60% lower than league average, whereas a FIP- of 150 would be 50% higher than league average. Thus, the difference between a FIP- of 105 and a FIP- of 95 is nowhere near as large as the difference between a FIP- of 50 and a FIP- of 60. One is a reduction of about 10%, whereas the other is a 17% reduction. With that in mind, here are the FIP- values for each of the pitchers:      Halladay 54, Kershaw 64, Lee 67, and Hamels 69. Some of the difference between Kershaw, Lee, and Hamels disappears, while Halladay stands out even more.

To go a step further, however, and truly isolate the pitcher’s performance, one must eliminate the effect of HR/FB variance and the effect of different ballparks. xFIP accomplishes this by regressing each pitcher’s HR/FB to the league average, which is a little above 9% this year. This process effectively eliminates the differences in a pitcher’s numbers due to park differences as well as taking luck out of the equation.  Studies have shown that pitchers have little control on their HR/FB rate, so instead of crediting or penalizing pitchers for what they did not truly control, we simply remove the effect entirely. Of course, not every pitcher has exactly the same true talent HR/FB rate, but that’ll be dealt with later. The xFIPs for the four are as follows: Halladay: 2.61, Lee 2.76, Kershaw 2.81, Hamels 2.93. Again, Halladay remains the best of the pack here although it’s not a complete runaway like the FIP rankings were. Halladay’s HR/FB at 5.1% gave him a major boost in those rankings, as did Kershaw’s 6.6%. Hamels benefited somewhat at 7.5% and Lee was barely below average at 8.5%. Overall, by eliminating the variation in HR/FB rates, Lee jumps slightly ahead of Kershaw, but both remain behind Halladay.

Now throughout this article I’ve mentioned true talent BABIP’s and HR/FB rates, and that brings us to the final DIPS statistic to be used in this post: SIERA (skill-interactive ERA). SIERA accounts for everything that xFIP does, but it also takes into account that certain types of pitchers (i.e. extreme groundball or flyball pitchers) produce weaker contact and thus have lower true-talent BABIPs and HR/FB rates. In addition, it accounts for the impact of the interaction between the DIPS categories (K, BB, HR, GB%, FB%) in that walks hurt groundball pitchers less as they can be eliminated by double plays while home runs hurt high strikeout pitchers less as there are likely to be fewer men on base. Their SIERAs are as follows: Roy Halladay: 2.54, Cliff Lee 2.67, Cliff Lee 2.67, Cole Hamels 2.82. Once again, the same dynamics appear, with Halladay first, Lee and Kershaw very close to each other and Hamels last.

Overall, after looking through both traditional and advanced pitching metrics, the answer to the Cy Young is fairly clear. Roy Halladay is simply a step above anyone else in the NL at the moment. Kershaw’s the media favorite at the moment, but his numbers only stack up to Halladay’s on a cursory glance at ERA and W/L and nothing else. Kershaw benefits from his ballpark and from his defense, as well as from some batted ball luck, whereas Halladay has pitched to the exact same level without any such assistance. Kershaw’s season is far more similar to Lee’s than it is to Halladay’s. Rather than having the discussion be Halladay, Lee, or Kershaw for Cy Young, it should be over whether or not Kershaw or Lee is 2nd to Halladay. Lastly, while Hamels has been very, very good, he’s just a step below Lee and Kershaw.

Obviously, none of this takes into account things like Lee’s two pitcher of the month awards, or his six shutouts. While these are certainly impressive accomplishments, they shouldn’t be given as a justification to place him over someone who has had the better overall season. Halladay has not exhibited the dominance that Lee’s shown, but he has nonetheless pitched in such a consistently great manner that while he hasn’t matched Lee’s upside, he has also avoided the bad stretches that Lee has had occasionally. Certainly, Lee’s recent stretch can be used to predict that he may surpass Halladay by season’s end, but other than that it doesn’t really matter. The Cy Young award should be given to the pitcher who pitched the best in 2011.  Up to this point, it’s been Harry Leroy Halladay.

