As a follow up to my previous article on hitting metrics, I wanted to take a look at those pitching metrics that correlate year-to-year. For this installment, I looked at starting pitchers from 2004-2011 with at least 162 innings pitched in year one and year two.
As before, this is just a straightforward correlative analysis--nothing fancy. I took a look at a bevy of metrics (courtesy of the fine, upstanding citizens at FanGraphs), and here are the results:
Pitcher repertoire generally has the highest correlation, year-to-year (Y2Y). The distribution of their pitches (i.e. four-seam fastball, cutter, change up, etc.) shows great consistency from one year to the next. Now, there are potentially coding errors in that data, but the consistency of those statistics reflects what I think is generally known--that once a pitcher makes it to the big leagues as a starter they rarely alter their portfolio of pitches. What they likely alter, more regularly, is speed, sequence, and location. But that's just a hypothesis, one that can't be confirmed or rejected with this data.
Moving on.

Outside of repertoire, the highest correlated statistic for starters is the ratio of ground balls to fly balls they throw (GB/FB), followed closely by K%. Again, this is consistent with previous research that looked into what factors a pitcher generally controls (e.g. Tom Tango and FIP, Matt Swartz and SIERA). We can see that strikeouts (and metrics associated with strikeouts such as Swinging Strike %, Contact %, and Outside of Zone Swing and Contact %), walks, and batted balls outside of line drives are all correlated Y2Y at least .67 or higher.
The highest correlated ERA estimator was SIERA (.72), followed by xFIP (.68).
As before, I also put together a correlation matrix for all the year one metrics and the year two metrics. Those correlations between .40 and .69 are shaded blue, and correlations above .70 are shaded green.
Scrolling left to right we can quickly see what metrics correlate strongly with, say, next year's Earned Run Average (ERA). ERA itself has a Y2Y correlation of .38. True ERA (tERA) came in at .47, the highest of all the ERA estimators. Fielding Independent Pitching (FIP) had a correlation of .46, followed by SIERA .45 and xFIP .43.
Another interesting finding relates to Win Propability Added (WPA). The most predictive statistics in terms of whether starters will have higher WPA are those related to strikeouts. Again, this jives with what people have long suggested--the ability to miss bats is key and something that pitcher's inherently control to a large degree.
Finally, to further emphasize the point that a starting pitcher's record is not the best way to evaluate their performance, let's look at run support per nine innings (RS/9). The Y2Y correlation of a pitcher's run support is a mere .16. With Wins having a correlation of only .29, it's no surprise.
So, as with hitters, it pays to focus on independent pitcher metrics like SIERA and FIP when trying to get a read on a hurler's true performance and likely performance in the next year. And, like hitters, focusing on how much a pitcher misses bats, gets swings on less hittable balls, and commands the zone is a solid bet as these attributes are some of the most related year-to-year. When we see big changes in these types of metrics it should be a red flag that something might be happening (positive or negative) with a pitcher.
(Special thanks to Matt Swartz for working through some data issues with me)
0 recs | 16 comments
Great work. So, as someone who doesn't follow the developments in ERA estimators THAT closely…
this isn’t the first time I’ve seen SIERA rate best. Why hasn’t that gained the traction of a FIP or xFIP?
adarowski - January 9, 2012
Well, there has been lot's of debate about SIERA
I will say this: the analysis above shows that it has the highest Y2Y correlation with itself, but other ERA estimators have a higher Y2Y correlation with ERA.
Also, as far as estimating next year’s ERA, they are all within .01 points of each other, more or less. FIP and xFIP are absurdly easy to calculate compared to SIERA and tERA, so maybe that’s part of what’s driving adoption.
Bill Petti - January 9, 2012
Cool, thanks.
There’s a lot to be said for simplicity.
adarowski - January 9, 2012
I'm not so sure about that
http://www.fangraphs.com/blogs/index.php/are-pitching-projections-better-than-era-estimators/
Lewie Pollis - January 9, 2012
This (via Mike Fast):
Julian Levine - January 9, 2012
As I explained before
The most inaccurate and horribly misleading comment in that entire thing is where Mike claims that I did not test the extra terms. As I summarized here:
http://www.fangraphs.com/blogs/index.php/new-siera-part-two-of-five-unlocking-underrated-pitching-skills/
I spent several months testing a variety of assumptions, some of which were later confirmed by HITf/x data. As others have found, the most important “extra” term is GB^2. This is because, as I showed, BABIP on ground balls is lower for pitchers who get more ground balls. Brian Cartwright later found this in his recent (excellent) piece in the THT Annual— the lower the angle off the bat, the more likely the grounder was an out, and the more ground balls a pitcher gets, the more likely his ground ball came at such an extreme angle. But I also showed that the correlation between BABIP & K%, and the correlation between HR/FB & K% are important and relevant. That explains why the average pitchers would have his SIERA-fied FIP coefficients as about:
SO: 2 FIP→ 2.9 SIERA
BB: 3 FIP→ 2.9 SIERA
FB: 1.3 xFIP → 0.7 SIERA
That’s very important because all I really did was instead of saying
dERA/dSO = a
Perhaps it’s true that:
dERA/dSO = a + b*SO + c*BB + d*GB
and so on. I just allowed each peripheral’s effect to be affected by the level of all three peripherals. It looks complicated, but it’s as simple as just saying “let’s do second order effects, instead of first.”
The proof is in the fact that the RMSE is better than other metrics, out of sample, and the correlation is better for SP&RP combined (and about the same for SP, when you adjust for park effects— you’ll note that FIP- has a lower correlation with next-year ERA than SIERA). Batted ball data is not perfect***, but if you use actual results (i.e. use regression), you can
figure out how useful it is.
All I did was take the assumption one step back.
The dismissal of a data source that is twice as accurate than almost any other data source in social science just because we are sabermetricians who are used to data that is four times as accurate is foolish and unnecessary.
Matt Swartz - January 10, 2012
this is wrong:
papality - January 10, 2012
Momentum
Franky, I think it’s more of a momentum thing. SIERA does relatively better with relievers compared to other ERA estimators than it does with starters, so it does have a distinctly higher correlation when SP & RP are both in there. It’s also got a better RMSE too.
I’ve never really understood the issue with it being complicated to calculate. I’ve never calculated an xFIP myself on the fly either. I just go to FanGraphs. The Markov matricies that generated the 3,2,13 coefficients in FIP are easily more complicated math than the multiple regression techniques used in SIERA too, so I think it’s really a matter of xFIP coming first.
Matt Swartz - January 9, 2012
I think you right about first mover, Matt
As for ease of calculating, maybe it’s my lack of exposure, but I feel like most find 3 stats with 3 multipliers relatively easy. Moreover, people might feel like the have a better sense of what’s ‘inside’ as a result.
Did you republish the equation after updating? The only reference I have is this: http://www.baseballprospectus.com/glossary/index.php?search=SIERA
Bill Petti - January 9, 2012 via iPhone app
Yeah
http://www.fangraphs.com/blogs/index.php/new-siera-part-two-of-five-unlocking-underrated-pitching-skills/
Matt Swartz - January 9, 2012
I believe FIP is based on a multiple regression.
DICE, Clay Dreslough’s independent derivation of FIP, was a linear regression as well.
cwyers - January 9, 2012
Clutch Y2Y correlation = 0
Jack Morris’ face when

Lewie Pollis - January 9, 2012
I demand more Looney Tune reaction shots in our comments section!
Justin Bopp - January 9, 2012
Excellent work
Excellent job Bill. I can see that this was a huge amount of work, but the results are valuable. It’s interesting that k/9 is as predictive of next year ERA as FIP.
LPanas - January 9, 2012
Very interesting. Looking at the correlation matrix, Y1 SIERA/Y2 ERA comes in at 0.45, and Y1 K%/Y2 ERA comes in at -0.45.
Julian Levine - January 11, 2012
Curious
What Siera formula was used in this test? Was it a standard formula or did the constants change for each year?
RallyMonkey5 - January 10, 2012 via mobile
You must Login with your SB Nation account and be a member of Beyond the Box Score to post a comment.