Baseball Prospectus today has an analysis of the greatest living pitcher. It's analogous to the ever-popular greatest living hitter question. They proceed to compare the greatest pitchers from all time by normalizing across eras. They then compare the peaks for all pitchers to determine the greatest living pitcher. Their conclusion? Roger Clemens, they claim, is the greatest living pitcher, a wholly defensible position. The study is fascinating as pitchers from various eras are juxtaposed.
I also believe that the study is inherently flawed. As I surveyed the peak values, I was, at first, surprised and, ultimately, highly suspicious of the ranking of pitchers from my childhood, especially Steve Carlton and Tom Seaver. I don't want to be one for era bias, but I find it hard to believe that five current pitchers had better peaks than Seaver, i.e., Clemens, Maddux, Martinez, Johnson, and Brown.
I then investigated the metrics used. The metric that they based the study on is called normalized runs against (NRA, a rather unfortunate acronym):
NRA: RA (Runs Allowed) normalized for league and park. Measured on a scale like ERA, with 4.50 being average in any given year.
They then convert NRA to a 100-based standard-where 100 is average-called RA+.
Pitchers are ultimately ranked by a stat called Value. Value is "represented by the number of seasons leading the league in IP with a league average RA+." It is determined, as far as I can tell, by a combination of innings pitched and RA+. "For instance, leading the league in IP with an RA half of league average is worth 2.00 VALUE."
I have a number of issues with this Value. First,-and this may just be me-I do not fully understand how it is derived. Second, from what I do understand, I am not sure what it is measuring. Third,-and this is the big one-I think that it is fundamentally flawed.
Here's my argument: BP normalizes the statistics for ballpark and era, but there are era differences that affect these metrics that cannot be normalized away. First the choice of IP (innings pitched) as a key metric. They rank the leaders in IP from 1.00 to .99 down to 0. So it's normalized right? Consider that there are 10-12 man staffs, using five-man rotations, and a large number of relief pitchers per game on 30 different clubs today. In the past much smaller staffs were used, relief pitchers did not have today's connotation, and four-man rotations were employed on as few as 16 teams. There are so many more pitchers today that ranking IP would be unfair to today's pitchers. Or at least that's what I would expect though it does not seem so in the results of the study.
Let's try the other metric, RA+. Is it affected by era even though normalized? I believe so. What is different about today's era as opposed to say the Sixties? Well, scoring is way up, but if we normalized the data then this should not affect things, right? Let's see. Take the reasons that I mentioned above coupled with higher scoring and consider ERA even normalized. Wouldn't a higher average ERA and greater number of pitchers cause there to be a greater spread between the ERA values. I think this is part of the nature of the ERA statistic-it is bound at the lower to zero but unbound at the higher end. Think about it, you could have a zero ERA by not allowing any runs and by getting at least one batter out. You could also have an infinite ERA by allowing a run but not getting any batters out. So I maintain that in eras with lower average ERAs, the ERAs tend to converge or cluster around the average. As you add more pitchers and raise the average ERA, the values will tend to cluster less and less. Batting Average differs from ERA in that it is bound at both ends, at .000 for a minimum and at 1.000 for a maximum.
Therefore, I do not believe that Randy Johnson having a 2.50 ERA which is, let's say, half the league norm today compares to Bob Gibson in 1968 having a 1.12 ERA, which is, let's say, half of the league normalized average. I believe that this is why the current pitchers clustered at the top of their analysis unduly. Their conclusion may be correct. Roger Clemens may be the greatest living pitcher. But, I think, their analysis is too flawed to be used to back up that assertion.