Baseball Toaster: Mike's Baseball Rants : Does the Cy Young Matter Anymore?

This is my site with my opinions, but I hope that, like Irish Spring, you like it, too.

About the Toaster

Baseball Toaster was unplugged on February 4, 2009.

Frozen Toast

Catfish Stew
And So To Fade Away

Fairpole
Baseball Toaster To Be Unplugged, Disassembled For Scraps

Cub Town
Fade to Black

Humbug
My Final Take-It-Or-Leave-It Offer

The Juice
Official Moving Day is Here

Cardboard Gods
Reggie Jackson, 1976

The Griddle
You did a good job, but it's time to go

Dodger Thoughts
Cheers

Bad Altitude
Pitcher News; I Dump My Tickets

Mike's Baseball Rants
The State of the Hall 2009

Bronx Banter
Movin' On

Just a Bit Outside
And We're Off!

Toaster.TV blogs
Western Homes
Screen Jam
Tin Ear
Aesthetics

Archives

2009
01

2008
10 09 07
06 05 04 03

2007
12 11 10 09 08 07
06 05 04 03 02 01

2006
12 11 10 09 08 07
06 05 04 03 02 01

2005
12 11 10 09 08 07
06 05 04 03 02 01

2004
12 11 10 09 08 07
06 05 04 03 02 01

2003
12 11 10 09 08 07
06 05 04 03 02 01

2002
12 11 10 09 08 07

Links to MBBR

Does the Cy Young Matter Anymore?

2005-11-09 09:52

by Mike Carminati

Bartolo Colon is a very good pitcher who had a very good year this past season. I'll accept that.

He is not the best pitcher in the AL nor did he have the best pitching year in the AL this year. But he won the Cy Young award yesterday.

It seems that the sole reason to give out the award is to stir up controversy. Hardly anyone who is a serious student of the game would pick Colon number one, but in the baseball writers' world in which, apparently, wins are king, Colon rules.

I think they should rename it the Lamarr Hoyt award (but they should a few more "r"'s to his first name first). For those of you how are two young to remember, Hoyt won 24 games for the White Sox back in 1983, when they had a cartoon batter logo on the front of their uniforms and Harold Baines was the mayor of Chicago. Hoyt won the Cy Young despite a 3.66 ERA, 1.24 runs higher than league leader Rick "Don't Call me BJ" Honeycutt (who was traded midseason to the NL), and 115 adjusted ERA (52 points worse than Honeycutt). Hoyt had just one more season with the Sox and was out of baseball altogether in three seasons.

Colon isn't as bad a choice as Hoyt was in 1983, but he ranks fourth in pitching Win Shares, fifth in BP's VORP (Value over Replacement Player), and 13th adjusted ERA (i.e., ERA based on the park-adjusted league average or ERA+). He does, however, rank numero uno in wins, 21, with three more than the next AL pitcher. My assumption is that those voters who looked past win totals split their vote among better candidates like Johan Santana, Mariano Rivera, and Mark Buehrle.

So Colon won. There was the usual gnashing of teeth and wringing of hands, but it all seems to have gotten a bit too de rigueur.

Given that I assume that the voters will be attracted to the wrong candidates, I'm left with minor questions. For example, it's obvious that Colon was not the best candidate, but how bad was he? Was he one of the top five pitchers in the AL? And are the voters solely attracted to the win totals baubles? If so, why did Kevin Millwood receive a vote when he won just nine games to go with his 2.86 ERA?

If you want to defend the award being given to Colon, to quote Jack Nicholson in the woefully inappropriately named "As Good As It Gets", "If your selling crazy go sell it somewhere else. We're full up here."

Ok, first I took the top ten pitchers in VORP, P WS, and ERA+. Then I added wins, ERA, strikeout-to-walk ratio, Walks plus Hits over Innings Pitched (WHIP), and strikeouts per nine innings. I took the rankings for each of these and averaged them. Here are the results. First the Cy Young voting along with the sabermetrically leaning stats:

Pitcher	1st	2nd	3rd	Pts	Rk	P WS	Rk	VORP	Rk	ERA+	Rk
Johan Santana, Min	3	8	12	51	3	23.1	2	73.0	1	153	3
Mark Buehrle, CHW	0	0	5	5	5	23.2	1	54.2	2	143	4
Roy Halladay, TOR				0		16.1	11	52.7	3	184	2
Bartolo Colon, LAA	17	11	0	118	1	19.2	4	51.1	5	120	13
Randy Johnson, NYY				0		16.4	10	44.1	11	117	14
Jon Garland, CHW	0	0	1	1	6	21.1	3	50.1	7	127	9
John Lackey, ANA				0		17.2	9	50.3	6	122	12
Mariano Rivera, NYY	8	7	7	68	2	17.3	8	32.3	26	323	1
Kevin Millwood, Cle	0	0	1	1	7	15.3	13	52.3	4	143	4
Jose Contreras, CWS				0		18.2	6	41.5	13	123	11
Cliff Lee, Cle	0	2	2	8	4	14.5	16	39.8	16	108	16
Freddy Garcia, CHA				0		18.6	5	45.6	9	115	15
Joe Blanton, OAK				0		14.4	18	44.3	10	127	9
Kenny Rogers, TEX				0		17.5	7	40.5	15	130	7
Jarrod Washburn, ANA				0		14.7	14	48.8	8	131	6

Next, the more conventional stats (though a number will stick tick off Joe Morgan):

Pitcher	W	Rk	ERA	Rk	WHIP	Rk	K/9IP	Rk	K/BB	Rk
Johan Santana, Min	16	5	2.87	4	0.97	3	9.25	1	5.29	2
Mark Buehrle, CHW	16	5	3.12	5	1.18	9	5.67	22	3.73	9
Roy Halladay, TOR	12	27	2.41	2	0.96	2	6.86	12	6.00	1
Bartolo Colon, LAA	21	1	3.48	10	1.16	5	6.35	18	3.65	10
Randy Johnson, NYY	17	4	3.79	20	1.13	4	8.42	5	4.49	6
Jon Garland, CHW	18	2	3.50	11	1.17	6	4.68	35	2.45	20
John Lackey, ANA	14	13	3.44	8	1.33	29	8.57	3	2.80	16
Mariano Rivera, NYY	7	58	1.38	1	0.87	1	9.19	2	4.44	7
Kevin Millwood, Cle	9	43	2.86	3	1.22	14	6.84	13	2.81	15
Jose Contreras, CWS	15	8	3.61	13	1.23	16	6.77	14	2.05	30
Cliff Lee, Cle	18	2	3.79	19	1.22	13	6.37	17	2.75	17
Freddy Garcia, CHA	14	13	3.87	23	1.25	18	5.76	21	2.43	21
Joe Blanton, OAK	12	27	3.53	12	1.22	12	5.19	28	1.73	42
Kenny Rogers, TEX	14	13	3.46	9	1.32	27	4.01	42	1.64	44
Jarrod Washburn, ANA	8	50	3.20	6	1.33	28	4.77	35	1.84	39

And the final rankings, based on the average of all rankings and then on all rankings but wins:

Pitcher	Avg Rk	Avg Rk (w/o W)
Johan Santana, Min	2.63	2.29
Mark Buehrle, CHW	7.13	7.43
Roy Halladay, TOR	7.50	4.71
Bartolo Colon, LAA	8.25	9.29
Randy Johnson, NYY	9.25	10.00
Jon Garland, CHW	11.63	13.00
John Lackey, ANA	12.00	11.86
Mariano Rivera, NYY	13.00	6.57
Kevin Millwood, Cle	13.63	9.43
Jose Contreras, CWS	13.88	14.71
Cliff Lee, Cle	14.50	16.29
Freddy Garcia, CHA	15.63	16.00
Joe Blanton, OAK	19.75	18.71
Kenny Rogers, TEX	20.50	21.57
Jarrod Washburn, ANA	23.25	19.43

Colon comes in fourth overall and fifth if we ignore wins. So it goes to show you that if you major competition are two guys who won 16 games, another that won 12 in an abbreviated season, and a closer who didn't break any records (besides many voters will tell you closers don't belong in the Cy Young voting, just the MVP vote), that shiny 21-win brass ring is going to attract the idiotic writers—sorry, "idiotic writers" is a bit redundant.

But wait a second, maybe I am being to mean to the idiots, I mean, writers. Just because the guy with the most wins won the award, I should not just assume that wins are the be-all and end-all. Now that we have all of these data, we can put it to the test.

How well does the actual vote correlate to win totals or to any of the stats that we have for that matter. Let's see…

I ran the numbers, and though none have any significant correlation to the Cy Young result some do much better than others. Here are the results. Remember that we want a negative coefficient (because the voting descends while the ranks ascend) tending toward 1.000.

Stat Rank	Correl to Pts
P WS	-0.343
VORP	0.086
W	-0.016
ERA	-0.246
ERA+	-0.083
WHIP	-0.463
K/9IP	-0.304
K/BB	-0.392
Avg Rk	-0.395
Avg Rk (w/o W)	-0.432

The first thing you might notice is that VORP and Cy Young actually have the worst correlation. The CY vote actual runs slightly counter to VORP.

But the next thing that popped out was how poorly wins did—second to last! And the average ranking with wins did better than the overall average ranking. Apparently, wins alone are not the entire basis of the writers' vote.

Of all the derived SABR stats, pitching Win Shares does best—congrats to Bill James, I guess.

Oddly, the stat that correlated best was WHIP while the other strikeouts/walks stats did better than most. So, am I left to believe that writers base their vote on strikeouts and walks? I guess. At least it's better than wins. Maybe by the 2050 the troglodytic voters will have evolved to the neanderthalic ERA. In the age of Elroy Jetson, I am sure they will be trafficking in Pitching Win Shares and VORP. Then again, I expected everyone to be zooming to work with those jet packs on their packs that we were promised were just around the corner when we were kids.

Comments

2005-11-09 10:08:53

1. Todd S

Is it possible that some of the writers are intentionally lashing out at the more modern metrics? As in "I'll show those damn Moneyballers. I'm going to vote for whoever has the most wins, just to show 'em!" I hope that's not the case, of course, but that mentality seems to be present in the MSM right now. Even if it did factor in, it was probably a minority of the voters, but perhaps a big enough bloc to swing the vote?

2005-11-09 10:45:38

2. nickb

Interestingly, the Bill James "Cy Young Predictor" (http://sports.espn.go.com/mlb/features/cy) ranked the participants in this fashion:

1. Rivera
2. Colon
3. Nathan
4. Buerhle
5. Garland
6. Santana
7. KRod
8. Baez
9. Unit
10. Lackey

2005-11-09 11:59:13

3. YankeeInMichigan

Perhaps the writers are focusing on wins for the #1 spot and applying more open minds to the runner-up positions.

2005-11-09 21:45:25

4. Vince Galloro

"For those of you how are two young to remember, Hoyt won 24 games for the White Sox back in 1983, when they had a cartoon batter logo on the front of their uniforms"

Actually, the Sox were wearing the uniforms with pullover jerseys and the SOX across the front in white on a blue background with red horizontal stripes above and below.

2005-11-10 19:39:20

5. Brent is a Dodger Fan

4 Right you are! The cartoon logo was banished sometime in the early 80s or late 70s, can't recall exactly.

Mike: I want to quibble a bit with your method. Just a bit, I promise. You really shouldn't average the ranking across stats. It's sort of like averaging averages -- a statistical mistake.

The reason for this is this: Let's say one pitcher had an astonishing lead over another the next best pitcher in some stat, like Santana's dominance in VORP (35% more VORP than the next guy!).

Move onto another stat: Win Shares. Buehrle's 23.2 gets him the 1, Santana's 23.1 gets him the 2.

Wow. Less than 1% difference in P WS but 35% in VORP? And it counts the same?

Instead, you need to find a way to normalize all the stats and then add them, then rank the result.

The way I did it was as follows: For each stat, divide each player's stat by the best in category (or, when lower is better, divide the best by the players' stat). This results in a ratio where the best player has 1.0 and all the players with worse stats get some ratio, like .94, or .75. Now these numbers are all on the same scale! Sum up the set of ratios. Then rank them.

This method, using all the stats you used, results in Santana, Buerhle, Garland, Colon then Garcia. I'm not saying it is going to give a different end result, I just hate to see that statistical mistake made.

2005-11-11 05:49:07

6. Mike Carminati

Brent,

I realize that it's just averaging rankings. It's not perfect, but it's not like averaging averages, it's like averaging rankings, which is of course what it is. If you want to weight the stats, that's fine.

And I apologizing the honor of these fine unis: http://www.baseballhalloffame.org/exhibits/online_exhibits/dressed_to_the_nines/detail_page.asp?fileName=al_1983_chicago.gif&Entryid=1480

2005-11-11 07:40:29

7. PhillyJ

Would the times (albeit few) that a reliever won the award have any bearing on the wins/CYA correlation.

Is it significant enought that if you removed the relievers that the correlation would be higher?

2005-11-11 10:30:38

8. Mike Carminati

"And I apologizing the honor of these fine unis:"

Huh? That's, "I apologize for besmirching the honor of..." (or words to that effect).

2005-11-11 13:27:01

9. Brent is a Dodger Fan

6 Mike: you say tomay-to, I say tomah-to...

I believe averaging rankings is in the same category of statistical error as averaging averages, but I think we can agree that you were attempting a quick approach rather than a purely scientific one. Or, at least one that is more likely to be used by sportswritiers: look at some stats and see what it looks like to you.

More scientific approaches are already available: VORP and P WS are attempts at encorporating mutliple stats, like K, BB, IP, H, and yielding one measure that is predictive of contribution towards winning games. Though I've never seen the research, these stats presumably have more scientific value on their own than attempting to average rankings, or weight performance across individual stats, like the method shown above.

2005-11-11 13:33:17

10. Brent is a Dodger Fan

6 Oh, and it looks like the years where the cartoon Sox on the sleeve was from 1971-1975, like here:

http://www.baseballhalloffame.org/exhibits/online_exhibits/dressed_to_the_nines/detail_page.asp?fileName=al_1975_chicago.gif&Entryid=1276

I guess the Sox couldn't really decide what their colors are:
1960: Black and white
1971: Red and white
1976: Black and white again
1982: Red white and blue
1987: Black and white again

Comment status: comments have been closed. Baseball Toaster is now out of business.