Baseball Toaster: Mike's Baseball Rants : You're So Money, Baby!

This is my site with my opinions, but I hope that, like Irish Spring, you like it, too.

About the Toaster

Baseball Toaster was unplugged on February 4, 2009.

Frozen Toast

Catfish Stew
And So To Fade Away

Fairpole
Baseball Toaster To Be Unplugged, Disassembled For Scraps

Cub Town
Fade to Black

Humbug
My Final Take-It-Or-Leave-It Offer

The Juice
Official Moving Day is Here

Cardboard Gods
Reggie Jackson, 1976

The Griddle
You did a good job, but it's time to go

Dodger Thoughts
Cheers

Bad Altitude
Pitcher News; I Dump My Tickets

Mike's Baseball Rants
The State of the Hall 2009

Bronx Banter
Movin' On

Just a Bit Outside
And We're Off!

Toaster.TV blogs
Western Homes
Screen Jam
Tin Ear
Aesthetics

Archives

2009
01

2008
10 09 07
06 05 04 03

2007
12 11 10 09 08 07
06 05 04 03 02 01

2006
12 11 10 09 08 07
06 05 04 03 02 01

2005
12 11 10 09 08 07
06 05 04 03 02 01

2004
12 11 10 09 08 07
06 05 04 03 02 01

2003
12 11 10 09 08 07
06 05 04 03 02 01

2002
12 11 10 09 08 07

Links to MBBR

You're So Money, Baby!

2003-05-21 00:44

by Mike Carminati

Rob Neyer has some interesting comments stemming from an argument in Moneyball over the importance of on-base percentage and of slugging percentage and on the validated of OPS (on-base plus slugging).

Here's an abridged version of the Moneyball text:

OPS was the simple addition of on-base and slugging percentages. Crude as it was, it was a much better indicator than any other offensive statistic of the number of runs a team would score. Simply adding the two statistics together, however, implied that they were of equal importance...An extra point of on-base percentage was clearly more valuable than an extra point of slugging percentage -- but by how much? ... In [the resulting] model an extra point of on-base percentage was worth three times an extra point of slugging percentage.

But three-to-one at what point? Clearly as Neyer opines they are not saying that a player with a .200 on-base percentage was equal to a .600 slugging hitter. Neyer states that he "came to the conclusion that while OPS ain't bad, a better measure would be the sum of slugging percentage and OBP*1.4 (or thereabouts)... So yes, OPS is a crude tool, a blunt object that shouldn't be used when precision is critical"

However, we have to use something as a yardstick or Mario Mendoza would look like Babe Ruth-well, maybe not. It got me to thinking how well the various batting averages correlated to runs historically. I compared batting average, on-base percentage, slugging percentage, OPS, and Neyer's modified OPS' (OBP*1.4 + Slug) against runs for all major-league teams to determine which best correlated.

Here's what I got. The higher the correlation coefficient the better:

BA Corr	OBP Corr	SLUG Corr	OPS Corr	OPS' Corr
0.6642	0.8360	0.7679	0.8399	0.8576

So Neyer's OPS' is best historically, and regular OPS nudges on-base percentage. That all seems to make intuitive sense.

I next did the same thing broken down by decades:

Decade	BA Corr	OBP Corr	SLUG Corr	OPS Corr	OPS' Corr
1870s	0.7378	0.6970	0.7193	0.7166	0.7115
1880s	0.6806	0.8857	0.7115	0.8245	0.8600
1890s	0.7143	0.7750	0.7564	0.7808	0.7852
1900s	0.8925	0.9010	0.8836	0.9316	0.9355
1910s	0.7190	0.8035	0.7814	0.8376	0.8413
1920s	0.8681	0.9110	0.9102	0.9603	0.9632
1930s	0.8321	0.9261	0.9204	0.9561	0.9595
1940s	0.8456	0.8840	0.9005	0.9462	0.9462
1950s	0.8216	0.8492	0.8403	0.9384	0.9467
1960s	0.8056	0.8934	0.9094	0.9467	0.9494
1970s	0.8351	0.8934	0.9008	0.9514	0.9580
1980s	0.5579	0.6146	0.7419	0.7475	0.7265
1990s	0.5888	0.6674	0.6596	0.6925	0.6987
2000s	0.8425	0.9124	0.8939	0.9424	0.9500

Note that initially batting average was the best predictor of runs being scored. Then on-base percentage ruled in the 1880s. Ever since then OPS (or OPS') has shown the best correlation to runs scored.

But it's odd how wildly the correlation coefficients fluctuate. One would think that a stat would predict well from decade to decade, or at least that the process would evolve more rather than swing wildly back on forth.

I think there is some way to use linear regression to get the different averages weighed properly based on era, but figuring out what constitutes an era may be the difficult part. It could be split up by decade, but that's sort of an artificial rule being imposed on the system. Perhaps runs-per-game could be used as a means to stratify the major-league seasons, thereby chunking them into like groups.

I'll have to think about this a bit more but I think it's do-able. Maybe I'll wait until after Amazon gets around to sending me my copy of Moneyball.

Comment status: comments have been closed. Baseball Toaster is now out of business.