Baseball Toaster was unplugged on February 4, 2009.
Part I
Baseball Reference’s Situational Records page
In our ongoing struggle to understand how teams win. Well, maybe it's not the causal connection to winning but rather the fossil record, the physical evidence that is left behind in the statistical record.
In this leg of that struggle I would like to look at a few items in response to the comments I got at the time as well as some items that are new to the study or had been inadvertently overlooked in the first go-round.
I wasn't satisfied by the lack of correlation between a team's record in one-run games and its overall winning percentage. A one-run game could be a 2-1 pitchers' duel or a 10-9 slugfest. I thought putting a finer point on what sort of one-run game the study should be concerned with. Maybe teams that won low-scoring, one-run games were more likely to be winning teams.
I divided the one-run game population into three groups: low-scoring games (total score no higher than five runs, designated "LS"), high-scoring games (10 runs or higher in total, designated "HS"), and then everything in between (9-9 total runs, designated the "Rest"). Here are the correlation coefficients for each group based on actual and expected winning percentage:
Decade | 1-R LS Win% | 1-R LS Exp Win% | 1-R HS Win% | 1-R HS Exp Win% | 1-R Rest | 1-R Rest Exp Win% |
Total | 0.056 | 0.058 | 0.017 | 0.018 | 0.026 | 0.027 |
1900s | 0.061 | 0.070 | 0.126 | 0.131 | 0.102 | 0.098 |
1910s | 0.109 | 0.109 | 0.020 | 0.018 | 0.179 | 0.180 |
1920s | -0.008 | -0.012 | 0.077 | 0.069 | -0.095 | -0.092 |
1930s | 0.094 | 0.092 | 0.184 | 0.195 | -0.011 | -0.009 |
1940s | 0.094 | 0.094 | 0.052 | 0.057 | 0.039 | 0.038 |
1950s | 0.006 | 0.004 | -0.031 | -0.033 | 0.083 | 0.081 |
1960s | 0.056 | 0.057 | 0.032 | 0.042 | -0.003 | -0.001 |
1970s | 0.072 | 0.067 | -0.073 | -0.081 | 0.013 | 0.018 |
1980s | 0.048 | 0.053 | -0.062 | -0.059 | -0.038 | -0.037 |
1990s | 0.062 | 0.062 | -0.094 | -0.094 | 0.096 | 0.098 |
2000s | 0.039 | 0.050 | 0.071 | 0.083 | -0.131 | -0.129 |
So what does this tell us? Well, that the splitouts correlate even worse to winning percentage than the overall one-run gestalt. So clearly, team records in one-run games are a dead end.
What about what we called "save situations", games decided by three or less runs? Could they give us a clearer picture if we split them out in a similar fashion (i.e., the groups above)? Here are those splitouts:
Decade | BR Sv LS Win% | Exp BR Sv LS Win% | BR Sv HS Win% | Exp BR Sv HS Win% |
Total | 0.623 | 0.617 | 0.582 | 0.568 |
1900s | 0.001 | 0.728 | 0.740 | 0.685 |
1910s | -0.001 | 0.798 | 0.770 | 0.558 |
1920s | 0.001 | 0.534 | 0.474 | 0.660 |
1930s | 0.000 | 0.695 | 0.706 | 0.657 |
1940s | 0.686 | 0.696 | 0.678 | 0.652 |
1950s | 0.603 | 0.620 | 0.551 | 0.557 |
1960s | 0.681 | 0.680 | 0.523 | 0.488 |
1970s | 0.583 | 0.582 | 0.565 | 0.575 |
1980s | 0.579 | 0.528 | 0.473 | 0.419 |
1990s | 0.515 | 0.495 | 0.486 | 0.481 |
2000s | 0.502 | 0.494 | 0.603 | 0.595 |
Again, the breakdowns correlate more poorly than the larger group (i.e., "save situations"). This is probably due to the greater variability as the groups get smaller.
The last thing I would like to look at in respects to team situational records is inter-league record, which I referred to earlier today. Are inter-league games complete crapshoots or does a team's overall winning percentage have some bearing on how well they perform? Unfortunately, we only have seven years' worth of data to look at, but let's see what we get:
Decade | BR Inter-Lg Win% | Exp BR Inter-Lg Win% |
Total | 0.516 | 0.502 |
1990s | 0.345 | 0.372 |
2000s | 0.624 | 0.592 |
As I expected, inter-league record correlates somewhat poorly to overall record. However, this decade its correlation has improved immensely. Perhaps this is due to how inter-league play has been changed, i.e., so that it is completed in the first half and that the rivals now rotate. Or perhaps it's all just luck, variability in the small sample size.
Next, I would like to turn to some other batting statistics to see how well they correlate to winning percentage. In the first part of the study, we looked at the standard batting ratios. Now, I would like to look at home runs, walks, and strikeouts. All of them will be taken as a ratio of the total team plate appearances adjusted for the league average.
We'll also examine a new, derived stat that I'll call the "small ball" factor. It's designed to calculate how well a team plays "small ball". This is a rather abstract and subjective evaluation. For example, how can you measure how well a team plays hit-and-run? Move a runner from first to second by hitting behind him? My solution was to base the stat on what statistical evidence we do have: bunts, sac flies, stolen bases, caught stealing, and grounded into a double play (evidence of good hit-and-run skills). The "small ball" stat will consist of total bunts, sac flies, and stolen bases minus GIDP and caught stealing. The stat is made more problematic because baseball hasn't consistently recorded all of these stats until relatively recently.
To be continued…
Comment status: comments have been closed. Baseball Toaster is now out of business.