Brad Wilkerson's sequential cycle got me to thinking about the odds of sequential cycles (single, double, triple, and home run in order) and cycles in general. After going over the probabilities, Wilkerson's feat seems even more remarkable to me.
In the world of probabilities, a four-at-bat, sequential cycle is the rarest feat. Why? It's all probability-probability is our friend. Hitting a sequential cycle in 4 at-bats is like rolling a die and getting 1, 2, 3, and 4 in order, except that the die would have to be a bit odd-sided since the probability of hitting a single or of not getting a hit at all are much higher than getting a triple or home run.
Historically, the odds of getting a hit are not great. Here are the odds for each type of hit based on total plate appearances across major-league history (TPA= AB + BB + HBP + SF + SH):
Home Run: 1.60%
Today doubles and home runs are more plentiful and singles and triples are rarer than the historical average (Actually a league has not recorded a single or triple percentage as high as the average since the mid-Forties).
So unlike a die, on which the odds of getting a one are the same as the odds of getting any other number, the odds are much, much higher of not getting a hit and, then if a hit is made, of collecting a single. To calculate the odds of getting a sequential cycle in four at-bats would be the same as with dice except instead of using 1-to-6 as the odds, you would insert the odds above (for a historical average). So whereas a 1-2-3-4 roll would be 1/6 * 1/6 * 1/6 * 1/6 or about 0.00077 (7.7 * 10^4) , the odds of a sequential cycle would be 17.10%* 3.83%*0.88%*1.60% or about 0.00000092 (9.2 * 10^7). Given that there have been 179277 games in major league history each with two teams of nine batters, the expected number of 4-at-bat, sequential cycles in major-league history is just 2.97.
The previous sequential cycler was Jose Valentin in 2000, but he did it in a 5-at-bat game. The odds of a 5-at-bat, sequential cycle go up somewhat. The at-bat that was not part of the sequential cycle would have to be before or after the other four hits, and it wouldn't matter what he did in that plate appearance. Using the die example, there are two scenarios X-1-2-3-4 or 1-2-3-4-X (where X is the mystery at-bat). Therefore the odds increase two-fold.
I don't think even Retrosheet can help us determine if that 2.97 expected value is in the ballpark or not. We are left with old Sporting News microfilm. I think I'll pass for now.
Next in the cycle food chain, you have the four at-bat, not-necessarily-sequential variety. Let's revisit the die: Let's assume that we throw a one first (to limit the results). To complete a cycle in 4 tries, here are the possibilities:
If my fancy ciphering works, that's six. Now, consider that given that there are four ways to start that run (1, 2, 3, or 4), there are 24 possibilities (4 * 6 or 4!-four factorial-if you prefer). That means the odds of a four-at-bat cycle historically are 0.0022% which translates into 59.68 expected cycles all-time, if all games were based on 4 at-bats. But they're not
In a 5-at-bat (or 5-plate-appearance to be exact) game, the odds go up five-fold. Again using the die, if we had five throws to collect 1, 2, 3, and 4 in any order, that would mean that we would have one "free" throw that could be anything: a hit, an out, an interference call, a Jeffrey Maier catch-who cares? Using just the first example above we could have five combinations to achieve the cycle:
(Where X is the "free" throw
So each 4-at-bat combination now propagates to five 5-AB combinations or that hitting for the cycle in 5 plate appearances is five times as likely then in four. This means that there 120 combinations that can result in a cycle (i.e., five times the 24 from 4-ABs or 5!)
A six-at-bat game is 25 times more likely to result in a cycle than a 4-AB game since there are two "free" throws. Here are the resulting 6-AB combinations from just the first 5-AB combination above:
(Where X is the "free" throw
Therefore, there are five 6-AB combinations for every 5-AB one and 25 for every 4-AB one. That means that there are 600 combinations in a 6-AB game that can result in a cycle (5 * 5 * 4!).
Also, I should mention that the odds of each combination do not change because the odds of getting anything in the "free" throw are 1. If you throw the die, you have to get something. Such a certainty is assigned the highest probability, one. So one times the combination percentage is still the original percentage.
So what does this all mean, if anything? It means that we can take the probability of hitting a single, double, triple, and home run for each league year and using the number of games, determine the expected occurrences of a cycle. Then we can see how they compare with the actual totals.
At the risk of overkill, I now list the odds of hitting for the cycle and the expected total of cycles per league and year based on 4-, 5-, and 6-plate-apperance at-bats. Also, I include an "Avg Exp" column which calculates the expected number of cycles based on the actual average of plate appearances per game (usually around 4.25). Lastly, I list the actual times a batter hit for the cycle for each league-year. (Note that I could only find NL and AL data. Data based on Baseball-Almanac.com's cycle data):
[Unfortunately, the new and improved Blogger ate the table.]