I don’t know the name for this phenomenon, but I’m guessing everyone has experienced it at some point. You hear something enough times, and you start to repeat it without really thinking critically about it. My example: the breakeven stolen base rate. I’ve heard that term so many times over the years, often in connection with whether teams were stealing too much or not enough, that I incorporated it into my thought processes like it was my own.
But then someone asked me why the optimal stolen base success rate was around 70%, and I realized that I’d been wrong. It was a bolt-of-inspiration kind of moment – you only need to hear the counter-argument once to re-assess your old, uncritically assumed thought. Why should teams keep stealing so long as they’re successful more than 70% (ish) of the time? I couldn’t explain it to myself using math.
The other side of the coin, the notion that teams should be successful at far better than the breakeven rate in the aggregate, is incredibly easy to understand. There’s a difference between marginal return and total return. Consider a business where you’re making investments. Your first investment makes $10. Your next one makes $8, and then $6, and so on. You could keep investing until your business breaks even – until you make a negative $10 investment to offset that first one, more or less ($10+$8+$6+$4+$2+$0-$2-$4-$6-$8-$10). But that’s a clearly bad decision. You should stop when your marginal return stops being positive – when an investment returns you $0, you can just stop going and pocket the $30 ($10+$8+$6+$4+$2+$0).
When it comes to stolen bases, not every opportunity is created equal. Statcast records caught stealing probabilities that take into account runner speed, distance from second, batter handedness, and all kinds of other variables you’d want to include to get a good estimate of success. In this year’s data set, which doesn’t contain every steal (double steals, steals of home, and failed pickoffs are notable exclusions), there were 644 steals where Statcast estimated a caught stealing probability of 5% or less. That estimate was pretty good! Those base stealers were caught only 1.2% of the time. Those are the easy money steals, the $10 you make on the first investment.
On the other hand, Statcast tabulated 184 steals where the model predicted a caught stealing percentage between 31% and 35%. Again, the model was pretty good – catchers threw out 38.6% of those would-be base stealers. That’s the negative $2 investment in this example. Those steals probably weren’t a good idea.
Now, a stolen base breakeven point still has meaning. Per our play-by-play database, the average successful stolen base event added 0.169 runs to a team’s expected run scoring. The average caught stealing event cost a team 0.394 runs. Do the math, and that means that a 70% success rate has zero expected value. Exclude double steals from the analysis, and it’s about 71%.
Expected value isn’t the only thing determining whether it’s a good time to steal, of course. Who’s batting next matters. Game state matters. Whether the pitcher gets spooked by successful steals probably matters, though definitely not in a way I’d feel comfortable saying we could measure. But in a broad sense, you can think of 70% as a rule of thumb line. You should need a good reason to attempt a steal if you think it’ll be successful less than 70% of the time, and likewise, you should need a good reason not to steal if you’re going to be successful a lot more than 70% of the time.
What does that mean for the league-wide stolen base success rate? Let’s go back to my marginal return example from earlier. The attempts with a caught stealing percentage below 5%? They’re the $10 investment. The steals with a caught stealing rate between 5% and 10%? They’re more like the $8 investment, and so on. I tabulated all that data (see the appendix below for a quick discussion of that) and used that to estimate what the overall stolen base success rate would look like if players only stole when the marginal returns were above zero.
In other words, I took all of the stolen base attempts with an estimated caught stealing percentage of 30% or below and added them together. That’s most of the tracked steals in the database, believe it or not. Statcast estimated probabilities for 3,410 steals in 2024. A full 2,764 of those carried caught stealing probabilities of 30% or lower. Those 2,764 opportunities resulted in 2,397 steals and 367 times caught stealing, an 86.7% success rate.
All the stolen base opportunities with positive marginal value – the ones where the batter is on the right side of breakeven – have an aggregate average success rate of roughly 87%. If the league is below that, there are some bad steals in the mix. Given that the overall success rate in Statcast’s sample is 80.8% (again, it excludes some types of steals), it’s clear that some amount of bad stolen base attempts are bringing the whole sample down.
Here’s another way of thinking about it: Using my average run expectancy changes from up above, the “good steals” added 260 runs of expected scoring to their teams. But if you look at all tracked stolen base attempts as a whole, you get only 207 runs of total value. The “bad steals” cost teams 53 runs, in other words.
Interestingly enough, the “bad steals” were about as bad as the “good steals” were good. The average good steal added 0.090 runs per attempt. The average bad steal cost 0.082 runs per attempt. There were far more good attempts than bad – 81% of steals tracked by Statcast fell on the right side of the breakeven line – but that bottom 20% is dragging down the overall numbers.
That 70% line is hardly a bright dividing line. There are stolen base attempts with a breakeven well below 70%, and ones with a breakeven above it. It’s an aggregate number only, and I won’t claim to have an opinion on every single steal attempt all year. But as a general rule of thumb, it’s fair to say that roughly a fifth of the steals that were attempted this year were negative-expectation undertakings.
Another complication: It’s not like there’s a blinking red light telling you the odds of successfully stealing a base on every play. Tiny fractions of a second separate an 80% chance from a 65% chance. The pitcher throwing a fastball up instead of a changeup down could easily account for it. If you’re willing to end up with a few attempts with marginally negative expected value in exchange for being more aggressive overall, that could change the calculus slightly.
Let’s say teams are fine with stolen base attempts that are only 65% likely to succeed – breakeven plus a margin of error. Add that bucket in to our hypothetical group of good-decision stolen base attempts, and we get an overall success rate of 85.1%, and a total of 252 runs added. That feels closer to a reasonable estimate to me – I’d rather have my baserunners be aggressive with the new rules, personally.
You can quibble with a lot of the particular assumptions here. Maybe the breakeven rate is a bit different than my estimate. Maybe the cost of steals on the player at the plate – taking pitches to give the runner a chance, getting distracted by a shifting defense, and so on – changes the math. Baseball is a lot more complex than my little simplification. But one thing is for sure: If your team is succeeding in its stolen base attempts at the breakeven rate, it’s stealing too often. Don’t focus on getting your overall numbers to breakeven – focus on the marginal breakeven steal, and stop stealing after that.
Appendix: The trend is your friend, except at the end when it bends
Here’s a chart of Statcast’s estimated caught stealing percentage compared to actual rates, bucketed out in five percent groups:
Whoa, the right side is pretty funky, huh? The first half of the graph looks just about perfect, and then things get weird. Is something strange with the numbers?
Not really! There are two things going on here, each of which highlights a limitation of this type of analysis. First, the graph is lying to you. The data follows a trend line right up until around 50% caught stealing, at which point it gets wild. But that’s not half the sample – it’s 94% of the sample. Nearly every steal attempt results in a caught stealing percentage estimate below 50%. Of course it does! That’s how you get an actual caught stealing rate of 20%. There’s an inordinate amount of noise in the right half of the graph – there are a third as many observations in the entire right half than in the left-most datapoint. Here it is in table form:
Caught Stealing Rates, Modeled vs. Actual
Bucket | Count | Modeled CS Rate | CS Rate |
---|---|---|---|
0-5% | 644 | 2.1% | 1.2% |
6-10% | 527 | 8.1% | 4.9% |
11-15% | 543 | 13.0% | 12.9% |
16-20% | 447 | 17.9% | 22.4% |
21-25% | 356 | 22.9% | 24.2% |
26-30% | 247 | 27.8% | 31.2% |
31-35% | 184 | 32.9% | 38.6% |
36-40% | 122 | 37.7% | 45.1% |
41-45% | 86 | 42.5% | 50.0% |
46-50% | 57 | 48.2% | 50.9% |
51-55% | 54 | 53.1% | 44.4% |
56-60% | 32 | 58.2% | 43.8% |
61-65% | 26 | 62.4% | 53.8% |
66-70% | 22 | 68.1% | 40.9% |
71-75% | 21 | 73.1% | 42.9% |
76-80% | 16 | 77.3% | 56.3% |
81-85% | 13 | 83.3% | 46.2% |
86-90% | 7 | 88.4% | 28.6% |
91-95% | 5 | 92.6% | 40.0% |
96-100% | 1 | 96.0% | 100.0% |
Second, imagine what a play with a 75% caught stealing chance looks like. Maybe it’s a busted hit and run attempt, or maybe the runner fell down. Most likely, though, it’s a delayed steal, and trust me when I say that a model that depends primarily on runner position and speed is going to have trouble with delayed steals, particularly when they’re a tiny part of the sample.
I watched every stolen base attempt with an estimated caught stealing chance of higher than 50%. The vast majority of the weird ones – the 85-90% bucket has seven steals in it, and five were successful – were delayed attempts that preyed on defensive inattention. If the catcher was firing down to second base at full speed every time, I have no doubt that it’d be an out almost every time. In my eyes, this is a classic case of a model that is very good in the general case having trouble with some trend-breaking outliers.