How Strovex calculates
home run probability
Every morning before first pitch, the model scores each batter in today's lineups using 13 data-driven factors. Here's exactly what goes in, how much each factor is weighted, and why.
The Big Picture
Start with a base rate
We estimate how likely this specific batter is to hit a home run in any given plate appearance, using their season HR rate blended with the league average through Bayesian shrinkage.
Apply 13 multipliers
Each factor adjusts the base rate up or down. Multipliers above 1.0 boost probability, below 1.0 suppress it. They're organized into three tiers by how much signal they carry.
Classify by confidence tier
The final probability determines the pick's confidence tier. HIGH picks (≥12%) represent elite matchups where multiple factors align. MEDIUM picks (≥7%) are solid plays worth tracking.
Base Rate — Bayesian Shrinkage
Before any factors are applied
We don't take a batter's April HR rate at face value. If someone hits 3 HRs in their first 30 PA that's a 10% rate — but that's almost certainly luck. With only 30 plate appearances, we can't trust it.
Instead, we use Bayesian shrinkage — the smaller the sample, the more we blend toward the league average (3.3% HR rate per PA). As a batter accumulates more plate appearances, their observed rate is trusted more and more.
Stabilization points: HR rate 500 PA · Barrel rate 250 PA · Exit velocity 100 PA · ISO 400 PA
Tier 1 — Batter & Pitcher Season Stats
Strongest signal · highest weights · core of the model
Pitcher HR/9
How many home runs does this pitcher give up per 9 innings? The single strongest external factor. A pitcher allowing 2.0 HR/9 is significantly more exploitable than the league average of 1.25.
Barrel Rate
What % of a batter's balls in play are barrels — the hardest, most optimally-hit contact? Barrels are the strongest predictor of home run power. Blended with Bayesian shrinkage (needs 250 PA to fully trust).
Isolated Power (ISO)
Slugging % minus batting average — a pure measure of extra-base power, independent of singles. Captures how much raw power a batter generates. Needs ~400 PA to stabilize.
Exit Velocity
Average speed off the bat for all balls in play. Stabilizes faster than other stats (just 100 PA), making it reliable early in the season. A player consistently hitting 93+ mph has real power.
Pitcher Fly Ball Rate
Pitchers who induce fly balls (rather than ground balls) are more homer-prone — fly balls have the only realistic chance of leaving the park. League avg is 37% fly ball rate.
Tier 2 — Matchup & Context
Who is pitching, where is the game, and how do the styles match up
Platoon Split
Many hitters have dramatic left/right splits — a lefty slugger might hit .290 vs RHP but .190 vs LHP. We look up the batter's actual HR rate against this pitcher's throwing arm, then blend toward season rate based on how many PA of platoon data we have (needs 30+ PA to fully trust).
Park Factor
Ballparks vary enormously. Coors Field (Colorado) inflates HR probability by ~40%. Petco Park (San Diego) suppresses it. We apply the park's historical HR factor directly — it's one of the few factors that's objective and doesn't need blending.
Pitch Arsenal Matchup
We score how well this specific batter matches up against this pitcher's pitch mix. If a pitcher throws 50% sliders and the batter crushes sliders (.420 xwOBA), that's a structural advantage. Weighted by pitch usage — a 5% curveball matters less than a 50% fastball.
Tier 3 — Recent Form & Conditions
Short-term signals and game-day environment · dampened weights
Rolling 5-Game Exit Velocity
Is this batter hot right now? We look at average exit velocity over the last 5 games. A player consistently hitting 95+ mph in recent games is in a strong contact phase. EV is the most reliable short-window signal — more stable than HR rate or barrel rate in small samples.
Recent Barrel Bonus
If a batter has barreled balls in the last 5 games (on top of high EV), they get a small additional bonus. Barrels on top of high EV = they're not just hitting hard, they're hitting with optimal launch angle too.
Wind
Wind blowing out to center/right adds real carry — we apply up to +15% for strong tailwinds (20+ mph out). Wind blowing in suppresses probability by up to -12%. Crosswinds and indoor parks are treated as neutral.
Temperature
Warmer air is less dense, so the ball carries slightly farther. We apply a small linear effect — about 0.1% per degree above 72°F baseline. The effect is real but modest (never more than ±7%).
Confidence Tiers
How we classify each pick
Probability ≥ 12%
Multiple factors aligning strongly — elite power hitter, vulnerable pitcher, favorable park and conditions. Our top picks.
Probability ≥ 7%
Solid matchup with meaningful edge over the league average. Worth tracking closely.
Probability < 7%
Below-average conditions or matchup. Listed for completeness — not the picks we lead with.
The 25% Hard Cap
No matter how many factors line up, we cap the final probability at 25%. Live performance data showed our HIGH and MEDIUM picks connecting at 26%+, so the model's outputs are calibrated to reflect that real-world rate. The cap prevents extreme compounding on outlier days.
What the Model Doesn't Know
Injuries not yet announced. Pitch tip-offs or sequencing adjustments. Lineup changes after 11 AM ET. Player fatigue, travel days, personal circumstances. The model is a probabilistic tool — it identifies edges in aggregate, not guarantees for any single at-bat.