Park factors

What the wind off Lake Michigan does to run scoring at Wrigley

A run projection is a guess about the air as much as the bats. At Wrigley the wind off the lake can add or erase the better part of a run before a pitch is thrown, and for a long time the model could not see it.

up to ~1 runWind’s swing on a Wrigley projection · windiest days
1 parkScope · Cubs home games only
Direction · speedOut adds runs, in subtracts · harder means more
Capped & gatedBounded, and only above a real wind speed

A run projection is a guess about the air as much as the bats. Most of the time the air is boring and the guess is mostly about the lineups. Wrigley Field is the exception. The wind off Lake Michigan can add or erase the better part of a run before either team takes a swing, and for a long time the model’s Wrigley projections could not see it. The misses were not random. They pointed straight at the flags above the bleachers.

The tempting assumption is that the run estimates were off because the model misjudged the teams. It did not. It was right about the teams and wrong about the air.

A park that changes by the hour

Wrigley is the most wind-exposed park in baseball, and its wind is not a fixed trait you can fold into a season-long number. It is a daily fact. On an afternoon when the flags blow out toward the lake, the ball carries, warning-track outs become home runs, and the park manufactures runs. On a day the wind comes in off the water, the same park does the opposite: deep drives hang up and die, and run scoring falls. Two different run environments, one address, decided by which way the wind is blowing that day.

A single annual park factor averages those two parks into one number, and that number is wrong on both kinds of day. It captures the building. It cannot capture the morning.

How much the wind is worth

This is the part that matters for a run projection: the wind’s effect is not a rounding error. On the windiest days it can move the expected run total by close to a full run, upward when it is blowing out and downward when it is blowing in. A blustery afternoon at Wrigley is, in run terms, a meaningfully different game from a calm one between the same two teams.

So a projection that ignores the wind is not slightly off on those days. It is systematically off, in a predictable direction, every time the flags are snapping. That is the worst kind of error a model can carry, because it is not noise that averages out over a season. It is a bias that shows up again and again in the same place.

Teaching the projection to read the air

The fix is to put the missing physics back into the run total. The model now adds a wind term to its Wrigley projection. Direction sets the sign, so a wind blowing out raises the projected runs and a wind blowing in lowers them, and speed sets the size, so a stiff wind moves the number more than a light one. The adjustment is capped, so even a gale cannot run away with the projection, and it engages only above a real wind speed. On a calm day it does nothing. When there is no wind reading, it does nothing.

It is deliberately narrow. One park, Cubs home games only, and only when there is genuine wind to account for. The exact coefficients are part of the live model and stay behind the paywall, but the shape is simple and physical: out adds runs, in subtracts runs, harder means more, and the whole thing is bounded.

Did the runs line up

A correction that merely fits the past is worthless, so the test was written down before it was run, with a set of pass-or-fail checks fixed in advance: every half-season had to hold on its own, both full seasons had to hold, and Wrigley specifically had to improve on its starting point. It cleared all of them. The park that had been the model’s most consistent blind spot became one of its most accurate once the air was in the projection, and the corrected version is live now. The record will keep accumulating from here.

Why this one is worth telling

It is the cleanest example of a recurring idea on this site: the model can be right about the teams and wrong about the conditions, and the honest fix is small, narrow, and tested in advance rather than a sweeping new system. The wind off the lake was never noise. It was a real, measurable force on run scoring that the projection finally learned to read.

Analytical and educational content about model performance. Not betting advice.