Model Calibration

Brier Score

0.1815

Lower is better · 0 = perfect · 0.25 = coin flip

Favorite Win Rate

56.1%

Games where the higher-ELO side won outright

Games Analyzed

64,098

Non-forfeit games across all historical seasons

Calibration Curve

Each dot is a bucket of games grouped by predicted win probability. The diagonal line is perfect calibration — dots above mean the model underestimates; dots below mean it overestimates. Dot size reflects number of games in that bucket.

Game Count by Predicted Probability

How many games fall into each 5% probability bucket — a well-spread distribution means the model uses its full range.

How It Works

ELO ratings are maintained per (team, age-group) pair. Before each game, each team has a rating; the expected win probability for the home side is:

P(home wins) = 1 / (1 + 10^{(ELO_away − ELO_home) / 400})

K-factor is 32. Ratings regress 40% toward 1500 between seasons to account for roster turnover. The season simulation uses these ratings plus a fixed 22% draw probability to run 10,000 Monte Carlo trials per flight.