0/70 completed
Metrics & Evaluation Interactive

Brier Score

Measure the accuracy of probabilistic predictions. Lower is better. Critical for evaluating calibration of betting models.

๐Ÿ“Š The Brier Score Formula

BS = (1/N) ร— ฮฃ(p_i - o_i)ยฒ
  • p_i = Predicted probability
  • o_i = Actual outcome (0 or 1)
  • N = Number of predictions

Interpretation

  • โ€ข 0.0 = Perfect predictions
  • โ€ข 0.25 = Random guessing (50% each)
  • โ€ข 1.0 = Always 100% wrong

Single Prediction

Predicted Probability 0.7
0 1
Actual Outcome

๐Ÿ“Š Result

0.0900

(0.70 - 1)ยฒ = 0.0900

Excellent prediction!

Sample Model

Predictions 100
Avg Brier Score 0.1730

Calibration Chart

Well-calibrated model: predicted % โ‰ˆ actual %. When you predict 70%, it should happen 70% of the time.

Brier Score Benchmarks

Perfect
0.00
Always predicts exactly right
Excellent
0.10
Tournament winning
Good
0.20
Useful for betting
Average
0.25
Random baseline
Poor
0.35
Worse than guessing

๐Ÿ”ฌ Brier Score Decomposition

Calibration

How well predicted probabilities match observed frequencies.

70% predictions should win ~70% of the time.

Resolution

How much predictions vary from base rate.

Always predicting 50% = no resolution.

Uncertainty

Inherent unpredictability of outcomes.

Can't be reducedโ€”max at 50% base rate.

๐Ÿ€ Sports Pricing Applications

Model Evaluation

  • โ†’ Compare different projection models
  • โ†’ Track model performance over time
  • โ†’ Identify miscalibrated probability bins

Pricing Validation

  • โ†’ Verify implied probabilities are accurate
  • โ†’ Compare to closing line performance
  • โ†’ Segment by sport/market for tuning

R Code Equivalent

# Calculate Brier Score
brier_score <- function(predicted, actual) { 
  mean((predicted - actual)^2)
}

# Calibration plot
plot_calibration <- function(predicted, actual, n_bins = 10) { 
  bins <- cut(predicted, breaks = seq(0, 1, length.out = n_bins + 1))
  
  calibration <- data.frame(
    bin = levels(bins),
    predicted = tapply(predicted, bins, mean),
    actual = tapply(actual, bins, mean)
  )
  
  ggplot(calibration, aes(x = predicted)) +
    geom_line(aes(y = predicted), linetype = "dashed") +
    geom_point(aes(y = actual), color = "green") +
    labs(x = "Predicted", y = "Actual") +
    theme_minimal()
}

# Example
predicted <- c(0.7)
actual <- c(1)
bs <- brier_score(predicted, actual)
cat(sprintf("Brier Score: %.4f\n", bs))

โœ… Key Takeaways

  • โ€ข Brier Score: lower = better (0 = perfect)
  • โ€ข 0.25 is random guessing baseline
  • โ€ข Penalizes confident wrong predictions heavily
  • โ€ข Use calibration plots to diagnose issues
  • โ€ข Decompose into calibration + resolution
  • โ€ข Track over time to detect model drift

Pricing Models & Frameworks Tutorial

Built for mastery ยท Interactive learning