Difference-in-Differences
Estimate causal treatment effects by comparing changes over time between treatment and control groups. The gold standard for policy evaluation.
๐ The DiD Framework
Compare the change in treatment group to the change in control group. The difference removes time trends affecting both.
Key Insight: If both groups would have changed equally without treatment, any extra change in treatment group = causal effect.
| Pre | Post | Change | |
|---|---|---|---|
| Treatment | 66.8 | 110.1 | +43.2 |
| Control | 57.4 | 88.6 | +31.2 |
| DiD | 12.0 |
Parameters
๐ Estimates
Treatment vs Control Over Time
Vertical gap after T5 between green line and dashed gray = treatment effect. DiD estimates this by comparing changes.
The Parallel Trends Assumption
โ Valid DiD
Pre-treatment trends are parallel. Any difference post-treatment is attributable to the intervention.
โ Invalid DiD
Pre-treatment trends diverge. Groups were already on different trajectoriesโDiD estimate is biased.
โ ๏ธ Key Assumptions
Parallel Trends
CriticalWithout treatment, groups would have followed same trajectory
No Spillovers
CriticalTreatment doesn't affect control group
Stable Composition
Group membership doesn't change due to treatment
No Anticipation
Behavior doesn't change before treatment
๐ฐ Pricing Applications
What to Test
- โ New pricing tier impact (launch in one region first)
- โ Promotional campaign effectiveness
- โ Feature launch impact on engagement
- โ Regulatory change effects
When to Use DiD vs A/B
- DiD: Can't randomize, observational data, policy changes
- A/B: Can randomize, want cleanest causal estimate
R Code Equivalent
# Difference-in-Differences regression
library(fixest)
# Prepare data
panel_data <- data.frame(
id = rep(1:n_units, each = n_periods),
time = rep(1:n_periods, n_units),
treated = rep(c(0, 1), each = n_units/2 * n_periods),
post = as.numeric(time > treatment_period),
outcome = y
)
# DiD regression with fixed effects
did_model <- feols(outcome ~ treated * post | id + time,
data = panel_data)
# The coefficient on 'treated:post' is the DiD estimate
summary(did_model)
# Event study (test parallel trends)
event_study <- feols(outcome ~ i(time, treated, ref = treatment_period) | id + time,
data = panel_data)
iplot(event_study) # Should be flat pre-treatmentโ Key Takeaways
- โข DiD estimates causal effects from observational data
- โข Parallel trends assumption is critical
- โข Compare CHANGES, not levels
- โข Use when randomization isn't possible
- โข Validate with pre-treatment trend tests
- โข Works for policy/feature rollouts