Kwangmin Kim - Difference-in-Differences (DiD)

출처

이 글은 사전지식 + KOH Ch.11 (간접) 기반 (교재 미확인 — agent 사전학습 기반). 핵심 논문 인용 — Card & Krueger (1994), Goodman-Bacon (2021), Callaway & Sant’Anna (2021).

이 글은 Phase J 시리즈의 13 번째 글이자 J-DID 시리즈 (4 편) 의 첫 글. 관찰 데이터에서 인과 추론 의 가장 흔한 도구 — Difference-in-Differences — 를 다룬다.

1 진입 직관 — “두 그룹의 변화 비교”

지금까지 다룬 RCT, CATE 추정 등은 baseline 의 confounding 통제 또는 cross-sectional 데이터. DiD 는 시간 차원 활용:

DiD 의 한 줄 원리: 처치 group 의 변화 에서 대조 group 의 같은 시점 변화 를 빼서 순수 처치 효과 추정.

즉 시간 trend (모든 group 에 영향) 를 대조 group 의 변화 로 보정.

1.1 비유 — 다이어트 효과

한 여름에 Drug X 다이어트 약 시험. 한 group 만 약 복용. 두 group 모두 6 월 → 9 월 체중 측정.

6 월 9 월 변화

처치 75 kg 70 kg -5 kg

대조 75 kg 73 kg -2 kg

처치 group 의 순수 약 효과: \(-5 - (-2) = -3\) kg. 대조 group 의 2 kg 감소 는 여름 자연 감량 — 약 무관.

DiD = -3 kg — 약의 순수 효과.

	6 월	9 월	변화
처치	75 kg	70 kg	-5 kg
대조	75 kg	73 kg	-2 kg

결정적 가정: 만약 약 안 받았다면 처치 group 도 대조처럼 -2 kg 감소. Parallel Trends Assumption.

2 정의: Difference-in-Differences

정의: DiD Estimator

\[ \widehat{ATT} = (\bar{Y}_{T,1} - \bar{Y}_{T,0}) - (\bar{Y}_{C,1} - \bar{Y}_{C,0}) \]

여기서: - \(\bar{Y}_{T,1}\): 처치 group, 처치 후 평균 - \(\bar{Y}_{T,0}\): 처치 group, 처치 전 평균 - \(\bar{Y}_{C,1}\): 대조 group, 처치 후 평균 - \(\bar{Y}_{C,0}\): 대조 group, 처치 전 평균

2.1 등가 표현

\[ \widehat{ATT} = (\bar{Y}_{T,1} - \bar{Y}_{C,1}) - (\bar{Y}_{T,0} - \bar{Y}_{C,0}) \]

처치 후 group 차이 - 처치 전 group 차이.

2.2 두 표현의 동등성

같은 4 개 평균의 다른 grouping. 수학적으로 동일.

수식 직관: 두 차이의 차이 = 2 차원 변동 통제. 한쪽은 시간 trend (group 별 같음 가정), 다른 쪽은 group 차이 (시간에 일정 가정).

2.3 두 가정의 깊이

가정	의미	위반 사례
Common time trend	두 group 의 처치 외 시간 변화 같음	처치 시점 두 group 이 다른 외부 충격 받음
No time-varying group difference	두 group 의 고유 차이 가 시간에 일정	두 group 의 경제 환경 이 다르게 변화

두 가정이 합쳐 Parallel Trends Assumption (다음 글에서 깊이).

3 Card & Krueger (1994) — 고전 사례

3.1 시험

1992 년 4 월, 뉴저지 최저 임금 인상 (4.25 → 5.05 달러). 펜실베이니아는 변화 없음.

3.2 Card & Krueger 의 분석

뉴저지 (처치) 와 펜실베이니아 (대조) 의 fast food 매장 고용 비교.

DiD: 두 주의 고용 변화 차이 = 최저 임금의 고용 효과.

3.3 결과

표준 경제 이론 예측: 최저 임금 인상 → 고용 감소.

Card & Krueger 결과: 고용 감소 없음, 오히려 약간 증가. 경제학에 충격.

3.4 의의

Quasi-experimental design 의 영향력 있는 사례. 노벨상 (Card 2021) 의 한 근거.

3.5 비판

이후 재분석 (Neumark & Wascher 2000) 에서 다른 결과. 데이터 한계, parallel trends 의문. DiD 의 robustness 검토 필요 보여줌.

4 Two-Way Fixed Effects (TWFE) Regression

4.1 회귀 표현

DiD 를 regression 으로:

\[ Y_{it} = \alpha_i + \gamma_t + \beta \cdot D_{it} + \varepsilon_{it} \]

\(\alpha_i\): unit fixed effect (group/individual 의 시간 무관 차이)

\(\gamma_t\): time fixed effect (모든 unit 에 같은 시간 효과)

\(D_{it}\): 처치 indicator (unit i 가 시점 t 에 처치 받음 = 1)

\(\beta\): 처치 효과 (DiD estimator)

4.2 장점

여러 unit, 여러 time period 일반화. Covariate 추가 가능. Standard error 계산 (cluster robust).

4.3 한계 (Recent Research)

Goodman-Bacon (2021) 등 최근 연구: Staggered adoption (각 unit 이 다른 시점 에 처치 받음) 시 TWFE 가 biased weighted average. 다음 글에서 깊이.

5 Staggered Adoption — 최근 연구

5.1 문제

전통적 DiD: 모든 처치 unit 이 같은 시점 에 처치 받음.

현실: unit 마다 처치 시점 다름 (예: 주별로 다른 시점에 정책 시행).

5.2 TWFE 의 함정

Goodman-Bacon (2021): TWFE 의 β 가 모든 가능한 2x2 DiD 의 weighted average. 일부 negative weight — 해석 어려움.

5.3 새로운 추정량

Callaway & Sant’Anna (2021), Sun & Abraham (2021), de Chaisemartin & D’Haultfœuille (2020) 등이 event-study 또는 group-time ATT 추정량 제안. Robust 한 결과.

다음 글에서 깊이.

6 Synthetic Control — Abadie 등 (2003+)

6.1 동기

단일 처치 unit (예: 한 도시) + 다수 대조 unit. 최선의 대조 group 자체 구성.

6.2 방법

대조 unit 들의 가중 평균 으로 synthetic control 만들기. 가중치는 처치 전 outcome 일치 기준.

6.3 사례

Abadie & Gardeazabal (2003): 바스크 지역 테러 vs 경제. Synthetic Basque (가상의 평화로운 바스크) 와 비교.

6.4 Causal ML 과의 결합

Generalized Synthetic Control, Matrix Completion 등 ML 기반 확장. EconML 의 DRIV 등.

7 DiD vs RCT — 비교

측면	RCT	DiD
데이터	무작위 배정	관찰 (자연 실험)
Identification	Exchangeability	Parallel Trends
가정 강도	약 (무작위 보장)	강 (verifiable 어려움)
응용	임상, A/B 시험	정책, 경제, 자연 실험
Sample size	작아도 OK	시간 + group 충분

7.1 Hernan 의 입장

“DiD 는 RCT 가 불가능할 때 의 차선책. Parallel trends 가정 의존 — RCT 의 무작위 보장과 다름. Robustness check 필수.”

8 DiD 의 응용 영역

8.1 경제학

최저 임금, 노동 정책, 세금 변화. Card-Krueger, Krueger-Card 등.

8.2 Public Health

흡연 정책, 백신 도입, 의료 보험. State-level 정책 효과.

8.3 마케팅

광고 캠페인, 가격 변화. Geo experiment 와 친척 (Phase J-SWITCH).

8.4 Tech

Feature rollout 의 점진적 도입. Staggered DiD 응용.

9 후속 3 글 안내

9.1 J-DID-1: Parallel Trends + TWFE 깊이

PTA 의 정확한 정의 + 검증 방법 (pre-trend test). TWFE regression 의 estimator interpretation.

9.2 J-DID-2: Staggered Adoption (최근 연구)

Goodman-Bacon (2021) 의 TWFE 분해. Callaway-Sant’Anna (2021) 의 group-time ATT. Event-study design.

9.3 J-DID-3: Synthetic Control + Card-Krueger 재방문

Abadie 의 synthetic control. Card-Krueger 의 후속 분석. DiD vs Synthetic Control trade-off.

10 시뮬레이션 — 단순 DiD

import numpy as np
from scipy import stats

np.random.seed(42)

# 시나리오: 2 group, 2 시점
n_per_group = 1000

# Group T (처치)
Y_T_pre = 100 + np.random.normal(0, 10, n_per_group)
# 처치 효과 = 5 + 시간 trend = 2
Y_T_post = Y_T_pre + 2 + 5 + np.random.normal(0, 5, n_per_group)

# Group C (대조)
Y_C_pre = 100 + np.random.normal(0, 10, n_per_group)
# 시간 trend = 2 만 (처치 안 받음)
Y_C_post = Y_C_pre + 2 + np.random.normal(0, 5, n_per_group)

print("[단순 DiD 시뮬레이션]\n")
print(f"진짜 처치 효과: 5\n")

# 4 개 평균
Y_T_pre_mean = Y_T_pre.mean()
Y_T_post_mean = Y_T_post.mean()
Y_C_pre_mean = Y_C_pre.mean()
Y_C_post_mean = Y_C_post.mean()

print(f"4 개 평균:")
print(f"  Y_T,pre  = {Y_T_pre_mean:.2f}")
print(f"  Y_T,post = {Y_T_post_mean:.2f}")
print(f"  Y_C,pre  = {Y_C_pre_mean:.2f}")
print(f"  Y_C,post = {Y_C_post_mean:.2f}\n")

# 단순 차이 (처치 후 group 비교) — confounding 안 보정
naive_post = Y_T_post_mean - Y_C_post_mean
print(f"Naive (처치 후 group 차이): {naive_post:.2f}")

# 단순 차이 (처치 group 의 전·후 변화) — 시간 trend 안 보정
naive_pre_post = Y_T_post_mean - Y_T_pre_mean
print(f"Naive (처치 group 의 전·후): {naive_pre_post:.2f}")

# DiD
did = (Y_T_post_mean - Y_T_pre_mean) - (Y_C_post_mean - Y_C_pre_mean)
print(f"\nDiD: {did:.2f}")
print(f"  처치 group 변화: {Y_T_post_mean - Y_T_pre_mean:.2f}")
print(f"  대조 group 변화: {Y_C_post_mean - Y_C_pre_mean:.2f}")
print(f"  차이: {did:.2f} ≈ 진짜 효과 5\n")

# Standard Error
n = n_per_group
se_did = np.sqrt(
    Y_T_pre.var() / n + Y_T_post.var() / n +
    Y_C_pre.var() / n + Y_C_post.var() / n
)
ci_low, ci_high = did - 1.96 * se_did, did + 1.96 * se_did
print(f"95% CI: ({ci_low:.2f}, {ci_high:.2f})")
print(f"진짜 효과 5 가 CI 안에: {ci_low <= 5 <= ci_high}")

결과 해석:

Naive (처치 후 group 차이): 시간 trend 무시 — 진짜 효과보다 큼.

Naive (처치 group 의 전·후): 시간 trend 포함 — 진짜 효과 + 2 = 7.

DiD: 5 ≈ 진짜 효과.

11 결론

DiD 는 시간 차원 활용 인과 추론. Parallel Trends Assumption 이 핵심. RCT 가 불가능한 정책·경제 영역에서 표준 도구.

핵심 메시지:

DiD 정의: 두 차이의 차이
Card-Krueger (1994): 고전 사례, 노벨상 영향
TWFE Regression: 일반화 + covariate 추가
Staggered Adoption: 최근 연구 — Goodman-Bacon, Callaway-Sant’Anna
Synthetic Control: 단일 unit + 합성 대조
Parallel Trends: 가장 중요한 가정 (다음 글)

후속 3 글에서 PTA·TWFE 깊이, Staggered, Synthetic Control 다룬다.

12 관련 주제

선행 지식

(Phase D) Hernan Ch.7 — Confounding
Effect Modification 시리즈

Phase J 후속 글

Parallel Trends + TWFE (placeholder)
Staggered Adoption (placeholder)
Synthetic Control + Card-Krueger (placeholder)

13 참고문헌

Card, D. & Krueger, A. B. (1994). Minimum wages and employment: A case study of the fast-food industry in New Jersey and Pennsylvania. American Economic Review 84, 772-793.
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. J. Econometrics 225, 254-277.
Callaway, B. & Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. J. Econometrics 225, 200-230.
Abadie, A. & Gardeazabal, J. (2003). The economic costs of conflict: A case study of the Basque Country. American Economic Review 93, 113-132.
Sun, L. & Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. J. Econometrics 225, 175-199.
de Chaisemartin, C. & D’Haultfœuille, X. (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American Economic Review 110, 2964-2996.
Roth, J., Sant’Anna, P. H. C., Bilinski, A., Poe, J. (2023). What’s trending in difference-in-differences? J. Econometrics.
Neumark, D. & Wascher, W. (2000). Minimum wages and employment: A case study of the fast-food industry in New Jersey and Pennsylvania: Comment. American Economic Review 90, 1362-1396.