Kwangmin Kim - 공동 개입과 상호작용 식별

이 글은 Phase J 시리즈의 6 번째 글이자 Hernan Ch.5 시리즈 의 두 번째. Ch.5 개관 의 joint intervention 과 interaction 의 정의·식별 을 깊이 다룬다 (Hernan Ch.5.1, 5.2).

1 진입 직관 — 두 처치의 동시 효과

1.1 Vitamins + Heart Transplant 사례

한 시험에서 심장 이식 (\(A\)) + 비타민 보충 (\(E\)) 를 동시에 무작위 배정. 4 처치 조합:

\(E\) \(A\) 의미

1 1 비타민 + 이식

1 0 비타민만

0 1 이식만

0 0 둘 다 안 받음

\(E\)	\(A\)	의미
1	1	비타민 + 이식
1	0	비타민만
0	1	이식만
0	0	둘 다 안 받음

각 환자에 4 counterfactual outcome — 각 처치 조합 시 결과.

1.2 Hernan 의 핵심 질문

비타민을 모두 먹었을 때 이식 효과 가 비타민을 안 먹었을 때 이식 효과 와 다른가?

다르면: interaction between A and E.

비유 — 약물 상호작용: 두 약을 함께 먹으면 효과가 단독 합과 다름. 일부는 시너지 (예: 항생제 칵테일), 일부는 부정적 상호작용 (예: warfarin + 아스피린의 출혈 위험).

2 Joint Intervention — Counterfactual Framework

2.1 Counterfactual \(Y^{a, e}\)

처치 \(A=a\), \(E=e\) 로 강제 개입 시 결과. 4 가지 가능한 값:

\[ Y^{a=1, e=1}, \quad Y^{a=1, e=0}, \quad Y^{a=0, e=1}, \quad Y^{a=0, e=0} \]

2.2 Consistency 의 확장

단일 처치 consistency: \(Y = Y^A\).

공동 처치 consistency: \(Y = Y^{A, E}\).

즉 실제 받은 처치 조합의 counterfactual 이 관찰된 결과. 다른 3 counterfactual 은 반사실 — 관찰 불가능.

2.3 Recursive Substitution

Single counterfactual \(Y^a\) 는 사실 \(Y^a = Y^{a, E}\) 로 볼 수 있음 — \(E\) 는 관찰된 값으로 둠.

Hernan 의 통찰: “Consistency 자체가 recursive substitution 의 special case.”

3 Interaction 의 정확한 정의 — Additive Scale

3.1 정의 1 (A 의 effect 가 E 에 따라 다름)

Additive interaction 존재 ↔︎

\[ \Pr[Y^{a=1, e=1} = 1] - \Pr[Y^{a=0, e=1} = 1] \neq \Pr[Y^{a=1, e=0} = 1] - \Pr[Y^{a=0, e=0} = 1] \]

좌변: E=1 인 모집단에서 A 의 효과. 우변: E=0 인 모집단에서 A 의 효과.

3.2 정의 2 (E 의 effect 가 A 에 따라 다름)

Hernan 이 보여주는 등가 표현:

\[ \Pr[Y^{a=1, e=1} = 1] - \Pr[Y^{a=1, e=0} = 1] \neq \Pr[Y^{a=0, e=1} = 1] - \Pr[Y^{a=0, e=0} = 1] \]

좌변: A=1 하에서 E 의 효과. 우변: A=0 하에서 E 의 효과.

3.3 두 정의의 등가성 (Hernan Technical Point 5.1)

두 정의가 수학적으로 동등. 직접 증명:

정의 1 의 양변에 \(\Pr[Y^{a=0, e=0} = 1]\) 를 빼면:

\[ \{\Pr[Y^{1,1}] - \Pr[Y^{0,0}]\} - \{\Pr[Y^{0,1}] - \Pr[Y^{0,0}]\} \] \[ \neq \{\Pr[Y^{1,0}] - \Pr[Y^{0,0}]\} \]

정리하면:

\[ \Pr[Y^{1,1}] - \Pr[Y^{0,0}] \neq \{\Pr[Y^{1,0}] - \Pr[Y^{0,0}]\} + \{\Pr[Y^{0,1}] - \Pr[Y^{0,0}]\} \]

즉 결합 효과 가 각 단독 효과의 합 과 다름. 대칭적 정의 — A, E 의 지위 평등.

3.4 Interaction 의 부호

부호	명칭	의미
\(>\)	Superadditive	결합 > 합 (시너지)
\(<\)	Subadditive	결합 < 합 (간섭)
\(=\)	No interaction	결합 = 합 (독립)

4 Interaction 의 정확한 정의 — Multiplicative Scale

4.1 정의

\[ \frac{\Pr[Y^{1,1}]}{\Pr[Y^{0,0}]} \neq \frac{\Pr[Y^{1,0}]}{\Pr[Y^{0,0}]} \times \frac{\Pr[Y^{0,1}]}{\Pr[Y^{0,0}]} \]

결합 risk ratio 가 각 단독 risk ratio 의 곱 과 다름.

4.2 부호

부호	명칭
\(>\)	Supermultiplicative
\(<\)	Submultiplicative

4.3 Additive vs Multiplicative — 4 가지 가능성

이전 글 (Ch.4-1) 의 4 시나리오와 비슷:

Additive	Multiplicative	사례
Yes	Yes	흔함 (qualitative interaction)
Yes	No	Baseline 효과 다를 때
No	Yes	Baseline 효과 같을 때
No	No	독립 효과

수식 직관: 같은 데이터 가 두 scale 에서 다른 결론. 항상 두 scale 모두 보고 권장.

5 Identifying Interaction — Ch.5.2

5.1 핵심 요건

Interaction 식별 ↔︎ 4 가지 counterfactual risk \(\Pr[Y^{a, e}]\) 모두 식별. 각각에 대해:

Exchangeability

Positivity

Consistency

5.2 Setting 1: Marginal Randomization of E

\(E\) 가 무조건 무작위 배정. 그러면:

\[ \Pr[Y^{a=1, e=1} = 1] = \Pr[Y^{a=1} = 1 | E=1] \]

즉 공동 counterfactual = 조건부 단일 counterfactual. Effect modification 의 정의와 동일.

결론: \(E\) 무작위 배정 시 interaction = effect modification.

5.3 Setting 2: Joint Randomization (2x2 Factorial)

두 처치 모두 무작위 배정 — 가장 이상적. 4 cell 모두에서 exchangeability 자동.

사례: 비타민 + 이식 동시 무작위. 50% 확률 비타민, 50% 확률 이식. 4 cell 각 25%.

장점: 단순한 통계 분석. 각 cell 의 평균 risk 직접 계산.

5.4 Setting 3: Observational

두 처치 모두 관찰 변수. 환자 자기 선택 또는 의사 결정.

가정: Conditional exchangeability for joint treatment — 측정된 covariate \(L\) 통제 시 4 cell 의 exchangeability.

절차: Combined treatment AE (4 levels: 11, 10, 01, 00) 로 보고 단일 처치 분석. Standardization 또는 IP weighting.

5.5 Setting 4: 일부 처치만 무작위

\(A\) 무작위, \(E\) 관찰. 또는 그 반대.

도전: \(E\) 에 대한 가정 (exchangeability 등) 추가 필요. 가정 못 함 → interaction 분석 불가능, effect modification 만 가능.

Hernan 의 사례: \(A\) 무작위 시험에서 baseline E 에 대한 sub-group 분석. E 에 대해 어떤 가정도 없음 → Effect modification by E 만 검증. Interaction 주장 못 함.

6 Combined Treatment 관점

6.1 핵심 통찰

두 처치 (A, E) 를 하나의 처치 AE (4 levels) 로 봐도 동일. 따라서 단일 처치 인과 추론 도구 가 그대로 적용.

6.2 사례

\(AE \in \{00, 01, 10, 11\}\). 단순 처치 4 levels. Standardization, IP weighting, propensity score (4 level multinomial logistic regression) 모두 가능.

6.3 함의

Interaction 식별은 새로운 도구가 아닌 — 기존 인과 추론 도구의 적용. 단지 4 cell 모두에 대한 식별 가정 필요.

7 Effect Modification ≠ Interaction — 재방문

이전 글에서 다룬 핵심 차이를 더 깊이.

7.1 Hernan 의 결정적 통찰

“\(V\) 가 \(A\) 의 effect modifier 일 수 있지만 \(V\) 와 \(A\) 의 interaction 은 아닐 수 있음. 왜? \(V\) 가 진짜 인과 변수 가 아닌 그것의 surrogate 일 수 있음.”

7.2 Greek/Roman 사례 (Ch.4)

국적 \(V\): 처치 \(A\) (이식) 의 effect modifier. Greek 에서 효과 0, Roman 에서 효과 +0.3.

그러나: 국적 자체 가 인과 작용 안 함. 진짜 원인은 외과 술기 품질 — 국적의 surrogate.

결론: \(V\) (국적) 는 effect modifier 이지만 진짜 인과 처치 (술기 품질) 와의 interaction 의 대리.

7.3 “No Action, No Interaction”

Hernan 의 명언 (Ch.5):

“No action, no interaction.”

즉 변수에 직접 개입할 수 없으면 interaction 주장 못 함. Effect modification 만 가능.

7.4 실무 함의

변수 유형	가능한 분석
처치 (조작 가능)	Effect modification + Interaction
Baseline 변수 (조작 불가능)	Effect modification 만
Surrogate (대리)	Effect modification 만, causal claim 회피

8 두 framework 의 통합 사례

같은 데이터, 두 분석:

8.1 시나리오: 백신 + 마스크 시험

코로나 예방 시험. 백신 (\(A\)) 무작위 배정 + 마스크 (\(E\)) 무작위 배정 (2x2 factorial).

8.1.1 Effect Modification 관점

“마스크 사용자에서 백신의 효과는 마스크 미사용자와 다른가?” — sub-group 분석.

8.1.2 Interaction 관점

“백신과 마스크의 공동 효과 가 각 단독 효과의 합 과 다른가?” — 2x2 factorial 분석.

차이: Effect modification 은 비대칭 (A 의 효과가 E 에 따라). Interaction 은 대칭 (A 와 E 동등).

2x2 factorial 시험에서는 두 분석이 같음 (E 가 무작위). 그러나 해석 다름.

9 시뮬레이션 — 4 가지 식별 시나리오

import numpy as np

np.random.seed(42)

n = 4000

# 진짜 효과 (4 cell)
def true_p_function(a, e):
    return {
        (0, 0): 0.50,
        (1, 0): 0.40,   # A 단독 -0.10
        (0, 1): 0.45,   # E 단독 -0.05
        (1, 1): 0.20,   # 둘 다 -0.30 (superadditive)
    }[(a, e)]

# Setting 1: Marginal Randomization of E (E 만 random, A 관찰)
print("[Setting 1: Marginal Randomization of E]\n")
np.random.seed(1)
E = np.random.choice([0, 1], n, p=[0.5, 0.5])
A = np.random.choice([0, 1], n, p=[0.5, 0.5])   # E 와 무관
true_p = np.array([true_p_function(a, e) for a, e in zip(A, E)])
Y = (np.random.random(n) < true_p).astype(int)

# Stratified analysis (E 별 A 효과)
for e in [0, 1]:
    p_T = Y[(E == e) & (A == 1)].mean()
    p_C = Y[(E == e) & (A == 0)].mean()
    print(f"  E={e}: RD(A) = {p_T - p_C:+.3f}")

# Setting 2: Joint Randomization (2x2 factorial)
print("\n[Setting 2: Joint Randomization (2x2 Factorial)]\n")
np.random.seed(2)
A = np.random.choice([0, 1], n, p=[0.5, 0.5])
E = np.random.choice([0, 1], n, p=[0.5, 0.5])
true_p = np.array([true_p_function(a, e) for a, e in zip(A, E)])
Y = (np.random.random(n) < true_p).astype(int)

# 4 cell risks
print("  4 cell risks:")
for a in [0, 1]:
    for e in [0, 1]:
        mask = (A == a) & (E == e)
        risk = Y[mask].mean()
        print(f"    Pr(Y | A={a}, E={e}) = {risk:.3f}")

# Additive interaction
p00 = Y[(A==0)&(E==0)].mean()
p10 = Y[(A==1)&(E==0)].mean()
p01 = Y[(A==0)&(E==1)].mean()
p11 = Y[(A==1)&(E==1)].mean()
print(f"\n  RD(A | E=0) = {p10 - p00:+.3f}")
print(f"  RD(A | E=1) = {p11 - p01:+.3f}")
print(f"  Interaction = {(p11 - p01) - (p10 - p00):+.3f}")
print(f"  (진짜 interaction = -0.30 - (-0.10) - (-0.05) = -0.15)")

# Setting 3: Observational
print("\n[Setting 3: Observational (confounder L)]\n")
np.random.seed(3)
L = np.random.normal(0, 1, n)
# 처치가 L 에 의존
prob_A = 1/(1 + np.exp(-L))
prob_E = 1/(1 + np.exp(-0.5*L))
A = (np.random.random(n) < prob_A).astype(int)
E = (np.random.random(n) < prob_E).astype(int)
# 진짜 결과
true_p = np.clip(
    np.array([true_p_function(a, e) for a, e in zip(A, E)]) + 0.1*L,
    0, 1
)
Y = (np.random.random(n) < true_p).astype(int)

# Naive
print("  Naive (조정 없음):")
for a in [0, 1]:
    for e in [0, 1]:
        mask = (A == a) & (E == e)
        if mask.sum() > 10:
            risk = Y[mask].mean()
            print(f"    Pr(Y | A={a}, E={e}) = {risk:.3f} (n={mask.sum()})")

# IP Weighting (combined treatment AE 로 보기)
from sklearn.linear_model import LogisticRegression

# 각 cell 의 propensity (4 cells)
AE = A * 2 + E   # 0, 1, 2, 3
ps_model = LogisticRegression(multi_class='multinomial', max_iter=1000)
ps_model.fit(L.reshape(-1, 1), AE)
ps = ps_model.predict_proba(L.reshape(-1, 1))

print("\n  IP Weighted (combined treatment AE):")
for cell, (a, e) in enumerate([(0,0), (0,1), (1,0), (1,1)]):
    weight = 1 / ps[:, cell]
    mask = (AE == cell)
    weighted_risk = (Y[mask] * weight[mask]).sum() / weight[mask].sum()
    print(f"    Pr(Y | A={a}, E={e}) = {weighted_risk:.3f}")

결과 해석:

Setting 1: \(E\) marginal random → effect modification 분석으로 충분.

Setting 2: Joint random → 직접 4 cell 비교.

Setting 3: 관찰 → naive biased, IP weighting (combined AE) 으로 보정.

10 결론

Interaction 식별은 4 cell counterfactual 모두의 식별. 4 식별 시나리오 (Marginal·Joint·Observational·일부) 에 따라 가정과 도구 다름. No action, no interaction — 진짜 인과 처치만 interaction 가능.

핵심 메시지:

Joint intervention \(Y^{a,e}\): 4 처치 조합의 counterfactual
두 등가 정의: A, E 의 지위 평등 (Hernan Tech 5.1)
Additive vs Multiplicative: 같은 데이터, 다른 결론
4 식별 시나리오: 각각 다른 가정·도구
Combined treatment AE: 4 levels 단일 처치로 봐도 OK
No action, no interaction: Surrogate 변수는 effect modification 만

다음 글에서 Counterfactual response types (5.3) + Sufficient causes (5.4) 깊이.

11 관련 주제

선행 지식

Phase J 후속 글

Counterfactual Response Types + Sufficient Causes (placeholder)
Sufficient Cause Interaction (placeholder)

12 참고문헌

Hernán, M. A. & Robins, J. M. (2020). Causal Inference: What If, Chapter 5.1, 5.2. Chapman & Hall/CRC.
VanderWeele, T. J. (2009b). On the distinction between interaction and effect modification. Epidemiology 20, 863-871.
VanderWeele, T. J. & Robins, J. M. (2007). Four types of effect modification: a classification based on directed acyclic graphs. Epidemiology 18, 561-568.
Greenland, S. (2009). Interactions in epidemiology: relevance, identification, and estimation. Epidemiology 20, 14-17.
Rothman, K. J., Greenland, S., Walker, A. M. (1980). Concepts of interaction. Am. J. Epidemiol. 112, 467-470.