1 정의

정의: Moderation (효과 수정)

한 변수의 효과가 다른 변수의 값에 따라 달라지는 현상 (Buisson, 2021, Ch.11).

수식:

\[ Y = \beta_0 + \beta_X \cdot X + \beta_M \cdot M + \beta_i \cdot (X \cdot M) + \epsilon \]

여기서:

$X$ = 주 효과 변수 (예: PlayArea)
$M$ = moderator (예: Children)
$\beta_i$ = interaction term — moderation 의 강도

해석:

$M = 0$ 일 때 $X$ 의 효과 = $\beta_X$
$M = 1$ 일 때 $X$ 의 효과 = $\beta_X + \beta_i$

$\beta_i$ 가 $M$ 값에 따른 효과 차이.

직관 — Moderation 의 의미

분석가의 자연스러운 의문: “효과가 변수에 따라 달라진다” = 단순 회귀로 안 잡히나?

답: 단순 회귀의 한계.

단순 회귀: Y = β_0 + β_X * X + β_M * M
   모든 사람의 X 효과 = β_X (상수)

Moderation 회귀: Y + β_i * (X * M)
   M=0: X 효과 = β_X
   M=1: X 효과 = β_X + β_i

interaction term $\beta_i$ 가 X 효과의 차이를 명시적으로 모델링.

비유: “운동 효과가 사람마다 다름.”

단순 회귀: “운동 = +5 kg 감량 (모든 사람)”
Moderation: “운동 = 남자 +3 kg, 여자 +7 kg” (성별이 moderator)

→ Moderation 이 personalization 의 통계적 framework.

2 3 가지 유형

분류

수학적으로 동일하지만 변수 type 에 따라 명명:

유형	$X$ 의 type	$M$ 의 type	예
Segmentation	비즈니스 행동	개인 특성 (demographic)	PlayArea × Children
Interaction	같은 type	같은 type	PlayArea × LoungeArea
Nonlinearity	변수 자체	변수 자체	Emails × Emails (quadratic)

세 유형 모두 같은 통계 도구 (interaction term).

직관 — 이름의 차이의 함의

비즈니스 의사결정의 차이:

Segmentation: “어떤 customer 에게 우선 적용할지” (개인화)
Interaction: “두 정책을 함께 시행할 때 시너지” (정책 결합)
Nonlinearity: “이 변수의 효과가 점차 감소/증가” (regime change)

분석가의 보고:

변수 type 에 따라 다른 비즈니스 의미
같은 통계 결과지만 다른 비즈니스 행동 권장

→ Moderation 의 framework 가 다양한 비즈니스 질문을 통합.

3 Segmentation — Personal Characteristic Moderator

3.1 C-Mart PlayArea 사례

비즈니스 문제

C-Mart 의 매장 정책 결정:

매장에 PlayArea (놀이 공간) 설치 비용 = $50K
효과: 평균 visit duration 늘림 → 더 많은 구매
질문: 어느 매장에 우선 설치?

가설:

PlayArea 효과는 자녀 동반 customer 에게 큼 (자녀가 놀이 + 부모가 쇼핑)
자녀 없는 customer 에게는 효과 작음

CD:

Children (Y/N) ──→ (PlayArea → VisitDuration)
                            ↑
                            여기서 moderation

화살표가 화살표를 향함 → moderation 표기.

4 시나리오

Children	PlayArea	평균 VisitDuration
0	0	20 분
0	1	24 분
1	0	30 분
1	1	55 분

분석:

Children 의 main effect: 자녀 있으면 +10 분 (20 → 30, 24 → 54)
PlayArea 의 main effect: PlayArea 있으면 +4 분 (20 → 24, 자녀 없을 때)
Interaction: PlayArea 의 효과가 자녀 있을 때 21 분 더 큼 (4 → 25)

수식:

\[ \text{Duration} = 20 + 4 \cdot \text{PlayArea} + 10 \cdot \text{Children} + 21 \cdot (\text{PlayArea} \cdot \text{Children}) \]

직관 — Interaction Term 의 의미

$\beta_i = 21$ 의 비즈니스 의미:

PlayArea 의 효과가 자녀 있는 customer 에게 21 분 더 큼
자녀 없는 customer: 4 분 효과
자녀 있는 customer: 4 + 21 = 25 분 효과

매장 결정:

자녀 동반 customer 비율 높은 매장 = PlayArea 효과 큼
그 매장 우선 설치 → ROI 최대

이게 segmentation 의 비즈니스 가치. 평균 효과만 보고 결정하면 잘못된 매장에 설치.

3.2 회귀 코드

R / Python

lm(duration ~ play_area * children, data=hist_data)

Python:

import statsmodels.formula.api as smf
smf.ols("duration ~ play_area * children", data=df).fit()

* 표기:

play_area * children → play_area + children + play_area:children
자동으로 main effect + interaction 모두 포함

출력:

                    Estimate
(Intercept)         19.99
play_area1           3.96
children1           10.02
play_area1:children1 20.99

절대 main effect 빼지 말 것

분석가가 빠질 수 있는 함정: “Interaction 만 의미 있으니 main effect 빼기”.

# WRONG
smf.ols("duration ~ play_area:children", data=df).fit()

문제:

Main effect 없는 interaction = 잘못된 모형
계수 해석 안 됨
$\beta_i$ 가 잘못 추정

규칙: Main effect 가 통계적으로 유의 안 해도 항상 포함.

# CORRECT
smf.ols("duration ~ play_area + children + play_area:children", data=df).fit()
# 또는 동등한
smf.ols("duration ~ play_area * children", data=df).fit()

→ Hayes (2017) 의 표준. Buisson 도 강조.

3.3 CD 표현

화살표가 화살표를 향함

Children ───→
              ↘
   PlayArea ──→ VisitDuration
              ↗

또는 더 명확히:

   Children
       │
       ▼ (moderation 화살표)
   ┌─────────┐
PlayArea ──→ VisitDuration
   └─────────┘

Children 의 화살표가 PlayArea → VisitDuration 의 효과 자체를 가리킴.

3.4 시각화

직관 — 두 line 의 plot

X-axis: Children (0 or 1) Y-axis: 평균 VisitDuration 두 line: PlayArea = 0 또는 1

Duration
  60 |        × (children=1, PA=1)
     |       /
  40 |      /
     |     /
  20 |── × (children=0, PA=0) ───× (children=1, PA=0)
     |── × (children=0, PA=1)
     |________________________
        0           1   Children

두 line 이 평행 안 함 → moderation 존재.

평행하면:

PlayArea 의 효과가 모든 Children 에게 같음 (no moderation)

기울기 차이가 $\beta_i$.

4 Interaction — 같은 Type 변수

4.1 정의

두 비즈니스 변수의 시너지

C-Mart 의 두 정책:

PlayArea (놀이 공간)
LoungeArea (휴게실)

각각 효과:

PlayArea: 자녀 있는 customer 매장 체류 ↑
LoungeArea: 자녀 없는 customer 매장 체류 ↑

가설: 둘 다 있으면 시너지 (한 customer 가 두 곳 모두 사용 가능).

수식:

\[ \text{Duration} = \beta_0 + \beta_P \cdot P + \beta_L \cdot L + \beta_i \cdot (P \cdot L) \]

여기서 $\beta_i > 0$ 이면 시너지.

CD:

PlayArea ─────→
              ↘
                VisitDuration
              ↗
LoungeArea ──→

두 화살표가 합쳐짐 — symmetrical interaction.

직관 — Symmetrical interaction 의 의미

비즈니스 의사결정:

한 매장에 PlayArea 만: $X$ 효과
한 매장에 LoungeArea 만: $Y$ 효과
둘 다: $X + Y + \beta_i$ (시너지)

만약 $\beta_i > 0$ → 둘 다 설치 가치 ↑. 만약 $\beta_i < 0$ → 둘이 substitute, 둘 다 무의미.

→ 두 정책의 결합 효과 결정.

4.2 ML 과의 연결

직관 — Random Forest, XGBoost 의 power

Buisson 의 깊은 통찰:

“일부 ML 모형 (random forest, XGBoost, neural network) 의 power 가 interaction 자동 캡처에서 옴.”

회귀 vs ML:

회귀: interaction 을 분석가가 명시적 추가 (X * M)
ML: 자동으로 모든 interaction 학습

회귀의 장점: interpretability (계수 = 효과). ML 의 장점: 자동 + 비선형.

분석가의 default:

인과 추론·해석 우선 → 회귀 + 명시적 interaction
예측 정확도 우선 → ML

비즈니스 분석에서: 회귀 (with interaction) 이 의사결정 도구. ML 은 prediction 만 우선.

5 Nonlinearity — Self-Moderation

5.1 Quadratic Term

변수 자체와의 interaction

Marketing email 사례:

1 개월 1 통: 효과 큼
1 개월 5 통: 효과 작음
1 개월 10 통: 효과 거의 없음 (annoying)

Decreasing returns (감소 효과). 선형 모형으로 안 맞음.

해결: Quadratic term

\[ \text{Purchases} = \beta_0 + \beta_1 \cdot \text{Emails} + \beta_2 \cdot \text{Emails}^2 \]

$\beta_2 < 0$ → concave (감소 효과). $\beta_2 > 0$ → convex (증가 효과, network effect).

직관 — 자기 자신과의 Interaction

Quadratic = $\text{Emails} \cdot \text{Emails}$ = self-interaction.

해석:

$M$ = Emails 자체
다른 emails 의 양이 한 email 의 효과를 moderate
“마지막 email 추가 효과는 이미 보낸 emails 수에 의존”

비유: 식사.

첫 빵: 매우 만족
둘째 빵: 약간 만족
셋째 빵: 무관
넷째 빵: 부담

각 빵의 효과가 이미 먹은 양에 의존 → self-moderation.

“Linear regression” 이 아닌 게 아닌 이유

오해: “Quadratic = nonlinear regression.”

정확: Linear regression 의 정의 = 계수에 대해 linear.

$Y = \beta_1 X + \beta_2 X^2$ 가 X 에 대해 nonlinear 이지만, $\beta_1, \beta_2$ 에 대해 linear.

→ 표준 linear regression 으로 적합 가능.

대조: $Y = e^{\beta_1 X}$ 는 계수에 nonlinear → nonlinear regression 필요.

5.2 R/Python 코드

Quadratic syntax

lm(Purchases ~ Emails + I(Emails^2), data=df)

Python:

smf.ols("Purchases ~ Emails + I(Emails ** 2)", data=df).fit()

I() 의 의미:

Identity function — 표현을 그대로 회귀에 전달
회귀 algorithm 이 quadratic term 을 misinterpret 안 하도록

R 은 ^ 가 caret (키보드 6 키), Python 은 ** (제곱).

직관 — 결과 해석

가상 결과:

                Estimate
(Intercept)     0.5
Emails          0.30
I(Emails^2)    -0.02

해석:

Emails 의 main effect: +0.30 (1 email 당 +0.30 purchases)
Quadratic: -0.02 (각 email 의 효과가 줄어듦)

최적 email 수:

\[ \frac{d \text{Purchases}}{d \text{Emails}} = 0.30 - 0.04 \cdot \text{Emails} = 0 \]

$\text{Emails}^* = 7.5$.

→ 비즈니스 결정: 1 개월 7~8 emails 가 optimal. 그 이상은 negative ROI.

→ Quadratic 이 optimal point 식별 가능 (수익률 분석).

6 Uplift Analysis — Moderation 의 마케팅 응용

6.1 정의

Uplift = Treatment Effect Heterogeneity

마케팅에서 “Uplift” = 각 customer 의 treatment 효과.

수식:

\[ \text{Uplift}(i) = E[Y(1) | X_i] - E[Y(0) | X_i] \]

여기서:

$Y(1)$ = treatment 받았을 때 outcome
$Y(0)$ = treatment 안 받았을 때
$X_i$ = customer i 의 특성

Uplift 가 큰 segment 에 우선 treatment 적용 → ROI 최대.

6.2 Buisson 의 사례

잘못된 기준 — High Probability

마케팅의 흔한 함정:

Group 1 (younger, 30 미만):
   Email 안 받음: 20% 구매
   Email 받음: 40% 구매
   → Uplift = 20%p

Group 2 (older, 60 이상):
   Email 안 받음: 80% 구매
   Email 받음: 90% 구매
   → Uplift = 10%p

잘못된 결정 (probability 기준):

“Group 2 가 더 많이 구매 → Group 2 에 email”

올바른 결정 (uplift 기준):

“Group 1 의 email 효과 (uplift) 가 더 큼 → Group 1 에 email”

이 차이가 segmentation 의 비즈니스 가치.

직관 — Uplift 의 측정

Uplift 측정에 필요:

Control vs Treatment 비교 (RCT 필요)
Subgroup 별 비교 (segmentation)

Moderation 회귀:

smf.ols("Y ~ Treatment + Age + Treatment:Age", data=df).fit()

Treatment:Age 의 계수가:

Age 별 treatment 효과 차이
양수면 older 에게 효과 큼
음수면 younger 에게 효과 큼

→ 회귀의 interaction 이 uplift 의 추정 도구.

6.3 Personalization 의 본질

직관 — Personalization 이란?

Personalization 의 정의:

“특정 segment 에게 특정 message 를 보내고, 다른 segment 는 다른 (또는 message 없음).”

핵심:

모두에게 좋은 message = personalization 아님 (그냥 mass marketing)
한 segment 에 특히 잘 작동 + 다른 segment 에는 작동 안 함 (또는 negative) = 진짜 personalization

수학적으로:

$\text{Uplift}(\text{segment 1})$ ≠ $\text{Uplift}(\text{segment 2})$
즉 moderation 존재

→ Moderation 의 직접 응용. Personalization 의 통계적 기반.

7 Moderation 의 적용 결정

7.1 언제 moderation 을 찾을까

함정: 모든 곳에서 찾기

Moderation 은 second-order effect (효과의 효과). 위험:

작은 sample 에서 false positive 큼
분석가의 motivation: “전체 효과 약해도 segment 효과 큼” 발견 욕구
P-hacking 위험

분석가의 default:

1차: 평균 효과 분석 (Ch.8 의 ITT)
2차: 강한 도메인 가설 있으면 moderation
3차: 데이터 탐색적 moderation 은 매우 신중

함정 시나리오:

“전체 효과 거의 0 인데 30 세 남자 + Kansas 거주 + 화요일 응답자 만 +50% 효과!”

이 결과는 거의 noise. P-value 매우 작아 보여도 multiple testing 보정 필요.

7.2 실험 단계 통합

ToC + Moderation

E-BUI8-0 의 ToC:

1-click 버튼 → 예약 시간 ↓ → 예약 완료 ↑

Moderation 추가:

              Age (moderator)
                  │
                  ▼
1-click 버튼 ──→ 예약 시간 ──→ 예약 완료
                  ↑
                  Age (또는 다른 moderator)

Age 가 두 효과 모두 moderate 가능:

1-click → 시간: 젊은 사람이 더 빠르게 적응?
시간 → 완료: 나이 든 사람은 시간 영향 더 큼?

이 가설을 사전에 식별 + 검증.

직관 — 사전 검증의 가치

Moderation 가설을 historical data 로 사전 검증:

smf.ols("BookingDuration ~ Age + Children + Age:Children", data=hist_data)

만약 Age:Children interaction 이 강하면:

가설 강화 (실험에서 Age 별 효과 다를 가능성)
실험 디자인 조정 (Age 별 stratification)
또는 Age 의 narrow range 만 target

만약 약하면:

실험에서 moderation 우선순위 ↓
평균 효과 위주 분석

→ 사전 검증이 실험 효율 ↑.

8 Mediator 와 Moderator 의 차이

두 개념의 흔한 혼동

	Mediator	Moderator
역할	X 의 효과를 Y 로 전달	X 의 효과를 Y 로 변경
CD	X → M → Y (chain)	X → Y, M ↘ (arrow to arrow)
회귀	M 통제 시 X 효과 0	X * M interaction
의미	“어떻게” (How)	“누구에게” (For Whom)

예:

1-click → 예약 시간 → 예약 완료: 예약 시간이 mediator
1-click 효과가 Age 별 다름: Age 가 moderator

직관 — 두 개념의 비유

비유: 약 효과.

Mediator: 약 → 혈중농도 → 증상 완화 (혈중농도가 약효 전달)
Moderator: 약 → 증상 (체중에 따라 다름) (체중이 약효 강도 결정)

비즈니스:

Mediator: TV 광고 → 인지도 → 구매 (인지도 분석으로 광고 메커니즘 파악)
Moderator: TV 광고 → 구매 (인구통계 따라 다름) (segment 분석)

→ 두 개념 모두 분석가 도구. 함께 사용 가능.

9 Moderation 의 검증

9.1 Bootstrap CI

Interaction 의 통계적 검증

Moderation 회귀의 interaction term ($\beta_i$):

p-value 만으로 결정 안 함
Bootstrap CI 사용
효과 크기의 비즈니스 의미 점검

# Bootstrap CI
def bootstrap_interaction(df, formula, target="X1:X2", B=1000):
    coeffs = []
    for _ in range(B):
        sample = df.sample(len(df), replace=True)
        m = smf.ols(formula, data=sample).fit()
        coeffs.append(m.params.get(target, 0))
    return np.percentile(coeffs, [5, 95])

직관 — Bootstrap CI 의 우월성

Standard CI vs Bootstrap CI:

Standard: normal distribution 가정
Bootstrap: 분포 가정 없음

Moderation 의 statistical 분포가 보통 정규 안 됨 (interaction 의 sample 분포가 skewed). Bootstrap 이 더 안전.

비즈니스 분석에서 default = Bootstrap.

9.2 Effect Size 점검

직관 — 통계적 vs 실용적 유의

Interaction 의 두 기준:

통계적 유의 (p < 0.05)
실용적 유의 (비즈니스 의미)

흔한 함정:

큰 sample 에서 작은 interaction 도 통계적 유의
그러나 비즈니스 의미 없음 ($\beta_i = 0.001$ 같은)

분석가의 default:

Effect size 의 비즈니스 의미 우선
p-value 는 보조

C-Mart 사례:

Interaction = 21 분 (큰 효과, 비즈니스 의미 명확)
p < 0.001
둘 다 충족 → 의사결정 진행

10 코드 예시 — Moderation 시뮬레이션

10.1 C-Mart 데이터 생성

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

# 가상 C-Mart 데이터
np.random.seed(42)
n = 5000
df = pd.DataFrame({
    "play_area": np.random.binomial(1, 0.5, n),
    "children": np.random.binomial(1, 0.3, n),
})

# True coefficients
beta_0 = 20
beta_p = 4
beta_c = 10
beta_i = 21

# Outcome
df["duration"] = (
    beta_0
    + beta_p * df["play_area"]
    + beta_c * df["children"]
    + beta_i * df["play_area"] * df["children"]
    + np.random.normal(0, 5, n)
)

# 회귀
m = smf.ols("duration ~ play_area * children", data=df).fit()
print(m.summary())

직관 — 추정 결과의 점검

예상 결과:

                       Estimate
(Intercept)            20.0
play_area              4.0
children              10.0
play_area:children    21.0

진짜 값과 추정값 비교:

$\beta_0$: 20 ≈ 20 ✓
$\beta_p$: 4 ≈ 4 ✓
$\beta_c$: 10 ≈ 10 ✓
$\beta_i$: 21 ≈ 21 ✓

→ 시뮬레이션이 회귀의 정확성 검증.

10.2 시각화

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(8, 5))

# 4 시나리오의 평균
for pa in [0, 1]:
    for c in [0, 1]:
        mean_dur = df[(df["play_area"] == pa) & (df["children"] == c)]["duration"].mean()
        ax.scatter(c, mean_dur, s=200, label=f"PlayArea={pa}, Children={c}")

# Lines
for pa in [0, 1]:
    means = [df[(df["play_area"] == pa) & (df["children"] == c)]["duration"].mean() for c in [0, 1]]
    ax.plot([0, 1], means, label=f"PlayArea={pa}")

ax.set_xlabel("Children")
ax.set_ylabel("Mean Duration")
ax.set_xticks([0, 1])
ax.set_title("Moderation Visualization")
ax.legend()
ax.grid(alpha=0.3)
plt.tight_layout()
plt.savefig("moderation_viz.png", dpi=80)
plt.show()

직관 — 두 line 의 기울기

Plot 에서 두 line 의 비교:

PlayArea = 0 line: 기울기 10 (children 없을 때 1 → 있을 때 1 → +10)
PlayArea = 1 line: 기울기 31 (children 있을 때 1 → +31)

기울기 차이 = $\beta_i$ = 21.

평행 안 함 → moderation 명확.

이 plot 이 비즈니스 파트너에게 가장 효과적.

10.3 Self-Moderation (Quadratic)

# Email 사례
np.random.seed(42)
n = 5000

df_email = pd.DataFrame({
    "emails": np.random.uniform(0, 15, n),
})

# True: decreasing returns
beta_0_e = 0.5
beta_1 = 0.3
beta_2 = -0.02

df_email["purchases"] = (
    beta_0_e
    + beta_1 * df_email["emails"]
    + beta_2 * df_email["emails"] ** 2
    + np.random.normal(0, 0.5, n)
)

# Linear regression (잘못)
m_linear = smf.ols("purchases ~ emails", data=df_email).fit()
print("Linear (잘못):")
print(f"  emails coef: {m_linear.params['emails']:.4f}")
print(f"  R²: {m_linear.rsquared:.4f}")

# Quadratic regression (정확)
m_quad = smf.ols("purchases ~ emails + I(emails ** 2)", data=df_email).fit()
print("\nQuadratic:")
print(f"  emails coef: {m_quad.params['emails']:.4f}")
print(f"  emails² coef: {m_quad.params['I(emails ** 2)']:.4f}")
print(f"  R²: {m_quad.rsquared:.4f}")

# Optimal point
beta_1_est = m_quad.params["emails"]
beta_2_est = m_quad.params["I(emails ** 2)"]
optimal = -beta_1_est / (2 * beta_2_est)
print(f"\nOptimal emails: {optimal:.2f}")

직관 — Optimal Point 의 비즈니스 함의

Quadratic 추정 결과 (가상):

$\hat{\beta}_1 = 0.30$
$\hat{\beta}_2 = -0.02$
Optimal = -0.30 / (2 × -0.02) = 7.5

비즈니스 결정:

“1 개월 7~8 emails 가 optimal. 그 이상 보내면 ROI 감소. 그 미만은 sub-optimal.”

이 권장이 quadratic 의 직접 산출물. 단순 linear 회귀로는 안 보임.

→ Self-moderation 이 optimal policy 식별 도구.

11 종합 — Moderation 의 결정 트리

분석가의 default

1. 평균 효과 분석 (1 차)
   - Treatment 효과 추정
   - Moderation 무관

2. Moderation 가설 식별 (2 차)
   - 도메인 직관
   - Personal characteristic 기반
   - 사전 검증 (historical data)

3. Moderation 회귀
   - X * M interaction term
   - Main effect 모두 포함
   - Bootstrap CI

4. 효과 크기 + 통계적 유의 점검
   - Effect size 가 비즈니스 의미 있음
   - 통계적으로 robust
   - Multiple testing 보정 (여러 moderator 시도 시)

5. 시각화 (두 line plot)
   - 비즈니스 파트너 communication

6. Action plan
   - Segmentation: 어느 segment 에 적용?
   - Interaction: 두 정책 결합 결정
   - Nonlinearity: Optimal point 결정

이 워크플로가 moderation 분석의 표준.

12 관련 주제

12.1 Ch.11 의 형제 글

E-BUI11-1 세분화 분석 — Segmentation 자세히
E-BUI11-2 상호작용과 비선형 — Interaction + Quadratic 자세히
E-BUI11-3 다중 조절 변수와 부트스트랩 — Bootstrap CI 깊이

12.2 이전 챕터

E-BUI3-2 체인과 포크 구조 — Mediator 의 차이

12.3 후속 챕터

E-BUI12-0 Mediation·IV overview — Ch.12: Mediation 자세히

12.4 카테고리 진입점

Experimentation 학습 로드맵 — 11 Phase × 7 교재 매핑