Kwangmin Kim - 2×3 within-subjects 설계와 7 효과

1 모형

정의: 2×3 All-Within Design

요인 \(A\) (2 levels), \(B\) (3 levels), 모두 within. 각 피험자 \(i = 1, \ldots, n\) 가 \(2 \times 3 = 6\) 셀 모두 측정.

\[ Y_{ijk} = \mu + \alpha_j + \beta_k + (\alpha\beta)_{jk} + \pi_i + (\alpha\pi)_{ij} + (\beta\pi)_{ik} + (\alpha\beta\pi)_{ijk} + \varepsilon_{ijk} \]

random effects: - \(\pi_i \sim N(0, \sigma^2_\pi)\): subject random. - \((\alpha\pi)_{ij} \sim N(0, \sigma^2_{\alpha\pi})\): subject × A interaction. - \((\beta\pi)_{ik} \sim N(0, \sigma^2_{\beta\pi})\): subject × B interaction. - \((\alpha\beta\pi)_{ijk}\): subject × AB interaction (residual).

각 fixed effect 의 검정 분모는 자기 효과 × subject 의 random interaction.

2 ANOVA 표 (등표본 \(n\))

Source	\(df\)	검정 분모
Subjects (\(\pi\))	\(n-1\)	—
\(A\) (within)	\(a-1 = 1\)	\(A \times S\)
\(A \times S\)	\((a-1)(n-1) = n-1\)	—
\(B\) (within)	\(b-1 = 2\)	\(B \times S\)
\(B \times S\)	\((b-1)(n-1) = 2(n-1)\)	—
\(A \times B\)	\((a-1)(b-1) = 2\)	\(A \times B \times S\)
\(A \times B \times S\)	\((a-1)(b-1)(n-1) = 2(n-1)\)	—
Total	\(abn - 1 = 6n - 1\)	—

핵심: 각 fixed 효과의 분모는 그 효과 × subject 상호작용.

3 효과의 의미

\(A\) main: \(A\) levels 사이의 평균 차이 (across \(B\) levels and subjects).
\(B\) main: \(B\) levels 사이의 평균 차이.
\(A \times B\): \(A\) 의 효과가 \(B\) levels 에 따라 다른가.

4 자유도 1 vs 자유도 2 효과

직관: 자유도와 sphericity

\(A\) main (자유도 1): 차이 점수 1 종류 → sphericity 자동 만족, paired t-test.

\(B\) main (자유도 2): 차이 점수 3 종류 (\(B_2-B_1\), \(B_3-B_1\), \(B_3-B_2\)) → sphericity 검정 필요.

\(A \times B\) (자유도 2): interaction contrast 2 개 → sphericity 검정 필요.

→ \(A\) 검정은 ε 조정 불필요. \(B, A \times B\) 는 ε 조정 권장 (G-MAX12-3).

5 가설 데이터 — 약 종류 × 시점

요인 \(A\) = 약 종류 (drug A, drug B), 요인 \(B\) = 측정 시점 (1, 2, 3 시간 후). \(n = 10\) 환자.

가상 SBP 평균:

	시점 1	시점 2	시점 3
Drug A	130	120	115
Drug B	132	128	125

해석: - \(A\) main: Drug A 가 Drug B 보다 평균 SBP ↓ (122 vs 128). - \(B\) main: 시간이 지날수록 SBP ↓ (모든 처치). - \(A \times B\): Drug A 가 시간이 지날수록 더 빨리 감소 (15 vs 7).

6 효과 분해

6.1 \(A\) main

\[ \hat\psi_A = \bar Y_{A \cdot} - \bar Y_{B \cdot} = 121.7 - 128.3 = -6.6 \]

자유도 1.

각 피험자에 대해: \[ D_i^{(A)} = \frac{Y_{i,A,1} + Y_{i,A,2} + Y_{i,A,3}}{3} - \frac{Y_{i,B,1} + Y_{i,B,2} + Y_{i,B,3}}{3} \]

\(\bar D^{(A)} = -6.6\). one-sample \(t\)-test.

6.2 \(B\) main (양적 요인이라 trend 분해 가능)

\(B\) levels 가 시간 (양적, 등간격).

선형 contrast \((-1, 0, +1)\): \[ \hat\psi_B^{\text{lin}} = -1(131) + 0(124) + 1(120) = -11 \]

이차 contrast \((+1, -2, +1)\): \[ \hat\psi_B^{\text{quad}} = +1(131) - 2(124) + 1(120) = -1 \]

선형 추세가 강하고 이차는 약함 — 시간에 따라 거의 선형 감소.

각 피험자의 선형 추세 점수: \[ D_i^{(B, \text{lin})} = -1 \cdot \bar Y_i^{(B=1)} + 0 \cdot \bar Y_i^{(B=2)} + 1 \cdot \bar Y_i^{(B=3)} \]

\(\bar D^{(B, \text{lin})}\) 의 one-sample \(t\)-test.

6.3 \(A \times B\)

\(A \times B\) 의 자유도 2 (\((2-1)(3-1)\)). 두 contrast 로 분해: - \(A \times B\)-linear: \(A\) × \(B\)-linear 의 곱. - \(A \times B\)-quadratic: \(A\) × \(B\)-quadratic.

각각 자유도 1 의 별도 검정.

6.3.1 \(A \times B\)-linear

\(c^A = (+1, -1)\), \(c^{B,\text{lin}} = (-1, 0, +1)\). Kronecker product:

\(c^{AB,\text{lin}}_{jk}\): - \((A=A, B=1)\): \((+1)(-1) = -1\) - \((A=A, B=2)\): \((+1)(0) = 0\) - \((A=A, B=3)\): \((+1)(+1) = +1\) - \((A=B, B=1)\): \((-1)(-1) = +1\) - \((A=B, B=2)\): \((-1)(0) = 0\) - \((A=B, B=3)\): \((-1)(+1) = -1\)

\(\hat\psi_{A \times B, \text{lin}} = -1(130) + 0 + 1(115) + 1(132) + 0 + (-1)(125) = -130 + 115 + 132 - 125 = -8\)

해석: Drug A 의 선형 감소 (\(115 - 130 = -15\)) vs Drug B 의 선형 감소 (\(125 - 132 = -7\)). 차이 = \(-15 - (-7) = -8\). \(A \times B\)-linear contrast 가 이를 잡음.

7 ANOVA 결과 (가상)

Source	\(SS\)	\(df\)	\(MS\)	\(F\)	\(p\)
\(A\)	600	1	600	\(600/15 = 40\) (\(MS_{AS}\))	\(<0.001\)
\(A \times S\)	135	9	15	—	—
\(B\)	800	2	400	\(400/10 = 40\) (\(MS_{BS}\))	\(<0.001\)
\(B \times S\)	180	18	10	—	—
\(A \times B\)	100	2	50	\(50/8 = 6.25\) (\(MS_{ABS}\))	0.009
\(A \times B \times S\)	144	18	8	—	—

세 효과 모두 유의. 약 종류, 시점, 그리고 그 상호작용 모두 의미.

8 시각화 — Profile Plot

SBP
130│ ───●(B-1, 130)
   │   \
   │    \
125│ ────●(B-2, 124 평균)──●(Drug B, 1)
   │      \                 \
   │       \                 \
120│        ●(Drug A, 1)─    ●(Drug B, 3)
   │                    \
115│                     ●(Drug A, 3)
   └──────────────────────  Time
     1    2    3

두 line 의 기울기가 다름 → \(A \times B\) interaction.

9 Python 코드

import numpy as np
import pandas as pd
from statsmodels.stats.anova import AnovaRM
from statsmodels.formula.api import mixedlm

np.random.seed(2026)
n = 10
A_levels = ["DrugA", "DrugB"]
B_levels = [1, 2, 3]
cell_means = {
    ("DrugA", 1): 130, ("DrugA", 2): 120, ("DrugA", 3): 115,
    ("DrugB", 1): 132, ("DrugB", 2): 128, ("DrugB", 3): 125,
}

records = []
for subj in range(n):
    pi = np.random.normal(0, 6)
    for a in A_levels:
        for b in B_levels:
            y = cell_means[(a, b)] + pi + np.random.normal(0, 3)
            records.append({"subject": subj, "A": a, "B": b, "Y": y})

data = pd.DataFrame(records)

# 2-way within-subjects ANOVA
aovrm = AnovaRM(data, "Y", "subject", within=["A", "B"]).fit()
print("=== 2-way Within ANOVA ===")
print(aovrm.anova_table)

# Mixed model
md = mixedlm("Y ~ C(A) * C(B)", data=data, groups=data["subject"]).fit()
print("\n=== Mixed Model ===")
print(md.summary().tables[1])

# Trend on B
data["B_lin"] = data["B"] - 2  # -1, 0, 1
data["B_quad"] = (data["B"] - 2)**2 - 2/3

# B 의 trend 분해 (mixed model)
md_trend = mixedlm("Y ~ C(A) * B_lin + C(A) * B_quad",
                   data=data, groups=data["subject"]).fit()
print("\n=== Trend Decomposition ===")
print(md_trend.summary().tables[1])

10 Trend Analysis 의 within-subjects 적용

직관: 양적 시점의 trend 분해

\(B\) 가 시간 (양적, 등간격) 이면 \(B\) main 의 자유도 2 를 다음과 같이 분해:

\(B\)-linear (자유도 1): “시간이 지나며 평균 감소가 선형인가?”
\(B\)-quadratic (자유도 1): “선형 외에 곡률이 있는가?”

각각 자유도 1 의 paired t-test 와 동치.

마찬가지로 \(A \times B\) (자유도 2) 도: - \(A \times B\)-linear: “Drug A 와 Drug B 의 선형 감소가 다른가?” - \(A \times B\)-quadratic: “곡률이 다른가?”

trend 분해는 G-MAX6 의 within-subjects 확장. 양적 시점에 자연스럽다.

11 사후 비교 — Pairwise Comparison

\(B\) main 이 유의 → 어느 시점이 다른가? Pairwise comparison + Bonferroni:

B1 vs B2: paired t-test → p_raw, p_Bonferroni
B1 vs B3: paired t-test → p_raw, p_Bonferroni
B2 vs B3: paired t-test → p_raw, p_Bonferroni

Bonferroni: \(\alpha / 3\) 보정 (3 비교).

또는 Tukey HSD (within-subjects 버전).

12 가정

12.1 Sphericity (재방문)

\(B\) main 의 sphericity: \(\text{Var}(Y_{B=1} - Y_{B=2}) = \text{Var}(Y_{B=1} - Y_{B=3}) = \text{Var}(Y_{B=2} - Y_{B=3})\).

위반 시 ε 조정 (G-MAX12-3).

12.2 결측 데이터

한 시점이라도 결측이면 그 피험자 제거 (listwise) — 큰 손실. multilevel 권장.

12.3 Random Effect 의 정규성

\(\pi_i \sim N(0, \sigma^2_\pi)\). 작은 \(n\) 에서 robust 하지 않을 수 있음.

13 ML 매핑

매핑: ML 의 dataset × hyperparameter × epoch

ML 모델 평가의 다요인 within:

Dataset (subject): 10 datasets
A (within): 학습률 (low, high) — 2 levels
B (within): epoch (1, 5, 10) — 3 levels

각 dataset 의 6 셀 모두 평가.

분석: - \(A\) main: “어느 학습률이 평균적으로 좋은가?” - \(B\) main: “epoch 따라 정확도가 변화하는가?” - \(A \times B\): “학습률에 따라 epoch 의 효과가 다른가?”

multilevel + Bayesian 으로 정통적 처리.

14 본 시리즈

G-MAX12-0  개관
G-MAX12-1  2×3 within 의 7 효과  ← 현재 글
G-MAX12-2  Split-Plot Design
G-MAX12-3  Sphericity Extensions

15 관련 주제

선행 지식

후속 주제

다른 카테고리 연결

Statistics — LDA Mixed Effects

16 더 읽을 거리

Maxwell, S. E., Delaney, H. D. (2004). “Designing Experiments and Analyzing Data.”
Hedeker, D., Gibbons, R. D. (2006). “Longitudinal Data Analysis.” Wiley.
Davis, C. S. (2002). “Statistical Methods for the Analysis of Repeated Measurements.” Springer.