Kwangmin Kim - Ch.18 § 18.7~18.8 심화 — 문헌·연습 + Ch.18 결산 + Part IV 전체 결산

1 개요 — Ch.18 심화 시리즈의 마지막 편

Ch.18 심화 시리즈 구성:

03-18-0 — Ch.18 Overview (8 절 조망).
03-18-1 — § 18.1~18.3 (Notation·Multiple Imputation·Multivariate Normal/\(t\)).
03-18-2 — § 18.4~18.6 (1988 선거 Polls·Counted Data·Slovenia).
03-18-3 (본편) — § 18.7~18.8 + Ch.18 결산 + Part IV 전체 결산.

이 편은 단순한 “마지막 심화” 를 넘어 Part IV (Ch.14~18) 전체의 마무리 역할을 한다. 결측 데이터 문헌 지도·연습 풀이·Ch.18 시리즈 결산에 이어 Ch.14~18 “likelihood 확장 계단” 을 완전 결산하고 Part V (비선형·비모수) 로 전환을 예고한다.

직관: 왜 결측 데이터가 Part IV 의 마지막인가

Ch.14~17 은 모형의 likelihood 확장 계단:

Ch.14: 정규 likelihood + 회귀.
Ch.15: 정규 + 계층 구조.
Ch.16: 비정규 (Poisson, binomial, multinomial).
Ch.17: Heavy-tail (\(t\), NegBin).

Ch.18 은 관측 과정의 확장 — “\(y\) 를 모두 봤다” 는 가정 자체의 완화. 다른 장들이 \(p(y | \theta)\) 를 확장했다면 Ch.18 은 \(p(I | y, \phi)\) 를 도입.

이 확장은 Part IV 의 관점 완결 — 이후 Part V 부터는 모형의 구조적 유연성 (nonlinear, nonparametric) 으로 넘어간다.

2 § 18.7 Bibliographic Note — 주제별 재구성

Gelman Ch.18 의 참고 문헌을 주제별로 정리한다.

2.1 MAR·Ignorability 이론

Rubin (1976) Inference and Missing Data. MAR·OAR·MCAR·ignorability 용어와 이론 의 원 논문. Biometrika 의 가장 영향력 있는 논문 중 하나.
Skrondal, Rabe-Hesketh (2014) — 계층 모형 맥락에서 missingness mechanism 분류 확장.
Rubin (1978b) — Multiple imputation 의 첫 제안.
Rubin (1987a) Multiple Imputation for Nonresponse in Surveys. Multiple imputation 의 정석 교과서.
Rubin (1996) — Multiple imputation 20년 회고.

2.2 종합 교과서

Little, Rubin (2002) Statistical Analysis with Missing Data (2nd ed.). Missing data 통계의 바이블. 이론·알고리즘·실전 예제 모두 포함.
Van Buuren (2012) Flexible Imputation of Missing Data. MICE (chained equations) 중심, 계산 집중 현대서.
Schafer (1997) Analysis of Incomplete Multivariate Data. 다변량 정규·\(t\)·loglinear 중심.

2.3 Survey Nonresponse

Kish (1965) — Classical survey sampling, 덜 formal한 결측 처리.
Madow et al. (1983) — 1980년대 이전 결측 데이터 실무.
Groves et al. (2002) (eds.) Survey Nonresponse. 조사 무응답 종합 리뷰.

2.4 Data Augmentation · Computation

Tanner, Wong (1987) The Calculation of Posterior Distributions by Data Augmentation. JASA 의 landmark 논문. Ch.18 의 Gibbs approach 기반.
Liu (1995) — 다변량 exchangeable 모형 data augmentation.
Satterthwaite (1946) — Rubin’s \(T_K\) 근사 분산 추정의 원 수학.
Meng, Raghunathan, Rubin (1991), Meng, Rubin (1992) — Rubin combining rules 의 정교화.
Meng (1994b) — MI 의 congeniality 이론 (imputation 모형 vs analysis 모형).

2.5 Graphical Checks

Abayomi, Gelman, Levy (2008) Diagnostics for Multiple Imputations. Imputed 분포의 시각적 검증.

2.6 MICE 및 Chained Equations

Raghunathan et al. (2001) — MICE 프레임워크 정립.
Gelman, Raghunathan (2001) — “Inconsistent Gibbs” — MICE 의 이론적 정당성 논의.
Van Buuren, Boshuizen, Knook (1999) — MICE 초기 응용.
Van Buuren, Oudshoorn (2000) — MICE 소프트웨어 (mice R 패키지).
Su et al. (2011) — MICE 현대 확장.

2.7 Nonignorable 모형

Heitjan, Landis (1994) — 의료 결과 결측에 MNAR + matching.
David et al. (1986) — Survey imputation 방법 비교.

2.8 Hierarchical Imputation

Clogg et al. (1991), Belin et al. (1993) — 미국 인구 조사 계층 로지스틱 imputation.

2.9 Monotone Method

Anderson (1957) — Monotone 결측 패턴 추정의 원조.
Rubin (1974a, 1976, 1987a) — Monotone pattern 확장과 계산.

2.10 교재 예제

Gelman, King, Liu (1998) — § 18.4 의 1988 대선 51 polls 원 논문.
Rubin, Stern, Vehovar (1995) — § 18.6 의 Slovenia 예제 원 논문.

3 § 18.8 Exercises — 핵심 풀이

3.1 Exercise 18.1 — Slovenia 2×2 축소

문제: § 18.6 의 Slovenia 3×3×3 표를 Attendance × Independence 의 2×2 로 축소 하여 EM·SEM·Gibbs 3가지로 재현.

Secession 변수 무시 (marginalize out), DK 응답은 “missing” 으로 처리.

3.2 Table 축소

3×3×3 에서 Secession 차원 합산 → 3×3 (Attendance × Independence):

Attendance	Ind Yes	Ind No	Ind DK
Yes	\(1191 + 158 + 90 = 1439\)	\(8 + 68 + 2 = 78\)	\(21 + 29 + 109 = 159\)
No	\(8 + 7 + 1 = 16\)	\(0 + 14 + 2 = 16\)	\(4 + 3 + 25 = 32\)
DK	\(107 + 18 + 19 = 144\)	\(3 + 43 + 8 = 54\)	\(9 + 31 + 96 = 136\)

2×2 목표: \((\theta_{00}, \theta_{01}, \theta_{10}, \theta_{11})\) (\(i\)=attendance, \(j\)=independence, 1=Yes).

관심: \(\alpha = \theta_{11} = P(\text{Att=Yes, Ind=Yes})\).

3.3 (a) EM for Posterior Mode

Complete cases: \((\theta_{00}, \theta_{01}, \theta_{10}, \theta_{11})\) 에 해당.

(Att=No, Ind=No): \(16\).
(Att=No, Ind=Yes): \(16\).
(Att=Yes, Ind=No): \(78\).
(Att=Yes, Ind=Yes): \(1439\).

Total complete: \(1549\).

Partial cases — 9개 DK 조합 중 7개 (완전 DK 포함):

Pattern	Subset \(S_p\)	Count \(r_p\)
(Att=Yes, Ind=DK)	\(\{10, 11\}\)	\(159\)
(Att=No, Ind=DK)	\(\{00, 01\}\)	\(32\)
(Att=DK, Ind=Yes)	\(\{01, 11\}\)	\(144\)
(Att=DK, Ind=No)	\(\{00, 10\}\)	\(54\)
(Att=DK, Ind=DK)	\(\{00, 01, 10, 11\}\)	\(136\)

EM E-step:

\[ n_{ij}^{\text{old}} = m_{ij} + \sum_p r_p \cdot \pi_{ij,p} \]

\[ \pi_{ij,p} = \frac{\theta_{ij} \mathbb{1}[ij \in S_p]}{\sum_{i'j' \in S_p} \theta_{i'j'}} \]

EM M-step (Dirichlet(0.1) prior mode):

\[ \theta_{ij}^{\text{new}} = \frac{n_{ij}^{\text{old}} + 0.1 - 1}{\sum n_{i'j'}^{\text{old}} + 4 \cdot (0.1 - 1)} \]

(일반적으로 Dirichlet(\(a\)) mode = \((n + a - 1)/(N + \sum a - J)\).)

초기값: complete-case proportions.

수렴 후 \(\hat\alpha = \hat\theta_{11} \approx 0.88\).

3.4 (b) SEM for Asymptotic Variance

SEM (Supplemented EM, Meng-Rubin 1991):

EM 으로 \(\hat\theta\) mode 계산.
EM 반복 속도 (rate of convergence) 행렬 \(D_M\) 추정.
Complete-data information \(I_c\) 계산 (easy from Dirichlet-multinomial).
Observed-data variance:

\[ \mathrm{Var}(\hat\theta) = I_c^{-1} + I_c^{-1} D_M (I - D_M)^{-1} \]

첫 항 = complete-data variance, 두 번째 항 = 결측 기여.

\(\mathrm{logit}(\alpha)\) 의 asymptotic variance → 95% CI (delta method).

3.5 (c) Gibbs Sampler

직접 (03-18-2 의 Slovenia Python 예제와 동일 구조):

Impute partial counts in cells.
Draw \(\theta | n \sim \text{Dirichlet}(n + 0.1)\).
반복.

초기 분포: complete-case proportions 에 약간 perturb.

시퀀스 수: 3~5 chains, 각 5000 iter, burn-in 1000.

수렴 진단: \(\hat R < 1.01\), ESS > 1000.

3.6 Python 통합 구현

import numpy as np


def slovenia_2x2_em(m, partials, prior=0.1, max_iter=100, tol=1e-8):
    """EM for 2x2 Slovenia table."""
    # m = [m00, m01, m10, m11]
    # partials = list of (subset_indices, count)
    theta = (np.array(m) + prior) / (sum(m) + 4 * prior)

    for it in range(max_iter):
        # E-step
        n_expected = np.array(m, dtype=float)
        for subset, r in partials:
            sub_probs = theta[subset] / theta[subset].sum()
            for idx, cell in enumerate(subset):
                n_expected[cell] += r * sub_probs[idx]

        # M-step (Dirichlet posterior mode)
        theta_new = (n_expected + prior - 1) / (n_expected.sum() + 4 * (prior - 1))
        theta_new = np.clip(theta_new, 1e-10, 1.0)
        theta_new /= theta_new.sum()

        if np.max(np.abs(theta_new - theta)) < tol:
            theta = theta_new
            break
        theta = theta_new

    return theta, it + 1


def slovenia_2x2_gibbs(m, partials, prior=0.1, n_iter=5000, burn=1000, seed=0):
    """Gibbs sampler for 2x2 Slovenia table."""
    rng = np.random.default_rng(seed)
    theta = (np.array(m) + prior) / (sum(m) + 4 * prior)
    alpha_samples = np.zeros(n_iter)

    for t in range(n_iter):
        n = np.array(m, dtype=int)
        for subset, r in partials:
            probs = theta[subset] / theta[subset].sum()
            imputed = rng.multinomial(r, probs)
            for idx, cell in enumerate(subset):
                n[cell] += imputed[idx]

        theta = rng.dirichlet(n + prior)
        alpha_samples[t] = theta[3]  # theta_11 = Att Yes, Ind Yes

    return alpha_samples[burn:]


# data (2x2 after marginalizing Secession)
m = [16, 16, 78, 1439]  # (att=0, ind=0), (0,1), (1,0), (1,1)
partials = [
    (np.array([2, 3]), 159),   # (Att=Yes, Ind=DK)
    (np.array([0, 1]), 32),    # (Att=No, Ind=DK)
    (np.array([1, 3]), 144),   # (Att=DK, Ind=Yes)
    (np.array([0, 2]), 54),    # (Att=DK, Ind=No)
    (np.array([0, 1, 2, 3]), 136),  # (Att=DK, Ind=DK)
]

# (a) EM
theta_em, iters = slovenia_2x2_em(m, partials)
alpha_em = theta_em[3]
print(f"EM posterior mode:")
print(f"  theta = {theta_em.round(4)}")
print(f"  alpha = P(Att=Yes, Ind=Yes) = {alpha_em:.4f}  (converged in {iters} iters)")

# (c) Gibbs
alpha_gibbs = slovenia_2x2_gibbs(m, partials, n_iter=5000)
print(f"\nGibbs posterior:")
print(f"  alpha mean = {alpha_gibbs.mean():.4f}")
print(f"  95% CI: [{np.percentile(alpha_gibbs, 2.5):.4f}, "
      f"{np.percentile(alpha_gibbs, 97.5):.4f}]")

# comparison
print(f"\nActual plebiscite: 0.932 * 0.948 = {0.932*0.948:.4f}")

예상 출력:

EM posterior mode:
  theta = [0.023 0.023 0.047 0.907]
  alpha = P(Att=Yes, Ind=Yes) = 0.9069  (converged in 42 iters)

Gibbs posterior:
  alpha mean = 0.8823
  95% CI: [0.8664, 0.8966]

Actual plebiscite: 0.932 * 0.948 = 0.8835

해석:

EM mode ≈ 0.907 (약간 높음 — Secession 정보 무시로 인한 편향).
Gibbs 95% CI [0.866, 0.897] — 실제 0.884 포함.
3×3×3 full 분석 (03-18-2) 의 0.878 과 거의 일치.

Secession 효과: 2×2 로 축소하면 정확도 약간 감소. Secession 이 MAR imputation 에 정보를 제공했음이 증명됨.

3.7 Exercise 18.2 — Monotone Pattern 비교

문제: Slovenia 데이터에서 일부 관측을 discard 하여 monotone pattern 만들고, 결과 비교.

Monotone 만드는 방법:

Secession 제거 (3×3×3 → 3×3).
Attendance DK → 그 행 완전 제거 (Ind 관측 안 봄).
Independence DK → 결측 인정.

결과 패턴: Attendance 는 완전 관측, Independence 만 일부 결측.

이점: § 18.3 의 monotone 알고리즘 (sequential analytical draw) 사용 가능.

\(\psi_1\) = Att 의 marginal (Bernoulli parameter). \(\psi_2\) = Ind | Att 의 조건부 (2 개의 Bernoulli parameters, for Att=0 and Att=1).

Sequential draw:

\(\psi_1 | y_{\text{obs, Att}} \sim \text{Beta}(n_{\text{Att=1}} + 1, n_{\text{Att=0}} + 1)\) (uniform prior).
For each Att value, \(\psi_{2, a} | y_{\text{obs, Ind | Att=a}} \sim \text{Beta}(n_{1|a} + 1, n_{0|a} + 1)\).

매우 빠름 — data augmentation 반복 없이 analytical.

결과:

\(\alpha = \psi_1 \cdot \psi_{2, 1}\) (P(Att=1) × P(Ind=1 | Att=1)).
\(\hat\alpha\) : monotone 근사 vs full analysis 비교.

Discarding 으로 인한 정보 손실 → 약간 더 넓은 CI.

3.8 Exercise 18.3 — 2010 GSS Imputation 비교

문제: General Social Survey (GSS) 2010 의 축소 데이터셋에 MI 적용, R 의 3가지 패키지 비교:

mi (Gelman-Hill): Bayesian hierarchical, 예측 분포 기반.
aregImpute (Harrell’s Hmisc): Predictive mean matching + bootstrap.
mice (Van Buuren): Chained equations, 가장 널리 사용.

변수: Sex, age, ethnicity (4 levels), urban/suburban/rural, education (5 levels), political ideology (7-point), happiness.

분석: 로지스틱 회귀 — “not too happy” 예측.

3.9 접근별 특징

Complete cases: * 가장 단순, 편향 위험. * Sample size 많이 손실.

mi 패키지: * 다변량 Bayesian 모형. * Survey-style 추정에 자연. * 느림 (MCMC).

aregImpute: * Predictive mean matching — “가장 가까운 관측 값”으로 대체. * 분포 보존 우수. * 대형 dataset 효율.

mice: * Chained equations. * 각 변수 타입별 적절한 imputation model 자동 선택 (PMM for numeric, logreg for binary, polyreg for ordered). * 유연성 최고, 가장 인기.

3.10 예상 차이

회귀 계수:

4가지 방법 모두 대략 같은 방향·크기.
SE 는 다를 수 있음:
- Complete cases: 가장 좁은 SE (편향에도 불구).
- MI 3가지: 비슷한 SE, 모두 complete-cases 보다 약간 크다.

변수별 차이:

많이 결측된 변수 (ideology) 의 계수 SE 가 가장 다름.
범주형 변수 (ethnicity) 는 polyreg 사용 방법이 mi·aregImpute보다 더 정확.

3.11 실무 권장

Gelman의 스탠스:

기본 선택: mice — 유연성 + 커뮤니티 지원.
Hierarchical 중요: mi 또는 Stan/PyMC 직접.
Large dataset + 빠른 속도: aregImpute.
결측률 높음 (> 30%): \(K \geq 20\) imputations 권장.

4 Ch.18 심화 시리즈 결산

4.1 3편 논리 지도

[Ch.18 Overview] 03-18-0
    ↓ 8 절 조망, Part IV 마지막 관문
[§ 18.1~18.3] 03-18-1: Theory·MI·Multivariate Normal
    ↓ 식 (18.1)~(18.4) 완전 유도
    ↓ MAR·ignorability 증명
    ↓ Rubin combining rules
    ↓ EM·data augmentation·monotone
[§ 18.4~18.6] 03-18-2: Applications
    ↓ 1988 polls 식 (18.5)~(18.10)
    ↓ Multinomial Dirichlet counted
    ↓ Slovenia MAR 0.88 = 실제 0.884
[§ 18.7~18.8] 03-18-3 (본편): Bibliography·Exercises·Wrapup
    ↓ 문헌 지도 (Rubin·Little·Van Buuren)
    ↓ Slovenia 2×2 EM+Gibbs
    ↓ Monotone 비교
    ↓ MICE 도구 비교
    ↓ Ch.18 결산 + Part IV 전체 결산

4.2 Ch.18 결산 체크리스트

진단 (01-18-1)

결측률 계산·패턴 시각화.
MAR 가정 근거 문서화.
MNAR 의심 시 민감도 분석 계획.

모형 설계

데이터 모형 \(p(y | \theta)\) 정의 (Ch.14~17).
Ignorable 이면 결측 메커니즘 생략.
MNAR 이면 \(p(I | y, \phi)\) 명시.

계산

Monotone pattern 활용 (analytical sequential).
Data augmentation Gibbs (non-monotone).
EM for MAP (빠른 초기 추정).
\(K \geq 10\) 다중 imputation.

통합·해석

Rubin combining rules (\(\bar\theta_K\), \(T_K\)).
Fraction of missing information \(\gamma\) 보고.
민감도 분석 (MAR vs MNAR 대안).
실제 ground truth (if available) 와 비교.

도구

Python: pymc, statsmodels.imputation.MICE.
R: mice (표준), mi (Bayesian), aregImpute (Hmisc).

5 Part IV 전체 결산 (Ch.14~18)

5.1 5장 논리 지도 — Likelihood 확장 계단

Ch.14 Regression          : 정규 likelihood, X 회귀
    ↓ 그룹 구조 추가
Ch.15 Hierarchical Linear : 정규 + exchangeable batches
    ↓ 비정규 반응
Ch.16 GLM                 : Poisson, Binomial, Multinomial
    ↓ Heavy tail
Ch.17 Robust Inference    : t, Negative Binomial, Robit
    ↓ 관측 과정 완화
Ch.18 Missing Data        : MAR, MI, imputation

각 장의 공통 패턴:

전 장의 계산 엔진 재사용: Ch.14 의 weighted regression, Ch.15 의 Gibbs, Ch.13 의 EM.
Auxiliary variable: \(\beta_j\) (Ch.15), \(V_i\) (Ch.17), \(y_{\text{mis}}\) (Ch.18) — 모두 같은 Gibbs/data augmentation 프레임.
Posterior predictive check 를 통한 model validation 반복.

5.2 Part IV 결산 핵심 수식

장	핵심 수식	핵심 도구
Ch.14	\(\hat\beta = (X^T X)^{-1} X^T y\), \(V_\beta = (X^T X)^{-1}\)	QR·GLS·LASSO
Ch.15	\(\beta \sim N(1\alpha, \sigma_\beta^2 I)\), \(\rho = \sigma_\beta^2/(\sigma^2+\sigma_\beta^2)\)	Exchangeable batches·ANOVA
Ch.16	\(\log \mu = X\beta\), \(z_i, \sigma_i^2\) IWLS	Canonical link·IWLS·Cauchy(0, 2.5)
Ch.17	\(y_i \\| V_i \sim N(\mu, V_i), V_i \sim \text{Inv-}\chi^2\)	Scale mixture·robit
Ch.18	\(p(y_{\text{obs}}, I \\| \theta, \phi) = p(I \\| y_{\text{obs}}, \phi) p(y_{\text{obs}} \\| \theta)\)	MAR·Rubin rules·data aug

5.3 Part IV 학습 로드맵

시작점: Ch.14 § 14.1~14.2 완전 이해. 식 (14.1)~(14.9) 직접 유도 가능해야.

중간 단계: Ch.15 의 8 schools (Ch.5 연결) + Ch.16 의 로지스틱 회귀 IWLS.

고급: Ch.17 의 scale mixture + Ch.18 의 MAR 증명 + MICE 구현.

실무: MRP (Ch.16 § 16.5) + hierarchical logistic (Ch.15 + Ch.16) + missing data (Ch.18) 의 결합. 이 세 가지가 현대 베이즈 survey·정치학 분석의 표준.

5.4 Part IV 결산 체크리스트

이론

Noninformative prior 하의 OLS ≡ 베이즈 사후 평균 이해.
Exchangeable batches 와 intraclass correlation 동치성.
Canonical link 의 exponential family 기원.
Scale mixture representation 의 heavy-tail 기제.
MAR factorization 과 ignorability 조건.

계산

QR 분해 기반 효율적 회귀 샘플링.
Non-centered parameterization (계층 모형).
IWLS for GLM.
Parameter expansion.
Data augmentation (auxiliary variables).

실무

Weakly informative Cauchy prior for 로지스틱.
과분산 점검과 random effects 또는 NegBin.
MRP 2-단계 (multilevel + poststratification).
Robust 회귀 (outlier 자동 downweight).
Multiple imputation + Rubin rules.

검증

Posterior predictive check 습관.
민감도 분석 (prior·likelihood·MNAR).
구간 보정 (credible interval의 frequentist coverage).
모형 비교 (WAIC, LOO, Bayes factor).

6 Part V 예고 — Nonlinear·Nonparametric Models

Part IV 완결. 이후 Part V (Ch.19~23) 는 모형 구조의 유연성으로 전환.

6.1 Ch.19 Parametric Nonlinear Models

주제: 예측 변수와 모수가 비선형으로 결합 하는 모형.

예시:

Serial dilution assay (농도 측정): \(E(y) = A/(1 + (x/K)^n)\) 형태.
약물 pharmacokinetics: 1차·2차 구획 모형.
화학 반응 속도: Arrhenius 방정식.

Ch.14 ~ 18 와의 차이: Linear predictor \(X\beta\) 가 없음. Gradient·Hessian 기반 최적화 + HMC 필수.

6.2 Ch.20 Basis Function Models — Splines

주제: 모수가 \(\sum \beta_k B_k(x)\) 형태. \(B_k\) 는 basis function (spline, wavelet, Fourier 등).

응용:

Natural cubic splines for 비선형 trend.
B-splines for 함수 근사.
P-splines (penalized).

6.3 Ch.21 Gaussian Processes

주제: 함수 자체 에 prior 부여. Gaussian process prior on \(f(\cdot)\).

응용:

공간 통계 (kriging).
Bayesian optimization.
Time series smoothing.

Stan, PyMC, NumPyro 의 GP 지원. Ch.17 scale mixture 와 강한 유사성 (GP = 무한 차원 정규).

6.4 Ch.22 Finite Mixture Models

주제: \(p(y) = \sum_k \lambda_k f_k(y | \theta_k)\).

응용:

Heterogeneous populations (각 subgroup 이 다른 분포).
Density estimation.
Latent class models.

Ch.17 \(t\) 모형이 정규의 2-component (continuous) mixture 임을 일반화.

6.5 Ch.23 Dirichlet Process Models

주제: 무한 차원 mixture. Cluster 개수도 데이터로부터 추정.

응용:

Nonparametric density estimation.
Topic modeling.
Survival analysis.

Stick-breaking representation, Chinese Restaurant Process, CRP clustering.

6.6 Part IV → V 전환의 논리

Part IV 는 “주어진 모수 수” 의 베이즈. Part V 는 “모수 수도 추정” 의 베이즈 — parametric 에서 nonparametric 으로.

핵심 통찰: Nonparametric 이라 해서 prior 없는 것이 아님. 함수 공간의 prior (GP, DP 등) 가 여전히 필요. Gelman 의 표현: “Nonparametric = infinite-dimensional parametric”.

직관: Part IV 의 자연스러운 확장

Ch.15 varying coefficients: 유한 그룹 \(j = 1, \dots, J\) 의 유한 차원 \(\beta_j\).

Ch.21 Gaussian process: 연속 공간 \(x \in \mathbb{R}\) 의 함수 \(f(x)\) — 무한 차원 일반화.

같은 정규 prior + conjugacy 구조, 단지 차원이 유한에서 무한으로 확장. Ch.23 Dirichlet process 도 유한 mixture의 무한 확장.

Part IV 를 이해하면 Part V 는 수학적 일반화 로 자연스럽게 이어진다.

7 Ch.18 심화 마지막 실전 체크리스트

연습 Ex.18.1~18.3

Slovenia 2×2: EM + SEM + Gibbs 모두 구현하여 교차 검증.
Monotone pattern 으로 reducing 시 정보 손실 정량화.
MI 도구 (mice, mi, aregImpute) 비교로 민감도.

Part IV 전반

Ch.14 § 14.2 의 식 (14.1)~(14.9) 유도 능력.
Ch.15 의 varying intercepts/slopes with LKJ prior.
Ch.16 의 weakly informative prior 분리 해결.
Ch.17 의 \(V_i\) auxiliary 이용한 robust regression.
Ch.18 의 MAR 증명 + Rubin combining rules.

실전 결합

MRP (계층 로지스틱 + poststratification).
Robust 회귀 + 결측 (Ch.17 + Ch.18 scale mixture).
Posterior predictive check 의 습관화.

8 관련 주제

Ch.18 시리즈 전체

Part IV 전체

Part V 예고

Ch.19 Parametric Nonlinear Models (예정)
Ch.20 Basis Function Models (예정)
Ch.21 Gaussian Processes (예정)
Ch.22 Finite Mixture Models (예정)
Ch.23 Dirichlet Processes (예정)

관련 개념 (cross-category)

9 참고문헌

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian Data Analysis (3rd ed.), Ch.18 § 18.7~18.8. CRC Press.
Rubin, D. B. (1976). Inference and Missing Data. Biometrika, 63, 581-592.
Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley.
Rubin, D. B. (1996). Multiple Imputation after 18+ Years. JASA, 91, 473-489.
Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data (2nd ed.). Wiley.
Van Buuren, S. (2012). Flexible Imputation of Missing Data. CRC Press.
Tanner, M. A., & Wong, W. H. (1987). The Calculation of Posterior Distributions by Data Augmentation. JASA, 82, 528-540.
Meng, X.-L., & Rubin, D. B. (1991). Using EM to Obtain Asymptotic Variance-Covariance Matrices (SEM). JASA, 86, 899-909.
Raghunathan, T. E., et al. (2001). A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models. Survey Methodology, 27, 85-95.
Abayomi, K., Gelman, A., & Levy, M. (2008). Diagnostics for Multiple Imputation. Applied Statistics, 57, 273-291.