Kwangmin Kim - Ch.18 § 18.1~18.3 심화 — Notation·Multiple Imputation·Multivariate Normal/\(t\) 결측

1 개요

Ch.18 심화 시리즈의 첫 번째 편. § 18.1~18.3 은 결측 데이터의 이론 기초 + 알고리즘 + 표준 모형이다.

§ 18.1 — MAR/MCAR/ignorability의 정확한 수학적 정의와 factorization.
§ 18.2 — Multiple imputation 의 3-step 절차와 Rubin’s rules 완전 유도.
§ 18.3 — 다변량 정규·\(t\) 모형에서 결측 처리 (EM·data augmentation·monotone).

직관: 세 절의 역할

§ 18.1 에서 “언제 결측 메커니즘을 무시해도 되는가” 의 조건 정립.
§ 18.2 에서 “결측 불확실성을 어떻게 정직하게 반영하는가” — Multiple imputation 프레임워크.
§ 18.3 에서 “표준 모형 (정규/\(t\)) 에 결측이 있을 때 구체적으로 어떻게 계산하나” — EM + Gibbs.

세 절의 공통 기반: \(y_{\text{mis}}\) 를 parameter 처럼 다룸. Bayesian 에서는 자연스러운 접근 — 미지 quantity 에는 posterior distribution이 있다.

2 § 18.1 Notation and Missing Data Mechanisms

2.1 기본 표기 재정리

\(y\) = complete data.
\(y = (y_{\text{obs}}, y_{\text{mis}})\).
\(I\) = inclusion indicator (\(I_{ij} = 1\) if observed, 0 if missing).
\(\theta\) = data model parameters.
\(\phi\) = missingness mechanism parameters.

2.2 결합 분포 분해

가장 일반적 형태:

\[ p(y, I | \theta, \phi) = p(y | \theta) \cdot p(I | y, \phi) \]

두 부분:

Data model \(p(y | \theta)\) — 실제로 모델링하고자 하는 대상 (Ch.14~17).
Missingness mechanism \(p(I | y, \phi)\) — 왜 일부가 관측 안 됐는가.

2.3 관측 데이터의 Likelihood — 식 (18.1) 유도

실제 관측되는 것은 \((y_{\text{obs}}, I)\). \(y_{\text{mis}}\) 는 integrated out:

\[ p(y_{\text{obs}}, I | \theta, \phi) = \int p(y_{\text{obs}}, y_{\text{mis}} | \theta) \cdot p(I | y_{\text{obs}}, y_{\text{mis}}, \phi) \, dy_{\text{mis}} \quad \text{(18.1)} \]

의미: 관측 data likelihood 계산에 결측 변수에 대한 적분 이 필요. 미관측 값의 모든 가능한 값에 대해 평균.

왜 어려운가: \(p(I | y_{\text{obs}}, y_{\text{mis}}, \phi)\) 안에 \(y_{\text{mis}}\) 가 들어있으면 적분이 복잡. 이 의존성이 missingness mechanism의 핵심.

2.4 MAR — 식 (18.2) 완전 유도

정의: 결측 확률이 \(y_{\text{obs}}\) 에만 의존, \(y_{\text{mis}}\) 에 무관:

\[ p(I | y_{\text{obs}}, y_{\text{mis}}, \phi) = p(I | y_{\text{obs}}, \phi) \]

이 가정 하에서 식 (18.1) 을 다시 쓰면:

\[ p(y_{\text{obs}}, I | \theta, \phi) = \int p(y_{\text{obs}}, y_{\text{mis}} | \theta) \cdot p(I | y_{\text{obs}}, \phi) \, dy_{\text{mis}} \]

\(p(I | y_{\text{obs}}, \phi)\) 은 \(y_{\text{mis}}\) 에 무관하므로 적분 밖으로 빠짐:

\[ = p(I | y_{\text{obs}}, \phi) \cdot \int p(y_{\text{obs}}, y_{\text{mis}} | \theta) \, dy_{\text{mis}} \]

\(\int p(y_{\text{obs}}, y_{\text{mis}} | \theta) \, dy_{\text{mis}} = p(y_{\text{obs}} | \theta)\) (marginal).

따라서:

\[ \boxed{ p(y_{\text{obs}}, I | \theta, \phi) = p(I | y_{\text{obs}}, \phi) \cdot p(y_{\text{obs}} | \theta) } \quad \text{(18.2)} \]

결과: \(y_{\text{obs}}\) likelihood 가 \(\theta\) 와 \(\phi\) 로 factorize.

2.5 Ignorability 증명

Bayes:

\[ p(\theta, \phi | y_{\text{obs}}, I) \propto p(\theta, \phi) \cdot p(y_{\text{obs}}, I | \theta, \phi) \]

MAR 하:

\[ = p(\theta, \phi) \cdot p(I | y_{\text{obs}}, \phi) \cdot p(y_{\text{obs}} | \theta) \]

Parameter distinctness 가정 (\(\theta \perp \phi\) in prior): \(p(\theta, \phi) = p(\theta) p(\phi)\).

\[ p(\theta, \phi | y_{\text{obs}}, I) \propto [p(\phi) \cdot p(I | y_{\text{obs}}, \phi)] \cdot [p(\theta) \cdot p(y_{\text{obs}} | \theta)] \]

Factorization! \(\theta\) 와 \(\phi\) 사후가 독립:

\[ p(\theta | y_{\text{obs}}, I) \propto p(\theta) \cdot p(y_{\text{obs}} | \theta) \]

즉 \(\theta\) 추론은 \(p(I)\) 무시 가능. 이것이 ignorability.

Ignorability 의 두 조건

MAR: \(p(I | y, \phi) = p(I | y_{\text{obs}}, \phi)\).
Distinctness: \(\theta \perp \phi\) in prior.

두 조건 모두 필요. 보통 2번은 자연스러움 (다른 부류 parameters), 1번이 결정적.

실무: MAR 은 가정이지 확인 불가능. 그러나 보조 변수 를 많이 포함하면 MAR이 더 그럴듯 해진다. 예: 소득 결측이 나이·교육·직업으로 설명되면 MAR, 설명 안 되면 MNAR.

2.6 MCAR — 식 (18.3)

가장 강한 가정: 결측 확률이 \(y\) 전체에 무관:

\[ p(I | y_{\text{obs}}, y_{\text{mis}}, \phi) = p(I | \phi) \quad \text{(18.3)} \]

MCAR ⟹ MAR (역은 아님).

MCAR 하에서: complete-case analysis 도 편향 없음 (단 효율만 손실). MCAR 이 맞으면 결측 row 제거해도 OK.

실무 드뭄: “완전 무작위 결측”은 매우 제한적. 대부분 상황에서 결측은 최소한 일부 관측 변수에 의존.

2.7 MNAR — Nonignorable

MAR 도 아닌 경우. \(p(I | y, \phi)\) 가 \(y_{\text{mis}}\) 에 의존.

예시 — 소득 결측:

고소득자가 소득 공개 안 할 가능성 높음 → \(p(I = 0 | \text{income}, \phi)\) 가 income 값 (결측 값) 에 의존. MNAR.

처리:

결측 메커니즘 명시 모델링.
도메인 지식 + 민감도 분석.
MAR 가정 하 결과 vs 합리적 MNAR 대안 비교.

2.8 세 수준 비교

수준	수학 조건	실무 빈도	Complete-case 편향?	결측 모형 필요?
MCAR	\(p(I) = p(I)\)	드뭄	편향 없음 (효율만)	No
MAR	\(p(I \\| y_{\text{obs}})\)	흔함	보통 편향	No (ignorable)
MNAR	\(p(I \\| y)\)	현실적	심각한 편향	Yes

2.9 소득 예시 — 3 시나리오

\(y = (\text{age}, \text{income})\). Age 는 모두 관측, income 일부 결측.

시나리오	\(p(I_{\text{income}})\)	분류
“Income 결측 확률 고정 10%”	\(p(I) = 0.9\)	MCAR
“젊을수록 소득 공개 잘 안 함”	\(p(I \\| \text{age})\)	MAR
“소득 높을수록 공개 안 함 (age 통제 후도)”	\(p(I \\| \text{age, income})\)	MNAR

실무 가이드: MAR 가정 근거를 보조 변수 충분성으로 argument. “나이 + 교육 + 지역 + 직업 등을 통제하면 소득 결측이 소득 값에 의존하지 않는다” — 관련 공변량이 많을수록 설득력 증가.

3 § 18.2 Multiple Imputation — 완전 유도

3.1 Single Imputation의 문제

순진한 접근: 결측을 평균·회귀 예측 등으로 한 번 채움 (single imputation).

문제:

Imputed 값을 관측 값처럼 다루면 불확실성 과소 추정.
SE 가 작아짐 → credible interval 너무 좁음.
추론이 과신 (overconfident).

3.2 Multiple Imputation (Rubin 1987)

핵심 아이디어: \(K\) 번 imputation → \(K\) 개 complete datasets → 각각 분석 → 통합.

결측 불확실성을 “\(K\) imputation 간 변동” 으로 정량화.

3.3 3-Step 절차

Step 1 — Imputation:

\[ X_{\text{mis}}^k \sim p(X_{\text{mis}} | X_{\text{obs}}, y), \quad k = 1, \dots, K \]

Joint 모형 \(p(X, y | \theta, \psi)\) 의 사후 predictive에서 draws.

Step 2 — Analysis:

각 \(X^k = (X_{\text{obs}}, X_{\text{mis}}^k)\) 에 대해 원하는 분석 (Ch.14~17):

\[ \hat\theta_k, \hat W_k = \text{analysis}(X^k, y), \quad k = 1, \dots, K \]

\(\hat\theta_k\) = 점 추정, \(\hat W_k\) = 분산 추정 (scalar 단순화; 다차원이면 공분산 행렬).

Step 3 — Combining: Rubin’s rules.

3.4 Rubin’s Combining Rules — 완전 유도

점 추정:

\[ \bar\theta_K = \frac{1}{K} \sum_{k=1}^K \hat\theta_k \]

불확실성 분해:

총 분산 = (각 imputation 내 분산) + (imputation 간 분산).

Within-imputation variance:

\[ \bar W_K = \frac{1}{K} \sum_{k=1}^K \hat W_k \]

“완전 데이터 였다면 있었을 분산의 평균” — 결측과 무관한 고유 불확실성.

Between-imputation variance:

\[ B_K = \frac{1}{K-1} \sum_{k=1}^K (\hat\theta_k - \bar\theta_K)^2 \]

“\(K\) 개 imputation 간 추정 변동” — 결측으로 인한 추가 불확실성.

총 분산:

\[ T_K = \bar W_K + \frac{K+1}{K} B_K \]

3.5 왜 \((K+1)/K\) 인가 — 유도

\(K \to \infty\) 극한에서 “true” MI 분산:

\[ T_\infty = W + B \]

\(W\) = within-imputation true variance, \(B\) = true between-imputation variance.

유한 \(K\): \(\bar W_K\) 는 \(W\) 의 unbiased estimator, \(B_K\) 는 \(B\) 의 unbiased estimator.

그러나: 관심 분산은 \(\bar\theta_K\) 의 sampling variance (MI 절차의 반복에 의한 변동).

\[ \mathrm{Var}(\bar\theta_K) = \mathbb{E}[\text{within}] + \mathrm{Var}[\text{across}] = W + \frac{B}{K} \]

(평균이므로 between-variance가 \(1/K\) 로 감소.)

Total variance (for inference on \(\theta\), not \(\bar\theta_K\)):

\[ \mathrm{Var}(\theta | \text{data}) = W + B = \bar W_K + \frac{K+1}{K} B_K \]

\((K+1)/K\) 는 finite-\(K\) 보정. \(K = 5\) 이면 \(6/5 = 1.2\) 배 (20% 부풀림).

직관: \((K+1)/K\) 의 두 해석

Intuition 1: \(B_K\) 자체가 \(K\) 유한으로 noisy. 이 noise를 보수적으로 보상 → 약간 부풀림.

Intuition 2: \(\bar\theta_K\) 의 sampling variance는 \(W + B/K\). 그러나 inference 대상은 \(\theta\) (future imputation에서 추출될 값) 이므로 추가 \(B\) 가 필요. 합쳐서 \(W + B + B/K = W + (K+1)B/K\).

Rubin 의 공식은 두 해석이 수렴하는 지점.

Fraction of missing information:

\[ \gamma = \frac{r + 2/(df + 3)}{r + 1}, \quad r = \frac{(K+1) B_K / K}{\bar W_K} \]

\(\gamma \approx 0\) 이면 결측 영향 작음, \(\gamma\) 크면 결측이 불확실성에 지배적. 통상 결측률 \(p\) 의 대체 측정치 (큰 표본에서 \(\gamma \approx p\)).

3.6 Degrees of Freedom

\(t\)-interval 자유도 (Rubin-Barnard 1999 완성):

\[ \mathrm{df}_K = (K-1) \left( 1 + \frac{K}{K+1} \frac{\bar W_K}{B_K} \right)^2 \]

\(B_K\) 작으면 \(\mathrm{df}\) 커짐 (결측 거의 영향 없음 → 완전 데이터 자유도에 가까움).

\(B_K\) 크면 \(\mathrm{df}\) 작아짐 (결측 불확실성 증폭 → 더 넓은 interval 필요).

3.7 \(K\) 선택 권장

전통적 (Rubin 1987): \(K = 5\) — 대부분 충분.

현대적 (van Buuren 2018): fraction of missing information 에 비례.

\(\gamma \approx 0.1\): \(K = 5\).
\(\gamma \approx 0.3\): \(K = 20\).
\(\gamma \approx 0.5\): \(K = 50\).

Bayesian 순수 접근: MCMC samples 수천 개 그대로 사용 (암묵적으로 \(K \to \infty\)).

3.8 Data Augmentation — Tanner-Wong (1987)

Iterative MI 를 MCMC 관점으로:

\[ \begin{aligned} y_{\text{mis}}^{s+1} | \theta^s, y_{\text{obs}} &\sim p(y_{\text{mis}} | \theta^s, y_{\text{obs}}) \quad (\text{I-step}) \\ \theta^{s+1} | y_{\text{mis}}^{s+1}, y_{\text{obs}} &\sim p(\theta | y_{\text{obs}}, y_{\text{mis}}^{s+1}) \quad (\text{P-step}) \end{aligned} \]

I-step = Imputation step, P-step = Posterior step.

이것이 Gibbs sampler의 missing data 특수 버전. Steady state 에 수렴하면 joint posterior \(p(\theta, y_{\text{mis}} | y_{\text{obs}})\) 에서 draw.

Ch.17 scale mixture 와의 동형성: \(V_i\) augmentation과 \(y_{\text{mis}}\) augmentation이 수학적으로 동등. 둘 다 “latent variable을 Gibbs 에 포함”.

3.9 EM with Missing Data

Posterior mode 원하면 EM:

E-step: \(y_{\text{mis}}\) 의 conditional expectation (또는 sufficient statistics).

M-step: Imputed data로 정규 MLE.

Ch.13 § 13.4 의 특수 사례.

3.10 Monotone Missing Pattern — 계산 Shortcut

정의: 변수를 순서대로 정렬하면 “뒤로 갈수록 더 많이 결측”.

예: 종단 연구 dropout — 일단 drop 하면 이후 모든 시점 결측.

3.11 Monotone의 이점

Standard 알고리즘: 각 관측의 결측 패턴마다 다른 conditional 계산 → 느림.

Monotone 알고리즘: 패턴이 \(k\) 개 blocks 로 단순화 → 한 번에 처리.

구체적: 변수 \(y_1, \dots, y_d\) 가 monotone (뒤로 갈수록 결측 많음). Likelihood factorize:

\[ \log p(y_{\text{obs}} | \psi) = \log p(y_{\text{obs},1} | \psi_1) + \log p(y_{\text{obs},2} | y_{\text{obs},1}, \psi_2) + \dots \]

\(\psi_j\) = \(y_j\) 의 조건부 분포 모수 (\(y_1, \dots, y_{j-1}\) 주어진 회귀).

Prior 도 factorize 가능하면:

\[ p(\psi) = p(\psi_1) p(\psi_2 | \psi_1) \cdots p(\psi_k | \psi_1, \dots, \psi_{k-1}) \]

순차 사후 추출 가능 (data augmentation 반복 없이):

\(\psi_1 | y_{\text{obs},1}\) 에서 draw.
\(\psi_2 | \psi_1, y_{\text{obs},1}, y_{\text{obs},2}\) 에서 draw.
…

3.12 Near-Monotone Strategy

거의 monotone 이지만 일부 위반 인 경우:

위반 결측값만 (minor deviations) data augmentation 으로 impute → monotone pattern 완성.
이후 순차 draw (analytical).

통상 full data augmentation 보다 훨씬 빠름. 위반 비율이 작을수록 효율 큼.

3.13 변수 순서 결정

“결측 적은 순서” 로 배열이 경험적 방법. 정확히는:

\[ \text{변수 } j \text{의 결측 비율} = \frac{\text{missing in } y_j}{n} \]

오름차순으로 재배열 → 대략 monotone.

4 § 18.3 Missing Data in Multivariate Normal and \(t\)

4.1 Multivariate Normal 모형

\(y_i \sim N_d(\mu, \Sigma)\), \(i = 1, \dots, n\). 일부 성분 결측.

관심 모수: \((\mu, \Sigma)\).

충분통계량:

\[ T_1 = \sum_{i=1}^n y_{ij}, \quad T_2 = \sum_{i=1}^n y_{ij} y_{ik} \]

(for each \(j, k \in \{1, \dots, d\}\)).

4.2 EM 알고리즘

E-step: 결측 성분에 대한 조건부 기댓값과 공분산 으로 충분통계량 업데이트.

Multivariate normal conditional 공식:

\(y_i = (y_{\text{obs}, i}, y_{\text{mis}, i})\), \(\mu = (\mu_{\text{obs}}, \mu_{\text{mis}})\), \(\Sigma\) blocks \(\Sigma_{\text{oo}}, \Sigma_{\text{om}}, \Sigma_{\text{mo}}, \Sigma_{\text{mm}}\).

\[ y_{\text{mis},i} | y_{\text{obs},i}, \mu, \Sigma \sim N\left( \mu_{\text{mis}} + \Sigma_{\text{mo}} \Sigma_{\text{oo}}^{-1} (y_{\text{obs},i} - \mu_{\text{obs}}), \; \Sigma_{\text{mm}} - \Sigma_{\text{mo}} \Sigma_{\text{oo}}^{-1} \Sigma_{\text{om}} \right) \]

E-step:

\[ \mathbb{E}[y_{ij} | y_{\text{obs}}, \theta^{\text{old}}] = \begin{cases} y_{ij} & \text{if observed} \\ \mu_j + \Sigma_{j, \text{obs}} \Sigma_{\text{oo}}^{-1} (y_{\text{obs},i} - \mu_{\text{obs}}) & \text{if missing} \end{cases} \]

\[ \mathbb{E}[y_{ij} y_{ik} | y_{\text{obs}}, \theta^{\text{old}}] = y_{ij}^{\text{old}} y_{ik}^{\text{old}} + c_{ijk}^{\text{old}} \]

\(c_{ijk}\) = conditional covariance (0 if both observed, conditional \(\mathrm{Cov}\) if both missing).

M-step: 업데이트된 \(\mathbb{E}[T_1], \mathbb{E}[T_2]\) 로 정규 MLE:

\[ \mu_j^{\text{new}} = \frac{1}{n} \sum_i y_{ij}^{\text{old}} \]

\[ \sigma_{jk}^{\text{new}} = \frac{1}{n} \sum_i (y_{ij}^{\text{old}} y_{ik}^{\text{old}} + c_{ijk}^{\text{old}}) - \mu_j^{\text{new}} \mu_k^{\text{new}} \]

4.3 Gibbs Sampler (Full Posterior)

\(y_{\text{mis},i} | \mu, \Sigma, y_{\text{obs},i}\): conditional multivariate normal.
\(\mu | y, \Sigma\): normal (충분통계량 이용).
\(\Sigma | y, \mu\): inverse-Wishart (conjugate).

반복.

4.4 초기값

\(\Sigma\) positive definite 중요. 방법:

Complete-case covariance: 관측 row만 사용한 sample \(\hat\Sigma\).
Mean imputation 후 sample covariance.
Regularized (ridge) — 소수 complete cases 로 singular 방지.

4.5 예: 2변수 경우

\(d = 2\), \(y = (y_1, y_2)\). \(y_1\) 완전 관측, \(y_2\) 일부 결측.

조건부 분포:

\[ y_{2,i} | y_{1,i}, \mu, \Sigma \sim N\left( \mu_2 + \frac{\sigma_{12}}{\sigma_{11}}(y_{1,i} - \mu_1), \; \sigma_{22} - \frac{\sigma_{12}^2}{\sigma_{11}} \right) \]

E-step:

\[ y_{2,i}^{\text{old}} = \begin{cases} y_{2,i} & \text{if observed} \\ \mu_2 + \frac{\sigma_{12}}{\sigma_{11}}(y_{1,i} - \mu_1) & \text{if missing} \end{cases} \]

\[ c_{i,22}^{\text{old}} = \begin{cases} 0 & \text{if observed} \\ \sigma_{22} - \sigma_{12}^2/\sigma_{11} & \text{if missing} \end{cases} \]

\(y_1\) 은 완전 관측이므로 \(c_{i,11} = 0\) 항상.

4.6 Monotone Pattern 의 특수 이점

\(y_1\) 완전 관측, \(y_2\) 일부 결측, \(y_3\) 더 많이 결측, … — monotone.

Likelihood factorize:

\[ p(y | \mu, \Sigma) = p(y_1 | \psi_1) \cdot p(y_2 | y_1, \psi_2) \cdot p(y_3 | y_1, y_2, \psi_3) \cdots \]

\(\psi_j\) = \(y_j\) 의 \(y_1, \dots, y_{j-1}\) 에 대한 선형 회귀 (계수 + 잔차 분산).

순차 draw:

\(y_1\) 관측치로 \(\psi_1\) posterior.
\((y_1, y_2)\) 관측치로 \(\psi_2 | \psi_1\) posterior.
…

각 step은 표준 회귀 분석 → analytical. Data augmentation 불필요.

4.7 \(t\) 확장 — 식 (18.4)

Ch.17 의 \(t\) 를 결측 상황에 적용:

\[ y_i | \theta, V_i \sim N_d(\mu, V_i \Sigma), \quad V_i \sim \text{Inv-}\chi^2(\nu, 1) \]

\(V_i\) 도 미관측 — 결측 취급.

\(V_i\) 의 conditional posterior:

\[ p(V_i | y_{\text{obs}}, \theta, \nu) \propto N(y_{\text{obs},i} | \mu_{\text{obs}}, V_i \Sigma_{\text{obs}}) \cdot \text{Inv-}\chi^2(V_i | \nu, 1) \quad \text{(18.4)} \]

(18.4) 는 scaled inverse-\(\chi^2\) — conjugate!

\[ V_i | y_{\text{obs},i}, \theta, \nu \sim \text{Inv-}\chi^2\left( \nu + d_{\text{obs},i}, \frac{\nu + (y_{\text{obs},i} - \mu_{\text{obs}})^T \Sigma_{\text{obs}}^{-1} (y_{\text{obs},i} - \mu_{\text{obs}})}{\nu + d_{\text{obs},i}} \right) \]

\(d_{\text{obs},i}\) = observed 성분 수.

직관: 왜 식 (18.4) 가 scaled inverse-\(\chi^2\) 인가

정규 × Inv-\(\chi^2\) prior는 conjugate — scaled inverse-\(\chi^2\) posterior.

Multivariate 확장: \(y_{\text{obs},i}\) 만 관측 → observed 부분의 quadratic form 만 업데이트에 기여.

\[ \text{Quadratic} = (y_{\text{obs},i} - \mu_{\text{obs}})^T \Sigma_{\text{obs}}^{-1} (y_{\text{obs},i} - \mu_{\text{obs}}) \]

결측 성분은 \(V_i\) posterior에 정보 제공 안 함 (자연스러움 — 관측 안 됐으니).

효과: 관측 성분이 \(\mu_{\text{obs}}\) 에서 멀면 (outlier-like) \(V_i\) 커짐 → 해당 관측을 downweight.

Ch.17 의 \(t\) robust inference와 동일 직관, 단 multivariate/missing 확장.

4.8 Full Gibbs for \(t\) with Missing

\(V_i | y_{\text{obs},i}, \theta, \nu\): 식 (18.4) scaled inverse-\(\chi^2\).
\(y_{\text{mis},i} | y_{\text{obs},i}, \theta, V_i\): multivariate normal (Ch.17 정규 결측과 동일).
\(\theta = (\mu, \Sigma) | y, V\): weighted normal-inverse-Wishart posterior.

(4. 옵션: \(\nu | V\) Metropolis.)

3-step cycle. 표준 multivariate 정규 결측 Gibbs + \(V_i\) 하나 더.

4.9 Nonignorable Extension

MAR 가정이 깨질 때. 결측 메커니즘 parameters \(\phi\) 를 model에 포함: