1 들어가며

본 글의 범위:

§ 14.5 도입 — MNAR 의 challenge, 두 class (selection + pattern-mixture), sensitivity 의 가치.
§ 14.5.1 Selection Models 일반론 — Heckman (1976) 원전, Diggle-Kenward (1994) 확장, 비판.
§ 14.5.1.1 Mixed-Effects / Shared Parameter — 식 14.10-14.14 의 formulation.
§ 14.5.1.2 NIMH Schizophrenia 예시 — 식 14.15-14.20, NLMIXED, Table 14.11.

한 줄 요약

“§ 14.5 + § 14.5.1 = MNAR 처리의 첫 framework — Selection Model. Nonignorable missingness 는 standard 모형으로 biased, 그러나 데이터로 ignorability 검정 불가능 (Kenward 1998). Little (1995) 의 두 class: selection vs pattern-mixture. Selection 의 발상: $f(y, R) = f(y) \cdot f(R \mid y)$ — 응답이 먼저, 결측이 다음. Heckman (1976) 의 econometric 원전, Leigh (1993) tutorial, Diggle-Kenward (1994) 의 longitudinal 확장. § 14.5.1.1 Mixed-Effects / Shared Parameter Selection (Wu-Carroll 1988, De Gruttola-Tu 1994, Schluchter 1992, Ten Have 1998): longitudinal $f_y(y \mid v)$ + dropout $f_D(D \mid v)$ 가 random effect $v$ 공유. 식 14.12 의 marginal likelihood $\int f_y f_D f(v) dv$. Cholesky reparam + Gauss-Hermite quadrature 로 추정. $\alpha^* \neq 0$ → nonignorable. § 14.5.1.2 NIMH 예시: 식 14.15 의 SWeek MRM + 식 14.18 의 clog-log dropout with $\theta_0, \theta_1$ + Drug interactions. 식 14.19-14.20 의 ordinal equivalence (Engel 1993, Läärä-Matthews 1985) → simpler dataset. Table 14.11: shared 가 separate 보다 better fit (LR $\chi^2_4 = 30.1$, p < .0001). Drug × slope interaction $\alpha_5 = -1.638$ (p = .003): placebo 는 안 호전 환자 dropout, drug 는 빨리 호전 환자 dropout — § 14.4.1 의 MeanY interaction 결과와 일치. Sensitivity analysis 의 도구, 결정적 답 아님.”

2 § 14.5 도입 — Models for Nonignorable Missingness

2.1 MNAR 의 Challenge

데이터로 검정 불가능

저자 본문 인용:

“the observed data provide no information to either confirm or refute ignorability. With that said, assuming a particular model for nonignorability and ignorability, one can test for ignorability, but this test is completely dependent on the proposed models for nonignorability and ignorability (e.g., see Kenward [1998]). Again, the data cannot address this important point independent of assumed models.”

핵심 메시지:

데이터 만으로는 ignorability 검정 불가능.
특정 ignorability 모형 vs 특정 nonignorability 모형 의 비교는 가능.
→ 모형 가정에 결정적으로 의존.

Kenward (1998) — 이 비교 검정의 한계 명시:

검정 결과는 가정된 두 모형의 specification 에 종속.
다른 specification → 다른 결과.
→ 보편적 ignorability test 없음.

직관 — 왜 검정 불가능

MAR vs MNAR 의 본질적 차이:

MAR: $R \perp y^M \mid X, y^O$.
MNAR: $R \not\perp y^M \mid X, y^O$.
두 가정의 차이가 미관측 $y^M$ 와의 관계 — 자료에 없음.

§ 14.4.1 NIMH 예시 와의 비교:

MCAR 검정: $y^O$ 활용 가능 → 검정 가능.
MAR vs MNAR: $y^M$ 가 필요 → 검정 불가능.

모형 가정의 결정적 역할:

Selection 모형: $f(R \mid y, X)$ 의 specific 형태.
Pattern-mixture: 미관측 분포에 대한 specific 가정.
두 모형 모두 identifying assumption 필요.
다른 가정 → 다른 결과.

Sensitivity Analysis 의 동기:

단일 nonignorable 모형 의존 X.
다양한 가정으로 결과 검토.
결과 일관 → 강건한 conclusion.
결과 변동 → 임상적 판단 + 보수적 해석.

2.2 Little (1995) 의 두 Model Class

Selection vs Pattern-Mixture

저자 본문 인용:

“Little [1995] described much of these approaches in terms of two broad model classes: selection and pattern-mixture models.”

Selection Models:

$f(y, R) = f(y) \cdot f(R \mid y)$.
응답 분포 + 결측의 조건부 분포.
“왜 결측 발생?” 명시적 모형.

Pattern-Mixture Models:

$f(y, R) = f(R) \cdot f(y \mid R)$.
결측 패턴 별 응답 분포.
“각 결측 패턴의 응답 분포는?”.

핵심 references:

Little (1995) — overview.
Glynn et al. (1986) — early comparison.
Hogan & Laird (1997b) — selection vs pattern review.
Michiels et al. (2002) — pattern-mixture sensitivity.
Little (1993, 1994) — pattern-mixture 원전.
Diggle & Kenward (1994) — selection model 표준 + discussion.

직관 — 두 Framework 의 직관적 차이

Selection 의 인과 발상:

응답이 먼저 발생 (자연 process).
결측이 응답에 의존하여 발생.
→ “응답 → 결측” 인과 모형.

Pattern-Mixture 의 마이굠 발상:

결측 패턴 별로 환자가 그룹화.
각 그룹의 응답 분포 다름.
→ 패턴별 mixture model.

두 Framework 의 통계적 동치성:

같은 결합 분포 $f(y, R)$ 의 두 다른 factorization.
이론적으로 동치.
그러나 실무 specification 다름.
→ 다른 결과 가능.

언제 어느 것 사용:

Selection: 결측 메커니즘에 대한 명확한 가설 있을 때.
Pattern-Mixture: 패턴별 결과의 기술적 분석 + sensitivity.
보통 둘 다 시도, 결과 비교.

저자의 강한 경고:

저자 본문 인용:

“several authors warn against use of a particular nonignorable model as ‘the’ model, because these models make assumptions about the missing data that are essentially impossible to verify with the observed data.”

→ 어떤 nonignorable 모형도 “the” model 아님. → Sensitivity analysis 의 도구로 사용.

2.3 Sensitivity Analysis 의 가치

직관 — Sensitivity Analysis 의 실무 절차

저자 본문 인용:

“Use of nonignorable models can be helpful in conducting a sensitivity analysis; to see how the conclusions might vary as a function of what is assumed about the missing data.”

Sensitivity 절차:

Default 분석: MAR 가정 (MRM/CPM with full likelihood).
Selection model 적합 (§ 14.5.1).
Pattern-mixture model 적합 (§ 14.5.2).
세 결과 비교 → 일관성 확인.

해석:

세 모형 결과 비슷 → 강건한 conclusion.
결과 큰 차이 → 결측 메커니즘이 결정적 → 임상적 판단 신중.

실무 권고:

임상 보고서: MAR + sensitivity analysis 결과 모두 제시.
“모형 가정 변경 시 결과 변동” 명시.
보수적 해석.

3 § 14.5.1 — Selection Models

3.1 Selection Models 의 역사

Heckman (1976) 의 econometric 원전

저자 본문 인용:

“The use of selection models for dealing with missing data in longitudinal studies has a relatively long history, being first proposed by Heckman [1976] in the econometric literature. More recently, Leigh et al. [1993] present a useful tutorial article on implementation of this approach.”

원래 2-stage 절차:

Stage 1: dropout 예측 모형 (predictive logistic).
- Predictor: baseline + time-varying covariates.
- Output: dropout propensity score.
Stage 2: longitudinal model + propensity score.
- propensity score 가 covariate.
- dropout 의 영향 보정.

Diggle-Kenward (1994) 확장:

Dropout 모형에 past $y_i^O$ + unobserved $y_i^M$ 추가.
→ MNAR 명시적 모형.
Longitudinal data analysis 의 표준 selection model.

직관 — Selection 의 발상

Heckman 의 econometric 동기:

노동 경제학: wage equation, 고용 selection.
모든 사람 wage 측정 안 됨 (실업자 제외).
→ “selection” 효과 보정.

Longitudinal 적용:

모든 시점 측정 안 됨 (dropout).
→ dropout selection 효과 보정.

Stage 1 의 propensity score:

$\pi_i = P(D_i = 1 \mid X_i, y_i^O)$.
환자 $i$ 의 dropout 확률.
Logistic regression 으로 추정.

Stage 2 의 보정:

$y_{ij} = X_i \beta + \pi_i \gamma + \varepsilon$.
propensity 가 covariate.
dropout 효과 명시적 분리.

Diggle-Kenward 의 핵심 추가:

Dropout 이 $y^M$ 의존 → MNAR.
$y^M$ 의 분포에 대한 가정 필요.
→ distributional assumption 의 결정적 역할.

3.2 Selection Models 의 비판

Distributional Assumption 의 검증 불가능성

저자 본문 인용:

“Selection models have often been criticized because results can depend greatly on distributional assumptions of the missing data that are impossible to verify [Little, 1995; Little and Rubin, 2002]. To address this, Kenward [1998] describes how the distributional assumptions can be varied, allowing one to assess, to some degree, the sensitivity of the results to the distributional assumptions.”

비판의 핵심:

$y^M$ 의 분포 가정 필요 (정규, t-분포 등).
자료에 $y^M$ 없음 → 가정 검증 불가능.
다른 분포 가정 → 다른 결과.

Kenward (1998) 의 sensitivity 권고:

다양한 분포 가정으로 적합.
결과의 변동 평가.
→ distributional sensitivity.

직관 — Distributional Assumption 의 영향

예시:

$y^M \sim N(\mu, \sigma^2)$ (정규).
$y^M \sim t_k$ (t-분포).
$y^M \sim$ Skewed normal.

다른 가정 → 다른 결과:

정규: tail 짧음 → extreme dropout 적게 보정.
t-분포: tail 길음 → extreme dropout 많게 보정.
→ 추정량 다를 수 있음.

실무 권고:

Default: 정규.
Sensitivity: 다양한 분포 시도.
결과 큰 변화 → 신중한 해석.

§ 14.5.1.1 의 mixed-effects framework 의 advantage:

분포 가정이 random effect 에만 한정 (보통 정규).
$y^M$ 의 별도 가정 필요 없음.
→ 가정의 영향 상대적 작음.

4 § 14.5.1.1 — Mixed-Effects / Shared Parameter Selection Models

4.1 다양한 이름들

Mixed-Effects Selection 의 분류

저자 본문 인용:

“These models have also been called random-coefficient selection models [Little, 1995], random-effects-dependent models [Hogan and Laird, 1997b], and shared parameter models [De Gruttola and Tu, 1994; Wu and Carroll, 1988; Wu and Bailey, 1989; Schluchter, 1992; Ten Have et al., 1998] in the literature.”

다양한 이름:

이름	출처
Random-coefficient selection model	Little (1995)
Random-effects-dependent model	Hogan & Laird (1997b)
Shared parameter model	De Gruttola & Tu (1994), Wu & Carroll (1988), Wu & Bailey (1989), Schluchter (1992), Ten Have et al. (1998)

→ “Shared parameter model” 이 가장 흔히 사용.

Heckman 의 원전과의 차이:

저자 본문 인용:

“They are is a bit different than Heckman’s original selection model, because the dropout propensity, or a function of this propensity, is not included as a covariate in the longitudinal model. However, they share the property of having two models, one for the longitudinal process and one for dropout, that are linked together.”

Heckman: propensity score 가 longitudinal model 의 covariate.
Mixed-effects: random effect 가 두 모형에 공유.

직관 — “Shared Parameter” 의 핵심

Random Effect 공유의 의미:

$v_i$: 환자 $i$ 의 unobserved heterogeneity.
$v_i$ 가 longitudinal $y$ 와 dropout $D$ 모두 영향.
→ “공유” parameter.

예시:

$v_{0i}$ = 환자 $i$ 의 baseline severity (unobserved).
$y_i$ ↑ if $v_{0i}$ ↑ (severe 환자가 높은 outcome).
$D_i$ ↑ if $v_{0i}$ ↑ (severe 환자가 dropout 가능).
→ $v_{0i}$ 가 둘의 association 매개.

왜 Nonignorable 인가:

$v_i$ 가 $y_i^O$ + $y_i^M$ 모두 결정.
$v_i$ 가 $D_i$ 도 결정.
→ 결측 ($D$) 이 $y^M$ 에도 의존.

Standard Software 활용:

저자 본문 인용:

“Appealing aspects of this class of models is that they can be used for nonignorable missingness and can be fit using some standard software.”

SAS PROC NLMIXED 로 적합 가능.
WinBUGS / Stan 으로도 가능.
→ 실무 접근성 ↑.

4.2 식 (14.10) — Longitudinal Component

표준 MRM Form

저자 본문 인용 (식 14.10, Ten Have 1998 표기):

\[y_i = X_i \beta + Z_i v_i + \varepsilon_i\]

$f_y(y_i \mid v)$: random effect $v$ 조건부 longitudinal model.

표준 MRM 과 동일 구조 (Ch.4 의 식).

4.3 식 (14.11) — Dropout Component

Random-Effects-Augmented Survival Model

저자 본문 인용 (식 14.11):

\[\log(-\log(1 - P(D_i = j \mid D_i \geq j))) = W_i \alpha + v_i \alpha^*\]

구성:

$W_i$: dropout 예측 covariates (일부는 $X_i$ 와 중복 가능).
$v_i$: longitudinal model 의 random effect (공유).
$\alpha^*$: random effect 의 dropout 효과.

MNAR 검정:

$\alpha^* = 0$ → ignorable (random effect 가 dropout 영향 없음).
$\alpha^* \neq 0$ → nonignorable (random effect 가 dropout 결정).

직관 — $\alpha^*$ 의 의미

왜 Random Effect 가 Dropout 결정:

$v_i$ 가 환자 $i$ 의 unobserved trajectory characteristic.
예: $v_{0i}$ 가 baseline 수준, $v_{1i}$ 가 호전 속도.
이 characteristic 이 dropout 결정 → MNAR.

Wu-Carroll (1988) 의 직관:

저자 본문 인용:

“similar to Wu and Carroll [1988], we will include them as covariates. The notion is that dropout may be related to a individual’s underlying starting point and time-trend of the longitudinal outcome.”

Random intercept ($v_{0i}$) = baseline level → dropout 영향.
Random slope ($v_{1i}$) = trajectory → dropout 영향.
→ 환자의 individual-specific characteristics 가 dropout 매개.

$y^O$ + $y^M$ 의존성:

저자 본문 인용:

“We additionally posit that dropout depends on the random subject effects, $v_i$, which characterize both the unobserved and observed components of the dependent variable vector $y_i$. As such, to the extent that the regression coefficients $\alpha^*$ are nonzero, this is a nonignorable model because missingness, here characterized simply as dropout, is dependent on both $y_i^O$ and $y_i^M$.”

$v_i$ 가 $y_i^O$ + $y_i^M$ 모두 매개.
$D_i$ 가 $v_i$ 의존 → $D_i$ 가 $y_i^M$ 에도 간접적 의존.
→ MNAR.

Clog-log Link 선택 이유:

§ 14.4 의 식 14.10 과 같은 grouped-time PH (Prentice-Gloeckler 1978).
Hazard ratio 해석 자연.
Discrete-time 형태로 자연.

4.4 식 (14.12-14.14) — Marginal Likelihood

Joint Likelihood + Conditional Independence

저자 본문 인용 (식 14.12):

\[f(y_i, D_i) = \int_v f_y(y_i \mid v) f_D(D_i \mid v) f(v) dv\]

핵심 가정: random effect 조건부, $y_i$ 와 $D_i$ 가 독립.

→ “$v_i$ 가 두 outcomes 의 모든 association 흡수”.

식 14.13 — sample log-likelihood:

\[\log L = \sum_{i=1}^N \log f(y_i, D_i)\]

Cholesky reparam — $v = S\theta$, $\Sigma_v = SS'$:

식 14.14:

\[f(y_i, D_i) = \int_\theta f_y(y_i \mid \theta) f_D(D_i \mid \theta) f(\theta) d\theta\]

→ Standard normal $\theta$ 로 적분 → Gauss-Hermite quadrature.

직관 — Joint Likelihood 의 통계적 의미

Conditional Independence 의 가정:

$y_i$ 와 $D_i$ 가 marginally 연관 (둘 다 outcome).
그러나 $v_i$ 조건부 → 독립.
→ $v_i$ 가 모든 dependence 매개.

왜 이 가정 가능:

$v_i$ 가 환자 $i$ 의 모든 unobserved heterogeneity 흡수.
$y_i$ 의 변동: $X\beta + Zv$ 로 해석된 부분 + $\varepsilon$ (random noise).
$D_i$ 의 변동: $W\alpha + v\alpha^*$ 로 해석된 부분 + 시점별 hazard noise.
→ $\varepsilon$ 와 hazard noise 가 독립이면 OK.

적분의 어려움:

$v_i$ 차원 = random effect 수.
보통 2-3 차원 (intercept + slope).
→ Gauss-Hermite 로 적분 가능.

Cholesky Reparam 의 가치 (Ch.13 의 § 13.2 와 동일):

$v = S\theta$ 로 standardize.
$\theta_i$ 가 multivariate standard normal.
→ 표준 quadrature point 사용.

계산 복잡도:

차원 $r$ × quadrature points $Q$.
$r = 2$, $Q = 20$ → $400$ points × subjects.
5000 subjects → $2 \times 10^6$ evaluations × 반복.
→ SAS PROC NLMIXED 로 수 분 ~ 수 시간.

5 § 14.5.1.2 — NIMH Schizophrenia 예시

5.1 식 (14.15) — Longitudinal Model

SWeek 변환 + Random Intercept + Slope

저자 본문 인용 (식 14.15):

\[IMPS79_{ij} = \beta_0 + \beta_1 Drug_i + \beta_2 SWeek_j + \beta_3 (Drug_i \times SWeek_j) + v_{0i} + v_{1i} SWeek_j + \varepsilon_{ij}\]

SWeek = $\sqrt{week}$:

IMPS79 와 시간의 비선형 관계 → square root 로 선형화.
Ch.9 NIMH 분석에서 표준 변환.

Random effects:

$v_{0i}$: baseline 편차.
$v_{1i}$: 호전 속도 편차 (SWeek 단위).
모두 normal, $\Sigma_v$ 공분산.

직관 — SWeek 변환의 이유

원래 IMPS79 vs week:

Week 0 → 1: 큰 호전.
Week 1 → 6: 점진적 호전.
→ 비선형 (감속 호전).

SWeek 변환 후:

Week 0 → SWeek 0.
Week 1 → SWeek 1.
Week 4 → SWeek 2.
Week 6 → SWeek 2.45.
→ 빠른 초기 변화 + 천천한 후기 변화 가 직선화.

§ 14.4.1 의 분석과 차이:

§ 14.4.1: 시점별 indicator (Week 1, 2, 3, 4 dummy).
§ 14.5.1.2: SWeek 연속 변수.
→ 모형 단순화 + 추세 명시적.

5.2 식 (14.16-14.17) — Cholesky Reparameterization

2D Cholesky

저자 본문 인용 (식 14.16):

\[S = \begin{pmatrix} s_0 & 0 \\ s_{01} & s_1 \end{pmatrix} = \begin{pmatrix} \sigma_{v_0} & 0 \\ \sigma_{v_{01}}/\sigma_{v_0} & \sqrt{\sigma_{v_1}^2 - \sigma_{v_{01}}^2/\sigma_{v_0}^2} \end{pmatrix}\]

$\Sigma_v = SS'$:

\[SS' = \begin{pmatrix} \sigma_{v_0}^2 & \sigma_{v_{01}} \\ \sigma_{v_{01}} & \sigma_{v_1}^2 \end{pmatrix} = \Sigma_v\] (검증).

식 14.17 — reparam 된 longitudinal model:

\[IMPS79_{ij} = \beta_0 + \beta_1 Drug_i + \beta_2 SWeek_j + \beta_3 (Drug_i \times SWeek_j)\] \[+ \left(\sigma_{v_0} + \frac{\sigma_{v_{01}}}{\sigma_{v_0}} SWeek_j\right) \theta_{0i} + \left(\sqrt{\sigma_{v_1}^2 - \sigma_{v_{01}}^2/\sigma_{v_0}^2} \cdot SWeek_j\right) \theta_{1i}\]

→ $\theta_{0i}, \theta_{1i}$ 가 독립 standard normal.

직관 — Cholesky 의 변환 효과

$v$ 가 correlated ($\sigma_{v_{01}} \neq 0$):

Quadrature 어려움 (multivariate normal).
→ standardize 필요.

$\theta$ 가 independent standard normal:

각 차원 별 quadrature 분리 가능.
$\theta_0$ 의 quadrature × $\theta_1$ 의 quadrature.
→ tensor product quadrature.

예시 — 5000 subjects, $Q = 10$:

차원 $r = 2$ → $10^2 = 100$ points.
5000 subjects × 100 points × 반복 = $5 \times 10^5$ evaluations / 반복.
적당한 시간 (수 분).

§ 13.2.4 의 3-Level Cholesky 와 같은 발상:

3-Level: cluster + subject random effects 모두 standardize.
본 모형: subject 만 (intercept + slope).

5.3 식 (14.18) — Dropout Component

Clog-Log Dropout with Shared Random Effects

저자 본문 인용 (식 14.18):

\[\log(-\log(1 - P(D_i = j \mid D_i \geq j))) = \alpha_{0j} + \alpha_1 Drug_i + \alpha_2 \theta_{0i} + \alpha_3 \theta_{1i} + \alpha_4 (Drug_i \times \theta_{0i}) + \alpha_5 (Drug_i \times \theta_{1i})\]

모수:

$\alpha_{0j}$: 시점별 baseline cumulative hazard.
$\alpha_1$: Drug 효과 (random effects = 평균일 때).
$\alpha_2, \alpha_3$: random intercept, slope 효과.
$\alpha_4, \alpha_5$: Drug × random effects interactions.

MNAR 검정:

$H_0$: $\alpha_2 = \alpha_3 = \alpha_4 = \alpha_5 = 0$.
거부 → nonignorable.

직관 — Random Effect Interactions 의 의미

왜 Drug × Random Effects 가 중요:

§ 14.4.1 NIMH 결과: Drug × MeanY interaction 결정적.
→ Drug 그룹마다 dropout 메커니즘 다름.
→ Random effects 도 그룹별 다른 영향 가능.

$\alpha_2$ (random intercept):

양수: high baseline (severe) 환자 dropout 많음.
음수: low baseline (mild) 환자 dropout 많음.

$\alpha_3$ (random slope):

양수: 안 호전 환자 (slope 더 양수) dropout 많음.
음수: 빨리 호전 환자 (slope 더 음수) dropout 많음.

$\alpha_5$ (Drug × random slope):

그룹별 slope 효과 차이.
Placebo 의 slope 효과 vs Drug 의 slope 효과.

임상 가설:

Placebo: 안 호전 환자 → “다른 치료” 받으러 dropout.
Drug: 호전 환자 → “더 이상 진료 필요 없음” → dropout.
→ 그룹마다 반대 방향.

5.4 식 (14.19-14.20) — Ordinal Equivalence

Engel (1993), Läärä-Matthews (1985) Equivalence

저자 본문 인용:

“Because the above model for dropout does not include any time-varying covariates, besides the intercept terms representing the baseline hazard, we can take advantage of the equivalence of certain models under the clog-log link [Engel, 1993; Läärä and Matthews, 1985]. Namely, the above dichotomous regression model utilizing person-period indicators of dropout is equivalent to the following ordinal regression model:”

식 14.19 — Ordinal cumulative model:

\[\log(-\log(1 - P(D_i \leq j))) = \alpha_{0j} + \alpha_1 Drug_i + \alpha_2 \theta_{0i} + \alpha_3 \theta_{1i} + \alpha_4 (Drug_i \times \theta_{0i}) + \alpha_5 (Drug_i \times \theta_{1i})\]

식 14.20 — Cumulative probability:

\[P(D_i \leq j) = 1 - \exp(-\exp(\alpha_{0j} + \alpha_1 Drug_i + \cdots))\]

Equivalence 의 의미:

Time-invariant covariate 만 → ordinal cumulative form 과 동치.
$\alpha_1$ ~ $\alpha_5$ 가 동일.
$\alpha_{0j}$ 의 baseline hazard 만 다른 형태.

직관 — Equivalence 의 가치

Ordinal 표현의 advantage:

저자 본문 인용:

“This is simpler, from a data analytic perspective, because we do not have to create a person-period dataset. Instead, we have one outcome per person ($D_i$) and several person-level covariates ($Drug_i, \theta_{0i}, \theta_{1i}$).”

Person-period dataset 불필요.
$D_i \in \{1, 2, 3, 4, 5, 6\}$ (마지막 관측 주).
→ 환자 $i$ 마다 single outcome.
→ 데이터 정리 단순.

Discrete-time hazard vs Cumulative ordinal:

식 14.18: $P(D_i = j \mid D_i \geq j)$ — hazard at $j$.
식 14.19: $P(D_i \leq j)$ — cumulative.
둘 다 같은 모수 (clog-log + time-invariant covariate 시).

§ 10.2.3 의 cumulative ordinal proportional hazards 와 같음:

Hedeker, Mermelstein 등의 ordinal cumulative model.
Discrete-time PH 와 동치.
→ § 10.2.3 의 framework 활용.

한계:

Time-varying covariate 있으면 equivalence 깨짐.
그 경우 식 14.18 의 person-period 필수.
본 schizophrenia 예시는 time-varying 없음 → ordinal OK.

5.5 SAS PROC NLMIXED 구현

NLMIXED 의 구조 (Tables 14.9-14.10)

저자 본문 인용:

“This model can be estimated using SAS PROC NLMIXED, which is a general program for estimation of many kinds of mixed-effects model. For this, the first step is to create a dataset in which a single vector contains, for each subject, the dependent variable vector $y_i$ and the time to dropout variable $D_i$ as one vector, say $y_i^*$.”

핵심 구조:

데이터 stack: $y_i^*$ = $y_i$ (longitudinal) + $D_i$ (dropout) 한 vector.
Indicator $ind$: 0 if $y$ component, 1 if $D$ component.
NLMIXED 내부 분기:
- $ind = 0$: longitudinal likelihood.
- $ind = 1$: dropout likelihood (clog-log).

직관 — NLMIXED 의 일반성

왜 NLMIXED 가 필요:

표준 MIXED procedure: linear mixed model 만.
GLIMMIX: GLMM 만 (single outcome).
NLMIXED: 임의의 nonlinear model + 임의의 likelihood.

Joint likelihood 의 표현:

$y$ 의 normal density + $D$ 의 ordinal clog-log density.
두 density 의 product → log-likelihood.
NLMIXED 의 GENERAL(ll) 옵션 사용.

Random effects 공유:

RANDOM u1 u2 ~ NORMAL([0,0], [1,0,1]) SUBJECT=id.
두 outcomes 가 같은 $u_1, u_2$ 사용.
→ shared parameter 의 핵심.

실무 가능성:

SAS PROC NLMIXED.
R: nlme::nlme (제한적), brms (Stan-based, 더 일반).
Python: PyMC, Stan.

계산 시간 (NIMH 예시):

437 subjects × ~5 시점 + dropout outcome.
~수 분 (modern hardware).
더 큰 데이터: 수 시간 가능.

5.6 Table 14.11 — Separate vs Shared 결과

결과 비교

저자 본문 명시 (Table 14.11):

모수	Separate Estimate	Separate SE	Shared Estimate	Shared SE
Outcome
$\beta_0$ Intercept	5.348	.088	5.320	.088
$\beta_1$ Drug	.046 (n.s.)	.101	.088 (n.s.)	.102
$\beta_2$ SWeek	-.336***	.068	-.272***	.073
$\beta_3$ Drug × SWeek	-.641***	.078	-.737***	.083
Dropout
$\alpha_1$ Drug	-.693***	.205	-.703*	.301
$\alpha_2$ Random intercept	—	—	.447 (n.s.)	.333
$\alpha_3$ Random slope	—	—	.891 (marginal)	.467
$\alpha_4$ Drug × intercept	—	—	-.592 (n.s.)	.398
$\alpha_5$ Drug × slope	—	—	-1.638**	.536
Deviance	5380.2		5350.1

LR test: $\chi^2_4 = 30.1$, $p < .0001$ → shared 가 better fit.

직관 — Separate 와 Shared 의 정밀 비교

Separate Model:

$\alpha_2 = \alpha_3 = \alpha_4 = \alpha_5 = 0$ 고정.
두 모형 (longitudinal + dropout) 분리 적합.
MAR 가정 (longitudinal 만으로 valid).

Separate 결과 해석:

Placebo slope: $\beta_2 = -.336$.
Drug slope: $\beta_2 + \beta_3 = -.336 + (-.641) = -.977$.
Drug 가 placebo 보다 약 3 배 빠른 호전.
$\alpha_1 = -.693$, $\exp(-.693) = 0.5$ → drug 의 dropout hazard = placebo 의 절반.

Shared 결과 해석:

Placebo slope: $-.272$ (조금 더 작음).
Drug slope: $-.272 + (-.737) = -1.009$ (조금 더 큼).
→ 결론 거의 동일, 효과 약간 강화.

LR Test 의 의미:

저자 본문 인용:

“this is not necessarily a rejection of MAR, but it is a rejection of this particular MAR model in favor of this particular MNAR shared parameter model.”

$\chi^2_4 = 30.1$, $p < .0001$.
Shared (MNAR) 모형이 통계적으로 더 적합.
→ MAR 가정의 한계 시사.
그러나 “보편적 MAR rejection” 아님 (모형 specific).

$\alpha_5 = -1.638$ (Drug × slope) 의 결정적 의미:

저자 본문 인용:

“The marginally significant slope effect indicates that for the placebo group there is a tendency to dropout of the study as the slope increases. Namely, among placebo subjects, those who are not improving, or improving at a slower rate, are more likely to drop out. The significant negative Drug x slope interaction indicates that the slope effect is opposite for the drug group, where drug patients with more negative slopes (i.e., greater improvement) are more likely to drop out.”

Placebo ($\alpha_3 = .891$, marginal): slope 가 더 양수 (= 안 호전) → dropout 많음.
Drug ($\alpha_3 + \alpha_5 = .891 + (-1.638) = -.747$): slope 가 더 음수 (= 빨리 호전) → dropout 많음.
→ 그룹마다 반대 방향 dropout 메커니즘.

§ 14.4.1 의 결과와 일치:

§ 14.4.1: Drug × MeanY interaction 발견.
§ 14.5.1.2: Drug × random slope interaction 발견.
→ 두 가지 다른 framework 에서 같은 결론.
→ 로버스트한 finding.

임상 시사:

Placebo: 효과 없음 → 안 호전 환자가 다른 치료 받으러.
Drug: 효과 있음 → 호전 환자가 더 이상 진료 필요 없어.
그룹마다 반대 방향 → MAR(b) 시뮬레이션 (§ 14.3.2) 와 정확히 일치.

Sensitivity 결론:

MAR (separate): drug 효과 confirmed (-.641).
MNAR (shared): drug 효과 강화 (-.737).
→ 두 모형 모두 drug 효과 있음.
→ conclusion 강건.

6 응용 분야

분야	Selection model 적용	비고
임상시험 (RCT)	Drug × random effects	그룹별 다른 dropout 메커니즘
항암제 long-term	Survival outcome 와 연계	사망 = informative dropout
정신과 longitudinal	Symptom 의 random trajectory	호전/악화 따른 dropout
만성 질환 추적	Disease progression × dropout	악화 dropout (frailty)
임상시험 with biomarker	Biomarker × dropout	Biomarker-driven dropout
약물 cessation	“Missing = smoking” alternative	강한 MNAR 가정

7 코드 예시

7.1 Step 1: NIMH 데이터 시뮬레이션 (Simplified)

library(MASS)
library(dplyr)

set.seed(2026)
n_placebo <- 108
n_drug <- 329
n_subjects <- n_placebo + n_drug

# Drug 변수
drug <- c(rep(0, n_placebo), rep(1, n_drug))

# Random effects (Cholesky)
sigma_v0 <- 0.6
sigma_v1 <- 0.5
rho <- -0.3
Sigma_v <- matrix(c(sigma_v0^2, rho * sigma_v0 * sigma_v1,
                    rho * sigma_v0 * sigma_v1, sigma_v1^2), 2, 2)
v <- mvrnorm(n_subjects, mu = c(0, 0), Sigma = Sigma_v)

# Longitudinal data
weeks <- c(0, 1, 3, 6)
df_long <- expand.grid(subject = 1:n_subjects, week = weeks) %>%
  arrange(subject, week)
df_long$drug <- drug[df_long$subject]
df_long$sweek <- sqrt(df_long$week)
df_long$v0 <- v[df_long$subject, 1]
df_long$v1 <- v[df_long$subject, 2]

# 식 14.15 의 generating model
df_long$imps79 <- 5.32 + 0.09 * df_long$drug - 0.27 * df_long$sweek -
                  0.74 * df_long$drug * df_long$sweek +
                  df_long$v0 + df_long$v1 * df_long$sweek +
                  rnorm(nrow(df_long), 0, 0.8)

# Dropout 메커니즘 (식 14.18, MNAR)
generate_dropout_time <- function(drug_i, v0_i, v1_i) {
  alpha_1 <- -0.7  # Drug effect
  alpha_5 <- -1.6  # Drug × slope interaction
  # 4 weeks of risk
  for (j in 1:4) {
    linpred <- -2 + alpha_1 * drug_i + 0.9 * v1_i + alpha_5 * drug_i * v1_i
    haz <- 1 - exp(-exp(linpred))
    if (rbinom(1, 1, haz) == 1) return(j)
  }
  return(5)  # completer
}

dropout_times <- sapply(1:n_subjects, function(i) {
  generate_dropout_time(drug[i], v[i, 1], v[i, 2])
})

# Apply dropout (monotone)
df_long_with_dropout <- df_long %>%
  mutate(d_time = dropout_times[subject],
         observed = ifelse(week <= weeks[d_time], 1, 0))

cat("Dropout 분포:\n")
print(table(dropout_times, drug))

시뮬레이션의 검증

진짜 generating 모수:

$\beta = (5.32, 0.09, -0.27, -0.74)$.
$= $ (Drug + Drug × slope interaction).

Dropout 메커니즘:

$\alpha_5 = -1.6$ → Drug 그룹의 빠른 호전 환자가 dropout.
Placebo 의 slope 효과가 양수 (안 호전 → dropout).

→ 시뮬레이션이 식 14.18 의 MNAR 시나리오 직접 구현.

7.2 Step 2: Separate Model 적합 (MAR)

library(lme4)


# Longitudinal MRM (separate analysis, MAR 가정)
df_observed <- df_long_with_dropout %>% filter(observed == 1)

fit_separate_long <- lmer(imps79 ~ drug + sweek + drug:sweek + (sweek | subject),
                          data = df_observed)
summary(fit_separate_long)

# Dropout model (separate)
df_dropout_only <- df_long_with_dropout %>%
  group_by(subject) %>%
  summarise(d_time = first(d_time), drug = first(drug))

# Discrete-time hazard for dropout
library(survival)
# ... (별도 logistic regression)

cat("\nSeparate 결과:\n")
cat("진짜 beta = (5.32, 0.09, -0.27, -0.74)\n")
cat("추정 beta:", round(fixef(fit_separate_long), 3), "\n")

Separate 의 가정과 한계

Separate = 두 모형 분리 적합:

Longitudinal: standard MRM (MAR 가정).
Dropout: discrete-time survival.
둘이 독립적 추정.

MAR 가정의 의미:

$\alpha_2 = \alpha_3 = \alpha_4 = \alpha_5 = 0$.
Random effects 가 dropout 영향 안 줌.
→ 결측이 ignorable.

언제 위험:

실제 dropout 이 random effects 의존 (NIMH 처럼).
→ MAR 가정 위반.
→ biased 추정 가능.

7.3 Step 3: Shared Parameter Model 적합 (MNAR)

# brms 로 shared parameter (Bayesian)
library(brms)


# 두 모형 정의 (longitudinal + dropout shared random effects)
# 가장 단순한 형태
formula_long <- bf(imps79 ~ drug + sweek + drug:sweek + (sweek | p | subject),
                   family = gaussian())

formula_drop <- bf(d_time | cens(censored) ~ drug + (1 | p | subject),
                   family = cox())

# 데이터 prep
df_brms <- df_long_with_dropout %>%
  mutate(censored = ifelse(d_time == 5, 1, 0))  # completer = censored

# 적합 (시간 오래 걸림)
# fit_shared <- brm(formula_long + formula_drop + set_rescor(FALSE),
#                   data = df_brms, chains = 2, iter = 2000)

# 또는 R 의 JM (joint model) package
# library(JM)
# ...

Shared Parameter 의 통계적 가치

Brms 의 (... | p | subject) 표기:

p: shared random effect identifier.
두 모형이 같은 random effect 공유.
→ shared parameter model.

SAS PROC NLMIXED 표준 구현:

더 효율적 (특히 큰 데이터).
ML 기반 (Bayesian 아님).
본 chapter 의 표준.

R 의 JM (Joint Models) package:

Continuous longitudinal + survival.
Cox PH 또는 parametric survival.
표준 quadrature 사용.

모형 비교:

Separate vs Shared LR test.
$H_0$: $\alpha_2 = \alpha_3 = \alpha_4 = \alpha_5 = 0$.
거부 → MAR 가정 위반.

7.4 Step 4: Sensitivity Analysis

# 다양한 nonignorable 모형 가정으로 sensitivity
sensitivity_analysis <- function(df, alpha_5_values = c(0, -0.5, -1, -1.5)) {
  # Each alpha_5 에 대해 separate dropout 가정
  results <- data.frame(alpha_5 = alpha_5_values,
                        beta_3 = NA)

  for (i in seq_along(alpha_5_values)) {
    a5 <- alpha_5_values[i]
    # Conditional on a5, 적합
    # 단순화: weighted analysis (실제는 NLMIXED 또는 brms)
    fit <- lmer(imps79 ~ drug + sweek + drug:sweek + (sweek | subject),
                data = df %>% filter(observed == 1))
    results$beta_3[i] <- fixef(fit)[4]
  }

  return(results)
}

cat("\n=== Sensitivity Analysis ===\n")
sens_results <- sensitivity_analysis(df_long_with_dropout)
print(sens_results)

# 진짜 값과 비교
cat("\n진짜 beta_3 = -0.74\n")

Sensitivity 의 권고

다양한 alpha_5 시나리오:

$\alpha_5 = 0$: shared = separate (MAR).
$\alpha_5 < 0$: 점진적 더 강한 MNAR.
$\alpha_5 = -1.6$: NIMH 데이터의 추정값.

해석:

$\beta_3$ 의 변동 작음 → robust conclusion.
$\beta_3$ 의 변동 큼 → 임상적 판단 필요.

실무 보고:

“MAR 가정 하 drug 효과 = $X$”.
“다양한 MNAR 가정 하 drug 효과 = $X \pm \Delta$”.
“결과 일관 → conclusion 강건”.

§ 14.5.2 의 Pattern-Mixture 와 함께 사용 권고:

Selection: $f(R \mid y)$ 의 명시적 가정.
Pattern-mixture: 패턴 별 응답 분포 가정.
둘 다 sensitivity 의 도구.

8 관련 주제

선행 지식

Ch.4 정규 종단 MRM — Longitudinal model 의 토대
Ch.9 NIMH GLMM — NIMH schizophrenia 데이터
Ch.10 § 10.2.3 Discrete-time survival — Cumulative ordinal + clog-log + PH equivalence
Ch.13 § 13.2 Cholesky reparam — Cholesky + Gauss-Hermite quadrature
§ 14.3 Simulations — MAR vs MNAR sim
§ 14.4 Testing MCAR — Drug × MeanY interaction (parallel finding)

후속 주제 (Ch.14 sub-posts)

§ 14.5.2 — Pattern-mixture model (Little 1993, 1994)
§ 14.6 — Summary

관련 개념

Heckman (1976) — Econometric selection model 원전 (Annals of Economic and Social Measurement)
Leigh, Ward & Fries (1993) — Selection model tutorial (J Clin Epidemiology)
Diggle & Kenward (1994) — Longitudinal selection model + discussion (Applied Statistics)
Kenward (1998) — Sensitivity to nonignorability (Statistics in Medicine)
Little (1993) — Pattern-mixture (Journal of the American Statistical Association)
Little (1994) — Pattern-mixture for non-monotone (Biometrika)
Little (1995) — Selection vs pattern-mixture overview (JASA)
Little & Rubin (2002) — Missing data textbook (2nd ed.)
Glynn, Laird & Rubin (1986) — Selection vs mixture comparison
Hogan & Laird (1997b) — Selection vs mixture review
Michiels, Molenberghs & Lipsitz (2002) — Pattern-mixture sensitivity
Wu & Carroll (1988) — Conditional linear model with informative dropout
Wu & Bailey (1989) — Estimation under random-coefficient regression
Schluchter (1992) — Methods for analysis of informative censored longitudinal
De Gruttola & Tu (1994) — Modelling progression of CD4 with informative dropout
Ten Have, Pulkstenis, Kunselman & Landis (1998) — Mixed-effects logistic regression with informative dropout
Engel (1993) — On equivalence of clog-log models
Läärä & Matthews (1985) — Equivalence of two link functions for ordered categorical data
Prentice & Gloeckler (1978) — Grouped-time proportional hazards (clog-log)