1 들어가며 — 12-3 의 Poisson 위에 두 확장

§ 12.4 ~ 12.4.1 (12-3) 에서 mixed-effects Poisson 의 marginal MLE framework 를 다뤘다. 본 sub-post 는 그 위에 두 가지 깊이 있는 확장:

두 확장의 자리

확장	절	핵심
Empirical Bayes	§ 12.4.2	환자별 \(\theta_i\) 추정 (식 12.28-12.29)
Mixed-Effects ZIP	§ 12.4.3	두 random effects 로 ZIP 의 longitudinal 확장
Mixed-Effects ZIP(τ)	§ 12.4.3	ZIP(τ) 의 mixed-effects 변형

한 줄 요약

“§ 12.4.2 = Empirical Bayes 의 식 (12.28) EAP estimator \(\bar\theta_i = E(\theta_i \mid y_i)\) — § 9.6.1 식 (9.48) 의 Poisson 적용. Thomas et al. (1992) 의 hospital mortality ranking 이 표준 응용 + 자살률 county ranking (§ 12.5) 의 토대. Shrinkage 효과로 작은 클러스터의 안정 추정. § 12.4.3 = Mixed-Effects ZIP (Hedeker-Gibbons 2006) 의 두 random effects (\(\sigma_1\) Poisson part + \(\sigma_2\) logistic part) 로 두 차원 환자 이질성 모형화. 식 (12.36-12.41) score 가 12-2 의 ZIP score 에 random effects 항 추가. BHHH (Berndt et al. 1974) 로 information matrix 추정. Mixed-Effects ZIP(τ) (식 12.42-12.51) 에서 식 (12.44) 의 τ 가 두 부분 효과의 비율 — Hall (2000), Hur et al. (2002) 의 부분적 확장의 완전 일반화.”

2 § 12.4.2 — Empirical Bayes 의 깊이

2.1 EAP Estimator — 식 (12.28-12.29)

식 (12.28) — EAP (Expected A Posteriori)

환자 \(i\) 의 random effect 사후 평균:

\[ \bar\theta_i = E(\theta_i \mid y_i) = \frac{1}{h(y_i)} \int_\theta \theta_i \cdot \ell(y_i \mid \theta) \cdot g(\theta) \, d\theta \tag{12.28} \]

표기:

\(h(y_i)\): marginal density (식 12.24).
\(\ell(y_i \mid \theta)\): conditional likelihood (식 12.19).
\(g(\theta) = \mathcal{N}(0, 1)\): random effect 사전 분포.

→ 사후 분포 \(f(\theta \mid y_i)\) 의 평균.

식 (12.29) — 사후 분산

추정의 정밀도 측정:

\[ V(\bar\theta_i \mid y_i) = \frac{1}{h(y_i)} \int_\theta (\theta_i - \bar\theta_i)^2 \cdot \ell(y_i \mid \theta) \cdot g(\theta) \, d\theta \tag{12.29} \]

→ 95% 신뢰 구간: \(\bar\theta_i \pm 1.96 \sqrt{V(\bar\theta_i)}\).

직관 — Bayes 의 정리로부터의 도출

\(\theta_i\) 의 사후 분포 (Bayes):

\[ f(\theta_i \mid y_i) = \frac{\ell(y_i \mid \theta_i) g(\theta_i)}{h(y_i)} = \frac{\text{likelihood} \times \text{prior}}{\text{marginal}} \]

식 (12.28) 의 EAP = 사후 평균:

\[ \bar\theta_i = \int_\theta \theta_i \cdot f(\theta_i \mid y_i) \, d\theta \]

→ Bayes 정리의 직접 적용.

§ 9.6.1 식 (9.48) 와 동일 형태 — 모든 GLMM 의 EB 추정이 같은 framework. 차이는 conditional likelihood \(\ell\) 의 형태:

이항 (§ 9): Bernoulli 곱.
순서형 (§ 10): Cumulative.
명목 (§ 11): Multinomial.
카운트 (§ 12): Poisson 곱.

Bock & Aitkin (1981) 의 명명 — “EAP” (Expected A Posteriori). Item Response Theory (IRT) 의 능력 추정에서 표준.

2.2 적분 근사 — Gauss-Hermite

EAP 의 numerical 근사

식 (12.28) 의 적분은 Gauss-Hermite quadrature 로:

\[ \bar\theta_i \approx \frac{\sum_{q=1}^Q B_q \cdot \ell(y_i \mid B_q) \cdot A(B_q)}{\sum_{q=1}^Q \ell(y_i \mid B_q) \cdot A(B_q)} \]

표기:

\(B_q\): Gauss-Hermite 노드 (표준 정규).
\(A(B_q)\): 가중치.
분자 = \(\int \theta \cdot \ell \cdot g\).
분모 = \(h(y_i)\).

식 (12.29) 의 사후 분산도 같은 방식:

\[ V(\bar\theta_i) \approx \frac{\sum_q (B_q - \bar\theta_i)^2 \cdot \ell(y_i \mid B_q) \cdot A(B_q)}{\sum_q \ell(y_i \mid B_q) \cdot A(B_q)} \]

직관 — Quadrature 의 효율

§ 9.6.3 의 Gauss-Hermite quadrature 가 EB 추정에도 직접 적용:

\(\hat\beta, \hat\sigma\) 추정 시 이미 quadrature 결과 저장.
같은 \(B_q, A(B_q)\) 로 EAP 계산 — 추가 비용 미미.
모든 환자의 EAP 를 한 번에 산출 가능.

\(Q\) 선택:

\(Q = 10\): marginal MLE 에 충분.
\(Q = 30\) 이상: EB 추정의 정확도 ↑ (특히 사후 분포가 사전과 다른 경우).
→ EB 용으로 더 많은 quadrature 점 권고.

Adaptive quadrature 의 가치:

환자별 사후 분포 위치 (mode) 가 다를 수 있음.
Adaptive Gauss-Hermite 는 환자별로 quadrature 노드 재배치.
→ EB 추정 정확도 ↑.

R lme4::ranef(fit) 가 이 절차로 EB 추출.

2.3 Shrinkage 효과

EB 의 자연스러운 정규화

EB 추정의 두 극한:

데이터 많은 환자 (\(n_i\) 큼):

\(\ell(y_i \mid \theta)\) 가 매우 강한 정보 (likelihood 가 좁은 봉우리).
사후 분포가 likelihood 에 dominated.
\(\bar\theta_i\) 가 environment likelihood 의 mode 에 가까움 (사전 영향 미미).

데이터 적은 환자 (\(n_i\) 작음):

\(\ell\) 가 약함 (likelihood 가 넓은 봉우리).
사후 분포가 사전 \(g(\theta) = \mathcal{N}(0, 1)\) 에 dominated.
\(\bar\theta_i \to 0\) (사전 평균).

→ Shrinkage — 작은 표본의 추정이 모집단 평균으로 수축.

직관 — Shrinkage 의 임상적 가치

Hospital mortality ranking 예시:

Case 1 — 큰 hospital (5000 환자):

자체 mortality rate 데이터가 풍부.
모집단 평균보다 더 정확한 추정 가능.
EB 가 거의 raw rate 에 가까움.

Case 2 — 작은 hospital (50 환자):

Random variation 으로 raw mortality rate 가 매우 부정확.
Outlier hospital 처럼 보일 수 있음 (작은 표본의 변동).
EB 가 모집단 평균으로 수축 → “안정된” 추정.

Shrinkage 의 통계적 가치:

Stein’s paradox (1956): 다중 추정 문제에서 shrinkage estimator 가 unbiased 보다 항상 더 정확.
Empirical Bayes 가 Stein-like shrinkage 의 자연 구현.
→ “다 같이 추정” 이 “각자 따로 추정” 보다 정확.

이것이 Thomas et al. (1992) 의 hospital mortality + Gibbons et al. (2005) 의 county suicide rate 분석의 토대.

2.4 Hospital Mortality 응용 — Thomas et al. (1992)

EB random effects 의 ranking 응용

Thomas et al. (1992) 의 분석:

응답: 환자별 사망 여부 (Bernoulli) — Logistic mixed-effects.
Random effect: hospital cluster (\(\theta_i\) for hospital \(i\)).
추정: hospital 별 EAP \(\bar\theta_i\).
해석: “Hospital \(i\) 의 mortality rate 가 전국 평균보다 (logit 척도로) 얼마나 다른가” — covariate 보정 후.

직관 — Quality 평가의 통계적 토대

Hospital quality ranking 의 표준 절차:

Risk adjustment: 환자 covariate (age, comorbidity, severity) 를 fixed effects 로 통제.
Random effect 추출: 잔여 hospital 효과 = \(\theta_i\).
EB 추정: \(\bar\theta_i\) + 95% CI.
Outlier 식별: CI 가 0 을 포함 안 하면 “유의하게 다름”.
Ranking: \(\bar\theta_i\) 순서로 hospital 비교.

Shrinkage 가 핵심:

작은 hospital 의 raw rate 가 outlier 처럼 보여도 — EB 로 적절한 방향 (모집단 평균 쪽) 으로 수축 후 평가.
큰 hospital 의 raw rate 는 거의 그대로 → 정확한 비교.

§ 12.4.2 의 Poisson 적용 — 자살률 county ranking (§ 12.5):

응답: county 별 자살 수 (카운트) — Poisson mixed-effects.
Random effect: county cluster.
EB 추정: county 별 \(\bar\theta_i\) (조정된 자살률 비율).
정책: 위험 county 식별 → 자원 배분.

→ 같은 framework, 다른 응답 형태. Hospital → mortality (Bernoulli), county → 자살 수 (Poisson).

Longford (1994) 의 EB references — 이 분야의 표준 참고.

3 § 12.4.3 — Mixed-Effects ZIP 의 깊이

3.1 모형 정의 — 식 (12.30-12.35)

Hedeker-Gibbons (2006) 의 두 random effects ZIP

기본 ZIP 우도 (식 12.30):

\[ P(y_{ij}) = p(y_{ij}) = (1 - \pi_{ij}) f(y_{ij}) + I(y_{ij}) \pi_{ij} \tag{12.30} \]

with Poisson density (식 12.31):

\[ f(y_{ij}) = \frac{e^{-\lambda_{ij}} \lambda_{ij}^{y_{ij}}}{y_{ij}!} \tag{12.31} \]

두 random effects 추가:

\[ \text{logit}(\pi_{ij}) = w_{ij}^\top \gamma + \nu_i = w_{ij}^\top \gamma + \sigma_2 \theta_{2i} \tag{12.32} \]

\[ \log(\lambda_{ij}) = x_{ij}^\top \beta + \upsilon_i = x_{ij}^\top \beta + \sigma_1 \theta_{1i} \tag{12.33} \]

여기서 \(\upsilon_i \sim \mathcal{N}(0, \sigma_1^2)\), \(\nu_i \sim \mathcal{N}(0, \sigma_2^2)\).

식 (12.34-12.35) — Explicit \(\pi_{ij}, \lambda_{ij}\)

식 (12.32-12.33) 을 inverse link 로 풀면:

\[ \pi_{ij} = \frac{\exp(w_{ij}^\top \gamma + \sigma_2 \theta_{2i})}{1 + \exp(w_{ij}^\top \gamma + \sigma_2 \theta_{2i})} \tag{12.34} \]

\[ \lambda_{ij} = \exp(x_{ij}^\top \beta + \sigma_1 \theta_{1i}) \tag{12.35} \]

직관 — 두 차원 환자 이질성

식 (12.32-12.33) 의 두 random effects 의 의미:

\(\sigma_1 \theta_{1i}\) (Poisson part frailty):

“환자 \(i\) 가 Poisson state 라면 평균보다 얼마나 자주 사건 발생하는가” 의 환자별 차이.
\(\theta_{1i} > 0\): 평균보다 자주 (high-frequency utilizer).
\(\theta_{1i} < 0\): 평균보다 가끔.

\(\sigma_2 \theta_{2i}\) (Logistic part frailty):

“환자 \(i\) 가 perfect state (사건 발생 가능성 없는 그룹) 일 가능성” 의 환자별 차이.
\(\theta_{2i} > 0\): perfect state 가능성 ↑ (구조적 zero 그룹).
\(\theta_{2i} < 0\): Poisson state 가능성 ↑.

→ 두 random effects 가 두 차원의 환자 segmentation.

임상 시나리오 (mental health):

환자 type	\(\theta_{1i}\)	\(\theta_{2i}\)	해석
High utilizer	큰 양수	큰 음수	진료 필요 + 자주 방문
Non-utilizer	(의미 없음)	큰 양수	진료 필요 없음
Occasional	음수	0 근처	진료 필요 + 가끔
Average	0 근처	0 근처	평균

→ 두 random effects 가 정책 결정에 풍부한 정보. 단순 ZIP (random effects 없음) 으로는 환자별 segmentation 불가능.

\(\theta_{1i}\) 와 \(\theta_{2i}\) 의 상관:

독립 가정 가능 (Hedeker-Gibbons 2006 의 가정).
또는 이변량 정규로 상관 모형화 (더 일반).
임상적 가설: “구조적 zero 그룹과 high-frequency 그룹이 구별되는 환자 특성” — 음의 상관 가능.

3.2 Hall (2000) vs Hur (2002) vs Hedeker-Gibbons (2006)

세 framework 의 차이

Hall (2000) — 부분적 mixed-effects ZIP:

Poisson part 에만 random effects (\(\sigma_1 \theta_{1i}\)).
Logistic part 는 fixed (random effects 없음).
EM 알고리즘.

Hur et al. (2002) — 단일 random effect:

두 부분에 같은 random effect (\(\theta_i\)).
\(\theta_{1i} = \theta_{2i}\) 가정 — 두 차원 환자 효과의 완전 상관.
단순하지만 가정 강함.

Hedeker-Gibbons (2006) — 두 random effects:

식 (12.32-12.33) 의 \(\sigma_1 \theta_{1i}, \sigma_2 \theta_{2i}\) 별도.
가장 일반적 — 두 차원 이질성 분리.
Newton-Raphson + BHHH.

직관 — 일반화의 trade-off

세 framework 의 비교:

Framework	Random effects	가정 강도	정확성	추정 비용
Hall (2000)	Poisson 만	logistic 무이질성 가정 (강함)	보통	낮음 (EM)
Hur (2002)	단일 (\(\theta_i\))	두 부분 완전 상관 가정 (강함)	보통	보통
Hedeker-Gibbons (2006)	두 개 (\(\theta_{1i}, \theta_{2i}\))	가정 약함	높음	높음 (이중 적분)

선택 기준:

Logistic part 의 환자 차이가 작을 것으로 예상 → Hall (Poisson 만 random).
두 부분의 효과가 같은 환자 특성에 의해 결정 → Hur (단일 effect).
두 부분의 메커니즘이 다른 환자 특성 → Hedeker-Gibbons (두 개 effects).

현대 표준 — Hedeker-Gibbons:

가장 유연.
Bayesian 구현 (brms) 으로 추정 비용 줄어듦.
사후 비교 (LR test, DIC) 로 단순 모형으로 collapse 가능.

LR 검정 — Hedeker-Gibbons vs Hur:

\(H_0: \theta_{1i} = \theta_{2i}\) 같음.
\(\chi^2\) 검정 — 가정 위반 시 거부.
거부 안 됨 → Hur 의 단순 모형 사용 가능.

3.3 Score 도출 — 식 (12.36-12.41)

식 (12.36-12.37) — Conditional likelihood

환자별 conditional likelihood (random effects \(\theta = (\theta_{1i}, \theta_{2i})\) 조건부):

\[ \ell(y_i \mid \theta) = \prod_{j=1}^{n_i} [(1 - \pi_{ij}) f(y_{ij}) + I(y_{ij}) \pi_{ij}] \tag{12.36} \]

Log:

\[ \log \ell(y_i \mid \theta) = \sum_{j=1}^{n_i} \log[(1 - \pi_{ij}) f(y_{ij}) + I(y_{ij}) \pi_{ij}] \tag{12.37} \]

식 (12.38-12.39) — Score 의 분리

모수 vector \(\eta = (\eta_1, \eta_2) = (\beta, \sigma_1, \gamma, \sigma_2)\).

\(\eta_1 = (\beta, \sigma_1)\) score (Poisson part):

\[ \frac{\partial \log \ell}{\partial \eta_1} = \sum_{j=1}^{n_i} \frac{1}{p(y_{ij})} \left[ (1 - \pi_{ij}) f(y_{ij}) \frac{\partial \log f(y_{ij})}{\partial \eta_1} \right] \tag{12.38} \]

(만약 \(\pi_{ij}\) 가 \(\eta_1\) 무관 — ZIP 의 별도 covariate.)

\(\eta_2 = (\gamma, \sigma_2)\) score (logistic part):

\[ \frac{\partial \log \ell}{\partial \eta_2} = \sum_{j=1}^{n_i} \frac{1}{p(y_{ij})} [I(y_{ij}) - f(y_{ij})] \frac{\partial \pi_{ij}}{\partial \eta_2} \tag{12.39} \]

식 (12.40-12.41) — 모수별 미분

\(\partial \log f(y_{ij}) / \partial \eta_1\) (Poisson part):

\[ \frac{\partial \log f(y_{ij})}{\partial \beta_p} = (y_{ij} - \lambda_{ij}) x_{ijp} \]

\[ \frac{\partial \log f(y_{ij})}{\partial \sigma_1} = (y_{ij} - \lambda_{ij}) \theta_{1i} \tag{12.40} \]

\(\partial \pi_{ij} / \partial \eta_2\) (logistic part):

\[ \frac{\partial \pi_{ij}}{\partial \gamma_s} = \pi_{ij} (1 - \pi_{ij}) w_{ijs} \]

\[ \frac{\partial \pi_{ij}}{\partial \sigma_2} = \pi_{ij} (1 - \pi_{ij}) \theta_{2i} \tag{12.41} \]

직관 — 12-2 ZIP score 의 mixed-effects 일반화

12-2 의 식 (12.15-12.16) (cross-sectional ZIP score) 와 비교:

Cross-sectional: \(\sum_i\) — 환자 단위 합.
Mixed-effects: \(\sum_j\) — 같은 환자의 시점 합 (conditional on \(\theta\)).
\(\sigma_1, \sigma_2\) 에 대한 추가 score 항 (random effects 분산 추정).

\(\eta_1\) vs \(\eta_2\) 의 score 분리:

Poisson part 의 score (식 12.38): \((1 - \pi_{ij}) f(y_{ij}) \cdot \partial \log f / \partial \eta_1\) — Poisson likelihood weight.
Logistic part 의 score (식 12.39): \([I(y) - f(y)] \cdot \partial \pi / \partial \eta_2\) — perfect zero detection.

이것이 EM 알고리즘 의 자연 구조 — 두 부분이 분리되어 별도 GLM 적합 가능 (cross-sectional ZIP 의 EM 과 같은 발상).

Mixed-effects 의 추가 부담:

Score 가 conditional (random effects \(\theta\) 의존).
Marginal score 는 적분 후 — Gauss-Hermite quadrature.
두 random effects → 이중 적분 (\(Q^2\) 점).

Adaptive quadrature 권고:

두 차원에서 \(Q = 10^2 = 100\) 점 — 큰 부담.
Adaptive 로 환자별 효율적 quadrature.
또는 Bayesian (brms, Stan) 으로 우회.

3.4 BHHH Information Matrix

BHHH (Berndt, Hall, Hall, Hausman 1974)

저자 본문 인용:

“The second derivatives of the above log-likelihood are complex, but can be approximated using the BHHH estimator (the sum of the outer product of the first derivatives) to yield the observed information matrix.”

BHHH:

\[ \widehat I_{\text{BHHH}} = \sum_i \frac{\partial \log \ell_i}{\partial \eta} \frac{\partial \log \ell_i}{\partial \eta^\top} \]

이를 Newton-Raphson 의 Hessian 자리에 사용.

직관 — Hessian 회피의 가치

Mixed-effects ZIP 의 Hessian 계산이 매우 복잡:

두 random effects → 이중 적분.
ZIP 의 mixture → 분기 처리.
모수 5 가지 (\(\beta, \sigma_1, \gamma, \sigma_2\)) 의 cross-derivative.

Hessian 을 직접 계산하면:

코드 길이 폭발.
Numerical instability 위험 (negative diagonal 등).
디버깅 어려움.

BHHH 의 장점 (12-2 에서 본 것과 동일):

Score 만 사용 (1 차 도함수).
항상 양정치 (outer product 합).
알고리즘 단순.
Information matrix equality (정칙성 조건 하) 로 정확한 정보 추정.

현대 software:

R glmmTMB(family = "tweedie") 또는 zero-inflated families: BHHH-like quasi-Newton.
brms: NUTS sampler — Hessian 불필요.
SAS PROC NLMIXED: BFGS 또는 Newton-Raphson 옵션.

한계:

BHHH 가 진정한 Hessian 보다 약간 부정확 (작은 표본).
매우 작은 \(n\) → BFGS (Quasi-Newton) 가 더 정확.

4 Mixed-Effects ZIP(τ) — 식 (12.42-12.51)

4.1 모형 정의 — 식 (12.42-12.46)

ZIP(τ) 의 mixed-effects 변형

ZIP(τ) 의 functional 관계 + random effects:

\[ P(y_{ij}) = (1 - \pi_{ij}) f(y_{ij}) + I(y_{ij}) \pi_{ij} \tag{12.42} \]

logit(π) 의 functional 형태 (12-2 의 식 12.10 의 확장):

\[ \text{logit}(\pi_{ij}) = -\tau x_{ij}^\top \beta + \nu_i = -\tau x_{ij}^\top \beta + \sigma_2 \theta_{2i} \tag{12.43} \]

\(\log(\lambda_{ij})\) 는 식 (12.33) 와 동일.

\(\tau\) 의 expectation 표현 (식 12.44):

\[ \tau = -\frac{E[\log \pi_{ij} / (1 - \pi_{ij})]}{E[\log \lambda_{ij}]} \tag{12.44} \]

식 (12.45-12.46) — Explicit 표현

\[ \pi_{ij} = \frac{\exp(-\tau x_{ij}^\top \beta + \sigma_2 \theta_{2i})}{1 + \exp(-\tau x_{ij}^\top \beta + \sigma_2 \theta_{2i})} \tag{12.45} \]

\[ \lambda_{ij} = \exp(x_{ij}^\top \beta + \sigma_1 \theta_{1i}) \tag{12.46} \]

직관 — 식 (12.44) 의 의미

식 (12.44) 의 \(\tau\) 는 두 부분의 효과 비율:

분자: logistic part 의 평균 logit (어떤 covariate 조합에서).
분모: Poisson part 의 평균 log-rate.

\(\tau\) 가 양수 → covariate 가 \(\lambda\) ↑ 일 때 \(\pi\) ↓ (perfect state 비율 감소). \(\tau\) 가 음수 → covariate 가 \(\lambda\) ↑ 일 때 \(\pi\) ↑ (둘 다 증가).

\(\tau\) 의 추정 의미:

두 부분이 같은 covariate 의 효과 — 비율로 표현.
모수 절약: ZIP 의 \(\beta + \gamma\) (두 셋) → ZIP(τ) 의 \(\beta + \tau\) (한 셋 + 1 모수).

Mixed-effects ZIP(τ) 의 가정:

\(\theta_{1i}, \theta_{2i}\) 는 여전히 별도 — 두 random effects 분산 분리.
단지 fixed effects 가 functional 관계.
→ 모수 절약 + 환자 이질성 보존.

4.2 Score — 식 (12.47-12.51)

\(\eta_1 = (\beta, \sigma_1)\) Score (식 12.47)

\[ \frac{\partial \log \ell}{\partial \eta_1} = \sum_j \frac{1}{p(y_{ij})} \left[ (1 - \pi_{ij}) f(y_{ij}) \frac{\partial \log f}{\partial \eta_1} + (I(y_{ij}) - f(y_{ij})) \frac{\partial \pi_{ij}}{\partial \eta_1} \right] \]

→ ZIP 의 식 (12.38) 과 비슷하지만 추가 항 — \(\pi_{ij}\) 가 \(\beta\) 의 함수 (functional 관계).

\(\eta_2 = (\tau, \sigma_2)\) Score (식 12.48)

\[ \frac{\partial \log \ell}{\partial \eta_2} = \sum_j \frac{1}{p(y_{ij})} [I(y_{ij}) - f(y_{ij})] \frac{\partial \pi_{ij}}{\partial \eta_2} \]

식 (12.49-12.51) — \(\pi\) 와 \(f\) 의 미분

\(\partial \log f / \partial \beta_p = (y_{ij} - \lambda_{ij}) x_{ijp}\) (식 12.49 — Poisson part 와 동일).

\(\partial \pi_{ij} / \partial \beta_p\) (식 12.50, functional 관계로 추가 항):

\[ \frac{\partial \pi_{ij}}{\partial \beta_p} = -\tau \pi_{ij} (1 - \pi_{ij}) x_{ijp} \]

(ZIP 에서는 0 이었음 — 별도 covariate.)

\(\partial \pi_{ij} / \partial \tau\) (식 12.51):

\[ \frac{\partial \pi_{ij}}{\partial \tau} = -\pi_{ij} (1 - \pi_{ij}) x_{ij}^\top \beta \]

\(\partial \pi_{ij} / \partial \sigma_2\):

\[ \frac{\partial \pi_{ij}}{\partial \sigma_2} = \pi_{ij} (1 - \pi_{ij}) \theta_{2i} \]

직관 — Functional 관계의 추정 효과

ZIP vs ZIP(τ) 의 score 차이:

ZIP (식 12.38-12.39):

\(\partial \pi / \partial \beta = 0\) (별도 covariate).
두 부분이 score 에서 분리 — EM 자연.

ZIP(τ) (식 12.47-12.51):

\(\partial \pi / \partial \beta = -\tau \pi(1-\pi) x \neq 0\) (functional 관계).
두 부분이 score 에서 결합 — Newton-Raphson 직접 최대화.

\(\tau\) score (식 12.51 의 \(\partial \pi / \partial \tau\)):

\(\tau\) 의 효과가 logistic part 의 logit 에 직접.
“perfect state 비율의 변화율 / Poisson rate 의 변화율” 의 비례 모수.

EM vs Newton-Raphson 선택 (12-2 의 cross-sectional 과 동일):

ZIP (별도 covariate): EM 권고 (두 GLM 분리).
ZIP(τ) (functional 관계): Newton-Raphson 권고 (직접 최대화).

Mixed-effects 의 추가 복잡성:

두 random effects → marginal score 의 이중 적분.
BHHH 로 information matrix 근사 (식 12.18 와 동일 개념).
Adaptive Gauss-Hermite quadrature 권고.

5 응용 분야

분야	EB 활용	Mixed-effects ZIP 활용
Hospital quality	Hospital ranking	(rare — 보통 Poisson 만)
County 자살률	County ranking (§ 12.5)	자살률 ZIP (§ 12.5)
환자별 의료 이용	High utilizer 식별	두 차원 segmentation (필요 + 빈도)
약물 처방 패턴	Provider ranking	비처방자 vs 처방자 + 처방 횟수
학생별 결석	Student / school ranking	결석 안 함 vs 가끔 vs 자주
보험 청구	Provider 별 청구 패턴	청구 안 함 vs 청구 + 빈도
사회 행동	Cluster ranking	행동 안 함 vs 함 + 빈도

6 코드 예시

6.1 Step 1: Empirical Bayes 추정 (R)

library(lme4)


# 12-3 의 mixed-effects Poisson 데이터에 EB 적용
fit_mep <- glmer(y ~ time + x + (1 | subject), data = df, family = poisson,
                 nAGQ = 10)

# Empirical Bayes random effects (식 12.28)
ranef_mep <- ranef(fit_mep, condVar = TRUE)$subject

# 처음 10 환자의 EB 추정
print(head(ranef_mep, 10))

# 사후 분산 (식 12.29)
post_var <- attr(ranef(fit_mep, condVar = TRUE)$subject, "postVar")
post_se <- sqrt(post_var[1, 1, ])

# EB 와 사후 SE 결합
eb_df <- data.frame(
  subject = 1:nrow(ranef_mep),
  theta_hat = ranef_mep[, 1],
  post_se = post_se
)
print(head(eb_df))

# Caterpillar plot — 환자별 EB ± 95% CI
library(ggplot2)
eb_df_sorted <- eb_df[order(eb_df$theta_hat), ]
eb_df_sorted$rank <- 1:nrow(eb_df_sorted)
ggplot(eb_df_sorted, aes(x = rank, y = theta_hat)) +
  geom_point(size = 1.5) +
  geom_errorbar(aes(ymin = theta_hat - 1.96 * post_se,
                    ymax = theta_hat + 1.96 * post_se),
                width = 0.3, alpha = 0.5) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
  labs(x = "Subject Rank", y = "EB Estimate (theta_i)",
       title = "Empirical Bayes Random Effects Caterpillar Plot")

Caterpillar Plot 의 정책 활용

Caterpillar plot 의 해석:

y 축: \(\bar\theta_i\) (환자/클러스터별 random effect).
x 축: rank (작은 → 큰).
빨간 점선 (y = 0): 모집단 평균.

Outlier 식별:

CI 가 0 을 포함 안 함 + \(\bar\theta_i > 0\) → 평균보다 유의하게 높음.
CI 가 0 을 포함 안 함 + \(\bar\theta_i < 0\) → 유의하게 낮음.

Hospital quality 응용:

가장 높은 mortality rate 의 hospital — 정책 개입 대상.
가장 낮은 mortality rate — 모범 사례.

자살률 county 응용:

가장 높은 자살률 county — 자원 우선 배분.
가장 낮은 자살률 county — 보호 요인 분석.

Shrinkage 의 시각적 확인:

작은 표본의 CI 가 넓음 — 정확도 낮음.
큰 표본의 CI 가 좁음 — 정확도 높음.

6.2 Step 2: Mixed-Effects ZIP 적합 (R `glmmTMB`)

library(glmmTMB)


# Mental health utilization 시뮬레이션 (mixed-effects ZIP)
set.seed(2026)
n_subjects <- 300
n_times <- 5

df_zip <- expand.grid(subject = 1:n_subjects, time = 1:n_times)
df_zip$age <- rep(rnorm(n_subjects, 50, 10), each = n_times)
df_zip$insurance <- rep(rbinom(n_subjects, 1, 0.6), each = n_times)

# 두 random effects (식 12.32-12.33)
upsilon_1 <- rep(rnorm(n_subjects, 0, 0.6), each = n_times)  # Poisson part
upsilon_2 <- rep(rnorm(n_subjects, 0, 0.8), each = n_times)  # Logistic part

# Logistic part: P(perfect state)
logit_pi <- -1 + 0.5 * df_zip$insurance + upsilon_2
pi <- plogis(logit_pi)

# Poisson part: lambda
log_lam <- 0.3 + 0.5 * df_zip$insurance + 0.1 * df_zip$time + upsilon_1
lam <- exp(log_lam)

# Mixture
is_perfect <- rbinom(nrow(df_zip), 1, pi)
df_zip$y <- ifelse(is_perfect == 1, 0, rpois(nrow(df_zip), lam))

# Mixed-effects ZIP 적합
fit_mzip <- glmmTMB(y ~ time + insurance + (1 | subject),
                    ziformula = ~ insurance + (1 | subject),
                    data = df_zip, family = poisson)
summary(fit_mzip)

R glmmTMB 의 mixed-effects ZIP

glmmTMB 의 syntax:

본 formula: Poisson part.
ziformula: Logistic part (zero-inflation).
각 part 에 별도 random effects 가능 (식 12.32-12.33 의 직접 구현).

대안 패키지:

R pscl::zeroinfl: cross-sectional ZIP 만 (random effects 없음).
R brms::brm(family = zero_inflated_poisson()): Bayesian, 가장 유연.
SAS PROC NLMIXED: 직접 코딩.
Hedeker MIXPREG: 카운트 mixed-effects 의 표준 도구 (단, 옛 software).

Hedeker-Gibbons 2006 framework 의 구현:

glmmTMB 가 식 (12.32-12.33) 의 두 random effects 직접 지원.
두 부분이 다른 covariate + 다른 random effects 분산.
표준오차 자동 산출.

6.3 Step 3: Hall (2000) vs Hur (2002) vs Hedeker-Gibbons 비교

# Hall (2000): Poisson part 만 random effects
fit_hall <- glmmTMB(y ~ time + insurance + (1 | subject),
                    ziformula = ~ insurance,  # logistic part 는 fixed
                    data = df_zip, family = poisson)

# Hur et al. (2002): 같은 random effect (단순화 어려움 — brms 우회)
# 직접 구현 어려움, 본 sub-post 에서는 비교 생략

# Hedeker-Gibbons (2006): 두 random effects (위 fit_mzip)

# AIC 비교
cat("Hall (Poisson only random):\n")
cat("  AIC:", AIC(fit_hall), "BIC:", BIC(fit_hall), "\n")
cat("\nHedeker-Gibbons (두 random):\n")
cat("  AIC:", AIC(fit_mzip), "BIC:", BIC(fit_mzip), "\n")

# LR test (Hall vs Hedeker-Gibbons)
lrt_stat <- 2 * (logLik(fit_mzip) - logLik(fit_hall))
cat("\nLR test (Hall vs Hedeker-Gibbons):\n")
cat("  chi^2 =", round(lrt_stat, 2), "\n")
cat("  df = 1, p =", format.pval(1 - pchisq(lrt_stat, 1), digits = 3), "\n")

모형 선택 절차

LR test 결과:

p < .05: Hedeker-Gibbons 모형이 우세 — 두 random effects 모두 필요.
p ≥ .05: Hall 모형으로 충분 — logistic part 의 환자 이질성 작음.

실무 권고:

가장 일반 모형 (Hedeker-Gibbons) 시작.
단순 모형 (Hall) 과 LR 검정 비교.
차이 미미하면 단순 모형 채택.
차이 큼 → 일반 모형 유지 + 두 random effects 모두 보고.

Bayesian 우회 (brms):

library(brms)
fit_brm <- brm(y ~ time + insurance + (1 | subject),
               data = df_zip,
               family = zero_inflated_poisson(),
               # zi formula 도 random effects 가능
               formula = bf(y ~ time + insurance + (1 | subject),
                            zi ~ insurance + (1 | subject)),
               chains = 2, iter = 2000)
summary(fit_brm)

Bayesian 의 장점:

두 random effects 의 상관 모형화 자연.
Prior 활용 (작은 표본 안정화).
DIC, LOO 등 풍부한 모형 비교.

6.4 Step 4: 두 차원 환자 Segmentation

# Mixed-effects ZIP 의 두 random effects 추출
ranef_mzip <- ranef(fit_mzip)
theta_1 <- ranef_mzip$cond$subject[, 1]  # Poisson part
theta_2 <- ranef_mzip$zi$subject[, 1]    # Logistic part

# 환자 segmentation
segment_df <- data.frame(
  subject = 1:length(theta_1),
  theta_1_hat = theta_1,
  theta_2_hat = theta_2,
  segment = case_when(
    theta_2_hat > 0.5 ~ "Non-utilizer (perfect state likely)",
    theta_1_hat > 0.5 & theta_2_hat <= 0 ~ "High utilizer",
    theta_1_hat < -0.5 & theta_2_hat <= 0 ~ "Occasional utilizer",
    TRUE ~ "Average"
  )
)

# Segment 별 환자 수
print(table(segment_df$segment))

# 시각화 — 두 random effects 의 산점도 + segmentation
library(ggplot2)
ggplot(segment_df, aes(x = theta_1_hat, y = theta_2_hat, color = segment)) +
  geom_point(alpha = 0.6, size = 2) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray") +
  geom_vline(xintercept = 0, linetype = "dashed", color = "gray") +
  labs(x = "Poisson Random Effect (theta_1)",
       y = "Logistic Random Effect (theta_2)",
       title = "Patient Segmentation — Two-Dimensional EB",
       subtitle = "x: 사용 빈도 차이, y: 비사용 가능성 차이") +
  scale_color_manual(values = c("Non-utilizer" = "gray60",
                                  "High utilizer" = "red",
                                  "Occasional utilizer" = "blue",
                                  "Average" = "darkgreen"))

Two-Dimensional Patient Segmentation

이 시각화가 mixed-effects ZIP 의 핵심 임상 가치:

Non-utilizer (오른쪽 상단): \(\theta_2 > 0\) — 진료 필요 없는 환자.
High utilizer (오른쪽 하단): \(\theta_1 > 0, \theta_2 < 0\) — 자주 진료 받는 환자.
Occasional utilizer (왼쪽 하단): \(\theta_1 < 0, \theta_2 < 0\) — 가끔만 진료.
Average (중앙): 평균 환자.

임상 정책 활용:

High utilizer 식별 → 만성 관리 프로그램 대상.
Non-utilizer 식별 → 의료 접근성 캠페인 대상.
Occasional → 예방 의료 강화.
Average → 표준 케어.

단순 ZIP 또는 Poisson 만으로는 절대 못 얻는 정보:

단순 Poisson: 단일 random effect → 한 차원만.
단순 ZIP: random effects 없음 → 환자 segmentation 불가능.
Mixed-effects ZIP (Hedeker-Gibbons 2006): 두 차원 segmentation 가능.

이것이 mixed-effects ZIP 가 단순 모형보다 임상·정책 가치가 큰 핵심 이유.

7 관련 주제

선행 지식

Ch.12 Overview — Counts GLMM 의 큰 그림
§ 12.3 ZIP — Cross-sectional ZIP (식 12.30-12.33 의 토대)
§ 12.4 ~ 12.4.1 — Mixed-effects Poisson (random effects framework)
§ 9.6.1 Empirical Bayes — EAP estimator (식 9.48-9.49)

후속 주제 (Ch.12 sub-posts)

§ 12.5 — Suicide rate illustration (Gibbons et al. 2005) — 본 sub-post 의 직접 응용
§ 12.6 — Ch.12 Summary

관련 개념

Bock & Aitkin (1981) — EAP estimator + Marginal MLE 원전
Thomas et al. (1992) — Hospital mortality EB ranking
Longford (1994) — EB references
Lambert (1992) — ZIP 모형 토대
Hall (2000) — Mixed-effects ZIP (Poisson 만 random)
Hur et al. (2002) — Mixed-effects ZIP (단일 random)
Hedeker & Gibbons (2006, § 12.4.3) — 완전한 두 random effects ZIP
Berndt, Hall, Hall & Hausman (1974) — BHHH information matrix
Greene (1994) — Unified pdf + ZIP estimation
Stein (1956) — Stein’s paradox (shrinkage 의 통계적 토대)
Gibbons et al. (2005) — Suicide rate analysis (§ 12.5 의 응용)
Goldsmith et al. (2002) — Suicide statistics