1 들어가며 — 12-2 의 한계, 12-3 의 해결

§ 12.3 (12-2) 의 마지막에 ZIP 의 본질적 한계 — random effects 부재 — 를 명시했다. 본 sub-post 는 그 한계를 푸는 § 12.4 mixed-effects framework + § 12.4.1 의 mixed-effects Poisson 의 정확한 도출.

한 줄 요약

“§ 12.4 = ZIP/Poisson 의 cross-sectional 한계를 random effects 추가로 해결. 발전사 — Goldstein (1991), Breslow (1984), Siddiqui (1996) 가 1980s-1990s 에 정립. § 12.4.1 = 식 (12.20) \(\lambda_{ij} = \exp(x'\beta + \sigma\theta_i)\) 의 multiplicative random effect 직관 (logistic 의 additive on logit 과 대비). Score (식 12.22) 가 ‘잔차 × 공변량’ 의 § 12.1 와 같은 형태 — random effect \(\theta_i\) 가 추가 항. Marginal likelihood (식 12.24) 가 Gauss-Hermite quadrature 로 \(\sum_q \ell_{iq} A(B_q)\) 근사 — § 9.6 toolkit 의 직접 적용. Normal vs gamma random effects — Siddiqui 1996 비교, Longford 1994 의 normal 권고 (다중 random effects 표현력 + 해석 단순).”

2 § 12.4 — Mixed-Effects Models for Counts 의 자리

2.1 발전사 — 1980s-1990s

핵심 문헌

Mixed-effects Poisson 의 발전 (chronological):

Breslow (1984): Poisson regression with normally distributed random effects. 첫 정식 framework.
Lawless & Willmot (1989): Poisson with inverse Gaussian random effects. 대안 분포 탐색.
Goldstein (1991): Multilevel log-linear model. 다수준 framework 통합.
Thall (1988), Albert (1992): 다양한 응용.
Stukel (1993): 추가 변형.
Siddiqui (1996): Normal vs gamma random effects 비교 — 본 chapter 의 토대.
Hall (2000), Hur et al. (2002): Mixed-effects ZIP 발전 (§ 12.4.3 에서).
Hedeker & Gibbons (2006): 본 chapter 의 종합.

직관 — 왜 1980s 에 발전 시작했는가

Mixed-effects Poisson 의 발전 시점이 mixed-effects logistic (Bock-Aitkin 1981) 과 비슷한 1980s 인 이유:

Bock-Aitkin (1981) 의 marginal MLE + Gauss-Hermite quadrature 가 GLMM 의 일반 framework 정립.
컴퓨팅 발전: numerical integration 이 가능해짐.
응용 수요: 의료 이용, 자살, 사고 등 longitudinal/clustered 카운트 분석 필요성.

왜 logistic 보다 약간 늦게 (Breslow 1984)?:

Logistic mixed-effects 는 이항 응답 — 상대적 단순 (Bernoulli 우도).
Poisson 은 unbounded 응답 — 적분 안정성 약간 더 신경 써야.
ZIP, NB 등 변형이 더 늦게 (1990s).

현재 (2026) 의 표준:

R lme4::glmer(family = poisson), glmmTMB, brms.
SAS PROC NLMIXED.
모두 § 9.6 / § 12.4.1 의 toolkit 사용.

2.2 Independence 가정의 위반

표준 Poisson 의 한계

저자 본문 인용:

“The traditional Poisson regression model for count responses … assume statistical independence of observations. However, in many cases the frequency data are clustered or longitudinal.”

Longitudinal 카운트 데이터의 흔한 형태:

같은 환자의 매년 의료 이용 횟수.
같은 환자의 매주 흡연 횟수.
같은 환자의 매월 발작 횟수.

Clustered 카운트 데이터:

County 별 자살률 (county = cluster).
School 별 학생 결석 횟수.
Hospital 별 사망률.
Family 별 의료 이용 빈도.

이런 데이터의 카운트가 클러스터/시점 무관 독립 일 가능성 거의 0:

같은 환자의 반복 측정 → 환자별 baseline 차이.
같은 county 의 자살 → 사회적 요인 공유.
→ 독립 가정 위반.

직관 — 위반의 결과

표준 Poisson (또는 ZIP) 을 longitudinal 카운트에 적용하면:

표준오차 과소 추정: 독립 가정이 정보량을 부풀려 평가.
Type I error 증가: p-value 가 부풀려져 type I error 률 ↑.
신뢰구간 좁음: 95% CI 가 실제 95% 커버리지 미달.

→ 잘못된 통계적 결론.

해결 — Random effects 추가:

환자별 unobserved heterogeneity 흡수.
같은 환자의 시점 사이 상관 자동 처리 (조건부 독립 가정 위에서).
정확한 표준오차 + 추론.

§ 9.6 의 mixed-effects logistic 와 같은 framework — 응답이 카운트로 바뀐 것뿐.

3 § 12.4.1 — Mixed-Effects Poisson 의 정확한 정의

3.1 모형 — 식 (12.19-12.21)

식 (12.19) — Conditional density

\(N\) 환자, 환자 \(i\) 의 \(n_i\) 시점 응답 \(y_i = (y_{i1}, \ldots, y_{in_i})\).

Random effect \(\upsilon_i \sim \mathcal{N}(0, \sigma^2)\) 조건부 — 같은 환자의 시점이 독립:

\[ f(y_i \mid \theta) = \prod_{j=1}^{n_i} f(y_{ij}; \lambda_{ij}) = \prod_j \frac{e^{-\lambda_{ij}} \lambda_{ij}^{y_{ij}}}{y_{ij}!} \tag{12.19} \]

여기서 \(\theta_i = \upsilon_i / \sigma \sim \mathcal{N}(0, 1)\) — 표준화 random effect.

식 (12.20) — \(\lambda_{ij}\) 정의

\[ \lambda_{ij} = \exp(x_{ij}^\top \beta + \upsilon_i) = \exp(x_{ij}^\top \beta + \sigma \theta_i) \tag{12.20} \]

식 (12.21) — Conditional log-likelihood

\[ \log \ell(y_i \mid \theta) = -\sum_{j=1}^{n_i} [\exp(x_{ij}^\top \beta + \sigma \theta_i) - y_{ij}(x_{ij}^\top \beta + \sigma \theta_i) + \log(y_{ij}!)] \tag{12.21} \]

(constant 항 \(\log(y_{ij}!)\) 는 추정에 무관.)

직관 — Multiplicative Random Effect

식 (12.20) 의 \(\lambda_{ij} = \exp(x_{ij}^\top \beta + \sigma \theta_i)\) 의 의미:

\(\sigma \theta_i\) 가 환자별 “rate shifter” — 평균 대비 비율.
\(\exp(\sigma \theta_i)\) 가 multiplicative factor.

해석:

\(\theta_i = +1\): \(\lambda\) 가 \(\exp(\sigma)\) 배 증가 (환자가 평균보다 자주 사건).
\(\theta_i = -1\): \(\lambda\) 가 \(\exp(-\sigma)\) 배 감소.

§ 9.5 mixed-effects logistic 와의 비교:

Chapter	Link 척도	Random effect 형태	Outcome 척도 효과
§ 9 (Logistic)	logit	\(\sigma \theta_i\) on logit (additive)	Multiplicative on odds, nonlinear on probability
§ 12 (Poisson)	log	\(\sigma \theta_i\) on log (additive)	Multiplicative on rate

→ 두 모형 모두 link 척도에서 additive, 결과 척도에서 multiplicative.

Poisson 의 multiplicative 효과의 자연성:

카운트는 항상 비음수 — multiplicative factor 가 자연.
인구 학적 효과 (예: 성별이 의료 이용 빈도를 1.5 배) 가 multiplicative 표현 직관적.
Poisson regression 의 표준 해석 — exp(coefficient) = rate ratio (RR).

임상 예시:

\(\sigma = 0.3\): 환자별 \(\lambda\) 가 평균 대비 약 \(\exp(0.3) = 1.35\) 배 ~ \(\exp(-0.3) = 0.74\) 배.
즉 환자별 의료 이용 빈도가 평균의 0.74 ~ 1.35 배 사이 (±1 SD 환자).
더 큰 \(\sigma\) → 환자 이질성 더 큼.

3.2 Conditional Independence

\(\theta_i\) 조건부 독립 가정

식 (12.19) 의 곱 형태가 가정하는 것:

같은 환자의 시점이 \(\theta_i\) 조건부로 독립.
즉 환자별 unobserved heterogeneity (\(\theta_i\)) 가 모든 시점 사이 상관을 흡수.
그 외의 시점-시점 상관은 없음.

이 가정이 깨지면:

Random effect 가 모든 within-subject 상관을 못 흡수.
추가 시간적 상관 (예: 자기상관) 필요.
→ § 7 의 MRM-AC 처럼 random effects + autocorrelated errors.

Poisson 에서의 검증 어려움:

Logistic 처럼 단일 응답이 아니라 카운트 — 잔차 분석으로 자기상관 검증 가능.
Pearson 잔차의 시간 lag 별 상관 → 추가 구조 필요성 진단.

직관 — § 9.6 의 conditional independence 와 동일 framework

§ 9.6 (이항) 의 conditional independence 가정 (식 9.39 의 Bernoulli 곱) 과 정확히 같은 발상.

Logistic: \(\prod_j \Psi(z_{ij})^{Y_{ij}} (1-\Psi(z_{ij}))^{1-Y_{ij}}\).
Poisson: \(\prod_j e^{-\lambda_{ij}} \lambda_{ij}^{y_{ij}} / y_{ij}!\).

두 모형이 conditional 우도가 곱 형태 — 같은 framework 위.

이것이 § 9.6 의 모든 toolkit (marginal MLE, Gauss-Hermite quadrature, Fisher scoring, Empirical Bayes) 가 § 12.4.1 에 직접 적용 가능한 이유.

3.3 Score 와 Hessian — 식 (12.22-12.23)

식 (12.22) — Conditional score

\(\beta\) 와 \(\sigma\) 에 대한 conditional score (식 12.21 의 도함수):

\[ \begin{bmatrix} \frac{\partial \log \ell}{\partial \beta} \\ \frac{\partial \log \ell}{\partial \sigma} \end{bmatrix} = \begin{bmatrix} \sum_{j=1}^{n_i} (y_{ij} - \lambda_{ij}) x_{ij} \\ \sum_{j=1}^{n_i} (y_{ij} - \lambda_{ij}) \theta_i \end{bmatrix} \tag{12.22} \]

식 (12.23) — Conditional Hessian

\[ \begin{bmatrix} \frac{\partial^2 \log \ell}{\partial \beta \partial \beta^\top} \\ \frac{\partial^2 \log \ell}{\partial \sigma \partial \sigma} \end{bmatrix} = \begin{bmatrix} -\sum_j \exp(x_{ij}^\top \beta + \sigma \theta_i) x_{ij} x_{ij}^\top \\ -\sum_j \exp(x_{ij}^\top \beta + \sigma \theta_i) \theta_i^2 \end{bmatrix} \tag{12.23} \]

직관 — Score 의 두 항

\(\beta\) score: \(\sum_j (y_{ij} - \lambda_{ij}) x_{ij}\).

§ 12.1 식 (12.5) 의 표준 Poisson score 와 동일한 형태.
“잔차 × 공변량” — canonical link 의 결과.
차이는 \(\lambda_{ij}\) 가 random effect \(\theta_i\) 의존.

\(\sigma\) score: \(\sum_j (y_{ij} - \lambda_{ij}) \theta_i\).

“잔차 × \(\theta_i\)” — random effect 가 효과적으로 또 하나의 covariate.
\(\theta_i\) 가 양수면 score 가 잔차 합에 비례, 음수면 반대 부호.
→ \(\sigma\) 추정이 환자별 잔차 패턴에 의존.

Hessian 의 양정치 보장:

식 (12.23) 의 \(-\sum_j \lambda_{ij} x_{ij} x_{ij}^\top\) — outer product 합에 음의 부호.
\(-(\text{negative definite}) = \text{positive definite}\) → Newton-Raphson 안정.

이것이 canonical link (log) 사용의 알고리즘적 장점 (12-1 에서 본 § 12.1 와 동일).

4 Marginal Likelihood — 식 (12.24-12.27)

4.1 식 (12.24-12.25) — Gauss-Hermite Quadrature

Marginal density (식 12.24)

Random effect 적분 + Gauss-Hermite quadrature 근사:

\[ h(\beta, \sigma, \theta; y_i) = h(y_i) = \int_\theta f(y_i \mid \theta) g(\theta) d\theta \]

\[ \approx \sum_{q=1}^Q \left[\prod_{j=1}^{n_i} \frac{e^{-\lambda_{ijq}} \lambda_{ijq}^{y_{ij}}}{y_{ij}!}\right] A(B_q) \tag{12.24} \]

with

\[ \lambda_{ijq} = \exp(x_{ij}^\top \beta + \sigma B_q) \tag{12.25} \]

\(B_q, A(B_q)\) — Gauss-Hermite 노드와 가중치 (\(q = 1, \ldots, Q\)).

직관 — § 9.6.3 의 Poisson 적용

식 (12.24-12.25) 가 § 9.6.3 의 식 (9.55-9.59) 의 Poisson 버전. 차이는:

§ 9.6: Bernoulli conditional likelihood (\(\Psi(z_{ij})^{Y} (1-\Psi)^{1-Y}\)).
§ 12.4.1: Poisson conditional likelihood (\(e^{-\lambda} \lambda^y / y!\)).

같은 적분 framework, 다른 conditional 형태.

Gauss-Hermite 의 작동 (§ 9.6.3 복습):

Random effect \(\theta\) 의 적분 영역을 \(Q\) 개 노드 \(B_q\) 로 discretize.
각 노드 \(B_q\) 에서 conditional likelihood 계산 (\(\theta = B_q\) 대입).
가중치 \(A(B_q)\) 로 weighted sum.

Q 선택의 기준 (§ 9.6.3 권고):

\(Q = 5\): 충분한 정확도 (Longford 1993, Jansen 1990).
\(Q = 10\): 안전한 default.
\(Q = 20\): 매우 정확 (대부분 응용에서 충분).
다중 random effects (3+) : adaptive Gauss-Hermite.

카운트 데이터의 추가 고려:

\(\lambda\) 가 매우 큰 경우 (\(\lambda > 100\)): factorial \(y!\) 계산이 overflow → log-likelihood 사용.
매우 작은 \(\lambda\): \(\lambda^y\) 가 underflow → log-space 계산.

4.2 Marginal Score — 식 (12.26-12.27)

식 (12.26) — \(\beta\) marginal score

\[ \frac{\partial \log L}{\partial \beta} = \sum_{i=1}^N \frac{1}{h(y_i)} \sum_q \sum_j (y_{ij} - \lambda_{ijq}) x_{ij} \cdot \frac{e^{-\lambda_{ijq}} \lambda_{ijq}^{y_{ij}}}{y_{ij}!} A(B_q) \tag{12.26} \]

식 (12.27) — \(\sigma\) marginal score

\[ \frac{\partial \log L}{\partial \sigma} = \sum_i \frac{1}{h(y_i)} \sum_q \sum_j (y_{ij} - \lambda_{ijq}) \theta_i \cdot \frac{e^{-\lambda_{ijq}} \lambda_{ij}^{y_{ij}}}{y_{ij}} A(B_q) \tag{12.27} \]

(원문 typo 가능성 — \(\lambda_{ijq}\) 와 \(\theta_i\) 의 일관성 확인 필요.)

직관 — Marginal Score 의 구조

식 (12.26-12.27) 의 핵심 구조:

\[ \frac{\partial \log L}{\partial \theta} = \sum_i \frac{1}{h(y_i)} \sum_q [\text{conditional score at } B_q] \cdot \ell(y_i \mid B_q) \cdot A(B_q) \]

해석:

각 노드 \(B_q\) 에서 conditional score 계산.
Conditional likelihood \(\ell\) 와 weight \(A(B_q)\) 로 가중.
\(h(y_i)\) 로 정규화 (사후 분포의 정규화 상수).

→ 사후 분포 (posterior) 에 대한 conditional score 의 가중 평균.

§ 9.6 식 (9.59) 의 Poisson 버전:

형식 동일.
Conditional score 의 형태만 다름 (\(\beta\) score: 잔차 × 공변량, \(\sigma\) score: 잔차 × \(\theta_i\)).
같은 알고리즘 (Fisher scoring + Gauss-Hermite) 으로 추정.

Fisher scoring update (§ 9.6 식 9.47 와 동일):

\[ \Theta^{new} = \Theta^{old} + I^{-1}(\Theta) \cdot \nabla \log L \]

수렴 후 \(I^{-1}\) 가 표준오차 제공.

5 Random Effects 분포 — Normal vs Gamma

5.1 Normal Random Effects (Default)

정규 가정의 표준성

식 (12.20) 의 \(\theta_i \sim \mathcal{N}(0, 1)\) — 표준 가정.

장점:

다변량 확장 자연: \(\mathcal{N}_r(0, \Sigma_v)\) 로 다중 random effects.
Cholesky factorization (§ 9.5.2): 양정치 자동 보장.
Gauss-Hermite quadrature: 정규 분포에 최적화된 노드/가중치.
회귀 계수와 같은 척도: log-rate 척도에서 additive — 해석 단순.

Longford (1994) 의 정규 권고

저자 본문 인용 (§ 12.4.3 에서):

“the normal is the only well-established multivariate distribution with a full range of correlation structures” for models with multiple random effects.

해석:

다변량 정규: 임의의 상관 구조 (\(\Sigma_v\)) 표현 가능.
다변량 gamma, beta 등: 상관 구조 제한 (positive correlation 만, 등).
→ 다중 random effects 모형에서 정규가 사실상 유일한 선택.

Preisler (1989) 의 추가 근거:

“interpretation of parameters is more straightforward because the random and fixed effects are on the same scale.”

→ Random effects 가 fixed effects 와 같은 log 척도 → 직접 비교, 합산 가능.

5.2 Gamma Random Effects (Alternative)

Gamma 의 매력 — Closed Form

\(\upsilon_i \sim \text{Gamma}(\alpha, \beta)\) 가정 시:

Poisson + Gamma → Negative Binomial 의 marginal distribution.
즉 적분 (식 12.24) 이 closed form 으로 계산 가능 (Gauss-Hermite quadrature 불필요).

이것이 표준 NB regression 의 토대 — Poisson + Gamma random effects.

직관 — Gamma vs Normal 의 trade-off

Gamma 의 장점:

Closed form integration → 빠른 추정.
Numerical quadrature 불필요.
NB regression 의 표준 framework.

Gamma 의 단점:

다중 random effects 의 다변량 gamma 정의 어려움.
회귀 계수와 다른 척도 — 해석 단순화 어려움.
Random effect 분산이 NB 의 dispersion 과 confounded.

Siddiqui (1996) 의 비교:

Single random effect: gamma 가 더 빠름 (closed form 이점).
Multiple random effects: normal 만 실용적 (다변량 gamma 어려움).
결론: single → gamma, multiple → normal.

현대 (2026) 의 표준:

R MASS::glm.nb, glmmTMB(family = nbinom1/2): Gamma random effect (NB).
R lme4::glmer(family = poisson) + random effects: Normal.
→ 두 framework 동시 활용 — 모형 비교에 유용.

Inverse Gaussian (Lawless-Willmot 1989):

또 다른 alternative — 매우 right-skewed 효과.
실무에서 거의 안 씀 (특수 응용 외).

6 §9.6 Toolkit 의 Poisson 적용 — 종합

§9, §10, §11, §12 의 통합 framework

§ 9.6 의 toolkit 이 모든 GLMM 에 직접 적용:

Chapter	Conditional likelihood	Marginal MLE	추정
§ 9 (Bernoulli)	식 (9.39): \(\prod \Psi^Y (1-\Psi)^{1-Y}\)	식 (9.40)	Gauss-Hermite
§ 10 (Cumulative ordinal)	식 (10.16): cumulative diff	식 (10.18)	Gauss-Hermite
§ 11 (Multinomial)	식 (11.8): multinomial	식 (11.9)	Gauss-Hermite
§ 12 (Poisson)	식 (12.19): Poisson 곱	식 (12.24)	Gauss-Hermite

→ 모두 같은 framework, conditional likelihood 의 형태만 다름.

Empirical Bayes (§ 12.4.2 다음 sub-post):

모든 chapter 에서 같은 식 (식 9.48-9.49 의 일반화).
환자별 random effect 추정.

Cholesky 미분 (§ 9.6.2):

다중 random effects 의 행렬 미분 동일.
Magnus (1988) elimination matrix 활용.

Adaptive Gauss-Hermite quadrature:

다차원 random effects 의 차원 폭발 처리.
Poisson 에도 직접 적용.

핵심 메시지:

새로운 알고리즘 학습 불필요.
Conditional likelihood 형태만 응답에 맞춰 변경.
같은 toolkit 으로 모든 GLMM 통합 분석.

7 응용 분야

분야	카운트 응답	Random effects
의료 이용	매년 진료 횟수	환자 (longitudinal)
자살 예방	County 별 자살 수	County (clustered)
흡연 행동	매주 흡연 횟수	환자 (longitudinal)
항생제 사용	병원별 처방 횟수	Hospital (clustered)
사고 발생	도로별 사고 수	도로 segment (clustered)
가족 크기	가족별 자녀 수	지역 (clustered)
학업 결석	학생별 결석 일수	학생 + 학교 (3-level)
동물 행동	개체별 행동 빈도	개체 (longitudinal)

8 코드 예시

8.1 Step 1: 시뮬레이션 데이터 생성

import numpy as np
import pandas as pd


def simulate_mixed_poisson(n_subjects: int, n_times: int,
                            beta: np.ndarray, sigma: float,
                            seed: int = 2026) -> pd.DataFrame:
    """식 (12.19-12.20) 의 mixed-effects Poisson 시뮬레이션.

    lambda_ij = exp(x'β + σ θ_i)
    y_ij ~ Poisson(lambda_ij)
    """
    rng = np.random.default_rng(seed)
    rows = []

    for i in range(n_subjects):
        # 환자별 random effect (식 12.20)
        theta_i = rng.normal()
        for j in range(n_times):
            t = j  # 시간
            x_ij = rng.normal()  # 시간 가변 covariate
            X_ij = np.array([1.0, t, x_ij])  # 절편 + 시간 + 공변량
            eta_ij = X_ij @ beta + sigma * theta_i
            lam_ij = np.exp(eta_ij)
            y_ij = rng.poisson(lam_ij)
            rows.append({"subject": i, "time": t, "x": x_ij,
                         "y": y_ij, "lambda": lam_ij,
                         "theta": theta_i})

    return pd.DataFrame(rows)


# 시뮬레이션 — 200 환자, 5 시점, sigma = 0.5
beta_true = np.array([0.5, 0.1, 0.3])  # 절편, 시간, 공변량
df = simulate_mixed_poisson(n_subjects=200, n_times=5, beta=beta_true, sigma=0.5)

print(f"전체 데이터: {len(df)} 행, {df['subject'].nunique()} 환자")
print(f"\n시점별 평균:")
print(df.groupby('time')['y'].mean().round(2))
print(f"\n환자별 평균 (처음 5 환자):")
print(df.groupby('subject')['y'].mean().head().round(2))

시뮬레이션 검증

\(\sigma = 0.5\) → 환자별 \(\lambda\) 가 평균 대비 \(\exp(0.5) \approx 1.65\) 배 ~ \(\exp(-0.5) \approx 0.61\) 배.
시점별 평균이 시간에 따라 \(\exp(0.1)\) 배씩 증가.
환자별 평균 차이가 random effect 의 결과.

이 데이터에 표준 Poisson regression 적용하면:

회귀 계수 추정은 비슷.
표준오차 과소 추정 (clustering 무시).
→ Mixed-effects Poisson 적용 필요.

8.2 Step 2: Mixed-Effects Poisson MLE 직접 구현 (Gauss-Hermite)

import numpy as np
from scipy.stats import poisson
from scipy.optimize import minimize
from numpy.polynomial.hermite_e import hermegauss


def neg_log_lik_mixed_poisson(theta_vec: np.ndarray, df: pd.DataFrame,
                                Q: int = 10) -> float:
    """식 (12.24) 의 marginal log-likelihood 의 negation.

    theta_vec: (beta, log_sigma) — log_sigma 로 양수 보장.
    """
    p = 3  # 절편 + time + x
    beta = theta_vec[:p]
    sigma = np.exp(theta_vec[p])  # 양수 보장

    # Gauss-Hermite quadrature 노드와 가중치 (정규 분포)
    nodes, weights = hermegauss(Q)
    weights = weights / np.sqrt(np.pi)  # 정규 분포 정규화

    log_L = 0.0
    for subject_id in df['subject'].unique():
        df_sub = df[df['subject'] == subject_id]
        X_sub = df_sub[['time', 'x']].values
        X_sub = np.column_stack([np.ones(len(df_sub)), X_sub])
        y_sub = df_sub['y'].values

        # 식 (12.24): marginal likelihood = sum over quadrature
        h_yi = 0.0
        for q in range(Q):
            B_q = np.sqrt(2) * nodes[q]  # 표준 정규로 변환
            # 식 (12.25)
            lam_ijq = np.exp(X_sub @ beta + sigma * B_q)
            # Conditional likelihood
            log_lik_q = np.sum(poisson.logpmf(y_sub, lam_ijq))
            h_yi += np.exp(log_lik_q) * weights[q]
        log_L += np.log(h_yi + 1e-300)

    return -log_L


# 적합
theta0 = np.array([0.0, 0.0, 0.0, np.log(0.5)])  # 초기값
result = minimize(neg_log_lik_mixed_poisson, theta0, args=(df, 10),
                  method='BFGS')

print(f"Converged: {result.success}")
beta_est = result.x[:3]
sigma_est = np.exp(result.x[3])
print(f"Estimated beta:  {beta_est.round(3)}  (true: {beta_true})")
print(f"Estimated sigma: {sigma_est:.3f}  (true: 0.5)")

Marginal MLE 검증

위 코드가 식 (12.24) 의 직접 구현:

각 환자별 marginal likelihood = Gauss-Hermite quadrature.
표본 전체 log-likelihood 합 → BFGS 최소화.
추정값이 진짜 모수에 가까움 → 알고리즘 정확.

계산 효율 향상:

환자별 loop → vectorize (numpy broadcasting).
Adaptive Gauss-Hermite (큰 random effect 분산에서 더 정확).
Multi-core parallelization.

대안 — R 의 표준 패키지:

library(lme4)
fit <- glmer(y ~ time + x + (1 | subject), data = df, family = poisson)
summary(fit)

lme4::glmer 가 위 코드와 같은 marginal MLE 수행. nAGQ 옵션으로 quadrature 점 수 조정.

8.3 Step 3: Empirical Bayes Random Effects 추정 (다음 sub-post 미리보기)

def empirical_bayes_theta(df_sub: pd.DataFrame, beta: np.ndarray,
                           sigma: float, Q: int = 50) -> tuple:
    """§12.4.2 의 식 (12.28-12.29) — EAP estimator.

    환자별 random effect 의 사후 평균 + 분산.
    """
    nodes, weights = hermegauss(Q)
    weights = weights / np.sqrt(np.pi)

    X_sub = df_sub[['time', 'x']].values
    X_sub = np.column_stack([np.ones(len(df_sub)), X_sub])
    y_sub = df_sub['y'].values

    # 분자: ∫ θ ℓ(y|θ) g(θ) dθ
    numer = 0.0
    denom = 0.0
    for q in range(Q):
        B_q = np.sqrt(2) * nodes[q]
        lam_ijq = np.exp(X_sub @ beta + sigma * B_q)
        log_lik_q = np.sum(poisson.logpmf(y_sub, lam_ijq))
        ell_q = np.exp(log_lik_q)
        numer += B_q * ell_q * weights[q]
        denom += ell_q * weights[q]

    theta_hat = numer / denom

    # 사후 분산
    var_numer = 0.0
    for q in range(Q):
        B_q = np.sqrt(2) * nodes[q]
        lam_ijq = np.exp(X_sub @ beta + sigma * B_q)
        log_lik_q = np.sum(poisson.logpmf(y_sub, lam_ijq))
        ell_q = np.exp(log_lik_q)
        var_numer += (B_q - theta_hat) ** 2 * ell_q * weights[q]
    var_theta = var_numer / denom

    return theta_hat, var_theta


# 환자 5 명의 EB 추정
print("환자별 EB random effect 추정 (처음 5 명):")
for sid in df['subject'].unique()[:5]:
    df_sub = df[df['subject'] == sid]
    theta_hat, var_theta = empirical_bayes_theta(df_sub, beta_est, sigma_est)
    theta_true = df_sub['theta'].iloc[0]
    print(f"  Subject {sid}: theta_hat = {theta_hat:+.3f} (true: {theta_true:+.3f}), "
          f"SE = {np.sqrt(var_theta):.3f}")

EB Shrinkage 검증

EB 추정값과 진짜 \(\theta_i\) 비교:

데이터 많은 환자 (시점 많음) → EB 추정이 진짜에 가까움.
데이터 적은 환자 → EB 가 0 (사전 평균) 으로 수축.

이 EB 추정이 § 12.4.2 의 식 (12.28-12.29) 의 직접 구현. 다음 sub-post 에서 깊이 다룸.

활용:

환자별 expected count 예측: \(\hat\lambda_i = \exp(x'\hat\beta + \hat\sigma \hat\theta_i)\).
“high utilizer” 식별 (예: \(\hat\theta_i > 1\)).
임상적 의사결정 지원.

8.4 Step 4: R 의 표준 적합 + Diagnostic

library(lme4)
library(performance)


# 표준 Poisson (random effects 무시)
fit_poisson <- glm(y ~ time + x, data = df, family = poisson)

# Mixed-effects Poisson
fit_mep <- glmer(y ~ time + x + (1 | subject), data = df, family = poisson,
                 nAGQ = 10)  # adaptive Gauss-Hermite, 10 points

# 비교
summary(fit_poisson)
summary(fit_mep)

# LR test (random effects 의 유의성)
# 표준 Poisson 의 deviance vs mixed-effects 의 -2 log L
# Boundary 보정 필요 (sigma = 0 가 boundary)

# Overdispersion 진단
check_overdispersion(fit_poisson)
check_overdispersion(fit_mep)

R lme4::glmer vs 직접 구현

lme4::glmer(family = poisson) 가 위 Python 직접 구현보다:

빠름 (컴파일된 C++ 코드).
안정 (수치 안정성 검증된).
다중 random effects + 복잡한 모형 지원.

옵션:

nAGQ = 1: Laplace approximation (빠르지만 정확도 낮음).
nAGQ > 1: Adaptive Gauss-Hermite quadrature (정확).
단, nAGQ > 1 은 single random effect 만 지원 (다중은 glmmTMB 또는 Bayesian).

대안 패키지:

glmmTMB: 빠름 + 다중 random effects + Laplace, NB, ZIP.
MASS::glmmPQL: 옛 표준, PQL 근사 (편향 가능).
brms: Bayesian, 가장 유연.

Diagnostic:

performance::check_overdispersion: Pearson dispersion 진단.
DHARMa package: 잔차 plot + 시뮬레이션 기반 진단.
lme4::ranef(fit) : EB random effects 추출.

9 관련 주제

선행 지식

Ch.12 Overview — Counts GLMM 의 큰 그림
§ 12.1 ~ 12.2 — Poisson + Modified Poisson family
§ 12.3 — ZIP 의 깊이 (random effects 부재 한계)
§ 9.6 추정 — Marginal MLE + Gauss-Hermite (직접 적용)
§ 9.6.2 Cholesky 미분 — 다중 random effects 표기
§ 9.6.3 Gauss-Hermite quadrature — 적분 근사

후속 주제 (Ch.12 sub-posts)

§ 12.4.2 — Empirical Bayes Random Effects (식 12.28-12.29)
§ 12.4.3 — Mixed-Effects ZIP (식 12.30-12.33)
§ 12.5 — Suicide rate illustration (Gibbons et al. 2005)
§ 12.6 — Ch.12 Summary

관련 개념

Breslow (1984) — Poisson with normal random effects 원전
Lawless & Willmot (1989) — Inverse Gaussian random effects
Goldstein (1991) — Multilevel log-linear
Thall (1988), Stukel (1993), Albert (1992) — 다양한 응용
Siddiqui (1996) — Normal vs gamma 비교
Bock & Aitkin (1981) — Marginal MLE + Gauss-Hermite 원전
Longford (1994) — Normal 권고
Preisler (1989) — Random effects 해석
McCullagh & Nelder (1989) — GLM canonical link
Statistics 음이항 분포 — Poisson + Gamma random effect = NB