Kwangmin Kim - Klein § 3.5~3.6 — Likelihood Construction and Counting Processes

1 들어가며 — 두 Framework 의 위치

Ch.3 § 3.2~3.4 가 censoring/truncation 의 분류 + 직관 이라면, § 3.5~3.6 은 — 그 분류를 수학적 추론 도구 로 변환하는 두 framework.

§ 3.5~3.6 의 한 줄 요약

“§ 3.5 — 모든 censoring/truncation 형태가 단일 likelihood 식 (3.5.1) 의 특수 경우. § 3.6 — 같은 식이 counting process martingale framework 에서 더 강력하게 도출됨, KM/NA/Cox/log-rank 의 점근성 통일 처리.”

Framework	대상	강점	Klein 활용
§ 3.5 Classical likelihood	parametric MLE	직관적 도출, exponential family	Ch.12
§ 3.6 Counting process	nonparametric/semipara	점근성, 다양한 도구 통합	Ch.4~11 모두

같은 식을 도출하지만 — counting process 가 압도적으로 더 일반적. 본 편은 두 framework 를 정밀하게 다룬다.

2 § 3.5 — Likelihood Construction

2.1 핵심 가정

Critical Assumption — Independence

Lifetime \(X\) 와 censoring time \(C_r\) 가 독립.

이 가정이 만족 안 되면 — 식 (3.5.1) 부터 식 (3.5.6) 까지 모두 무효. § 3.2 의 Tsiatis 1975 식별 불가능성과 같은 문제.

검증: 도메인 지식 (왜 censoring 했는가?) 으로만 판단. 데이터로 검증 불가.

2.2 Master 식 (3.5.1) — 통일 Likelihood

모든 censoring 의 통일 표현

\[ L \propto \underbrace{\prod_{i \in D} f(x_i)}_{\text{exact}} \cdot \underbrace{\prod_{i \in R} S(C_{r,i})}_{\text{right}} \cdot \underbrace{\prod_{i \in L} [1 - S(C_{l,i})]}_{\text{left}} \cdot \underbrace{\prod_{i \in I} [S(L_i) - S(R_i)]}_{\text{interval}} \tag{3.5.1} \]

각 개체는 — 자신의 정보의 양에 비례 하여 likelihood 에 기여:

정보 종류	기여	확률적 의미
exact	\(f(x_i)\)	“정확히 \(x_i\) 에 사건”
right cens	\(S(C_{r,i})\)	“최소 \(C_{r,i}\) 까지 살아있음”
left cens	\(1 - S(C_{l,i})\)	“\(C_{l,i}\) 이전 어딘가”
interval	\(S(L_i) - S(R_i)\)	“\((L_i, R_i]\) 안 어딘가”

2.3 Truncation 의 추가

분모 보정

Left truncation \((Y_{L,i}, Y_{R,i})\):

\(f(x_i) \to f(x_i) / [S(Y_{L,i}) - S(Y_{R,i})]\)
\(S(C_i) \to S(C_i) / [S(Y_{L,i}) - S(Y_{R,i})]\)

Right truncation only:

\[ L \propto \prod_i \frac{f(Y_i)}{1 - S(Y_i)} \]

분모 = “개체가 truncation interval 안에서 사건을 경험할 조건부 확률” — sampling bias 보정.

2.4 Type I Censoring 의 명시적 도출 (Klein § 3.5)

2.4.1 \(\delta = 0\) 의 기여

\(\delta_i = 0\) 은 \(X_i > C_r\) 의미:

\[ \Pr[T_i = C_r, \delta_i = 0] = \Pr[T_i = C_r \mid \delta_i = 0] \Pr[\delta_i = 0] = \Pr[X_i > C_r] = S(C_r) \]

(중간 등식: \(\delta_i = 0\) 이면 \(T_i = C_r\) 이 deterministic.)

2.4.2 \(\delta = 1\) 의 기여

\(\delta_i = 1\) 은 \(X_i \leq C_r\) 의미:

\[ \begin{aligned} \Pr[T_i = t, \delta_i = 1] &= \Pr[X_i = t \mid X_i \leq C_r] \Pr[X_i \leq C_r] \\ &= \frac{f(t)}{1 - S(C_r)} \cdot [1 - S(C_r)] \\ &= f(t) \end{aligned} \]

2.4.3 결합

\[ \Pr[t, \delta] = [f(t)]^\delta [S(t)]^{1-\delta} \]

전체 likelihood (식 3.5.3):

\[ L = \prod_{i=1}^n [f(t_i)]^{\delta_i} [S(t_i)]^{1-\delta_i} = \prod_{i=1}^n [h(t_i)]^{\delta_i} \exp[-H(t_i)] \]

(둘째 형태: \(f = h \cdot S\), \(S = e^{-H}\) 사용.)

Hazard 형태의 중요성

\[ L = \prod_i h(t_i)^{\delta_i} \exp[-H(t_i)] \]

이 형태 가 — Cox PH (Klein Ch.8) 의 partial likelihood 도출의 출발점. likelihood 가 hazard 와 cumulative hazard 의 함수 임을 명시.

§ 3.6 의 counting process likelihood 가 이 식과 정확히 동일.

2.4.4 Exponential 의 closed form (식 3.5.4)

\(f = \lambda e^{-\lambda x}\), \(S = e^{-\lambda x}\):

\[ L_I = \prod_i (\lambda e^{-\lambda t_i})^{\delta_i} (e^{-\lambda t_i})^{1-\delta_i} = \lambda^r \exp[-\lambda S_T] \]

\(r = \sum \delta_i\) — 사건 수.
\(S_T = \sum t_i\) — 총 관측 시간.

MLE — “사건 수 / 총 시간”

로그 likelihood \(\ell = r \log \lambda - \lambda S_T\).

\(\partial \ell / \partial \lambda = r/\lambda - S_T = 0 \Rightarrow \hat{\lambda} = r/S_T\).

직관: hazard = (사건 강도) = (사건 수) / (총 위험 시간).

Fisher information: \(I(\lambda) = r/\lambda^2\). 따라서 \(\text{Var}(\hat{\lambda}) \approx \lambda^2/r\).

핵심: 표본 크기 \(n\) 이 아니라 사건 수 \(r\) 이 정밀도를 결정. 임상시험 design 에서 — “필요한 사건 수” 를 사전 계산 하는 이유.

2.5 Type II Censoring (식 3.5.7)

첫 \(r\) 개의 ordered statistics 의 joint density:

\[ L_{II,1} = \frac{n!}{(n-r)!} \prod_{i=1}^r f(x_{(i)}) \cdot [S(x_{(r)})]^{n-r} \]

도출

\(\prod f(x_{(i)})\): 첫 \(r\) 개의 사건 시점.
\([S(x_{(r)})]^{n-r}\): 나머지 \(n-r\) 개가 시점 \(x_{(r)}\) 까지 살아있을 확률.
\(n!/(n-r)!\): 어느 환자가 첫 \(r\) 명에 들어가는지의 조합 수.

상수 \(n!/(n-r)!\) 는 inference 무관 → 식 (3.5.1) 와 비례.

2.6 Random Censoring (Klein Example 3.10)

2.6.1 Joint Distribution

\(X \perp C_r\) 이라 가정. \(X\) 의 분포 \((f, S)\), \(C_r\) 의 분포 \((g, G)\).

\[ \Pr[T_i = t, \delta_i = 0] = \frac{d}{dt} \int_0^t \int_v^\infty f(u) g(v)\, du\, dv = \frac{d}{dt} \int_0^t S(v) g(v)\, dv = S(t) g(t) \]

(부분적 적분.)

마찬가지로:

\[ \Pr[T_i = t, \delta_i = 1] = f(t) G(t) \]

2.6.2 Likelihood (식 3.5.5)

\[ L = \prod_i [f(t_i) G(t_i)]^{\delta_i} [g(t_i) S(t_i)]^{1-\delta_i} \]

2.6.3 Non-informative 분리

식 (3.5.6) 의 출생

\[ L = \underbrace{\left\{\prod_i G(t_i)^{\delta_i} g(t_i)^{1-\delta_i}\right\}}_{\text{censoring 분포}} \times \underbrace{\left\{\prod_i f(t_i)^{\delta_i} S(t_i)^{1-\delta_i}\right\}}_{\text{관심 부분}} \]

\(G\) 가 \(f\) 의 모수와 무관 → 첫째 항이 상수 → likelihood 가

\[ L \propto \prod_i [f(t_i)]^{\delta_i} [S(t_i)]^{1-\delta_i} \]

식 (3.5.3) 와 정확히 동일. 즉, non-informative random censoring 은 Type I 와 같은 추론.

조건:

\(X \perp C_r\) (독립).
\(G\) 가 \(f\) 의 모수에 의존 안 함.

둘 다 만족 못 하면 → 표준 KM/NA/Cox 모두 편향.

2.6.4 Theoretical Note 3 — 비독립의 경우

\(X, C_r\) 의 joint survival \(S(x, c)\) 가 종속:

\[ L_{III} \propto \prod_i \left\{[-\partial S(x, t_i)/\partial x]_{x=t_i}\right\}^{\delta_i} \left\{[-\partial S(t_i, c)/\partial c]_{c=t_i}\right\}^{1-\delta_i} \]

식 (3.5.6) 와 매우 다를 수 있다. informative censoring 의 표준 처방: IPCW (Robins-Rotnitzky 1992) 또는 sensitivity analysis.

2.7 Progressive Type II Censoring (Klein Theoretical Note 2)

복잡한 likelihood — 첫 \(r_1\) 개 사건 + \(n_1\) sacrifice + 다음 \(r_2\) 사건 + …:

\[ L_{II,2} \propto \prod_{i=1}^{r_1} f(x_{(i)}) \cdot [S(x_{(r_1)})]^{n_1} \cdot \prod_{i=1}^{r_2} f(x^*_{(i)}) \cdot [S(x^*_{(r_2)})]^{n - n_1 - r_1 - r_2} \]

여기서 \(x^*\) 는 truncated 분포 (\(x \geq x_{(r_1)}\) 조건부) 의 실현. 첫 단계 sacrifice 후 — 잔여 표본의 분포가 left-truncated.

2.8 Regression — 개체별 분포 (식 3.5.2)

각 개체가 다른 분포 \(f_i, S_i\) (covariate \(Z_i\) 에 의존):

\[ L = \prod_{i \in D} f_i(x_i) \prod_{i \in R} S_i(C_{r,i}) \prod_{i \in L} [1 - S_i(C_{l,i})] \prod_{i \in I} [S_i(L_i) - S_i(R_i)] \]

회귀 모형의 출생

이 식이 — Cox PH (Klein Ch.8), AFT (Ch.12), Aalen (Ch.10), parametric AFT (Ch.12) 모든 회귀 모형의 likelihood 출생.

Cox: \(S_i(t) = S_0(t)^{\exp(\beta'Z_i)}\) → partial likelihood 로 환원.
AFT: \(S_i(t) = S_0(t e^{-\gamma'Z_i})\) → 직접 MLE.
Aalen: \(h_i(t) = h_0(t) + Z_i'\beta(t)\) → 가중 LS.

세 framework 모두 — 식 (3.5.2) 의 다른 parameterization.

3 § 3.6 — Counting Processes (Aalen 1975)

3.1 왜 새 framework 가 필요한가

§ 3.5 의 likelihood 는 — parametric 추정에 강력. 그러나:

비모수 KM, NA, log-rank: likelihood 도출 어려움.
점근 정규성 증명: order-stats 또는 Glivenko-Cantelli 도구 필요.
Cox partial likelihood: 표준 likelihood 가 아님.
Multistate, time-dependent covariates: 처리 복잡.

Aalen 1975 의 혁명

확률과정 + martingale 이론 + counting process → 모든 비모수·반모수 도구의 통일 framework.

KM, NA, log-rank: 모두 stochastic integral.
Cox partial likelihood: counting process likelihood.
점근 정규성: martingale CLT 로 통일 처리.
Time-dependent covariates: 자연 처리 (\(Z(t)\)).
Multistate: counting process 의 다변량 확장.

modern survival analysis 의 수학적 출생 — 1975~1995 의 20 년 혁명.

3.2 Counting Process \(N(t)\)

정의

확률 과정 \(N(t), t \geq 0\) 가 counting process 이려면:

\(N(0) = 0\)
\(N(t) < \infty\) a.s.
Sample paths: right-continuous, piecewise constant, +1 jump only (한 번에 하나씩 증가).

오른쪽 censored 데이터에서:

\[ N_i(t) = I[T_i \leq t, \delta_i = 1] = \begin{cases} 0 & \text{개체 } i \text{ 사건 미발생 (또는 } t \text{ 이후)} \\ 1 & \text{개체 } i \text{ 사건 발생함 (시점 } T_i \leq t \text{)} \end{cases} \]

\[ N(t) = \sum_{i=1}^n N_i(t) \]

시점 \(t\) 까지 발생한 사건의 누적 수.

직관

\(N(t)\) 는 — 데이터로부터 직접 관측되는 step function. 시점 \(t\) 에 사건 발생 시 +1 jump.

3.3 History (Filtration) \(\mathcal{F}_t\)

정의

시점 \(t\) 까지 알려진 모든 정보의 σ-algebra. \(s \leq t\) 이면 \(\mathcal{F}_s \subset \mathcal{F}_t\) — 정보는 누적만 됨.

right-censored 데이터:

\[ \mathcal{F}_t = \sigma\left(\{(T_i, \delta_i) : T_i \leq t\} \cup \{T_i > t : i \text{ 별}\} \cup \{Z_i\}\right) \]

시점 \(t\) 까지 사건/censoring 발생한 개체의 결과.
시점 \(t\) 까지 추적 중인 개체의 fact.
baseline covariate \(Z_i\) + time-dependent \(Z_i(s), s \leq t\).

3.4 At-Risk Process \(Y(t)\)

\[ Y(t) = \sum_{i=1}^n I[T_i \geq t] \]

시점 \(t\) 의 위험 집합 크기 — 사건도 censoring 도 안 일어난 개체 수.

직관

\(Y(0) = n\) (모두 처음에는 위험 상태).
시점 \(t\) 에 사건 또는 censoring 발생 → \(Y\) 가 1 감소.
\(Y(\infty) = 0\) (모두 종료).

\(Y(t)\) 는 — left-continuous, decreasing step function. \(\mathcal{F}_{t^-}\) 에서 알려짐 (predictable).

3.5 Intensity Process \(\lambda(t)\) — 식 (3.6.2)

핵심 정의

\[ E[dN(t) \mid \mathcal{F}_{t^-}] = Y(t) h(t)\, dt = \lambda(t)\, dt \]

여기서:

\(Y(t)\): 시점 \(t\) 의 위험 집합 (확률적, \(\mathcal{F}_{t^-}\) 에서 알려짐).
\(h(t)\): 모집단 hazard (모수).
\(\lambda(t) = Y(t) h(t)\): 시점 \(t\) 의 사건 발생 강도.

3.5.1 도출

식 (3.6.1):

\[ \Pr[t \leq T_i \leq t + dt, \delta_i = 1 \mid \mathcal{F}_{t^-}] = \begin{cases} h(t)\, dt & \text{if } T_i \geq t \\ 0 & \text{if } T_i < t \end{cases} \]

\(X \perp C_r\) + \(X\) 가 hazard \(h\) 를 가짐:

\[ \Pr[t \leq X_i \leq t + dt, C_i > t + dt \mid X_i \geq t, C_i \geq t] = h(t)\, dt + o(dt) \]

(censoring 이 \(t + dt\) 까지 발생 안 할 확률은 1 + \(o(dt)\).)

전체:

\[ E[dN(t) \mid \mathcal{F}_{t^-}] = \sum_i I[T_i \geq t] \cdot h(t) dt = Y(t) h(t)\, dt \]

의미

\(Y(t) = 100\), \(h(t) = 0.05\)/year → 다음 작은 dt 에서 사건 강도는 \(5/year \cdot dt\).

“\(Y\) 명이 위험에 있고 hazard 가 \(h\) 면 — 다음 순간 5 명/year 의 사건 강도”.

3.6 Compensator \(\Lambda(t)\)

정의

\[ \Lambda(t) = \int_0^t \lambda(s)\, ds = \int_0^t Y(s) h(s)\, ds \]

\(N\) 의 예측 가능 (predictable) 부분:

\[ E[N(t) \mid \mathcal{F}_{t^-}] = \Lambda(t) \]

시각화

\(N(t)\): 관측된 step function (red, +1 jumps).
\(\Lambda(t)\): smooth, predictable (blue, 거의 선형).
둘은 closely 같이 움직임 — large \(n\) 에서 \(N(t) \approx \Lambda(t)\).

차이 \(N(t) - \Lambda(t)\) 가 noise.

3.7 Martingale \(M(t) = N(t) - \Lambda(t)\) — 핵심 정의

식 (3.6.3) 의 핵심

\[ M(t) = N(t) - \Lambda(t) \]

성질 1 — Mean zero:

\[ E[dM(t) \mid \mathcal{F}_{t^-}] = E[dN(t) \mid \mathcal{F}_{t^-}] - E[\lambda(t) dt \mid \mathcal{F}_{t^-}] = \lambda(t) dt - \lambda(t) dt = 0 \]

성질 2 — Martingale property:

\[ E[M(t) \mid \mathcal{F}_s] = M(s) \quad \text{for } s < t \]

해석: \(M(t)\) 는 — “관측 사건 (\(N\)) 에서 예측 사건 (\(\Lambda\)) 을 뺀 잔차” — 모형의 무작위 noise.

3.7.1 두 정의의 동치성

식 (3.6.3) 의 증명:

\[ \begin{aligned} E[M(t) \mid \mathcal{F}_s] - M(s) &= E[M(t) - M(s) \mid \mathcal{F}_s] \\ &= E\!\left[\int_s^t dM(u) \mid \mathcal{F}_s\right] \\ &= \int_s^t E\!\left[E[dM(u) \mid \mathcal{F}_{u^-}] \mid \mathcal{F}_s\right] du \\ &= \int_s^t E[0 \mid \mathcal{F}_s] du = 0 \end{aligned} \]

(반복 기댓값 + Fubini.)

두 부분으로의 분해

\[ N(t) = \underbrace{\Lambda(t)}_{\text{smooth, predictable}} + \underbrace{M(t)}_{\text{mean-zero noise}} \]

이 분해 가 — 모든 통계 추론의 출생.

3.8 Predictable Variation \(\langle M \rangle(t)\)

정의

\(M^2(t)\) 의 compensator. 즉 \(M^2(t) - \langle M \rangle(t)\) 가 martingale.

\(\text{Var}[dM(t) \mid \mathcal{F}_{t^-}] = d\langle M \rangle(t)\).

도출: \(dN(t)\) 는 0/1 random variable with probability \(\lambda(t) dt\). 따라서

\[ \text{Var}[dN(t) \mid \mathcal{F}_{t^-}] = \lambda(t) dt [1 - \lambda(t) dt] \approx \lambda(t) dt \]

(small \(dt\) 에서 \(\lambda(t) dt\) 가 작으므로.)

따라서

\[ \langle M \rangle(t) = \int_0^t \lambda(s) ds = \Lambda(t) \]

(데이터에 ties 없을 경우.)

신기한 사실

\(N\) 의 mean = \(N\) 의 variance (locally) → counting process 가 — 시점 \(t\) 근방에서 — Poisson process 처럼 행동.

ties 있으면 Bernoulli variance \(\lambda(1-\lambda)\) 사용.

3.9 Stochastic Integral

정의

\(K(t)\) 가 predictable process (\(\mathcal{F}_{t^-}\) 에서 알려짐) 이면:

\[ \int_0^t K(u) dM(u) \]

는 — predictable process \(K\) 가 martingale \(M\) 에 대한 적분.

성질 1 — 자체가 martingale.

성질 2 — Predictable variation:

\[ \left\langle \int_0^t K(u) dM(u) \right\rangle = \int_0^t K(u)^2\, d\langle M \rangle(u) \]

식 (3.6.4).

의미

Stochastic integral 은 — “predictable weight \(K\) 로 noise \(dM\) 을 누적” — 가중 평균. 결과는 여전히 mean 0 noise (martingale 성질 보존).

이 도구가 — 모든 비모수 추정량의 출생.

3.10 Nelson-Aalen 의 Stochastic Integral 도출 — 식 (3.6.5)

핵심 식

식 (3.6.2) 에서 \(dN(t) = \lambda(t) dt + dM(t) = Y(t) h(t) dt + dM(t)\).

\(Y(t)\) 가 0 이 아니면 양변을 \(Y(t)\) 로 나눔:

\[ \frac{dN(t)}{Y(t)} = h(t)\, dt + \frac{dM(t)}{Y(t)} \tag{3.6.5} \]

3.10.1 적분 — Nelson-Aalen 추정량

\(J(u) = I[Y(u) > 0]\), convention \(0/0 = 0\).

\[ \int_0^t \frac{J(u)}{Y(u)}\, dN(u) = \int_0^t J(u) h(u)\, du + \int_0^t \frac{J(u)}{Y(u)}\, dM(u) \]

세 부분 의미

좌변: \(\hat{H}(t)\) — Nelson-Aalen estimator. 이산 표본에서:

\[ \hat{H}(t) = \sum_{t_i \leq t} \frac{d_i}{n_i} \]

여기서 \(d_i\) = 시점 \(t_i\) 의 사건 수, \(n_i = Y(t_i)\) = 위험 집합 크기.

우변 첫 항: \(H^*(t) = \int_0^t J(u) h(u)\, du\) — “데이터가 있는 범위” 의 cumulative hazard. 데이터가 충분하면 \(H(t)\) 와 일치.
우변 둘째 항: \(W(t) = \int_0^t J(u)/Y(u)\, dM(u)\) — predictable process 의 stochastic integral → martingale, mean 0.

3.10.2 결론

\[ \boxed{\hat{H}(t) - H^*(t) = W(t) \text{ is a martingale}} \]

Nelson-Aalen 의 통계적 성질 — 자동 도출

Unbiased: \(E[\hat{H}(t)] = E[H^*(t)]\) (martingale 의 평균이 0).
Variance (식 3.6.4):

\[ \langle W \rangle(t) = \int_0^t \left[\frac{J(u)}{Y(u)}\right]^2 d\langle M \rangle(u) = \int_0^t \frac{J(u)}{Y(u)} h(u)\, du \]

이산 추정 (Klein § 4.2):

\[ \hat{\text{Var}}[\hat{H}(t)] = \sum_{t_i \leq t} \frac{d_i}{n_i^2} \]

이것이 — 모든 NA 신뢰구간의 기반.

3.11 Kaplan-Meier — Product Integral

연속 분포: \(S(t) = \exp[-H(t)]\).

이산 (또는 product limit) form: \(S(t) = \prod_{s \leq t} [1 - dH(s)]\).

비모수 추정:

\[ \hat{S}(t) = \prod_{s \leq t} [1 - d\hat{H}(s)] = \prod_{t_i \leq t}\left[1 - \frac{dN(t_i)}{Y(t_i)}\right] = \prod_{t_i \leq t}\left[1 - \frac{d_i}{n_i}\right] \]

KM 의 출생

이 식이 — Kaplan-Meier estimator (Klein Ch.4). counting process framework 에서 — 추정량의 정의 + 점근 성질 + 신뢰구간 + 신뢰대 모두 자동 도출.

\(\hat{S}(t)/S(t) - 1\) 도 stochastic integral → martingale → CLT.

Greenwood’s formula (KM variance) 도 같은 framework 에서 도출.

3.12 Martingale CLT — 점근 정규성

정리 (Klein § 3.6 의 마지막 부분)

\(Y(t)/n \to y(t)\) (deterministic limit) 로 가정. \(Z^{(n)}(t) = \sqrt{n}[\hat{H}(t) - H^*(t)]\) 에 대해:

식 (3.6.6):

\[ \langle Z^{(n)} \rangle(t) \approx \int_0^t \frac{h(u)}{y(u)}\, du \]

\(n \to \infty\) 에서 — \(Z^{(n)}\) 이 평균 0, 분산 \(\int h/y\, du\), 독립증분, 정규분포 인 limiting process \(Z^{(\infty)}\) 로 수렴.

3.12.1 Joint distribution

\([Z^{(\infty)}(t_1), \ldots, Z^{(\infty)}(t_k)]\) 가 multivariate normal with covariance:

\[ \text{Cov}[Z^{(\infty)}(t), Z^{(\infty)}(s)] = \int_0^{\min(s,t)} \frac{h(u)}{y(u)}\, du \]

Martingale CLT 의 강점

전통 CLT 는 i.i.d. 표본 가정. Martingale CLT 는 — 종속성 + censoring + truncation 모두 처리. survival 추정량의 점근 정규성을 통일적으로 증명.

신뢰대 (confidence band) — 식 (3.6.6) 으로 \(\hat{H}\) 가 sup-norm 에서 Brownian motion 으로 수렴 → 신뢰대 도출.

3.13 Counting Process Likelihood

각 개체의 counting process \(N_j(t)\). 시점 \(t\) 의 history 에서 \(dN_j(t)\) 가 거의 Bernoulli with \(\lambda_j(t) dt\):

\[ \Pr[dN_j(t) = 1 \mid \mathcal{F}_{t^-}] = \lambda_j(t)\, dt \]

기여:

\[ \lambda_j(t)^{dN_j(t)} [1 - \lambda_j(t) dt]^{1 - dN_j(t)} \]

전체 시간 \([0, \tau]\) 에 적분 (product integral):

\[ \lambda_j(t)^{dN_j(t)} \exp\!\left[-\int_0^\tau \lambda_j(u)\, du\right] \]

전체 likelihood:

\[ L = \left[\prod_{j=1}^n \prod_t \lambda_j(t)^{dN_j(t)}\right] \exp\!\left[-\sum_{j=1}^n \int_0^\tau \lambda_j(u)\, du\right] \]

3.13.1 Right-Censored 의 환원

\(\lambda_j(t) = Y_j(t) h(t)\) 대입:

\[ L \propto \left[\prod_{j} h(t_j)^{\delta_j}\right] \exp\!\left[-\sum_{j} H(t_j)\right] \]

§ 3.5 와의 동등성

이 식이 — § 3.5 의 식 (3.5.3) 과 정확히 동일.

\[ \prod h^{\delta} e^{-H} = \prod h^{\delta} S = \prod (h S)^{\delta} S^{1-\delta} = \prod f^{\delta} S^{1-\delta} \]

같은 likelihood, 두 framework. counting process 는 — 다른 (더 일반적) 도출.

3.14 Counting Process 의 활용 매핑

Klein 책 전체에서의 활용

도구	Counting process 표현	Klein chapter
Nelson-Aalen	\(\int J/Y\, dN\)	Ch.4
Kaplan-Meier	\(\prod (1 - dN/Y)\)	Ch.4
Smoothed hazard	kernel × \(dN/Y\)	Ch.6
Log-rank test	\(\int K(u)[dN_1 - dN_2]\)	Ch.7
Cox partial likelihood	\(\prod e^{\beta'z_i}/\sum_R e^{\beta'z_j}\)	Ch.8
Aalen additive	\(\hat{B}(t) = \int Y^- dN\)	Ch.10
Schoenfeld residual	martingale residual	Ch.11

모든 도구가 counting process martingale 의 stochastic integral. 점근 성질이 martingale CLT 로 통일적 증명.

4 R + Python — 두 Framework 정밀 비교

4.1 R — Likelihood 직접 + Counting process 수동 구현

library(survival)

# § 1.2 Leukemia 데이터
leukemia <- data.frame(
  time = c(1, 22, 3, 12, 8, 17, 2, 11, 8, 12, 2, 5, 4, 15, 8, 23, 5, 11, 4, 1, 8,
           10, 7, 32, 23, 22, 6, 16, 34, 32, 25, 11, 20, 19, 6, 17, 35, 6, 13, 9, 6, 10),
  status = c(rep(1, 21), 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0)
)

# (1) § 3.5 Likelihood — Exponential MLE (식 3.5.4)
neg_loglik <- function(lambda, t, delta) {
  -sum(delta * log(lambda) - lambda * t)
}
r <- sum(leukemia$status)
S_T <- sum(leukemia$time)
lambda_mle <- r / S_T
cat(sprintf("§ 3.5 Likelihood: λ̂ = %d/%d = %.4f\n", r, S_T, lambda_mle))

# (2) § 3.6 Counting process — N(t), Y(t) 직접 구현
times <- sort(unique(leukemia$time))

# At-risk Y(t)
Y_t <- sapply(times, function(t) sum(leukemia$time >= t))

# dN(t) — increments
dN_t <- sapply(times, function(t) {
  sum(leukemia$status[leukemia$time == t])
})

# Nelson-Aalen — stochastic integral ∫ J/Y dN
NA_increments <- ifelse(Y_t > 0, dN_t / Y_t, 0)
H_NA <- cumsum(NA_increments)

# Variance — ∫ J/Y² dN
var_increments <- ifelse(Y_t > 0, dN_t / Y_t^2, 0)
Var_H <- cumsum(var_increments)
SE_H <- sqrt(Var_H)

# 95% 신뢰구간 (log transformation 안정화)
H_lower <- exp(log(H_NA) - 1.96 * SE_H / H_NA)
H_upper <- exp(log(H_NA) + 1.96 * SE_H / H_NA)

# survival 패키지와 비교
fit_na <- survfit(Surv(time, status) ~ 1, data = leukemia, type = "fh")

# Plot
par(mfrow = c(1, 2))
plot(times, H_NA, type = "s", col = "red", lwd = 2,
     xlab = "Time (weeks)", ylab = "Ĥ(t)",
     main = "Nelson-Aalen — manual stochastic integral")
lines(times, H_lower, type = "s", col = "red", lty = 2)
lines(times, H_upper, type = "s", col = "red", lty = 2)
lines(fit_na, fun = "cumhaz", col = "blue", conf.int = FALSE)
legend("topleft", c("Manual ∫ J/Y dN", "survival package"),
       col = c("red", "blue"), lwd = c(2, 1))

# (3) Martingale residual M_i(t) — 모형 적합 진단
# 모형 hazard: ĥ = λ_mle (constant — exponential 가정)
# M_i = N_i(t) - integral of Y_i(u) ĥ du = δ_i - λ̂ * t_i
M_residuals <- leukemia$status - lambda_mle * leukemia$time
plot(leukemia$time, M_residuals, pch = 19,
     xlab = "Time", ylab = "M_i — martingale residual",
     main = "Exponential fit — martingale residuals")
abline(h = 0, col = "red", lwd = 2)
# 잔차 sum should be ≈ 0
cat(sprintf("Sum of martingale residuals: %.4f (should ≈ 0)\n",
            sum(M_residuals)))

4.2 Python — Counting process 4 함수 + Cox partial likelihood

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lifelines import KaplanMeierFitter, NelsonAalenFitter, CoxPHFitter

leukemia = pd.DataFrame({
    "time": [1, 22, 3, 12, 8, 17, 2, 11, 8, 12, 2, 5, 4, 15, 8, 23, 5, 11, 4, 1, 8,
             10, 7, 32, 23, 22, 6, 16, 34, 32, 25, 11, 20, 19, 6, 17, 35, 6, 13, 9, 6, 10],
    "status": [1]*21 + [1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0],
    "six_mp": [0]*21 + [1]*21,
})

# § 3.6 — Counting process 4 함수 시각화
times = np.sort(leukemia["time"].unique())
N_t = np.array([leukemia[(leukemia["time"] <= t) & (leukemia["status"] == 1)].shape[0]
                for t in times])
Y_t = np.array([leukemia[leukemia["time"] >= t].shape[0] for t in times])

# Compensator Λ(t) — exponential 가정 시
lam_hat = leukemia["status"].sum() / leukemia["time"].sum()
# 정확한 Λ_i(t) = ∫ Y_i(u) ĥ du, 여기서는 모집단 평균 Λ(t) ≈ ∫ Y(u) ĥ du
# 단순 근사: 누적 hazard × Y 의 평균
delta_t = np.diff(np.concatenate([[0], times]))
Lambda_t = np.cumsum(Y_t * lam_hat * delta_t)

# Martingale M(t) = N(t) - Λ(t)
M_t = N_t - Lambda_t / N_t.max() * N_t.max()  # scaled for visualization

# Plot 4 functions
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

axes[0, 0].step(times, N_t, where="post", color="black", lw=2)
axes[0, 0].set_title("N(t) — Counting process (cumulative events)")
axes[0, 0].set_xlabel("Time (weeks)")
axes[0, 0].set_ylabel("Count")

axes[0, 1].step(times, Y_t, where="post", color="blue", lw=2)
axes[0, 1].set_title("Y(t) — At-risk process (predictable)")
axes[0, 1].set_xlabel("Time (weeks)")
axes[0, 1].set_ylabel("# at risk")

# Nelson-Aalen — stochastic integral
NA_inc = []
for t in times:
    d_t = leukemia[(leukemia["time"] == t) & (leukemia["status"] == 1)].shape[0]
    y_t = leukemia[leukemia["time"] >= t].shape[0]
    NA_inc.append(d_t / y_t if y_t > 0 else 0)
H_NA = np.cumsum(NA_inc)

axes[1, 0].step(times, H_NA, where="post", color="red", lw=2)
naf = NelsonAalenFitter().fit(leukemia["time"], leukemia["status"])
naf.cumulative_hazard_.plot(ax=axes[1, 0], style="--", color="green",
                             label="lifelines NA")
axes[1, 0].set_title(r"$\hat{H}(t) = \int_0^t J/Y\,dN$ — stochastic integral")
axes[1, 0].set_xlabel("Time (weeks)")
axes[1, 0].set_ylabel(r"$\hat{H}(t)$")
axes[1, 0].legend(["Manual ∫", "lifelines"])

# KM — product integral
kmf = KaplanMeierFitter().fit(leukemia["time"], leukemia["status"])
KM_manual = np.cumprod([1 - inc for inc in NA_inc if inc < 1] +
                        [1] * (len(times) - sum(1 for inc in NA_inc if inc < 1)))
axes[1, 1].step(times, kmf.survival_function_at_times(times),
                where="post", color="purple", lw=2, label="KM")
axes[1, 1].set_title(r"$\hat{S}(t) = \prod (1 - dN/Y)$ — product integral")
axes[1, 1].set_xlabel("Time (weeks)")
axes[1, 1].set_ylabel(r"$\hat{S}(t)$")
axes[1, 1].legend()

plt.tight_layout()
plt.savefig("klein_3_5_6_counting_process.png", dpi=100)

# Cox partial likelihood — counting process 표현 검증
cph = CoxPHFitter().fit(leukemia, duration_col="time", event_col="status")
cph.print_summary()
# log-likelihood 가 counting process likelihood 와 동일

# Martingale residuals from Cox
mart_resid = cph.compute_residuals(leukemia, kind="martingale")
plt.figure(figsize=(8, 5))
plt.scatter(leukemia["time"], mart_resid["martingale"], alpha=0.6)
plt.axhline(0, color="red", lw=2)
plt.xlabel("Time")
plt.ylabel("Martingale residual")
plt.title("Cox PH — Martingale residuals (should sum to 0)")
plt.tight_layout()
plt.savefig("klein_3_6_martingale_residuals.png", dpi=100)

print(f"Sum of Cox martingale residuals: {mart_resid['martingale'].sum():.4f}")
# Cox 모형의 핵심 진단: martingale residual sum ≈ 0

결과 검증

Manual NA (\(\int J/Y dN\)) = lifelines NA = 같은 step function.
KM = \(\prod (1 - dN/Y)\) 가 lifelines KM 과 일치.
Martingale residuals 합 ≈ 0 — counting process 의 martingale 성질 검증.
Cox partial likelihood = counting process likelihood 의 specialization.

5 직관 통합 — 두 Framework 의 통일

핵심 6 가지 교훈

두 framework 의 동등성: § 3.5 와 § 3.6 이 동일 likelihood 식 도출. 그러나 — counting process 가 더 일반적·강력.
likelihood 의 4 가지 기여: exact (\(f\)), right (\(S\)), left (\(1-S\)), interval (\(S(L)-S(R)\)). 모든 censoring 이 이 4 종 의 합.
Truncation 분모: sampling bias 보정. 위험 집합 \(Y(t)\) 정의로 자연 처리.
Counting process 4 도구: \(N\) (관측), \(Y\) (위험), \(\Lambda\) (예측), \(M\) (noise). 이 4 가지가 — 모든 비모수·반모수 추정의 출생.
Stochastic integral 의 마법: 모든 추정량이 — predictable process × \(dM\) 의 적분 → martingale → 자동 unbiased + martingale CLT 로 점근 정규성.
하나의 likelihood, 모든 도구: KM·NA·log-rank·Cox·Aalen·Schoenfeld 가 — 같은 framework 의 다른 instantiation. 1975 Aalen 의 혁명.

6 실전 체크리스트 — § 3.5~3.6

§ 3.5 Likelihood

Master 식 (3.5.1) — 4 종 censoring 통일.
Type I 도출 (3.5.3): \(L = \prod f^\delta S^{1-\delta}\).
Exponential closed form (3.5.4): \(L = \lambda^r e^{-\lambda S_T}\), MLE \(\hat{\lambda} = r/S_T\).
Type II order statistics (3.5.7).
Random censoring 의 informative vs non-informative (3.5.6).
Truncation 분모 보정.
Regression 의 식 (3.5.2).

§ 3.6 Counting Process

\(N(t), Y(t), \mathcal{F}_t, \lambda(t), \Lambda(t), M(t), \langle M \rangle(t)\) — 7 가지 핵심 정의.
Intensity \(\lambda(t) = Y(t) h(t)\) (3.6.2).
Martingale \(M = N - \Lambda\) (3.6.3) — mean zero noise.
Stochastic integral 의 martingale 성질.
Nelson-Aalen \(\hat{H}(t) = \int J/Y\, dN\) (3.6.5) — stochastic integral.
KM \(\hat{S}(t) = \prod (1 - dN/Y)\) — product integral.
Variance (3.6.6): \(\langle Z^{(\infty)} \rangle = \int h/y\, du\).
Martingale CLT — 점근 정규성 통일 처리.
Counting process likelihood = § 3.5 likelihood (동등성).

7 관련 주제

Klein 시리즈

(이전) Ch.3 overview
(이전) § 3.1~3.2 심화 — Right Censoring 6 형태
(이전) § 3.3~3.4 심화 — Left/Interval Censoring + Truncation
(다음) § 3.7 — Exercises (예정)
(다음 chapter) Ch.4 — Nonparametric Estimation (KM, NA, 신뢰구간/신뢰대 — 본 편 framework 의 직접 응용)

관련 개념 (cross-category)

8 참고문헌

Klein, J. P., & Moeschberger, M. L. (2003). Survival Analysis: Techniques for Censored and Truncated Data (2nd ed.), Ch.3 § 3.5~3.6, pp. 74-87. Springer.
Aalen, O. O. (1975). Statistical inference for a family of counting processes. PhD thesis, University of California, Berkeley.
Aalen, O. O. (1978). Nonparametric inference for a family of counting processes. Annals of Statistics, 6(4), 701-726.
Andersen, P. K., Borgan, Ø., Gill, R. D., & Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer. — counting process framework 의 표준 reference.
Fleming, T. R., & Harrington, D. P. (1991). Counting Processes and Survival Analysis. Wiley.
Gill, R. D. (1980). Censoring and Stochastic Integrals. Mathematical Centre Tracts, 124. Amsterdam.
Doob, J. L. (1953). Stochastic Processes. Wiley. — martingale 이론의 정전.
Rebolledo, R. (1980). Central limit theorems for local martingales. Zeitschrift für Wahrscheinlichkeitstheorie, 51, 269-286. — martingale CLT.
Robins, J. M., & Rotnitzky, A. (1992). Recovery of information and adjustment for dependent censoring using surrogate markers. In AIDS Epidemiology, Birkhäuser. — IPCW.
Therneau, T. M., & Grambsch, P. M. (2000). Modeling Survival Data: Extending the Cox Model. Springer. — counting process 의 R 구현.
Kalbfleisch, J. D., & Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd ed. Wiley.