Kwangmin Kim - Klein Ch.1 § 1.15~1.16 심화 — Psychiatric Patients

1 들어가며 — Left Truncation 의 두 데이터

Klein 시리즈 사다리:

편	주제
Ch.1 Overview (01)	19 예제 catalog
… (이전 편들)	…
§ 1.13~1.14 (01-7)	NLSY Pneumonia + Weaning
§ 1.15~1.16 (본 편)	Psychiatric + Channing House (Left Truncation)
§ 1.17~1.19 (예정 또는 skip)	Marijuana + Breast Cancer + AIDS

본 편이 답하는 다섯 가지 질문

Left truncation 이 right censoring 과 본질적으로 다른 점은?
Length-biased sampling 이 Channing House 같은 retirement community 데이터에서 어떻게 발생하는가?
Relative mortality function 이 어떻게 외부 standard population 과 sample 을 비교하는가?
One-sample test = “외부 standard 와 비교” 의 통계적 framework?
Conditional survival \(S(t \mid T > L)\) 의 의미와 left truncation 보정의 수학?

2 Left Truncation — 본 편의 통합 주제

직관 — Left Truncation 의 본질

Right censoring (관측 불완전):

개체가 표본에 있고, 사건 시점만 부분 정보.
“\(T \geq c\)”.

Left truncation (표본 추출 편향):

개체가 특정 나이/시점까지 살아남아야 표본에 들어옴.
이른 사망자는 표본에서 제외.
“\(T > L_i\)” 인 개체만 관측.

비유:

80 세 retirement community 입주자만 분석:
- 30 세 사망자는 자동 제외.
- 표본은 “오래 산 사람들” 위주 → length-biased.
60 세에 community 입주한 사람 vs 75 세에 입주:
- 둘 다 “최소 그 나이까지 살아남음”.
- 그러나 75 세는 더 강한 selection.

수학적 처리:

Likelihood: \(\frac{f(T_i)^{\delta_i} S(T_i)^{1-\delta_i}}{S(L_i)}\)
Denominator \(S(L_i)\) 가 truncation 보정.
“표본에 들어올 조건부 확률” 로 정규화.

본 편의 두 데이터 모두 left truncation 의 시연:

§ 1.15 Psychiatric: admission 나이 = \(L_i\).
§ 1.16 Channing: community 입주 나이 = \(L_i\).

3 § 1.15 Psychiatric Patients (Woolson 1981)

3.1 의학적 배경 — 정신 질환과 mortality

3.1.1 정신 질환의 사망률

Schizophrenia, manic-depressive disorder 같은 severe mental illness.
일반 인구보다 사망률 높음:
- 자살 위험.
- 부수적 의학 문제 (cardiovascular, metabolic).
- Antipsychotic 약의 부작용.
Excess mortality 의 정량화 = 공중보건 정책 도구.

3.1.2 Tsuang & Woolson Study (1977-1981)

University of Iowa hospitals 의 psychiatric inpatient.
1935-1948 admission cohort.
Long-term mortality 추적.

3.2 데이터 — Table 1.7

n: 26 (작은 sample).
Variables:
- Gender (15 female, 11 male).
- Age at admission (19~58).
- Follow-up time (years from admission to death/censoring).
- Status (1 = death, 0 = censored).

3.2.1 Sample 구조

Gender	Age at Admission	Follow-up
Female	51	1
Female	58	1
Female	55	2
Female	28	22
Male	21	30+
…	…	…

+ = censored.

직관 — 26 명의 작은 데이터로 무엇을 할 수 있나

Internal comparison (sample 안에서):

Male vs Female 사망률 비교.
Age at admission 의 effect.
그러나 26 명 → 통계적 검정력 약함.

External comparison (외부 standard):

Iowa 1959 mortality table 과 비교.
“정신 질환 환자가 일반 인구 대비 얼마나 사망률 높은가?”
One-sample test: \(H_0\): psychiatric mortality = Iowa 1959 mortality.

External comparison 의 우위:

External 인구의 mortality 가 정확 (수백만 명).
작은 sample (26 명) 도 external 과 명확히 다른지 검정 가능.
이것이 본 데이터의 핵심 기여.

3.3 Klein 사용 매핑

Chapter	본 데이터 사용
Ch.6.3	Relative mortality + cumulative excess mortality (Iowa 1959 standard)
Ch.7.2	One-sample hypothesis test (sample vs population)
Ch.9	Left truncation in Cox PH model

3.4 Relative Mortality Function (Ch.6.3)

3.4.1 정의

샘플의 survival \(S(t)\) 와 standard population 의 \(S^*(t)\) 의 비율:

\[ S_r(t) = \frac{S(t)}{S^*(t)} \]

\(S(t)\): 샘플 (psychiatric patients) 의 survival.
\(S^*(t)\): standard (Iowa 1959) 의 survival, age·sex 보정.

해석:

\(S_r(t) = 1\): psychiatric 와 standard 가 같음.
\(S_r(t) < 1\): psychiatric 가 더 빨리 사망 (excess mortality).
\(S_r(t) > 1\): psychiatric 가 standard 보다 오래 (드물음).

3.4.2 Cumulative Excess Mortality

\[ H_e(t) = H(t) - H^*(t) \]

\(H(t)\): 샘플의 누적 hazard.
\(H^*(t)\): standard 의 누적 hazard.
\(H_e(t)\): 정신 질환의 “추가” 누적 hazard.

직관 — Excess Mortality 의 임상 의미

예시:

\(H_e(t = 10년) = 0.5\).
해석: “10 년 동안 정신 질환 환자가 일반 인구보다 cumulative log-mortality 가 0.5 더 높음”.
또는 “10 년 후 standard 보다 사망률이 \(\exp(0.5) - 1 = 65\%\) 높음”.

공중보건 함의:

정신 질환의 mortality burden 정량화.
Targeted intervention (자살 예방, 의학 care 강화) 의 통계적 근거.
약물 치료 발전 시기별 비교 (1948 vs 2000 cohort).

3.5 One-Sample Hypothesis Test (Ch.7.2)

\[ H_0: S(t) = S^*(t) \text{ for all } t \]

3.5.1 Standardized Test Statistic

\[ Z = \frac{\sum_i (\delta_i - E_i)}{\sqrt{\sum_i E_i (1 - E_i)}} \]

\(E_i\): 개체 \(i\) 의 expected death (under standard).
\(\delta_i\): 실제 사건.

\(Z \sim N(0, 1)\) under \(H_0\).

직관 — One-Sample 의 강점

Two-sample test (within sample):

Control group 필요 (e.g., placebo).
Sample 작으면 검정력 부족.

One-sample test (vs external standard):

External population 이 “control” 역할.
External 의 정확도 높음 (수백만 명 인구 통계).
작은 sample 도 강력한 검정.

조건:

External standard 가 적절 (같은 시기·지역).
Age·sex 보정.

본 데이터: 26 명이지만 Iowa 1959 standard 와 명확히 다른지 검정 가능.

3.6 Left Truncation in Psychiatric (Ch.9)

3.6.1 문제

환자는 admission 시점에 이미 특정 나이.
Admission 전 사망자는 study 에 포함 안 됨.
“Age at death” 분석 시 left truncation 발생.

3.6.2 보정

Standard Cox model:

# 잘못됨 (left truncation 무시)
coxph(Surv(age_death, status) ~ gender + age_admission, data = psych)

# 올바름 (left truncation 보정)
coxph(Surv(age_admission, age_death, status) ~ gender, data = psych)
# Surv(start, stop, event) — counting process format

직관 — Risk Set 의 차이

Naive (left truncation 무시):

시간 0 (출생) 부터 모든 환자 risk set 에 포함.
그러나 출생~admission 사이 사망 가능성 무시.
→ bias.

Corrected (left truncation 보정):

환자는 admission 시점부터 risk set 에 들어옴.
“이 시점에 살아 있는 다른 환자들” 만 비교.
Risk set 이 시간 따라 동적.

→ counting process format 으로 자연스럽게 표현.

4 § 1.16 Channing House (Hyde 1980) — Left Truncation 의 정전 예제

4.1 배경 — Channing House Retirement Community

Location: Palo Alto, California.
Period: January 1964 - July 1975.
n: 462 (97 male + 365 female).
사건: 사망 (또는 community 떠남).
Variables: 입주 나이, 사망/이탈 나이, gender.

4.1.1 특수 feature

“All were covered by a health care program provided by the center which allowed for easy access to medical care without any additional financial burden to the resident.”

→ Healthcare access 가 균질 — 분석에서 유리한 조건.

4.2 Left Truncation 의 발생 메커니즘

일반 인구:
  출생 ───●(50세 사망) ✗ Channing 표본 제외
       ───●(60세 사망) ✗ 제외
       ───────●(70세 사망 후 입주 가능) ✗ 입주 전 사망
       ──────────●(80세 입주, 90세 사망) ✓ 표본 포함
       ──────────●(80세 입주, 100세 사망) ✓ 표본 포함

Channing House 표본:
  입주 시 살아 있는 사람만 = "충분히 오래 산 사람".

직관 — Length-Biased Sampling

일반 인구 lifetime 분포:

정상분포 또는 right-skewed.
이른 사망 가능.

Channing 표본 lifetime 분포:

입주 나이 (보통 70+ 세) 이전 사망자 자동 제외.
→ 표본은 “오래 사는 사람” 으로 편향.
Length-biased: 더 긴 lifetime 일수록 표본에 들어올 확률 큼.

무시 시 bias:

Naive: “Channing 거주자의 평균 사망 나이 = 90 세” → “이 그룹의 lifetime 90”.
그러나 이는 conditional on entering 의 mean.
Unconditional lifetime (전체 인구) 은 더 짧음.

보정:

Conditional survival \(S(t \mid T > L)\) 추정.
또는 truncation distribution 알려진 경우 unconditional 복원 가능.

본 데이터 = left truncation 의 정전 예제 (Klein Ch.3.4 의 동기).

4.3 Conditional Survival Function (Ch.4.6)

4.3.1 정의

\[ S(t \mid T > L) = \frac{S(t)}{S(L)}, \quad t \geq L \]

\(L\): truncation 시점 (입주 나이).
\(S(t \mid T > L)\): “L 까지 살아남았다는 조건에서 t 까지 생존 확률”.

4.3.2 Truncation-Adjusted KM

표준 KM 의 risk set 수정:

시점 \(t\) 의 risk set: \(\{i : L_i \leq t \leq T_i\}\).
“이 시점에 admit 했고 아직 안 죽은 사람들”.

# R survival
fit <- survfit(Surv(L, T, status) ~ 1, data = channing)
# Surv(start, stop, event) — left truncation

4.4 Klein 사용 매핑

Chapter	본 데이터 사용
Ch.3.4	Left truncation 의 정의 + likelihood 유도
Ch.4.6	Conditional survival function 의 비모수 추정
Ch.7.3	Log-rank test with left truncation (male vs female)
Ch.9	Cox PH with left truncation

4.5 Two-Sample Test with Left Truncation (Ch.7.3)

4.5.1 가설

\[ H_0: S_M(t \mid T > L) = S_F(t \mid T > L) \]

(Male 과 female 의 conditional survival 같다.)

4.5.2 Modified Risk Set

각 시점 \(t\) 의:

Male risk set: \(\{i \in M : L_i \leq t \leq T_i\}\).
Female risk set: \(\{i \in F : L_i \leq t \leq T_i\}\).
Log-rank statistic 계산 시 이 risk set 사용.

직관 — Truncation 보정 log-rank

표준 log-rank: 시점 \(t\) 의 사건이 시점 \(t\) 의 risk set 에 비례하는지 검정.

Truncation 보정: risk set 을 “\(L \leq t \leq T\)” 로 제한.

“같은 나이에 살아 있는 male/female 의 사망률이 다른가?”

본 데이터에서 일반적으로 female 이 더 오래 산다 — 검정으로 확인.

5 R + Python EDA — Psychiatric Patients

5.1 R — `survival`

library(survival)
library(survminer)

# Klein Table 1.7
psych <- data.frame(
  gender = c("F", "F", "F", "F", "M", "M", "F", "F", "F", "F",
             "F", "F", "F", "M", "F", "F", "F", "F", "F", "F",
             "M", "M", "F", "F", "F", "M"),
  age_admit = c(51, 58, 55, 28, 21, 19, 25, 48, 47, 25,
                31, 24, 25, 30, 33, 36, 30, 41, 43, 45,
                35, 29, 35, 32, 36, 32),
  followup = c(1, 1, 2, 22, 30, 28, 32, 11, 14, 36,
               31, 33, 33, 37, 35, 25, 31, 22, 26, 24,
               35, 34, 30, 35, 40, 39),
  status = c(1, 1, 1, 1, 0, 1, 1, 1, 1, 0,
             0, 0, 0, 1, 0, 1, 0, 1, 1, 1,
             0, 0, 0, 1, 1, 0)
)

# Age at death (admission age + follow-up)
psych$age_death <- psych$age_admit + psych$followup

# Survival 객체 — left truncated
surv_obj <- with(psych, Surv(age_admit, age_death, status))

# KM (left truncation 보정)
fit <- survfit(surv_obj ~ gender, data = psych)
ggsurvplot(fit, data = psych, pval = TRUE,
           xlab = "Age (years)", ylab = "Conditional survival",
           legend.labs = c("Female", "Male"))

# Log-rank with left truncation
survdiff(Surv(age_admit, age_death, status) ~ gender, data = psych)

# Cox PH with left truncation
cox_fit <- coxph(Surv(age_admit, age_death, status) ~ gender, data = psych)
summary(cox_fit)

# One-sample test vs Iowa 1959 standard
# (실제 standard mortality table 필요)
# 시뮬레이션:
iowa_1959_hazard <- function(age, sex) {
  # 단순화: exponential model
  base_rate <- if (sex == "M") 0.012 else 0.008
  base_rate * exp(0.05 * (age - 50))
}

# Expected deaths under standard
psych$expected <- mapply(function(a, s, t) {
  integrate(iowa_1959_hazard, a, a + t, sex = s)$value
}, psych$age_admit, psych$gender, psych$followup)

# One-sample test
observed <- sum(psych$status)
expected <- sum(psych$expected)
SMR <- observed / expected  # Standardized Mortality Ratio
print(paste("Observed:", observed, "Expected:", round(expected, 2),
            "SMR:", round(SMR, 2)))

# Z-test (large sample)
z <- (observed - expected) / sqrt(expected)
p_value <- 2 * pnorm(-abs(z))
print(paste("Z =", round(z, 2), "p =", round(p_value, 4)))

5.2 Python — `lifelines`

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lifelines import KaplanMeierFitter, CoxPHFitter
from lifelines.statistics import logrank_test

psych = pd.DataFrame({
    "gender": ["F", "F", "F", "F", "M", "M", "F", "F", "F", "F",
               "F", "F", "F", "M", "F", "F", "F", "F", "F", "F",
               "M", "M", "F", "F", "F", "M"],
    "age_admit": [51, 58, 55, 28, 21, 19, 25, 48, 47, 25,
                  31, 24, 25, 30, 33, 36, 30, 41, 43, 45,
                  35, 29, 35, 32, 36, 32],
    "followup": [1, 1, 2, 22, 30, 28, 32, 11, 14, 36,
                 31, 33, 33, 37, 35, 25, 31, 22, 26, 24,
                 35, 34, 30, 35, 40, 39],
    "status": [1, 1, 1, 1, 0, 1, 1, 1, 1, 0,
               0, 0, 0, 1, 0, 1, 0, 1, 1, 1,
               0, 0, 0, 1, 1, 0]
})
psych["age_death"] = psych["age_admit"] + psych["followup"]

# KM with left truncation (entry parameter)
fig, ax = plt.subplots(figsize=(9, 6))
for grp, color in [("F", "red"), ("M", "blue")]:
    sub = psych[psych["gender"] == grp]
    kmf = KaplanMeierFitter()
    kmf.fit(durations=sub["age_death"],
            event_observed=sub["status"],
            entry=sub["age_admit"],  # left truncation
            label=f"Gender {grp}")
    kmf.plot_survival_function(ax=ax, color=color)
ax.set_xlabel("Age (years)")
ax.set_ylabel("Conditional survival")
plt.tight_layout()

# Log-rank with left truncation (lifelines 미지원, 직접 구현 또는 R)
# Cox with left truncation
psych["male"] = (psych["gender"] == "M").astype(int)
cph = CoxPHFitter()
# Note: lifelines 의 CoxPHFitter 는 entry 파라미터 직접 지원 안 함
# Workaround: counting process format (start, stop, event)
psych_long = psych.rename(columns={"age_admit": "start",
                                     "age_death": "stop",
                                     "status": "event"})
cph.fit(psych_long[["start", "stop", "event", "male"]],
        duration_col="stop", event_col="event",
        entry_col="start")  # if supported
print(cph.summary)

6 R + Python EDA — Channing House

6.1 R — Left Truncation 의 정확한 처리

library(survival)

# Channing House 시뮬레이션 (실제는 Klein web)
set.seed(42)
n <- 462
channing <- data.frame(
  gender = c(rep("M", 97), rep("F", 365)),
  age_entry = pmin(110, pmax(60, rnorm(n, 75, 8))) * 12,  # months
  age_death = NA,
  status = rbinom(n, 1, 0.5)
)
# Sample lifetime (충분히 오래 산 사람들)
channing$age_death <- channing$age_entry +
  pmin(360, rexp(n, rate = 0.005)) * 12  # months

# WRONG: left truncation 무시
fit_wrong <- survfit(Surv(age_death / 12, status) ~ gender, data = channing)

# CORRECT: left truncation 보정
fit_correct <- survfit(Surv(age_entry / 12, age_death / 12, status) ~ gender,
                       data = channing)

# 비교
par(mfrow = c(1, 2))
plot(fit_wrong, main = "Wrong (no truncation)",
     xlab = "Age (years)", ylab = "Survival", col = c("blue", "red"))
plot(fit_correct, main = "Correct (left truncation)",
     xlab = "Age (years)", ylab = "Conditional survival",
     col = c("blue", "red"))

# Cox PH with left truncation
cox_correct <- coxph(Surv(age_entry / 12, age_death / 12, status) ~ gender,
                     data = channing)
summary(cox_correct)

# Log-rank with left truncation
survdiff(Surv(age_entry / 12, age_death / 12, status) ~ gender, data = channing)

6.2 Python — `lifelines`

from lifelines import KaplanMeierFitter, CoxPHFitter
import matplotlib.pyplot as plt

# Channing 시뮬레이션
rng = np.random.default_rng(42)
n_m, n_f = 97, 365
channing = pd.DataFrame({
    "gender": ["M"] * n_m + ["F"] * n_f,
    "age_entry": np.clip(rng.normal(75, 8, n_m + n_f), 60, 110),
    "lifetime": rng.exponential(20, n_m + n_f),
    "status": rng.binomial(1, 0.5, n_m + n_f),
})
channing["age_death"] = channing["age_entry"] + channing["lifetime"]

# KM with left truncation
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# WRONG (no truncation)
for grp, color in [("M", "blue"), ("F", "red")]:
    sub = channing[channing["gender"] == grp]
    kmf = KaplanMeierFitter()
    kmf.fit(durations=sub["age_death"], event_observed=sub["status"], label=grp)
    kmf.plot_survival_function(ax=axes[0], color=color)
axes[0].set_title("Wrong (no truncation)")

# CORRECT (left truncation)
for grp, color in [("M", "blue"), ("F", "red")]:
    sub = channing[channing["gender"] == grp]
    kmf = KaplanMeierFitter()
    kmf.fit(durations=sub["age_death"], event_observed=sub["status"],
            entry=sub["age_entry"], label=grp)
    kmf.plot_survival_function(ax=axes[1], color=color)
axes[1].set_title("Correct (left truncation)")

axes[0].set_xlabel("Age")
axes[1].set_xlabel("Age")
plt.tight_layout()

# Cox with left truncation
channing["male"] = (channing["gender"] == "M").astype(int)
cph = CoxPHFitter()
cph.fit(channing[["age_entry", "age_death", "status", "male"]],
        duration_col="age_death", event_col="status",
        entry_col="age_entry")
print(cph.summary)

직관 — Left Truncation 무시 vs 보정 결과 차이

Wrong (무시):

Survival curve 가 이른 나이부터 시작 (출생 가까이).
그 영역에서 모든 환자가 alive (자동) → S(t) ≈ 1.
60 세 이전 데이터 가 이상함 (사실상 conditional 인데 unconditional 처럼 보임).

Correct (보정):

Survival curve 가 가장 이른 입주 나이부터 시작.
Risk set 이 동적 (입주에 따라 추가).
정확한 conditional survival.

→ Left truncation 보정은 실제로 매우 다른 결과 — 무시하면 큰 bias.

\(L\) 이 작은 (모두 빨리 입주) 경우: bias 작음. \(L\) 이 크고 다양한 (Channing 같은) 경우: bias 큼.

7 두 데이터의 페다고지 통합

측면	§ 1.15 Psychiatric	§ 1.16 Channing
n	26	462
Truncation	Admission 나이	입주 나이
사건	사망	사망
핵심 도구	One-sample vs standard + relative mortality	Conditional survival + 2-sample with truncation
Klein 사용	Ch.6.3, 7.2, 9	Ch.3.4, 4.6, 7.3, 9

직관 — Left Truncation 의 두 시연

§ 1.15 Psychiatric:

Truncation 작음 (admission 나이 19~58).
작은 sample → external comparison (Iowa standard) 핵심.
Truncation 보정은 Cox model 에서 명시적.

§ 1.16 Channing:

Truncation 큼 (입주 나이 60+).
큰 sample → internal 비교 (M vs F) 가능.
Length-biased sampling 의 표준 시연.

상보성:

Psychiatric: external standard 와 비교 (small sample).
Channing: truncation 의 본질 직관 (length bias).
두 데이터 모두 left truncation 보정 필수.

8 핵심 직관 통합

Left truncation = “표본 진입 자격이 사건 시점에 의존” = selection bias.
Length-biased sampling = “오래 사는 사람이 표본에 더 자주 등장”.
Relative mortality \(S(t)/S^*(t)\) = 외부 standard 와의 비교.
Cumulative excess mortality \(H(t) - H^*(t)\) = “추가” 사망 위험.
One-sample test = small sample + external standard 의 강력한 framework.
Conditional survival \(S(t \mid T > L)\) = truncation 보정의 자연스러운 형태.
Counting process format Surv(start, stop, event) = R/Python 의 표준 표현.

9 실전 체크리스트 — § 1.15~1.16

§ 1.15 Psychiatric

External standard mortality 인지 (Iowa 1959).
Relative mortality \(S_r(t) = S(t)/S^*(t)\) 계산.
Cumulative excess mortality \(H_e(t) = H(t) - H^*(t)\).
One-sample test 또는 SMR (Standardized Mortality Ratio).
Cox PH with left truncation.

§ 1.16 Channing House

Left truncation 의 메커니즘 인지 (입주 나이).
Length-biased sampling 의 직관.
Conditional survival \(S(t \mid T > L)\) 계산.
KM with vs without truncation 보정 비교.
Log-rank with left truncation (M vs F).
Cox PH with left truncation.

EDA

Surv(start, stop, event) 형식.
Risk set 의 동적 변화 시각화.
보정 vs 무보정 결과 비교.

다음 단계

§ 1.17 (Marijuana — left/right censoring).
§ 1.18 (Breast Cancer — interval censoring).
§ 1.19 (AIDS — right truncation).

10 관련 주제

Klein 시리즈

Ch.1 Overview
(이전) § 1.13~1.14 — Pneumonia · Weaning
(다음) § 1.17~1.19 (예정 또는 skip)

관련 개념 (cross-category)

11 참고문헌

Klein, J. P., & Moeschberger, M. L. (2003). Survival Analysis: Techniques for Censored and Truncated Data (2nd ed.), Ch.1 § 1.15~1.16. Springer.
Woolson, R. F. (1981). Rank Tests and a One-Sample Logrank Test for Comparing Observed Survival Data to a Standard Population. Biometrics, 37(4), 687-696.
Tsuang, M. T., & Woolson, R. F. (1977). Mortality in Patients with Schizophrenia, Mania, Depression and Surgical Conditions. British Journal of Psychiatry, 130, 162-166.
Hyde, J. (1980). Survival Analysis with Incomplete Observations. In Biostatistics Casebook, ed. R. G. Miller et al., Wiley, 31-46.
Andersen, P. K., Borgan, Ø., Gill, R. D., & Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer. (Left truncation likelihood 이론)
Cnaan, A., & Ryan, L. (1989). Survival Analysis in Natural History Studies of Disease. Statistics in Medicine, 8(10), 1255-1268.
Wang, M. C. (1991). Nonparametric Estimation from Cross-Sectional Survival Data. JASA, 86(413), 130-143. (Length-biased sampling 이론)
Lai, T. L., & Ying, Z. (1991). Estimating a Distribution Function with Truncated and Censored Data. Annals of Statistics, 19(1), 417-442.
Therneau, T. M., & Grambsch, P. M. (2000). Modeling Survival Data: Extending the Cox Model. Springer.
Davidson-Pilon, C. (2019). lifelines. JOSS, 4(40), 1317.