# Survival Analysis: Models and Applications

# Survival Analysis: Models and Applications

ISBN: 978-1-118-30767-0

Jun 2012

427 pages

$88.99

## Description

Survival analysis concerns sequential occurrences of events governed by probabilistic laws. Recent decades have witnessed many applications of survival analysis in various disciplines. This book introduces both classic survival models and theories along with newly developed techniques. Readers will learn how to perform analysis of survival data by following numerous empirical illustrations in SAS.*Survival Analysis: Models and Applications**:*

- Presents basic techniques before leading onto some of the most advanced topics in survival analysis.
- Assumes only a minimal knowledge of SAS whilst enabling more experienced users to learn new techniques of data input and manipulation.
- Provides numerous examples of SAS code to illustrate each of the methods, along with step-by-step instructions to perform each technique.
- Highlights the strengths and limitations of each technique covered.

Covering a wide scope of survival techniques and methods, from the introductory to the advanced, this book can be used as a useful reference book for planners, researchers, and professors who are working in settings involving various lifetime events. Scientists interested in survival analysis should find it a useful guidebook for the incorporation of survival data and methods into their projects.

Preface xi

**1 Introduction 1**

1.1 What is survival analysis and how is it applied? 1

1.2 The history of survival analysis and its progress 2

1.3 General features of survival data structure 3

1.4 Censoring 4

1.4.1 Mechanisms of right censoring 5

1.4.2 Left censoring, interval censoring, and left truncation 6

1.5 Time scale and the origin of time 7

1.5.1 Observational studies 8

1.5.2 Biomedical studies 9

1.5.3 Health care utilization 9

1.6 Basic lifetime functions 10

1.6.1 Continuous lifetime functions 10

1.6.2 Discrete lifetime functions 12

1.6.3 Basic likelihood functions for right, left, and interval censoring 14

1.7 Organization of the book and data used for illustrations 16

1.8 Criteria for performing survival analysis 17

**2 Descriptive approaches of survival analysis 20**

2.1 The Kaplan–Meier (product-limit) and Nelson–Aalen estimators 21

2.1.1 Kaplan–Meier estimating procedures with or without censoring 21

2.1.2 Formulation of the Kaplan–Meier and Nelson–Aalen estimators 24

2.1.3 Variance and standard error of the survival function 27

2.1.4 Confi dence intervals and confi dence bands of the survival function 29

2.2 Life table methods 36

2.2.1 Life table indicators 37

2.2.2 Multistate life tables 40

2.2.3 Illustration: Life table estimates for older Americans 44

2.3 Group comparison of survival functions 46

2.3.1 Logrank test for survival curves of two groups 48

2.3.2 The Wilcoxon rank sum test on survival curves of two groups 51

2.3.3 Comparison of survival functions for more than two groups 55

2.3.4 Illustration: Comparison of survival curves between married and unmarried persons 58

2.4 Summary 61

**3 Some popular survival distribution functions 63**

3.1 Exponential survival distribution 63

3.2 The Weibull distribution and extreme value theory 68

3.2.1 Basic specifi cations of the Weibull distribution 68

3.2.2 The extreme value distribution 72

3.3 Gamma distribution 73

3.4 Lognormal distribution 77

3.5 Log-logistic distribution 80

3.6 Gompertz distribution and Gompertz-type hazard models 83

3.7 Hypergeometric distribution 89

3.8 Other distributions 91

3.9 Summary 92

**4 Parametric regression models of survival analysis 93**

4.1 General specifi cations and inferences of parametric regression models 94

4.1.1 Specifi cations of parametric regression models on the hazard function 94

4.1.2 Specifi cations of accelerated failure time regression models 96

4.1.3 Inferences of parametric regression models and likelihood functions 99

4.1.4 Procedures of maximization and hypothesis testing on ML estimates 101

4.2 Exponential regression models 103

4.2.1 Exponential regression model on the hazard function 103

4.2.2 Exponential accelerated failure time regression model 106

4.2.3 Illustration: Exponential regression model on marital status and survival among older Americans 108

4.3 Weibull regression models 113

4.3.1 Weibull hazard regression model 114

4.3.2 Weibull accelerated failure time regression model 115

4.3.3 Conversion of Weibull proportional hazard and AFT parameters 117

4.3.4 Illustration: A Weibull regression model on marital status and survival among older Americans 121

4.4 Log-logistic regression models 127

4.4.1 Specifi cations of the log-logistic AFT regression model 127

4.4.2 Retransformation of AFT parameters to untransformed log-logistic parameters 129

4.4.3 Illustration: The log-logistic regression model on marital status and survival among the oldest old Americans 131

4.5 Other parametric regression models 135

4.5.1 The lognormal regression model 136

4.5.2 Gamma distributed regression models 137

4.6 Parametric regression models with interval censoring 138

4.6.1 Inference of parametric regression models with interval censoring 138

4.6.2 Illustration: A parametric survival model with independent interval censoring 139

4.7 Summary 142

**5 The Cox proportional hazard regression model and advances 144**

5.1 The Cox semi-parametric hazard model 145

5.1.1 Basic specifi cations of the Cox proportional hazard model 145

5.1.2 Partial likelihood 147

5.1.3 Procedures of maximization and hypothesis testing on partial likelihood 150

5.2 Estimation of the Cox hazard model with tied survival times 154

5.2.1 The discrete-time logistic regression model 154

5.2.2 Approximate methods handling ties in the proportional hazard model 155

5.2.3 Illustration on tied survival data: Smoking cigarettes and the mortality of older Americans 157

5.3 Estimation of survival functions from the Cox proportional hazard model 161

5.3.1 The Kalbfl eisch–Prentice method 162

5.3.2 The Breslow method 164

5.3.3 Illustration: Comparing survival curves for smokers and nonsmokers among older Americans 165

5.4 The hazard rate model with time-dependent covariates 169

5.4.1 Categorization of time-dependent covariates 169

5.4.2 The hazard rate model with time-dependent covariates 171

5.4.3 Illustration: A hazard model on time-dependent marital status and the mortality of older Americans 173

5.5 Stratified proportional hazard rate model 176

5.5.1 Specifications of the stratifi ed hazard rate model 177

5.5.2 Illustration: Smoking cigarettes and the mortality of older Americans with stratifi cation on three age groups 178

5.6 Left truncation, left censoring, and interval censoring 183

5.6.1 The Cox model with left truncation, left censoring, and interval censoring 184

5.6.2 Illustration: Analyzing left truncated survival data on smoking cigarettes and the mortality of unmarried older Americans 185

5.7 Qualitative factors and local tests 191

5.7.1 Qualitative factors and scaling approaches 191

5.7.2 Local tests 193

5.7.3 Illustration of local tests: Educational attainment and the mortality of older Americans 195

5.8 Summary 199

**6 Counting processes and diagnostics of the Cox model 201**

6.1 Counting processes and the martingale theory 202

6.1.1 Counting processes 202

6.1.2 The martingale theory 204

6.1.3 Stochastic integrated processes as martingale transforms 207

6.1.4 Martingale central limit theorems 208

6.1.5 Counting process formulation for the Cox model 211

6.2 Residuals of the Cox proportional hazard model 213

6.2.1 Cox–Snell residuals 213

6.2.2 Schoenfeld residuals 214

6.2.3 Martingale residuals 216

6.2.4 Score residuals 218

6.2.5 Deviance residuals 219

6.2.6 Illustration: Residual analysis on the Cox model of smoking cigarettes and the mortality of older Americans 220

6.3 Assessment of proportional hazards assumption 222

6.3.1 Checking proportionality by adding a time-dependent variable 225

6.3.2 The Andersen plots for checking proportionality 227

6.3.3 Checking proportionality with scaled Schoenfeld residuals 228

6.3.4 The Arjas plots 229

6.3.5 Checking proportionality with cumulative sums of martingale-based residuals 230

6.3.6 Illustration: Checking the proportionality assumption in the Cox model for the effect of age on the mortality of older Americans 232

6.4 Checking the functional form of a covariate 236

6.4.1 Checking model fit statistics for different link functions 236

6.4.2 Checking the functional form with cumulative sums of martingale-based residuals 237

6.4.3 Illustration: Checking the functional form of age in the Cox model on the mortality of older Americans 239

6.5 Identifi cation of infl uential observations in the Cox model 243

6.5.1 The likelihood displacement statistic approximation 244

6.5.2 LMAX statistic for identifi cation of infl uential observations 247

6.5.3 Illustration: Checking influential observations in the Cox model on the mortality of older Americans 248

6.6 Summary 253

**7 Competing risks models and repeated events 255**

7.1 Competing risks hazard rate models 256

7.1.1 Latent failure times of competing risks and model specifications 256

7.1.2 Competing risks models and the likelihood function without covariates 259

7.1.3 Inference for competing risks models with covariates 261

7.1.4 Competing risks model using the multinomial logit regression 263

7.1.5 Competing risks model with dependent failure types 266

7.1.6 Illustration of competing risks models: Smoking cigarettes and the mortality of older Americans from three causes of death 268

7.2 Repeated events 282

7.2.1 Andersen and Gill model (AG) 283

7.2.2 PWP total time and gap time models (PWP-CP and PWP-GT) 286

7.2.3 The WLW model and extensions 288

7.2.4 Proportional rate and mean functions of repeated events 291

7.2.5 Illustration: The effects of a medical treatment on repeated patient visits 294

7.3 Summary 308

**8 Structural hazard rate regression models 310**

8.1 Some thoughts about the structural hazard regression models 310

8.2 Structural hazard rate model with retransformation of random errors 313

8.2.1 Model specification 314

8.2.2 The estimation of the full model 317

8.2.3 The estimation of reduced-form equations 318

8.2.4 Decomposition of causal effects on hazard rates and survival functions 323

8.2.5 Illustration: The effects of veteran status on the mortality of older Americans and its pathways 327

8.3 Summary 344

**9 Special topics 347**

9.1 Informative censoring 347

9.1.1 Selection model 348

9.1.2 Sensitivity analysis models 351

9.1.3 Comments on current models handling informative censoring 352

9.2 Bivariate and multivariate survival functions 352

9.2.1 Inference of the bivariate survival model 353

9.2.2 Estimation of bivariate and multivariate survival models 355

9.2.3 Illustration of marginal models handling multivariate survival data 359

9.3 Frailty models 359

9.3.1 Hazard models with individual frailty 360

9.3.2 The correlated frailty model 364

9.3.3 Illustration of frailty models: The effect of veteran status on the mortality of older Americans revisited 366

9.4 Mortality crossovers and the maximum life span 376

9.4.1 Basic specifications 378

9.4.2 Relative acceleration of the hazard rate and timing of mortality crossing 381

9.4.3 Mathematical conditions for maximum life span and mortality crossover 383

9.5 Survival convergence and the preceding mortality crossover 384

9.5.1 Mathematical proofs for survival convergence and mortality crossovers 385

9.5.2 Simulations 387

9.5.3 Explanations for survival convergence and the preceding mortality crossover 393

9.6 Sample size required and power analysis 398

9.6.1 Calculation of sample size required 399

9.6.2 Illustration: Calculating sample size required 401

9.7 Summary 403

Appendix A The delta method 405

Appendix B Approximation of the variance–covariance matrix for the predicted probabilities from results of the multinomial logit model 407

Appendix C Simulated patient data on treatment of PTSD (n = 255) 410

Appendix D SAS code for derivation of φ estimates in reduced-form equations 417

Appendix E The analytic result of κ*(x) 422

References 424

Index 438