Survival Analysis in R Programming

1. Objective

In this tutorial, we will discuss Survival Analysis in R. Along with this, we will also cover syntax, usage, and functions or R survival analysis in detail.

Survival analysis in R

2. Introduction to Survival Analysis in R

In R, survival analysis particularly deals with predicting the time when a specific event is going to occur. It is also known as analysis of time to death.

For example:

To Predict the number of days a person in the last stage will survive.

We use R package to carry out this analysis.

In R survival package, a function named surv() takes the input data as an R formula. It creates a survival object among the chosen variables for analysis. Thus, after this survfit() is being used to create a plot for the analysis.

What is R Survival Analysis?

  • Model time to event (esp. failure)and is used in medicine, biology, actuary, finance, engineering, sociology, etc.
  • It is able to account for censoring.
  • We can also compare between 2+ groups.
  • It is able to access relationship between covariates and survival time.

2.1 Install Package





Description of the parameters used

  • time is the follow-up time until the event occurs.
  • the event indicates the status of occurrence of the expected event.
  • the formula is the relationship between the predictor variables.

Graphical Analysis is also an important part of R.You can follow this below-mentioned link to learn this:

Introduction to Graphical Analysis

3. Survival Analysis in R

Let us see the various steps to perform R programming survival analysis:

  • Install survival Package: survival >library (survival)
  • Create a survival subject: Surv
  • Kaplan – Meier Estimator: survfit
  • Mantel-Haenzel Test: survdiff
  • Cox Model: coxph

Learn more about R-  R Introduction

3.1 Creating the survival object

Survival object in R is created by Surv function:


>Surv (time, time2, event, type=c

(‘right’, ‘left’, ‘interval’,

(‘right’, ‘left’, ‘interval’,

‘counting’, ‘interval2’), origin=0)

3.2 Kaplan-Meier Estimator

  • Also known as product-limit estimator
  • It is like the censoring version of an empirical survival function
  • It generates a stair-step curve
  • Variance is estimated by Greenwood’s formula
  • It does not account for effect of another covariate

3.3 Kaplan-Meier Estimator (Cont.)

It is Computed by the function: survfit


>survfit (formula, …)

3.4 Mantel-Haenzel Test

  • It is also known as a log-rank test.
  • It is generated from a sequence of 2×2 tables.
  • Conditional independence.
  • It is efficient in comparing groups differed by categorical variables, but not continuous ones.

3.5 Mantel-Haenzel Test (Cont.)

Computed by the function: survdiff


>survdiff (formula, data, subset, na.action, rho=0)

3.6 Cox Model

  • It is also known as proportional hazard model.
  • Here the assumption is quite strong.

3.7 Cox Model (Cont.)

Computed by the function: coxph


>coxph (formula, data=, weights,

subset, na. action, init,

control, method=c


singular. ok=TRUE, robust=FALSE,

model=FALSE, x=FALSE,

y=TRUE, …)

3.8 Cox Model (Cont.)

For Baseline







survivor‘,xlab =’Days’,ylab=


3.9 Cox Model (Cont.)

For mean covariates


‘fitted survival function at

mean covariates‘, xlab=’Days’,

mean covariates‘, xlab=’Days’,


3.10 Diagnostic of Cox Model

  • Cox model is amazing, but the assumption is strong.
  • Schoenfeld residuals etc,

4. Conclusion

We have studied R survival analysis in detail. We have also learned its syntax and usages.The most important thing we have studied is its functions which help you to understand its real-life applications.


Leave a comment

Your email address will not be published. Required fields are marked *