# Summary tables for R survival objects with stargazer or xtable

Last year, I was hired to do some statistical analysis for a group of surgeons. As research tends to go, our initial paper submission was rejected and several revisions were suggested. A particularly annoying problem came up, which was how to create nice tables from survival objects - summaries, not regression tables. The survival package overloads summary() rendering xtable and stargazer unable to build a beautiful table from the custom summary function.

I can illustrate the effect here using data from IDRE:

library(survival)
hmohiv<-read.table(
"http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv",
sep=",",
header = TRUE)
attach(hmohiv)
hmo.surv <- survfit(Surv(time, censor)~ 1)

Calling summary on the survival fit will print out a good deal of information for each time point recorded in your data.

normsum <- summary(hmo.surv)
print(normsum)
## Call: survfit(formula = Surv(time, censor) ~ 1)
##
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1    100      15   0.8500  0.0357       0.7828        0.923
##     2     83       5   0.7988  0.0402       0.7237        0.882
##     3     73      10   0.6894  0.0473       0.6026        0.789
##     4     61       4   0.6442  0.0493       0.5544        0.748
##     5     56       7   0.5636  0.0517       0.4709        0.675
##     6     49       2   0.5406  0.0521       0.4476        0.653
##     7     46       6   0.4701  0.0526       0.3775        0.586
##     8     39       4   0.4219  0.0525       0.3306        0.538
##     9     35       3   0.3857  0.0520       0.2962        0.502
##    10     32       3   0.3496  0.0511       0.2625        0.466
##    11     28       3   0.3121  0.0500       0.2280        0.427
##    12     25       2   0.2872  0.0490       0.2055        0.401
##    13     21       1   0.2735  0.0486       0.1931        0.387
##    14     20       1   0.2598  0.0480       0.1809        0.373
##    15     19       2   0.2325  0.0467       0.1568        0.345
##    22     16       1   0.2179  0.0460       0.1441        0.330
##    30     14       1   0.2024  0.0453       0.1305        0.314
##    31     13       1   0.1868  0.0444       0.1173        0.298
##    32     12       1   0.1712  0.0433       0.1043        0.281
##    34     11       1   0.1557  0.0421       0.0916        0.264
##    35     10       1   0.1401  0.0407       0.0793        0.247
##    36      9       1   0.1245  0.0390       0.0674        0.230
##    43      8       1   0.1090  0.0371       0.0559        0.212
##    53      7       1   0.0934  0.0349       0.0449        0.194
##    54      6       1   0.0778  0.0324       0.0344        0.176
##    57      4       1   0.0584  0.0296       0.0216        0.157
##    58      3       1   0.0389  0.0253       0.0109        0.139

The survival summary takes additional options, namely a vector specifying the times at which you want to summarize your set.

survsum <- summary(hmo.surv, times = c(1,10,25,58))
print(survsum)
## Call: survfit(formula = Surv(time, censor) ~ 1)
##
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1    100      15   0.8500  0.0357       0.7828        0.923
##    10     32      44   0.3496  0.0511       0.2625        0.466
##    25     14      10   0.2179  0.0460       0.1441        0.330
##    58      3      11   0.0389  0.0253       0.0109        0.139

Stargazer doesn’t like it (and neither does xtable - it won’t even let me compile the Rmd document).

stargazer(survsum)
##
## % Error: Unrecognized object type.

So, I made a convenience function that turns your survfit object into a data frame that can be passed to a table generator. Feel free to modify it to create the columns that you need.

#' Create a nice data frame from a survival model for automatically generating a table.
#'
#' @param survobj \code{\link[survival]{survfit}} object
#' @param timevec a vector of times for survival summary
#' @param scale a number that scales the timevec, e.g., if your data is recorded in days and you want months displayed, use scale = 30
#' @return a data frame with labeled columns
#' @author Aaron Gonzales
#' \url{http://binaryaaron.github.io/}
#' @export

make_surv_table <- function(survobj,
timevec,
scale = 1,
col_names = NULL
){
# vector of times for summary points, defaults to six month intervals
if(is.null(timevec)){
print('timevec invalid; defaulting to six-month intervals by days')
timevec <- c(0, 180, 360, 540, 720)
}
# here, scale converts the numbers into months
survsum <- summary(survobj, timevec, scale)
# make a data frame out of the summary object
df <- data.frame(survsum$time, survsum$surv,
1 - survsum$surv, survsum$std.err,
cumsum(survsum$n.event), survsum$n.risk
)
# give that data frame names
if(is.null(col_names)){
colnames(df) <- c('Time',
'Survival',
'Deaths',
'Survival SE',
'Cummulative Events',
'Number remaining')
} else{
colnames(df) <- col_names
}
# return the frame
return(df)
}

And here it is in action:

summary_days <- c(0, 10, 25, 58)

stargazer(make_surv_table(hmo.surv, summary_days),
summary = FALSE,
type = c('html'),
title = 'Summary of HMOHIV survival fit',
rownames = FALSE)
 Time Survival Deaths Survival SE Cummulative Events Number remaining 0 1 0 0 0 100 10 0.350 0.650 0.051 59 32 25 0.218 0.782 0.046 69 14 58 0.039 0.961 0.025 80 3

The HTML format leaves a hair to be desired, but they look great in latex. Obviously, change the type = 'html' to latex or text as needed.

Hopefully someone else can make use of this as well.

### Written by Aaron Gonzales

I'm a typically atypical guy.

Updated