Journal of the National Cancer Institute Advance Access originally published online on January 8, 2008
JNCI Journal of the National Cancer Institute 2008 100(2):92-97; doi:10.1093/jnci/djm265
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© The Author 2008. Published by Oxford University Press.
COMMENTARY |
Visualizing Length of Survival in Time-to-Event Studies: A Complement to Kaplan–Meier Plots
Affiliations of authors: Cancer Group, Medical Research Council Clinical Trials Unit, London, UK (PR, MKBP); Centre for Statistics in Medicine, University of Oxford, Oxford, UK (DGA)
Correspondence to: Patrick Royston, DSc, Cancer Group, MRC Clinical Trials Unit, 222 Euston Rd, London NW1 2DA, UK (e-mail: pr{at}ctu.mrc.ac.uk).
| ABSTRACT |
|---|
|
|
|---|
Because of censoring, standard methods of plotting individual survival times are invalid. Therefore, graphic display of time-to-event data usually takes the form of a Kaplan–Meier survival plot. Kaplan–Meier plots, however, make differences between groups seem larger than they really are. To overcome these limitations, we developed a technique for producing scatter plots with survival data and applied it to data from a randomized trial of patients with renal cancer. As of June 21, 2001, 25 of the 347 patients with kidney cancer in the Medical Research Council RE01 randomized treatment trial for whom data were available had been censored, and the remainder had died. Values of the censored survival times were imputed by assuming a log-normal distribution in survival times and by drawing a random sample given that that each patient with censored data survived at least to the point of censoring. The combined original and imputed data were then examined by use of dot plots and scatter plots. In the RE01 trial, median survival of patients treated with interferon was 3.0 months (95% confidence interval = 0.3 to 5.5 months) longer than that in patients treated with medroxyprogesterone acetate. The Kaplan–Meier analysis showed clear separation between treatment groups and between prognostic groups. In contrast, comparisons of individual observed and imputed survival times between groups of patients showed considerable overlap and gave a more realistic idea of the modest between-group differences than Kaplan–Meier comparisons. These graphs of the distribution of survival times for individuals in each study group, which are simple to produce, may usefully complement Kaplan–Meier plots.
For each individual in a clinical study, time-to-event data denote the time from a starting point to some event, such as disease recurrence or death. In general, by the end of the follow-up period, not all patients will have experienced the event. The survival time for these patients is said to be censored, that is, the observation period ended before the event occurred. Because of censoring, standard statistical methods, such as scatter plots of survival times or regression analysis, are not valid. Graphic display of time-to-event data is presented almost universally in a Kaplan–Meier plot (1–3), which depicts the proportion of individuals surviving without the event as a function of the length of follow-up. The Kaplan–Meier plot has the key benefits of displaying all the data and of allowing correctly for the censoring of survival times, which is a hallmark of such data. One can easily read percentiles of the survival time distribution in each treatment arm from the Kaplan–Meier plot and obtain rough estimates of the median survival times that correspond to a survival probability of 0.5.
In reports of randomized controlled trials, for example, presentation of a Kaplan–Meier plot of survival against time for each treatment group is the standard way of reporting survival data, regardless of other analyses also used. When interpreted carefully and appropriately, a Kaplan–Meier plot provides useful survival information. However, it tends to conceal one important feature of survival data—that is, the inherent variability in the survival time of individual patients (4)—which is normally of considerable interest in data not subject to censoring. Further, because of the space that often appears between two or more survival curves in a Kaplan–Meier plot, the differences between groups may appear larger than they actually are.
In this article, we use data from the Medical Research Council (MRC) RE01 trial in metastatic renal carcinoma to exemplify a new way to present survival time data from trials and prognostic studies. Between February 1992 and November 30, 1997, 350 patients with metastatic renal carcinoma were recruited to the MRC RE01 trial, which was conducted at 31 centers in the United Kingdom. The trial randomly assigned patients to treatment with interferon alpha or with medroxyprogesterone acetate. A 28% (95% confidence interval [CI] = 6% to 45%) reduction in the hazard rate of death in the interferon alpha group, compared with the medroxyprogesterone acetate group, was reported (5).
Overall survival in the RE01 trial, as presented in a Kaplan–Meier plot (Fig. 1) and accompanied by the relevant statistical analysis, shows improved survival in the experimental interferon alpha arm compared with the standard medroxyprogesterone acetate arm. The times of the events are visible as downward steps or notches on the graph. As shown in Fig. 1, few deaths were observed after a follow-up of 48 months.
|
In this article, we used imputation to develop a method to complement the information given by Kaplan–Meier plots. The method is intended to be used only for qualitative visual exploration and understanding of survival data, not for more extensive analyses. For the statistical analysis of survival data, many analytic methods are available, including determination of the hazard ratio and its corresponding confidence interval and P value (6) and the Cox proportional hazards regression model (7–8) when adjustment for prognostic factors is required.
Censored survival times may be viewed as a type of missing data. One approach to analyzing missing data is through (multiple) imputation, which was introduced by Rubin (9). Hsu et al. (10) proposed a complex nonparametric method for imputing survival data that assumes proportional hazards effects for prognostic factors, and Gelman et al. (11) proposed the use of imputation for checking a statistical model. We present a much simpler approach to imputation that is based on the log-normal distribution [see also Royston and Parmar (12) for a more sophisticated approach to modeling time-to-event data that may in principle also be used for imputation]. In our approach, each censored survival time is completed or imputed by substituting a value sampled at random from a log-normal distribution. The mean and standard deviation of the distribution are estimated by taking into account the values of prognostic factors for each individual patient. To investigate the usefulness of this approach, we used data from the MRC RE01 trial. We aimed to use scatter plots to display individual survival times when groups of patients are compared. Such plots are intended to give a more realistic impression of the differences in survival time between the groups than is possible with Kaplan–Meier plots.
| Visualizing Censored Observations |
|---|
|
|
|---|
Our goal was to use the available data from the MRC RE01 trial to develop a plausible model of the distribution of survival times that could be used to make an educated guess (called an imputation) as to the true time to event for each censored patient, had follow-up been prolonged sufficiently. The guesses would necessarily have substantial uncertainty and not be accurate for the individual patient. Nevertheless, over the total cohort of patients, the guesses would give useful information about the distribution of individual survival times. The methodology for such imputation is outlined below and is described in more detail by Royston (13).
Briefly, let m be the log survival time expected for a given patient with an actual survival time of t months. The value of m depends on the patients prognostic factors (see below). Let s be the residual SD of the log survival times, that is, the SD of (log t – m) across the sample. The log-normal model predicts that (log t – m)/s has a standard Gaussian distribution. For censored observations, the true (but unknown) survival time always exceeds the censored time, c. A random draw, u, is made from a standard Gaussian distribution truncated at (log c – m)/s. The imputed survival time for this patient is calculated as antilog (m + u x s). The process is repeated for all other patients with censored times, giving one complete imputation of the censored observations. The uncensored survival times are not altered.
Note that the method introduces an appropriate amount of random variation to simulate realistic individual survival times by drawing a random sample from the right-hand tail region of the log-normal distribution (ie, later than each censoring time). The sampling therefore allows for the fact that each patient with censored data survived at least to the point of censoring. It also takes into account the estimated mean of the log-normal distribution for the survival of that patient.
Statistical Analysis
The statistical package Stata, version 10 (14), was used for all analyses. The cnreg command in Stata was used to fit a Gaussian (normal) distribution to the logarithm of the survival times (ie, to fit a log-normal distribution). The command allows for censoring and for the influence of prognostic factors. Imputation of censored survival times was done with a specially written Stata program (the program is available upon request from the authors). All statistical tests were two-sided.
| Treatment Effect In Metastatic Renal Carcinoma |
|---|
|
|
|---|
To analyze the RE01 data, we first constructed a prognostic model from available baseline variables that were determined for each patient when they were randomly assigned to treatment [for details of the available prognostic factors and the selected multivariable model, see table 2 and the results section in Royston and Sauerbrei (15)]. Briefly, the selected variables (using the criterion of a P value of less than .05) were WHO (World Health Organization) performance status, hemoglobin level, white cell count, the logarithm of time from diagnosis of metastasis to randomization, and treatment. All continuous variables were kept continuous in the analysis. The only categoric variables were WHO performance status and treatment.
Second, by using a type of probability plot adapted to allow for censored observations, the log-normal model was found to fit the survival times from the RE01 trial well (data not shown). We fitted a log-normal distribution to the survival times allowing for censoring and for the influence of the prognostic factors (by use of the cnreg command in Stata). By allowing for prognostic factors, we increased the accuracy of imputation. The model enabled us to predict the logarithm of the survival time for each patient from their prognostic variables. The predictions were equivalent to a prognostic index or risk score.
Third, by use of the time to censoring for each patient with a censored survival time, we imputed an estimated time of death according to the fitted log-normal distribution, by taking into account the predicted median from step 2 above [for further technical details, see above and section 3.3 of Royston (13)].
As of June 21, 2001, of the 347 patients with available data, 25 (7%) were censored and so required imputed survival times. In contrast to the Kaplan–Meier analysis (Fig. 1), by plotting individual survival times from observed and imputed data (Fig. 2), the overlap of survival distributions between the treatment arms and the wide spread of survival times in each arm were visualized. Although the Kaplan–Meier plot depicts the same underlying data, with each vertical step corresponding to a death, it is much harder to appreciate the considerable overlap in survival times in the two treatment groups because the Kaplan–Meier plot shows only cumulative survival probabilities at particular times rather than individual survival times.
|
The original trial result was that interferon alpha treatment reduced the hazard of dying by 28% (95% CI = 6% to 45%) at any time during the analysis, compared with medroxyprogesterone acetate treatment. Median survival in patients treated with interferon was 3.0 months (95% CI = 0.3 to 5.5 months) longer than that in patients treated with medroxyprogesterone acetate. However, the treatment effect at the level of the individual patient must be presumed to be relatively small because the natural variation among survival times of patients was much larger than the treatment effect: the 10th, 50th (ie, median), and 90th centiles of survival time for all patients were 1.2, 8.0, and 38.0 months, respectively. The explained variation (R2) in the logarithm of the survival time that was attributable to treatment was only 2.2% (95% CI = 0.3% to 6.4%). This result means that only approximately 2% of the variation in the length of survival observed among the patients could be attributed to the treatment received. That perspective is better reflected by the dot plot in Fig. 2 than by the Kaplan–Meier plot.
Prognosis in Metastatic Renal Carcinoma
Most new therapies have at best only modest treatment effects (16); variation in prognosis derived from models including prognostic factors such as stage of disease is usually much larger. We used the prognostic model described above and the observed and imputed survival times to illustrate that the treatment effect among the patients in the RE01 trial was small in absolute terms and in fact was considerably smaller than the effect of prognostic factors. We first computed a continuous prognostic index from the factors WHO performance status, hemoglobin level, white cell count, logarithm of time from diagnosis of metastasis to randomization, and treatment. We split the dataset by use of tertiles of the prognostic index into three approximately equal groups of 115, 116, and 116 patients, corresponding to those with good, moderate, or poor prognosis. The median survival times were approximately 15, 9, and 2 months. A Kaplan–Meier plot of these data is shown in Fig. 3. The distribution of the original and imputed survival times for the three groups is shown on a logarithmic scale in Fig. 4. Figures 3 and 4 are analogous to Figs. 1 and 2, and the differences may be interpreted similarly.
|
|
Kaplan–Meier survival curves for the three prognostic groups showed much more separation between groups than curves for the treatment groups (Fig. 2). None (0%) of the 115 patients with good prognosis, three (3%) of the 116 patients with moderate prognosis, and 25 (22%) of the 116 of patients with poor prognosis died within 1 month of being randomly assigned, whereas 74 (64%), 42 (36%), and 12 (10%), respectively, survived longer than 12 months. For comparison, nine (5%) of the 172 patients treated with interferon alpha and 19 (11%) of the 175 patients treated with medroxyprogesterone acetate died within 1 month, and 73 (42%) and 55 (31%), respectively, survived longer than 12 months. The variation in the logarithm of survival time explained by the prognostic index (ie, R2) was 37% (95% CI = 31% to 44%), compared with merely 2% for the treatment effect alone. Note that the prognostic index for an individual patient corresponds to the median of the distribution of log survival times estimated by the log-normal model for that specific patient.
The effect of dividing the patients into three groups by plotting the logarithm of survival time against the prognostic index was clarified further by marking the boundaries (ie, the tertile cut points on the prognostic index) between the three prognosis groups (Fig. 5). As noted above, with a log-normal model, the prognostic index may be expressed as the predicted median logarithm of the survival time. The spread of data from the moderate prognosis group, containing the central one-third of the patients, was relatively narrow, compared with the spread of data from the other two prognostic groups. Furthermore, the arbitrary nature of the grouping and the wide spread of survival times around a given median predicted time are revealed by the plot. The even spread of values above and below the line of equality between observed and predicted survival supported a reasonably good fit between the model predictions and the actual data, although there was a considerable amount of unexplained (apparently random) variation.
|
| Discussion |
|---|
|
|
|---|
Kaplan–Meier curves have become the standard method of displaying time-to-event data. We concentrated on two main uses of Kaplan–Meier curves: 1) to display differences between treatment groups and 2) to represent groups with differing prognoses as determined from a prognostic model. Although Kaplan–Meier curves in each situation reveal a lot of information, they also hide a lot. In particular, they give no direct indication of the variability in survival times among individual patients, so that the overlap between the distributions of survival times of different groups is obscured.
We have developed a simple approach to overcoming this difficulty in which the distribution of observed and imputed survival times for individuals in each group is illustrated. The method is surprisingly informative and, we hope, will help physicians and patients to understand more fully the results of clinical trials and the implications of prognostic assessments. One direct consequence of the display is to show the near impossibility of predicting, with any reasonable degree of accuracy, the survival of individual patients, even when they are assigned to groups whose Kaplan–Meier curves are clearly separated from each other (Figs. 4 and 5). The difficulty of predicting outcome for individual patients is the result of the inherent variability in the survival distributions and the degree of overlap of the latter across groups. Lack of ability to "explain" the inherent variability accounts for the inability of models to give good predictions for individual patients [see also Henderson and Keiding (4)].
The approach that we developed to produce scatter plots dealing with censoring is illustrated, in this article, with a small amount (7%) of censoring in data from a trial in metastatic kidney cancer that had fairly short survival. The method involved fitting a log-normal model to the data, allowing for censoring and prognostic factors. If the data are not a good fit to a log-normal distribution then other models may be tried, including exponential and Weibull models (17). If extra flexibility is required, more sophisticated models are also available [eg, that described by Royston and Parmar (12)]. With these approaches, we use actual survival times and impute censored values. Using data from the MRC RE01 trial, we displayed survival distributions to show that a substantial difference between Kaplan–Meier curves for two treatment groups may correspond to a rather small shift in the distribution of survival times.
Our study has several limitations. It is clear that with greater censoring, the uncertainty in the true shape of the survival distribution and the difficulty of producing realistic imputations become greater. In an example (not shown) with 86% censoring, we obtained estimates of extreme survival times that extended beyond the human lifespan. The follow-up time was too short to get a realistic appraisal of longer term survival. Consequently, it may be unwise to use our method when more than approximately 50% of the patients are censored because it is not possible to get a reliable feel for the shape of the survival distribution when more than half of the data are being imputed. Even if the log-normal or other assumed distribution were correct, we would not get accurate information on the parameters that define the distribution (ie, the mean and standard deviation). In all cases, but especially with substantial censoring, the imputed values should be inspected for plausibility. For example, imputed survival times of 200 years would not be plausible. We emphasize that even when the method can probably not be used reliably for a given study, researchers should bear in mind the broad lesson of the present article that the Kaplan–Meier curves do not give a complete representation of the difference between survival distributions and that differences in actual survival time between groups may be considerably smaller than indicated in the Kaplan–Meier curves.
With treatment effects larger than those in the RE01 trial, the graphic separation between the points from the two treatment groups will be greater—in fact, the separation should be more like the pattern in Fig. 4 than that in Fig. 2. Because the effects of cancer treatment are usually modest (much smaller than differences in prognosis), Fig. 2 is more likely to be representative of real-life trial results than Fig. 4. To those who toil for many years to do clinical trials and succeed in showing, as did RE01, a clinically important difference between the new and standard treatments, the results presented as in Fig. 2 may appear disappointing. However, we contend that such graphs confirm visually what physicians already understand—namely, that most new (or even old) treatments have only a modest impact on the population of patients that they treat. It should be noted that this new way of presenting survival data in which a clear difference between Kaplan–Meier curves is converted to a highly overlapping pair of scatter plots does not devalue a new treatment. Displaying the frequency distributions shows the direct impact of the treatments, for example, by showing how the survival distribution changed among patients treated with interferon, compared with that among patients treated with medroxyprogesterone acetate. Because the survival times of individuals are not shown explicitly by Kaplan–Meier curves, the direct impact of the treatments is obscured and, furthermore, is not reflected in summary statistics, such as the hazard ratio and its confidence interval.
Analyses such as those presented in this article also provide a basis for discussion as to why further randomized trials evaluating a new treatment in a given disease are necessary. It is clear that even what appears to be a good prognostic index—that is, one that results in good separation in the Kaplan–Meier plots between curves representing the prognostic groups—may explain only a modest amount of the variation in prognosis of patients. For the RE01 trial, the variation was approximately 37%. The explanation for 63% of the variation remains unknown.
Is the method still useful when the log-normal distribution is a less good fit to the data than it was to the MRC RE01 data? In cancer, certainly, the distribution of survival times is invariably positively skewed with a long right-hand tail. Although there is no a priori reason why the log-normal distribution should be a good model for such a distribution, it also is positively skewed and is at least a good candidate. That the fit may be imperfect is not really important because all we ask is that it provides plausible imputations. In our experience, in which we have used this approach for several datasets, it appears to do a good job in this role. An additional possibility is to use more flexible parametric survival models (12). However, the agreement would not improve markedly by using such a model with data from the trial that we used in this article because the log-normal distribution already fits the data well.
Finally, a note of warning is in order. Although the imputation method that we have described helps one to gain an impression of the survival times of individuals, it does not follow that a parametric survival model, such as the log-normal model that we used for imputation, will necessarily be a satisfactory substantive model for the data, nor will it give valid estimates of quantities such as mean survival times. That is the primary reason why the Cox proportional hazards model, which makes no assumptions about the distribution of survival times, is typically the method of choice for regression analysis of censored data. The imputation methods that we describe in this article should be used only for graphic display but not, for example, for statistical significance testing or other inferential tasks.
In summary, we have illustrated an approach to representing censored and observed survival times in a simple but informative way. Such graphs may be a useful complement to Kaplan–Meier plots.
| NOTES |
|---|
|
|
|---|
The authors take full responsibility for the design of the study, the collection of the data, the analysis and interpretation of the data, the decision to submit the manuscript for publication, and the writing of the manuscript.
None of the authors has any conflict of interests with respect to the material of this article. No specific funding was required.
| REFERENCES |
|---|
|
|
|---|
1. Kaplan E, Meier P. Nonparametric estimation from incomplete data. J Am Stat Assoc. (1958) 53:457–481.[CrossRef][Web of Science]
2. Bland JM, Altman DG. Survival probabilities (the Kaplan–Meier method). BMJ (1998) 317:1572.
3. Pocock SJ, Clayton TC, Altman DG. Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls. Lancet (2002) 359:1686–1689.[CrossRef][Web of Science][Medline]
4. Henderson R, Keiding N. Individual survival time prediction using statistical models. J Med Ethics (2005) 31:703–706.
5. Medical Research Council Renal Cancer Collaborators. Interferon-alpha and survival in metastatic renal carcinoma: early results of a randomised controlled trial. In: Lancet (1999) 353:14–17.[CrossRef][Web of Science][Medline]
6. Machin D, Cheung YB, Parmar MKB. Survival analysis. A practical approach (2006) 2nd ed. Chichester, UK: John Wiley & Sons.
7. Cox DR. Regression models and life tables. J R Stat Soc. (1972) 34:187–220.
8. Bradburn MJ, Clark TG, Love SB, Altman DG. Survival analysis part II: multivariate data analysis—an introduction to concepts and methods. Br J Cancer (2003) 89:431–436.[CrossRef][Web of Science][Medline]
9. Rubin DB. Multiple imputation for non-response in surveys (1987) New York: John Wiley and Sons.
10. Hsu CH, Taylor JM, Murray S, Commenges D. Survival analysis using auxiliary variables via non-parametric multiple imputation. Stat Med (2006) 25:3503–3517.[CrossRef][Web of Science][Medline]
11. Gelman A, Van MI, Verbeke G, Heitjan DF, Meulders M. Multiple imputation for model checking: completed-data plots with missing and latent data. Biometrics (2005) 61:74–85.[CrossRef][Web of Science][Medline]
12. Royston P, Parmar MK. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat Med (2002) 21:2175–2197.[CrossRef][Web of Science][Medline]
13. Royston P. The lognormal distribution as a model for survival time in cancer, with an emphasis on prognostic factors. Stat Neerl (2001) 55:89–104.[CrossRef][Web of Science]
14. StataCorp. Stata Statistical Software: Release 10.0 (2007) College Station, TX: Stata Press.
15. Royston P, Sauerbrei W. A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials. Stat Med (2004) 23:2509–2525.[CrossRef][Web of Science][Medline]
16. Bailar JC III, Gornik HL. Cancer undefeated. N Engl J Med (1997) 336:1569–1574.
17. Hastings NAJ, Peacock B, Evans M. Statistical Distributions (2000) 3rd ed. New York: Wiley.
Manuscript received December 11, 2006; revised October 1, 2007; accepted November 7, 2007.
Correspondence about this Article
Editorial about this Article
Related Articles in JNCI
![]()
CiteULike
Connotea
Del.icio.us What's this?
J Natl Cancer Inst 2008 100: 1188.
J Natl Cancer Inst 2008 100: 80-81.
J Natl Cancer Inst 2008 100: 79.
J Natl Cancer Inst 2008 100: 79.
This article has been cited by other articles:
![]() |
N. Lama and C. Gallo Re: Visualizing Length of Survival in Time-to-Event Studies: A Complement to Kaplan-Meier Plots J Natl Cancer Inst, August 20, 2008; 100(16): 1188 - 1188. [Full Text] [PDF] |
||||
![]() |
P. Royston, M. K. B. Parmar, and D. G. Altman Response: Re: Visualizing Length of Survival in Time-to-Event Studies: A Complement to Kaplan-Meier Plots J Natl Cancer Inst, August 20, 2008; 100(16): 1188 - 1189. [Full Text] [PDF] |
||||
![]() |
J. A. Ajani In Reply J. Clin. Oncol., May 1, 2008; 26(13): 2236 - 2237. [Full Text] [PDF] |
||||
![]() |
J. Wittes Times to Event: Why Are They Hard to Visualize? J Natl Cancer Inst, January 16, 2008; 100(2): 80 - 81. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






