Journal of the National Cancer Institute Advance Access originally published online on January 8, 2008
JNCI Journal of the National Cancer Institute 2008 100(2):80-81; doi:10.1093/jnci/djm304
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© The Author 2008. Published by Oxford University Press.
EDITORIALS |
Times to Event: Why Are They Hard to Visualize?
Affiliation of author: Statistics Collaborative, Inc, Washington, DC
Correspondence to: Janet Wittes, PhD, Statistics Collaborative, Inc, 1625 Massachusetts Ave, NW, Ste 600, Washington, DC 20036 (e-mail: janet{at}statcollab.com).
One would think that nearly 350 years of experience with survival curves (1) and 60 years of looking at Kaplan-Meier curves (2) would have trained our brain to interpret correctly what our eyes are seeing. But, to mix metaphors, our brains and our eyes march to different drummers. Our brains understand that the height of the riser in a single step of a Kaplan-Meier curve is inversely related to the sample size—the smaller the number of people eligible to experience an event (ie, make the step), the larger the drop when an event occurs. When two time-to-event curves from a study of, say, 5 years hover together for the first 2 years and then separate, the eye thinks that 2 years of treatment with the better intervention are required for an effect, whereas the brain understands that the benefit (or harm) may not require 2 years of treatment, but the effect may take 2 years to become manifest (3). Alternatively, two Kaplan-Meier curves may track closely for most of a clinical trial and then separate widely at the end—as shown visually at the right side of the graph. The brain knows that large white space between the two curves does not mean a dramatic benefit (or harm) of treatment if one waits long enough but, rather, likely reflects variability arising from the small sample sizes available for analysis at the end of the study. The eye, however, lights immediately on the area where the curves diverge. All of these examples illustrate problems in interpreting time-to-event curves in the context of populations.
Royston et al. (4) raise still another issue—what do time-to-event curves tell us about the individual patient? The authors point out quite correctly that two time-to-event curves can look very different, have a small log-rank P value, and have a comfortably low hazard ratio but that the actual distributions of times to the events overlap considerably. Therefore, they suggest presenting the actual distributions to show the overlap between the two populations, thus tempering the enthusiasm of physicians and patients who see diverging survival curves. That argument makes eminent sense. After all, when we have continuous data for which the variable is something other than time to event, we not only present the mean and some measure of variability of the mean but we also often display a histogram or box plot to show the distribution of the data themselves.
Why dont we do something analogous in studies comparing mortality? The answer is that, when we try to plot raw data, censoring raises its ugly head. Unless we wait until everyone in a trial has died before analyzing its data, some people will still be alive at the end of the trial and, therefore, their time to death remains unknown. To make the problem more vexing, not all censored observations represent the longest times to expected death. If a high-risk person is censored early because of being a late enrollee or a patient was lost to follow-up early in the study, the projected time to death will be short. So, we statisticians have thrown up our collective hands and have not systematically shown person-specific times. Royston et al. (4) now present us with a very simple and clever solution for dealing with those censored observations—use the time until censoring plus the individual's prognostic factors to simulate, under a log-normal distribution, the expected times and then plot both the observed and simulated times on a histogram [see figs. 2 and 4 in Royston et al. (4)]. When, as in the case they describe, the study has relatively few censored observations, their method allows a direct way of visualizing the time-to-death distributions. They have kindly offered to provide researchers with a computer program that produces their plots—I predict a rush of requests.
Royston et al. (4) also suggest that their method of presenting data be used as an adjunct to, not a replacement for, Kaplan-Meier curves. The two methods of visualization complement each other, in that Kaplan-Meier curves show effects in the population and their plots show individual times. Their presentation provides a good sense of the variability of the data, but it does not give information on the variability of the estimated proportions surviving at any given time. For that, one needs either a Kaplan-Meier curve with confidence limits (such curves are often very difficult to read) or a table with the estimated proportions and their 95% confidence limits. I would urge that presentations of Kaplan-Meier curves also routinely include ticks to identify the times of censoring (unless, of course, the sample sizes are so large that the ticks badly clutter the curves). Without those ticks, one cannot tell which proportion of the imputed survival times come from the right-hand, or later time, side of the curves and which proportion are the inferentially more problematic cases coming from earlier times.
The example the authors use is particularly well suited to their method. The proportion of censored observations is low (7%), with most occurring in the "good" prognosis patients [see fig. 4 in Royston et al. (4)]. They might have chosen an even more extreme case. As mentioned above, many Kaplan-Meier curves in the literature show dramatic-looking differences at the right side of the curve—indicating a very large benefit of treatment—but the actual numbers at risk are very low at exactly the places where the curves are farthest apart. The graphs of Royston et al. (4) might be particularly helpful in such studies.
In other situations, their graphs will be less useful. As the proportion of censored observations increases, the graphs will reflect more and more uncertainty. The authors suggest that their graphs not be used with more than 50% censoring or when the simulations produce biologically inconsistent times of death. My own threshold of comfort is currently well below 50% censoring. However, until users have collective experience with these presentations, everyone using the method should play with data to develop a sense of when the graphs will be useful.
If the outcome under study is something other than mortality, then censoring occurs for a number of reasons. As in studies of mortality, a trial may end before an event has occurred or a person may be lost to follow-up. Other reasons may apply in the nonmortality setting: a person may die without experiencing the outcome of interest or another event may occur that precludes the outcome of interest. In such cases, imputation will be more difficult and more sensitive to assumptions. The authors restrict their discussion to mortality, but their plots will certainly be put to "off-label" use. All of us who experiment with these graphs should be careful in this setting to understand the assumptions being made in simulating time-to-event data.
Royston et al. (4) have persuasively pointed out some important limitations of Kaplan-Meier curves and have proposed a method of presenting mortality data under limited censoring. Those of us who work with time-to-event data should now attempt to extend their method to cases with many censored events or to settings other than survival.
REFERENCES
1. Graunt J. Natural and Political Observations Made Upon the Bills of Mortality (1662) London: Thos. Roycroft.
2. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. (1958) 53:457–481.[CrossRef][ISI]
3. Lagakos SW. Time-to-event analyses for long-term treatments—the APPROVe trial. N Engl J Med (2006) 355(2):113–117.
4. Royston P, Parmer MKB, Altman DG. Visualizing length of survival in time-to-event studies: a complement to Kaplan-Meier plots. J Natl Cancer Inst. (2008) 100(2):92–97.
Related Articles in JNCI
![]()
CiteULike
Connotea
Del.icio.us What's this?
J Natl Cancer Inst 2008 100: 92-97.
J Natl Cancer Inst 2008 100: 79.
J Natl Cancer Inst 2008 100: 79.
This article has been cited by other articles:
![]() |
J. A. Ajani In Reply J. Clin. Oncol., May 1, 2008; 26(13): 2236 - 2237. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
