© 2003 by Oxford University Press
Journal of the National Cancer Institute, Vol. 95, No. 9, 634-635,
May 7, 2003
© 2003 Oxford University Press
EDITORIAL |
Judging New Markers by Their Ability to Improve Predictive Accuracy
Correspondence to: Michael W. Kattan, Ph.D., Health Outcomes Research Group, Department of Epidemiology and Biostatistics and Department of Urology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave., New York, NY 10021 (e-mail: kattanm{at}mskcc.org).
The man who has recently received a radical prostatectomy to treat his clinically localized prostate cancer now faces an important decision: whether or not adjuvant therapy would be beneficial. Clearly, a major factor in this decision is the likelihood of his disease recurring in the absence of additional therapy. There are at least three well-documented prognostic models for use in this setting, and each predicts the likelihood of biochemical progression (i.e., prostate-specific antigen [PSA]-defined recurrence of prostate cancer). Partin et al. (1) developed an equation they called "Rw"; Blute et al. (2) devised the "GPSM" score (which includes the Gleason score, PSA level, seminal vesicle status, and margin status), and Kattan et al. (3) derived a postoperative nomogram, which was later validated by Graefen et al. (4). Which of these models predicts best for the individual patient? The GPSM score and the postoperative nomogram have been evaluated by the concordance index and have values of 0.76 and 0.80, respectively, suggesting relatively similar performance. The concordance index is the probability that, given two randomly selected patients, the patient with the worse outcome is, in fact, predicted to have a worse outcome (5). This measure, similar to an area under the receiver operating characteristic curve, ranges from 0.5 (i.e., chance or a coin flip) to 1.0 (perfect ability to rank patients).
In this issue of the Journal, Rhodes et al. (6) have found that a novel marker, the E-cadherin and enhancer of zeste homolog 2 (EZH2) status, may provide additional prognostic ability in the postoperative prostate cancer disease setting. They have found that the interaction of E-cadherin and EZH2 is statistically significant in multivariable analysis (P = .003) and has a hazard ratio of 3.19. This association may prove to have important biologic implications. However, from a prediction perspective, an important question should be asked of any new marker: How accurate is the best prediction model that contains the new marker relative to the best model that lacks it? That is, how much does the concordance index improve with knowledge of the patients novel marker? This increment is a direct gauge of the progress being made in our ability to predict patient outcome.
Analyses that characterize markers by their impact on the predictive accuracy (e.g., as measured by a change in the concordance index) of a model are rare, but beneficial. Begg et al. (7) effectively did this when they compared three rival staging systems in thymoma. As Begg et al. point out, many prognostic factors contain little or no relevant information that is not already available when standard prognostic factors are combined optimally. For this reason, it is important to compare the best (i.e., most accurately predicting) models, with and without the marker of interest. When judging the value of a model containing a new marker, an important question is whether it is possible to achieve an equivalent concordance index by the optimal modeling of all predictors besides the novel marker. If so, the new marker has not improved our ability to predict patient outcome.
Why should we change the way we ordinarily look at markers and instead compare the accuracies of two models? The reasons that the comparison must be model-based and that traditional reporting of P values and hazard ratios from multivariable analysis is inadequate are manyfold. First, an individual patients optimal prediction, in most cases, will come from a multivariable model. Rarely would a single marker, absent any modeling, be ideal for prediction. If a model of markers provides the most accurate prediction, we should be evaluating models of markers. Second, the P value tests whether the association with the marker is 0, which is not testing the question of direct interest: whether a new marker improves our ability to predict. As Simon (8) points out, these are different questions. Third, when examining the P value for a novel marker, this value may depend on how the other variables are considered in the multivariable model. For example, the use of cutoffs or transforms for the established marker(s) can affect the P value of the novel marker. A comparison of the best models with and without a marker of interest provides a more objective alternative, because the emphasis is shifted to predictive accuracies of the models; the modeling should be used that provides the most accurate predictions (e.g., maximizes the concordance index), an objective goal. This model comparison conveniently alleviates another problem, that of automated variable selection. Procedures such as backwards elimination tend to reduce the P values of variables that survive elimination (i.e., the P values of remaining variables tend to shrink as other variables are eliminated) (9). Thus, the concern when a marker has a small P value only after variable selection, and not when judged in the full model before variable selection, is largely solved because 1) automated variable selection procedures would be used only when they improve a models predictive ability [which is very rare (9)] and 2) the P value of the marker after variable selection would not be of direct interest.
Interpretation of a novel markers hazard ratio, in an effort to judge the markers prognostic value, has similar drawbacks. The hazard ratio is dependent on the measurement scale of the marker, cutoff(s) used for the novel marker, and the manner in which established variables are modeled.
The following case study illustrates why incremental model predictive accuracy is a valuable metric. A new marker, percentage of biopsy cores positive for prostate cancer, was recently analyzed for its ability to improve preoperative prediction of prostate cancer recurrence after radical prostatectomy (10). When added to a model containing the established markers (pretreatment PSA level, clinical tumor stage, and biopsy Gleason score), the percentage of positive cores was highly statistically significant (P<.001). However, the concordance index of this model (0.75) was identical to that of the best model that lacked this predictor. Thus, percentage of positive cores, despite being statistically significant on multivariable analysis, did not advance our ability to predict individual patient outcomes by any appreciable amount. However, when levels of interleukin 6 and tumor growth factor
1 were added to the model containing the established markers, these new markers were also each highly statistically significant (each P<.001), and indeed the concordance index improved to 0.83. Thus, evaluation of the concordance index provided a bottom-line analysis when marker P value inspection results were inconsistent.
In conclusion, more emphasis on prognostic model predictive accuracy is needed. Markers should be judged on their ability to improve an already optimized prediction model, rather than on their P value in a multivariable analysis. In addition, this incremental predictive accuracy should be quantified. As a measure of a models predictive ability, the concordance index may not be the perfect metric, and methods of comparing concordance indices need further development, but it is perhaps the best alternative presently available (11). Measurement of predictive accuracy is an active area of research. Nonetheless, continuous measurement of the improvement in our ability to predict outcomes for cancer patients helps us to know when prognostic progress has been made and keeps our focus on the important goal of improving our ability to predict patient outcome.
REFERENCES
1 Partin AW, Piantadosi S, Sanda MG, Epstein JI, Marshall FF, Mohler JL, et al. Selection of men at high risk for disease recurrence for experimental adjuvant therapy following radical prostatectomy. Urology 1995;45:8318.[CrossRef][Web of Science][Medline]
2 Blute ML, Bergstralh EJ, Iocca A, Scherer B, Zincke H. Use of Gleason score, prostate specific antigen, seminal vesicle and margin status to predict biochemical failure after radical prostatectomy. J Urol 2001;165:11925.[CrossRef][Web of Science][Medline]
3 Kattan MW, Wheeler TM, Scardino PT. Postoperative nomogram for disease recurrence after radical prostatectomy for prostate cancer. J Clin Oncol 1999;17:1499507.
4 Graefen M, Karakiewicz P, Cagiannos I, Klein EA, Kupelian PA, Quinn D, et al. A validation study of the accuracy of a postoperative nomogram for recurrence after radical prostatectomy for localized prostate cancer. J Clin Oncol 2002;20:9516.
5 Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA 1982;247:25436.
6 Rhodes DR, Sanda MG, Otte AP, Chinnaiyan AM, Rubin MA. Multiplex biomarker approach for determining risk of prostate-specific antigen-defined recurrence of prostate cancer. J Natl Cancer Inst 2003;95:6619.
7 Begg CB, Cramer LD, Venkatraman ES, Rosai J. Comparing tumor staging and grading systems: a case study and a review of the issues, using thymoma as a model. Stat Med 2000;19:19972014.[CrossRef][Medline]
8 Simon R. Evaluating prognostic factor studies. In: Gospodarowicz MK, editor. Prognostic factors in cancer. 2nd ed. New York (NY): Wiley-Liss, Inc.; 2001. p. 4956.
9 Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:36187.[CrossRef][Web of Science][Medline]
10 Kattan MW, Scardino P. Prediction of progression: nomograms of clinical utility. Clinical Prostate Cancer 2002;1:906.[Medline]
11 Harrell FE Jr. Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. New York (NY): Springer-Verlag; 2001.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. F. Shariat, P. I. Karakiewicz, G. Godoy, J. A. Karam, R. Ashfaq, Y. Fradet, H. Isbarn, F. Montorsi, C. Jeldres, P. J. Bastian, et al. Survivin as a Prognostic Marker for Urothelial Carcinoma of the Bladder: A Multicenter External Validation Study Clin. Cancer Res., November 15, 2009; 15(22): 7012 - 7019. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Margulis, Y. Lotan, P. I. Karakiewicz, Y. Fradet, R. Ashfaq, U. Capitanio, F. Montorsi, P. J. Bastian, M. E. Nielsen, S. C. Muller, et al. Multi-Institutional Validation of the Predictive Value of Ki-67 Labeling Index in Patients With Urinary Bladder Cancer J Natl Cancer Inst, January 21, 2009; 101(2): 114 - 119. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Wouters, B. Lowenberg, and R. Delwel A decade of genome-wide gene expression profiling in acute myeloid leukemia: flashback and prospects Blood, January 8, 2009; 113(2): 291 - 298. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. F. Shariat, J. A. Karam, J. Walz, C. G. Roehrborn, F. Montorsi, V. Margulis, F. Saad, K. M. Slawin, and P. I. Karakiewicz Improved Prediction of Disease Relapse after Radical Prostatectomy through a Panel of Preoperative Blood-Based Biomarkers Clin. Cancer Res., June 15, 2008; 14(12): 3785 - 3791. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. F. Shariat, J. Walz, C. G. Roehrborn, A. R. Zlotta, P. Perrotte, N. Suardi, F. Saad, and P. I. Karakiewicz External Validation of a Biomarker-Based Preoperative Nomogram Predicts Biochemical Recurrence After Radical Prostatectomy J. Clin. Oncol., March 20, 2008; 26(9): 1526 - 1531. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. W. Veltri, M. C. Miller, S. Isharwal, C. Marlow, D. V. Makarov, and A. W. Partin Prediction of Prostate-Specific Antigen Recurrence in Men with Long-term Follow-up Postprostatectomy Using Quantitative Nuclear Morphometry Cancer Epidemiol. Biomarkers Prev., January 1, 2008; 17(1): 102 - 110. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. F. Symmans, F. Peintinger, C. Hatzis, R. Rajan, H. Kuerer, V. Valero, L. Assad, A. Poniecka, B. Hennessy, M. Green, et al. Measurement of Residual Breast Cancer Burden to Predict Survival After Neoadjuvant Chemotherapy J. Clin. Oncol., October 1, 2007; 25(28): 4414 - 4422. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Berkwits and E. Guallar Risk Factors, Risk Prediction, and the Apolipoprotein B-Apolipoprotein A-I Ratio Ann Intern Med, May 1, 2007; 146(9): 677 - 679. [Full Text] [PDF] |
||||
![]() |
C.-Y. Wu, M.-S. Wu, E.-P. Chiang, Y.-J. Chen, C.-J. Chen, N.-H. Chi, Y.-T. Shih, G.-H. Chen, and J.-T. Lin Plasma Matrix Metalloproteinase-9 Level Is Better than Serum Matrix Metalloproteinase-9 Level to Predict Gastric Cancer Evolution Clin. Cancer Res., April 1, 2007; 13(7): 2054 - 2060. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Dupuy and R. M. Simon Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting J Natl Cancer Inst, January 17, 2007; 99(2): 147 - 157. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Lessard, P. I. Karakiewicz, P. Bellon-Gagnon, M. Alam-Fahmy, H. A. Ismail, A.-M. Mes-Masson, and F. Saad Nuclear Localization of Nuclear Factor-{kappa}B p65 in Primary Prostate Tumors Is Highly Predictive of Pelvic Lymph Node Metastases. Clin. Cancer Res., October 1, 2006; 12(19): 5741 - 5745. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Buyse, S. Loi, L. van't Veer, G. Viale, M. Delorenzi, A. M. Glas, M. Saghatchian d'Assignies, J. Bergh, R. Lidereau, P. Ellis, et al. Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst, September 6, 2006; 98(17): 1183 - 1192. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Stephenson, P. T. Scardino, J. A. Eastham, F. J. Bianco Jr., Z. A. Dotan, P. A. Fearn, and M. W. Kattan Preoperative nomogram predicting the 10-year probability of prostate cancer recurrence after radical prostatectomy. J Natl Cancer Inst, May 17, 2006; 98(10): 715 - 717. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. S. Vasan Biomarkers of Cardiovascular Disease: Molecular Basis and Practical Considerations Circulation, May 16, 2006; 113(19): 2335 - 2362. [Full Text] [PDF] |
||||
![]() |
L. Wang, H. Hricak, M. W. Kattan, L. H. Schwartz, S. C. Eberhardt, H.-N. Chen, and P. T. Scardino Combined endorectal and phased-array MRI in the prediction of pelvic lymph node metastasis in prostate cancer. Am. J. Roentgenol., March 1, 2006; 186(3): 743 - 748. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J.G. Hanley, A. J. Karter, K. Williams, A. Festa, R. B. D'Agostino Jr, L. E. Wagenknecht, and S. M. Haffner Prediction of Type 2 Diabetes Mellitus With Alternative Definitions of the Metabolic Syndrome: The Insulin Resistance Atherosclerosis Study Circulation, December 13, 2005; 112(24): 3713 - 3721. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Pischon, C. J. Girman, F. M. Sacks, N. Rifai, M. J. Stampfer, and E. B. Rimm Non-High-Density Lipoprotein Cholesterol and Apolipoprotein B in the Prediction of Coronary Heart Disease in Men Circulation, November 29, 2005; 112(22): 3375 - 3383. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Greenland and P. G. O'Malley When Is a New Prediction Marker Useful?: A Consideration of Lipoprotein-Associated Phospholipase A2 and C-Reactive Protein for Stroke Risk Arch Intern Med, November 28, 2005; 165(21): 2454 - 2456. [Full Text] [PDF] |
||||
![]() |
R. Simon Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers J. Clin. Oncol., October 10, 2005; 23(29): 7332 - 7341. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. C. Bast Jr., H. Lilja, N. Urban, D. L. Rimm, H. Fritsche, J. Gray, R. Veltri, G. Klee, A. Allen, N. Kim, et al. Translational Crossroads for Biomarkers Clin. Cancer Res., September 1, 2005; 11(17): 6103 - 6108. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, C. Sempos, R. P. Donahue, J. Dorn, M. Trevisan, and S. M. Grundy Joint Distribution of Non-HDL and LDL Cholesterol and Coronary Heart Disease Risk Prediction Among Individuals With and Without Diabetes Diabetes Care, August 1, 2005; 28(8): 1916 - 1921. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Rubin, T. A. Bismar, O. Andren, L. Mucci, R. Kim, R. Shen, D. Ghosh, J. T. Wei, A. M. Chinnaiyan, H.-O. Adami, et al. Decreased {alpha}-Methylacyl CoA Racemase Expression in Localized Prostate Cancer is Associated with an Increased Rate of Biochemical Recurrence and Cancer-Specific Death Cancer Epidemiol. Biomarkers Prev., June 1, 2005; 14(6): 1424 - 1432. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Mullerad, H. Hricak, L. Wang, H.-N. Chen, M. W. Kattan, and P. T. Scardino Prostate Cancer: Detection of Extracapsular Extension by Genitourinary and General Body Radiologists at MR Imaging Radiology, July 1, 2004; 232(1): 140 - 146. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Wang, M. Mullerad, H.-N. Chen, S. C. Eberhardt, M. W. Kattan, P. T. Scardino, and H. Hricak Prostate Cancer: Incremental Value of Endorectal MR Imaging Findings for Prediction of Extracapsular Extension Radiology, July 1, 2004; 232(1): 133 - 139. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Pepe, H. Janes, G. Longton, W. Leisenring, and P. Newcomb Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic, Prognostic, or Screening Marker Am. J. Epidemiol., May 1, 2004; 159(9): 882 - 890. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. W. Kattan Evaluating a New Marker's Predictive Contribution Clin. Cancer Res., February 1, 2004; 10(3): 822 - 824. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||











