Skip Navigation

JNCI Journal of the National Cancer Institute 2006 98(23):1686-1693; doi:10.1093/jnci/djj463
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Decarli, A.
Right arrow Articles by Gail, M. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Decarli, A.
Right arrow Articles by Gail, M. H.
Related Collections
Right arrowEditorial about this Article
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press.

ARTICLE

Gail Model for Prediction of Absolute Risk of Invasive Breast Cancer: Independent Evaluation in the Florence–European Prospective Investigation Into Cancer and Nutrition Cohort

Adriano Decarli, Stefano Calza, Giovanna Masala, Claudia Specchia, Domenico Palli, Mitchell H. Gail

Affiliations of authors: Medical Statistics and Biometry Institute, University of Milan, Milan, Italy (AD); Unit of Medical Statistics and Biometry, National Cancer Institute, Milan, Italy (AD); Department of Biomedical Science and Biotechnology, Medical Statistics and Biometry Section, University of Brescia, Brescia, Italy (SC, CS); Molecular and Nutritional Epidemiology Unit, Cancer Research and Prevention Center, Scientific Institute of Tuscany, Florence, Italy (GM, DP); Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD (MHG)

Correspondence to: Adriano Decarli, PhD, Medical Statistics and Biometry Institute, University of Milan, Via Venezian 1, 20133 Milan, Italy (e-mail: adriano.decarli{at}unimi.it).


    ABSTRACT
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Background: The Gail model 2 (GM) for predicting the absolute risk of invasive breast cancer has been used for counseling and to design intervention studies. Although the GM has been validated in US populations, its performance in other populations is unclear because of the wide variation in international breast cancer rates. Methods: We used data from a multicenter case–control study in Italy and from Italian cancer registries to develop a model (IT-GM) that uses the same risk factors as the GM. We evaluated the accuracy of the IT-GM and the GM using independent data from the Florence– European Prospective Investigation Into Cancer and Nutrition (EPIC) cohort. To assess model calibration (i.e., how well the model predicts the observed numbers of events in subsets of the population), we compared the number of expected incident breast cancers (E) predicted by these models with the number of observed incident breast cancers (O), and we computed the concordance statistic to measure discriminatory accuracy. Results: The overall E/O ratios were 0.96 (95% confidence interval [CI] = 0.84 to 1.11) and 0.93 (95% CI = 0.81 to 1.08) for the IT-GM and the GM, respectively. The IT-GM was somewhat better calibrated than GM in women younger than 50 years, but the GM was better calibrated when age at first live birth categories were considered (e.g., 20- to 24-year age-at-first-birth category E/O = 0.68, 95% CI = 0.53 to 0.94 for the IT-GM and E/O = 0.75, 95% CI = 0.58 to 1.03 for the GM). The concordance statistic was approximately 59% for both models, with 95% confidence intervals indicating that the models perform statistically significantly better than pure chance (concordance statistic of 50%). Conclusions: There was no statistically significant evidence of miscalibration overall for either the IT-GM or the GM, and the models had equivalent discriminatory accuracy. The good performance of the IT-GM when applied on the independent data from the Florence–EPIC cohort indicates that GM can be improved for use in populations other than US populations. Our findings suggest that the Italian data may be useful for revising the GM to include additional risk factors for breast cancer.



    INTRODUCTION
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Each year, more than 800 000 new cases of breast cancer are diagnosed worldwide (1). Although breast cancer is the most commonly diagnosed cancer among women in developed countries, there is wide international variation in breast cancer incidence rates (2). For example, according to the GLOBOCAN 2002 database (3), the estimated age-standardized breast cancer incidence rates for 1998 through 2002 were 74.4 per 100 000 women in Italy and 101.1 per 100 000 women in the United States. Thus, it is possible that a breast cancer prediction model that uses a small number of risk factors to estimate the absolute risk of breast cancer, such as the Gail model (4), will be better calibrated for populations in some countries than for those in other countries. The accuracy of the calibration of the Gail model for populations in Italy might depend on the extent to which the prevalence in Italy of risk factors in the model account for the lower breast cancer incidence rates in Italy and whether other factors, such as differences in the use of screening mammography, might affect these rates. For example, in the United States in 2000, 70.1% of women older than 39 years reported having had a mammogram within the previous 2 years (5). Although comparable data from Italy are not available, it is likely that a smaller proportion of Italian women received mammograms, based on the coverage of available screening programs (6) and on the National Health Service data (7). According to these data, the estimated percentage of Italian women who were older than 49 years and had a mammogram during 2004 was 28.1%.

The original Gail model (4) was based on data from the Breast Cancer Detection Demonstration Project (BCDDP), a program of five annual screening examinations for breast cancer that was conducted at 29 centers in the United States. The model had a multivariate relative risk component that included the woman's age at first live birth, her age at menarche, the number of previous benign breast biopsies, whether or not atypical hyperplasia had been identified on any of the biopsies, and the total number of first-degree relatives with breast cancer. The original Gail model also used baseline age-specific breast cancer risks estimated from the BCDDP to estimate the absolute risk of invasive or in situ breast cancer. Investigators who were designing the Breast Cancer Prevention Trial to determine whether tamoxifen could prevent breast cancer (8) later modified the baseline hazard portion of the original Gail model by using invasive breast cancer rates from the National Cancer Institute's Surveillance, Epidemiology, and End Results program. This modified model, referred to as "Gail model 2" by Costantino et al. (9), is incorporated into the National Cancer Institute's Breast Cancer Risk Assessment Tool (available at http://www.cancer.gov/bcrisktool/).

The Gail model 2 (hereafter referred to as the GM) has been validated using independent data from the United States. Rockhill et al. (10) used data from the Nurses' Health Study for the period from 1992 to 1997 to determine whether the average absolute risk of breast cancer predicted by the GM for specific subgroups of women accurately predicted the observed number of breast cancers in those subgroups (i.e., calibration). Overall, the ratio of expected (E) to observed (O) numbers of cases was 0.94, with values near 0.90 for younger women and near 1.00 for older women. On the basis of these subgroup analyses, Rockhill et al. concluded that the GM was well calibrated, but they noted that the distribution of projected risks among women who developed breast cancer overlapped considerably with that among women who did not develop breast cancer (concordance statistic = 0.58), indicating that the GM had "modest discriminatory accuracy at the individual level." Costantino et al. (9) found an overall E/O ratio of 1.03 for women in the control arm of the Breast Cancer Prevention Trial and also concluded that the GM was well calibrated. Other studies have examined the calibration and relative risk features of the GM in unscreened US populations (11,12), in which the model can overestimate risk; in African American populations (13,14), for which additional validation studies are needed; in women attending high-risk clinics (15), for whom the GM gave predictions that were strongly concordant with those from genetically based models; and in women attending specialized clinics (1620).

Comparatively few publications have evaluated the GM in non-US populations. Of those that have, some (21,22) were based solely on case–control data, which can be used to investigate relative risk features but not absolute risks. Boyle et al. (23) compared the observed number of breast cancers among 5408 hysterectomized women without benign breast disease in an Italian trial of tamoxifen to prevent breast cancer with the number predicted by the GM. The GM predicted 88.4 events, whereas 79 breast cancers were observed (E/O = 1.12, 95% confidence interval [CI] = 0.92 to 1.28). Amir et al. (24) studied 3170 women in the Family History Clinic at the University Hospital in South Manchester, United Kingdom. In this population, the GM had a high concordance (0.735), but the expected number of breast cancers, 44.30, underestimated the 64 observed breast cancers (E/O = 0.69, 95% CI = 0.54 to 0.90).

We used data from a case–control study that was conducted in six regions of Italy to develop two models for predicting breast cancer risk that were based on the Gail relative risk model. One model used ordinal 1 df coding of the data (IT-GM) as done in the original Gail model (4), and the other used a categorical coding (IT1-GM). We then used independent data from the Florence cohort of the European Prospective Investigation Into Cancer and Nutrition (EPIC) study (25) to evaluate the three models and compared the performance of the GM with that of IT-GM and IT1-GM.


    SUBJECTS AND METHODS
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Italian Multicenter Case–Control Study of Diet and Breast Cancer

We used data from a case–control study of diet and breast cancer that was conducted from June 1991 through April 1994 in six areas in Italy and has been described in detail (26) to develop breast cancer risk prediction models that were based on the Gail relative risk model. Briefly, case subjects (N = 2569) were women aged 23–74 years (median age = 55 years) with no previous history of cancer who were admitted to the major teaching and general hospitals of the study areas with histologically confirmed invasive breast cancer that was diagnosed within the year before the study interview. Control subjects (N = 2588) were women aged 20–74 years (median age = 56 years) who were admitted for acute conditions to hospitals in the same catchment areas as the case subjects. Women admitted for gynecologic, hormonal, or neoplastic diseases or for diseases related to known risk factors for breast cancer were not included. Control subjects were admitted for trauma (mostly fractures and sprains; 22%), nontraumatic orthopedic diseases (33%), surgical conditions (15%), eye diseases (18%), or other conditions (12%), such as ear, nose, throat, skin, or dental conditions. Case and control subjects were not matched but had similar distributions of age and area of residence. Participation rates exceeded 96% for case and control subjects. The interviewers who administered the questionnaires were trained in one center, and the same structured questionnaire and coding manual were used in all study centers. The following information was extracted from the original data files: age, age at menarche, age at first live birth, number of births, menopausal status, age at menopause, family history of breast cancer, and number of breast biopsies. A single file that included these items plus the case–control status and the center of enrollment was created. Only four case subjects and four control subjects had missing data for some of these variables; these subjects were excluded when fitting the relative risk models. All study participants signed an informed consent form. The case–control study was approved by the local ethics committee.

Florence–EPIC Cohort Study

We used data from the Florence–EPIC study cohort to validate our breast cancer prediction models. This cohort included women aged 35–64 years who resided in the Italian provinces of Florence and Prato, which are covered by the Florence Cancer Registry, and were recruited into the Florence portion of the EPIC–Italy prospective study on diet and cancer (27,28). EPIC study enrollment strategies included actively inviting women who were attending breast and cervical cancer screening programs, blood donors, members of consumer associations, and employees of private companies and public agencies and enrolling subjects who joined the cohort after being informed about the study by media. Overall, 10 083 women were enrolled. Detailed information on dietary and lifestyle habits, reproductive history, and family history of breast cancer was collected from each woman in the study. Dietary information on the frequency of consumption of more than 120 food and beverage items was obtained from a self-administered food frequency questionnaire that had been validated in a pilot phase of the study (29). Automated and manual procedures were used to check the quality and completeness of the data.

We excluded 30 women who had prevalent breast cancer at the time of recruitment and 12 additional women in whom incident breast cancer was diagnosed within 6 months after recruitment. Follow-up therefore consisted of the period beginning 6 months after recruitment through December 31, 2002, when the cohort follow-up was last updated. Only 10 women were lost to follow-up. A total of 194 women were diagnosed with invasive breast cancer during follow-up among the 10 031 women included in the analysis. All participants signed an informed consent form at enrollment. The Florence –EPIC project was approved by the local ethics committee.

There was no overlap between the women enrolled in the case–control study and those recruited for the Florence–EPIC cohort study.

Statistical Methods

As previously described (4,30), we calculated the absolute risk or probability of developing breast cancer between the ages a and a + {tau}, for a woman who is in risk factor stratum i, as:

Formula

where subscript 1 refers to the incidence of breast cancer and subscript 2, to all other causes of death. In this equation, h1(t) is the baseline hazard rate of developing breast cancer at age t in the reference group and ri(t) is the relative risk of developing breast cancer at an age t for a woman in risk stratum i compared with the group of subjects without known risk factors (baseline or reference group). Note that ri(t) may depend on age t, so that a proportional hazards assumption is not required, although in the GM, this assumption holds separately for women younger than 50 years and those 50 years or older. In addition, h2(t) is the mortality rate, at age t, from all causes of death, except breast cancer, in the population, andFormula

To develop the IT-GM and IT1-GM models, we estimated ri(t) and age-specific attributable risks AR(t) using data from the Italian Multicenter Case–Control Study of Diet and Breast Cancer (26,31). The logistic model for IT-GM included the same factors and ordinal 1-df codes as the GM, except for the number of breast biopsies, which was coded as 0, 1, or 2 for none, one, or two or more biopsies, respectively, in the GM, and as 0 or 1 for zero or one or more biopsies, respectively, in the IT-GM. The other factors included in IT-GM were age at menarche, which was coded 0, 1, or 2 for 14 years or older, 12–13 years, or younger than 12 years, respectively; number of breast biopsies; age at first live birth, which was coded 0, 1, 2, or 3 for younger than 20 years, 20–24 years, 25–29 years or nulliparous, or 30 years or older, respectively; family history, which was coded 0, 1, or 2 for zero, one, or two or more affected first-degree relatives (mother and/or sister); a variable for the interaction between the number of breast biopsies and age (<50 versus ≥50 years); and a variable for the interaction between age at first birth and family history. The logistic model for IT1-GM included categorical rather than ordinal 1 df codes for all the variables in the model. This means that each k-level variable was coded using k–1 dummy variables. We used the method described by Bruzzi et al. (32) to estimate the attributable risk AR(t) separately for women younger that 50 years and for those 50 years or older. The composite age-specific invasive breast cancer incidence rate, Formula and the mortality rate from competing risks, h2(t), were estimated from 1989–1993 data from the Florence Cancer Registry (33). To obtain the baseline hazard, h1(t), we used the formula (4):

Formula

where {rho}i(t) is the proportion of cases in the ith risk group. The 95% confidence intervals for estimated absolute risks were obtained by using the delta method described by Benichou and Gail (34), and the estimated variance of AR was obtained as previously described (35). This procedure accounts for the two major sources of uncertainty, namely, variation in the estimated log odds coefficients and the estimated AR, and the covariances among the estimated log odds coefficients and the AR.

Validation studies for the GM, the IT-GM, and the IT1-GM were based on independent Florence–EPIC cohort data. To test the calibration of the models, we calculated the expected number of cases of invasive breast cancer (E) in each subgroup of women and compared it with the observed number of cases (O) in the same subgroup by using E/O ratios (36). We evaluated the 95% confidence intervals for the E/O ratios by assuming that E was fixed and that O had a Poisson distribution (9,10).

The chi-square test for goodness of fit [i.e., (O – E)2/E] or the sum of this quantity over risk factor levels was used to compare observed and expected number of breast cancer in the subgroups defined by categories of each risk factor considered. To test for differences among the chi-square goodness-of-fit statistics obtained for the IT-GM, the IT1-GM, and the GM, we used a parametric bootstrap with resampling (N = 10 000) of Poisson observed counts (37) with means equal to the originally observed counts in each category.

We used the concordance statistic, that is, the area under the receiver-operating curve (36), to measure discriminatory accuracy (10). To evaluate the contributions of the risk factors in the model (other than age) to the discriminatory accuracy, we calculated age-specific concordances by computing separate concordances in the following age groups: 30–39 years, 40–49 years, 50–59 years, and 60 years or older. To compute the average age-specific concordance, we used weights proportional to the number of women in each age group. The variance estimate for the average age-specific concordance was the sum, over the age groups, of the weight squared times the estimated variance of the age-specific concordance estimate, which was obtained as previously described (10).


    RESULTS
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Table 1 shows the distribution of the population (N = 10 031), the number of woman-years of exposure during follow-up, and the number of incident breast cancer cases during follow-up (N = 194) for the Florence–EPIC study cohort for categories defined by breast cancer risk factors. The age at first birth was not known for three women. The age at menarche was not known for 60 women; those women were included in the 12- to 13-year age category. The information on risk factors was complete for more than 99% of the subjects.


View this table:
[in this window]
[in a new window]

 
Table 1.  Distribution of the population size, number of woman-years of follow-up, and number of incident invasive breast cancers by categories of breast cancer risk factors in the Florence–EPIC cohort, 1993–2002*

 
Table 2 shows the distributions of the 2565 case subjects and 2584 control subjects from the Italian Multicenter Case–Control Study of Diet and Breast Cancer (26) among categories defined by the risk factors in the GM. Table 2 also includes the corresponding odds ratios (with 95% confidence intervals) for the IT-GM and IT1-GM models that we estimated from the Italian Multicenter Case–Control Study using the logistic models described in "Subjects and Methods," as well as for the GM (4). The odds ratios from the IT-GM, IT1-GM, and GM were similar, considering that their confidence intervals overlap, even though number of biopsies was coded differently (i.e., according to two categories in the IT-GM and IT1-GM versus according to three categories in the GM). The odds ratio from the IT1-GM was statistically significantly lower than that from the IT-GM only in the subgroup that included subjects who were 30 years or older at first live birth and had more than one affected first-degree relative, a category that included only two women with breast cancer.


View this table:
[in this window]
[in a new window]

 
Table 2.  Distributions of case and control subjects in the Italian Multicenter Case–Control Study on Diet and Breast Cancer and odds ratios (ORs), with 95% confidence intervals (CIs) from the IT-GM, IT1-GM, and GM, by categories defined by risk factors in the original Gail model*

 
The estimated attributable risks from the Italian Multicenter Case–Control Study were 0.397 for the IT-GM and 0.399 for the IT1-GM. The attributable risk for the GM was 0.421 (9). That is, the estimated percentages of invasive breast cancer in the population related to the risk factors considered were 39.7, 39.9, and 42.1 for IT-GM, IT1-GM, and GM respectively.

The observed number of invasive breast cancer cases and the numbers of cases expected from the IT-GM, IT1-GM, and GM are shown in Table 3. The IT-GM predicted 186.11 expected invasive breast cancers and the IT1-GM predicted 194.31 breast cancers. The corresponding E/O ratios, based on 194 observed cases, were 0.96 (95% CI = 0.84 to 1.11) and 1.00 (95% CI = 0.88 to 1.16) for the IT-GM and the IT1-GM, respectively. The GM predicted 180.10 expected breast cancers, producing an E/O ratio of 0.93 (95% CI = 0.81 to 1.08). Thus, the calibration was slightly better for the IT-GM and the IT1-GM than for the GM, but these differences in calibration were not statistically significant, based on pairwise bootstrapped differences of (O – E)2/E.


View this table:
[in this window]
[in a new window]

 
Table 3.  Ratios of expected (E) numbers of invasive breast cancers, based on the IT-GM, IT1-GM, and the GM to the observed (O) numbers of invasive breast cancers in the Florence–EPIC cohort*

 
The number of expected breast cancer cases in the Florence–EPIC cohort calculated using only age-specific rates from the Florence Cancer Registry was 143 (E/O ratio = 0.73, 95% CI = 0.64 to 0.86). This number represents a considerable underestimation of the real breast cancer incidence in the Florence–EPIC cohort. By contrast, the IT-GM, IT1-GM, and GM gave much better predictions of the expected breast cancer risk, probably because these models account for the distribution of risk factors in the Florence–EPIC cohort. For example, 9.9% of women in the Florence–EPIC cohort had one or more affected first-degree relatives (Table 1), compared with only 4.7% of the control subjects in the Italian Multicenter Case–Control Study of Diet and Breast Cancer (Table 2). Thus, the Florence–EPIC cohort appears to have had more women with elevated risk factors than the general population, and the IT-GM, IT1-GM, and GM all account for this fact.

For women younger than 50, the agreement between observed (O = 69) and expected numbers of breast cancer cases was better for the IT-GM (E = 51.49; E/O = 0.75, 95% CI = 0.60 to 0.97) and the IT1-GM (E = 53.54; E/O = 0.77, 95% CI = 0.62 to 1.01) than for the GM (E = 42.32; E/O = 0.61, 95% CI = 0.49 to 0.80) (Table 3). The parametric bootstrap procedure showed that the chi-square statistic for goodness of fit was statistically significantly greater for the GM than for the IT-GM (21.02 versus 9.03; difference = 11.99; 95% CI = 3.76 to 23.34) (Table 3). The chi-square statistic for goodness of fit for the GM (8.90) for subgroups defined by age at first live birth categories was statistically significantly different from that for the IT-GM (14.48; difference = –5.58, 95% CI = –10.47 to –1.04), but not from that for the IT1-GM (14.29; difference = –5.37, 95% CI = –11.27 to 0.99) (Table 3). For the category defined by age at first live birth of 30 years or more, the GM had better agreement between observed and expected invasive breast cancer counts (E/O = 1.28, 95% CI = 0.98 to 1.95) than the IT-GM and the IT1-GM.

The average age-specific concordance statistics were modest: 58.6% (95% CI = 54.4% to 62.8%) for the IT-GM, 59.0% (95% CI = 54.8% to 63.2%) for the IT1-GM, and 58.8% (95% CI = 54.6% to 63.1%) for the GM (Table 3). The average age-specific concordances refer to the population-weighted average of the age-specific concordances. The age-specific concordance estimates the probability that the projected risk of breast cancer for a randomly selected woman diagnosed with breast cancer in a given age group will exceed that for a randomly selected woman without breast cancer in that age group. Histograms of the 5-year breast cancer risks for women diagnosed with breast cancer and women without breast cancer in the Florence–EPIC cohort overlapped considerably (data not shown), as was expected on the basis of these concordance statistics.

Because the IT-GM and IT1-GM had similar E/O ratios, we present a detailed comparison of only the IT-GM with the GM by quintiles of expected risk (Table 4). Both the IT-GM and the GM underestimated breast cancer risk for women in the second and third quintiles of risk and overestimated it in the fifth quintile, but neither deviation from 1.0 was statistically significant.


View this table:
[in this window]
[in a new window]

 
Table 4.  Ratios of expected (E) to observed (O) numbers of invasive breast cancers in the Florence–EPIC cohort for categories defined by quintiles of the projected probability of developing breast cancer*

 
Table 5 presents the absolute risk projections from the IT-GM for two hypothetical subjects with different combinations of risk factors, as previously discussed by Gail et al. (4), and for various age intervals. The estimated risks based on the IT-GM were similar to those estimated by Gail et al. (4), but the 95% confidence intervals were wider, probably reflecting the greater precision in the relative risk estimates of the GM (Table 2).


View this table:
[in this window]
[in a new window]

 
Table 5.  Estimated risks (as percentages) of developing breast cancer and 95% confidence intervals (CIs) according to the IT-GM, over different age intervals for two hypothetical subjects with specific combinations of risk factors*

 

    DISCUSSION
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
We developed two absolute risk models for invasive breast cancer, the IT-GM and the IT1-GM, by using case–control data from Italy and cancer registry data from the Florence and Prato provinces of Italy. Except for one coding difference for number of breast biopsies, IT-GM used the same codes as the GM; the IT1-GM differed from the GM by using categorical codes rather than ordinal codes. We tested all three models on independent data from the Florence–EPIC cohort. The relative risks estimated by the IT-GM and the IT1-GM were similar to those estimated by the GM, and each of the models gave much better calibrated predictions of the expected invasive breast cancers for the Florence–EPIC cohort than were obtained from Florence Cancer Registry age-specific rates alone. Thus, these models seemed to capture the effects of the risk factors in the Florence–EPIC cohort. For women younger than 50 years, the IT-GM and the IT1-GM were better calibrated than the GM, which tended to underestimate risk [see also (10)], but the GM was better calibrated than the IT-GM and the IT1-GM for most subgroups defined by age at first live birth, although the differences were statistically significant only when compared with IT-GM. There was no statistically significant evidence of miscalibration overall for any of the three models. The average age-specific concordance statistic was approximately 59% for each of the models.

Our finding that the GM was well calibrated in the EPIC data contrasts with the findings reported by Amir et al. (24) for a population in South Manchester, United Kingdom. That study indicated a higher concordance (i.e., 73.5%) for the GM but found that the predicted counts from the GM were statistically significantly lower than the observed counts. A small part of this underestimation may reflect the fact that Amir et al. ignored competing risks in their calculations of expected breast cancers; they compounded a 1-year absolute risk, rather than using the integration [see (4) and the formula in "Statistical Methods"], that accounts for competing risks over longer age intervals.

Another study conducted in Italy (23), of tamoxifen for breast cancer prevention, found an E/O ratio of 1.12. This slight excess of E over O may reflect preventive effects of tamoxifen in this study. In addition, that study (23) included a cohort of women who had undergone hysterectomy, whose breast cancer risk may be lower when compared to the general population.

It is important to consider the effects of screening on absolute risk projections. Members of the Florence–EPIC cohort were followed with mammography every 2 years, which may have contributed to the fact that more breast cancers were observed than predicted from breast cancer registry data from the region of Florence, where mammography screening in the general population is probably less frequent. The relative age-adjusted incidence rate in Italy compared to the United States was equal to 74.4/101.1 = 0.74 (3). The GM could explain this discrepancy if it included all relevant risk factors and if the sole difference in rates arose from differences in the distributions of these factors. In this case, we would expect the ratio (1 – ARItaly)/(1 – ARUS) to equal 0.74. In fact, from the estimate of ARItaly given by the IT-GM, we calculated this ratio to be (1 – 0.397)/(1 – 0.421) = 1.04. Thus the distributions of the GM risk factors in Italy and the United States do not fully explain the differences in countrywide rates, suggesting that other factors such as screening may contribute to these differences. The good calibration of the GM in the Florence–EPIC cohort probably reflects, in part, the similar screening rates in the Florence–EPIC cohort and in the United States.

A limitation of this study is that it focused on risk in the general population and not on women with special risk factors. In particular, the models we studied do not take into account certain risk factors, such as a personal history of breast cancer, whether she received radiation treatment to the chest, or the possibility that a disease-producing BRCA1 or BRCA2 gene mutation was transmitted in her family. Thus, the use of these models is not recommended in such circumstances. Other methods for predicting breast cancer risk that are based on more detailed genetic and family history modeling may be preferred when the inheritance of a BRCA1 or BRCA2 gene mutation is suspected. However, in a study of women in a specialized breast cancer risk assessment clinic, Euhus et al. (15) found that the Gail model was an appropriate risk assessment tool for most women attending specialized clinics, and they concluded that "this model was felt to accurately assign a general risk level in 87% of women" in such clinics.

A recent workshop on risk prediction models (38) recommended support for mechanisms and resources to validate risk models and the extension of existing models by using data sources that include diverse populations, including those from diverse geographic regions. Our data provide the opportunity to validate the GM in a Southern European locale and to develop and validate new breast cancer risk models that are based on Italian data. Our results are encouraging with respect to model calibration but reinforce the point, mentioned by others (10,36), that the GM predicts the numbers of observed breast cancers in subgroups of women well but has limited ability to identify precisely who will and who will not develop breast cancer.

This modest discriminatory accuracy is measured by the concordance statistic, which represents the probability that a randomly selected woman diagnosed with breast cancer will have a higher projected risk than a randomly selected woman without breast cancer. The average age-specific concordance statistics presented in Table 3 are the population-weighted averages of age-specific concordance statistics for the age groups 30–39 years, 40–49 years, 50–59 years, and 60 years or older. The age-specific concordances capture the discriminatory accuracy of the risk factors in these models apart from age (except for age variation within the age groups). Typically, concordance estimates from women of all ages are larger than age-restricted concordance estimates because age is a strong risk factor for breast cancer. The discriminatory accuracy of these models can be increased, in principle, by including more powerful risk predictors in the model. For example, adding measurements of mammographic density as a predictor would be expected to improve the discriminatory accuracy of risk prediction models (39), and recent publications indicate this is so (40,41). High discriminatory accuracy is needed for screening applications, but well-calibrated models with modest discriminatory accuracy are useful for allowing a woman to compare her risk of breast cancer with other risks she faces and to make informed decisions about the risks and benefits of an intervention, such as whether or not to take tamoxifen to prevent breast cancer (36,42). Well-calibrated breast cancer risk models with modest discriminatory accuracy are also useful for designing prevention trials, because their statistical power depends on the expected number of incident breast cancers and, hence, on the average absolute risk, and for estimating disease burden in a group or population from the distribution of risk factors in that population.

Although the good calibration of the GM in the Florence–EPIC cohort suggests that it may be a useful model for predicting risk in other Western European populations where mammographic screening is common, more work is needed to test the applicability of the GM to populations such as those in China or Eastern Europe, where screening is less frequent. The good performance of the IT-GM and the IT1-GM using independent Florence–EPIC data indicates that the data from the Italian Multicenter Case–Control Study of Diet and Breast Cancer might be useful for revising the GM to include additional risk factors, particularly modifiable risk factors, such as dietary consumption patterns (43). Regardless of whether such modifiable factors increased the discriminatory accuracy substantially, models with such factors might be useful for identifying changes in exposures that might reduce the absolute risk of breast cancer and might therefore be used in counseling. In principle, the case–control data used in this study, together with cancer registry data, can be used to construct such models of absolute risk, and the current findings encourage us to do so.


    NOTES
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
This work was supported by contributions from the Associazione Italiana per la Ricerca sul Cancro and the Italian Ministry of Education (PRIN2003). The study sponsors had no role in the design of the study; the collection, analysis, and interpretation of the data; the writing of the manuscript; or the decision to submit the manuscript for publication.

The authors thank Professor C. La Vecchia (University of Milan) for providing access to the Italian Multicenter Case–Control Study data.


    REFERENCES
 Top
 Notes
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 

(1) Landis SH, Murray T, Bolden S, Wingo PA. Cancer statistics 1998. CA Cancer J Clin 1998;48:6–31.[Abstract]

(2) Althuis MD, Dozier JM, Anderson WF, Devesa SS, Brinton LA. Global trends in breast cancer incidence and mortality 1973–1997. Int J Epidemiol 2005;34:405–12.[Abstract/Free Full Text]

(3) GLOBOCAN 2002 Database, Descriptive Epidemiology Group, IARC, Lyon, France. Available at: www-dep.iarc.fr. [Last accessed: October 25, 2006.]

(4) Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst 1989;81:1879–86.[Abstract/Free Full Text]

(5) Swan J, Breen N, Coates RJ, Rimer BK, Lee NC. Progress in cancer screening practices in the United States: results from the 2000 National Health Interview Survey. Cancer 2003;97:1528–40.[CrossRef][ISI][Medline]

(6) The Italian National Center for Screenings Monitoring. IV Report, Florence, Italy, 2005. Available at: www.tumori.net/it/screening.php. [Last accessed: October 25, 2006.]

(7) The Italian National Statistical Institute and Ministery of Health, Health Status and use of Health Services, Rome, Italy, 2005. Available at: www.istat.it/sanita/Health/. [Last accessed: October 25, 2006.]

(8) Fisher B, Costantino JP, Wickerham DL, Redmond CK, Kavanah M, Cronin WM, et al. Tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study. J Natl Cancer Inst 1998;90:1371–88.[Abstract/Free Full Text]

(9) Costantino JP, Gail MH, Pee D, Anderson S, Redmond CK, Benichou J, et al. Validation studies for models projecting the risk of invasive and total breast cancer incidence. J Natl Cancer Inst 1999;91:1541–8.[Abstract/Free Full Text]

(10) Rockhill B, Spiegelman D, Byrne C, Hunter DJ, Colditz GA. Validation of the Gail et al. model of breast cancer risk prediction and implications for chemoprevention. J Natl Cancer Inst 2001;93:358–66.[Abstract/Free Full Text]

(11) Bondy ML, Lustbader ED, Halabi S, Ross E, Vogel VG. Validation of a breast cancer risk assessment model in women with a positive family history. J Natl Cancer Inst 1994;86:620–5.[Abstract/Free Full Text]

(12) Spiegelman D, Colditz GA, Hunter D, Hertzmark E. Validation of the Gail et al. model for predicting individual breast cancer risk. J Natl Cancer Inst 1994;86:600–7.[Abstract/Free Full Text]

(13) Bondy ML, Newman LA. Breast cancer risk assessment models: applicability to African-American women. Cancer 2003;97(Suppl):230–5.[CrossRef][Medline]

(14) Newman LA, Rockhill B, Bondy ML, Abrams J, Berlin JA, Colditz GA, et al. Validation of the Gail breast cancer risk assessment model in African American women based on a multi-center case-control study of 3,283 African American and 5,974 white American women. Proc Am Soc Clin Oncol 21:2002(abstr 976).

(15) Euhus DM, Leitch AM, Huth JF, Peters GN. Limitations of the Gail model in the specialized breast cancer risk assessment clinic. Breast J 2002;8:23–7.[CrossRef][Medline]

(16) Kaur JS, Roubidoux MA, Sloan J, Novotny P. Can the Gail model be useful in American Indian and Alaska Native populations? Cancer 2004;100:906–12.[CrossRef][ISI][Medline]

(17) Abu-Rustum NR, Herbolsheimer H. Breast cancer risk assessment in indigent women at a public hospital. Gynecol Oncol 2001;81:287–90.[CrossRef][ISI][Medline]

(18) Miller BE. Breast cancer risk assessment in patients seen in a gynecologic oncology clinic. Int J Gynecol Cancer 2002;12:389–93.[CrossRef][ISI][Medline]

(19) Bernatsky S, Ramsey-Goldman R, Boivin JF, Joseph L, Moore AD, Rajan R, et al. Do traditional Gail model risk factors account for increased breast cancer in women with lupus? J Rheumatol 2003;30:1505–7.[ISI][Medline]

(20) Lewis CL, Kinsinger LS, Harris RP, Schwartz LJ. Breast cancer risk in primary care. Implications for chemoprevention. Arch Intern Med 2004;164:1897–903.[Abstract/Free Full Text]

(21) Pastor Climente IP, Morales Suarez-Varela MM, Llopis Gonzalez A, Magraner Gil JF. Application of the Gail method of calculating risk in the population of Valencia. Clin Transl Oncol 2005;7:336–43.[Medline]

(22) Novotny J, Pecen L, Petruzelka L, Svobodnik A, Dusek L, Danes J, et al. Breast cancer risk assessment in the Czech female population—an adjustment of the original Gail model. Breast Cancer Res Treat 2006;95:29–35.[CrossRef][ISI][Medline]

(23) Boyle P, Mezzetti M, La Vecchia C, Franceschi S, Decarli A, Robertson C. Contribution of three components to individual cancer risk predicting breast cancer risk in Italy. Eur J Cancer Prev 2004;13:183–91.[CrossRef][ISI][Medline]

(24) Amir E, Evans DG, Shenton A, Lalloo F, Moran A, Boggis C, et al. Evaluation of breast cancer risk assessment packages in the family history evaluation and screening programme. J Med Genet 2003;40:807–14.[Abstract/Free Full Text]

(25) Palli D, Berrino F, Vineis P, Tumino R, Panico S, Masala G, et al. A molecular epidemiology project on diet and cancer: the EPIC-Italy prospective study. Design and baseline characterisics of participants. Tumori 2003;89:586–93.[ISI][Medline]

(26) Mezzetti M, La Vecchia C, Decarli A, Boyle P, Talamini R, Franceschi S. Population attributable risk for breast cancer: diet, nutrition and physical exercise. J Natl Cancer Inst 1998;90:389–94.[Abstract/Free Full Text]

(27) Calza S, Specchia C, Frasca G, Tumino R, Sacerdote C, Fiorini L, et al. EPIC-Italy cohorts and multipurpose national surveys. A comparison of some socio-demographic and life-style characteristics. Tumori 2003;89:615–23.[ISI][Medline]

(28) Masala G, Assed M, Saieva C, Salvini S, Cordopatri G, Ermini I, et al. The Florence city sample: dietary and life style habits of a representative sample of adult residents. A comparison with the EPIC-Florence volunteers. Tumori 2003;89:636–45.[ISI][Medline]

(29) Pisani P, Faggiano F, Krogh V, Palli D, Vineis P, Berrino F. Relative validity and reproducibility of a food frequency dietary questionnaire for use in the Italian EPIC centres. Int J Epidemiol 1997;26 Suppl 1:S152–60.[Abstract/Free Full Text]

(30) Benichou J, Gail MH. Methods of inference for estimates of absolute risk derived from population-based case-control studies. Biometrics 1995;51:182–94.[CrossRef][ISI][Medline]

(31) Mezzetti M, Ferraroni M, Decarli A, La Vecchia C, Benichou J. Software for attributable risk and confidence interval estimation in case-control studies. Comput Biomed Res 1996;29:63–75.[CrossRef][ISI][Medline]

(32) Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using case-control data. Am J Epidemiol 1985;122:904–14.[Abstract/Free Full Text]

(33) Florence Cancer Registry. Available at: http://www.cspo.it/registri/registro_rtt/. [Last accessed: October 25, 2006.]

(34) Benichou J, Gail MH. A delta method for implicitly defined random variables. Am Stat 1989;43:41–4.

(35) Benichou J, Gail MH. Variance calculations and confidence intervals for estimates of the attributable risk based on logistic models. Biometrics 1990;46:991–1003.[CrossRef][ISI][Medline]

(36) Gail MH, Pfeiffer RM. On criteria for evaluating models of absolute risk. Biostatistics 2005;6:227–39.[Medline]

(37) Efron B, Tibshirani RJ. An introduction to the bootstrap. New York (NY): Chapman and Hall; 1993.

(38) Freedman AN, Seminara D, Gail MH, Hartge P, Colditz GA, Ballard-Barbash R, et al. Cancer risk prediction models: a workshop on development, evaluation, and application. J Natl Cancer Inst 2005;97:715–23.[Abstract/Free Full Text]

(39) Colditz GA. Epidemiology and prevention of breast cancer. Cancer Epidemiol Biomarkers Prev 2005;14:768–72.[Free Full Text]

(40) Chen J, Pee D, Ayyagari R, Graubard B, Schairer C, Byrne C, et al. Projecting absolute invasive breast cancer risk in white women with a model that includes mammographic density. J Natl Cancer Inst 2006;98:1215–26.[Abstract/Free Full Text]

(41) Barlow WE, White E, Ballard-Barbash R, Vacek PM, Titus-Ernstoff L, Carney PA, et al. Prospective breast cancer risk prediction model for women undergoing screening mammography. J Natl Cancer Inst 2006;98:1204–14.[Abstract/Free Full Text]

(42) Gail MH, Costantino JP, Bryant J, Croyle R, Freedman L, Helzlsouer K, et al. Weighing the risks and benefits of tamoxifen treatment for preventing breast cancer. J Natl Cancer Inst 1999;91:1829–46.[Abstract/Free Full Text]

(43) Franceschi S, La Vecchia C, Russo A, Negri E, Favero A, Decarli A. Low risk diet for breast cancer in Italy. Cancer Epidemiol Biomakers Prev 1997;6:875–9.[Abstract]

Manuscript received April 3, 2006; revised September 22, 2006; accepted October 12, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?

Editorial about this Article

The Risk of Cancer Risk Prediction: "What Is My Risk of Getting Breast Cancer?"
Joann G. Elmore and Suzanne W. Fletcher
J Natl Cancer Inst 2006 98: 1673-1675. [Extract] [Full Text] [PDF]



This article has been cited by other articles:


Home page
JNCI J Natl Cancer InstHome page
J. G. Elmore and S. W. Fletcher
The Risk of Cancer Risk Prediction: "What Is My Risk of Getting Breast Cancer?"
J Natl Cancer Inst, December 6, 2006; 98(23): 1673 - 1675.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Decarli, A.
Right arrow Articles by Gail, M. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Decarli, A.
Right arrow Articles by Gail, M. H.
Related Collections
Right arrowEditorial about this Article
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?