Skip Navigation

JNCI Journal of the National Cancer Institute 2006 98(4):232-234; doi:10.1093/jnci/djj086
This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Twombly, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Twombly, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© Oxford University Press 2006.

NEWS

Criticism of Tumor Response Criteria Raises Trial Design Questions

Renee Twombly

When it debuted in 2000, the Response Evaluation Criteria in Solid Tumors (RECIST) was intended to be a simpler way to measure the response of tumors to experimental treatments. (See article, Vol. 92, No. 3, p. 205.) The prior criteria, adopted by the World Health Organization in 1979, involved a complicated formula that required measuring two dimensions on a tumor and multiplying the parameters with a calculator, if not a computer. RECIST made the job easier by requiring measurements of just the longest dimension of several tumors and adding them together.

But, in the 5 years since the widespread adoption of RECIST, the measurement tool has drawn some criticism. Critics argue that the RECIST criteria are too narrow—they force researchers to say a drug works or does not work based solely on changes in tumor size. In addition, some researchers say the criteria aren't universally applicable to all cancer types and drug classes and result in too many single arm phase II studies that are not predictive of a drug's ultimate success.

Although there is room for improvement within some of the criteria, much of the blame directed at RECIST is really misplaced frustration about poor clinical trial design, says Elizabeth Eisenhauer, M.D., who discussed this and other issues that have arisen in the 5 years since the RECIST criteria were published at the European Cancer Conference in Paris in November.


Figure 1
View larger version (121K):
[in this window]
[in a new window]
 
Elizabeth Eisenhauer

 
"People confuse the problem related to design and choice of endpoint with how you measure tumors," said Eisenhauer, vice president of the National Cancer Institute of Canada and a co-author of the RECIST criteria. "So they believe that using response criteria means using the same design regardless of the agent and tumor type. They feel they need to show the same degree of response in noncytotoxic drugs that might be seen in traditional cytotoxic drugs, but no part of RECIST states a minimum response rate that is important for declaring interest in a new drug."

In other words, researchers can decide that, for example, a 10% overall response rate or a 50% stable disease rate in a particular trial is meaningful, but to determine what that response rate or stable disease rate is, they still need to measure tumor load—and for that, RECIST criteria are available.

"We don't need a new way of describing what can happen to a tumor," Eisenhauer said. "We need a new way of designing the trial using those categories that signal activity for drugs that might not cause tumor shrinkage."

Shrinking Interest in Tumor Shrinkage?

The RECIST criteria have become a focal point of discussion in the trial design debate because of the underlying and traditional assumption that patients can be categorized into responders and nonresponders based on changes in the size of their tumors. That belief leads to insistence that tumor response be a part of clinical trial testing.

"The contentious point is that some people believe active agents shrink tumors no matter what," said Gwen Fyfe, M.D., vice president of hematology and oncology at Genentech. "It seems less likely that targeted agents will shrink tumors, but clearly some of them do."

To some, this stems from the traditional way that clinical trials test new therapies in patients with tumors—with a phase II clinical trial using a few patients who are treated with a single agent alone. In single-arm trials, they say, tumor response rate has to be used because of the belief that tumor regression is the surest measure of a drug effect.

But several cancers are not easily measurable in such a phase II setting. For example, Howard Scher, M.D., published a study last July in Clinical Cancer Research that found that few patients with metastatic prostate cancer had tumors that were measurable according to RECIST criteria, and that there are no target lesions in patients with rising PSA and localized disease, making these patients ineligible for trials that use RECIST criteria.

In short, most prostate cancer just doesn't spread in a way that allows tumors to be measured, said Scher, chair of urologic oncology at Memorial Sloan-Kettering Cancer Center in New York. Therefore, RECIST "misses the point. What you really want from a clinical trial is a decision on whether an agent worked, and to what degree, and in prostate cancer, I don't see how RECIST can get you there," he said.

Scher agrees with Eisenhauer that the real problem is that "investigators don't state clearly what their expectations are and what outcome of a trial would convince them there is enough signal to go forward," Scher said. "One size doesn't fit all, and you shouldn't design a trial to serve the criteria but [instead to serve] the endpoint that is based on what you are trying to show."

Controlling Comparisons

In fact, argues Mark Ratain, M.D., from the University of Chicago, "response criteria haven't even worked for our current drugs." Ratain, a well-known critic of oncology clinical trial drug design, says that "our current criteria are not useful to predict drug approval." In a recent editorial in Clinical Cancer Research, Ratain wrote that "positive phase II trials have not been predictive of phase III success" because very few drugs that go into phase III testing are found to show benefit.


Figure 2
View larger version (124K):
[in this window]
[in a new window]
 
Mark Ratain

 
Conversely, if tumor change is the only criterion used in phase II testing, then effective agents such as Herceptin (trastuzumab), Tarceva (erlotinib), and Avastin (bevacizumab) would never have been approved because of their fairly low response rate of about 10%, he said.

Ratain is pushing for larger phase II studies that use a control arm that can truly detect treatment differences between groups. "Current metrics are all designed with single-arm trials where you have to say how many people responded because you don't have a control group," he said. "...You need a control group to compare apples and apples. You can still use some standard metrics to compare apples and apples, but you can also use any metric you want to compare, such as disease stability, time to progression, or quality of life."

He notes that other disciplines use randomized, controlled phase II trials. "Oncologists are the only ones that use uncontrolled single-arm phase II trials, and we only do it that way because it is our religion, the way we were trained to do it," Ratain said.

Robert Glassman, M.D., a New York oncologist and investment banker, notes that only three to five oncology drugs are approved each year, although 635 drugs are currently in human testing and more than 2,000 are in discovery or preclinical testing, mostly in the United States.

"Phase II should be the place where drugs are filtered out, but most noncontrolled studies, which are filled with biases, have proven not to be predictive," he said. "Only overall survival and quality of life are true, clinically meaningful endpoints. Everything else—response rate, progression free survival, time to progression, etc.—are surrogates."

Progression-Free Survival

Ratain's trial design of choice is a randomized discontinuation design in which patients with stable disease are treated with an agent and then randomly assigned to continue or to go on a placebo. Results are then compared. If stable disease is a criterion, "then you can definitely measure between the groups to see if this is meaningful," Ratain said.

Ratain cites tests of two different renal cancer agents, carboxyaminoimidazole (CAI) and sorafenib. Both showed a RECIST response rate of 2%, but the randomized discontinuation design used in both trials showed CAI to be nonactive, whereas sorafenib demonstrated substantially longer progression-free survival. "Sorafenib is a highly active drug that was approved even though it doesn't have much of a response rate," he said. "But how would you find this out if you are using response rate as a screening criterion? Both are either active or inactive depending on what you believe 2% represents."

Response rate is prone to patient selection bias, especially as the drugs are tested in earlier disease, said Genentech's Fyfe, and in the end, response rate "only tells you something about the 20%, say, who might have had some objective measure of tumor shrinkage.

"But it doesn't tell you what happened to the other 80%," she said. "Did the drug affect them and slow the pace of their disease? Did it cause a little bit of tumor shrinkage but not enough to meet RECIST? When you just look at response, it is an arbitrary dichotomous variable that ignores all the people who didn't get an objective response, and those people may or may not have benefited in the pace of their disease."

Genentech's favored criterion is progression-free survival, because progression in each patient in a trial can be measured against that of other patients, and "each patient is valued equally in a progression analysis," Fyfe said.

Eisenhauer said that the oncology community should have a discussion about these design issues but added that phase II control arm studies require many more patients than the usual single-arm trial, "which raises some issues about feasibility."

Still, she points out that if progression or even tumor stability are chosen endpoints for these studies, it will still be necessary to use RECIST to quantify those variables. "RECIST is just a common language for describing what is happening to patients on trial who have tumor masses at baseline when they start on treatment," she said. "... RECIST brings up all kinds of issues because I think people are really looking for something magical that will tell them for sure what drug will work and what won't work. At this point, we don't have any guaranteed formula for that."


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Twombly, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Twombly, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?