Lung cancer accounts for the highest deathrate, compared to any other cancer, across all demographic profiles. Because the symptoms don’t show up until it is too late to produce any effective medical intervention, people with a history of smoking, coal mining, genetic markers, or other lung predispositions, should consider occasional screening. This study produces compelling evidence that a new treatment, which appears in the late 1970’s, shows promise on certain types of lung cancer, using graphical analysis, but due to variability over the entire spectrum of lung cancers, fails the hypothesis test, under the normal model.
The purpose of this study is to compare two treatment methods for lung cancer to determine if there is any rational justification that the test treatment yields better results than the standard treatment. Both treatment methods involve chemotherapy, but different drug combinations are applied, one using a ‘standard’ mix, which serves as a control, another using an alternate mix, which will be referred to as ‘novel’ throughout the rest of the paper. This data set is so popular in statistical circles, that the supporting literature article has been dissociated from the results, even though it was probably a landmark finding in the 1970’s. At this point, the research front has moved well beyond the benchmark, as the discussion section reveals, but for now, take a look at the data[1]:
137 obs of 8 var...
V1 = Treatment denotes the type of lung cancer chemotherapy: 1 (standard), 2 (test)
V2 = CellType denotes the type of cell involved: 1 (squamous), 2 (small cell), 3 (adeno), 4 (large)
V3 = Survival is the survival time in days since the treatment
V4 = Status denotes the status of the patient as dead or alive: 1 (dead), 0 (alive)
V5 = Karnofsky is the Karnofsky score: measure of treatment effectiveness
V6 = Diag is the time since diagnosis in months
V7 = Age is the age in years
V8 = Therapy denotes any prior therapy: 0 (none), 10 (yes)
Since the ultimate goal of any medical procedure is to extend life, evaluate V1 and V3.
summary(lung$V3)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 25.0 80.0 121.6 144.0 999.0
summary(standard$V3)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.0 25.0 97.0 115.1 153.0 553.0
summary(novel$V3)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 24.75 52.50 128.21 117.25 999.00
##
## Call:
## lm(formula = y ~ x)
##
## Coefficients:
## (Intercept) x
## -59.445 1.607
n.standard + n.novel = N = 137
n.standard = 69
mean.standard = 115.1
st.dev.standard = 112.7
n.novel = 68
mean.novel = 128.2
st.dev.novel = 193.8
delta = mean.novel - mean.standard = 128.2 - 115.1 = 13.1
hypothesis null delta = 0
alt !=
st.err = sqrt(193.8^2/68 + 112.7^2/69) = 27.14
marg.err(95) = 1.96*27.14 = 53.2
pt.est(95) = 13.1 +/- 53.2
conf.int(95) = (-40.1,66.3)
pnorm(-13.1,0,27.14) = 0.315
alpha(95-2sides) = 0.025
pnorm > alpha : preserve null : no significance with variability
check:
Z = (13.1 - 0)/27.14 = 0.483
pnorm(-abs(0.483)) = 0.315
In the graphical analysis section, the data points were sorted in ascending order and plotted on the same graph, to determine how one data set compared to the other. To take things one step further, the novel data was plotted against standard data, yielding a linear regression slope of 1.61, which implies the novel data set might be superior to the standard, from a graphical perspective. However, when applying a normal statistical model to the data set, the noise is so great, that there is no scientific reason to reject the null hypothesis.
What could have caused such high volatility, examine V2, described as ‘CellType’, a potential response variable:
table(lung$V1,lung$V2)
##
## 1 2 3 4
## 1 15 30 9 15
## 2 20 18 18 12
barplot(table(lung$V1,lung$V2))
Without access to the original paper published by the Veteran Administration to establish experimental design, it looks like the researchers may have pushed the tougher cases toward the novel treatment, dooming its success in clinical trial. To compare both treatments fairly, each one should have an equal distribution of cancer types, consistent with a stratified sampling method. Perhaps one group would respond to one method, while another group would respond to a different method, based on the cancer mechanism, providing a niche application for the novel treatment. With only 8 distinct variables per patient, confounding variables should be taken into consideration. When a successful outcome occurs, interview doctors and patients to uncover evidence supporting factors not included in the study.
Delving into the current literature for some clues[2]:
With only one category in the initial study, SCLC has branched into 2 groups. Furthermore, SCLC grades: 1, 2, or recurrence after remission, 3, require unique chemotherapy / radiation combinations[3].
Through inferential evidence gathered from literature review, graphic interpretation, and statistical analysis, an experimental design template emerges:
[1] Kalbfleisch, J. and Prentice, R., “Veteran Administration Lung Cancer Study Group Data” in “The Statistical Analysis of Failure Time Data”, pp 223-224, Wiley: New York (1980).
[2] M. Sørensen, M. Pijls-Johannesma, & E. Felip, On behalf of the ESMO Guidelines Working Group, “clinical practice guidelines : Small-cell lung cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up”, Annals of Oncology 21, Supplement 5: v120–v125 (2010).
[3] Cristian Rapicetta, Sara Tenconi, Tommaso Ricchetti, Sally Maramotti, and Massimiliano Paci, “Ch 13: Surgery in Small-Cell Lung Cancer: Past, Present and Future” in “Lung Diseases - Selected State of the Art Reviews”, Elvisegran Malcolm Irusen, Ed, InTech: Rijeka, Croatia (2012).