Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT 6227, Final Assignment

Proportional Hazard Models:  Estimation, Prediction and Evaluation

Due Tuesday, Dec 20, 2022

Submit to Blackboard

Part I. Computation Problems (50%):

(You need to provide your code and computer output for each question.)

We consider the semiparametric Cox Proportional Hazard Model (PMH) to analyze the IMRAW-IST dataset studied in Sloand et al., (Journal of Clinical Oncology, 2008, Vol. 26, No.  15, 2505-2511). Let T be the overall survival time (i.e., time to death), {(Ti,δi) : i = 1, . . . ,n} be the censored survival observations, and {Zi = (Ai,Si,Ni,Pi,Ii)T  : i = 1, . . . ,n} be the covariates to be considered, where, for the ith subject, Ai  = age, Si  = 0 or 1 if the subject is female or male, respectively, Ni  = Neutro (ANC), Pi  = Platelets, and Ii  = 0 or 1 if the subject is from the IMRAW cohort or treated at NIH, respectively.  Based on the Cox PHM, the hazard rate for T is

λi(t|Zi) = λ0 (t)exp{βT Zi},                                               (1)

where λ0 (t) is the unknown baseline hazard rate and β  = (β1 , . . . ,β5 )T  is the vector of parameters. In order to define a clinically meaningful λ0 (t), we define Ai , Ni  and Pi  to be the centered” versions of age, ANC and platelets, i.e., they are obtained by substracting their mean values from their actual values. We assume that the uncensored time-to-death Ti*  and the censoring time Ci  are independent, the subjects are independent and Ci  has density g(t) and cumulative distribution function G(t).

(1)  (20 points) The first question is whether IST treatment is predictive for overall survival after adjusting Z = (Ai,Si,Ni,Pi)T . To do this, we would like to compute the C- statistics C and C for λi(t|Z) (Model 1) and λi(t|Zi) (Model 2), respectively, for some selected τ values.

(1.1) For τ = 3 years, what are the estimates, and their corresponding SE and 95% CI for C and C, respectively?

(1.2) What are the point estimate and 95% CI for ξ3 = C − C?

(1.3) For τ = 5 years, what are the estimates, and their corresponding SE and 95% CI for C and C, respectively?

(1.4) What are the point estimate and 95% CI for ξ5 = C − C?

(2)  (20 points) The second question is whether IST treatment is useful for changing the risk categories of death after adjusting Z = (Ai,Si,Ni,Pi)T . To do this, we would like to compute the NRIs for Model 1 and Model 2, respectively, for some selected τ values.

(2.1) For τ = 3 years, we consider the risk categories [0, 10%) (“Low”), [10%, 30%) (“Medium”) and [30%, 100%) (“High”), and the corresponding NRI  from Model 1 to Model 2.  What are the estimate and their corresponding SE and 95% CI for NRI?

(2.2) For τ = 3 years, we consider the risk categories [0, 10%) (“Low”), [10%, 30%) (“Low-Medium”), [30%, 50%) (“Medium-High”) and [50%, 100%) (“High”), and the corresponding NRI from Model 1 to Model 2. What are the estimate and their corresponding SE and 95% CI for NRI?

(2.3) For τ = 5 years, we consider the risk categories [0, 20%) (“Low”), [20%, 40%) (“Medium”) and [40%, 100%) (“High”), and the corresponding NRI  from Model 1 to Model 2.  What are the estimate and their corresponding SE and 95% CI for NRI?

(2.4) For τ = 5 years, we consider the risk categories [0, 20%) (“Low”), [20%, 40%) (“Low-Medium”), [40%, 60%) (“Medium-High”) and [60%, 100%) (“High”), and the corresponding NRI  from Model 1 to Model 2. What are the estimate and their corresponding SE and 95% CI for NRI ?

(3)  (20 points) The third question is to consider two competing events:  “time to AML death” (Event 1) and  time to Non-AML death” (Event 2).  Using the competing risk model as described in Hosmer, Lemeshow and May (2008, Ch. 9.6), compute the unadjusted (no covariates other than treatments are included) cumulative incidence curves, Fj(t) for j =Event 1 and j =Event 2 for patients in the IST and IMRAW groups.

(3.1) For t = 3 years, what are the estimates and their corresponding SE and 95% CI for F1(3|IST) and F2(3|IMRAW)?

(3.2) For t = 5 years, what are the estimates and their corresponding SE and 95% CI for F1(5|IST) and F2(5|IMRAW)?

(3.3) If the covariates Zi  are considered, what are the effects (i.e., coefficients) of Zi for Event 1?

(3.4) If the covariates Zi  are considered, what are the effects (i.e., coefficients) of Zi for Event 2?

Part II. Writing Assignment (50% of total points, 60 points):

Write a short report summarizing the findings from all the data analysis results of the IMRAW-IST data in this course. Your report should be single spaced, 9 to 10 pages (not longer than 10 pages) without counting the references, and including the following items (10 points each):

(1) Summary;

(2) Introduction: background of the data, objectives and study questions, approaches and major findings;

(3) Methods: description of the statistical methods;

(4) Results: major results (tables, graphs, etc.) and interpretations;

(5) Conclusions;

(6) References: list all the papers and books cited in the main body.

Note that, since we have done a lot of data analysis, you may need to be selective and just report the results which are most relevant to your overall objectives of your report. This should be similar to a short communication paper in scientific journals or conference proceedings.  For example, submissions to the proceedings of the International Statistical Institute are limited to 10 single spaced pages.