Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Homework 2

STAT 5511 (Fall 2022)

The usual formatting rules:

•   Your homework  (HW) should be formatted to be easily readable by the grader .

•   You may use knitr or Sweave in general to produce the code portions of the HW . However, the output from knitr/Sweave that you include should  be only  what  is  necessary  to  answer the  question, rather than just any automatic output that R produces .  (You may thus need to avoid using default  R functions if they output too much unnecessary material, and/or should make use of  invisible() or capture .output() .)

–   For example:  for output from regression, the main things we would want to see are the estimates for each coefficient (with appropriate labels of course) together with the computed OLS/linear regression standard errors and p-values .  If other output is not needed to answer the question, it should be suppressed!

•   Code snippets that directly answer the questions can be included in your main homework document; ideally these should be preceded by comments or text at least explaining what question they are answering .  Extra code can be placed in an appendix .

•   All plots produced in  R should have appropriate labels on the axes as well as titles .  Any plot should have explanation of what is being plotted given clearly in the accompanying text .

•   Plots  and  figures  should  be  appropriately  sized,  meaning  they  should  not  be  too  large,  so  that  the  page  length  is  not  too  long .    (The  arguments fig .height and fig .width to knitr chunks can achieve this .)

•   Directions for  by-hand” problems:  In general, credit is given for  (correct) shown work, not for final answers; so show all work for each problem and explain your answer fully.

Questions:

1.  (Prediction using the cross-correlation function) Assume that Yt  = aXt −ℓ + Wt  for some number a. The series Xt  leads Yt  if ℓ > 0 and is said to lag Yt  if ℓ < 0. Assume that E(Xt) = E(Yt) = 0, that {Xt} is stationary and that Wt ∼ WN(0,σ2 ) is uncorrelated with the whole series Xt . Let γx denote the autocovariance function of {Xt}.

(a) Is Yt  stationary?

(b) Compute the cross covariance function between Yt  and Xs, for any s and t.  (Your answer will

depend on γx, the autocovariance function of Xt .)

(c) Compute the cross correlation function between Yt  and Xs, for any s and t.  (Your answer will depend on γx, the autocovariance function of Xt .)

2. Question 2.3, Shumway and Stoffer, 4th edition (Note: The question is somewhat different than in previous editions).

3. Question 2.10, Shumway and Stoffer.  For (f)(iii), you can do both analysis of the residuals as you would in a non-time series context (e.g., a QQ-plot) and analysis of the correlation of the residuals (using the ACF).

4. Consider the setup of the previous question (Question 2.10, Shumway and Stoffer), and let us focus on just the oil series. One model we might consider for the (untransformed) oil series is the random walk with drift model, Xt = δ 1 + Xt 1 + Wt where Wt ∼ WN(0,σ2 ). If we let X0 = δ0  be a constant “intercept” term, then we have checked in class that we can write Xt  = δ0 + δ1t +对s(t)=1 Ws .  The mean of Xt  is thus δ0 + δ1t. We might be interested in estimating this (linear) regression function.

(a) Use lm() to regress the (untransformed) oil series on time. Print the summary() of the results

and plot the data with the regression line. Comment briefly on the statistical significance of the coefficients.

(b) Compute by hand the F-statistic for testing H0  : δ 1  = 0 against HA  : δ 1    0 (i.e., for testing

whether there is a drift) and the corresponding p-value by comparing to the appropriate F- distribution (see ?FDist). Check your result matches the F-statistic reported by summary().

(c) Compute also the regression with the quadratic time term, corresponding to the model E(Xt) = δ0 + δ1t + δ2t2  (this does not correspond to the random walk with drift model, so is a slight digression).   Compute the F-statistic and p-value for testing H0   : δ2   = 0.  Notice that this should match the result of the p-value for the quadratic term reported by summary(). (For each coefficient, summary() reports results of testing that coefficient to be 0 in a model where all the other coefficients are present. In fact, it turns out, the square root of your F-statistic should be the t-statistic that is reported.)

(d) Now we return to the random walk with drift model and the results of 4b. We want to assess whether the p-values that we computed actually mean anything. We will run a simulation study to assess this, as follows.

Simulate a random walk with no drift Xt , for t = 1, . . . , 545. (You may want to use the cumsum function.)  Assume Wt   N(0, 1) (you may also take δ0  = 0 although the value of δ0  will not matter here). Run the regression we ran previously in 4a of Xt on time, E(Xt) = δ0 +δ1t. Get the p-value for testing H0  : δ 1 = 0. (Note: you can get p-values from summary()$coefficients.)  Do this procedure M = 1000 times.  Report the proportion of p-values that are smaller than .05. Provide a comment explaining what this means for the p-value reported in 4b.