STAT 5511 (Fall 2021) Homework 2
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Homework 2
STAT 5511 (Fall 2021)
The usual formatting rules:
• Your homework (HW) should be formatted to be easily readable by the grader .
• You may use knitr or Sweave in general to produce the code portions of the HW . However, the output from knitr/Sweave that you include should be only what is necessary to answer the question, rather than just any automatic output that R produces . (You may thus need to avoid using default R functions if they output too much unnecessary material, and/or should make use of invisible() or capture.output().)
– For example: for output from regression, the main things we would want to see are the estimates for each coefficient (with appropriate labels of course) together with the computed OLS/linear regression standard errors and p-values . If other output is not needed to answer the question, it should be suppressed!
• Code snippets that directly answer the questions can be included in your main homework document; ideally these should be preceded by comments or text at least explaining what question they are answering . Extra code can be placed in an appendix .
• All plots produced in R should have appropriate labels on the axes as well as titles . Any plot should have explanation of what is being plotted given clearly in the accompanying text .
• Plots and figures should be appropriately sized, meaning they should not be too large, so that the page length is not too long . (The arguments fig .height and fig .width to knitr chunks can achieve this .)
• Directions for “by-hand” problems: In general, credit is given for (correct) shown work, not for final answers; so show all work for each problem and explain your answer fully.
Questions:
1. (Prediction using the cross-correlation function) Assume that Yt = aXt −ℓ + Wt for some number a. The series Xt leads Yt if ℓ > 0 and is said to lag Yt if ℓ < 0. Assume that E(Xt) = E(Yt) = 0, that {Xt} is stationary and that Wt ∼ WN(0, σ2 ) is uncorrelated with the whole series Xt . Let γx denote the autocovariance function of {Xt}.
(a) Is Yt stationary?
(b) Compute the cross covariance function between Yt and Xs, for any s and t. (Your answer will depend on γx, the autocovariance function of Xt .)
(c) Compute the cross correlation function between Yt and Xs, for any s and t. (Your answer will depend on γx, the autocovariance function of Xt .)
Solution:
(a) We have EYt = aEXt −ℓ . We have Var(Yt) = a2 Var(Xt −ℓ)+σw(2) = a2 γx(0)+σw(2) by independence of Xt −ℓ and Wt, and stationarity of Xt . For h > 0, Cov(Yt, Yt −h) = Cov(aXt −ℓ, aXt −ℓ −h) = a2 γx(h).
Thus, since Xt is stationary, Yt has constant mean, constant variance and its autocovariance is a function of the time difference only. We conclude Yt is indeed (weakly) stationary.
(b) Cov(Yt, Xs) = Cov(aXt −ℓ+ Wt, Xs) = a Cov(Xt −ℓ, Xs) + Cov(Wt, Xs) = aγx(|t − ℓ − s|), where we have used that Wt is independent of Xs for all (t, s) and that Xt is stationary by assumption.
(c) Using the calculation of Var(Yt) from above, the cross-correlation is
Cov(Yt, Xs)/^Var(Yt)Var(Xs) = aγx(|t − ℓ − s|)/^(σw(2) + a2 γx(0))γx(0).
2. Question 2.3, Shumway and Stoffer, 4th edition (Note: The question is somewhat different than in previous editions).
Solution:
(a) par(mfrow=c(2,2),mar=c(2.5,2.5,0,0)+0.5,mgp=c(1.6,0.6,0)) for(i in c(1:4)){
x<-ts(cumsum(rnorm(100,0.01,1)))
model<-lm(x~time(x)+0,na.action = NULL)
plot(x,ylab='random walk drift',las=1)
abline(a=0,b=0.01,col=2,lty=2)
abline(model,col=4)
}
2
0
−2
−4
−6
0 20 40 60 80 100
Time
4
2
0
−2
−4
0 20 40 60 80 100
Time
6
4
2
0
−2
0 20 40 60 80 100
Time
10
8
6
4
2
0
−2
0 20 40 60 80 100
Time
The dashed line is the true mean function and the solid line is the fitted one.
(b) par(mfrow=c(2,2),mar=c(2.5,2.5,0,0)+0.5,mgp=c(1.6,0.6,0))
for(i in c(1:4)){
x<-ts(rnorm(100))
y<-0.01*time(x)+x
model<-lm(y~time(x)+0,na.action = NULL)
plot(x,ylab='linear trend plus noise',las=1)
abline(a=0,b=0.01,col=2,lty=2)
abline(model,col=4)
}
2
1
0
−1
−2
0 20 40 60 80 100
Time
2
1
0
−1
−2
0 20 40 60 80 100
Time
2
1
0
−1
−2
0 20 40 60 80 100
Time
2
1
0
−1
−2
0 20 40 60 80 100
Time
The dashed line is the true mean function and the solid one is the fitted one.
(c) This question explores two very different models or data generating mechanisms. The estimated line based on the linear trend model does quite well (“is consistent”, we say) whereas based on the random walk it does poorly.
We saw in class the theoretical property that random walks are nonstationary because the variance of a random walk accumulates over time. This simulation shows what it means that the “trend” that we (think we) see in a random walk is actually variance rather than a true trend. (The four different instantiations of the random walk had four different “trends”, whereas the four different simulations in (b) had very similar trends.)
One way to think about this is to think about prediction: predicting future values based on the estimates in part (b) will tend to do well, but in part (a) the estimated line will be useless for prediction.
Another thing to notice is that the series as a whole is much more variable in (a) than in (b). For instance, the last observation (X100) goes from around -10 to +4 in (a) whereas in (b) it is always between −2 and 2.
3. Question 2.10, Shumway and Stoffer. For (f)(iii), you can do both analysis of the residuals as you would in a non-time series context (e.g., a QQ-plot) and analysis of the correlation of the residuals (using the ACF).
Solution:
(a) library(lattice)
library(astsa)
par(mfrow=c(1,1))
plot.ts(gas,ylab="price",main="gas and oil",ylim=c(25,325),col='1',las=1)
lines(oil, col='2')
legend("topright", legend = c("gas","oil"),lty = 1:1, col = 1:2, bty = 'n', cex=0.6)
gas and oil
gas
|
2000 2002 2004 2006 2008 2010
Time
The series look like random walks, perhaps with drift. So, it is not stationary. There is one visible very large jump present. Excluding that one, there are still several other quite large jumps present. This suggests there are periods of heavy volatility or heavy tailed behavior. Ignoring those, the random walk (with drift) model seems reasonable.
(b) If Xt+1 = (1 + r)Xt then log(Xt+1/Xt) = log(1 + r) which is approximately r if r is close to 0. so,∇ log(xt) is a good approximation.
(c) gas_gr <- diff(log(gas))
oil_gr <- diff(log(oil))
plot.ts(gas_gr,main="oil and gas growth rate",ylab = 'growth rate',col='1',las=1) lines(oil_gr, col='2')
legend("topright", legend = c("gas growth rate","oil growth rate"),
lty = 1:1, col = 1:2, bty = 'n', cex=0.6)
oil and gas growth rate
2000 |
2002 |
2004 |
2006 Time |
2008 |
2010 |
par(mfrow=c(2,1))
acf(gas_gr,main = 'gas growth rate')
acf(oil_gr,main= 'oil growth rate')
gas growth rate
0.3
Lag
oil growth rate
0.3
Lag
We can see that the transformed data looks fairly stationary since most of the ACF (excluding lag 0) lies within the 95% confidence interval.
(d) par(mfrow=c(1,1))
ccf(oil_gr, gas_gr, main = 'gas growth rate & oil growth rate',
ylab = 'CCF',las=1)
gas growth rate & oil growth rate
0.6
0.4
0.2
0.0
−0.4 −0.2 0.0 0.2 0.4
Lag
The plot here is of γoil,gas (h) = Cov(Ot+h, Gt). The strongest correlation on the plot is at h = 0; the two series are strongly contemporaneously correlated. Significant CCF values in this plot with lag > 0 indicate that gas leads oil; significant values when the lag is < 0 indicate that oil leads gas. We know that oil is used to create gas and so we would expect, a priori, that oil would lead gas. That would indicate we would see significant values with lag ≤ 0. We do indeed see that at a one week lag (h = −1) oil significantly leads gas. (It is debatable whether oil leads gas at 3 weeks, h = −3.) From this plot at lag h = 3 we also see that gas seems to lead oil by three weeks, and maybe also at weeks h = 1, 2. As the textbook mentions, this might be considered to be a feedback loop (e.g., where the price of gas is high and so oil sellers decide/realize they could increase the price of oil and gas sellers would still pay for it).
(e) lag2.plot(oil_gr,gas_gr,3,corr=T,smooth=T)
|
|
|
|
|
|
|
|
0.66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
−0.2 −0. 1 0.0 0.1 0.2
oil_gr(t−0)
2022-10-20