Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT 3032 Regression and Correlated Data

Homework 11

Please show your work on each problem for full credit. A correct answer, unsupported by the necessary explanation, R code or output will receive very little if any credit. Your work needs to be organized in a reasonably neat and coherent way, and submitted as a pdf file on Canvas.

Please do not share this handout outside the class.

Problem 1

The data file GoogleStockVolume2020.csv contains the daily volume (the number of shares that are exchanged hands during a given day) from January 2, 2020 to April 17, 2020 (excluding weekends and holidays). Please note that volume is the second column of the dataset.

(a)_[1 pt] Use R to draw a time series plot of the daily volume in the dataset. Make sure that you show the relevant R code and output.

(b)_[2 pts] Generate the ACF plot and the PACF plot of the time series of the daily volume. Make sure that you show the relevant R code and output. Based on the plots, which of the following models would you consider fitting to this time series data? Please select the best answer and explain.

A: The white noise model

B: The linear regression model

C: AR(1)

D: AR(3)

(c)_[2 pts] Fit the time series model you selected in Part (b) using the arima( ) function. Write down the fitted model. Pay attention to the notation. Please show your work with the relevant R code and output.

(d)_[2 pts] Produce the ACF plot of the residuals of the fitted model in Part (c). If the model fits the data well, the residuals should behave like white noise. One of the features of the white noise is independence. Based on the ACF plot, do the residuals (, , …) seem independent from each other? Please explain. Hint: use modelName$residuals to get the residuals.

(e)_[1 pt] Predict the volume of April 20, 2020. Please use the formula of the fitted model in Part (c). For this part, you are not allowed to use the predict( ) function. Please show your work. Hint: The stock market doesn’t open on April 18 and 19, 2020, since they were Saturday and Sunday.

(f)_[1 pt] Predict the volume of April 20, 2020. Please use the predict( )function and show your work. Hint: You should get an answer very similar to that of Part (e). Due to the magnitude of the predicted values, their last digits before the decimal point could be different.

(g)_[1 pt] The actual volume on April 20, 2020 was 1,695,500. (You can find it at the Alphabet Inc. (GOOG) Stock Historical Prices & Data ). What is the residual of this observation? Hint: the residual is the observed value minus the predicted/estimated value. You can use the predicted value in Part (e) or Part (f).

Problem 2

Download the data file HW11_Problem2_Data_S2021.csv from Canvas. The second column value contains the time series data.

(a)_[1 pt] How many observations are included in this time series? Please explain how you obtain the answer.

(b)_[4 pts] Fit an appropriate model to the time series. What is the equation of the fitted model? Please show your work. Hint: You need to first identify the model type and then obtain the model coefficients.

(c)_[3 pts] Perform model diagnostics. Please show your work. Hint: use the time series plot and the ACF plot of the residuals.

(d)_[2 pts] Predict the next two observations using the model you find in Part (b). Please show your work. You may use the predict( ) function or the formula of the fitted model.