Math 189 Spring 2023 Homework 6
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Homework 6
Math 189
Spring 2023
Due:5/14/2311:59PM PT
Save the.html RMarkdown file as a PDF, and submit the PDF to Gradescope. Make sure the pdf file contains both code and results. If the grader cannot open your submission, you will receive a 0. Do not submit the .Rmd file.
Please assign your pages to problem numbers when submitting your pdf to Gradescope. Failure to do so will result in a 1 point deduction.
Conceptual Problem
1.When would you want to use ridge regression instead of a standard linear regression?
2. When would you not want to use ridge regression?
Application Problems
For this homework, you will use the Hitters data in ISLR2 package.
3.Cross validation for linear model.
a. Remove the variables League ,Division ,and NewLeague from the data set. Then keep all rows that have complete cases only (i.e. no NA s). This can be done using the R function
complete. cases . Your new data. frame should be 263 by 17. Verfiy this by reporting the dim of your data.frame.
b. Split your data into a training and test set, with the training data containing 80% of the observations and the test set containing the other 20%.
c. Fit a linear model with Salary as the response, and all other variables as covariates. Report the summary of this model.
d. Test your model on your test dataset. Report the root mean squared error of your predictions compared to their true values.
e. Compare your result in (b)to the Residual standard error reported by R in the model you fit in (a). Which value did you expect to be larger? Why?
4. LASSO model. Use the same dataset as you constructed in 3a for this problem.
a. Use the LASSO for a model with Salary as the response and all other variables as covariates. Give a justification for the value of the regularization parameter λ you use.
b. Report the coefficients for the variables from the LASSO model. What does it mean if a variable gets zeroed out by LASSO?
c. Test the fitted LASSO model on your test dataset. Report the root mean squared error of your predictions compared to their true values.
5. Ridge regression model. Use the same dataset as you constructed in 3a for this problem.
a. Fit a linear model with ridge regression for a model with Salary as the response and all other variables as covariates. Give a justification for the value of the regularization parameter λ you
use.
b. Report the coefficients for the variables from the ridge regression model. How do the coefficients compare to the coefficients from the linear model fit in 3c?
c. Test the fitted ridge regression model on your test dataset. Report the root mean squared error of your predictions compared to their true values.
6. Consider the linear model, LASSO, and ridge regression fits in problems 3, 4 and 5. Which model would you recommend using if the General Manager of a baseball team is interested knowing which variables are most important for predicting a players salary?
2023-05-14