Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

EMET8002 Case Studies in Applied Economic

Analysis and Econometrics

Semester 2 2023

Computer Lab in Week 3

Question 1: Simple Linear Regression

Download the “states” data from Wattle and open it in Stata. As part of this question we explore the relationship between SAT (Scholastic Assessment Test) scores and the per pupil expenditure in primary and secondary school, in the U.S. on a state level.

(a) Describe the variables of interest (the SAT score, coded as “csat” and education expense, coded as “expense”) individually as well as their correlations and a scatterplot. Are there any outliers?

(b) Run a simple linear regression model where “csat” is the dependent (outcome) variable and “expense” is the independent (explanatory) variable. Do this with and without accounting for outliers. What changes? Which model do you prefer?

(c) Test whether the distribution of the residuals from your regressions in part (b) follows a normal distribution. Does the normality assumption hold?

Question 2: Multiple Linear Regression and Quantile Regression

We continue working with the “states” dataset. As part of this question we explore the relationship between SAT (Scholastic Assessment Test) scores and the following four variables: (1) Per pupil expenditure in primary and secondary school ("expense"), (2) % High school graduates taking SAT ("percent"), (3) Median household income in $1,000 ("income") and (4) % adults college degree ("college"). The data is provided on a state level for the U.S.

(a) Describe the five variables of interest individually as well as their correlations.

(b) Run a multiple linear regression model where “csat” is the dependent (outcome) variable and the other four variables are the independent (explanatory) variables.

(c) Test whether the distribution of the residuals from your regressions in part (b) follows a normal distribution. Does the normality assumption hold?

(d) Instead of running a multiple linear regression which estimates the mean test scores, as in part (b), run quantile regressions to estimate the median, the 10th quantile and the 90th quantile of mean test scores. Use the same dependent and independent variables as in your model from part (b).

Question 3: Preparation for the Research Report [not required for problem set]

Last week we discussed some aspects of the research report (worth 45% of your final mark) and we now continue the preparation for the report as well as the research proposal. We strongly recommend starting your work on the project as soon as possible.

(a) Have a look at the section with the research report on Wattle and discuss the structure of the final research report.

(b) What data is required for replicating the papers? What are the data sources? If you need to apply for the data through the Australian Data Archive we recommend to start the process now.

(c) As part of the project you are required to replicate and extend one of the papers. First of all, explain in your own words what is meant by replicating the main findings of a paper.

(d) Now explain in your own words what is meant by extending the results of a paper.

(e) As an example, consider the possible extension to update the data (e.g., using new waves of data). Discuss some ideas how this could be turned into a research question and backed up with economic theory and/or academic literature.