ECON 424/524: Lab 2
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECON 424/524: Lab 2
Lab 2: Simple Linear Regression
Goals for this lab:
● Continue learning how to explore and manipulate data using Stata
● Learn how to create scatterplots
● Learn how to estimate and interpret a univariate regression
1 Setup
1. Open Stata and open a new do-file.
2. Open the dataset wine .dta.
3. Examine the data using the commands describe, list, and summarize .
4. Note that the variable country is not numeric, but is what is called a “string” variable because it is stored as a string of non-numeric characters. Create a new numeric variable corresponding to country by doing the following:
● Type encode country, generate(country num)
● Compare the values of country and country num using the list and summarize commands.
● Type labelbook country num
2 Creating scatterplots
In this lab, we will analyze the relationship between wine consumption, disease and death. We will begin by looking at scatterplots to visualize this relationship.
1. Create a scatterplot of deaths against wine consumption by typing scatter death alcohol. You can learn more about where individual countries fall on this graph by typing scatter death alcohol, mlabel(country) .
2. Create a scatterplot with a fitted regression line through it by typing twoway (scatter death alcohol) (lfit death alcohol).
3. Suppose instead we want to estimate the elasticity of death with respect to wine con- sumption. To do this, we generate new variables from the natural logs of the original variables by typing:
● generate ldeaths=ln(deaths)
● generate lalcohol=ln(alcohol)
● Label these variables by typing label variable ldeaths "log deaths per 100,000" and label variable lalcohol "log wine consumption per capita" .
4. Create analogous scatterplots to those you created above using the new variables ldeaths and lalcohol .
3 Running regressions
1. Run a regression of deaths on wine consumption by typing regress deaths alcohol.
2. Get the residuals and fitted values from the regression by typing:
● predict uhat, residuals
● predict death hat, xb
3. Generate an alternate version of the deaths variable as generate death alt = death hat + uhat. Compare the variables deaths, death alt and death hat by typing summarize deaths death alt death hat.
4. Investigate the correlation between the alcohol variable and the residuals by typing correlate alcohol uhat.
5. Repeat each of the steps above (starting with the regression in Step 1) using ldeaths and lalcohol as the dependent and independent variable, respectively.
Name:
Complete and submit the following sheet along with your do-file by the end of the day:
1. What is the maximum per capita wine consumption in the dataset and for which coun- try is the amount reported?
2. Based on the scatterplot of deaths against alcohol, what sign do you predict for the regression coefficient βˆ1 ? Do think the assumption that E(u·alcohol) = 0 is likely to be true in this case?
3. In the regression of deaths on alcohol, what estimates do you obtain for βˆ0 and βˆ1 ? Interpret each of these numbers in words.
4. In the regression of deaths on alcohol, what is the value of R2 ? Interpret this value in words.
5. Which variable has a higher standard deviation, deaths or death hat, and why?
6. In the regression of ldeaths on lalcohol, what estimates do you obtain for βˆ0 and βˆ1 ? Interpret βˆ1 in words.
2022-10-12
Simple Linear Regression