Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Assignment 3: ECON-UA 266 - Intro to Econometrics

Spring 2023

The solution of this assignment will be released on Friday, February 17th 2023. You will work on algebraic properties of the OLS estimators.  You are encouraged to discuss your solutions with your colleagues on Piazza. However, you should also work on your own solutions, since it will be great practice for your exam. It’s also good practice to report nice regression results, not simply copying down the tables generated by R. Be clear and concise in your answers.

Question 1

Define the three assumptions to derive the OLS estimator properties. Choose one and discuss whether it is a

plausible assumption. Give a specific example where the assumption might be violated.

Answer

The three assumptions to derive the properties of the OLS estimator are

1. Mean Independence: E[ε|x]=0

2. The sample is i.i.d

3. The fourth moments are finite, namely E [Xi(4)] < and E [Yi4] <

We also saw one more assumption in class, i.e., we need variation in X . Without variation in X , the β is not well defined as β = Cov (Y, X)/Var(X ) and if Var (X) = 0, then β is undefined. We will focus on the three main assumptions in this question.

In many cases the mean independence assumption is violated.  A particular example is when researchers run regressions of wages on explanatory variables. In particular, suppose the researcher ran the following regression:

Earningsi  = αi + βEducationYrsi + εi

Where Earningsi  correspond to the weekly earnings of individual i and EducationYrsi  correspond to the number of years individual i spent in education.  It is highly likely that the error term includes many components which also determine earnings and which are correlated with EducationYrs (e.g. ability, type of occupation etc.). As such, the error term and our explanatory variable is likely correlated violating mean independence.

Question 2

In your assignment, you are asked to show that Z(2)  = 2i(2)    is an unbiased and consistent estimator for σZ(2) .

Answer

In this question, we will accept intuitive answers.

We know from the estimator for the variance of a random variable that we need to divide by N − 1 to get an unbiased estimator of the variance. This is because we are losing one degree of freedom. We need to first estimate the population mean to estimate the population variance. When N is large, it doesn’t matter. But lets show this result formally:

N                                              N

E [工(Xi − )2] = E [工(Xi(2) − N 2 )] =

i                                             i

N

E [Xi(2)] E[N 2] = N (σX(2)  + µX(2)) N((σX(2)/N) + µX(2)) =

i

σX(2)(N 1)

i(N)(Xi )2

N − 1

would be an unbiased estimator of σX

The principle is the same for the variance of Zi  = (Xi − )εi . But now, we also need to estimate Zi  first, as we don’t know εi . We will use ei  = Yi − βOLS Xi − αOLS .

This will make us lose another degree of freedom.

Question 3

Show that

SST = SSE + SSR

Answer We need to show that

(Yi )2  = (i i )2 + (Yi i )2

i                                    i                                    i

Note that (Yi − ) = (Yi − i ) + (i − )

So we see that

(Yi )2  = (i )2 + (Yi i )2 + 2 (Yi i )(i )

i                                    i                                    i                                        i

It just remains to show that the last term on the right hand side is zero.

Noting that i  = + βˆXi , = + βˆ, and that Yi − i  = (Yi − ) − (i − ) we have that the last term is equal to

2 [(Yi ) βˆ(Xi )] βˆ(Xi ) i

This is equal to

2βˆ [(Xi − )(Yi − ) − (Xi − )2 βˆ]

i

Subbing in the formula for βˆ and re-arranging this reduces to

2βˆ(i(Xi )(Yi ) i(Xi )2 j X¯(Yj))

Cancelling terms this finally reduces to

2βˆ(工(Xi )(Yi ) −工(Xj )(Yj )) = 2βˆ(0) = 0

i                                                       j

There is more than one way to answer this question. I will accept all answers.

Question 4

1. Explain the differences between these three equations:

= αOLS + βOLS X

Y = α + βX + ε

Y = αOLS + βOLS X + e

Answer The first equation represents the fitted regression line using OLS estimates of the regression model. The second equation represents the true population regression model, with ε to measure the error between the outcome variable Y and the value predicted by the population regression. The third equation is the OLS regression model. More specifically, this equation represents an OLS estimation of the population regression model.

2. What is the difference between e and ε from the previous question?

Answer We cannot observe the stochastic error ε in equation 2. Instead, we use the residual e to approximate ε .

3. Suppose you consider a model where Y = βX + ε, what is the consequence for ? What about SeX ? Explain your answer intuitively or algebraically.

Answer Intuitively, we are setting α = 0, which leaves us less flexibility to minimize our SSR compared to the case when we have an intercept.  Thus, we will expect our the 2  for each sample to be higher, since we put more restrictions on our model right now. Moreover, since we arbitrarily set alpha equals to 0, the Cov (X, ε) may not be 0 right now. In application, setting α = 0 can sometimes make the estimator more significant. However, it does not always make sense to set α = 0. Consider an example of the association between wheat harvesting and precipitation of the year. Even without precipitation, we still would expect some level of harvesting of wheat, making the setting α = 0 not plausible.

Algebraically, we do a minimization problem for the model without intercept like we did for a OLS model with intercept:

Yi  = βˆX + e

min (yi βˆxi )2

Take first order condition:

2工(n)(yi βˆxi )xi  = 0

i=1

which can also be represented by

n

2 ei xi  = 0

i=1

.

Solve the condition, we get:

βˆ = xi yi

Question 5

Suppose you are interested by the relationship between earnings and the number of years of education: Earningsi  = α + β × Educi + εi ,

a. Explain what εi  is. What is included in εi ?

Answer εi  is any other factors that would affect Earnings another than the years of education. It is the unobservable error term representing the gap between the predicted earning by years of education and real earnings.

Any other factors can be included, such as experience, tenure, health condition, IQ, EQ, family connection, age, race, etc.

b. Do you think it is likely that E (εi |Educi ) 0? Explain.

Yes. It is very likely that E (εi |Educi ) 0. It is because we can easily imagine that those omitted factors are associated with years of education. Thus the conditional expectation of εi  is highly likely not equal to zero. This implies that Cov (Educ, εi ) 0.

c. If the assumption is not satisfied, what is the consequence in terms of the properties of the OLS estimator of α? what about the properties of β?

The α and β are biased. Below are proof and an example showing that they are biased. Proof First, we prove that βˆ is biased.

We know that:

And:

Yi  = α + βXi + ε(2)

= α + β + ε¯(3)

Yi − = β(Xi − ) + (εi − ε¯)(4)

Plug (4) into (1), we get:

βˆOLS  = β + εi (xi )

From law of iterated expectation, we have:

E [βˆOLS ] = E [β] + E [ 1) ] = β + E[E [ 1) |x]] = β + E[ [1x¯(xi)]

With α arbitrarily set 0, E [εi |x] does NOT necessarily equal to 0. Thus, E [βˆOLS ] β . We have that β is biased.

Next, we prove that αOLS  is biased.

E [αOLS ] = E [] − E[(β + [1x¯(xi))]

E [αOLS ] = E [] − βE[X] − E[( [1x¯(xi))]

We can see that the last term does NOT necessarily equal to 0 since E [εi |x] does NOT necessarily equal to 0. Thus, αOLS  is also biased.

An simple Example: Suppose we have a multivariate population regression model Y = α + βX + γZ + ε, and suppose there is an association between Z and X that Z = d + fX + u. Then, if we absorb Z into the ε of our original population regression, we get:

Y = α + βX + γ(d + fX + u) + ε = (α + γd) + (β + γf)x + (γu + ε)

This example is showing that, if we ignore a factor Z that is associated with X , we will get biased estimator for X . Here, α\ = α + γd and β\ = β + γf .

Question 6

Consider the following population linear regression model: Yi  = α + βXi + εi .

Give the formula for the OLS estimator of β . Explain how the formula is being derived. What is the intuition behind the OLS estimator?

βˆ = Cov (X, Y)

min 工(n)(yi βˆxi )2

i=1

Let W = (yi βˆXi )2

First, we take First Order Condition with respect to both α and β .

= −2 (yi βˆxi ) = 0(1)

W = −2 (yi − βˆxi )xi  = 0(2)

Then, we solve these two conditions for α and β from equation 1 and 2.

From equation (1), we have:

n

i=1

n                 n                n

i=1             i=1            i=1

As yi  = ny¯ and xi  = n . We have:

ny¯ n nβˆ = 0

= y¯ − βˆ(3)

Then we plug equation (3) into equation (2).

(yi (y¯ − βˆ) − βˆxi )xi  = 0

i=1

n

i=1

n                          n                         n                      n

i=1                      i=1                     i=1                 i=1

βˆ =之(之) = ,

where we are use that the sample covariance

n                                                  n                                       n                               n

SXY  = (xi )(yi y¯) =工(xi )yi y¯(xi ) =工(xi )yi + 0.

i=1                                              i=1                                   i=1                           i=1

Question 7

Which of the following can cause OLS estimators to be biased?

(i) The variance of the population linear regression model error term depends on X . (ii) Omitting an important variable.

(iii) A sample where Xi  and Xj  is not independent.

Solution:

(i): This does not necessarily bias the OLS estimator itself. (However, it can cause problems with the standard errors of the OLS estimator. If the variance of the error term depends on X, this violates the assumption of homoscedasticity, which means that the standard errors of the OLS estimator may be biased. To obtainaccurate estimates of the standard errors, we can use methods such as heteroscedasticity-robust standard errors standard errors.)

(ii): Yes, this can cause the OLS estimator to be biased by violating the conditional zero mean assumption (A1). Omitting an important variable from the regression model can result in omitted variable bias, which arises when the omitted variable is correlated with both the regressor in the model and the outcome variable. However, if the omitted variable is independent of the regressor, then there won’t be any bias.

(iii): This does not necessarily bias the OLS estimator itself. (As in (i), it can cause problems with the standard errors of the OLS estimator. If there is correlation or clustering in the data, such as when observations are correlated within a school, family, or city, the standard errors of the OLS estimator may be biased if we do not account for this “clustering”.)

As an example, consider a case where we want to estimate the relationship of education on earnings, but education correlates within cities. If the city has no direct effect on earnings beyond its effect on education, then the OLS coefficients will be unbiased. However, if the city has a direct effect on earnings beyond its effect on education, for example through the industries present in the city, then the OLS coefficients for the relationship between education and earnings will be biased. In this case, we need to account for the city-level effect on earnings by, e.g. including city fixed effects in the regression model.

Data Question

Follow up with the dataset you downloaded in the previous question, i.e.,  2016 CPS, which contains observations on weekly earnings, sex, age, race, and education for respondents aged 25-64.

a. Define the univariate population regression model formally.

Answer

Given a dataset (Yi , Xi ) the univariate population regression model corresponds to the estimates ( , βˆ) which solve the following minimization problem

minα ε2

i

where εi  = Yi a bXi

b. Run the regression and interpret the economic significance of the coefficient.

Answer

library(tidyverse)

##  --  Attaching  packages  ---------------------------------------  tidyverse  1 .3 .2  --

##  v  ggplot2  3 .4 .0

##  v  tibble    3 .1 .8

##  v  tidyr      1 .3 .0

##  v  readr      2 .1 .3

##  --  Conflicts  ------------------------------------------  tidyverse_conflicts()  --

##  x  dplyr::filter() masks  stats::filter()

##  x  dplyr::lag()       masks  stats::lag()

library(foreign)

library(stargazer)

##

##  Please  cite  as:

##

##    Hlavac,  Marek  (2022) .  stargazer:  Well-Formatted  Regression  and  Summary  Statistics  Tables .

##    R  package  version  5 .2 .3 .  https://CRAN .R-project .org/package=stargazer

library(haven)

#Importing Data #Change to your working directory when working on your computer

mydata  <- read_dta("morg16 .dta")

newdata=filter(mydata,  intmonth==3,age>=25,  age<=64)

subset_data=select(newdata,  earnwke,  sex,age,  race,  grade92)

#Log Income

subset_data[subset_data==0]  <- NA

subset_data=drop_na (subset_data)

subset_data[ 'logincome ']=log(subset_data$earnwke)

#Mapping

newvals_grade92  <- c ( '31 '=0 , '32 '=3 , '33 '=6 , '34 '=8 , '35 '=9 ,

'36 '=10 , '37 '=11 , '38 '=12 , '39 '=12 , '40 '=14 ,

'41 '=14 , '42 '=14 , '43 '=16 , '44 '=17 , '45 '=20 , '46'=22)

subset_data[ 'yrs_ed ']=newvals_grade92[as .character(subset_data$grade92)]

# Elasticity Method

ols  <- lm(logincome ~ yrs_ed, data=subset_data)

stargazer(ols,type  =  'text ')

##

##  ================================================

##                                                       Dependent  variable:

##                                              ----------------------------

##                                                                  logincome

##  ------------------------------------------------

yrs_ed


Constant


##  ------------------------------------------------

##  Observations                                      11,025

##  R2                                                         0 .130

##  Adjusted  R2                                       0 .130

##  Residual  Std .  Error            0 .719  (df  =  11023)

##  F  Statistic                  1,642 .860***  (df  =  1;  11023)

##  ================================================

##  Note:                                 *p<0 . 1; **p<0 .05; ***p<0 .01

#Compute Mean of X and Y

sapply(subset_data,mean)

##        earnwke                 sex                age

##  992 .340149      1 .493515    43 .487710

0.103*14.251429  /6.654985

race 1 .406803

grade92 41 .027302

logincome

6 .654382

yrs_ed

14 .251429

##  [1]  0 .2205711

Economic Significance: The elasticity evaluated at / is χY X  = 0.103 ∗ 14.251429/6.654985 = 0.2205711. A one percentage change in education year is associated with a 0.22 percentage change in weekly income change rate.

# Standardization Method

sapply(subset_data,sd)

##          earnwke

##  679 .0229892

##             yrs_ed

##      2 .6844348

sex 0 .4999806

age 11 .1073180

race 1 .2335452

grade92 2 .5411844

logincome

0 .7710541

0.103*2.6844348/0.7710541

##  [1]  0 .3585958

Economic Significance: A one standard deviation change in education year is associated to a 0.35 standard deviation change in income change rate.