Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECMT2150 INTERMEDIATE ECONOMETRICS

Week 11 Tutorial - IV/2SLS Estimation & Policy Analysis with Pooled Cross Sections

Stata 1 (Wooldridge Chp 15 Computer Exercise 3 & Computer Exercise 5)

Use the data in CARD.dta for this exercise. The data comprise a sample of men from 1976.

a)       In example 15.4 the following equation is estimated:

log(wage) =  F0  +  F1 educ +  F2 expeT + F3 expeT2  + F4 black + F5 smsa + F6 south + …

+ u

Note that the model also includes a full set of regional dummies (the variables reg662-reg669) and a SMSA dummy for whether each man was living in a SMSA in 1966. A SMSA is a                  “standard metropolitan statistical area” so is essentially an indicator of whether an individual  lives in an urban area or not.

Reproduce the estimates in Table 15.1. That is, estimate the model above using OLS and using

IV with near4 the IV for educ. Also estimate the first stage and check the

identification/relevance condition.

b)       In order for IV to be consistent, the IV for educ, nearc4, must be uncorrelated with u. Could nearc4 be correlated with things in the error term, such as unobserved ability? Explain.

c)       For a subsample of the men in the data set, an IQ score is available. Regress IQ on nearc4 to check whether average IQ scores vary by whether the man grew up near a four-year college. What do you conclude?

d)       Now, regress IQ on nearc4, smsa66, and the 1966 regional dummy variables reg662, ...,           reg669. Are IQ and nearc4 related after the geographic dummy variables have been partialled out? Reconcile this with your findings from part (c).

e)       From parts (c) and (d), what do you conclude about the importance of controlling for smsa66 and the 1966 regional dummies in the log(wage) equation?

f)        In a) above (and Table 15.1 in the textbook), notice that the difference between the IV and OLS estimates of the return to education is economically important.

Test whether educ is endogenous; that is, determine if the difference between OLS and IV is statistically significant. To do so, you will need the residuals from the reduced form regression of education on all the exogenous variables including the instrument nearc4. Call these 2

(see eqn 15.49-15.51 in the textbook).

g)       We could also possibly use nearc2 as and additional instrument. Estimate the first stage using nearc4 and nearc2. Estimate the structural equation by 2SLS, using nearc4 and nearc2 as the  instruments for educ. Does the coefficient on educ change much?

Q1. Wooldridge Chp 15 Q8

Suppose you want to test whether girls who attend a girls’ high school do better in math than girls who attend co-ed schools. You have a random sample of senior high school girls from a state in the

United States, and score is the score on a standardized math test. Let girlhs be a dummy variable indicating whether a student attends a girls’ high school.

a)   What other factors would you control for in the equation? (You should be able to reasonably collect data on these factors.)

b)   Write an equation relating score to girlhs and the other factors you listed in part (a).

c)   Suppose that parental support and motivation are unmeasured factors in the error term in part (b). Are these likely to be correlated with girlhs? Explain.

d)   Discuss the assumptions needed for the number of girls’ high schools within a 20-mile radius of a girl’s home to be a valid IV for girlhs.

Pooled Cross Sections

Stata 2 (Wooldridge Chp 13 Computer Exercise 3)

Use the data in KIELMC.dta for this exercise.

a)    The variable dist is the distance from each home to the incinerator site, in feet. Consider the model

log(priCe) =  F0  +  60y81 +  F1 log(diSt) +  61y81 . log(diSt) + u

If building the incinerator reduces the value of homes closer to the site, what is the sign of 61 ? What does it mean if F1  > 0?

b)    Estimate the model from part (a) and report the results in the usual form. Interpret the coefficient on y81 . log(diSt). What do you conclude?

c)    Add age, age2, rooms, baths, log(intst), log(land), and log(area) to the equation. Now, what do you conclude about the effect of the incinerator on housing values?

d)    How come the coefficient on log(dist) is positive and statistically significant in part (b) but not in part (c)? What does this say about the controls used in part (c)?

Q2. Wooldridge Chp 13 Q3

Why can we not use first differences when we have independent cross sections in two years (as opposed to panel data)?