Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECMT2150 INTERMEDIATE ECONOMETRICS

Week 4 Tutorial  Asymptotics, Dummy Variables, LPM

Stata 1 Based on Wooldridge Computer Exercise 5.C1

Use the data in WAGE2.dta that we used last week for this exercise. Start a new do file or extend the one you wrote last week.

a)   Estimate the wage equation using wages in levels:

wage =  F0  + F1 age +  F2 educ + F3 exper + F4 tenure + u

Save the residuals. Stata will savethem in a new variable for you.

Label  your  new  residual  variable  giving  it  a  sensible  label,  for  example,  something  like "Residuals from level-level wage equation".

Make and save a histogram of your residuals, adding a normal density plot to the histogram. Notice the usefulness of having labelled your variable when you plot your histogram.

b)   Repeat part a), but now use the natural log of the wage as the dependent variable.

c)   Taking a look at your two histograms side by side. Would you say that Assumption MLR.6 is closer to being satisfied for the level-level model of the log-level model?

Stata 2 Based on Wooldridge Computer Exercise 7.C13

Here we will use the data in APPLE.dta to answer a series of questions.

a)     Define  and  create  a  new  binary  variable  as  ecobuy = 1 if ecolbs > 0  and  ecobuy =

0 if ecolbs = 0 .  In other words, ecobuy indicates whether, at the prices given, a family would buy any ecologically friendly apples. Label your new variable with a useful description.

b)    Use the command tab (and confirm with the command  summarize) to find out what fraction of families claim they would buy eco-labeled apples?

c)     Estimate the linear probability model:

ecobuy = F0  + F1 ecoprc +  F2regprc + F3faminc +  F4 hhsize + F5 educ + F6 age + u

and report the results in equation form or in a table using esttab.                   Carefully interpret the coefficients on the price variables, ecoprc and regprc .

d)    Are the non-price variables jointly significant in the LPM? (Use the usual F statistic, even       though it is not valid when there is heteroskedasticity. – we will come back to this issue in Week 7.) Which explanatory variable other than the price variables seems to have the most important effect on the decision to buy eco-labeled apples? Does this make sense to you?

e)    In the model from part c), replace faminc with log(faminc). Which model fits the data better, using faminc or log(faminc)? Interpret the coefficient on log(faminc).

f)     Using the estimated model from part e), how many estimated probabilities are negative? How many are bigger than one? Should you be concerned?

g)    For the estimation in part e), compute the percent correctly predicted for each outcome, ecobuy=0 and ecobuy=1. Which outcome is best predicted by the model? Recall that we generally use a standard prediction rule where if  ec—obuy ≥ 0.5 we predict that consumer does buy eco-labelled applies, and if ec—obuy < 0.5 we predict that consumer does not buy eco-labelled applies.

Q1. Imagine you are studying smoking behavior among adults in Australia. You collect a random representative sample of single adults from Australia and have data on whether or not they smoke  cigarettes,  how  many  per day and other  demographics. The variable cigs  is the self- reported usual or average number of cigarettes smoked per day. Do you think the variable cigs has a normal distribution in the Australian adult population? Explain.

Q2. Wooldridge Problem 3.12

The following equation represents the effects of tax revenue mix on subsequent employment growth for the population of counties in the United States:

growth =  F0  + F1 shareP  + F2 shareI  +  F3 shareS  + other factors,

where growth is the percentage change in employment from 1980 to 1990, shareP   is the share of property taxes in total tax revenue, shareI  is the share of income tax revenues, and shareS  is the  share  of  sales  tax  revenues.  All  of these  variables  are  measured  in  1980. The  omitted share, shareF , includes fees and miscellaneous taxes. By definition the four shares add up to one. Other factors would include expenditures on education, infrastructure, and so on (all measured in 1980).

a)   Why must we omit one of the tax share variables from the equation?

b)   Give a careful interpretation of F1 .

Q3. Wooldridge Question 7.8

Suppose you collect data from a survey on wages, education, experience and gender. In addition, you ask for information about marijuana usage. The original question is: ‘On how many separate occasions last month did you smoke marijuana?’

a)   Write an equation that would allow you to estimate the effects of marijuana usage on wage, while controlling for other factors. You should be able to make statements such as: ‘smoking marijuana five more times per month is estimated to change wage by X%’ .

b)   Write a model that would allow you to test whether drug usage has different effects on wages for men and women. How would you test that there are no differences in the effects of drug usage for men and women?

c)   Suppose you think it is better to measure marijuana usage by putting people into one of four categories: non-user, light user (1 to 5 times per month), moderate user (6 to 10 times per month) and heavy user (more than 10 times per month). Now write a model that allows you to estimate the effects of these different levels of marijuana usage on the wage.

d)   Using the model in (c), explain in detail how to test the null hypothesis that marijuana usage has no effect on wage. Be very specific and include a careful listing of degrees of freedom.

e)   What are some potential problems with drawing causal inference using the survey data that you collected?

Q4. Wooldridge Problem 5.2

Suppose that the model

pctstck =  F0  + F1funds +  F2risktol + u

satisfies  the  first  four  GM  assumptions,  where  pctstck  is  the  percentage  of  a  worker’s pension/superannuation invested in the stock market, funds is the number of mutual funds that the worker can choose from, and risktol is some measure of risk tolerance (larger risktol means that a person has a higher tolerance for risk). If funds and risktol are positively correlated, and we estimate the simple model

pctstck =  0  + 1funds + u

what is the inconsistency in 1, the slope coefficient?

Extra problem (more practice if you would like it)

1. Wooldridge Question 7.4

An equation explaining chief executive officer salary is

log(—salaTy) = 4.59 + 0.257 log(sales) + 0. 11 Toe + 0. 158 finance

(0.30) (0.032)                        (0.004)          (0.089)

+ 0. 181 conspTod − 0.283 utility

(0.085)                      (0.099)

n = 209, R2  = 0.357

The data used are in CEOSAL1.RAW, where finance, consprod, and utility   are binary variables  indicating  the  financial,  consumer  products  and  utilities  industries.  The omitted industry is transportation.

a) Compute the approximate percentage difference in estimated salary between the utility and transportation industries, holding sales and roe fixed. Is the difference statistically significant at the 1% level?

b) Use  Equation  (7.10)  to  obtain  the  exact  percentage  difference  in  estimated  salary between the  utility and transportation  industries and compare this with the answer obtained in part a).

c)  What is the approximate percentage difference in estimated salary between the consumer products and finance industries? Write an equation that would allow you to test whether

the difference is statistically significant.