ECOM90001: Basic Econometrics 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
SEMESTER 1 ASSESSMENT, 2022
ECOM90001: Basic Econometrics
Question 1 [20 marks]
a) [5 marks] Suppose you are investigating the effects of traffic congestion and traffic- related air pollution on infant health outcomes. This research question is examined by analysing the impact of the introduction of an electronic toll collection system along an extended stretch of a local freeway. This new electronic toll collection system reduced delays at toll collection booths, as well as air pollution caused by idling, accelerating, and decelerating. This electronic toll collection system was implemented at different times along this extended stretch of freeway. Consider the following timing of events for a particular section of the local freeway
period0 = Before: Observation of infant health outcomes
period1 = Introduction of electronic toll collection
period2 = After: Observation of infant health outcomes
A relevant treatment group would be mothers residing within 2 kilometres of a toll collection booth. Consider the following econometric model:
lowit = α0 + α1 neari + α2 aftert + β1 Nneari * aftert } + γ Xit + eit
with:
lowit = 1 if child born to mother i has low birth weight (less the 2,500 grams), 0 otherwise neari = 1 if mother i resides within a 2km radius of toll collection point, 0 otherwise
aftert = 1 for the period after the introudction of electronic toll collection, 0 otherwise
Xit = chararcteristics of mother i and child that are relevant to infant birth weight, such as, mother’s education, mother’s age, birth order, child gender, whether mother smoked during pregnancy, and whether a multiple birth
This model was estimated using Ordinary Least Squares (OLS) using panel data for births within 36 months before or after the introduction of electronic toll collection, for 98 toll collection points along the freeway. The results are presented below:
Variable Estimate Std. Error Pr(> |t|)
Near*After |
-0.0093 0.0028 0.0009 |
Observations Additional Controls (X) |
409,673 Yes |
i) What is the interpretation of the population parameter β1 ?
ii) Test the hypothesis that the introduction of electronic toll collection had a (statis- tically) significant impact on infant birth weight. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.
Note: For this question, you may use the sample size as an approximation to the model degrees of freedom.
b) [5 marks] Consider the following econometric model:
∆audusdt = β0 + β1 audusdt-1 + β2 ∆ audusdt-1 + εt
is stationary or not. Your answer should clearly state the null and alternative hypotheses, the test statistic and its distribution. Using the results in Figure 1, what is the value of the Augmented Dickey-Fuller test statistic? At the 5% level of significance, explain whether the sample evidence is consistent with the null hypothesis. Based upon the estimation results presented in Figure 1, the p-value associated with the Augmented Dickey-Fuller test (intercept, no trend) is 0.7489.
c) [5 marks] Let yi denote the number of doctor visits over a 12-month period by individual
i. The (probability) density of yi follows a Poisson distribution
f (yi |λi ) = exp(-λi ) λi(y)i i = 1, 2, . . . N
with E[yi] = λi and VAR[yi] = λi .
Assume the conditional expectation of the number of doctor visits is given by:
E[docvisitsi |Xi] = exp Nβ0 + β1 agei + β2 educi + β3 incomei + β4 chronici + β5 privatei } and:
yi = docvisits = number of doctor visits over a 12-month period for individual i age = age of individual i, in years
educ = education of individual i, in years
income = annual income of individual i, in $000s of dollars
chronic = 1 if individual i reports having a chroninc condition, 0 otherwise private = 1 if individual i reports having private insurance, 0 otherwise
i) What is meant by the term count data? What are the important characteristics of count data?
ii) What is the interpretation of the estimate for β2 ?
iii) What is the interpretation of the estimate for β5 ?
d) [5 marks] Consider the following demand function for the quantity demanded of product Qt as a function of its market price Pt
ln Qt = α0 + α1 ln Pt + α2 Xt + εDt
Consider also the following supply function for the quantity supplied of product Qt as a function of its market price Pt
ln Qt = β0 + β1 ln Pt + β2 Zt + εSt
with:
COV(Xt , εDt ) = 0 COV(Xt , εSt ) = 0 COV(Zt , εDt ) = 0 COV(Zt , εSt ) = 0 and COV(εDt , εSt ) = 0.
Write out an expression for the reduced form for ln Pt . Does the demand equation satisfy the necessary condition for identification? Briefly explain how you would test that the demand function satisfies the necessary condition(s) for identification. Your answer should clearly state the null and alternative hypotheses, the test statistic and its distribution.
Question 2 [20 marks]
In recent years workplace smoking policies have become increasingly prevalent and restric- tive. Consider the following econometric model to investigate whether these workplace policies reduce smoking:
smokei(*) = β0 + β1 agei + β2 agei(2) + β3 incomei
+ β4 hsonlyi + β5 somepsi + β6 univi + β7 workbani + εi (1)
where εi |Xi ~ /(0, σ2 ). Note that:
smokei(*) = a latent variable determining whether individual i is a current smoker agei = age of individual individual i, in years
incomei = income of individual i, in $000s of dollars
nohs = 1 if individual i did not finish high school, 0 otherwise
hsonly = 1 if individual i finished high school, no further study, 0 otherwise someps = 1 if individual i has completed some post-secondary study, 0 otherwise
univ = 1 if individual i has completed university level qualifications, 0 otherwise workban = 1 if workplace of individual i has a complete smoking ban, 0 otherwise
Note that nohs is the omitted education category.
Suppose you have a sample of observations for 14,817 employed individuals which also contains the following indicator variable:
smokei = *i*i 0(0)
This suggests the following model:
smokei = β0 + β1 agei + β2 agei(2) + β3 incomei +
+ β4 hsonlyi + β5 somepsi + β6 univi + β7 workbani + εi (2)
a) [1 mark] What is the interpretation of the population parameter β3 in the linear prob- ability model (2)?
b) [2 marks] Suppose model (2) is estimated by the method of Ordinary Least Squares (OLS) with smoke as the dependent variable. Outline an issue that might arise with the predicted values. Do you think that the standard errors are valid? Clearly explain why or why not.
c) The parameters of model (2) were estimated as a Probit model and the results are pre- sented in Figure 3
i) [2 marks] Let ˆpi represent the predicted probability that an individual is a current smoker, based upon their observed characteristics. Consider the following decision rule: if ˆpi ● 0.5 predict that s一mokei = 1, otherwise s一mokei = 0.
,s一moke(s一moke)
Based on the information in Table 1, calculate the percentage of outcomes that are correctly predicted? Using Table 1, comment on the usefulness of the model in predicting smoke = 1 or smoke = 0.
true predicted frequency correctly
predicted
0 |
0 |
11,079 |
yes |
1 |
0 |
3,612 |
no |
0 |
1 |
49 |
no |
1 |
1 |
77 |
yes |
|
|
14,817 |
|
Table 1: Predicted Probability Threshold ˆpi ● 0.5
ii) [5 marks] Using the estimation results provided in Figure 3, calculate the marginal effect for the variable age for an individual who is currently 40 years of age, who has only completed high school, with an annual income of $38,000, employed at a workplace with a workplace smoking ban. Provide a clear interpretation of your calculated marginal effect.
Note: The probability density function for a standard normal random variable Z is given by:
φ(Z) = exp / 、
iii) [5 marks] Explain how you would calculate the average marginal effect (AME) for the variable workban (workplace smoking ban). You have not been provided with enough information to actually calculate this marginal effect so do not attempt to calculate the marginal effect. Instead, your answer should clearly explain how you would calculate this marginal effect.
d) Suppose you are considering estimating model (2) using the method of Ordinary Least Squares (OLS). However, it is suspected that the variable workban might be endogenous (that is, workban is potentially correlated with the random error). The available data also contains the following variable:
sizei =1 if the employer of indivdual i has at least 50 employees, 0 otherwise
Based on previous studies, there is some evidence that larger establishments are more likely to implement establishment wide smoking bans.
i) [2 marks] Provide an explanation as to why it might be suspected that the variable workban could be potentially correlated with the random error. Although there are several explanations here, you are only required to provide one of these.
ii) [3 marks] Clearly explain the two conditions that must be satisfied for the variable size to be a valid instrumental variable. Do you think that these two conditions are likely to be satisfied? Why or why not?
Question 3 [20 marks]
Consider a large state that is divided into smaller geographic areas called counties. For the purposes of analysis this large state can also be divided into 3 regions (Region 1, Region 2, and Region 3).
Consider the following econometric model for the crime rate (crimes committed per person) in county i in time period t in a large state divided into three smaller regions:
+ β5 ln polpcit + β6 ln wmfgit + β7 region1i + β8 region2i + β9 urbani + εit (3)
where ln X denotes the natural logarithm of variable X and:
prbarr = probaility of arrest in county i in time t
prbconv = probability of conviction in county i in time t
prbpris = probability of prison senstence in county i in time t
avgsen = average prison sentence in days in county i in time t
polpc = police per capita in county i in time t
wmfg = average weekly wage, manufacturing industry, in county i in time t region1 = 1 if county i is located in region 1, 0 otherwise
region2 = 1 if county i is located in region 2, 0 otherwise
urban = 1 if county i is located in an urban area, 0 otherwise
Note that region3 is the omitted region.
Suppose you have a balanced panel data-set with observations on 90 counties over seven (7) years, providing a total of 630 total observations.
a) [1 mark] What is the interpretation of the population parameter β1 ?
b) [2 marks] Suppose you estimate the econometric model (3) by Ordinary Least Squares (OLS). Do you think that the standard errors are valid? Clearly explain why or why not.
c) Consider the following alternative econometric model:
ln crmteit = β0 + β1 ln prbarrit + β2 ln prbconvit + β3 ln prbprisit + β4 ln avgsenit
+ β5 ln polpcit + β6 ln wmfgit + β7 region1i + β8 region2i + β9 urbani + υi + εit (4)
where υi represents an unobserved time-invariant random variable:
VAR(υi |Xi ) = συ(2) COV(υi , υj ) = 0 for i j COV(υi , εit ) = 0
and VAR(εit |Xit , υi ) = σe(2)
i) [4 marks] Consider the econometric model (4). Do you think that the condition COV(Xikt , υi ) = 0 for each Xikt is likely to be satisfied? Clearly explain why or why not. In the context of the present application, clearly explain and provide an example, whether you think COV(Xikt , υi ) = 0 is a reasonable assumption.
ii) [3 marks] The estimation results for model (3) using the pooled OLS estimator (column 1) and the estimation results for model (4) using the Fixed Effects (FE) estimator (column 2) are presented in Figure 4. Both models are estimated using a cluster-robust variance estimator. Compare and contrast the estimates for the (log) probability of arrest for the pooled OLS model (column 1) relative to those for model (4) estimated using the Fixed Effects (FE) estimator (column 2). What do you conclude about the deterrent effect of the probability of arrest? Based on this comparison comment on the likely sign of any omitted variable bias in the pooled OLS model (3).
iii) [2 marks] Consider model (4). Define ωit = (υi + εit ). For the Random Effects (RE) model, the intraclass correlation of the (composite) error is given by:
ρ = corr(ωit , ωis ) =
for t s
Using the results in Figure 4 for the RE model (column 3), calculate the estimated value for this intraclass correlation ρˆ. Provide an interpretation for this estimated intraclass correlation ρˆ.
iv) [4 marks] Clearly outline the important differences between the Random Effects (RE) estimator and the Fixed Effects (FE) estimator. Your answer should clearly explain the variation in the data that is used to identify the parameters of interest.
d) Consider the following Correlated Random Effects (CRE) model:
ln crmteit = β0 + β1 ln prbarrit + β2 ln prbconvit + β3 ln prbprisit + β4 ln avgsenit + β5 ln polpcit + β6 ln wmfgit + β7 region1i + β8 region2i + β9 urbani + β10 ln prbarri + β11 ln prbconvi + β12 ln prbprisi + β13 ln avgseni
+ β14 ln polpci + β15 ln wmfgi + ηi + εit (5)
where ln X denotes the natural logarithm of variable X and υi represents an unobserved time-invariant random variable.
prbarr = probaility of arrest in county i in time t
prbconv = probability of conviction in county i in time t
prbpris = probability of prison senstence in county i in time t
avgsen = average prison sentence in days in county i in time t
polpc = police per capita in county i in time t
wmfg = average weekly wage, manufacturing industry, in county i in time t region1 = 1 if county i is located in region 1, 0 otherwise
region2 = 1 if county i is located in region 2, 0 otherwise
urban = 1 if county i is located in an urban area, 0 otherwise
and:
ln prbarr = sample mean of (log) probaility of arrest in county i ln prbconv = sample mean of (log) probability of conviction in county i ln prbpris = sample mean of (log) probability of prison senstence in county i
ln avgsen = sample mean of (log) average prison sentence in days in county i
ln polpc = sample mean of (log) police per capita in county i
ln wmfg = sample mean of (log) average weekly wage, manufacturing industry, in county i
where ηi represents an unobserved time-invariant random variable which satisfies the restriction COV(ηi , Xit ) = 0 for each of the explanatory variables.
i) [1 mark] Briefly explain why the Correlated Random Effects (CRE) model (5) does not include a variable urbani representing the sample mean of the variable urban for county i over time.
ii) [3 marks] The estimation results for the CRE model (5), using the cluster-robust variance estimator, are presented in Figure 5. Using the results in either Figure 6 or Figure 7 test the hypothesis that the Random Effects (RE) estimator is the most appropriate model, at the 5% level of significance. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.
2022-08-16