ECOM30002 / ECOM90002 Econometrics 2 Semester 1 Assessment, 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Semester 1 Assessment, 2022
Faculty of Business and Economics / Department of Economics
ECOM30002 / ECOM90002
Econometrics 2
Question 1.
This question focuses on the relationship between the settlement of Syrian refugees in Greece in 2015 and support for far-right politics in a sample of 90 Greek municipalities, indexed by i = 1, 2, . . . , n.1 Of particular interest is the question of whether the extent of refugee settlement in a municipality affects the vote share of the right-wing anti-immigrant and anti-asylum-seeker ‘Golden Dawn’ party. The following variables are available:
RefPerCapi
GDVotei
Disti
BorderPatroli
the number of refugees arriving in the ith municipality, per capita the change in the share of the vote won by the Golden Dawn party between January 2015 and September 20152 the distance of the ith municipality from the Turkish coast, in kilometers
the number of Border Patrol officers stationed in the ith munic- ipality, per capita
You may find it useful to note that a common route by which Syrian refugees entered Greece at this time was by boat via Turkey.
Throughout this question, statistical inference should be conducted at the 5% level of significance.
(a) (7 marks) Consider the following simplified causal representation of the factors
affecting the Golden Dawn vote share:
GDVotei = β0 + β1RefPerCapi + β2 Inclusivityi + Vi , (1)
where Vi is an i.i.d. disturbance term with mean zero and where Inclusivityi measures the degree of social inclusivity in the ith municipality. We will assume that Inclusivityi is unobserved.
E(GDVotei |RefPerCapi ) = δ0 + δ1RefPerCapi . (2)
Explain the conditions under which the omission of Inclusivityi from equation (2) would induce omitted variables bias such that δ 1
β1 . In general, do you expect that δ 1 = β1 , δ 1 > β1 or δ 1 < β1 ? Explain your answer.
(b) With the aim of estimating β1 , a Two-Stage Least Squares (2SLS) approach is adopted using Disti as an instrumental variable (IV) for RefPerCapi . The estima- tion results are provided in Table 1.1 on page 4.
(i) (3 marks) Explain why RefPerCapi is likely to depend on Disti . What does this imply about the use of Disti as an IV for RefPerCapi ?
(ii) (2 marks) Figure 1.1 on page 5 plots RefPerCapi on a political map of the
region. Based on Figure 1.1, does it appear that Disti is a relevant IV for RefPerCapi ? Explain your answer.
(iii) (3 marks) A range of estimation results are reported in Table 1.1 on page 4. Use the table to evaluate the relevance of Disti as an IV for RefPerCapi in this 2SLS procedure. Clearly indicate which value(s) from which column(s) of the table you are referring to in your answer.
(c) (5 marks) Interpret the sign, statistical significance and magnitude of the estimated coefficients on RefPerCapi in both columns (A) and (D) of Table 1.1 on page 4. Do the results suggest that β1 = 61 ? Explain your answer.
(d) A new 2SLS approach is adopted using both Disti and BorderPatroli as IVs for RefPerCapi . The first stage and second stage estimation results are provided in Table 1.1 on page 4.
(i) (3 marks) Explain conceptually whether BorderPatroli is likely to be a rele- vant instrument for RefPerCapi .
(ii) (4 marks) Use the tabulated estimation results to evaluate the relevance of Disti and BorderPatroli as IVs for RefPerCapi in this 2SLS procedure. Clearly indicate which value(s) from which column(s) of the table you are referring to in each step of your answer. Please comment on any limitations of the testing procedure that you have used.
(iii) (6 marks) Explain briefly how you would test for the exogeneity of the in-
struments using the J-test of over-identifying restrictions. Please write down the relevant test equation, the null and alternative hypotheses and the dis- tribution (including the degrees of freedom) against which the resulting test statistic should be compared to compute the corresponding p-value.
(iv) (2 marks) The J-test statistic takes a value of 8.274 with a p-value of 0.004.
What do you conclude about the exogeneity of the instruments based on this information?
(v) (5 marks) Considering all of the evidence on the validity of both Disti and BorderPatroli as IVs for RefPerCapi , discuss whether the 2SLS parameter estimate on RefPerCapi reported in column (E) of the table is consistent for the population parameter β1 in equation (1).
|
|
(A) |
(B) |
(C) |
(D) |
(E) |
|
GDVotei |
RefPerCapi |
RefPerCapi |
GDVotei |
GDVotei |
|
|
RefPerCapi |
0.610*** (0.210) |
|
|
0.483* (0.243) |
0.706*** (0.230) |
|
Disti |
|
_0.004*** (0.001) |
_0.004*** (0.001) |
|
|
|
BorderPatroli |
|
|
118.292* (66.025) |
|
|
|
Intercept |
1.324*** (0.098) |
1.102*** (0.287) |
_0.073 (0.658) |
1.369*** (0.113) |
1.291*** (0.112) |
* p<0.1; ** p<0.05; *** p<0.01
Notes:
(A) OLS estimates of a regression of GDVotei on an intercept and RefPerCapi .
(B) OLS estimates of the parameters of the PRF:
E(RefPerCapi |Disti ) = π0 + π 1Disti . (3)
(C) OLS estimates of the parameters of the PRF:
E(RefPerCapi |Disti , BorderPatroli ) = θ0 + θ1Disti + θ2BorderPatroli . (4)
(D) 2SLS estimation results for the regression of GDVotei on an intercept and RefPerCapi , using Disti as an IV for RefPerCapi .
(E) 2SLS estimation results for the regression of GDVotei on an intercept and RefPerCapi , using Disti and BorderPatroli as IVs for RefPerCapi .
Each column of the table shows the coefficient estimates, with heteroskedasticity-consistent (HC) standard errors in brackets underneath.
Figure 1.1: Refugee Arrivals and Distance from the Turkish Coast
Note: the following statement provides an alternative summary of the information in Figure 1.1 that does not rely on the use of coloured shading. The map shows that municipalities with higher values of RefPerCapi generally lie closer to the Turkish coast than municipalities with the lowest values of RefPerCapi .
Question 2.
Consider the following Monte Carlo experiment, in which 10,000 repeated samples of n = 50 observations each are obtained for three independently and identically distributed (i.i.d.) random variables, Xi , Wi , and Zi , with distribution:
╱ Xi 、 ╱╱0、 ╱ 1
.(Zi(Wi).. ~ N.(.(0(0).. ,
ρXW
1
ρWZ
ρXZ 、、
ρ1(W)Z.... ,
where ρjk denotes the correlation between variables j and k, with j, k = Xi , Wi , or Zi and j
k .
An i.i.d. disturbance term, Ui , is drawn independently of Xi , Wi and Zi from the standard normal distribution. Finally, a variable, Yi , is generated according to:
Yi = β0 + β1Xi + β2 Wi + Ui (5)
for i = 1, 2, . . . , n. The population values of the parameters in this experiment are:
ρXW = 0.8, ρXZ = 0.5, ρWZ = 0, β0 = 0 and β1 = β2 = 0.5. For each of the repeated samples, four different models are estimated, as follows: Model (A): OLS regression of Yi on an intercept, Xi and Wi .
Model (B): Two-Stage Least Squares (2SLS) regression of Yi on an intercept and Xi , using Zi as an Instrumental Variable (IV) for Xi .
Model (C): OLS regression of Yi on an intercept, Xi , Wi and Zi .
Model (D): OLS regression of Yi on an intercept and Zi .
For Models (A), (B) and (C), the estimated parameter on Xi is denoted βˆ1 . For each of these three models, the null hypothesis that the coefficient on Xi is zero is tested against the two-sided alternative that it is not zero. The hypothesis test is conducted at the 5% level of significance using a t-test based on homoskedasticity-only standard errors.
A selection of simulation results for Models (A), (B) and (C) are reported in Table 2.1 on this page. The rows labelled “Mean(βˆ1 )” and “S.D.(βˆ1 )” show the mean and standard deviation (S.D.) of βˆ1 for Models (A), (B) and (C) across all 10,000 repeated samples. The row labelled “Reject H0 ” reports the proportion of repeated samples in which the null hypothesis is rejected for Models (A), (B) and (C).
Table 2.1: Simulation Results for Models (A), (B) and (C)
|
Model (A) |
Model (B) |
Model (C) |
|
|
Mean(βˆ1 ) |
0.503 |
0.457 |
0.504 |
|
S.D.(βˆ1 ) |
0.244 |
0.459 |
0.449 |
|
Reject H0 |
0.528 |
0.380 |
0.201 |
(a) (6 marks) For each of Models (A), (B) and (C), explain whether or not the param- eter estimate, βˆ1 , is consistent for the population parameter, β1 .
(b) (5 marks) Based on the simulation results reported in Table 2.1 on page 6, which
model, (A), (B) or (C), gives the most precise estimate of the population parameter β1 ? Use your knowledge of the simulation set-up to explain why the model that you have identified yields a more precise estimate than the other two models.
(c) (5 marks) Refer to Table 2.1 on page 6. Explain whether the rejection frequencies reported in the row labelled “Reject H0 ” measure the size or the power of the t-test for each of Models (A), (B) and (C). Based on your analysis, interpret the value of “Reject H0 ” for each model.
Now consider Model (D). The specification of this model allows us to evaluate whether the 2SLS procedure conducted in Model (B) can be bypassed by directly regressing Yi on an intercept and Zi using OLS. The mean of the coefficient on Zi in Model (D) across all 10,000 repeated samples is 0.251.
(d) (3 marks) Using the properties that E(Xi |Zi ) = ρXZ Zi and E(Wi |Zi ) = ρWZ Zi , derive the form of the following Population Regression Function (PRF):
E(Yi |Zi ) = α0 + α1 Zi , (6)
and give definitions of α0 and α 1 in terms of the population parameters.
(e) Consider the asymptotic properties of
1 in Model (D) and answer the following:
(i) (2 marks) Deduce the numerical value that
1 is consistent for.
(ii) (2 marks) Does the reported simulated mean of
1 support your answer to
Question 2(e)(i)? Explain your answer.
(f) (4 marks) A colleague asserts that
1 in Model (D) is not consistent for the popu-
lation parameter β1 because of the omission of Wi from Model (D). Do you agree with this claim? Explain your answer.
(g) (3 marks) Would it be preferable to use heteroskedasticity consistent standard errors
rather than homoskedasticity-only standard errors in this experiment? Explain your answer.
This question focuses on student evaluations of teaching quality and the overall qual- ity of their educational experience at a sample of n = 84 Australian higher education institutions observed over T = 4 years from 2017 to 2020, inclusive. Higher education institutions are indexed by i = 1, . . . , n and time periods are indexed by t = 1, . . . , T. The following variables are available:
overalli,t teachingi,t
unii
the percentage of student respondents satisfied with their overall experience
the percentage of student respondents satisfied with the quality of teaching
a dummy variable equal to 1 if the institution is a university and 0 otherwise3
The following models are estimated:
❼ A pooled model estimated by OLS:
overalli,t = β0 + β1teachingi,t + β2unii + Ui,t (7)
❼ A model with entity fixed effects (where the word ‘entity’ denotes a higher education
institution), estimated using the within estimator:
❼ A model with time fixed effects, estimated using the within estimator:
overalli,t = λt + β1teachingi,t + β2unii + Ui,t (9)
❼ A model with entity and time fixed effects, estimated using the within estimator:
overalli,t = αi + λt + β1teachingi,t + β2unii + Ui,t (10)
Table 3.1 on page 9 reports coefficient estimates for all four models. In addition, for each model, the table reports robust standard errors clustered on entities (the standard errors are shown in brackets below each parameter estimate).
(a) (2 marks) How is the estimated intercept in equation (7) interpreted?
(b) (4 marks) Interpret the sign, magnitude and statistical significance (at the 5%
level) of the estimate of β1 from models (7) and (10). What explains the difference between these two estimates?
(c) (4 marks) Using an example, explain why the errors in equation (9) may be au- tocorrelated. Assuming that the errors are, in fact, autocorrelated, what are the implications for statistical inference on the parameters of equation (9) based on the results reported in Table 3.1 on page 9? Explain your answer.
(d) (4 marks) Which of the four models, if any, are likely to control for the effect of the COVID-19 pandemic on student evaluations of their overall experience? Explain your answer, clearly stating and justifying any assumptions that you are making regarding the effect of the COVID-19 pandemic on student evaluations of their overall experience.
(e) (4 marks) Give an example of a time-invariant omitted factor that varies across
individual institutions. Justify your chosen example. Which of the four models, if any, control for the effect of this type of omitted variable on student evaluations of their overall experience? Explain your answer.
(f) (4 marks) Student evaluations of their overall experience may be influenced by the
level of federal funding received by each institution, which we will assume varies across institutions and also over time. Which of the four models, if any, are likely to control for the omission of information on federal funding? Explain your answer.
(g) (3 marks) Based on the estimation results reported in Table 3.1 on this page, can
you infer whether any higher education institutions switch between NUHEI and University status during our sample period? Explain your answer.
2022-07-13