ECON0019: QUANTITATIVE ECONOMICS AND ECONOMETRICS
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
SUMMER TERM 2022
CENTRALLY-MANAGED ONLINE EXAMINATION
ECON0019: QUANTITATIVE ECONOMICS AND ECONOMETRICS
PART A
Answer all questions from this section.
A.1 You wish to measure the effect of hiring more teachers on student performance. To this end
you randomly select 300 schools; for each school you collect data on the number of students per teacher (str) and the average exam score (score) for final year students in 2021. Suppose that in the population the following equation holds:
score = β0 + β1 str + β2 ability + β3 (str × ability) + u, (1)
where ability is the average level of student ability in a given school. This equation statisfies MLR.3–MLR.4 in Wooldridge’s text book.
(a) You do not have data on ability and so decide to estimate β 1 by regressing score on str .
Derive the probability limit of the estimator.
ANSWER:
β˜1 |
=
=
= |
i ╱stri − str、scorei i ╱stri − str、2 i ╱stri − str、{β0 + β1 stri + β2 abilityi + β3 (stri × abilityi) + ui} i ╱stri − str、2 β 1 + βl2ityi + βability2i) + i ╱stri − str、ui i ╱stri − str、2 . |
By the LLN in conjunction with MLR . 3–MLR .4,
╱stri − str、2 → p Var (str) > 0,
╱stri − str、abilityi → p Cov (str, ability) ,
╱stri − str、(stri × abilityi) → p Cov (str, str × ability) ,
╱stri − str、ui → p Cov (str, u) = 0.
β˜1 →p β 1 + β2 + β3
...
(b) A colleague of yours conjectures that Cov (str, ability) < 0, Cov (str, str × ability) > 0,
β2 > 0 and β3 < 0. Do these sign restrictions seem plausible to you? Explain. Supposing they hold, is it possible to determine the sign of the asymptotic bias of the estimator in (a)?
ANSWER: Rewrite the model as
score = β0 + β1 str + (β2 + β3 str) ability + u. (3)
It seems plausible that the baseline effect of ability is positive (β2 > 0), that ability has a stronger effect if class size is small (β3 < 0), that better students are more likely to attend schools with small class sizes(Cov(str, ability) < 0) . However, the sign of Cov(str, str × ability) could go either way. The restrictions implies that the asymptotic bias is negative:
AsBias = lim β˜1 − β 1 = β2 Cov (str, ability) + β3 Cov (str, str × ability) < 0
石 ! 石 !
<0 <0
(c) You speculate that the joint population distribution of ability and str is such that
匝 [ability|str] = 匝 [ability] . (4)
Interpret the restriction in (4). Do you think it is likely to hold?
ANSWER: (4) says that ability is mean –independent of str . That is, ability does not relate to str . Specifically,
Cov (ability, str) = 匝 [ability × str] − 匝 [ability] 匝 [str] = 匝 [匝 [ability|str] × str] − 匝 [ability] 匝 [str] = 匝 [ability] 匝 [str] − 匝 [ability] 匝 [str]
= 0.
This is a strong assumption which is unlikely to hold: We would expect students with strong ability to be more likely to attend academically strong schools . Such schools are more likely to have relatively small str and so it would seem more plausible that Cov(ability, str) < 0 .
(d) Assuming that (4) holds, demonstrate that the probability limit of the OLS estimator in (a) equals β1 +β3匝 [ability]. Interpret the probability limit. In particular, is it a meaningful measure of the effect of class size on student performance?
ANSWER: We know that Cov(str, ability) = 0 under (4) . Similarly,
Cov (str, str × ability) = 匝 [ability × str2] − 匝 [ability × str] 匝 [str] = 匝 [匝 [ability|str] × str2] − 匝 [ability] 匝 [str]2 = 匝 [ability] 匝 [str2] − 匝 [ability] 匝 [str]2
= 匝 [ability] Var (str) .
Plugging these two identities into (2),
lim β˜1 = β 1 + β2 Cov (str, ability) + β3 Cov (str, str × ability)
Var (str)
= β 1 + β3匝 [ability] .
This limit is the average effect of increasing class sizes over the distribution of ability in the population. This is a useful policy measure if the policy maker only cares about the average effect. If the policy maker is concerned about the separate effect on low and high ability students, for example, it is not useful.
(e) You decide to collect additional data on the average mark that the final–year students
earned in their first year at each school. With mark denoting this new variable, you hypothesise that, for some unknown coefficients θ, the following two conditions hold:
匝 [ability|str, mark] = 匝 [ability|mark] = θmark . (6)
Interpret the two conditions and compare them to (4). Do they seem reasonable?
ANSWER: The first condition is the usual redundancy condition which is required for a proxy variable . It says that once str and ability are included in the regression there is no need for mark . The second condition states that, once we control for mark, str does not help explaining ability . They are more plausible than (4): They allow for str and ability to be correlated but in such a way that mark is a valid proxy for student performance .
(f) You run the following regression,
sc←ore = βˆ0 + βˆ1 str + βˆ2mark + βˆ3 (str × mark)
ANSWER: Taking conditional expectations on both sides of (1),
匝 [score|str, mark] = β0 + β1 str + β2匝 [ability|str, mark]
+β3 (str × 匝 [ability|str, mark])
+匝 [u|str, mark]
= β0 + β1 str + β2 θmark+β3 θ (str × mark) ,
where we have used that
匝 [u|str, mark] = 匝 [匝 [u|str, ability, mark] |str, mark] = 0.
The OLS estimators will consistently estimate the coefficients in the above equation. In particular, limp βˆ1 = β 1 .
A.2 You are interested in the relationship between sales, profits and research & development (R&D).
For that purpose you obtain the following regression based on data collected from a sample of 45 firms in the UK concrete industry in 2016,
r←d = (1..3(4)6(2)9) + (..11(2)6(1)) log (sales) + (..04(0)6(7))profit, 2 = .079, (7)
where rd is expenditures on R&D of a firm as percentage of its annual sales, sales is the firm’s annual sales (in millions GBP) and profit is its annual profits as percentage of sales. Robust standard errors are reported in parentheses.
(a) Interpret the coefficient on log (sales). If sales increases by 10% what is the exact estimated
percentage point change in rd? Is this an economically large effect?
ANSWER: Assuming that MLR . 1 –MLR .4 are satisfied, the coefficient measures the ceteris paribus effect on rd from a change in log (sales):
∆r←d = .21∆ log (sales) .
Thus, the expected change in rd from a percentage change in sales is given by
∆r←d = .21∆ log (sales) = .21 log ╱ 1 + %∆sales、. (8)
In particular, %∆sales = 10 leads to ∆r←d = .21 log(1 + 100(10)) = .0200 . That is, we expect
(b) Test the hypothesis that rd does not change with sales against the alternative that it does
increase with sales. Perform the test at the 5% and 10% level. What is the p-value of the test? Conclude.
ANSWER: We test H0 : βsales = 0 vs . H1 : βsales > 0 . The t-statistic is
t = = 1.81
which we then compare with 1 .282 (10% critical value) and 1 . 645 (5% critical value): We reject H0 at both the 5% and 10% level. The p-value is
p = Pr (t > 1.81) = 1 − Pr (t < 1.81) = 3.51%.
We conclude that there is strong statistical evidence supporting that rd does change with sales .
(c) You compute the F-test statistic of the hypothesis that sales and profit are jointly in- significant and obtain F = 4.12. Do you accept or reject the null at the 5% level? Explain.
ANSWER: This is a test of the joint hypothesis H0 : βsales = βprofit = 0 and so the F - statistic imposes two restrictions . The critical value at the 5% level can be chosen as 3.00 (n = ∞) or 3.2 (n = 45) . In either case, we reject the null at the 5% level since F = 4.12 is bigger than both.
(d) Do you trust the critical values that you used in (b) and (c) and the the p-value that you computed in (b)? Are they valid? What do you conclude about the reported test results?
ANSWER: The critical values used in (b) and (c) are based on a large –sample approxi- mation of the finite –sample distributions of the t and F statistics . These are only good approximation when sample size n is big. Here, n = 45 and so in this application they are probably a poor approximation of the actual critical values . The same goes for the p-value . Alternatively, one can use critical values from the exact finite-sample distributions assum- ing the regression error is normally distributed. This assumption is clearly violated in this application since the dependent variable rd is bounded and positive . In conclusion, the test results are highly debatable and should not be trusted. More data is neeeded.
(e) You estimate the following alternative regression model for rd,
(1.245) (.014) (.00000038) (.047)
At what point does the estimated marginal effect of sales on rd become negative in this model?
ANSWER: The estimated marginal effect of sales is .030 − .0000140sales which becomes
sales = = 2142.86.
That is, when sales exceed 2. 14 billion GBP, the marginal effect on R&D becomes negative .
(f) Write up a composite model that would allow you to test (7) and (9), respectively, against
the composite model. Would the outcomes of these two tests be able to determine which of the two models, (7) and (9), is the preferred one? Explain.
ANSWER:
rd = β0 + β1 sales + β2 sales2 + β3 log (sales) + β4profit + u. (10)
We would then test the two nulls of H1 : β 1 = β2 = 0 and H2 : β3 = 0 using the corre- sponding F-statistic (with 2 restrictions) and t-statistic, respectively. If the outcomes of the tests are Reject-Reject or Accept-Accept, we would not be able to select one model over the other.
(g) You collect data on annual R&D, sales and profits of the same 45 firms in 2017 and re- estimate (9) by running a pooled regression across the two years, 2016 and 2017. A colleague tells you that you should rather estimate the model using the first-difference estimator. Is your colleague right? Explain.
ANSWER: This depends: On one hand, the pooled regression estimator enjoys a bigger sample size (n = 90) relative to the fixed effects estimator (n = 45) and so we expect to be able to draw stronger conclusions based on the pooled regression. This assumes that autocorrelation in the errors isn’t too positive, in which case the variance of the pooled OLS estimator might actually be bigger. On the other hand, the fixed effects estimator controls for certain types of endogeneity/omitted variables that the pooled OLS estimator does not control for. Thus, the fixed effects estimator is less likely to be biased. If you are concerned about this potential endogeneity bias, the fixed effects estimator is preferred. If not, the pooled OLS estimator is generally the better one .
PART B
Answer ONE question from this section.
B.1 In “Rainfall and Conflict: A Cautionary Tale” (Journal of Development Economics, 2015),
Heather Sarsons studies whether lower income can lead to more violent conflict among religious groups in India. She studies a sample of 142 districts in the country’s 28 states. Simplifying things a bit, the baseline equation of interest is:
28
Ci = βYi + γs1 [Si = s] + εi ,
s=1
where Ci (“conflict”) is the number of riots in district i in a particular year, Yi (“income”) is income per capita, Si is the state in which district i is located, 1 [Si = s] indicates a dummy variable which takes the value of one when Si = s, and εi is the error term.
(a) Why may the OLS estimate of β be inconsistent? Provide at least one economic justification.
ANSWER: E.g. reverse causality: conflict may itself reduce income .
Sarsons proceeds to use an instrumental variable strategy: she instruments income with two measures of rainfall in the district. The first one, R1i, is the amount of rainfall in district i in the year of study minus its typical value (across many years) for the district. The second measure, R2i, is a dummy variable that R1i is below its 20th percentile. The idea is that agricultural production is a key source of income in much of India and it relies on sufficient rainfall.
(b) Explain in detail (step by step) how her instrumental variable estimate βˆ is constructed from data on (Ci, Yi, R1i, R2i, Si). Then write down the formal conditions under which βˆ is consistent for β . Which of them can be tested? For those which can, describe the testing procedure. For those which cannot, explain why not.
ANSWER: 2SLS. First, regress Yi on R1i, R2i and all the state dummies and take fitted values, i . Then regress Ci on i and the state dummies and take the coefficient at i .
There are two conditions . Exogeneity: E [R1iεi] = E [R2iεi] = 0 . Relevance: in the first stage, at least one of the coefficients on R1i and R2i is non-zero . Both can be tested here . Exogeneity can be tested by the overidentification test which rejects if the IV estimates using each instrument separately are statistically different. Relevance can be tested by the F -test on the coefficients at the instruments in the first stage regression.
(c) Why does Sarsons subtract the typical rainfall in the district when constructing R1i? Which condition or conditions from part (b) would be more likely violated if she did not do this? Give an economic justification.
ANSWER: Exogeneity would be more likely violated. While the deviation between the current and the typical rainfall is essentially randomly assigned, the realized rainfall is not. For instance, in regions with systematically less rainfall, we may expect to see different crops grown. Some crops may require larger farms and more economic inequality, leading to more conflict regardless of income .
For simplicity, drop R2i and keep R1i as the single instrument in the final parts of the question. Sarsons observes that in some Indian districts there are dams on local rivers, and thus reservoirs of water which do not dry out even in low-rainfall years. This should make income less dependent on weather. Another issue is that the causal effects of income on conflict can be heterogeneous across regions.
(d) With these complications, can the instrumental variable estimate βˆ still be interpreted as some average of causal effects of income on conflict and, if so, what kind of average? Which condition or conditions would have to be added, compared to part (b), for such an interpretation to be valid? Is it (or are they) plausible?
ANSWER: Dams imply that the first-stage effects π 1i of rainfall on income are heterogeneous . In this case, the IV coefficient has a LATE interpretation, as the average of causal effects weighted by π 1i . In particular, it would put lower weights on regions with dams . This interpretation
requires monotonicity: that more rainfall is always good for income . This can be violated if there may be floods which are also bad for agricultural productivity. (Discussions of independence and exclusion are welcome but optional since they are roughly covered by the exogeneity assumption from before .)
(e) Sarsons finds that in the districts with a dam the reduced-form coefficient is significantly different from zero, but the first-stage coefficient is not. She concludes that rainfall may not be an exogenous instrument. Explain intuitively and formally how she makes this conclusion.
ANSWER: Focus on the districts with a dam, for which the instrument is not relevant. Intu- itively, if exogeneity holds, then conflict can only be correlated with rainfall through the causal chain rainfall → income → conflict. However, since rainfall does not affect income, this cor- relation should be zero, and the reduced-form coefficient should be zero . Formally, we have Yi = π 1iR1i + δs1 [Si = s] + ui where π 1i = 0 and Cov [R1i, ui] = 0 . Then
28
Ci = βYi + γs1 [Si = s] + εi
s=1
28
= (γs + βδs) 1 [Si = s] + βui + εi .
s=1
Exogeneity implies Cov [R1i, βui + εi] = 0, and thus a zero reduced-form coefficient.
2022-08-18