ECONOMETRIC METHODS – COURSEWORK 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECONOMETRIC METHODS - COURSEWORK 2022
The answers to the questions must be type-written. The preference is that symbols and equations should be inserted into the document using the equation editor in Word. Alternatively, they can be scanned and inserted as an image (providing it is clear and readable). Maximum words 1,500 excluding any
Stata output and commands.
The coursework comprises two questions where the second is a short Stata assignment. Both questions 1 and 2 carry equal weight and the marks shown within each question indicate the weighting given to component sections. Any
calculations must show all workings otherwise full marks will not be awarded.
1. a.
In the following regression model yi = β0 + β1x1i + β2x2i + εi (where i denotes the unit of observation) under the scenario that the two independent variables x1 and x2 are highly collinear:
i) provide an algebraic expression for the correlation coefficient between the two independent variables; [5 marks]
ii) explain, using the appropriate formula, the effect of high collinearity on the standard errors of the parameter estimates and on the t-statistics. [5 marks]
b. The following sums were obtained from a sample of 240 time series observations (i.e. t=1,2,…,240) on the variables y and x.
∑ yt = 144 , ∑ xt = 216 , ∑ yt2 = 888 , ∑ xt(2) = 2160 , ∑ xtyt = 1080
i) Calculate the least squares estimates of the intercept and slope parameters in the regression model: yt = β0 + β1xt + εt [15 marks]
ii) Briefly explain the assumption of no autocorrelation in the context of the error term εt . [5 marks]
iii) Explain the consequences of corr(xt, εt ) ≠ 0. [5 marks]
c. Using Chinese data over the period 2006 quarter 1 to 2012 quarter 4 sales are modelled as a function of lagged sales, disposable income, consumer confidence, and seasonal effects:
salest = β0 + β1 salest−1 + β2 log(y)t + β3 t + ∑k(4) =2δkdkt + Et
Variable Definitions
sales |
= |
nominal sales (in ¥ million) |
log(Y) |
= |
Natural logarithm of nominal income |
recip_cc |
= |
1 = [consumer confidence, cc] (%) |
d2 |
= |
1 if second quarter of year; 0 otherwise |
d3 |
= |
1 if third quarter of year; 0 otherwise |
d4 |
= |
1 if fourth quarter of year; 0 otherwise |
After undertaking auxiliary regressions the following ANOVA results were obtained in Stata. ‘L’ denotes the lag operator.
regress L.sales logY recip_cc d2 d3 d4
Source | SS df MS
-------------+----------------------------------
Model | 10605.7128
Residual | 1884.14964
-------------+----------------------------------
Total |
regress logY L.sales recip_cc d2 d3 d4
Source | SS df MS
-------------+----------------------------------
Model | .05355609
Residual |
-------------+----------------------------------
Total | .102314625
regress recip_cc L.sales logY d2 d3 d4
Source | SS df MS
-------------+----------------------------------
Model |
Residual | .000022837
-------------+----------------------------------
Total | .000045554
d. The following Stata output shows the results of estimating the model from part (c) and sample means of continuous variables.
i) Calculate the slope and elasticity associated with income and consumer confidence, based at the sample mean. [10 marks]
ii) Explain why a reciprocal functional form is used. [5 marks]
iii) What does the estimate on the lagged dependent variable imply? [5 marks]
iv) Test for autocorrelation at the 5% level. [20 marks]
v) Interpret the seasonal (quarterly) effects. Rewrite the model in part (c) to allow for a concurrent regression and explain in detail how this could be tested. [15 marks]
regress sales L.sales logY recip_cc d2 d3 d4
Source | SS df MS
-------------+----------------------------------
Model | 11816.1851 6 1969.36419
Residual | 1195.78871 20 59.7894355
-------------+----------------------------------
Total | 13011.9738 26 500.460532
sales | Coefficient Std. err. t P>|t|
-------------+----------------------------------------
sales |
L1. | .220576 1.24
|
logY | 98.99456 35.01764 2.83 0.010
recip_cc | -4616.62 1618.058 -2.85 0.010
d2 | 23.94257 10.42623 2.30 0.033
d3 | 32.59669 8.305721 3.92 0.001
d4 | 63.50859 6.105048 10.40 0.000
_cons | -371.7605 144.322 -2.58 0.018
------------------------------------------------------
Durbin–Watson d-statistic( 7, 27) = 1.929705
sum sales L.sales logY cc recip_cc
Variable | Obs Mean Std. dev.
-------------+------------------------------------
sales |
| |
|
|
|
. |
| |
28 |
98.12636 |
23.61535 |
L1. |
| | |
27 |
96.28344 |
21.91756 |
logY |
| |
28 |
4.532284 |
.0645629 |
cc |
| |
28 |
160.7179 |
26.71612 |
recip_cc |
| |
28 |
.0064294 |
.0013157 |
STATA ASSIGNMENT
2. The following data set “wages.dta” is cross sectional based upon 2,220 individuals in 2020 from the U.S. The variables in the data are:
wage |
= |
hourly wage rate in cents |
educ |
= |
years of schooling of the individual |
fatheduc |
= |
father’s years of schooling |
motheduc |
= |
mother’s years of schooling |
black |
= |
dummy variable (0 white, 1 black) |
IQ |
= |
Intelligence score |
married |
= |
dummy variable (0 unmarried, 1 married) |
exper |
= |
years of labour market experience |
Load the data into Stata. Then type the following commands:
set seed 200212232
replace wage=wage*abs(rnormal(0,1))
where the number after "set seed" is your student registration number e.g. 200212232 (this ensures that each student has unique data). Next save your data as “ECN6540_Assignment_mydata.dta” . It is important that you work with this file if you close and reopen Stata at a later date.
a. Load your unique data from the file “ECN6540_Assignment_mydata.dta” . Using a semi log wage specification estimate a wage equation where YOU choose the independent variables BUT THESE MUST include, “black”, “married”, “educ”, “fatheduc” and “motheduc” at a minimum. [5 marks]
b. Interpret the estimated parameters of your model. [10 marks]
c. Test whether the individual parameters estimated are individually statistically significant and jointly statistically significant BY HAND and then compare with the Stata output. [15 marks]
d. Test your estimated model for heteroscedasticity using the WHITE test BY HAND (without using any inbuilt Stata test commands). [20 marks]
e. Use tsset id in order to set “id” as the time series identifier (although note that the data is cross sectional). Test whether the model estimated in part (a) exhibits auto correlation at the 5% level. What does this result imply? [5 marks]
f. Test whether the parameters associated with “fatheduc” and “motheduc” in part (a) are equal to unity at the 5% level BY HAND (without using any inbuilt Stata test commands). Use Stata to construct the appropriate RSS. [15 marks]
g. Using your initial model from part (a) test whether “black” and “married” individuals exhibit different returns to education (“educ”) at the 1% level BY HAND (without using any inbuilt Stata test commands). Use Stata to construct the appropriate RSS. [20 marks]
h. At the end of your document provide the text from your Stata *.do file. [10 marks]
2023-07-27