ECOM20001: Econometrics 1 Assignment 1
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECOM20001: Econometrics 1
Assignment 1
Getting Started
Please create an Assignment1 folder on your computer, and go to the Canvas site for ECOM 20001 and download the following data file into the Assignment1 folder:
• as1_voucher.csv
This dataset comes from a survey of 1,680 individuals who are given a $1,000 shopping voucher by the government. The voucher is divisible, which means that individuals can split it and spend it on different goods.
Individuals are free to spend the voucher on any goods. For example, they may use it to cover their usual daily expenses (e.g., food grocery), or they may use it on goods which they would not have purchased if there were no vouchers (e.g., a new tablet). In the former case, the voucher does not stimulate consumption. In the latter case, the voucher stimulates consumption.
The government is interested in understanding how much the voucher can stimulate consumption.1
This dataset contains the following variables:
• id: anonymous identification number for an individual
• newc: new consumption expenditure ($). That is, the voucher amount spent on goods that would not have been purchased by the individual if there were no vouchers.
• age: age (year) of the individual.
• gender: =1 if the individual is female, =2 if male.
• income: the individual’s annual income (in thousand $)
Prior to answering the questions, do some exploratory work such as browsing the observations and looking at the summary statistics.
Hint: Be as concise as possible in your answers. A short, crisp answer is better than a long, unclear answer. Longer answers do not necessarily attract higher marks. Minor rounding errors (accuracy up to 3 significant figures) are allowed.
Questions
1. (2 marks) Using the gender variable, create a binary variable called male that equals one if the individual is male, and is equal to 0 otherwise. Report summary statistics (mean, std. dev, min, max) for newc, income, male. Interpret each of the sample means in plain language.
(Hint: for illustration, to generate a binary variable, use command “mydata$female=1*(mydata$gender==1)”, where mydata is your original dataset)
2. (2 marks) Compute the 95% confidence interval for income. Interpret the findings in plain language.
3. (2 marks) Display 2 separate densities for newc where male=1 and for newc where male=0within the same graph. Briefly describe and compare the distributions in plain language.
4. (3 marks) Using the newc variable test the following difference in means:
- H0: mean(newc if male=1) = mean(newc if male=0)
- H1: mean(newc if male=1) != mean(newc if male=0)
where the symbol “!=“ means “not equals.” For the test, report the difference in means, the t-statistic, 95% confidence interval for the difference in means, and p-value. Explain whether the test rejects the null at the 5% level of significance. Provide a brief interpretation of your findings.
(Hint: You may apply the formulas in Lecture Note 3. To generate a new dataset consisting of a subset of observations, use the command “mydata_male=subset(mydata, male==1)”, for example.)
(Hint: Alternatively, you may use the command “t.test” . Tutorial 4 contains details on how you can do this.)
5. (2 marks) Construct a scatter plot of newc vs. income where newc is on the vertical axis and income is on the horizontal axis. Visually, does there appear to be a positive or negative relationship between newc and income? Also compute and report the correlation coefficient, corr(newc, income).
6. (2 marks) Run a single linear regression where the dependent variable is newc and the independent variable is income. Discuss your results by:
- Interpreting the magnitude of the regression intercept estimate.
- Interpreting the magnitude of the predicted change in newc corresponding to a $20,000 increase in annual income.
(Hint: To run a regression of variable y on variable x, use command “reg1=lm(y~x=mydata)”, where mydata is your original dataset. Then, display the regression results using command “summary(reg1)” . You may find more details in Tutorial 4.)
7. (2 marks) Suppose someone on the data analytics team tells you to run regressions of newc on income using male and female subsamples separately. What do you find in the regressions? Briefly describe your findings and explain whether the regressions provide any insights.
(Hint: to generate a new dataset consisting of a subset of observations, use the command “mydata_male=subset(mydata, male==1)” , where mydata is your original dataset)
8. (3 marks) Suppose someone on the data analytics team tells you to run regressions of newc on income using the following subsamples separately: (i) individuals with less than or equal to 20 years of age; (ii) individuals with less than or equal to 18 years of age. What do you find and how would you explain the findings to the team?
(Hint: to generate a new dataset consisting of a subset of observations, use the command “mydata_agele20=subset(mydata, age<=20)”, where mydata is your original dataset)
9. (2 marks) R-code: we will review and mark your R code according to the following scheme:
• 2/2 if R code is correct and organised and commented like the solution code for the assignment.
• 1/2 if R code is correct, but hard to follow or not well commented.
• 0/2 if R code is incorrect and/or a complete mess, or not submitted.
2022-06-09