Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECOM20001: Econometrics 1

Assignment 1 

Getting Started

Please create an Assignment1 folder on your computer, and go to the Canvas site for ECOM 20001 and download the following data file into the Assignment1 folder:

•  as1_voucher.csv

This dataset comes from a survey of 1,680 individuals who are given a $1,000    shopping voucher by the government. The voucher is divisible, which means that individuals can split it and spend it on different goods.

Individuals are free to spend the voucher on any goods. For example, they may use it to cover their usual daily expenses (e.g., food grocery), or they may use it on goods    which they would not have purchased if there were no vouchers (e.g., a new tablet).    In the former case, the voucher does not stimulate consumption. In the latter case, the voucher stimulates consumption.

The government is interested in understanding how much the voucher can stimulate consumption.1

This dataset contains the following variables:

•  id: anonymous identification number for an individual

•  newc: new consumption expenditure ($). That is, the voucher amount spent on goods that would not have been purchased by the individual if there were no vouchers.

•  age: age (year) of the individual.

•  gender: =1 if the individual is female, =2 if male.

•  income: the individual’s annual income (in thousand $)

Prior to answering the questions, do some exploratory work such as browsing the observations and looking at the summary statistics.


Hint: Be as concise as possible in your answers. A short, crisp answer is better than a long, unclear answer. Longer answers do not necessarily attract higher  marks. Minor rounding errors (accuracy up to 3 significant figures) are allowed.

 

Questions

1.   (2 marks) Using the gender variable, create a binary variable called male that     equals one if the individual is male, and is equal to 0 otherwise. Report summary statistics (mean, std. dev, min, max) for newc, income, male. Interpret each of the sample means in plain language.

 

(Hint: for illustration, to generate a binary variable, use command                         “mydata$female=1*(mydata$gender==1)”, where mydata is your original dataset)

 

2.   (2 marks) Compute the 95% confidence interval for income. Interpret the findings in plain language.

 

3.   (2 marks) Display 2 separate densities for newc where male=1 and for newc where male=0within the same graph. Briefly describe and compare the distributions in plain language.

 

4.   (3 marks) Using the newc variable test the following difference in means:

-  H0: mean(newc if male=1) = mean(newc if male=0)

-  H1: mean(newc if male=1) != mean(newc if male=0)

where the symbol “!=“ means “not equals.” For the test, report the difference in means, the t-statistic, 95% confidence interval for the difference in means, and p-value. Explain whether the test rejects the null at the 5% level of significance. Provide a brief interpretation of your findings.

 

(Hint: You may apply the formulas in Lecture Note 3. To generate a new dataset consisting of a subset of observations, use the command                                     “mydata_male=subset(mydata, male==1)”, for example.)

 

(Hint: Alternatively, you may use the command “t.test” . Tutorial 4 contains details on how you can do this.)

 

5.   (2 marks) Construct a scatter plot of newc vs. income where newc is on the      vertical axis and income is on the horizontal axis. Visually, does there appear to be a positive or negative relationship between newc and income? Also compute and report the correlation coefficient, corr(newc, income).


6.   (2 marks) Run a single linear regression where the dependent variable is newc and the independent variable is income. Discuss your results by:

-  Interpreting the magnitude of the regression intercept estimate.

-  Interpreting the magnitude of the predicted change in newc corresponding to a $20,000 increase in annual income.

 

(Hint: To run a regression of variable y on variable x, use command                      “reg1=lm(y~x=mydata)”, where mydata is your original dataset. Then, display the regression results using command summary(reg1)” . You may find more details in Tutorial 4.)

 

7.    (2 marks) Suppose someone on the data analytics team tells you to run regressions of newc on income using male and female subsamples        separately. What do you find in the regressions? Briefly describe your     findings and explain whether the regressions provide any insights.

 

(Hint: to generate a new dataset consisting of a subset of observations, use the command “mydata_male=subset(mydata, male==1)” , where mydata is your       original dataset)

 

8.   (3 marks) Suppose someone on the data analytics team tells you to run         regressions of newc on income using the following subsamples separately: (i) individuals with less than or equal to 20 years of age; (ii) individuals with less  than or equal to 18 years of age. What do you find and how would you explain the findings to the team?

 

(Hint: to generate a new dataset consisting of a subset of observations, use the           command mydata_agele20=subset(mydata, age<=20)”, where mydata is your original dataset)

 

9.    (2 marks) R-code: we will review and mark your R code according to the following scheme:

•  2/2 if R code is correct and organised and commented like the solution code for the assignment.

  1/2 if R code is correct, but hard to follow or not well  commented.

 0/2 if R code is incorrect and/or a complete mess, or not  submitted.