ETC1000 / ETW1000/MCD2080 Business and Economic Statistics 2016
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Semester One 2016
Examination Period
Faculty of Business & Economics
ETC1000 / ETW1000/MCD2080
Business and Economic Statistics
![]()
![]()
Question 1 (25 marks)
First let us look at quantity of coffee produced by these households. Below is a table of descriptive statistics for kilograms of coffee produced by households in the last 12 months.
Kilograms of coffee produced
Mean Standard Error Median Mode
Standard Deviation Sample Variance
Kurtosis Skewness Range Minimum Maximum Sum
763.5037
38.33008
800
1000
447.0017
199810.5
1.797512
0.837526
2480
20
2500
103836.5
Count 136
(a) Interpret the values for the Mean, Median and Mode. What do these three values tell you
about the shape of the distribution for coffee production?
(5 marks)
(b) Interpret the Standard Deviation. Would you say this is large? Explain your reasoning.
(3 marks)
(c) There is actually a total of 187 households in this sample, but only 136 of these grow coffee. Those that do not produce coffee have a blank for this variable, and so the Excel output above omits these blank values in the analysis (notice the Count is 136). In some data sets, these households may have had a “0” recorded instead of a blank. If this were the case here – that is, we were to include these households with production =0 kilograms into the descriptive statistics – what, if anything, would you expect to see happen to each of the Mean, Median, Mode and Standard Deviation? Explain your reasoning.
(6 marks)
A common standardisation used in measuring agricultural output is productivity relative to land area used. That is, a household’s productivity in coffee production can be captured by:
Yield![]()
![]()
quantity
of
coffee
produced
in
kilograms⁄coffee
land
area
in
hectares
(d) What does this standardisation allow us to compare? Give an example comparing 2 coffee-producing households to illustrate.
(3 marks)
(e) The following regression output estimates mean yield (kilograms per hectare).
![]()
![]()
![]()
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
0.9359
0.875909
0.868502
152.4773
Observations 136
ANOVA
|
|
df |
SS MS F Significance F |
|
Regression Residual Total |
1 135 136 |
22154640 22154640 952.9151 9.05E-63 3138660 23249.33 25293300 |
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
|
Intercept Mean |
0 403.6109 |
13.07482 |
377.7529 |
(i) Interpret the values for the Mean under the Lower 95% and Upper 95% columns.
(3 marks)
(ii) In neighbouring South-East Asian countries, coffee yield averages around 1000
kilograms per hectare. What does your answer to (i) tell you about the average yield of coffee-producing households in Timor-Leste compared to other countries?
(2 marks)
(iii) The confidence interval you discussed in (i) is quite a wide interval. Explain
intuitively the role of sample size (n) and standard deviation (σ) in determining the width of a confidence interval.
(3 marks)
Question 2 (15 marks)
Surveyed households were asked about their sources of income in the last 12 months. Responses are shown in the bar chart below.
(a) Provide an interpretation of the 73% and the 37% bars in this chart; that is, explain what
these values are measuring.
(2 marks)
(b) What can you say about the income dependency of households in this district on coffee?
Explain how you drew your conclusion.
(2 marks)
(c) Explain why a pie chart would be an inappropriate way to display this information.
(2 marks)
Many households in Timor-Leste suffer from a shortage of food. It has been argued that growing coffee does not help with this, because instead of growing crops on their land they are growing a crop that is not food. The following table shows the number of households that grow food-crops by coffee-growing status.
|
|
Grows coffee |
Does not grow coffee |
Total |
|
Grows food crops |
70 |
18 |
88 |
|
Does not grow food crops |
66 |
33 |
99 |
|
Total households |
136 |
51 |
187 |
(d) What is the probability a household does not grow food crops?
(2 marks)
(e) What is the probability a household does not grow food crops but grows coffee?
(2 marks)
(f) What is the probability a coffee-growing household grows food crops?
(2 marks)
(g) What does the table suggest about whether the growing of coffee restricts the growing of
food crops? Explain how you drew this conclusion.
(3 marks)
![]()
![]()
![]()
Question 3 (30 marks)
A regression model was estimated to understand why some households have better coffee yields than others. Variables are defined as follows:
Dependent variable: Explanatory variables:
Coffee yield in kilograms per hectare
|
Age of trees Maintains trees |
= age of household’s coffee trees, in years
= 1 if the household regularly prunes and maintains their coffee trees; =0 otherwise |
Zone 1 = 1 if the household is located in zone 1 within
the district; =0 otherwise
Zone 2 = 1 if the household is located in zone 2 within
the district; =0 otherwise
Zone 3 = 1 if the household is located in zone 3 within
the district; =0 otherwise
N.B. This coffee-growing district is divided into 3 geographical zones.
Regression output follows.
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
0.336666
0.113344
Adjusted R Square Standard Error
0.08627
145.7519
Observations 136
ANOVA
|
|
df |
SS MS F Significance F |
|
Regression Residual Total |
4 131 135 |
355747.6 88936.89 4.186525 0.003188 2782912 21243.61 3138660 |
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
|
Intercept Age of trees Maintains trees Zone 2 Zone 3 |
336.4716 -0.73468 77.81872 90.79857 36.37211 |
0.510922 31.03179 35.06785 29.88624 |
271.5169 -1.7454 16.43044 21.426 -22.75 |
401.4263 0.27605 139.207 160.1712 95.49422 |
(a) Interpret the estimated coefficient for the intercept and Age of trees. Explain whether these
values make sense.
(4 marks)
(b) Consider the coefficient for Maintains trees.
(i) Interpret the estimated coefficient.
(2 marks)
(ii) Perform a hypothesis test to see whether households that maintain their trees
experience better yields. Use a critical value approach: the value from the Student’s t distribution you need is 1.66.
(5 marks)
(iii) Currently few coffee-growing households in this district prune and maintain their coffee trees (around 20%), and it has been suggested that a program is needed to address this. From a practical point of view, is maintaining trees a key to substantially increased yields? Explain your reasoning.
(2 marks)
(c) The critical value of 1.66 you used in part (b) above is different to the critical value you would have obtained from a Standard Normal distribution.
(i) Why do we use Student’s t critical value? Intuitively, why would you expect the Student’s t critical value to be larger than the Normal distribution value?
(3 marks)
(ii) Under what circumstances would values from the Student’s t and Normal distributions
be virtually the same?
(1 mark)
(iii) For the test in (b), even though we may not know the appropriate critical value for the Normal distribution, we can tell whether the outcome of the test would be any different if the Normal critical value was used. Explain how we can tell in this case, and whether the outcome would change.
(2 marks)
(d) Next consider the Zone coefficients.
(i) Interpret the estimated coefficients for Zone 2 and Zone 3.
(4 marks)
(ii) What do the p-values for the Zone dummies tell you about differences in coffee yields
in different locations?
(3 marks)
(iii) Suggest a reason why we might see differences in coffee yield by location.
(2 marks)
(e) Agricultural scientists assure us that coffee production is strongly related to the age of the
tree, however Age of trees is not significant in the model. Suggest an possible explanation for why the model does not find this effect.
(2 marks)
Question 4 (16 marks)
Coffee is a tree crop that is harvested once per year. This puts a significant labour burden on households in harvest months of the year. It also means that coffee income is only earned for some months of the year.
The survey recorded the coffee harvesting activities of households on a monthly basis over the last 3 years (2012-2015). In any particular month of the last 3 years, therefore, we know the proportion of households in the district that were harvesting coffee.
The regression output below estimates the proportion of households engaged in coffee harvest as a function of the year and month:
Time =1 in January 2012
=2 in February 2012
=3 in March 2012
.
.
=36 in December 2015
=1 if the month is January
=1 if the month is February
![]()
![]()
![]()
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
0.963212
0.927777
Adjusted R Square Standard Error
0.890095
0.089332
Observations 36
ANOVA
|
|
Df |
SS MS F Significance F |
|
Regression Residual Total |
12 23 35 |
2.357792 0.196483 24.62138 2.77E-10 0.183544 0.00798 2.541336 |
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
2022-01-13

