ETF1100 BUSINESS STATISTICS 2019
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Semester Two 2019
ETF1100
BUSINESS STATISTICS – PAPER 1 OF 1
Question 1 (18 marks)
The Census data reports information about weekly incomes of households in the format shown in the
table below. This is data for single-parent households in Melbourne in 2016.
Income Range |
Number of Households |
Cumulative Number of Households |
Cumulative % of Households |
Negative/Nil income |
4,979 |
4,979 |
3.23% |
$1-$149 |
1,921 |
6,900 |
4.47% |
$150-$299 |
5,254 |
12,154 |
7.88% |
$300-$399 |
5,733 |
17,887 |
11.59% |
$400-$499 |
8,952 |
26,839 |
17.39% |
$500-$649 |
14,475 |
41,314 |
26.77% |
$650-$799 |
15,715 |
57,029 |
36.96% |
$800-$999 |
17,688 |
74,717 |
48.42% |
$1,000-$1,499 |
33,628 |
108,345 |
70.21% |
$1,500-$1,999 |
20,354 |
128,699 |
83.40% |
$2,000-$2,499 |
12,629 |
141,328 |
91.59% |
$2,500-$2,999 |
5,312 |
146,640 |
95.03% |
$3,000-$3,999 |
5,014 |
151,654 |
98.28% |
$4,000 or more |
2,654 |
154,308 |
100.00% |
Grand Total |
154,308 |
|
|
(a). If you want to calculate the mean weekly income for these households, what are some of the
problems you encounter when income is presented as a range? What approximations would you need to make? (2 marks)
(b). Now we want to calculate the median income for these households.
i. What is the definition of the median? (1 mark)
ii. Based on the table above, describe how you would calculate the median income for these households. (2 marks)
iii. Give an approximate estimate for the median using the method outlined in the previous question. (1 mark)
(c). The mean for this data has been found to be approximately $1,302 per week. The median you found in part (b) should be somewhat smaller than the mean of $1,302. What does this tell us about the shape of the distribution of the data set? Give some intuition for why this shape produces a mean that is larger than the median? (3 marks)
The sample standard deviation of income across households has been calculated for this set of data and is $924.
i. Write down the formula for the sample standard deviation. (2 marks)
ii. Explain the sample standard deviation formula in words and how it measures the spread of the data. (3 marks)
iii. Another measure of the spread/dispersion of the data is the range. Explain how the range is calculated and why the standard deviation is generally a better measure of spread/dispersion than the range. (2 marks)
iv. It is common to construct a confidence interval based on +/- 2 standard deviations either side of the mean. This usually covers 95% of the data if the data is normally distributed. For our data —with a mean $1,302, standard deviation $924 and the median you estimated earlier —what is problematic about this approach? (2 marks)
Question 2 (18 marks)
In order to analyse the data on poverty rates amongst households in the Melbourne region in 2016 we have created the data set shown in the snapshot below.
The snapshot shows the first 10 data points. The variables are defined as follows:
• Household Type = either “Single Parent” or “Two Parents” .
• Number of Children = the number of children in the household.
• Number of Households = the number of households in the Melbourne region in 2016 that fall into the category.
• Poverty Status = “Yes” means the household is in poverty and “No” means they are not in poverty.
(a). From the data above we have produced two pivot tables.
The first pivot table, shown below, illustrates the number of Melbourne households in total in each category (household type and number of children) in 2016.
The second pivot table, shown below, shows the number of Melbourne households which are in poverty in each category (household type and number of children) in 2016.
Using these two pivot tables, show how you could calculate the poverty rate for Melbourne households in this period with a single-parent and two children. (2 marks)
ii. Using these two pivot tables, show how you could calculate the overall poverty rate in Melbourne in this period. (2 marks)
iii. Below we report certain probabilities:
P( Poverty = Yes | Household Type = Single Parent ∩ Number of Children = 1 ) = 0.36 P( Poverty = Yes | Household Type = Single Parent ∩ Number of Children = 4 ) = 0.85 P( Poverty = Yes | Household Type = Two Parents ∩ Number of Children = 1 ) = 0.13 P( Poverty = Yes | Household Type = Two Parents ∩ Number of Children = 4 ) = 0.66
Interpret these probabilities and discuss how they vary with the number of children and the household type. Give some intuition for these patterns. (4 marks)
The pivot table below focuses just on households in poverty and is presented as “% of Total” .
i. What does the value 11.94% mean in the pivot table? (1 mark)
ii. What is the probability that a household in poverty has two parents and one child? (1 mark)
iii. What is the probability that a household in poverty has more than two children? (2 marks)
(c). The pivot table below focuses just on households in poverty and is presented as “% of Column” .
i. What does the value 57.70% mean in the pivot table? (1 mark)
ii. What does the value 22.76% mean in the pivot table? (1 mark)
iii. Suppose you are interested in investigating whether the number of children is independent of whether you have 1 or 2 parents using the data in the pivot table above. If these two variables were independent then write down a probability statement that you would expect to hold. (2 marks)
iv. Use data from the pivot table above to show that number of parents and number of children are NOT independent. (2 marks)
Question 3 (20 marks)
(a). The following table shows trends in the composition of households in Melbourne over the 5-
yearly censuses from 1981 to 2016.
Census Year |
Number of One- Parent Households |
Number of Two- parent Households |
Total Number of Households |
% of One-Parent Households |
1981 |
83,968 |
276,580 |
360,549 |
24.88 |
1986 |
94,668 |
308,105 |
402,773 |
24.43 |
1991 |
106,145 |
339,477 |
445,622 |
24.54 |
1996 |
117,683 |
375,428 |
493,111 |
24.80 |
2001 |
131,898 |
413,677 |
545,576 |
24.61 |
2006 |
146,220 |
462,517 |
608,737 |
24.02 |
2011 |
161,212 |
505,943 |
667,155 |
24.16 |
2016 |
174,769 |
563,155 |
737,924 |
23.68 |
Show how to calculate the average annual growth in the number of families in Melbourne between 1981 and 2016? (2 marks)
ii. Calculate the percentage of two-parent households in Melbourne in 2016. (1 mark)
(b). It is claimed that there has been a fall in the number of one-parent families in Melbourne since
1981. In order to investigate this we have estimated a regression model with a linear time trend in “Census Year” with the dependent variable being “% of One-Parent Households” . The results are shown below:
Should seasonal dummy variables have been included in this time series regression model? (2 marks)
ii. Interpret the coefficient for the intercept in this regression model and discuss whether it is meaningful. (2 marks)
iii. Interpret the coefficient for Census Year in this regression model. Is the sign consistent with the claim in the question? (2 marks)
iv. Undertake a hypothesis test at the 5% significance level of whether there has been
a change in the proportion of one-parent households over the time span examined. (4 marks)
v. How would your conclusion have differed if you had chosen a 1% significance level? (2 marks)
vi. Show how to calculate the predicted value for the proportion of one-parent households in 2050. (2 marks)
vii. How reliable do you think your prediction in the previous question will be? Justify your answer (3 marks)
2022-06-14