Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Semester One 2020

Exam - Alternative Assessment Task

ETF1100

Business Statistics

Question 1 (18 marks)

Exhibit 1 presents descriptive statistics for the age of the people who died in accidents. The first set  covers the whole period from January 1989 to February 2020, the second set covers the period from January 1989 to December 2003, and the last column is for the period January 2004 to February        2020.

Exhibit 1

Full Sample

1989-2003

2004-2020

Mean

39.542616

37.64256177

42.07816773

Standard Error

0.0963247

0.126064411

0.14748453

Median

34

31

38

Mode

18

18

18

Standard Deviation

21.796212

21.56438527

21.8456404

Sample Variance

475.07485

465.022712

477.2320043

Kurtosis

-0.619564

-0.524647304

-0.694967519

Skewness

0.5732953

0.659271235

0.473556606

Range

110

108

110

Minimum

-9

-9

-9

Maximum

101

99

101

Sum

2024661

1101459

923195

Count

51202

29261

21940

(a)  First, we will focus on the mean, median and mode over the three periods:

(i)          Compare the mean and median for the full sample. What do they suggest about the shape of the distribution of ages? Explain how you draw that conclusion about the   shape based on how the mean and median are calculated. (3 marks)

In the full sample the mean of 39.54 is larger than the median of 34. [1 mark]         This suggests that the distribution of Age is somewhat positively skewed. [1 mark]

The median is the middle value while the mean is the sum of the values divided by the number of observations. The mean is more effected by very large or small values. In this case it seems there is a small number of large values. The median is less effected by outliers. [1 mark]

(ii)         Compare the means across the three periods. What does this suggest about trends over time? (2 marks)

The mean age is 37.64 in the time span from 1989-2003, it is 42.08 in the second time span from 2004-2020 and it is 39.54 over the entire sample. [1 mark]

The rise in the mean indicates that the age of people killed in car accidents is increasing over time. [1 mark]

(iii)        What does the mode measure? Explain whether you think it is an informative measure in our case. (2 marks)

The mode is the most common value. [1 mark]

The mode tends not to be useful when the variable for which it is being calculated takes a large     number of values. In the case of age, this variable takes quite a number of values (this can be seen by the range). This means that the mode might be quite unstable. But the fact that it is stable         across each of the samples is instructive. It indicates that the most common death on our roads is a young person aged 18. [1 mark]

(b)  Look at the standard deviation in the table of summary statistics above.

(i)           Describe in words how the standard deviation is calculated from the mean and the data values. (1 mark)

The standard deviation is the square root of the average of the squared deviations of the variables from the mean. [1 mark]

(ii)          Using the full sample, interpret what the value of the standard deviation tells you about the spread of ages. (1 mark)

The standard deviation for the full sample tells us that the average difference between the age of someone who died and the mean was around 21.80 years. [1 mark]

(iii)         If the data was normally distributed, you would expect approximately 95% of the

data to be within two standard deviations of the mean. Calculate that range for the full sample. Comment on the values you find and whether they are likely to be         accurate? Why not? (3 marks)

In this case the range is: 39.54 +/- 2 * 21.80. This interval ranges from -4 to 83 years of age. [1 mark]

This interval includes a negative age which is clearly not possible. [1 mark]

This range is unlikely to be accurate because the data is probably not normally distributed. In         particular it is not symmetric around the mean as the mean is different from the median. [1 mark]

(c)   Refer back to Exhibit 1 and answer the following questions.

(i)          The minimum value in Exhibit 1 for each of the samples is -9. This was not an actual age, that is not possible, but an indicator for a missing value of age. How would       including this in our calculations have changed the mean, median and the mode? (2 marks)

It would lower the mean and the median. [1 mark]

It will not change the mode. [1 mark]

(ii)          In Exhibit 1 it appears that the average age of road fatalities is increasing over time.

How could a different proportion of missing values, i.e. -9, contribute to this result over the two periods examined. (2 marks)

If there are more missing values in the first period from 1989-2003 than in the second period 2004-2020 then this could pull the average down more in the first period than the second period. [2 marks]

(iii)        We reexamine the data and find that there are only 85 observations with -9 for Age.

Is including these values likely to have a large impact on the mean, median or mode? Provide some justification for your answer. (2 marks)

No. [1 mark]

The data includes 51,202 observations overall and 85 observations is a very small proportion of that total number. [1 mark]

Question 2 (16 marks)

(a)  Exhibit 2 shows the age distribution of fatalities each year from 1989 to 2019.

Exhibit 2


(i)          What does the 11.0% in the first row of the table mean? (1 mark)

This means that 11.0% of those who died in car accidents in 1989 are 65 years or older. [1 mark]

(ii)         Comment on the main trends for each age range that are shown in this table. (3 marks)

The proportion of people who are 17-25 year olds killed in road accidents has declined over time. It was 34.1% in 1989 and 18.2% in 2019. [1 mark]

The proportion of people who are 26-64 has generally remained fairly steady. It was 54.9% in 1989 and 58.4% in 2019. [1 mark]

The proportion of people who are 65 or more year olds killed in road accidents has increased over time. It was 11.0% in 1989 and 23.4% in 2019. [1 mark]