Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

PAST FINAL EXAMINATION 2

STAT7055 Introductory Statistics for Business and Finance

Question 1 [11 marks]


A random sample of 500 car batteries was taken and the life of each battery was mea- sured. Letting X denote battery life in years, suppose that the sample revealed the following distribution of battery life:

Life (in years) Frequency

X 1 12

1 < X 2 94

2 < X 3 170

3 < X 4 188

4 < X 5 28

5 < X 8

500

Based on this data, test whether battery life follows a normal distribution with µ = 2.8 and σ2 = 1.12. Clearly state your hypotheses and use a significance level of α = 5%.

Question 2 [15 marks]

Let X1, X2, . . . , Xn denote a random sample from a population with mean µ and variance σ2. That is, the Xi’s are independent and each Xi has E(Xi) = µ and V (Xi) = σ2. Consider the following three estimators for µ:


1

µ X + X ),

1

µˆ   = X

+ X2 + . . . + Xn1 + 1 X ,

µˆ   = X¯ = 1 ! X

(a) [6 marks] Show that each of the three estimators is unbiased. Show all working.

(b) [6 marks] Calculate the variance of each estimator. Show all working.

(c) [3 marks] Based on your answers to part (b), determine whether any of the three estimators is consistent and give an explanation.


Question 3 [26 marks]

A plant manager, in deciding whether to purchase a machine of design A or design B, checks the times for completing a certain task on each machine. Eight random times were recorded for each machine and the times (in seconds) are displayed in the table below.

Times

1 2 3 4 5 6 7 8

A 32 40 42 26 35 29 45 22

B 30 39 42 23 36 27 41 21

(a) [5 marks] Suppose the population variances for the completion times for machine

A and B are known to be σ2

= 65 and σ2

= 69. Test whether there is a signif-

icant difference in the completion times of the two machines. Clearly state your hypotheses and use a significance level of α = 5%.

(b) [4  marks]  If the true value of µA µB = 2.3, calculate the probability of making a type II error for the test in part (a) at a significance level of α = 5%.

For parts (c), (d) and (e), suppose now that the true population variances for the com- pletion times for machine A and B were actually unknown.

(c) [5 marks] Test whether the two population variances are equal. Clearly state your hypotheses and use a significance level of α = 1%.

(d) [4 marks] Test whether machine A is not slower than machine B. Clearly state your hypotheses and use a significance level of α = 5%.

(e) [2 marks] Calculate an 80% confidence interval for the difference in mean comple- tion times, µA µB.

Suppose now that the 8 random times recorded for each machine were not collected independently. Specifically, for each column in the table of completion times, the time for machine A and the time for machine B were recorded by the same technician.

(f) [6 marks] Based on this new information regarding how the samples were col- lected, test whether machine A is not slower than machine B. Clearly state your hypotheses and use a significance level of α = 5%.


Question 4 [18 marks]

Four groups of students were subjected to different teaching techniques and tested at the end of a specified period of time. The test scores for each student are listed in the table below. The total sum of squares for all test scores was calculated to be 1939.8333.

Student Group

1

2

3

4

65

75

59

94

87

69

78

89

73

83

67

80

79

81

62

88

81

72

83

83

69

79

76

90

(a) [4 marks] Calculate the sum of squares for treatment for a one-way ANOVA ap- plied to this data with teaching technique as the factor.

(b) [4 marks] Using a one-way ANOVA, test whether there is a difference in mean test scores for the four teaching techniques. Clearly state your hypotheses and use a significance level of α = 2.5%.

It turns out that in the table above, the first 3 test scores for each teaching technique were from students who were taught in first semester and the last 3 test scores for each teaching technique were from students who were taught in second semester. A two-way ANOVA was applied to this data and the ANOVA table is displayed below.

Source

Sum of squares

Degrees of freedom

Mean squares

F

Teaching Technique

?

?

?

?

Semester

24

?

?

?

Interaction

Error

?

1011.3333

?

?

?

?

?

Total

1939.8333

?

(c) [4 marks] Test whether there is an interaction between the teaching technique  and the semester in which it was taught. Clearly state your hypotheses and use a significance level of α = 5%.

(d) [3 marks] Test whether there is a difference in mean test scores for the four teach- ing techniques. Clearly state your hypotheses and use a significance level of α = 5%.

(e) [3 marks] Test whether there is a difference in mean test scores for the two semesters. Clearly state your hypotheses and use a significance level of α = 5%.



Question 5 [33 marks]

A study was conducted to determine the relationship between store profits of super- markets and the number of customers. Specifically, the average daily store profit and the average number of customers who entered the store per day was recorded for 17 supermarkets in Canberra. The data are summarised in the table below. Note that the average daily store profit was measured in hundreds of dollars.

Avg Num Cust Enter

Avg Profit

Avg Num Cust Enter

Avg Profit

196.07

130.27

201.92

138.03

193.86

130.20

204.91

138.96

196.91

134.14

203.81

140.80

198.97

135.76

211.88

147.85

197.28

136.08

207.90

144.12

200.15

138.04

211.30

146.24

200.29

138.32

211.41

148.40

200.81

137.65

212.04

147.88

202.27

137.77

Let X denote Average Number of Customers Entering and Y denote Average Profit. The


following sample variances and sample correlation coefficient are provided: s2

= 35.5029,

2 = 32.8781 and rXY = 0.9766906.

(a) [5 marks] Fit the regression model Y = β0 + β1X + $. That is, calculate the estimates βˆ0  and βˆ1.

(b) [4 marks] We have been told that the sum of squares for regression is SSR = 501.8117. Calculate the standard error of estimate, s!, and the coefficient of deter- mination, R2.

(c) [4 marks] Test whether there is a linear relationship between Average Profit and Average Number of Customers Entering. Clearly state your hypotheses and use a significance level of α = 5%.

In the study, the average number of customers per day who actually purchased something was also recorded for each supermarket. A new simple linear regression model with Average Profit as the dependent variable and Average Number of Customers Purchasing as the independent variable was fitted to the data. The results of the regression model (not shown) indicated that there was a significant linear relationship between Average Profit and Average Number of Customers Purchasing. Also, the sum of squares for regression was calculated to be SSR = 502.9175.

(d) [4 marks] Calculate the standard error of estimate, s!, and the coefficient of de- termination, R2, for this new model.

(e) [2 marks] Comparing your answers to parts (b) and (d), if you were asked to choose one model, would you choose the model with Average Number of Customers

Entering as the independent variable or the model with Average Number of Cus- tomers Purchasing as the independent variable? Why?

A multiple regression model with both Average Number of Customers Entering and Average Number of Customers Purchasing as independent variables was fitted to the data. Let W denote Average Number of Customers Purchasing. The output from the regression analysis is displayed below:

Predictor Coef SE Coef T p-value

Intercept 68.6007 15.1850 4.52 0.0005

X 4.5806 3.5632 1.29 0.2195

W 5.5239 3.5650 1.55 0.1436

(f) [2 marks] Given what we already know about the linear relationship between Av- erage Profit and Average Number of Customers Entering and the linear relationship between Average Profit and Average Number of Customers Purchasing, explain why the p-values for these two independent variables are large when they are both in- cluded in the model.

The following information was also recorded for each supermarket: an indicator variable (MALL) that equals 1 if the supermarket was located inside a shopping mall; and an indicator variable (SUN) that equals 1 if the supermarket was open on Sundays. With this additional information, the following model was also fitted to the data:

Y = β0 + β1W + β2MALL + β3SUN + β4 (W × SUN) + β5 (MALL × SUN) + $

The sum of squares for regression and sum of squares for error were calculated to be SSR = 509.7169 and SSE = 16.3327. Some output from the regression analysis is also displayed below:

Predictor Coef SE Coef T p-value

Intercept 138.0343 53.2884 2.59 0.0251

W 1.3292 0.2477 5.37 0.0002

MALL 6.0161 3.9196 1.53 0.1531

SUN 65.9656 58.0108 1.14 0.2796

W × SUN −0.3057 0.2707 −1.13 0.2827

MALL × SUN 4.3472 4.0946 1.06 0.3111

(g) [4 marks] Test the overall significance of the model. Clearly state your hypotheses and use a significance level of α = 5%.

(h) [2 marks] What do you conclude about the relationship between Average Profit and Average Number of Customers Purchasing? Clearly state your hypotheses and use a significance level of α = 5%.

(i) [2 marks] If possible, test whether a different intercept is needed for supermarkets that are located inside a shopping mall. Clearly state your hypotheses and use a significance level of α = 5%.

(j) [2 marks] If possible, test whether a different coefficient parameter for Average Number of Customers Purchasing is needed for supermarkets that are open on Sundays. Clearly state your hypotheses and use a significance level of α = 5%.

(k) [2 marks] If possible, test whether a different coefficient parameter for Average Number of Customers Purchasing is needed for supermarkets that are located inside a shopping mall.   Clearly state your hypotheses and use a significance level of        α = 5%.