STAT1008 – QUANTITATIVE RESEARCH METHODS 2018
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
EXAMINATION
Semester 1 – Final Examination, 2018
STAT1008 – QUANTITATIVE RESEARCH METHODS
Question 1
a. Suppose that a 95% confidence interval for is (54.8, 60.8). Which of the following is most likely the p-value for the test of ⃞ ǣ ൌ ͷ⃞ versus ⃞ ǣ ് ⃞⃞ ? [1 marks]
A. 0.031 B. 0.001 C. 0.016 D. 0.231
b. Decreasing the significance level of a hypothesis test (say, from 5% to 1%) will cause the p- value of an observed test statistic to: [1 marks]
A. Increase B. Decrease C. Stay the same
Use the following to answer parts c to e.
⃞⃞ ൌ ʹͺ and ଶ ൌ ⃞ͺǤ⃞, ⃞ଶ ൌ ͺǤͳ⃞ with ⃞ ଶ ൌ ʹͶ .
c. What is the test statistic for this test?
[1 marks]
d. What are the degrees of freedom for this test?
e. Would you reject the null hypothesis for the above test at a significance level of 5%?
f. Among international applicants to an Australian university, the average TOEFL score was 269, the SD was about 11, and the highest score was 285. Do you think the TOEFL scores follow a normal distribution?
It is generally believed that the heights of adult males in Australia are approximately normally distributed with mean 70 inches and standard deviation 3 inches and that the heights of adult females in
Australia are also approximately normally distributed with mean 64 inches and standard deviation 2.5 inches. ANU is considering custom ordering beds for their dorm rooms. Answer the following questions about the lengths of beds in dorm rooms at ANU.
g. The beds that the university currently purchases are 75 inches long. What proportion of males will be able to fit on the bed while lying perfectly straight?
Male Height |
|
|
. Thus, the proportion of males who will fit in the current beds is: Proportion is 0.95 or 95.25%
|
|
|||
|
h. Should ANU be concerned that females will not fit in the 75-inch beds? Numerically justify your answer.
i. ANU plans on ordering custom sized beds such that 99% of male students are expected to fit in them when lying perfectly straight. What length beds should they order? Round your answer to the nearest inch.
j. ANU decides it is too expensive to replace all the beds. Suppose ANU has 2,150 beds all of which are 75 inches long. How many beds should they replace? You may assume that only those males taller than 75 inches will receive the longer beds and that females make up half of the population that will need a dorm room bed.
Question 2
A survey was conducted amongst 438 students in an introductory statistics class. The questions asked were whether the student has ever smoked, whether they have ever consumed alcohol and their gender. The responses to the survey are given in the tables below.
|
Female |
Male |
No Alcohol |
58 |
39 |
Alcohol |
140 |
201 |
a. Construct a 95% confidence interval for the difference in proportion of females and males who have responded that they have smoked in the past.
b. Test at 10% level of significance, if there is evidence that the proportion of females who have never consumed alcohol differ significantly from the proportion of males who have never consumed alcohol.
c. If a random student is picked from the sample what is the probability that this student is a male
smoker?
d. If a random student is picked from the sample what is the probability that the student is a
smoker and has consumed alcohol?
In the same survey information was collected on each student’s height and weight. This information was used to evaluate the BMI (body mass index) for each student. The BMI can be used to assess, however imperfectly, the health of an individual. The table below gives us summary information for the BMI variable for the survey respondents.
|
|
ҧ |
⃞ |
Non-Smoker |
334 |
22.03 |
3.52 |
Smoker |
104 |
22.73 |
3.32 |
e. What affect will increasing the sample size have on the center and spread of the BMI variable?
f. It is suspected that the people who smoke tend to have a lower weight, hence they would also have a lower BMI compared to non-smokers with the same height. Setup and carry out an appropriate hypothesis test to verify this claim.
g. What assumptions are you making in carrying out the above test?
h. One of the researcher suggests that gender may also be a contributing factor to the difference in BMI. If you were to look at data on males and females separately would you expect to see a result different from the one derived in part f? Please explain why or why not.
Question 3
A multiple regression model ‘Model 1’ is fit to assess the ‘Time’ taken to commute to work using various forms of public transport. Two predictors, ‘Distance’ of travel in kms and ‘Age’ of the individual in years, were used in the regression model and the output from the model fitting is given below.
Summary of Model 1
Predictor Coef SE Coef T P
Constant 5.08731 1.21706 4.18 0.000
Distance 1.09934 0.03306 33.258 0.000
Age 0.03190 0.02575 1.239 0.216
S = 7.937 R-Sq = 69.0% R-Sq(adj) = 68.9%
Analysis of Variance
Source DF SS MS F P
Regression 2 69770 34885 553.78 0.000
Residual Error 497 31308 63
A second model ‘Model 2’ is fit where Time is only regressed on Distance. The output from the model fit is given below.
Summary of
Predictor
Constant
Distance
S = 7.941
Model 2
SE Coef
0.58764
0.03307
R-Sq = 68.9%
T P
10.90 0.000
33.24 0.000
R-Sq(adj) = 68.9%
Analysis of Variance
Source
Regression
Residual Error
Mean(Distance)
DF 1 498 |
SS 69673 31405 |
14.156
MS
69673
63
SD(Distance)
F
1104.8
P
0.000
10.75
a. Is ‘Age’ a significant predictor in this multiple regression model? What information have you
used to come to your conclusion?
b. Which model should you choose between the two models fit? Justify your choice.
c. Construct an appropriate 95% interval for the average time taken to commute to work for individuals whose distance of travel is 20kms. Comment whether this is a prediction or a confidence interval.
d. Construct an appropriate 95% interval for the time taken to commute to work for an individual whose distance of travel is 10kms. Comment whether this is a prediction or a confidence interval.
Question 4
Output for a model to predict the GPAs of students at a small university based on their Math scores, Verbal scores, and the number of hours spent watching television in a typical week is provided.
Summary of
Predictor
Constant
Math
Verbal
TV
S = ???
Regression Model
Coef
1.8015
0.0010442
0.0014182
-0.014708
R-Sq = ????
SE Coef
0.1842
0.0002500
0.0002398
0.003269
T P
9.78 0.000
4.18 ???
5.91 ???
-4.50 ???
R-Sq(adj) = 19.0%
Analysis of Variance
Source DF SS MS F P
Regression ??? ??? 4.8295 35.90 0.000
Residual Error ??? 59.7304 ???
Total 447 ???
a. Interpret the coefficient of TV in context.
b. Use the output to determine how many students were included in the sample.
Some of the information in the ANOVA table is missing. Evaluate the missing values to be able to answer the following questions.
c. How many degrees of freedom should appear in the "Regression" row of the table?
d. How many degrees of freedom should be listed in the "Residual Error" row?
e. At the 1% significance level, is the model effective according to the ANOVA test. Include all details of the test (i.e., Null & Alternative Hypothesis, Test Statistic and P-Value)
f. Which predictors are significant at the 5% level? What are their p-values?
g. What is the standard error of the residuals?
h. The R2 for this model is missing in the provided output. Use the available information to compute (round to three decimal places) and interpret R2 for this model.
i. A dotplot of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
Question 5
About 36% of all students in ANU are international students. Suppose we take a random sample of 200 ANU students. Let X represent the number of students in this sample that are international.
Round all answers to three decimal places and represent proportions as percentages.
a. Explain why X is a binomial random variable.
b. What is the probability that exactly 50 people in the sample are international?
c. What is the probability that there are less than 3 international students in the sample?
d. What is the mean and standard deviation of the random variable X?
A manufacturing firm uses three machines, A, B and C, to produce computer chips. Machine A produces 50% of all chips, machine B produces 30% of all chips and machine C produces the remaining chips. 1% of all chips produced by machine A are defective, whereas 2% of machine B chips are defective and 1.5% of machine C chips are defective.
e. Draw a tree diagram to evaluate the probability that a random chip picked is not defective.
f. If a chip picked at random turns out to be defective, what is the probability that it was
manufactured from machine B?
2022-06-06