ETF2121-ETF5912 Data Analysis in Business Semester 2, 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Past and Practice Exam Questions
ETF2121-ETF5912 Data Analysis in Business
Semester 2, 2022
Multiple Choice Questions
Question 1
A Type I error occurs when we:
(a) reject a false null hypothesis.
(b) reject a true null hypothesis.
(c) do not reject a false null hypothesis.
(d) do not reject a true null hypothesis.
(e) none of the above.
Question 2
If an estimated regression line has a y-intercept of 10 and a slope of 4, then when x = 2, the actual value ofy is:
18
15
14
13
Question 3
Pr(Y—=1 |X) = Φ(F̂0 + F̂1X)
0.5
0.9972
0.4772
0.0228
(e) none of the above
Question 4
The residual is defined as the difference between:
(a) the actual value ofy and the predicted value ofy.
(b) the actual value ofx and the predicted value of x.
(c) the actual value ofy and the predicted value of x.
(d) the actual value ofx and the predicted value ofy.
(e) none of the above.
Question 5
In a sign test, the following information is given: number of zero differences = 3, number of positive differences = 20, and number of negative differences = 5. The value of the test statistic is:
(a) 5
(b) 4
(c) 3
2
(e) none of the above
Question 6
A statistical method to compare two populations, when the samples are independent and the data are ranked, is the:
(a) Wilcoxon signed rank sum test
(b) Sign test
(c) Wilcoxon rank sum test
(d) Matched pairs t-test (e) none of the above
Question 7
Begin question 7 on a new page. Clearly label this question number of the new
page.
(7.a) Suppose you want to conduct a survey of small business owners in Victoria. The state Directory of Businesses gives a list of 1000 registered business owners, and hence the size of the population is 1000. You want to obtain a sample size of 30 business owners. There are two sampling techniques that can be used. Indicate which of these sampling techniques are described below.
(i) Group the businesses according to 5 business types (retailers, agriculture, manufacturing, financial services and advertising) and then randomly select a sample of 6 business owners from each business type.
(ii) Assign a number to each registered business in the state Directory of Businesses, and then use a random number generator to select the business owners to be included in the sample of size 30.
(7.b) When conducting an analysis of the equality of means of more than two populations (k > 2), what is the major advantage of conducting a parametric F-statistic (ANOVA) instead of doing multiple t-tests of each population mean against all the other population means separately? Briefly explain your answer.
(7.c) A regression analysis was performed to study the relationship between a dependent variable and five independent variables. The following information was obtained from the regression analysis:
SSE = 2400, SSR = 9600, n = 40
Determine the F-statistic.
(7.d) In a simple regression model, when every observation is on the regression line, the sum of squares of the error SSE = 0, the standard error of estimate s = 0, and the coefficient of determination R2 = 1. True or False and give a brief explanation.
(7.e) Specify the test statistic and the decision rule for each of the following Wilcoxon Rank Sum tests.
(i)
H0 : The two population locations are the same
HA : The location of population A is to the left of the location of populations B
nA = 4 nB = 6 a = 0.025
(ii)
H0 : The two population locations are the same
HA : The location of population A is different from the location of populations B
nA = 20 nB = 25 a = 0.05
(7.f) What is one important potential benefit of using a matched pairs experiment to test the difference between two populations rather than independent samples? Briefly explain your answer with an example.
(7.g) A Gallup Organisation poll a randomly selected American adults in July 2002 found that 55% of those surveyed felt that their weight was about right. The sampling error for the survey was given as 3%.
(i) Find a 95% confidence interval estimate of the percentage of American adults who think their weight is about right.
(ii) Based on the interval computed, explain whether it is reasonable to say that more than 50% of American adults think their weight is about right.
Question 8
An importer is considering whether to import and sell a new hair care product here in Australia. To begin with, he wants to try to sell the product in one large metropolitan market. He wants to choose the market where people spend the most on hair care products currently. To find this out, he randomly samples 210 adults who live in Melbourne and another 250 adults who live in Sydney and asks each one about their total spending in dollars on hair care products over the past year. The unknown population variances are assumed to be equal.
Using this information, answer the following questions.
Begin question 8 on a new page. Clearly label this question number of the new
page.
(8.a) What is the name of the test statistic that you may be able to use to test whether Melbourne or Sydney people spend more, on average, on hair care products using the sample data? Briefly explain why this test statistic may be appropriate?
(8.b) If you had all the data on total spending on hair care products for each individual in the two samples, what could you do to determine what the distribution of the data is? Explain your answer.
(8.c) Write down the appropriate null and alternative hypotheses for the test statistic in part (8.a) ifyou want to test whether spending on hair care products is higher in Sydney (population 2 or B) than in Melbourne (population 1 or A).
(8.d) What are the steps involved in constructing the test statistic in part (8.a)?
Question 9
The government is concerned about the rate of smoking in the population. It is considering whether to raise the tax on cigarettes to reduce the number of people smoking. To see if such a policy will have an effect, the government undertakes a study of the determinants of smoking in the community. The government's chief statistical analyst conducts a random survey of 807 adults in the country's population (from all states) and collects the following variables.
EDUC = years of schooling of the person
CIGPRIC = average price of cigarettes including taxes, in dollars
per pack, in the state where the person lives
AGE = of person in years
RESTAURN = 1 if the state where a person lives in has government
imposed restaurant smoking restrictions 0 otherwise
Using this information, the analyst constructed a binary variable SMOKE of whether the individual smoked or not; that is
SMOKE = 1 if an individual smokes
0 otherwise
The analyst wanted to identify the main determinants of the probability of an individual person smoking using this information. To do this, the analyst estimated the Logit model. The results of this estimation, using EViews, are provided below.
Dependent Variable: SMOKE
Method: ML - Binary Logit (Quadratic hill climbing / EViews legacy)
Sample: 1 807
Included observations: 807
Convergence achieved after 4 iterations
Covariance matrix computed using second derivatives
Variable |
Coefficient |
Std. Error |
-Statistic |
Prob. |
|
||||
C |
1.656162 |
1.001650 1.653434 |
0.0982 |
|
AGE |
-0.016164 |
0.004514 -3.581026 |
0.0003 |
|
EDUC |
-0.111039 |
0.026820 -4.140104 |
0.0000 |
|
CIGPRIC |
-0.003519 |
0.015581 -0.225849 |
0.8213 |
|
RESTAURN |
-0.465569 |
0.180173 -2.584019 |
0.0098 |
|
Mean dependent var S.E. of regression Sum squared resid Log likelihood Restr. log likelihood LR statistic |
0.384139 0.478753 183.5927 -520.8597 -537.5055 33.29162 |
S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Avg. log likelihood |
0.486693 1.305724 1.340619 1.319124 -0.645427 |
|
Obs with Dep=0 Obs with Dep=1 |
497 310 |
Total obs |
807 |
Begin Question 9 on a new page. Clearly label this question number of the new
page.
(9.a) Write down the estimated logit model. Report the results to 3 decimal places. (9.b) Interpret the sign of the estimated coefficients on the AGE and EDUC variables.
(9.c) Interpret the sign of the estimated coefficient on CIGPRIC. Test if the coefficient on the CIGPRIC variable is less than zero at a = 0.05. Do the six steps of the test. What do you conclude from this test about the potential effectiveness of a policy of increasing the tax on cigarettes to reduce smoking?
(9.d) Suppose Ms. A has the following characteristics:
EDUC = 13 years
CIGPRIC = $60
AGE = 40 years old
Using the estimated logit model, and assuming that the state where Ms. A lives in has government imposed restaurant smoking restrictions, calculate the probability that Ms. A smokes.
(9.e) Suppose Ms. B has the following characteristics:
EDUC = 13 years
CIGPRIC = $60
AGE = 40 years old
Using the estimated logit model, and assuming that the state where Ms. B lives in has no government imposed restaurant smoking restrictions, calculate the probability that Ms. B smokes.
(9.f) Based on the answers to parts (9.d) and (9.e), do you think that by restricting smoking in restaurant we can reduce the probability of smoking?
Question 10
The quarterly household spending on clothing, denoted yt , (in millions of dollars) for the first quarter of 1975 to the first quarter of 2005 is depicted in the line graph below.
An excerpt of these data are shown in the following table.
Year |
yt |
t |
1 |
2 |
3 |
4 |
1975 |
4818 4800 4866 5139 |
1 2 3 4 |
1 0 0 0 |
0 1 0 0 |
0 0 1 0 |
0 0 0 1 |
1976 |
4810 |
5 |
1 |
0 |
0 |
0 |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
.. |
2005 |
9138 |
121 |
1 |
0 |
0 |
0 |
Note:
(i) The time variable t equals 1 in the first quarter of 1975;
(ii) Q1 , Q2 , Q3 and Q4 are the dummy variables defined, respectively, as Q1 = 1 if 1st quarter (January to March)
0 otherwise
Q2 = 1 if 2nd quarter (April to June)
0 otherwise
Q3 = 1 if 3rd quarter (July to September)
0 otherwise
Q4 = 1 if 4th quarter (October to December)
0 otherwise
Begin Question 10 on a new page. Clearly label this question number of the new
page.
(10.a) Consider the following estimated linear trend model with dummy variables.
Dependent Variable: Y
Method: Least Squares
Sample: 1975Q1 2005Q1
Included observations: 121
Variable |
Coefficient |
2022-11-02