Final Exam, Stat 473/573
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Final Exam, Stat 473/573
May 3rd, 7:30AM – 9:30AM, Spring 2021
|
z.005 = 2.576 |
z.025 = 1.96 |
z.05 = 1.645 |
Label each part of your answers clearly in your blank paper!!!
1. [23 pts] A manufacturer wants to estimate total sales for its 20 product categories in year 2002. The company selects an SRS of 6 categories from a list of the 20 product categories. Divisions responsible for each of the selected categories were asked to provide early sales figures for year 2002. Last year (2001)’s sales figures are available for all 20 product categories, and the total sales of 20 product categories in 2001 is $640 billion. The manufacturer expects the sales in 2001 and 2002 are correlated. So he wants to use the information from the sales in 2001 to help estimate the total sales in 2002. The data of sales in 2001 and 2002 for each of the selected product categories are provided below, in billions of dollars.
|
Product Category |
Sales Billions of Dollars |
|
|
2001 |
2002 |
|
|
Paper towels and toilet paper |
21 |
26 |
|
Diapers |
63 |
91 |
|
Laundry soaps |
35 |
47 |
|
Household cleaning products |
60 |
70 |
|
Baked foods |
16 |
17 |
|
Snack foods |
50 |
76 |
a. Define the following variables and quantities for this problem. [4 pts]
(i) yi = _____________ (choose one from below and fill in blank)
• Sale amount from product category i in 2002
• Sale amount from product category i in 2001
(ii) xi = _____________ (choose one from below and fill in blank)
• Sale amount from product category i in 2002
• Sale amount from product category i in 2001
(iii) N = ____________________ (fill in blank)
(iv) n = _____________________ (fill in blank)
b. What is the population mean for x, i.e. xU , for this problem? Be careful because a wrong answer in this part will lead to a wrong answer in the next part. [3 pts]
c. Suppose the manufacturer runs a regression using the data in the table, and get the estimated intercept and slope,
1 = 1.431 and
0 = −3.885. Use these results to obtain the regression estimator to estimate total sales for its 20 product categories in 2002, i.e. calculate
reg . [6 pts]
d. Estimate the variance of the estimated total sales in (c), i.e. calculate
(
reg ). Assume s e(2) = 60. [6 pts]
e. Calculate the 99% confidence interval for the total sales for its 20 product categories in 2002. [4 pts]
2. [21 pts] Mary wants to know how many pets live in her community. There are 72 households in the community, and Mary selected a SRSWOR sample of 6 households and went to these 6 households to find out how many pets they have in each household. Below is the data.
|
Sample Household ID #s |
Number of Pets ( yi ) |
|
1 |
0 |
|
2 |
3 |
|
3 |
0 |
|
4 |
1 |
|
5 |
0 |
|
6 |
1 |
Suppose Mary wants to estimate the mean number of pets per household for the households that have at least one pet.
a. Define the domain Ud of interest in words. [2pts]
b. Define the domain population parameter that you want to estimate. Be specific about your mathematical notations, including the definition ofyi . [2pts]
c. Record the values for the following terms. [3pts]
(i) N = ____________________ (fill in blank)
(ii) n = ____________________ (fill in blank)
(iii) nd = ____________________ (fill in blank)
d. Estimate the mean number of pets per household for the households that have at least one pet. Use the second approach (i.e. the domain sample approach). [3pts]
e. Estimate the variance of the estimate in part (d). [4pts]
f. Now Mary wants to estimate the total number of pets for the households that have at least one pet. But she does not know the number of households that have at least one pet in her community. So she
(i) First define the following variable u. [1pt]
(
|
ui =〈
(ii) Fill in the blanks in the column for the variable u in the table below. If you are using blank paper to write down your answer, please attach the household ID#s to your individual answers in the blank cells. [3pts]
|
Sample Household ID #s |
Number of Pets (yi ) |
ui |
|
1 |
0 |
|
|
2 |
3 |
|
|
3 |
0 |
|
|
4 |
1 |
|
|
5 |
0 |
|
|
6 |
1 |
|
(iii) Then estimate the total number of pets for the households that have at least one pet. [3pts]
3. [10 pts] A researcher wants to estimate the number of dental cavities in a small community. A simple random sample without replacement (SRSWOR) of 3 households was selected from the 50 households in the community. Two people in each sampled household were selected at random using SRSWOR. Each sampled person in the household was given a dental examination and the number of dental cavities was recorded. The table below shows summary information on each sampled household.
|
Household |
Number of persons in household (Mi ) |
Average number of cavities per person in
household ( |
Sample variances of cavities in household (si(2)) |
|
1 |
4 |
2 |
1 |
|
2 |
3 |
1 |
4 |
|
3 |
3 |
3 |
4 |
a. What is the design? Be specific about each stage of sampling, including the selection method, the sample unit and the sample size at each stage. [5pts]
b. Estimate the total number of dental cavities in the population, using the unbiased Estimator, i.e. using
unb . [3pts]
c. What is the joint inclusion probability for a single person (j) in household 3 (i.e. what is 冗3j , the probability of a person in household 3 is selected?) [2pts]
4. [26 pts] Below is a small population that has been arranged in six clusters. For each cluster, the data values of the elements are listed, along with the cluster size and the total for the data value in each cluster. We will select a cluster sample using the PPSWR design with the cluster size as the size variable.
|
PSU (i) |
Mi |
Element values yij |
Cluster totals ti |
|
1 |
5 |
3, 5, 4, 6, 2 |
20 |
|
|
4 |
7, 4, 7, 7 |
25 |
|
3 |
8 |
7, 2, 9, 4, 5, 3, 2, 6 |
38 |
|
4 |
5 |
2, 5, 3, 6, 8 |
24 |
|
|
|
|
|
|
6 |
3 |
9, 7, 5 |
21 |
|
Sum |
27 |
|
|
Assume that we use the PPSWR design described this problem to select n = 2 clusters, and clusters 3 and 6 were selected. For each cluster selected, 2 elements were selected using SRSWOR. Suppose the sample consists of element 2, 5 in cluster 3 and element 1, 3 in cluster 6, i.e.
Cluster 6 (i = 6): we observe yi 1 = 9 and yi 2 = 5
a. What is this design? Be specific. [5pts]
b. Estimate the population total based on this sample. [6pts]
c. Calculate the variance of the estimated total in part (b). [4pts]
d. Calculate the weight for elementj in cluster i = 3? [3pts]
e. Estimate the population mean based on this sample. [3pts]
f. Calculate the standard error for the mean estimator in part (e). [ 3pts]
g. Is this self-weighting design? Check one. [2pts]
YES ________ No __________
Why or Why not?
5. [20 pts (2pts for each)] True and False
[T F] Under a stratification design, a best sample allocation for estimating subpopulation parameters generally leads to less precise whole population estimates.
[T F] A STS design is a self-weighting sample design if the proportional allocation is applied.
[T F] In a STS design, the most effective stratification is the stratification such that the units are ve
2023-05-25