Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Final Exam, Stat 473/573

May 3rd, 7:30AM  9:30AM, Spring 2021

z.005 = 2.576

z.025 = 1.96

z.05 = 1.645

Label each part of your answers clearly in your blank paper!!!

1.   [23 pts] A manufacturer wants to estimate total sales for its 20 product categories in year    2002.  The company selects an SRS of 6 categories from a list of the 20 product categories. Divisions responsible for each of the selected categories were asked to provide early sales    figures for year 2002.  Last year (2001)’s sales figures are available for all 20 product           categories, and the total sales of 20 product categories in 2001 is $640 billion.  The               manufacturer expects the sales in 2001 and 2002 are correlated. So he wants to use the         information from the sales in 2001 to help estimate the total sales in 2002. The data of sales in 2001 and 2002 for each of the selected product categories are provided below, in billions of dollars.

 

 

Product Category

Sales Billions of Dollars

2001

2002

Paper towels and toilet paper

21

26

Diapers

63

91

Laundry soaps

35

47

Household cleaning products

60

70

Baked foods

16

17

Snack foods

50

76

a.   Define the following variables and quantities for this problem.  [4 pts]

(i)         yi   =    _____________ (choose one from below and fill in blank)

•    Sale amount from product category i in 2002

•    Sale amount from product category i in 2001

(ii)       xi   =    _____________ (choose one from below and fill in blank)

•    Sale amount from product category i in 2002

•    Sale amount from product category i in 2001

(iii)      N =    ____________________   (fill in blank)

(iv)      n =     _____________________ (fill in blank)

b.   What is the population mean for x, i.e. xU  , for this problem? Be careful because a wrong answer in this part will lead to a wrong answer in the next part. [3 pts]

c.   Suppose the manufacturer runs a regression using the data in the table, and get the    estimated intercept and slope, 1  = 1.431 and  0  = −3.885. Use these results to   obtain the regression estimator to estimate total sales for its 20 product categories in 2002, i.e. calculate reg .  [6 pts]

d.   Estimate the variance of the estimated total sales in (c), i.e. calculate ( reg ).  Assume s e(2) = 60. [6 pts]

e.   Calculate the 99% confidence interval for the total sales for its 20 product categories in 2002. [4 pts]

2.   [21 pts] Mary wants to know how many pets live in her community. There are 72 households in the community, and Mary selected a SRSWOR sample of 6 households and went to these  6 households to find out how many pets they have in each household. Below is the data.

Sample Household ID #s

Number of Pets ( yi )

1

0

2

3

3

0

4

1

5

0

6

1

Suppose Mary wants to estimate the mean number of pets per household for the households that have at least one pet.

a.   Define the domain Ud   of interest in words. [2pts]

b.   Define the domain population parameter that you want to estimate. Be specific about your mathematical notations, including the definition ofyi . [2pts]

c.   Record the values for the following terms. [3pts]

(i)               N = ____________________   (fill in blank)

(ii)                n = ____________________   (fill in blank)

(iii)               nd  = ____________________   (fill in blank)

d.   Estimate the mean number of pets per household for the households that have at least one pet. Use the second approach (i.e. the domain sample approach). [3pts]

e.   Estimate the variance of the estimate in part (d). [4pts]

f.   Now Mary wants to estimate the total number of pets for the households that have at least one pet. But she does not know the number of households that have at least one pet in her community. So she

(i)         First define the following variable u. [1pt]

(

|

ui   =〈

(ii)       Fill in the blanks in the column for the variable u in the table below. If you are using blank paper to write down your answer, please attach the  household ID#s to your individual answers in the blank cells. [3pts]

Sample                Household ID #s

Number of Pets (yi )

ui

1

0

 

2

3

 

3

0

 

4

1

 

5

0

 

6

1

 

(iii)      Then estimate the total number of pets for the households that have at least one pet. [3pts]

3.   [10 pts] A researcher wants to estimate the number of dental cavities in a small community. A simple random sample without replacement (SRSWOR) of 3 households was selected      from the 50 households in the community.  Two people in each sampled household were     selected at random using SRSWOR.  Each sampled person in the household was given a     dental examination and the number of dental cavities was recorded.  The table below shows summary information on each sampled household.

 

 

 

Household

Number of

persons in

household

(Mi )

Average number

of cavities per

person in

household (i )

Sample variances of

cavities in household

(si(2))

1

4

2

1

2

3

1

4

3

3

3

4

a.   What is the design? Be specific about each stage of sampling, including the selection method, the sample unit and the sample size at each stage. [5pts]

b.   Estimate the total number of dental cavities in the population, using the unbiased Estimator, i.e. using unb . [3pts]

c.   What is the joint inclusion probability for a single person (j) in household 3     (i.e. what is 3j , the probability of a person in household 3 is selected?) [2pts]

4.   [26 pts] Below is a small population that has been arranged in six clusters.  For each cluster, the data values of the elements are listed, along with the cluster size and the total for the data value in each cluster. We will select a cluster sample using the PPSWR design with the cluster size as the size variable.

PSU

(i)

 

Mi

 

Element values yij

Cluster

totals  ti

1

5

3, 5, 4, 6, 2

20

2

4

7, 4, 7, 7

25

3

8

7, 2, 9, 4, 5, 3, 2, 6

38

4

5

2, 5, 3, 6, 8

24

5

2

3, 7

10

6

3

9, 7, 5

21

Sum

27

 

 

Assume that we use the PPSWR design described this problem to select n = 2 clusters,   and clusters 3 and 6 were selected.  For each cluster selected, 2 elements were selected   using SRSWOR. Suppose the sample consists of element 2, 5 in cluster 3 and element 1, 3 in cluster 6, i.e.

Cluster 6 (i = 6): we observe yi 1  = 9 and yi 2  = 5

a.    What is this design? Be specific. [5pts]

b.     Estimate the population total based on this sample. [6pts]

c.      Calculate the variance of the estimated total in part (b). [4pts]

d.       Calculate the weight for elementj in cluster i = 3? [3pts]

e.   Estimate the population mean based on this sample. [3pts]

f.   Calculate the standard error for the mean estimator in part (e). [ 3pts]

g.    Is this self-weighting design? Check one. [2pts]

YES  ________                       No __________

Why or Why not?

5.    [20 pts (2pts for each)] True and False

[T     F]  Under a stratification design, a best sample allocation for estimating              subpopulation parameters generally leads to less precise whole population estimates.

[T     F] A STS design is a self-weighting sample design if the proportional allocation is applied.

[T     F]  In a STS design, the most effective stratification is the stratification such that the units are ve