Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Stat 2118 – Midterm Examination II

Spring 2023

1.   At a certain company, there have been complaints of gender discrimination, so the  following regression model is fitted, where Y is the annual salary; x1 is the years of experience; and x2  = 0 if female and 1 if male:

yi  = β0  + β1xi1  + β2xi2  + β3xi1xi2  + εi

The following is the patial R output when the above model is fit for a sample of employees at the company:

Estimate Std. Error t value Pr(>|t|)

(Intercept)    50318.3    7356.47    6.84   0.0001

X1              3079.3    1691.92   -1.82   0.0777

X2               773.7     432.25    1.79   0.0825

X1X2            1969.0     859.83    2.29   0.0284

a)  What is the regression line for the female employees? What is the expected salary increase for female employees for each additional year of experience?

b)  What is the regression line for the group of male employees? What is the expected salary increase for male employees for each additional year of experience?

C)  Test to see if there is interaction using a = 0.05. Explain precisely what interaction means in the context of this problem.

d)  Is the gender discrimination claim valid? Explain your answer.

2.  Clean-Teeth, Inc. is a company selling toothpaste in the U.S. Clean-Teeth has the largest share of the US market with one main competitor in the market. It is widely known that the higher the unit price of tooth paste the lower the market share of Clean-Teeth. But some marketing experts believe that there does not exist any relationship between Clean - Teeth’s market share (y) and the unit price of tooth paste (x1) they charge, if the main competitor’sunit price (x2) is taken into account.

To investigate this issue the market share of Clean-Teeth was analyzed over a period of 123 weeks.  For  each week the  in the  sample,  Clean-Teeth’s  market share  (y),  Clean- Teeth’s  unit  price  (x1),  and  the  main  competitor’s  unit  price  (x2)  were  recorded.  A multiple regression analysis was made using this data and the model:

yi  = β0  + β1xi1  + β2xi2  + ε

Part of the resulting R output is given below. Note that some parts of the output are deleted on purpose and left as “?” .

a)  Fill out all results in the output denoted by “?” .

ANOVA

 

Sum of

Squares

 

DF

 

Mean Square

 

F

 

Sig.

Regression

0.049271

?

?

?

0.0000

Residual

0.194145

?

?

 

 

Total

0.243416

122

 

 

 

Parameter Estimates

model

Beta

Std. Error

t

(Intercept)

0.771059

0.158934

?

X1

-0.238899

0.047594

?

X2

-0.071115

0.075431

?

b)  What are the hypotheses associated with the F statistic in the above ANOVA table? Perform this test and discuss your conclusion.

c)  Test the null hypothesis that β1 = 0 at the 5% significance level. Using the results given, what can you say about the experts claim?

d)  Obtain a 95% confidence interval for β2

e)  Obtain R2 and the adjusted R2

f)  What is the estimate of the standard deviation of the error term?

3.   A sample of 25 brands of cigarettes were tested to relate the carbon monoxide content (y) to tar  (x1), nicotine  (x2), weight  (x3).  The following is R ANOVA tables for four models fitted to the data:

lm(formula = Carbon ~ Tar + Nicotine + Weight, data = cig)

ANOVA

 

Sum of

Squares

 

DF

 

Mean Square

 

F

 

Sig.

Regression

495.258

3

165.086

78.984

0.0000

Residual

43.893

21

2.090

 

 

Total

539.150

24

 

 

 

lm(formula = Carbon ~ Tar, data = cig)

ANOVA

 

Sum of

Squares

 

DF

 

Mean Square

 

F

 

Sig.

Regression

494.281

1

494.281

253.37

0.0000

Residual

44.869

23

1.951

 

 

Total

539.150

24

 

 

 

lm(formula = Carbon ~ Nicotine, data = cig)

ANOVA

 

Sum of

Squares

 

DF

 

Mean Square

 

F

 

Sig.

Regression

462.256

1

462.256

138.266

0.0000

Residual

76.894

23

3.343

 

 

Total

539.150

24

 

 

 

lm(formula = Carbon ~ Weight, data = cig)

ANOVA

 

Sum of

Squares

 

DF

 

Mean Square

 

F

 

Sig.

Regression

116.057

1

116.057

6.309

0.0195

Residual

423.094

23

18.395

 

 

Total

539.150

24

 

 

 

a)  Which of the variables individually contribute significantly to the prediction of the carbon monoxide content?

b)  FindSSE(X1), SSR(X2, X3 | X1) and MSR(X2, X3 | X1)

C)  Use the nested models test to see if the variables nicotine and weight can be removed from the complete first-order model. Specify the null and alternative hypothesis, the value of the test statistic and a decision at a  = 0.05.

d)  Calculate R2  and Ra(2) for the first two models (the full model and the model with just

tar). Do these values agree with your conclusion in part C)? Explain.