Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT 207

Spring 2022

Test #2

1.    Consider the scatterplot below between the respondent’s highest year of school completed (educ), and their    spouse’s highest year of school completed (speduc).  Using ONLY the graph, what if any relationship do you see between these two variables?  Describe its shape and direction if possible.

 

variables.

There is a positive, linear relationship between these two

2.   Assume that it has been determined that a linear relationship exists between the variables sibs and childs.          What is the appropriate correlation coefficient to report from among the four types we learned about in             Chapter 5, and what is its value?  Give both the name, and the value TO THREE DECIMAL PLACES OF ACCURACY.

Spearman’s r = 0. 199  (Pearson’s is incorrect,  = 0. 193)

3.    For the pair of variables race and income, just determine which of the four types of correlation coefficient might be appropriate to report, or answer “none” if no correlation coefficient would be appropriate.

None.  Race is nominal with more than two values

4.    Consider the following plot for the variables sibs and race and indicate, based on the plot alone, whether there is a relationship between the variables.  If not, briefly explain why not.  If there is, briefly describe it.

 

There IS a relationship.  Black respondents tend to have more siblings as indicated by the median, than other races.

5.   This question does not require SPSS.  Suppose two variables in a study are called hours and annincome.  The     first variable indicates the hours spent at work in a typical week, and the second variable is the person’s annual income.  Suppose a linear relationship is suspected to exist between them and the correlation coefficient value is 0.21.  The BEST contextual interpretation of this correlation coefficient is

i.    Increasing your hours worked per week increases your annual income.

ii.   There is a weak tendency for persons who work more hours per week to have a higher annual income.

iii.   A person’s annual income tends to increase by .21 percent for each extra hour of work per week .

iv.   There is no actual relationship between these variables.

6.   This problem is NOT from the GSS91-Social data set and does NOT require SPSS. Consider the clustered bar    graph given below.  The two variables are whether or not a person owns a gun (0=”no”, 1=”yes”), and whether or not a person is in a gang (0=”no”, 1=”yes”).

 

a.    What percentage of those who are not in a gang are gun owners?

49/(292+49) *100% = 14.4 %

b.    By computing another percentage associated with this graph, determine whether the variables have a positive or negative relationship.  You must justify your answer with an appropriate calculation for       credit.

The % of gun owners among those who are in a gang is 40/(70+40) *100% = 36.4%, so the relationship between these two variables is positive.

7.    Back to our data set… Assume that there is a linear relationship between whether or not a person is married     (married), and the number of hours of tv they watch per day (tvhours).  What is the equation of the regression line that allows you to predict the number of hours of tv a respondent watches per day from their marital          status?

 = (0.323) + 2.643

8.   Assuming a linear relationship IS appropriate for the variables maeduc (highest year of school completed,         mother) and paeduc (highest year of school completed, father), the equation of the regression line that allows you to predict the respondent’s mother’s highest year of school completed from the father’s highest year of    school completed is     = 0.558  + 4.887.

a.    Would it be appropriate to use this equation to predict the value of maeduc for someone whose paeduc value was 22?  If yes, do so.  If no, explain why not.

No.  The largest value of paeduc in the data set is 20, so using the regression equation to predict maeduc for a paeduc value of 22 would not be appropriate.

b.    Is the y-intercept meaningful in this case.  You must give a brief reason for credit.

Yes.  A paeduc value of 0 is within the range of paeduc values recorded in the data set.

SPSS IS NOT NEEDED FOR THE REST OF THE EXAM.

9.    Suppose data was collected for two scale variables, religious which measures the level of religious feeling a         person has as indicated by a numerical score from a survey, and alonetime which indicates the amount of hours spent alone in a typical week.  The variables are found to have a linear relationship.  Suppose a regression

equation is found that allows one to predict a person’s alonetime from their religious score, and is given by  = 0.075  + 2.25.

a.    If a person in the data set had a religious score of 70, and an actual alonetime value of 8 hours, find the value of this person’s residual.

Residual = actual Y   -  predicted Y  = 8  - (.075(70) + 2.25) = 8– 7.5 = 0.50

b.    In a scatterplot of this data, would this person’s dot” in the scatterplot be above the regression line, below the regression line, or on the regression line?

Above, since the residual is positive.

c.    Properly interpret the value of the y-intercept associated with the above regression equation in the context of this study.  Use language that your little brother would understand. Don’t worry about    whether or not it’s meaningful, just give the interpretation.

A person with a 0 religious score is predicted to spend about 2.25 hours alone per week on average.

d.   The correct interpretation of the slope in the above regression equation is (choose the one best response from below):

i.   For every one point higher in a person’s religious score, we expect on average a 2.25 increase in a person’s alonetime.

ii.   For every one hour more in a person’s alonetime, we expect on average a 2.25 increase in a person’s religious score.

iii.   For every one point higher in a person’s religious score, we expect on average a 0.075 hour increase in a person’s alonetime.

iv.   For every one hour more in a person’s alonetime, we expect on average a 0.075 increase in their religious score.

10. Consider the following table that shows the results of 85 people surveyed about their preferred breakfast beverage.

Coffee

Tea

Milk

Juice

Water

Other

35

22

12

8

3

5

a.    What fraction of those surveyed prefer coffee with their breakfast?

 = 0.41

b.    If a person is chosen randomly from those surveyed, what is the probability that they do not prefer milk?

1 −  = 0.86

11. Suppose the temperature in March in a particular city is normally distributed with a mean of 75 degrees and a standard deviation of 18 degrees.  Properly use the Excel calculator to answer the following questions.

a.    If you choose a random day in March to measure the temperature, how likely is it that the temperature you record is greater than 80 degrees?

You want the area to the right in the “X diagram,” which corresponds to the area to the

right of  =  = 0.28 in the Z diagram.”  The forwards” calculator gives a probability of 38.97%

b.   What temperature in March in this particular city is greater than 30% of the possible recorded temperatures in that city in March?

The score we seek must be to the left of the mean (the mean would be greater than 50% of the temperatures… we want a temperature that is greater than less than that).  This temp. has 30% of the possible            temperatures to its left.  Entering 30% into the backwards” calculator gives a z-score −0.52.

This z-score is converted to a raw score temperature of  = (−0.52)18 + 75 =  .  degrees.

12. Read the following excerpt from the article Your coffee habit might be lowering your GPA,” published on usatoday.com in August, 2017, and answer the questions that follow.

Students who drink a cup of coffee or more have lower grade point averages than those who don't, a new survey suggests.

Survey data showed students who drank one cup of coffee a day had a GPA of 3.41, compared to students who drank two cups a day at 3.39. Those who drankfive or more cups of coffee had an average GPA of 3.28,          according to the survey.

a.   Assume that the survey data included variables cups (number of cups of coffee drunk per day) and gpa. Which ONE of the following could be the correlation coefficient between the two variables?

.

ii.

iii. iv. v.

0

0. 18

−1.24

1. 144

0.54

b.   Should students drink less coffee to improve their GPA?  Why or why not?  A BRIEF (one sentence) explanation is necessary for credit.

No.  Correlation does not necessarily imply causation.