Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

POLS0010 Guided Marking exercise 2022-23

In a previous assessment for POLS0010 students were asked to complete the following task:

1.   Multiple Linear Regression (60 Points)

This question uses the ca_indresp_w_POLS0010.dta dataset. You have been asked to write a short report on the relationship between the amount of time spent on childcare or home-schooling during the UK’s coronavirus lockdown and psychological distress. You should create a categorical variable of time spent on childcare or home-schooling with up to four levels and use this variable to fit a multiple linear regression model(s) predicting psychological distress. You may choose to add additional explanatory variables to your model that may explain the relationship between time spent on childcare and home schooling and psychological distress. You may choose to recode  or transform some variables in your data. You should report any decisions you take to adjust for individual non-response in your data and to take account of any complex survey design. These decisions should form an introduction that also includes a description of your dataset, subsetting of respondents, descriptive statistics and your   research hypothesis. Briefly explain any limitations to your analysis in a concluding section that also summarises your main substantive finding.

The three appended submissions were marked against the following criteria:

Description of the dataset

Description of measures used

Presentation of descriptive statistics

Multiple regression output

Regression interpretation

Conclusions

Limitations

You are required to score each paper on these criteria on following scale:

Fail (<40%)

40-49

50-59

60-69

70-79

80+

Post your marks onMentimeterfor the three papers before 12th  December 2022. We will discuss your grading at the final POLS0010 Term 1 online Q&A on 12th  December at 9am.

Paper 1

This part of the assessment analyzes the relationship between the amount of time spent on childcare or home-schooling during the UK’s coronavirus lock down and individual psychological distress by using data from the Understanding Society COVID- 19 study in April 2020. The research hypothesis is that psychological distress tends to increase as the amount of time spent on childcare or home schooling increases. The null hypothesis is that there is no relationship between psychological distress and time spent on childcare or homeschooling. A multiple linear regression is constructed by using subjective well-being (GHQ) as the dependent variable, time spent on childcare or home schooling, sex and how often one feels lonely as three predictors variables. Subjective well-being is used to reflect the levels of psychological distress with larger scores indicating higher distress levels. Sex and the frequency of loneliness feeling are selected as two controlled variables. The interaction effects between sex and time on childcare variables are taken into account since it is found that the effects of childcare time on the well-being levels of mothers and fathers are different (Musick et. al, 2016).Therefore, if the regression estimates between subjective well- being and time spent on childcare are positive, it can be concluded that while controlling for  all other variables, as the amount of time spent on childcare or homeschooling increases, scores on subjective well-being also increases, indicating a rise in the psychological distress  felt by people.

Before reporting the results of the analysis, a few modifications to the data need to be outlined. First, respondents without children aged under 18 are filtered out of the dataset because it is assumed that households without children do not spend time on child-caring or   homeschooling. However, after removing observations without children, there are still 14%    data missing on the childcare/homeschooling time variable, around 80% of which is made up of inapplicable answers, indicated by the value of -8. I decide to remove these rows because it is inconvenient to work with them in the later analysis. This results in a final sample size of    5413. Values of the time spent on childcare or homeschooling are then divided into four         categories based on the minimum value, lower quartile value, median and the upper quartile   value in order to ensure a similar spread of samples in each category and make the regression prediction more accurate. The categories created are less than 1 hour, 1 to 6 hours, 7 to 19      hours and 20 hours or more, each of which contains a sample size of around 1000. In terms of the dependent variable, sequential hot-deck imputation is used to adjust for item non-              response by using how often one feels lonely as domain variable as it is assumed that people  who feel lonely may have similar levels of subjective well-being. Moreover, since the             original data is clustered and stratified probability samples of postal addresses in the UK, a     survey design object is created to adjust for differential non-response and unequal selection    probabilities occurred in the two sampling methods.

Descriptive statistical analysis is performed based on the survey design object created. As       shown by figure 1, there are 3319 female and 2094 male respondents in my final model. The  mean time spent on child-caring or homeschooling is 17.15 hours. The standard error at 0.60  indicates a relatively wide spread of data, implying that the amount of time spent on child-     caring or homeschooling varies largely among respondents. Subjective well-being has a range of 0-36, and larger values represent higher distress levels. The mean score of subjective well- being is 13.21 and standard error is 0.16.

 

Among the four categories of the time on childcare and homeschooling variable, the first        category, less than 1 hour, is taken as the reference category with others comparing their         regression coefficients against it. As shown in figure 2, it is found in the sex variable that,      with a p value smaller than the alpha level (0. 1), men generally enjoy 0.23 unit lower score in subjective well-being and thus lower distress levels than women do, controlling for all other   variables. By further taking the interaction effects between sex and time into account, men     and women show contrasting relationships between time spent on childcare and respective     well-being scores. While a positive relationship is observed for female, male respondents       tend to have lower well-being and thus lower psychological distress levels as the amount of   time spent on childcare and homeschooling increases. For instance, women spending between 1 and 6 hours, 7 to 19 hours and 20 hours or more on childcare are predicted to have 0.44,      0.77 and 1.13 unit higher well-being scores than those spending zero hour, keeping all other   variables constant. However, men are shown to score 1.19 unit lower on the well-being          variable if they increase the time to more than 20 hours. Despite this, it should be noted that   the p values of male’s time on childcare are greater than the alpha level (0. 1), which means    there is a failure of rejecting the null hypothesis and the regression estimates of male’s            childcare time are not statistically significant. The high standard errors (around 0.8) also         make the observed relationships less reliable as the well-being scores vary greatly among       individuals. Thus, it may be inaccurate to infer that men experience decreasing distress as       their time on childcare or homeschooling increases. On the other hand, it can be concluded     that there is a statistically significant relationship between subjective well-being and               women’s time on childcare as the p values are smaller than 0.1. In other words, the                 psychological distress of women increases as they spend more time on childcare or                 homeschooling. Additionally, women also tend to experience higher psychological distress    than men do, given that they spend the same amount of time spent on childcare and fall under the same categories in the loneliness variables.

 

Last but not least, a few limitations of my analysis should be stated. First, there are only two controlling variables in the final regression model; yet variables not included in the model    may also affect the relationship between individual well-being and time spent on childcare.  The uncertainty in whether subjective well-being of male decreases with the increase in time on childcare may be explained by the failure of controlling for other explanatory variables   that affect the relationship. Second, samples with zero weights are kept in the dataset, which may lead to an increase in standard error in the newly created survey design object, further  undermining the reliability of the results (Fotini et.al, 2013:9).

Paper 2

Introduction

This report analyses the ‘Understanding Society COVID- 19 study, April 2020’ and discusses the relationship between time spent on childcare/home-schooling and psychological distress. I have added additional explanatory variables to explain this relationship: age, sex, how often someone feels lonely and if they smoke.

To subset the data, I removed respondents that did not live with children as they did not seem appropriate to consider in the analysis. I decided to do a complete case analysis due to there   being substantial information missing for hot-deck imputation to be used, so I removed          respondents with subjective wellbeing values and/or time spent on childcare/home-schooling below 0 as these indicated missing, inapplicable or impractical data.

To consider clustering and stratification, I created a survey design object. Before doing this, I removed any non-credible weights (<0). I set ‘id’ to ‘psu’ (primary sampling units) and         specified the stratifying variable as ‘strata’ . To account for scenarios where a stratum has one PSU, I used options(survey.lonely.psu=“adjust”).

My research hypothesis was that the more time spent on childcare/home-schooling, the less psychologically distressed they would be because spending time with family is thought to   relieve stress. However, I thought that there would be a stronger relationship between how  often one feels lonely and psychological distress because it seems logical that someone who feels lonely more has poorer mental health.

Explanation of variables:

Why I chose to include it in the    analysis/What impact it may have

Psychological

distress

Results

Multiple linear regression model: psycℎological distress  =  b0 (intercept) + b1 ∗         time spent on cℎildcare/ℎome scℎooling  +  b2 ∗ frequency of loneliness +  b3 ∗ age  +  b4 ∗ sex  +  b5 ∗ wℎetℎer tℎey smoke

Coefficient     Standard       t-value               p-value

Estimates       Error

11.598

1.134

10.225

<2e- 16**

11-20            0.015              0.375              0.040                  0.968

(hours)

21-50

(hours)

0.395

1.330

0.184

51-144          -0.083             0.507              -0.164                0.870

(hours)

Sometimes   5.003

0.333

15.501

<2e- 16**

Often            12.211            0.789              15.679                <2e- 16**

-0.015

0.019

-0.757

0.450

Female         1.115              0.305              3.662                  0.0003**

-0.853

0.567

 

0.133

Multiple R-    Adjusted R-   F-statistic          p-value (for

Squared         Squared                                     F-Statistic)

0.324

0.322

156.9 on 8    and 2622 DF

<2e- 16

** significance level: 0.05

Analysis

I will start by analysis of the intercept, 11.598. This suggests that when a 17-year-old male  smoker who hardly/never feels lonely spends 1- 10 hours on childcare/home-schooling, their subjective wellbeing score is 11.598, meaning they don’t face substantial psychological       stress. The p-value is under the significant level 0.05 so the co-efficient is significant.

The p-values for childcare/home-schooling are larger than 0.05, meaning the coefficients are insignificant, and so it could be argued that the null hypothesis, that there is no relationship  between time spent on childcare/home-schooling and psychological distress, is true. For       example, when time spent on childcare is 11-20 hours, it appears that psychological distress is 0.015 units higher than if 1- 10 hours was spent; however, a p-value of 0.968 implies the    result is insignificant, as it means, if the null hypothesis (no relationship) is true, there is 96.8% chance of getting this result. The p-values for age and whether someone smokes are also above 0.05 meaning there is no significant relationship with psychological distress.

How often someone feels lonely is a significant variable in explaining psychological distress  as p<2e−16; this means that this relationship is reflected in the population and that it is not by chance. Someone who feels lonely often has a subjective wellbeing score that is 12.211 units  higher than someone who hardly ever or never feels lonely, meaning they are considerably     more psychologically distressed.

Sex is another significant explanatory variable, however it has less explanatory power than   frequency of loneliness. Females have a score 1.115 units higher than men, suggesting they  are slightly more distressed. Controlling other variables, a woman who is often lonely would have a subjective wellbeing score of 11.598+12.211+1. 115=24.924, opposed to a man who  never feels lonely, who would have a score of 11.598; there is a difference of 13.326.

The adjusted R-squared gives a value of 0.322, meaning 32.2% of the variance is explained by the model. This suggests that there are other variables explaining why someone is          psychologically distressed. The F-statistic of 156.9 has a p-value under 0.05, telling us that the model has at least one significant independent variable and that the model has some      explanatory power (demonstrated above).

I decided not to include interactive terms in my final model   because the results appeared to be insignificant. For example, I thought that age and sex could be considered as a                 multiplicative term when predicting psychological distress.   However, after performing a partial F-test, F=0.324 (p-          value=0.569), meaning we accept the null hypothesis - the     coefficient for the interactive term would be 0.

Results using anova()            function comparing a model  with the interactive term (age and sex) and one without:

p-value


I calculated the relative importance of each independent variable to see what variable       contributed most to model’s total explanatory value. As seen below, how often one feels  lonely has the most contribution to the model’s total explanatory value – over 80%. Time spent on childcare and whether one smokes contributes to R2 the least.

 

Limitations

A limitation to this model is that many values were removed, meaning the number of            observations analysed was significantly smaller than the original number (17452 to 2631),    meaning it is difficult to determine whether these results would be replicated in the sample    population. I took the decision not to impute missing values because, with more than 10% of the data missing, mean imputation or hot-deck imputation did not seem suitable as they         would cause the inferences to be incorrect due to distortion of variation and the general         distribution, and multiple imputation was too complex.

Other limitations to the model include the lack of transforming variables. With greater inspection, there may be a suitable transformation to achieve a more linear relationship between the variables.

Conclusion

It appears that there is not a significant relationship between childcare and psychological     distress. There does, however, appear to be an evident relationship suggesting that the more frequent one feels lonely, the greater psychological distress they face. Whilst there is also a relationship between sex and psychological distress, this is not as substantial in terms of its explanatory power.


Paper 3

Introduction

This study aims to investigate the relationship between weekly time spent on childcare or     home-schooling during UK’s coronavirus lockdown and psychological distress. The method is a multiple regression model between childcare/homeschooling time, subjective wellbeing, and other two explanatory variables, namely sex and couple (which notes whether individual is living with a couple).

Data is collected from the Understanding Society” study on UK Data Service, which is a   clustered and stratified probability samples of UK’s postal address. This study first samples the individuals who have at least one child by subsetting the individuals with at least one    above-zero value in the three variables that denote the number of children in the household aged 0-4, 5- 15, and 16- 18, respectively. The next step is cleaning the invalid responses on  each included variable. Table 1 shows the frequencies of each type of response.

Attempting to reduce individual non-response bias, this study weights the data by the given    measurement of weight that are designed to correctly reflect individuals’ possibility of being  selected. This study cleans the minus weights which indicate inapplicableness. Table 2           summarises the sample size after each data subsetting procedure reported above. The order of each step would not affect the final sample size of 3849.

Table 1

 

Table 2

 

This study identifies the cluster and strata used by the survey before fitting the regression      model. According to the user manual from “Understanding Society” (Kaminska et al., 2019), failure to account for complex survey design would not affect the estimates in analysis but would affect associated standard errors. This study discovers one stratum with only one         Primary Sampling Unit and chooses to ultilise the variance estimator that uses residuals from the population mean since it is considered a conservative approach. Multi-stage design is      ignored because it generally has little influence and is not emphasised by the user manual.

Table 3 compares the descriptive statistics on the samples before and after adjusting for weights and complex survey design. This study converts the highly right-skewed

childcare/homeschooling time to a 4-level categorical variable based on the variable’s statistics and general knowledge. Categorisation is presented in Table 4. “0 hour” is set alone as the first category because 1) it consists a large amount of observations and 2) no time spend on childcare or homeschooling is assumed by this study as distinguishing from spending at least some time. The second category is “1-24 hours,” meaning the total time spent is at most a full day’s time. The third category is “25-40 hours,” meaning up to 10 hours per day is spent. The fourth category is the remaining “41- 144 hours.” Each category   includes at least 10% of the total weight.

Table 3

 

The dependent variable is General Health Questionnaire score, which is used in attempt to  measure subjective wellbeing. The score range is 0-36, where 0 represents least distress and 36 represents most distress. This research hypothesises that psychological distress would be larger for the groups reporting more childcare/homeschooling time.

Table 5 presents the results of the regression model. No serious problem on multicollinearity  has been found. Women with couples and zero time spent on childcare/homeschooling have a mean subjective wellbeing score of 11.148. The mean score of men is 1.581 points higher      than that of women, holding couple and childcare/homeschooling time constant. The mean     score of people without couple is 2.232 points higher than that of people with couple (holding other variables constant). Since all of the above-mentioned estimates have significant p-         value, it could be substantively concluded that for those who have children, male has slightly higher level of psychological distress than female, and people living with a couple has            slightly less distress than those who don’t.

Table 5

 

According to the model, people spending 1-24 hours on childcare/homeschooling has a mean score 0.835 points higher compared with those spending no time (holding other variables      constant). The difference is 0.963 for those spending 25-40 hours and the p- value isn’t          significant. These estimates are small in size, so there is minor difference on distress level     among people spending at most 40 hours per week on childcare/homeschooling. Those          spending 41- 144 hours have on average 1.579 points higher than those spending no time        (holding other variables constant). For people spending more than 40 hours on

childcare/homeschooling, their level of psychological distress is slightly higher than those spending zero or less time.

In response to potential bias introduced by categorising continuous variable, this study tries  different categorisations and summaries the corresponding regression results in Table 6. The substantive findings are generally in line with what’s discussed above, with higher stress in  above-73hrs and -82hrs groups (they have small sample size). R2 is low in all models.

Table 6

 

Conclusions

In examining the relationship between hours spent on childcare/homeschooling during UK’s coronavirus lockdown and psychological distress, this study finds only minor difference       among people spending no time or at most standard working time (40hrs/week). People        spending even higher amount of time show higher level of distress. Additionally, female      shows less distress than male and people living with a couple shows less distress than those  who don’t. Only little variation in the outcome variable could be explained by the models.

Besides the above-mentioned limitation resulted from categorising continuous variable, this study acknowledges the limitation of self-reported childcare/homeschooling time, which     could lead to random measurement error. However, the large sample size would make this  issue less problematic.