Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Analyzing Fertility Data in STATA or R

Open the dataset SNIR61FL.dta in STATA, or in R (you will need to import it into R). This dataset comes from DHS, on dhsprogram.com you will be able to find the right documentation. Download the codebook to help understand the dataset. I have cleaned the dataset so it will be easier for you to use it, but I have kept the original variable names. This way you can find their meaning in the codebook. Relevant material can be found here:

http://dhsprogram.com/publications/publication-search.cfm?type=35

INSTRUCTIONS

Save all commands that you use in a do-file or R-script.

Start the do-file/R-script with writing your name, the date and the assignment.

You will need to submit this file for grading. (5p)

1. BASIC INFORMATION FROM THE DATASET (5p)

a. What country is the dataset from?

b. What is the year of survey?

c. What is the unit of observation?

d. What is n?

e. What is the survey round?

 

2. SUMMARY STATISTICS (10p)

a. Create a summary statistics table for age, education, average fertility and ideal fertility. (Include mean, maximum, minimum and observations)

Note: When looking up these variables in the codebook, check what the code for missing information is (sometimes it’s 9, 99 or 999). You will need to exclude missing variables before creating the summary statistics table.

b. Create a bar chart of the age distribution in the sample

c. Create a bar chart for the age distribution using 5 year bins

3. BASIC CHARACTERISTICS (10p)

a. In what years are the respondents born?  (Give the range)

b. What is the average age of the respondents?

c. What share of respondents have ever been married? (Note that you need to remove missing variables. There are several similar variables. Search for “marital” rather than married.)

d. What is the average years of education? (Use v106)

e. What is the minimum and maximum level of education?

f. Describe the households’ average living standards (TV, electricity, latrine etc.). Create suitable statistics and described them using words.

g. What is the gender distribution of the household head?

h. How many children has a woman had on average?

i. How many children that are alive does a woman have on average?

j. What is the average fertility among women who have completed their fertility? (Use the age variable to determine the subset of sample that has likely reached full fertility)

k. What is a woman’s ideal fertility (her desired number of children) and how does it relate to their actual fertility?

4. REGRESSION ANALYSIS (30p)

a. Fit a model that describes what factors predict the number of children a woman has given birth to. Write the equation for the model. (20p)

Note: Think about important variables that determine a woman’s fertility. Commonly included variables include: age, education, urban/rural locality, if she is married or has a partner. 

b. Interpret the coefficients. Are the results statistically significant? Please include a regression result table. (10p)

5. HYPOTHESIS TESTING (40p)

a. Come up with a hypothesis that can be tested using the data. Explain shortly why this is an interesting question. Point to recent economics literature on the topic. (20p)

b. Test this hypothesis using the dataset. Explain your findings. Are the results statistically significant? Please include any graphs or tables. (20p)

6. SUBMIT YOUR CLEAN DO-FILE/R-script (5 p)

GRADING KEY

Grading will be based on the following:

▪ Clarity of presented results. (Are the tables easy to read? Are variable names understandable or clearly defined? Are variable descriptions included?)

▪ Correct presentation of results. (Right type of graphs, tables)

▪ Correct interpretation of results. (Magnitude of result, direction of result, significance of coefficients)

▪ The language used clearly distinguishes between correlation and causation