Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

L1022: Project 100% for A1 2022-23

Maximum project length: 3,000 words

A Cross-Country Analysis of the Determinants of Renewable Electricity Production

In light of climate change, producing clean renewable energy is becoming increasingly important and urgent. The international need for renewable energy is acknowledged in the Sustainable Development Goals (SDGs). Target 7.2 of the SDGs is to “by 2030, increase substantially the share of renewable energy in the global energy mix”.

What determines how much renewable energy is produced in a country? This project focuses particularly on the production of electricity from renewable sources, excluding hydropower. (While hydropower is a valuable source of clean renewable energy, its production depends on particular factors and geographic considerations that are beyond the scope of this project.)

For this project, you are given data from the year 2015 for a random sample of countries. Your dataset includes the following variables:

· “Renewable electricity production, excluding hydroelectric (MWh)” (“RE_MWh” for short): This is the total annual electricity production from renewable sources (excluding hydroelectric) including geothermal, solar, tides, wind, biomass, and biofuels, measured in mega watt hours.

· “GDP per capita (constant 2015 US$)” (“GDP_pc” for short): GDP per capita is gross domestic product divided by midyear population, measured in 2015 US dollars. It can be  considered a proxy for the average annual income per person.

· “Research and development expenditure (% of GDP)” (“R&D” for short): Expenditures on research and development (covering basic research, applied research, and experimental development) expressed as a percent of GDP.

· “Rural population (% of total population)” (“Rural_pop” for short): Rural population (as defined by national statistical offices) expressed as a percent of the total population.

· “Population, total” (“Pop” for short): Total population, counting all residents regardless of legal status or citizenship.

There is also a dummy variable, that I created to help you with step 3 of the analysis below: “Small” takes value 1 if the country has a population of less than 10 million inhabitants. It takes value 0 if the country is large, with 10 million or more inhabitants. (You do not need to include this variable in the descriptive statistics and discussions of the dataset.)

For your data analysis, perform the following steps, and write up the results. (Use the marking scheme document to help you with the write-up.)

1. Describe the data, using summary statistics and graphs, as appropriate.

2. Calculate the Pearson correlation coefficients between RE_MWh, R&D, and Pop (in all possible combinations – so this should give you 3 coefficients). Test the statistical significance for each of the coefficients and comment on your results.

3. Test whether renewable electricity production (RE_MWh) is lower in small countries than in large countries. Consider a country small if it has less than 10 million inhabitants (you can use the variable Small here). Are your results in line with your expectations?

4. Generate a new variable that captures renewable electricity production per capita: RE_pc = RE_MWh/Pop Why is this variable useful?

5. Calculate the Pearson correlation coefficients between the new variable RE_pc, GDP_pc, R&D, and Rural_pop. Do all pairwise correlations – this should give you 6 correlation coefficients. Comment on the results. (There is no need to test statistical significance here.)

6. Estimate a regression model of the form:

RE_pci = α + β R&Di + εi

where the i subscript corresponds to country i. Interpret the α and β coefficients that you obtain, and comment on its economic significance. Formally test the statistical significance of the β coefficient.

7. Predict the production of renewable electricity per capita in a country that spends 2% of its GDP on R&D.

8. Generate a new variable that is the natural logarithm of RE_pc: Ln_RE_pc = ln(RE_pc). You will get missing values for some observations when doing this – why? Remove the observations with the missing values from your dataset before proceeding to step 9. (Delete the entire row – but make sure you keep a copy of your dataset with ALL observations in case you need to revisit any of the analysis in questions 1 to 7.)

9. Using your new logarithmic variable, estimate a regression model of the form:

Ln_RE_pci = α + β R&Di + εi

where the i subscript corresponds to country i. Interpret the β coefficient that you obtain, and comment on its economic and statistical significance. Which of the two regressions you ran so far is better? Why?

10. Estimate a regression model of the form:

Ln_RE_pci = α + β1 R&Di + β3Rural_popi + εi

where the i subscript corresponds to country i. Interpret the coefficients that you obtain, and comment on their economic and statistical significance. Compare the results to those you obtained in part 9. Which model is better?

11. Interpret the R-squared from this regression and test its statistical significance.

12. Why did the coefficient in front of R&D change from the model in part 9 to the model in part 10?