Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Expectations, Advice and Guidelines for the Group Project

ECON 452, W-2023

Your group will submit:

1.   A self-contained 10– 12-page paper which follow the guidelines below

2.   A single do file which runs all the analyses starting from your original dataset, cleaning it, producing the tables and graphs, and then running the regressions.

3.   A log file which shows all the analyses of your do file

Expectations:

Every empirical paper has its limitations, no matter how sophisticated the methods you use. A good chunk in your discussion of you results should be spent in discussing the limitations of   your paper in answering your question. This is a process essential for evolution of quality research. You should think deeply about why your methods may fail to deliver consistent estimates. This often involves in discussing how credible your assumptions are in the context  that you are analysing.

The credit depends holistically on the method used, the explanations, the execution of the methods and the critique. For the same quality of execution, explanation and critique, using methods more advanced than just a simple OLS will get more credit. However, the credit does   not only depend on the method used. For example, a well-executed logit which can very well     explain what we learn and what are the limitations will get more credit than a poorly executed   IV which does not have credible explanation. For the same empirical strategy, the project, which is better executed, better explained, better critiqued will get more credit.

Advice:

Writing an empirical project with the objective of answering a causal question involves different steps which take different amount of time in completing and then revising.

You need a causal question, i.e. you need to frame your primary question of interest in terms    of — “In this project we want to estimate the causal impact of <X> on <Y>” . there will be other regressors (other Xs) that you will control for in answering your question. But there will be a    primary variable or a policy treatment that will be your key object of interest. The biggest hurdle and time-consuming part of this is in getting hold of a data set which can answer your  causal question of interest. The reason is that when one writes an empirical project for the first time, they often underestimate the complexity of accessing and understanding real world data. It is rare that the first dataset that you end up finding will be the eventual dataset with which   you will answer your causal question in the project. There have many cases where students

have come to me with a causal question in mind, but the data that they have cannot answer the question. Some non-exhaustive examples are:

i.    The data is not at the right level. It is easier to find aggregate data, but often causal         questions are framed such that one needs micro-level or some level disaggregated data.

ii.   The data is not in the correct time frame. Often students are interested in estimating a      causal effect of a recent policy change that they have been hearing in the news. While the question is almost always quite interesting but getting data which covers recent years,     let alone the current year, is relatively much harder.

iii.  The data does not contain the adequate variables that they would need to answer their question.

This leads to two paths depending on time. Either students search for different datasets, or they change the question more in line to the dataset that they can access and analyse in a reasonable time frame.  So, once you submit your research plan, it is not necessary that you must stick to it. I understand that students may need to change the question and/or the data over time.

Please do not plan to use datasets like –

a.   Example data sets in assignments [Most of them were either from an existing source modified by me for the purposes of the assignment]

b.   Example data sets in Stata or Wooldridge [These often have very small sample size, and their purposes is to help getting used to coding]

c.   Time series data like macro data sets which has aggregate data over time - country level GDP, or stock market data. [This is a course in causal inference where our focus is on questions in micro-econometrics]

When in doubt please ask advice from the TAs or me on whether the dataset you are considering is adequate in answering your question of interest.

Hence before starting an empirical project for the first time, one needs to be aware that progress in an empirical project with respect to time is highly non-linear. Once the required data is there, the other parts are relatively faster.

1.   Your start will be slow, where you are mostly trying to find a data set with which you  can answer a question or reframing your question depending on what data you can get within a reasonable time frame.

2.   The next step where once you have the data is analysing it which will go relatively faster. Within this step also there is sufficient differences in the amount of time spent. In particular,

a.   the biggest chunk of time you will end up spending will be in understanding     patterns in the data and cleaning the dataset to get it to the point where you can run estimations.

i.   Cleaning datasets take time because you will rarely get the data in the     exact format where you can implement your estimations, as you have      been getting in the assignments. This could include, handling missing     values properly, defining new variables from existing variables, merging variables correctly from multiple datasets if applicable etc.

ii.   You will understand patterns in the data by summarizing the data set in detail, graphing scatter plots, histograms etc. It could be that not all of    them end up being in your final project, but they will help you                 understand the underlying patterns

b.   Once your data is clean and you have a good understanding of the data, you will have a “cleaned sample” on which you will run a bunch of estimations with         different/flexible specifications. This will be the relatively fast.

c.   Then comes interpretation and linking your results to the big picture question.     Often in this step, depending on the results, you may need some modifications or additional estimations to tightly knit a story that is self-consistent and contained  within your empirical analyses. This means often revising step 2.b

3.   Finally comes writing the research paper and making it self-contained. Writing well is a skill and it comes with practice.

Guidelines on writing the paper:

Most of these following guidelines are adapted with permission from a similar course in which I was a TA during my PhD.

In reading the papers, we pay considerable attention to style (correct spelling and grammar,    clear exposition, good organization). So, too, do most people: reports that are difficult to read  routinely get ignored, even if they contain good ideas. Thus, it will pay to develop the habit of working hard to craft a clear explanation of your ideas.

Many of the following suggestions are standard good practice.

1.   In writing up a research report, one should have an audience in mind. Take the audience to be your fellow students in this course. They'll know most of the relevant economic       and econometric theory but won't know your data set or your model.

2.   Include a cover page with the following information: Title; date; your names; your e-       mail addresses; the word “Abstract”; an abstract of 100 words or fewer. If you have         acknowledgments to make (thanking a fellow student for helpful comments, for               examples), put these on the bottom of the cover page. The text of the paper begins on the next page.

3.   Your paper should be divided into sections, to help guide the reader. Recall the basic structure of a typical paper is:

i.    Introduction:  motivate your question, present an overview of your paper, related literature with adequate citations and summarize your findings.

ii.   Data: Explain the data you have, the sample, descriptive characteristics and   explain patterns in the data and your final estimation sample, the hypotheses you plan to test

iii.  Model: specification of the model, economic intuition of what you may find

iv.  Results: you present and interpret your results.

v.   Discussion where your work is applicable and where it may not be, i.e. critique your work.

vi.  Conclusion: summarize your findings and give suggestions for future research linked to the previous section

4.   Number the pages.

5.   Be explicit about your data set in the data section. State the sample size. State the units of measurement. For example, if “income” is a variable, state whether it is measured in        current dollars or constant 2008 dollars, and if it is per capita, say so. For data in logs or   log differences, you will usually want to multiply by 100 so that the units will be percent or percent change. Also explain the choice of the sample: why does it start in a given        year, or you use only a cross-section from a given year instead of a panel, and so on.

6.   Include plots and a table with basic statistics (means, standard deviations) of the data.

7.   Number the equations. You can number the equations by section, if you have sections in the paper. That is, the third equation in section 2 can be numbered 2-3, the second            equation in section 4 can be numbered 4-2, etc., if you prefer doing this to numbering     sequentially through the paper.

8.   Tables:

a.    Number the tables, and on each include a descriptive header (“Means and Standard Deviations of Data,” or Variance Decompositions,” for example).

b.    Tables may appear in the text in the appropriate place, or at the end of the paper.

c.    Tables should not run over page boundaries, unless they are too long to fit on a  single page. That is, if you include a table in the text, you should ensure that you place it so that it does not run from one page to the next.

d.   Make every effort to make each table self-contained, even though this will require you to redundantly present information that is also stated in the body of the paper itself. This is now the standard in the profession, and you should look at a paper    published recently to see how much detail is included in tables.

i.      In notes at the bottom of each table, define the symbols that are in the table. It is not adequate to simply state “definitions are in the paper” or “see section 2 of the paper for definitions” . Instead say something like “Variable definitions: y=log of income in 2012 dollars, educ=years of education,” and so

on.

ii.      In tables that present regression results, include a note that describes the      estimation technique (“The probit was estimated by maximum likelihood,   assuming normality,” for example.) (You will also present such information in the text itself.)

iii.      Variables should be self-explanatory in tables. The should not have any abbreviations.

iv.      Be sure to include the name of the dependent variable somewhere in the table.

v.      In the text, use words to describe the variables in your model. For example, if years of schooling is a regressor in your model write out Years of schooling” not YRSCH” if that is the name of the variable in your statistical software.

e.    In all but the simplest tables, number the rows and columns. When the text             references a result in the table, cite the row and column: “the t-statistic is 2.12 (row

(2), column (4)).”

10. Figures:

b.   Number the figures, and on each include a descriptive title (“Parental Income versus SAT Score,” for example).

c.    Figures may appear in the text in the appropriate place, or at the end of the paper, but wither way must be referenced.

d.   Figures should not run over page boundaries and must always fit on a single page. That is, if you include a figure in the text, you should ensure that you place it so that it does   not run from one page to the next.

11. Reporting of estimates:

a.    Do not report more than 3 digits. Example: report 0.412, not 0.4117678.

b.   Avoid long strings of zeroes at the beginning of a number. You can always rescale variables and hence the coefficients.

c.    Report standard errors, not t-statistics. Standard errors belong in parentheses under the coefficients. Example: Report

0.412

(0.146)

not

0.412 0.146

12. In estimating equations in the paper avoid the use of elaborate acronyms to denote variables (like WEEKEARNING_12 for Weekly earnings in 2012 dollars). They are rarely helpful to the reader. A single letter, usually with a subscript, ordinarily suffices and is    easier to read when used in equations. Simply explain what that letter denotes in the text that follows.

13. References:

a.    All references cited in the paper should be listed in a bibliography at the end of

the paper. Within the text refer to them as for example Angrist and Imbens (1994).

b.   When you reference a specific result, such as a point estimate of a parameter, or a theorem that establishes a particular claim, give the page number, such as Walter (2015, p361). When you reference a general result, for example noting other  papers that have studied topics like yours, no page number is needed.

14. Computer code: You do not need to include you programs in the paper. We should be able to figure out what you did without seeing it explicitly.

15. Miscellaneous reminders on terminology:

a. Hypotheses (not tests) may be acceptedor rejected.

b. Hypotheses refer to the magnitudes of population parameters, not estimates, and not to statistical significance. The word “significant” should not appear in the statement of a

hypothesis.

16. Papers can take as their starting point a working paper or published paper by  professional economists. Your own paper should be self-contained (even though we may ask you to turn in a copy of or link to the basis paper). As well, you need to be   crystal clear about what you have done versus what is in the published paper. If you obtain your data from the authors of the published paper, for example, you must      explicitly say so, even though you will also need to describe the source used by the   authors of that paper.

17. ACADEMIC INTEGRITY: It is a violation of scholarly ethics to repeat a passage, even a sentence, from another source without putting the passage in quotes and citing the        source: the usual publication details in case of printed matter, the URL and date in the  case of web-only material. This rule applies even when you are describing dreary facts: if you repeat a description from another paper of how data were collected, or the steps in computing an estimate, you must put the passage in quotation marks and cite the      original source.