Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

EMET3007/8012 Computer Lab 1

July 21, 2023

1    Introduction

A particularly important class of models for forecasting are the Autoregressive (AR) models. In this lab, we are exploring properties of two AR models.  Like an OLS model, an AR model contains some coefficients and these coefficients are usually unknown and have to be estimated from data. In this lab, we do something different. We study two AR models

with known coefficients to understand how the qualitative properties of an AR model depends on its coefficients.

An AR model is specified by the following equation.

y t  = a1y t−1+ a2y t−2+ ... + apy tp +  t .                                                            (1)

In this equation, y t’s are our data such as real GDP of an economy, while   t  is some unobservable noise.  It is usually assumed that the   t ’s are independent and identically distributed (i.i.d.). It is not necessary to assume that they are nor- mally distributed, simulation engines usually make such an assumption. This i.i.d. assumption can be relaxed somewhat without changing the theoretical results, but we do not dwell on such technical details in this applied course. The positive integer p is called the maximum lag of the process, and a1 , ..., ap are coefficients. An AR process with maximum lag p is called an AR(p) process.

It turns out that the qualitative behavior of an AR(p) process is determined by its AR polynomial

f (z) = 1 − a1z a2z2 − ... ap zp .                                                                 (2)

I denote the indeterminate of the polynomial by z to indicate that it may be a complex number.  Other authors prefer to use L instead of z because it represent a lag by one period.  If 1 is a root of the polynomial f (z), the AR process is said to possess a unit root, which is the cause of all kinds of complications in time series analysis.  Even though the (complex) roots off (z) can be any complex numbers in theory, they are usually either 1 or something with absolute value (“modulus”) bigger than 1 in AR processes encountered in economics.

In this lab, we examine the behavior of oneAR(3) process without unit root and oneAR(3) process with a unit root.

2    Tasks

All tasks are to be performed for both of theseAR(3) processes:

y t      =   .4y t−1+ .3y t−2+ .3y t−3+ et .                                                              (4)

2.1    Task 1

Using only method in NumPy, compute the roots of the AR polynomials of the processes.

2.2    Task 2 (Hurdle)

Simulate a sample path of each of these AR processes as follows. First, we create a theoretical model of the AR process

using the Python library statsmodels.tsa.

import  statsmodels.api  as  sm

Lab1_AR_process  =  sm.tsa.ArmaProcess([1,  -a_ 1,  -a_2,  -a_3]);

The library statsmodels.tse provides a method ArmaProcess to create an ARMA process, which is a generalization of AR processes.  In this lab, we will not use the MA (moving average) part.  The method takes two parameters, the AR polynomial and the MA polynomial. Since our process does not have an MA part, we only provide one parameter. Note that the representation of the AR polynomial is different from its representation in NumPy.  The right hand side of the second line creates an object which is the theoretical AR process, and then we give this object a name. You can replace

my choice of name with anything you like.

Recall that an object is some toy we can play with. We mainly care about how to play with the toy or what the toy can do; we do not care much about how the toy does it. A theoretical AR process is just like an equation such asEq. (3). It allows us to compute various theoretical properties of the AR process but not much more than that.  In this task, we will use the generate_sample method of the theoretical process, which will generate some data (called a “sample path”)

according to the definitionEq. (1).

An important concept in random simulation is that nothing is truly random in Python. It turns out that the sequence of random numbers generated depends on an integer called the “seed”.  If you are curious about what it does, you can use your search engine to find more information about pseudorandom numbers and seeds. Long story short, if you use

the same seed, running your code twice should generate identical results.

IMPORTANT. In this lab, you are required to use your UID as the random seed. (You do not need to reset the seed

before generating the sample path of the second AR(3) process Eq. 4.)

The following code will reset the random seed and then simulate a sample path of n period. By default, the distribution of et  is standard normal; the generate_sample method allows us to use different random number generators.  We use n = 200 for this lab.

import  numpy  as  np

np.random.seed(UID);

Lab1_AR_sample  =  Lab1_AR_process.generate_sample(200);

Now you can check that you have generated a sequence of 200 numbers following the AR process.  As the hurdle task of this lab, use matplotlib.pyplot to plot the sample path you generated. You need to show your tutor the figure you generated.1

2.3    Task 3

Actually, the theoretical ARMA process (Lab1_AR_Process in the code above, not the sample path you generated) has an attribute arroots which computes the complex roots of the AR process. Do this and verify that the result is identical to what you obtained in Task 1. Compute the absolute values (called “moduli” in mathematics) of the roots.

2.4    Task 4

Because unit roots cause many complications in time series analysis, it is usually desirable to know whether a given time

series possesses a unit root. The standard way to check this is to run the augmented Dickey-Fuller (or ADF) test:

H0  The process has a unit root.

H1  All AR roots of the process have absolute values greater than 1.

Of course, we already know whether each of our AR processes possesses a unit root.  However, it is instructive to run the ADF test on the sample paths you generated in Task 2.   This test is provided by the method statsmod- els.tsa.stattools.adfuller.  Even though it takes several parameters, you only need to supply one:  the data (sample path you generated). Read the manual page of the method and figure out how to read out the -value of the test.  Report the

-value of the ADF test for each of the two time series.

Remark. We treat the ADF test also as toy to play with. We do not care about how the test statistic is calculated and what its distribution is under the null hypothesis.  We only care about what the test is used for and how to interpret its

-value.

2.5    Task 5

This task is intended as a review of basic econometrics. In your lab report, explain the following terms briefly.

1.  The null hypothesis.

2.  The alternative hypothesis.

3.  Type I and Type II errors.

1Note that the tutor may check whether the sample path you generated is “correct” because they should be able to obtain the path identical to yours by using your UID as the random seed.

4. -value.

Now interpret the two -values you obtained in Task 4: what do they say about whether the underlying process has a unit root?

Finally, answer the following question: why it may not be a good idea to have a test using “not having unit root” as the null hypothesis and “having a unit root” as the alternative hypothesis.