闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT 466/866, Fall 2019

Final Exam

Instructions:

• Go to the Assessments/Assignments/Final folder in onQ and copy the contents of start.txt into your SAS program editor. Also download the MAP.sas7bdat file and upload this file your SAS Session.

• When you are finished, submit your SAS program file to this Final folder.

• You must provide the SAS syntax for all work.

• Try to make the contents ofyour SAS program file executable.

• Use comments ifyou need to provide any explanations.

• Save your SAS program file frequently.

• If you have a technical problem or a question raise your hand; ifthe TA or I don’t notice you after a minute, come get one of us.

• You will get part marks for code that contains some correct elements even if it does not run, so budget your time wisely.

Rules:

This is an open book exam so you may use any reference material, except other people.

If you communicate with other people electronically or otherwise, you will receive 0 on the exam.

1. Put your name in a main title that will appear on all pages ofthe output from this exam. For each of the other questions have a secondary title identify the question number. (2)

2. Performing a simulation: (STAT466-18, STAT866-23)

The standard formula for a 95% confidence interval of a sample mean is

x ± t(.975, n − 1) *sd / where n is the sample size, x-bar is the sample average, sd is the standard deviation, and t(.975, n-1) is the 97.5th percentile ofthe t-distribution with 1 degree of freedom. This formula is used by PROC MEANS and other SAS procedures to provide confidence limits. Perform 10, 000 simulations to confirm that this formula provides 95% coverage for a sample size as small as 2 ifthe sample is drawn from a standard normal distribution. You do not need to use arrays or macros for this question, and you don’t need to make the code generalizable to other sample sizes or distributions.

You can do this question one oftwo ways: 1) You can generate 10, 000 simulations of sample size 2 (i.e. 20, 000 rows) and then use PROC MEANS to obtain 10, 000 confidence limits, or 2) you could generate two random variables for 10, 000 simulations and then use SAS functions to implement the standard formula for a 95% confidence interval provided above. Provide the code for one ofthese 2 approaches ifyou are in STAT466. Provide the code for both approaches if you are in STAT866.

3. Reading, analyzing and tabulating data (27)

This question uses data from a fictional clinical trial measuring mean arterial blood pressure (MAP) before and after a month of treatment with either a Diuretic or a Calcium Channel Blocker. I’ve provided these data to you in two formats: 1) the MAP SAS dataset, and 2) the raw data included in the start.txt file.

a) Have SAS list the attributes of the MAP SAS dataset that you uploaded from onQ. (2)

b) Use the raw data provided in start.txt as in-stream data to generate a temporary dataset that is identical (including all attributes) to the MAP dataset you uploaded from onQ. (7)

c) Run a procedure to compare the MAP SAS dataset provided and the MAP dataset you generated in part b. (2)

For the remaining questions, you can use the MAP dataset provided or the temporary MAP dataset created in part b.

d) Run the Change variable through a procedure to get a detailed description of its distribution separately for each Med group. Turn on two options to help you asses ifthe distribution is approximately normally distributed. In a comment answer the following,

i. Do the data appear to grossly violate the normal distribution assumption - explain?

ii. Does there appear to be a statistically significant decrease in blood pressure with either medicine? Why or why not? (5)

e) Perform a two sample t-test to compare the change in MAP between medication groups. Which t-test method should we use for these data? Why? Do these data suggest that one medication significantly decreases blood pressure more than the other? (3)

f) Run an analysis ofcovariance with Change as the dependent variable and Pre and Med as independent variables. For full points, try to do this using a procedure that does not require you to create a dummy variable for Med. What is the p-value for comparing the Change between Med groups after controlling for Pre> (2)

g) Produce a summary table that looks exactly like the following. Don’t worry about colours, shading or wrapping. (6)

MAP in mmHg		Calcium Channel Blocker	Diuretic
Baseline MAP	N	18	20
	Mean	109	108
	Std	8	11
	Min	98	94
	Max	126	138
Post-treatment MAP	N	18	19
	Mean	89	85
	Std	9	11
	Min	71	68
	Max	104	110
MAP Change	N	18	19
	Mean	-20	-23
	Std	8	17
	Min	-35	-52
	Max	-5	8

4. Making a generic macro: (13)

Create a macro called replace that allows you to replace the occurrences of a particular value in a data set with a different value. The replace macro will have two positional parameters and two keyword parameters. The positional parameters from and to will tell the macro that the value of from should be replaced with the value ofto. The keyword macro parameters are vars and data. Vars will name the variable(s) on which to perform the replacements. It should default to work on all numeric variables in the data set. Data will specify which data set to update. It should default to the last data set created.

This macro should overwrite the original data set with a new version ofthe data set that has the values replaced. The macro should be designed so that it can work on any number ofvariables within a single data set. The macro should not add any new variables to the data set. When the macro is run, put a message in the log that says which value is being replaced by what. For example, ifyou ran %replace(., 999, vars=Post Change, data=MAP) you would recode missing values in the Post and Change variables to 999, and the following message would be written to the log: MESSAGE: Replacing . to 999 in the MAP datasetfor thefollowing variable(s): Post Change. Note the period at the end ofthe list ofvariables.