Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STA303H1S Assignment

Due on February 26, 2024 11:59 PM in Crowdmark

All relevant work must be shown for credit.

Note: All the questions of the assignment should be completed in R. You can submit your results in a .pdf file generated using R Markdown. For help, please check the .rmd code files posted in Quercus. Assume that the reader is not familiar with R outputs, and explain all findings, quoting necessary values from your outputs. You don’t need to show all the outputs generated from R un-less you are explaining them. For example, in 1(b), you are generating 100,000 confidence intervals, but you don’t need to show them all. Please just report the proportion of intervals containing the true value. However, all the R codes should be shown in the .pdf file. Please note that academic integrity is fundamental to learning and scholarship. You may discuss questions with other stu-dents; however, the work you submit should be your own. If I feel suspicious of any assignment (e.g., if your work doesn’t appear to be consistent with what we have discussed in class), I will not mark the assignment. Instead, I will ask you to present your work in my office, and your grade will be assigned based on your presentation. The assignment should be submitted via Crowd Mark.

1. Let Y ∼ Bin(n = 30, π = 0.9). Y can be interpreted as the number of successes in a sample of size n = 30 from a Binomial distribution with probability of success π = 0.9.

(a) Let the observed number of success after 30 trials is y = 27. Calculate Wald and score (Wilson) 95% confidence interval. Interpret the confidence interval.   [5 Marks]

(b) Simulate N = 100, 000 observations of Y using R function rbinom(). Calculate the Wald and Score 95% confidence interval for each of the observations. This means you are calculating 100, 000 confidence intervals of each type. Calculate the proportion of these Wald intervals that contain 0.9 (the true value of π). Also calculate the proportion of score intervals that contain 0.9. Compare the results and comment on your findings. Which one do you feel is a more reliable CI?         [10 Marks]

Note: R cannot generate random numbers. It only generates “pseudo” random numbers. Thus a seed needs to be provided to reproduce the results. One can fix the seed in R using the set.seed() command. The seed you are going to use is your student ID. Thus you have to start the code with set.seed(Your student ID). If you don’t provide the seed you will loose 3 Marks.

2. Same as the previous question Let Y ∼ Bin(n = 30, π) and y = 27. This time we don’t know the true value of π

(a) Find the likelihood (ℓ(π)) and log-likelihood function (L(π)).       [2 Marks]

(b) Using R, find the maximum likelihood estimate (MLE) of π using the optimize function and plot ℓ(π) and L(π) over the values of π. Show the MLE in the plot using a vertical line.       [3 Marks]

(c) Test H0 : π = 0.5 vs Ha : π = 0.5 using the likelihood ratio test. Interpret the results from the test.    [5 Marks]

(d) Using R calculate the 95% likelihood ratio confidence interval for π and interpret.    [5 Marks]

3. (a) Perform the following simulation (for this please set the seed to your student ID),

• Generate 500 random values from X1 ∼ Uniform[−10, 10], X2 ∼ N(0, 4) and X3 ∼ Bernoulli(0.7)

• Set β = (−0.8, 0.1, 0.2, 0.3)

• Simulate Yi ∼ Poisson(µi), where, µi = exp(∑jxijβj )

[10 Marks]

(b) Estimate the βs using Iteratively Weighted Least Square (IRLS) method by writing your own function. State the link function and state the W matrix as defined in the lecture 4 slides. Compare the results with glm code in R         [15 Marks]