闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

QBUS1040 Final Exam

Semester 2, 2022

The exam is open book, open notes. You may refer to the documentation of Python (python.org) and Python packages such as NumPy (docs.scipy.org). However, you cannot discuss this exam with anyone until the exam is over. This means you cannot use online forums, social media platforms and any other means of communication.

Throughout this exam, we use standard mathematical notation; in particular, we do not use (and you may not use) notation from any computer language or from any strange or non-standard mathematical dialect (e.g., physics). Also, you may not use concepts or material from other classes (say, a linear algebra class you may have taken) in your solutions. All the problems can be done using only the material from this class, and we will deduct points from solutions that refer to outside material.

This exam consists of eight problems that require you to submit a written response. You should submit a PDF to GradeScope and match the page number with the questions that you answered. You can find detailed instructions on how to scan and submit your assignments on Canvas. If you fail to match the page to the corresponding question, the marker will not be able to view your response, and thus, you will be awarded a 0 mark for the question.

Your solutions should be neat and complete. We will deduct points from poorly written or incom- plete/unjustified solutions even if they are correct.

Good luck!

Question:	1	2	3	4	5	6	7	8	Total
Points:	10	10	10	10	10	10	10	20	90

1. True or False?

(a) (2 points) A wide matrix can be invertible.

(b) (2 points) If C is a right inverse of A, then C^T is a left inverse of A^T .

(d) (2 points) Consider the following regularized regression problem

minimize ∥X^T β + v1 − y∥2 + λ∥β∥2.

Increasing λ will force predictions on the train set to converge to 0.

(e) (2 points) Adding basis functions to a model cannot increase the test error.

2. (10 points) Estimate the time it takes to compute the following operations. Which operation can be completed faster?

(A) Performing QR factorization on a 10, 000 1, 000 matrix on a computer that can compute 10 Gflop per second.

(B) Solving

Rx = b

where R is a upper triangular matrix with 20, 000 rows on a computer that can compute 2 Gflop per second.

3. Suppose B is a 4 × 2 matrix and assume that the matrices in the following operation conform.

ΣA BΣ Σ13Σ = ΣaΣ

(a) (2 points) Can the dimension of x be determined? If yes, state what it is.

(b) (2 points) Can the dimension of a be determined? If yes, state what it is.

(d) (3 points) Can the dimension of D be determined? If yes, state what it is.

4. (10 points) Calculate the pseudo-inverse of the following matrix by hand. Show all necessary steps. Present your answer in fractional form.

1 4

A = 2 2

4 1

5. Suppose you have been given a 3 × N feature matrix X with rows x^T , x^T , x^T , and you want to build

a regression model to predict y. You solve the problem

1 2 3

and obtain βˆ.

min ∥ Σ1 x₁ x₂ x₃Σ β − y∥2 (1)

Now suppose you want to solve the problem

min ∥ Σ1 x₁ x₂ x₃ (α₁x₂ + α₂1)Σ θ − y∥2, (2) where α₁ and α₂ are given positive scalars.

(a) (5 points) Would you expect the RMS error of (2) to be the same, higher or lower than (1).

Explain why.

(b) (5 points) Would you expect any issues when solving (2) directly? Why or why not?

6. (10 points) Consider the problem

minimize (9x₁ + 3x₂ − 6)² + (2x₁ − 7x₂ + 2)² + (2x₁ + 7x₂ − 4)²,

and let xˆ be its solution.

Show that xˆ can be obtained by solving a system of linear equations Cxˆ = d. Give the entries of the matrix C and the vector d, as well as their dimensions. Is the system under-determined, square or over-determined?

Hint. This can all be calculated by hand. You do not need to invert any matrix or use any sophisticated algorithm.

7. (10 points) A least squares classifier yˆ = sign(x^T β + ν) achieves false positive rate 0.23 and true positive rate 0.95 on the training set, and false positive rate 0.26 and true positive rate 0.91 on a test set.

You now consider adjusting the decision boundary by considering

yˆ = sign(x^T β + ν − α),

where α is a scalar.

To reduce the false positive rate, how should we choose α? Should it be positive or negative. Explain your answer.

8. The data provided in the Jupyter Notebook file QBUS1040-2022S2-FinalExam.ipynb contains six NumPy arrays. These include train and test variants of three variables:

• volume: This is the volume of timber produced by a tree, and our outcome variable.

• girth: This is the diameter of a tree.

• height: This is the height of a tree.

Suppose that x is a feature vector of length 2, with the first element encoding the diameter of the tree and the second element encoding the height.

(a) (4 points) We first consider a simple model

fˆ(x) = β₁x₁ + β₂x₂. (3)

State the basis functions in this model.

(b) (6 points) Fit model (3) on the data and report the estimated coefficients. Also report the RMS error of your model on the training and test data. Round your answers to 3 decimal places.

fˆ(x) = β₁x₁ + β₂x₂ + β₃x². (4)

Fit model (4) on the data and report the estimated coefficients. Also report the RMS error of your model on the training and test data. Round your answers to 3 decimal places.

(d) (4 points) Which model is better? Briefly explain why.

2022-11-23

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple