Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

QBUS1040 Final Exam

Semester 2, 2022

The exam is open book, open notes. You may refer to the documentation of Python (python.org) and  Python packages such as NumPy (docs.scipy.org). However, you cannot discuss this exam with anyone until the exam is over.  This  means  you cannot use  online  forums,  social  media  platforms and any other means of communication.

Throughout this exam, we use standard mathematical notation; in particular, we do not use (and you may not use) notation from any computer language or from any strange or non-standard mathematical dialect (e.g., physics). Also, you may not use concepts or material from other classes (say, a linear algebra class you may have taken) in your solutions. All the problems can be done using only the material from this class, and we will deduct points from solutions that refer to outside material.

This exam consists of eight problems that require you to submit a written response. You should submit  a PDF to GradeScope and match the page number with the questions that you answered. You can find detailed instructions on how to scan and submit your assignments on Canvas. If you fail to match the page to the corresponding question, the marker will not be able to view your response, and thus, you will be awarded a 0 mark for the question.

Your solutions should be neat and complete. We will deduct points from poorly written or incom- plete/unjustified solutions even if they are correct.

Good luck!

Question:

1

2

3

4

5

6

7

8

Total

Points:

10

10

10

10

10

10

10

20

90

1. True or False?

(a) (2 points) A wide matrix can be invertible.

(b) (2 points) If C is a right inverse of A, then CT is a left inverse of AT .

(c) (2 points) If A has linearly independent rows and x = Ab, then Ax = b.

(d) (2 points) Consider the following regularized regression problem

minimize XT β + v1 y2 + λβ2.

Increasing λ will force predictions on the train set to converge to 0.

(e) (2 points) Adding basis functions to a model cannot increase the test error.

2. (10 points) Estimate the time it takes to compute the following operations.  Which operation can  be completed faster?

(A) Performing QR factorization on a 10, 000 1, 000 matrix on a computer that can compute 10 Gflop per second.

(B) Solving

Rx = b

where R is a upper triangular matrix with 20, 000 rows on a computer that can compute 2 Gflop per second.

3. Suppose B is a 4 × 2 matrix and assume that the matrices in the following operation conform.

ΣA BΣ Σ13Σ = ΣaΣ

(a) (2 points)  Can the dimension of x be determined?  If yes, state what it is.

(b) (2 points)  Can the dimension of a be determined? If yes, state what it is.

(c) (3 points)  Can the dimension of A be determined?  If yes, state what it is.

(d) (3 points) Can the dimension of D be determined? If yes, state what it is.

4. (10 points) Calculate the pseudo-inverse of the following matrix by hand. Show all necessary steps. Present your answer in fractional form.

1   4

A =   2  2

4   1

5. Suppose you have been given a 3 × N feature matrix X with rows xT , xT , xT , and you want to build

a regression model to predict y. You solve the problem

1 2 3

and obtain βˆ.

min Σ1 x1 x2 x3Σ β y2 (1)

Now suppose you want to solve the problem

min Σ1 x1 x2 x3 (α1x2 + α21)Σ θ y2, (2) where α1 and α2 are given positive scalars.

(a) (5 points) Would you expect the RMS error of (2) to be the same, higher or lower than (1).

Explain why.

(b) (5 points) Would you expect any issues when solving (2) directly? Why or why not?

6. (10 points) Consider the problem

minimize (9x1 + 3x2 6)2 + (2x1 7x2 + 2)2 + (2x1 + 7x2 4)2,

and let xˆ be its solution.

Show that xˆ can be obtained by solving a system of linear equations Cxˆ = d.  Give the entries of the matrix C and the vector d, as well as their dimensions. Is the system under-determined, square or over-determined?

Hint. This can all be calculated by hand. You do not need to invert any matrix or use any sophisticated algorithm.

7. (10 points)  A  least  squares  classifier yˆ = sign(xT β + ν)  achieves  false  positive  rate  0.23  and  true positive rate 0.95 on the training set, and false positive rate 0.26 and true positive rate 0.91 on a test set.

You now consider adjusting the decision boundary by considering

yˆ = sign(xT β + ν α),

where α is a scalar.

To reduce the false positive rate, how should we choose α? Should it be positive or negative. Explain your answer.

8. The data provided in the Jupyter Notebook file QBUS1040-2022S2-FinalExam.ipynb contains six NumPy arrays. These include train and test variants of three variables:

• volume: This is the volume of timber produced by a tree, and our outcome variable.

• girth: This is the diameter of a tree.

• height: This is the height of a tree.

Suppose that x is a feature vector of length 2, with the first element encoding the diameter of the tree and the second element encoding the height.

(a) (4 points) We first consider a simple model

fˆ(x) = β1x1 + β2x2. (3)

State the basis functions in this model.

(b) (6 points) Fit model (3) on the data and report the estimated coefficients. Also report the RMS error of your model on the training and test data. Round your answers to 3 decimal places.

(c) (6 points) We now consider the model

fˆ(x) = β1x1 + β2x2 + β3x2. (4)

Fit model (4) on the data and report the estimated coefficients. Also report the RMS error of your model on the training and test data. Round your answers to 3 decimal places.

(d) (4 points) Which model is better? Briefly explain why.