FIT3152 Mock eExam with brief Answers/Marking Guide
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
FIT3152 Mock eExam with brief Answers/Marking Guide
R Coding (10 Marks)
eExam Q1 (4 Marks)
The DunHumby (DH) data frame records the Date a Customer shops at a store, the number of Days since their last shopping visit, and amount Spent for 20 customers. The first 4 rows are shown below.
Describe the action and output(s) of the R code.
Extract customer spend data pre 1/1/2011 |
[1 |
Mark] |
Calculate the amount spent by each customer |
[1 |
Mark] |
Find the 12 customers who spent the most |
[1 |
Mark] |
Extract the data for the top 12 spending customers |
[1 |
Mark] |
Save data as a csv file |
[1 |
Mark] |
Draw a histogram of the data for each customer |
[1 |
Mark] |
Up to a |
total |
of 4 Marks |
eExam Q2 (6 Marks)
Describe the function performed by each line of code or code fragment.
(a) DHY = DH[as.Date(DH$visit_date,"%d-%m-%y") < as.Date("01-01-11","%d-%m-%y"),]
Create a new 1/1/2011. [1 |
data frame consisting of observations (sales) earlier than Mark] |
||||||||||||||||||||||
(b) |
CustSpend = as.table(by(DHY$visit_spend, DHY$customer_id, sum)) |
||||||||||||||||||||||
Make |
a table of the total sales for (by) each customer [1 Mark] |
||||||||||||||||||||||
(c) |
CustSpend = sort(CustSpend, decreasing = TRUE) |
||||||||||||||||||||||
Sort |
the total sales table from highest to lowest [1 Mark] |
||||||||||||||||||||||
(d) |
CustSpend = head(CustSpend, 12) |
||||||||||||||||||||||
Keep the Mark] |
top |
12 |
records |
– |
the 12 customers |
who have |
spent |
the |
most |
[1 |
|||||||||||||
(e) DHYZ = DHY[(DHY$customer_id %in% CustSpend$customer_id),] |
|||||||||||||||||||||||
Extract the customer data for the data (DHY) data frame [1 Mark] |
top 12 |
spending |
customers |
from |
the |
main |
|||||||||||||||||
(f) ... + facet_wrap(~ customer_id, nrow = 3) |
|||||||||||||||||||||||
Draw the Mark] |
individual |
plots |
as a |
grid |
by |
wrapping |
every |
third |
column |
[1 |
Regression (10 Marks)
A subset of the ‘diamonds’ data set from the R package ‘ggplot2’ was created. The data set reports price, size(carat) and quality (cut, color and clarity) information as well as specific measurements (x, y and z). The first 6 rows are printed below.
The least squares regression of log(price) on log(size) and color is given below. Note that ‘log’ in this context means ‘Loge(X).’ Based on this output, answer the following questions.
2022-06-02