Stat 419 Introduction to Multivariate Statistics
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Extra Credit (Assignment 5)
Introduction to Multivariate Statistics (Stat 419)
Submission deadline: 12 Dec 2022 Total Points: (+) 2 with total
The“crime.us”data is available in the different R packages. You can access “crime.us”data from the “VGAMdata” R package as well. The following codes
will help you to have access to the data.
R code to access data:
install .packages("VGAMdata")
library(VGAMdata)
data(crime.us)
dim(crime.us)
names(crime.us)
head(crime.us)
The list of variables and their entries (observations) shows that this multivariate dataset is having data on different types of crime, and other related information, e.g. population, and name of the states. The information on States enables us to use different multivariate methods and see the differences/similarities by states.
We have created a new variable called “crime”taking a subset from the dataset. And the last variable (22th) indicates the states, so we separated that under the name “label”. Here is the R code:
crime = crime.us [, 13:20]
label = crime.us [, 22]
Now, answer the following questions (all you need to do is to run the provided code and write comments on them):
1. Perform a correlation analysis and report on your results. The following code should help:
crime .corr = cor(crime)
round(crime .corr, 3)
# Plot
library(ggplot2)
library(GGally)
ggpairs(crime)
2. Perform a multivariate regression and comment on the important findings. You may run the following code:
library(car)
library(MVN)
# MV model fit
crime.mv = lm (cbind(ViolentCrimeTotal, PropertyCrimeTotal) ~ MurderRate + RapeRate + RobberyRate + AssaultRate + BurglaryRate + LarcenyTheftRate + MotorVehicleTheftRate + Population, data = crime.us)
# MANOVA for two response variables
summary.aov(crime.mv)
# Summary results
summary(crime.mv)
3. Perform a Principal Component Analysis (PCA) on the crime data. Make a short summary of your findings. You may use the following code:
# Run PCA
crime .pca = prcomp(crime)
# Plot PC’s
library(ggbiplot)
ggbiplot(crime .pca, labels=lables)
# Summary results
summary(crime .pca)
4. Run a factor analysis model and summarize the results. Following code might help.
# Fit the FA model
crime .fa = factanal(crime, factors = 2, rotation = "varimax")
# Print the Factor loading
print(loadings(crime .fa), cutoff = 0 .5)
2022-12-12