0% found this document useful (0 votes)

168 views

R Programming

This homework assignment asks students to: 1) Perform principal component analysis (PCA) and clustering on personality data and interpret the results. 2) Examine factor loadings and eigenvalues to determine the number of factors to retain from the PCA. 3) Compare k-means and hierarchical clustering results to interpret personality clusters. 4) Analyze and interpret the factor and cluster solutions to understand patterns in the personality data.

Uploaded by

whereareyoudear

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

168 views

R Programming

Uploaded by

whereareyoudear

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Introduction to Computational Statistics (DSSH 6301),

Homework 11, Assignments

For all problems, please show all your work. As described in the Homework Guidelines, use RMarkdown to
write up your work as a .Rmd file, knit the result to a PDF file, and submit that PDF file to Blackboard.
(You can also knit to an HTML or Word document and save that as a PDF, as decribed in the Homework
Guidelines. Be sure to use R code for all your calculations, and the LATEX equation format to write up
any math. See the Homework Guidelines in Course Resources on Blackboard for more formatting details.
NOTES: include question number and text in your solutions, provide solutions in the same
order as given in the assignment some problems may have multiple solutions. If you did not have
points taken o for a question, we considered your solution correct be concise use minimum necessary
number of signigficant digits; round() and signif() are your friends follow Googles R Style Guide regex
sandbox R: r-project R-intro short-intro refcard formatR RStudio RMarkdown RevolutionR
swirl LATEX: latex-project wiki sheet sandbox hostmath knitr::kable tables generator detexify
tcolorbox
Please perform a principal component analysis (PCA) and clustering using bfi data set in the psych package
(25 personality items thought to boil down to a few core personality types) and interpret the results. You
can load the data using, data(bfi) after loading the psych package; you may need to clean it a bit first
with na.omit() to remove the observations with NA items, or else impute those missing items. It might also
help to use scale() on your dataset before analysis. scale() takes all your variables (columns) and rescales
them to have a mean of 0 and a sd of 1, so that you can more easily compare all your principal components
or clusters to see which are larger or smaller.
1. Examine the factor eigenvalues or variances (or the sdev or standard deviations as reported by prcomp or
princomp, which you then need to square to get the variances). Plot these in a scree plot and use the elbow
test to guess how many factors should be retained. What proportion of the total variance does your subset of
variables explain?
2. Examine the loadings of the PCs on the variables (sometimes called the rotation in the function
output) - ie, the projection of the principal components on the variables - focusing on just the first one or
two PCs. Sort the variables by their loadings, and try to interpret what the first one or two PCs factors
mean. This may require looking more carefully into the dataset to understand exactly what each of the
variables were measuring. You can find more about the data in the psych package using ?psych or visiting
http://personality-project.org
3. First use k-means and examine the centers of the first two or three clusters. How are they similar to and
dierent from the factor loadings of the first couple factors?
4. Next use hierarchical clustering. Print the dendrogram, and use that to guide your choice of the number
of clusters. Use cutree to generate a list of which clusters each observation belongs to. Aggregate the data
by cluster and then examine those centers (the aggregate means) as you did in (3). Can you interpret all of
them meaningfully using the methods from (3) to look at the centers?
5. From the factor and cluster analysis, what can you say more generally about what you have learned about
your data?

Complete Download (Ebook) Introduction to Data Mining: Global Edition by Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar ISBN 9780273769224, 0273769227 PDF All Chapters
100% (6)
Complete Download (Ebook) Introduction to Data Mining: Global Edition by Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar ISBN 9780273769224, 0273769227 PDF All Chapters
81 pages
Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
Ai-900 3df695e8afa1
No ratings yet
Ai-900 3df695e8afa1
61 pages
Data Structures and Algorithm Analysis in C++, Third Edition
From Everand
Data Structures and Algorithm Analysis in C++, Third Edition
Clifford A. Shaffer
4.5/5 (5)
Decision Making: Submitted By-Ankita Mishra
No ratings yet
Decision Making: Submitted By-Ankita Mishra
20 pages
Learn R Programming in 24 Hours
From Everand
Learn R Programming in 24 Hours
Alex Nordeen
No ratings yet
CE 533 FEA Lecture PDF
No ratings yet
CE 533 FEA Lecture PDF
7 pages
A New Database Intrusion Detection Approach Based On Hybrid Meta-Heuristics
No ratings yet
A New Database Intrusion Detection Approach Based On Hybrid Meta-Heuristics
17 pages
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Beginning R: The Statistical Programming Language
From Everand
Beginning R: The Statistical Programming Language
Mark Gardener
4.5/5 (4)
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Pratik Zanke Factor Hair Revised
No ratings yet
Pratik Zanke Factor Hair Revised
37 pages
An Introduction To The Psych Package: Part I: Data Entry and Data Description
No ratings yet
An Introduction To The Psych Package: Part I: Data Entry and Data Description
63 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Final Report
No ratings yet
Final Report
38 pages
Activity 5a - Data Analysis Using R and Other Stat Application-1
No ratings yet
Activity 5a - Data Analysis Using R and Other Stat Application-1
8 pages
Education - Post 12th Standard - CSV
No ratings yet
Education - Post 12th Standard - CSV
11 pages
R Commands
No ratings yet
R Commands
18 pages
KVA Anusha - PGP12021 - BA
100% (1)
KVA Anusha - PGP12021 - BA
8 pages
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Ldats2470 Project
No ratings yet
Ldats2470 Project
2 pages
Exploratory Factor Analisys
No ratings yet
Exploratory Factor Analisys
79 pages
Mastering Algorithms and Data Structures
From Everand
Mastering Algorithms and Data Structures
Manish Soni
No ratings yet
fds qb
No ratings yet
fds qb
6 pages
Day 17 - Numpy
No ratings yet
Day 17 - Numpy
7 pages
Practical 02 - Pca
No ratings yet
Practical 02 - Pca
14 pages
Maths Record Output .
No ratings yet
Maths Record Output .
24 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Factor
No ratings yet
Factor
90 pages
Instant Heat Maps in R How-to
From Everand
Instant Heat Maps in R How-to
Sebastian Raschka
No ratings yet
Dvpd11 Merged Merged 27 83
No ratings yet
Dvpd11 Merged Merged 27 83
57 pages
DATA SCIENCE EXPERIMENTS
No ratings yet
DATA SCIENCE EXPERIMENTS
31 pages
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
From Everand
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
Olga Maria Stefania Cucaro
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
DM lab
No ratings yet
DM lab
18 pages
CS373 Homework 1: 1 Part I: Basic Probability and Statistics
No ratings yet
CS373 Homework 1: 1 Part I: Basic Probability and Statistics
5 pages
Essential Algorithms: A Practical Approach to Computer Algorithms
From Everand
Essential Algorithms: A Practical Approach to Computer Algorithms
Rod Stephens
4.5/5 (2)
Lab0 R Tutorial EHS
No ratings yet
Lab0 R Tutorial EHS
9 pages
Neural Networks & Machine Learning: Worksheet 3
No ratings yet
Neural Networks & Machine Learning: Worksheet 3
3 pages
Exploratory Data Analysis and Graphics: Lab 2
No ratings yet
Exploratory Data Analysis and Graphics: Lab 2
19 pages
Graded Project
No ratings yet
Graded Project
36 pages
Data Mining Project - Abinaya John
No ratings yet
Data Mining Project - Abinaya John
42 pages
BAN5
No ratings yet
BAN5
2 pages
Overview
No ratings yet
Overview
94 pages
Basics of Statistics and Probability - FP: Statistical Measures
No ratings yet
Basics of Statistics and Probability - FP: Statistical Measures
12 pages
data science
No ratings yet
data science
15 pages
Segmentation-Factor Analysis
No ratings yet
Segmentation-Factor Analysis
50 pages
Data Structures I Essentials
From Everand
Data Structures I Essentials
Dennis Smolarski
No ratings yet
Cost Practical
No ratings yet
Cost Practical
13 pages
COST - JournalPracticals (1-7)
No ratings yet
COST - JournalPracticals (1-7)
22 pages
Total Documentation
No ratings yet
Total Documentation
21 pages
shahun term workR1
No ratings yet
shahun term workR1
34 pages
R Console
No ratings yet
R Console
6 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Python Code - Summary Statistics
No ratings yet
Python Code - Summary Statistics
6 pages
Great Learning DATA MINING PROJECT
No ratings yet
Great Learning DATA MINING PROJECT
15 pages
Practical 5 2
No ratings yet
Practical 5 2
7 pages
DS assignment COMPLETED DOC
No ratings yet
DS assignment COMPLETED DOC
11 pages
AD3411-DATA SCIENCE AND ANALYTICS LABORATORY
No ratings yet
AD3411-DATA SCIENCE AND ANALYTICS LABORATORY
27 pages
Unit 3: Discriminant Analysis and Cluster Analysis
No ratings yet
Unit 3: Discriminant Analysis and Cluster Analysis
43 pages
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
From Everand
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
Kim Chantala
No ratings yet
DA_Lab_Week-1
No ratings yet
DA_Lab_Week-1
7 pages
CCW331 SET4
No ratings yet
CCW331 SET4
5 pages
Term Project Report
No ratings yet
Term Project Report
23 pages
Gear Noise PDF
No ratings yet
Gear Noise PDF
29 pages
Modeling of Wheels and Tires
No ratings yet
Modeling of Wheels and Tires
8 pages
Altair NVH PDF
100% (2)
Altair NVH PDF
97 pages
Gear Noise PDF
No ratings yet
Gear Noise PDF
29 pages
Conceptual Car Design at BMW With Focus On NVH Performance: Dr. Manfred Kroiss (IABG)
No ratings yet
Conceptual Car Design at BMW With Focus On NVH Performance: Dr. Manfred Kroiss (IABG)
13 pages
Find and Classify Critical Points: XX Yy 2 Xy 0 0
No ratings yet
Find and Classify Critical Points: XX Yy 2 Xy 0 0
2 pages
5 Industrial Noise and Vibrations Control
No ratings yet
5 Industrial Noise and Vibrations Control
81 pages
Subject - Faculty - No of GATE Questions Discussed Duration Sign
No ratings yet
Subject - Faculty - No of GATE Questions Discussed Duration Sign
1 page
Faculty Recruitment: Professors, Assistant Professors & Lecturers
No ratings yet
Faculty Recruitment: Professors, Assistant Professors & Lecturers
1 page
CS F415 DATA MINING L1
No ratings yet
CS F415 DATA MINING L1
4 pages
Advanced Lectures On Machine Learning ML Summer SC
No ratings yet
Advanced Lectures On Machine Learning ML Summer SC
249 pages
RapidMiner For ML
No ratings yet
RapidMiner For ML
9 pages
Be Winter 2022
No ratings yet
Be Winter 2022
2 pages
ML Sec2 Group-4 2024
No ratings yet
ML Sec2 Group-4 2024
13 pages
NetLogo K-Means Guidelines
No ratings yet
NetLogo K-Means Guidelines
3 pages
Data Clustering..
No ratings yet
Data Clustering..
10 pages
Major 74 Team
No ratings yet
Major 74 Team
20 pages
ML Sanchit
No ratings yet
ML Sanchit
49 pages
Intro To Machine Learning 101 Python Data Science v2
No ratings yet
Intro To Machine Learning 101 Python Data Science v2
101 pages
Numsense
No ratings yet
Numsense
138 pages
Crime Rate Prediction Based On Clustering: Bachelor of Technology Computer Science and Engineering
100% (1)
Crime Rate Prediction Based On Clustering: Bachelor of Technology Computer Science and Engineering
50 pages
Civil Engineering Literature Review Sample
100% (3)
Civil Engineering Literature Review Sample
6 pages
GIS Applications in Archaeology
No ratings yet
GIS Applications in Archaeology
28 pages
IAI&ML UNIT-4
No ratings yet
IAI&ML UNIT-4
34 pages
Data Mining Syllabus new
No ratings yet
Data Mining Syllabus new
2 pages
Customer_Segmentation_Using_Hierarchical_Clustering (1)
No ratings yet
Customer_Segmentation_Using_Hierarchical_Clustering (1)
6 pages
Cluster Analysis Introduction (Unit-6)
No ratings yet
Cluster Analysis Introduction (Unit-6)
20 pages
Rotundo & Sackett (2002)
No ratings yet
Rotundo & Sackett (2002)
15 pages
MACHINE LEARNING TUTORIAL QUESTION BANK modified
No ratings yet
MACHINE LEARNING TUTORIAL QUESTION BANK modified
13 pages
Using Representative-Based Clustering For Nearest Neighbor Dataset Editing
No ratings yet
Using Representative-Based Clustering For Nearest Neighbor Dataset Editing
4 pages
Florin Curta Hoard Patterns
No ratings yet
Florin Curta Hoard Patterns
67 pages
Declustering and Debiasing: January 2007
No ratings yet
Declustering and Debiasing: January 2007
26 pages
Advances in Computer and Computational Sciences
100% (2)
Advances in Computer and Computational Sciences
753 pages
Web Server Log Analysis Sysytem
No ratings yet
Web Server Log Analysis Sysytem
3 pages
DMDW Course Outcome
No ratings yet
DMDW Course Outcome
8 pages
CSCI 6882 - Data Warehouse and Data Mining
No ratings yet
CSCI 6882 - Data Warehouse and Data Mining
4 pages

R Programming

Uploaded by

R Programming

Uploaded by

Introduction to Computational Statistics (DSSH 6301),

Homework 11, Assignments

You might also like