0% found this document useful (0 votes)

45 views29 pages

#PART 1a) : "Vqv/ggbiplot"

The document describes a dataset called mpg that contains fuel economy data from the EPA. It has 234 rows and 11 variables related to vehicle models from 1999-2008. The tasks are to: 1) Plot graphs showing the manufacturers and vehicle classes ordered by median highway fuel efficiency, finding the most and least efficient. Honda and Land Rover are most and least efficient manufacturers. Compact and midsize classes are most efficient while SUVs and pickups are least. 2) Plot histograms and bar charts exploring relationships between diamond characteristics like carat, price, cut, color, and clarity from a dataset of over 50,000 diamonds. Cut quality is related to clarity and color varies equally across cuts. Cut, car

Uploaded by

MnH sajjad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views29 pages

#PART 1a) : "Vqv/ggbiplot"

Uploaded by

MnH sajjad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Exercise 1. The dataset mpg is part of the R datasets package.

It contains a subset of the fuel economy data

that the Environment Protection Agency (EPA) makes available on https://fueleconomy.gov/. It contains
only car models which had a new release every year between 1999 and 2008 - this was used as a proxy for the
popularity of the car. It is a dataframe with 234 rows and 11 variables: manufacturer name (manufacturer),
model name (model), engine displacement (displ), year of manufacture (year), number of cylinders(cyl), type
of transmission (trans), type of drive train(drv), city miles per gallon(cty), highway miles per gallon(hwy),
fuel type (fl) and type of car (class). Applying an appropriate R data visualisation method on the mpg data,
perform the following tasks.
(a). Write code that displays a graph which plots in the order of decreasing medians of the vehicle’s miles-
per-gallon on highway (hwy) against their manufacturers. Plot the graph and list the manufacturers in the
order of fuel efficiency of their vehicles. Using the graph, find out which companies produce the most and
the least fuel efficient vehicles.

#PART 1a)

require(ggplot2)

## Loading required package: ggplot2

library(devtools)

## Loading required package: usethis

install_github("vqv/ggbiplot")

## Skipping install of ’ggbiplot’ from a github remote, the SHA1 (7325e880) has not changed since last i
## Use ‘force = TRUE‘ to force installation

require(ggbiplot)

## Loading required package: ggbiplot

## Loading required package: plyr

## Loading required package: scales

## Loading required package: grid

dim(mpg)

## [1] 234 11

hwymedian=setNames(aggregate(mpg$hwy,list(mpg$manufacturer),median),c("manufacturer","hwy"))
ggplot(hwymedian,aes(x=reorder(manufacturer, -hwy), y=hwy))+geom_point()+ylab("Miles Per Gallon(hwy)")+x

1
Fuel Efficiency vs Manufacturer

30
Miles Per Gallon(hwy)

honda
volkswagen
hyundai audi nissanpontiacsubaru toyotachevrolet jeep ford mercurydodge lincolnland rover
Manufacturers

#WE can see from the plot that most fuel efficient manufacturer is HONDA and least is LAND-ROVER as it l

(b). Write code that displays a graph which plots in the order of decreasing medians of the vehicle’s miles-
per-gallon on highway (hwy) against the type of car (class). Plot the graph and list the classes of vehicle in
the order of their fuel efficiency.

#PART 1b)

classmedian=setNames(aggregate(mpg$hwy,list(mpg$class),median),c("class","hwy"))
ggplot(classmedian,aes(x=reorder(class, -hwy), y=hwy))+geom_point()+ylab("Miles Per Gallon(hwy)")+xlab(

2
Fuel Efficiency vs Class
27.5

25.0
Miles Per Gallon(hwy)

22.5

20.0

17.5

compact midsize subcompact 2seater minivan suv pickup

Class

# most efficient class is compact and midsize while the least efficient is SUV and pickup

(c). Draw a bar chart of manufacturers in terms of numbers of different types of cars manufactured. Based
on this, comment on classes of vehicles manufactured by the companies producing the most and the least
fuel efficient vehicles and possible reason(s) for highest/lowest fuel efficiency.

# PART 1c)

ggplot(mpg,
aes(x = manufacturer)) + geom_col(aes(y=hwy,fill=class))

3
800

class
600
2seater
compact
midsize
hwy

400 minivan
pickup
subcompact
suv
200

audi
chevrolet
dodge ford honda
hyundaijeep
land rover
lincoln
mercury
nissan
pontiac
subarutoyota
volkswagen
manufacturer

#from the barchart we can see that compact and midsize class of car is fuel efficient while SUV and pick

Exercise 2. The diamonds dataset within R’s ggplot2 contains 10 columns (price, carat, cut, color, clarity,
length(x), width(y), depth(z), depth percentage, top width) for 53940 different diamonds. Using this dataset,
carry out the following tasks.
(a). Write code to plot histograms for carat and price. Plot these graphs and comment on their shapes.

#PART 2a)
library(ggplot2)
ggplot(data = diamonds, aes(x = price)) +
geom_histogram(binwidth = 500,colour="blue") + xlab('Price') +
ylab('Frequency')

4
10000

7500
Frequency

5000

2500

0 5000 10000 15000 20000

Price

ggplot(data = diamonds, aes(x =carat)) +

geom_histogram(binwidth = 0.25,colour="blue") + xlab('Carat') +
ylab('Frequency')

5
10000
Frequency

5000

0 1 2 3 4 5
Carat

(b). Write code to plot bar charts of cut proportioned in terms of color and again bar charts of cuts
proportioned in terms of clarity. Comment on how proportions of diamonds change in terms of clarity and
colour under different cut categories.

#PART 2b)

ggplot(diamonds,
aes(x = clarity,
fill =cut)) +
geom_bar(position = "fill")+ylab("Proportions")

6
1.00

0.75

cut
Fair
Proportions

Good
0.50
Very Good
Premium
Ideal

0.25

0.00

I1 SI2 SI1 VS2 VS1 VVS2 VVS1 IF

clarity

ggplot(diamonds,
aes(x = color,
fill = cut)) +
geom_bar(position = "fill")+ylab("Proportions")

7
1.00

0.75

cut
Fair
Proportions

Good
0.50
Very Good
Premium
Ideal

0.25

0.00

D E F G H I J
color

#According to the charts we can see that fair cut diamonds are in clarity l1 in highest proportion and m
#on the basis of color we can see that all color category have equal proportion of diamonds cutting

(c). Write code to display an appropriate graph that facilitates the investigation of a three-way relationship
between cut, carat and price. Plot the graph. What inferences can you draw regarding the three way
relationship?

#PART 2c)

ggplot(diamonds, aes(x=carat, y=price)) +

geom_point(aes(color=cut)) +
scale_color_manual(name="Cut", values=c("red", "darkblue","darkgreen", "grey50","black"))

8
15000

Cut
Fair
Good
price

10000
Very Good
Premium
Ideal

5000

0
0 1 2 3 4 5
carat

# from the plot we can see that most of the diamonds cut ideal There is not much variation on cut . the

Exercise 3. Before deciding about selecting a particular machine learning technique for a data science prob-
lem, it is important to study the data distribution particularly through visualization. However, visualizing a
multivariate data with two or more variables is difficult in a two dimensional plot. In this exercise, you are
required to study the R’s iris dataset which is a multivariate data consisting of four features or properties
(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) characterizing three species of iris flower (setosa,
versicolor, and virginica). The principal component analysis (PCA) is a technique that can help facilitate
visualization of a multivariate data distribution. The first two principal components (PC1 and PC2) ob-
tained after applying PCA, can explain the majority of variations in the data. In order to study the data
variability in iris data-set, perform the following tasks.
(a). Write code to obtain PC scores.

#PART 3a)
library(ggplot2)
log.ir <- log(iris[, 1:4])
ir.species <- iris[, 5]
pcairis <- prcomp(log.ir,
center = TRUE,
scale. = TRUE)

print(pcairis)

## Standard deviations (1, .., p=4):

9
## [1] 1.7124583 0.9523797 0.3647029 0.1656840
##
## Rotation (n x k) = (4 x 4):
## PC1 PC2 PC3 PC4
## Sepal.Length 0.5038236 -0.45499872 0.7088547 0.19147575
## Sepal.Width -0.3023682 -0.88914419 -0.3311628 -0.09125405
## Petal.Length 0.5767881 -0.03378802 -0.2192793 -0.78618732
## Petal.Width 0.5674952 -0.03545628 -0.5829003 0.58044745

(b). Write code to obtain a scatter plot representing PC1 vs. PC2, wherein data clusters corresponding to
three flower types are clearly marked using possibly an ellipsoid.

#PART 3b
ggbiplot(pcairis,choice=c(1,2), groups=iris$Species, ellipse=TRUE,
scale=0,var.scale=0.2, colour="blue", varname.size=3)+
ggtitle("PCA visualization",subtitle="ggbiplot()") +
theme(plot.title =element_text(size=15, face="bold", hjust=0.5,
colour = "red"),plot.subtitle =element_text(size=10,
face="bold.italic", hjus

PCA visualization
ggbiplot()

2
PC2 (22.7% explained var.)

1 groups
setosa

0 versicolor
Petal
Pe tal..W
Leidt
nghth
virginica

−1 Se
pa
l.L
en
gt
h
−2
idth
al.W
Sep

−2 0 2
PC1 (73.3% explained var.)

(c). Run the codes to make the scatter plot, mark flowers using ellipsoids and comment on the feature
distribution.

#Code for the above exercise.

#PART 3c

10
ggplot(iris, aes(x=Petal.Length, y=Petal.Width, colour=Species)) +
geom_point() +
stat_ellipse()

Species
Petal.Width

setosa
versicolor
virginica
1

2 4 6
Petal.Length

Exercise 4. In this task, you are required to analyze the Animals dataset from the MASS package.This
dataset contains brain weight (in grams) and body weight (in kilograms) for 28 different animal species.The
three largest animals are dinosaurs, whose measurements are obviously the result of scientific modeling rather
than precise measurements.
A scatter plot given below fails to describe any obvious relationship between brain weight and body weight
variables. You are required to apply appropriate power transformations to the variables to obtain more
interpretable plot and describe the obtained relationship. To this end, undertake the following tasks.

library(ggplot2)
library(MASS)
data(Animals)
qplot(brain, body, data = Animals)

11
75000

50000
body

25000

0 2000 4000
brain

Task-1. Check whether each of the variables has normal distribution. Your response should be based on an
appropriate statistical test as well as smoothed histogram plots.

#Code for the above exercise.

#Shapiro test is used to confirm p-value
shapiro.test(Animals$body); shapiro.test(Animals$brain)

##
## Shapiro-Wilk normality test
##
## data: Animals$body
## W = 0.27831, p-value = 1.115e-10

##
## Shapiro-Wilk normality test
##
## data: Animals$brain
## W = 0.45173, p-value = 3.763e-09

#USing density plot and checking the difference between median and mean
plot(density(Animals$body))

12
density.default(x = Animals$body)
0.0012
0.0008
Density

0.0004
0.0000

0 20000 40000 60000 80000

N = 28 Bandwidth = 164.1

plot(density(Animals$brain))

13
density.default(x = Animals$brain)
0.0015
0.0010
Density

0.0005
0.0000

0 1000 2000 3000 4000 5000 6000

N = 28 Bandwidth = 137.2

hist(Animals$brain,30)

14
Histogram of Animals$brain
15
Frequency

10
5
0

0 1000 2000 3000 4000 5000 6000

Animals$brain

hist(Animals$body,30)

15
Histogram of Animals$body
20
15
Frequency

10
5
0

0 20000 40000 60000 80000

Animals$body

#through density plot we can see there is big difference in mean and median which shows that data is inc

Task-2. A power transformation of a variable X consists of raising X to the power lambda. Using an
appropriate statistical test and/or plot, find best lambda values needed for transforming each of the variables
requiring power transformation.

#Code for the above exercise.

body_sqrt = sqrt(Animals$body)
body_cub = sign(Animals$body) * abs(Animals$body)ˆ(1/3)
body_log = log(Animals$body)
four_body=sign(Animals$body) * abs(Animals$body)ˆ(1/4)
brain_sqrt = sqrt(Animals$brain)
brain_cub = sign(Animals$brain) * abs(Animals$body)ˆ(1/3)
brain_log = log(Animals$brain)

plot(density(brain_sqrt))

16
density.default(x = brain_sqrt)
0.03
0.02
Density

0.01
0.00

−20 0 20 40 60 80

N = 28 Bandwidth = 5.46

plot(density(brain_cub))

17
density.default(x = brain_cub)
0.10
0.08
0.06
Density

0.04
0.02
0.00

0 10 20 30 40 50

N = 28 Bandwidth = 2.196

plot(density(brain_log))

18
density.default(x = brain_log)
0.15
Density

0.10
0.05
0.00

0 5 10

N = 28 Bandwidth = 1.03

plot(density(body_sqrt))

19
density.default(x = body_sqrt)
0.030
0.020
Density

0.010
0.000

0 50 100 150 200 250 300

N = 28 Bandwidth = 6.94

plot(density(body_cub))

20
density.default(x = body_cub)
0.10
0.08
0.06
Density

0.04
0.02
0.00

0 10 20 30 40 50

N = 28 Bandwidth = 2.196

plot(density(body_log))

21
density.default(x = body_log)
0.00 0.02 0.04 0.06 0.08 0.10
Density

−10 −5 0 5 10 15

N = 28 Bandwidth = 1.74

#our results shows that best transformation is LOG . We will confirm this with boxcox transformation.
animal_body=lm(body~.,data=Animals)
animal_brain=lm(brain~.,data=Animals)
boxcox(animal_body,plotit = TRUE)

22
95%
−100
log−Likelihood

−200
−300
−400

−2 −1 0 1 2

boxcox(animal_brain,plotit = TRUE)

23
95%
−100
log−Likelihood

−150
−200
−250

−2 −1 0 1 2

#box cox shows the likelihood for log

#BEST VALUE WILL BE LOG

Task-3. Apply power transformation and verify whether transformed variables have a normal distribution
through statistical test as well as smoothed histogram plots.

#Code for the above exercise.

Animals$body = log(Animals$body)
Animals$brain = log(Animals$brain)
#After transformation we will confirm this with shapiro test and density plot
shapiro.test(Animals$brain);shapiro.test(Animals$body)

##
## Shapiro-Wilk normality test
##
## data: Animals$brain
## W = 0.95787, p-value = 0.31

##
## Shapiro-Wilk normality test
##
## data: Animals$body
## W = 0.98465, p-value = 0.9433

24
plot(density(Animals$brain))

density.default(x = Animals$brain)
0.15
Density

0.10
0.05
0.00

0 5 10

N = 28 Bandwidth = 1.03

plot(density(Animals$body))

25
density.default(x = Animals$body)
0.00 0.02 0.04 0.06 0.08 0.10
Density

−10 −5 0 5 10 15

N = 28 Bandwidth = 1.74

hist(Animals$brain,5)

26
Histogram of Animals$brain
10
8
Frequency

6
4
2
0

−2 0 2 4 6 8 10

Animals$brain

hist(Animals$body,30)

27
Histogram of Animals$body
3.0
2.5
2.0
Frequency

1.5
1.0
0.5
0.0

0 5 10

Animals$body

Task-4. Create a scatter plot of the transformed data. Based on the visual inspection of the plot, provide
your interpretation of the relationship between brain weight and body weight variables. You may like to add
an appropriate smoothed line curve to your plot to help in interpretation.

#Code for the above exercise.

anim_reg<-lm(brain~body, data=Animals)
plot(Animals$body, Animals$brain, xlab="Log(Body Weight)", ylab="Log(Brain Weight)", main="Plot of Body
abline(anim_reg, lwd=2, col="red")

28
Plot of Body Wt. to Brain Wt.
8
Log(Brain Weight)

6
4
2
0

0 5 10

Log(Body Weight)

#We have used logarithmic transformation to compresses the large values but stretches the small ones

# There is a strong relationship between these weights on the grounds that a large body might well need

Various Light Phenomena..
90% (10)
Various Light Phenomena..
22 pages
R Graphics Essentials For Great Data Visualization 9781979748100 C
No ratings yet
R Graphics Essentials For Great Data Visualization 9781979748100 C
257 pages
Beautiful Graphics in R
No ratings yet
Beautiful Graphics in R
238 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
Diabetes Case Study
100% (1)
Diabetes Case Study
12 pages
Sample Docs Format - Brgy
No ratings yet
Sample Docs Format - Brgy
14 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
R Graphics Essentials Great Data Visualization
No ratings yet
R Graphics Essentials Great Data Visualization
248 pages
Woodwind Breathing Techniques: An Annotated Bibliography
No ratings yet
Woodwind Breathing Techniques: An Annotated Bibliography
125 pages
Genetica Cuantitativa
No ratings yet
Genetica Cuantitativa
120 pages
Draft1 Technical Seminar 21cs81
No ratings yet
Draft1 Technical Seminar 21cs81
15 pages
DV - Unit 2
No ratings yet
DV - Unit 2
73 pages
Lecture 2 - R Graphics PDF
No ratings yet
Lecture 2 - R Graphics PDF
68 pages
Boeing AH-64 Apache Helicopter
No ratings yet
Boeing AH-64 Apache Helicopter
4 pages
Graphics Chapter
No ratings yet
Graphics Chapter
49 pages
Little Book of R For Multivariate Analysis
No ratings yet
Little Book of R For Multivariate Analysis
51 pages
Module 4-1
No ratings yet
Module 4-1
84 pages
Guide To Create: Beautiful Graphics in R
No ratings yet
Guide To Create: Beautiful Graphics in R
48 pages
Purpose in The Universe The Moral and Metaphysical Case For Ananthropocentric Purposivism Tim Mulgan Ebook All Chapters PDF
100% (2)
Purpose in The Universe The Moral and Metaphysical Case For Ananthropocentric Purposivism Tim Mulgan Ebook All Chapters PDF
45 pages
DSR - Unit 2-2.1 ExploringBasicgraphs
No ratings yet
DSR - Unit 2-2.1 ExploringBasicgraphs
51 pages
2 Table and Graphical Representations
No ratings yet
2 Table and Graphical Representations
46 pages
2) Selection of Lift Mode
No ratings yet
2) Selection of Lift Mode
21 pages
Reading Passage 1: "Tongue Often Hang Man Quicker Than Rope." (Chan Was Forever Carping at His Son For Too Much
No ratings yet
Reading Passage 1: "Tongue Often Hang Man Quicker Than Rope." (Chan Was Forever Carping at His Son For Too Much
8 pages
Data Visualization
No ratings yet
Data Visualization
46 pages
Plot
No ratings yet
Plot
34 pages
Regeditzinhavnc
No ratings yet
Regeditzinhavnc
2 pages
DA R Unit-4
No ratings yet
DA R Unit-4
32 pages
May Edition
No ratings yet
May Edition
2 pages
Wisest and Brighest of Mankind
No ratings yet
Wisest and Brighest of Mankind
13 pages
Class 10 Science Chapter 13 Magnetic Effects of Electric Current Revision Notes
No ratings yet
Class 10 Science Chapter 13 Magnetic Effects of Electric Current Revision Notes
28 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Business Analytics Unit - IV Notes - 60637706 - 2025 - 05!15!02 - 16
No ratings yet
Business Analytics Unit - IV Notes - 60637706 - 2025 - 05!15!02 - 16
28 pages
CS605 Labcf
No ratings yet
CS605 Labcf
30 pages
Lab Manual - DSR
No ratings yet
Lab Manual - DSR
32 pages
DEV Lab Manual
No ratings yet
DEV Lab Manual
27 pages
Final Data Lab
No ratings yet
Final Data Lab
21 pages
Statistical Modeling Using R - Lab Manual
No ratings yet
Statistical Modeling Using R - Lab Manual
23 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
Unit3 R
No ratings yet
Unit3 R
19 pages
BDA Experiment 9 and 10
No ratings yet
BDA Experiment 9 and 10
22 pages
DataViz Ggplot Sample
No ratings yet
DataViz Ggplot Sample
23 pages
KrutikaKolhe 862467252 HW2
No ratings yet
KrutikaKolhe 862467252 HW2
25 pages
R Lab Program
No ratings yet
R Lab Program
20 pages
ProbList2 24 SLN
No ratings yet
ProbList2 24 SLN
20 pages
R Basics
No ratings yet
R Basics
18 pages
M4 DAR Part1
No ratings yet
M4 DAR Part1
16 pages
Automig FC 71 T1
No ratings yet
Automig FC 71 T1
1 page
Lecture 6 - Data Visualization With Ggplot2
No ratings yet
Lecture 6 - Data Visualization With Ggplot2
15 pages
Using R For Basic Statistical Analysis
No ratings yet
Using R For Basic Statistical Analysis
11 pages
MSDS Benzyl Alcohol
No ratings yet
MSDS Benzyl Alcohol
6 pages
KrutikaKolhe 862467252 HW3
No ratings yet
KrutikaKolhe 862467252 HW3
14 pages
BISC 1403 R04 Fall 2024 V2
No ratings yet
BISC 1403 R04 Fall 2024 V2
6 pages
R Cac1
No ratings yet
R Cac1
11 pages
BAB 5-2 MTK Graph in R PT 2 Materi Line Plot
No ratings yet
BAB 5-2 MTK Graph in R PT 2 Materi Line Plot
9 pages
Introduction To R Charts Graphs AN 15 09 2024
No ratings yet
Introduction To R Charts Graphs AN 15 09 2024
8 pages
Module 5 Gned 06 Issues and Application of STS
No ratings yet
Module 5 Gned 06 Issues and Application of STS
9 pages
Answerkey For Cla 2
No ratings yet
Answerkey For Cla 2
7 pages
2 R - Zajecia - 4 - Eng
No ratings yet
2 R - Zajecia - 4 - Eng
7 pages
DA Lab Week-1
No ratings yet
DA Lab Week-1
7 pages
R Practical
No ratings yet
R Practical
9 pages
Iecex Certificate of Conformity
No ratings yet
Iecex Certificate of Conformity
6 pages
Graphics in R
No ratings yet
Graphics in R
8 pages
Lab1: Introduction To R: Islr2
No ratings yet
Lab1: Introduction To R: Islr2
10 pages
Queuing Formulas: 1 Notation 2
No ratings yet
Queuing Formulas: 1 Notation 2
6 pages
A-Simple-Neural-Network-From-Scratch - Jupyter Notebook
No ratings yet
A-Simple-Neural-Network-From-Scratch - Jupyter Notebook
9 pages
Assignment No 808
No ratings yet
Assignment No 808
8 pages
CourseKata R Cheatsheet ABC
No ratings yet
CourseKata R Cheatsheet ABC
5 pages
Applikationsbeitrag DESMA en
No ratings yet
Applikationsbeitrag DESMA en
4 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
DevRes wk1-2
No ratings yet
DevRes wk1-2
6 pages
Creating EDA Reports Using R Markdown
No ratings yet
Creating EDA Reports Using R Markdown
6 pages
7 Semester Mechanical Engineering: Course No. Course Name Credits L T P
No ratings yet
7 Semester Mechanical Engineering: Course No. Course Name Credits L T P
8 pages
Model Lab
No ratings yet
Model Lab
6 pages
Math Grade 9 Teacher Made Test Sy 2023-2024
No ratings yet
Math Grade 9 Teacher Made Test Sy 2023-2024
6 pages
Exercise 1
No ratings yet
Exercise 1
5 pages
Assignment R Solutions
No ratings yet
Assignment R Solutions
4 pages
Exercise 3
No ratings yet
Exercise 3
4 pages
173hh - Dukkha in Anicca, Anatta in Dukkha
No ratings yet
173hh - Dukkha in Anicca, Anatta in Dukkha
4 pages
Interaksi Obat by MEDSCAPE
No ratings yet
Interaksi Obat by MEDSCAPE
4 pages
Praval Panchamrut With Mouktik Powder
No ratings yet
Praval Panchamrut With Mouktik Powder
2 pages
Biology Module 6 Genetic Change Sample
No ratings yet
Biology Module 6 Genetic Change Sample
3 pages
Hydrologist
No ratings yet
Hydrologist
2 pages
226
No ratings yet
226
2 pages
228
No ratings yet
228
2 pages
235
No ratings yet
235
2 pages
246
No ratings yet
246
2 pages
t125 Trojan Data Sheets
No ratings yet
t125 Trojan Data Sheets
2 pages

#PART 1a) : "Vqv/ggbiplot"

Uploaded by

#PART 1a) : "Vqv/ggbiplot"

Uploaded by

Exercise 1. The dataset mpg is part of the R datasets package.

It contains a subset of the fuel economy data

## Loading required package: ggplot2

## Loading required package: usethis

## Loading required package: ggbiplot

## Loading required package: plyr

## Loading required package: scales

## Loading required package: grid

compact midsize subcompact 2seater minivan suv pickup

0 5000 10000 15000 20000

ggplot(data = diamonds, aes(x =carat)) +

I1 SI2 SI1 VS2 VS1 VVS2 VVS1 IF

ggplot(diamonds, aes(x=carat, y=price)) +

## Standard deviations (1, .., p=4):

#Code for the above exercise.

#Code for the above exercise.

0 20000 40000 60000 80000

0 1000 2000 3000 4000 5000 6000

0 1000 2000 3000 4000 5000 6000

0 20000 40000 60000 80000

#Code for the above exercise.

0 50 100 150 200 250 300

#box cox shows the likelihood for log

#BEST VALUE WILL BE LOG

#Code for the above exercise.

#Code for the above exercise.

You might also like