0% found this document useful (0 votes)
44 views24 pages

Unit 4 DVTTT

This document discusses customizing 3D scatterplots in R. It shows how to create a 3D scatterplot of automobile mileage vs engine displacement vs weight using the mtcars dataset. Several customizations are demonstrated, including replacing points with filled blue circles, adding drop lines to the x-y plane, and labeling points. Cylinder information is also added through coloring points. Finally, the document introduces biplots, showing how to create one for mtcars to visualize relationships between observations and variables in 2D space.

Uploaded by

Saimahima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views24 pages

Unit 4 DVTTT

This document discusses customizing 3D scatterplots in R. It shows how to create a 3D scatterplot of automobile mileage vs engine displacement vs weight using the mtcars dataset. Several customizations are demonstrated, including replacing points with filled blue circles, adding drop lines to the x-y plane, and labeling points. Cylinder information is also added through coloring points. Finally, the document introduces biplots, showing how to create one for mtcars to visualize relationships between observations and variables in 2D space.

Uploaded by

Saimahima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Unit 4

Customizing
Graphs
3-D ▹ The ggplot2 package and its extensions
Scatterplot can’t create a 3-D plot.
▹ However, you can create a 3-D
scatterplot with the scatterplot3d
function in the scatterplot3d package.
▹ plot automobile mileage vs. engine
displacement vs. car weight using the
data in the mtcars dataframe.

2
Now lets, modify the graph by replacing the points with filled
blue circles, add drop lines to the x-y plane, and create more
meaningful labels

3
4
5
6
▹ Next, label the points.
▹ saving the results of the scatterplot3d
function to an object, using the xyz
▹ convert function to convert coordinates
from 3-D (x, y, z) to 2D-projections (x,
y), and apply the text function to add
labels to the graph.

7
8
9
▹ As a final step, we will add information
on the number of cylinders in each car.
▹ we’ll add a column to the mtcars
dataframe indicating the color for each
point.
▹ For good measure, we will shorten the
y-axis, change the drop lines to dashed
lines, and add a legend

10
11
12
13

▹ easily see that the car with the highest
mileage (Toyota Corolla) has low engine
displacement, low weight, and 4 cylinders

14
Biplots ▹ A biplot is a specialized graph that
attempts to represent the relationship
between observations, between
variables, and between observations
and variables, in a low (usually two)
dimensional space.
▹ Let’s create a biplot for the mtcars
dataset, using the fviz_pca function
from the factoextra package.

15
▹ The fviz_pca function produces a ggplot2
graph.
▹ Dim1 and Dim2 are the first two principal
components - linear combinations of the
original p variables.
▹ P C1 = β10 + β11x1 + β12x2 + β13x3 + · · ·
+ β1pxp
▹ P C2 = β20 + β21x1 + β22x2 + β23x3 + · · ·
+ β2pxp

16
17
18
▹ The weights of these linear
combinations (βij s) are chosen to
maximize the variance accounted for in
the original variables.
▹ Additionally, the principal components
(PCs) are constrained to be
uncorrelated with each other.

19
▹ In this graph, the first PC accounts for
60% of the variability in the original
data.
▹ The second PC accounts for 24%.
Together, they account for 84% of the
variability in the original p = 11
variables.
▹ As you can see, both the observations
(cars) and variables (car
characteristics) are plotted in the same
graph.
20
▹ Points represent observations. Smaller
distances between points suggest similar
values on the original set of variables.
▹ For example, the Toyota Corolla and Honda
Civic are similar to each other, as are the
Chrysler Imperial and Liconln Continental.
▹ However, the Toyota Corolla is very
different from the Lincoln Continental.

21
▹ The observations that are are farthest along
the direction of a variable’s vector, have the
highest values on that variable.
▹ For example, the Toyoto Corolla and Honda
Civic have higher values on mpg. The Toyota
Corona has a higher qsec. The Duster 360 has
more cylinders.

22
▹ Care must be taken in interpreting
biplots.
▹ They are only accurate when the
percentage of variance accounted for is
high.
▹ Always check your conclusion with the
original data. See the article by Forrest
Young to learn more about interpreting
biplots correctly.
23
Thanks!

24

You might also like