0% found this document useful (0 votes)
7 views3 pages

Plotting With RStudio

Uploaded by

lildrick1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

Plotting With RStudio

Uploaded by

lildrick1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

5/27/24, 10:38 AM about:blank

Plotting with RStudio

Objective of Exercise:

This lab introduces you to plotting in R with ggplot and GGally. GGally is an extension of ggplot2.

Exercise:

1. Click the plus symbol on the top left and click R Script to create a new R script, if you don’t have one open already.

2. You will use the iris dataset. If you don’t have it loaded, copy and paste the following into your R script file.

1. 1
2. 2

1. library(datasets)
2. data(iris)

Copied!

3. In the previous lab, you installed the libraries necessary to create plots, let’s execute the following commands:

1. 1

about:blank 1/3
5/27/24, 10:38 AM about:blank
2. 2

1. library(GGally)
2. ggpairs(iris, mapping=ggplot2::aes(colour = Species))

Copied!

4. Select the commands and click Run on the top. You’ll see the following plot in the Plots window:

5. Click the Zoom icon on the plot window to zoom and see the plot.

6. This gives you a lot of information for a single line of code. First, you can see the data distributions per column and species on the diagonal. Then you see all the
pair-wise scatter plots on the tiles left to the diagonal, again segregated by color. It is, for example, obvious that a line can be drawn to separate setosa against
versicolor and virginica. In later courses, you will also learn how the overlapping species can be separated. This is called supervised machine learning using non-
linear classifiers. You can also see the correlation between individual columns in the tiles on the right to the diagonal, which confirms that setose is more different,
hence easier to distinguish, than versicolor and virginica. A correlation value close to one signifies high similarity, whereas a value closer to zero signifies less
similarity. The remaining plots on the right are called box-plots, and the ones at the bottom are called histograms, but you will learn about this in a more advanced
course in this series.

Author(s)
Romeo

Other Contributor(s)

Lavanya

Change log
Date Version Changed by Change Description
2022-12-30 1.2 Steve Hord QA pass edits
2020-12-10 1.1 Aije Created simplified version of the lab
2020-12-10 1.0 Malika Singla Migrated lab to Markdown

© IBM Corporation 2020. All rights reserved.

about:blank 2/3
5/27/24, 10:38 AM about:blank

about:blank 3/3

You might also like