DSR - Unit 2 - 3.3 LineGraphs
DSR - Unit 2 - 3.3 LineGraphs
Sailaja Thota
[email protected]
AY:2021-22
OUTLINE
Recap of previous Lecture
Topic for the Lecture
Lecture Discussion
Loop Functions
Data Science with R
Recap of previous Lecture
RECAP OF PREVIOUS LECTURE
Lecture
Outline the creation and use of Graphs in R
Outcome
Line Graph
LINE GRAPHS
• Line graphs are typically used for visualizing how one continuous
variable, on
the y-axis, changes in relation to another continuous variable, on
the x-axis.
• Often the x variable represents time, but it may also represent some
other
continuous quantity; for example, the amount of a drug
administered to
experimental subjects.
• Line graphs can also be used with a discrete variable on the x-axis.
This is appropriate when the variable is ordered (e.g., “small,”
“medium,” “large”), but not when the variable is unordered (e.g.,
“cow,” “goose,” “pig”).
BASIC LINE GRAPH
• You want to make a line graph with more than one line.
• Solution
• In addition to the variables mapped to the x-and y-axes, map
another (discrete) variable to colour or linetype, as shown in Figure
4-6:
• The tg data has three columns, including the factor supp, which we
mapped to
• colour and linetype:
• tg If the x variable is a factor, you must also tell ggplot
• #> supp dose length to group by that same variable, as described next.
Line graphs can be used with a continuous or categorical
• #> 1 OJ 0.5 13.23
variable on the x-axis. sometimes the variable mapped
• #> 2 OJ 1.0 22.70 to the x-axis is conceived of as being categorical, even
• #> 3 OJ 2.0 26.06 when it’s stored as a number. In the example here, there are
• #> 4 VC 0.5 7.98 three values of dose: 0.5, 1.0, and 2.0. You may want to
treat these as categories rather than values on a continuous
• #> 5 VC 1.0 16.77 scale. To do this, convert dose to a factor
• #> 6 VC 2.0 26.14 (Figure 4-7):
Notice the use of group = supp. Without this statement, ggplot won’t
know how to group the data together to draw the lines, and it will give
an error:
ggplot(tg, aes(x = factor(dose), y = length, colour = supp)) +
geom_line()
#> geom_path: Each group consists of only one observation. Do you
need to
#> adjust the group aesthetic?
Another common problem when the incorrect grouping is used is that
you will
see a jagged sawtooth pattern, as in Figure 4-8:
ggplot(tg, aes(x = dose, y = length)) +
geom_line()
This happens because there are
multiple data points at each y
location, and ggplot thinks they’re all
in one group. The data points for
each group are connected with a
single line, leading to the sawtooth
pattern. If any discrete variables are
mapped to aesthetics like colour or
linetype, they are automatically used
as grouping variables. But if you
want to use other variables
for grouping (that aren’t mapped to
an aesthetic), they should be used
with group.
• Sometimes points will
overlap. In these cases,
you may want to dodge
them, which means their
positions will be adjusted
left and right (Figure 4-10).
When doing so, you must
also dodge the lines, or
else only the points will
move and they will be
misaligned. You must also
specify how far they
should move when
dodged:
CHANGING THE APPEARANCE OF LINES
• https://www.r-project.org/about.html
• https://cran.r-project.org/mirrors.html
• https://rstudio.com/products/rstudio/download/
DISCUSSION
• 5 MINUTES
Functions in R