Edar M-1
Edar M-1
Introduction to R Programming
Reading and Getting Data into R, Viewing Named Objects, Types of Data Items, The
Structure of Data Items, Working with History Commands, Saving your Work in R.
Control Statements, Arithmetic and Boolean Operators, Functions, Return Values,
Environment and Scope Issues, Recursion.
What is R Language?
R is a programming language.
R is often used for statistical computing and graphical presentation to analyze and
visualize data.
Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data
miners and statisticians for data analysis and developing statistical software.
The official R software environment is an open-source free software environment
within the GNU package, available under the GNU General Public License.
It is written primarily in C, Fortran, and R itself (partially self-hosting).
Precompiled executables are provided for various operating systems.
R has a command line interface. Multiple third-party graphical user interfaces are
also available, such as RStudio, an integrated development environment,
and Jupyter, a notebook interface.
According to user surveys and studies of scholarly literature databases, R is one
of the most commonly used programming language used in data mining. As of
March 2022, R ranks 11th in the TIOBE index, a measure of programming
language popularity.
Note:
R Programming Paradigm: Multi-paradigm i.e.,
Procedural
Object oriented
Functional
Reflective
Imperative
Array
Designed by : Ross Ihaka and Robert Gentleman
Developer : R Core Team
1
First Appeared : August 1993; 28 years ago
Stable Release : 4.1.3 / 10 March 2022; 27 days ago
Typing Discipline : Dynamic
Filename extensions :
.r
.rdata
.rds
.rda
Features of R Language
R is a domain-specific programming language which aims to do data analysis.
It has some unique features which make it very powerful.
The most important arguably being the notation of vectors.
These vectors allow us to perform a complex operation on a set of values in a
single command.
There are the following features of R programming:
1. It is a simple and effective programming language which has been well
developed.
2. It is data analysis software.
3. It is a well-designed, easy, and effective language which has the concepts of
user-defined, looping, conditional, and various I/O facilities.
4. It has a consistent and incorporated set of tools which are used for data
analysis.
5. For different types of calculation on arrays, lists and vectors, R contains a
suite of operators.
2
6. It provides effective data handling and storage facility.
7. It is an open-source, powerful, and highly extensible software.
8. It provides highly extensible graphical techniques.
9. It allows us to perform multiple calculations using vectors.
10. R is an interpreted language.
History of R Language
The history of R goes back about 20-30 years ago.
R was developed by Ross lhaka and Robert Gentleman in the University of
Auckland, New Zealand, and the R Development Core Team currently develops
it.
This programming language name is taken from the name of both the developers.
The first project was considered in 1992.
The initial version was released in 1995, and in 2000, a stable beta version was
released.
The following table shows the release date, version, and description of R
language:
3
2.11 2010-04-22 Support for Windows 64-bit systems.
2.12.2 2011-02-25 Last version to support Windows 2000
2.13 2011-04-14 Adding a new compiler function that allows speeding up functions by
converting them to bytecode.
2.14 2011-10-31 Added mandatory namespaces for packages. Added a new parallel package.
2.15 2012-03-30 New load balancing functions. Improved serialisation speed for long vectors.
3.0.0 2013-04-03 Support for numeric index values 231 and larger on 64-bit systems.
3.3.3 2017-03-06 Last version to support Microsoft Windows XP.
3.4.0 2017-04-21 Just-in-time compilation (JIT) of functions and loops to byte-code enabled by default.
Packages byte-compiled on installation by default. Compact internal
3.5.0 2018-04-23 representation of integer sequences. Added a new serialisation format to
support compact internal representations.
3.6.0 2019-04-26 Improved sampling from a discrete uniform distribution, which was noticeably non-
uniform on large populations. New serialisation format supported since 3.5.0
becomes the default.
R now uses a stringsAsFactors = FALSE default, and hence by default no longer
4.0.0 2020-04-24 converts strings to factors in calls to data.frame( ) and read.table( ).
Reference counting is used for tracking object sharing, which reduces the need for
copying objects. New syntax for raw string constants.
4.1.0 2021-05-18 Introduced |> as the pipe operator for base R syntax (similar to the %>%
operator of the magrittr package) and the anonymous function shortcut syntax
\(x) x+1
4
R packages have advanced techniques which For finding outliers in a data set both R
are very useful for statistical work. The and Python are equally good. But for
Specialties for
CRAN text view is provided by many useful developing a web service to allow
data science
R packages. These packages cover everything peoples to upload datasets and find
from Psychometrics to Genetics to Finance. outliers, Python is better.
Most of the data analysis functionalities
are not inbuilt. They are available
Functionalities For data analysis, R has inbuilt functionalities
through packages like Numpy and
Pandas
Data visualization is a key aspect of analysis. Python is better for deep learning
R packages such as ggplot2, ggvis, lattice, because Python packages such as Caffe,
Key domains of
etc. make data visualization easier. Keras, OpenNN, etc. allows the
application
development of the deep neural network
in a very simple way.
There are hundreds of packages and ways to Python has few main packages such as:
Availability of accomplish needful data science tasks. Sccikit learn, and Pandas for data
packages analysis of machine learning,
respectively.
5
Advantages:
1) Open Source
An open-source language is a language on which we can work without any need
for a license or a fee.
R is an open-source language. We can contribute to the development of R by
optimizing our packages, developing new ones, and resolving issues.
2) Platform Independent
R is a platform-independent language or cross-platform programming language
which means its code can run on all operating systems.
R enables programmers to develop software for several competing platforms by
writing a program only once.
R can run quite easily on Windows, Linux, and Mac.
3) Machine Learning Operations
R allows us to do various machine learning operations such as classification and
regression.
For this purpose, R provides various packages and features for developing the
artificial neural network. R is used by the best data scientists in the world.
4) Exemplary support for data wrangling
R allows us to perform data wrangling.
R provides packages such as dplyr, readr which are capable of transforming
messy data into a structured form.
5) Quality plotting and graphing
R simplifies quality plotting and graphing.
R libraries such as ggplot2 and plotly advocates for visually appealing and
aesthetic graphs which set R apart from other programming languages.
6) The array of packages
R has a rich set of packages. R has over 10,000 packages in the CRAN repository
which are constantly growing.
R provides packages for data science and machine learning operations.
7) Statistics
R is mainly known as the language of statistics.
It is the main reason why R is predominant than other programming languages for
the development of statistical tools.
6
8) Continuously Growing
R is a constantly evolving programming language.
Constantly evolving means when something evolves, it changes or develops over
time, like our taste in music and clothes, which evolve as we get older.
R is a state of the art which provides updates whenever any new feature is added.
Disadvantages
1) Data Handling
In R, objects are stored in physical memory.
It is in contrast with other programming languages like Python.
R utilizes more memory as compared to Python.
It requires the entire data in one single place which is in the memory.
It is not an ideal option when we deal with Big Data.
2) Basic Security
R lacks basic security. It is an essential part of most programming languages such
as Python.
Because of this, there are many restrictions with R as it cannot be embedded in a
web-application.
3) Complicated Language
R is a very complicated language, and it has a steep learning curve.
The people who don't have prior knowledge or programming experience may find
it difficult to learn R.
4) Weak Origin
The main disadvantage of R is, it does not have support for dynamic or 3D
graphics.
The reason behind this is its origin. It shares its origin with a much older
programming language "S."
5) Lesser Speed
R programming language is much slower than other programming languages such
as MATLAB and Python.
In comparison to other programming language, R packages are much slower.
In R, algorithms are spread across different packages.
7
The programmers who have no prior knowledge of packages may find it difficult
to implement algorithms.
Applications of R
There are several-applications available in real-time. Some of the popular
applications are as follows:
1. Facebook
2. Google
3. Twitter
4. HRDAG
5. Sunlight Foundation
6. RealClimate
7. NDAA
8. XBOX ONE
9. ANZ
10. FDA
Comments
Comment is a description of a statement or program.
It can be used to make it more readable.
It can also be used to prevent execution when testing alternative code.
Comments starts with a #.
When executing code, R will ignore anything that starts with #.
This example uses a comment before a line of code:
Example:
# This is a comment
"Hello World!“
Output:
[1] "Hello World!"
8
This example uses a comment at the end of a line of code:
Example:
"Hello World!" # This is a comment
Output:
[1] "Hello World!"
Comments does not have to be text to explain the code, it can also be used to
prevent R from executing the code:
Example 1:
# "Good morning!“
"Good night!“
Output:
[1] "Good night!"
Example 2:
"Good morning!“
#"Good night!“
Output:
[1] "Good morning!“
Multiline Comments:
Unlike other programming languages, such as C, there are no syntax in R for
multiline comments. However, we can just insert a # for each line to create
multiline comments:
Example:
# This is a comment
# written in
# more than just one line
"Hello World!“
Output:
[1] "Hello World!"
9
Reading and Getting Data into R
Sets of data to examine (that is, samples) and will want to create more complex
series of numbers to work on.
We cannot perform any analyses if we do not have any data so getting data into R
is a very important task.
This section focuses on ways to create these complex samples and get data into R,
where we are able to undertake further analyses.
Previously the named objects contained single values (the result of some
mathematical calculation).
Here the named object data1 contains several values, forming a sample. The [1] at
the beginning shows we that the line begins with the first item (the number 3).
When we get larger samples and more values, the display may well take up more
than one line of the display, and R provides a number at the beginning of each
10
row so we can see “how far along” we are. In the following example we can see
that there are 41 values in the sample:
[1] 582 132 716 515 158 80 757 529 335 497 3369 746 201 277 593
[16] 361 905 1513 744 507 622 347 244 116 463 453 751 540 1950 520
[31] 179 624 448 844 1233 176 308 299 531 71 717
The second row starts with [16], which tells we that the first value in that row is
the 16th in the sample. This simple index system makes it a bit easier to pick out
specific items.
We can incorporate existing data objects with values to make new ones simply by
incorporating them as if they were values themselves (which of course they are).
In this example we take the numerical sample that we made earlier and
incorporate it into a larger sample:
> data1
[1] 3 5 7 5 3 2 6 8 5 6 9
> data2 = c(data1, 4, 5, 7, 3, 4)
> data2
[1] 3 5 7 5 3 2 6 8 5 6 9 4 5 7 3 4
Here we take your first data1 object and add some extra values to create a new
(larger) sample. In this case we create a new item called data2, but we can
overwrite the original as part of the process:
> data1 = c(6, 7, 6, 4, 8, data1)
> data1
[1] 6 7 6 4 8 3 5 7 5 3 2 6 8 5 6 9
Now adding extra values at the beginning has modified the original sample.
11
In practice though, it is a good habit to stick to one sort of quote; single quote
marks are easier to type.
The following example shows a simple text sample comprising of days of the
week:
> day1 = c('Mon', 'Tue', 'Wed', 'Thu')
> day1
[1] "Mon" "Tue" "Wed" "Thu"
We can combine other text objects in the same way as we did for the numeric
objects previously, like so:
> day1 = c(day1, 'Fri')
> day1
[1] "Mon" "Tue" "Wed" "Thu" "Fri"
If we mix text and numbers, the entire data object becomes a text variable and the
numbers are converted to text, shown in the following. We can see that the items
are text because R encloses each item in quotes:
> mix = c(data1, day1)
> mix
[1] "3" "5" "7" "5" "3" "2" "6" "8" "5" "6" "9" "Mon"
[13] "Tue" "Wed" "Thu" "Fri"
12
Perform the following steps to practice storing data using the scan( ) command.
1. Begin the data entry process with the scan( ) command:
> data3 = scan( )
2. Now type some numerical values, separated by spaces, as follows:
1: 6 7 8 7 6 3 8 9 10 7
3. Now press the Enter key and type some more numbers on the fresh line:
11: 6 9
4. Press the Enter key once again to create a new line:
13:
5. Press the Enter key once more to finish the data entry:
13:
Read 12 items
6. Type the name of the object:
> data3
[1] 6 7 8 7 6 3 8 9 10 7 6 9
13
Typing the name of the object we just created displays the data, and we can see
that they are indeed text items and the quotes are there:
> day2
[1] "Mon" "Tue" "Wed" "Thu"
If the data are text, we add the what = ‘character’ instruction to the scan()
command as before.
At this point, if we can open the file in a spreadsheet, proceed with the
aforementioned four steps.
If the file opens in a text editor or word processor, we must look to see how the
data items are separated before continuing.
If the data are separated with simple spaces, we can simply copy and paste. If the
data are separated with some other character, we need to tell R which character is
used as the separator. For example, a common file type is CSV (comma-separated
values), which uses commas to separate the data items.
To tell R we are using this separator, simply add an extra part to your command
like so:
scan(sep = ‘,’)
In this example R is told to expect a comma; note that we need to enclose the
separator in quotes. Here are some comma-separated numerical data:
14
To get these into R, use the scan( ) command like so:
> data4 = scan(sep = ',')
1: 23, 17, 12.5, 11, 17, 12, 14.5, 9
9: 11, 9, 12.5, 14.5, 17, 8, 21
16:
Read 15 items
> data4
[1] 23.0 17.0 12.5 11.0 17.0 12.0 14.5 9.0 11.0 9.0 12.5 14.5 17.0 8.0 21.0
Note that we have to press the Enter key to finish the data entry. Note also that
some of the original data had decimal points (for example, 14.5); R appends
decimals to all the data so that they all have the same level of precision. If your
data are separated by tab stops we can use “\t” to tell R that this is the case.
If the data are text, we simply add what = ‘character’ and proceed as before. Here
are some text data contained in a CSV text file:
"Jan", "Feb", "Mar", "Apr", "May", "Jun"
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
To get these data entered into R, perform the following steps:
1. Open the data file; in this case it has opened in a text editor and we see the
quotes and the comma separators.
2. Highlight the data required.
3. Copy to the clipboard.
4. Switch to R and type in the scan() command.
5. Paste the contents of the clipboard.
6. Press Enter on a blank line to end the data entry (this means that we have to
press Enter twice, once after the paste operation and once on the blank line).
7. Type the name of the data object created to view the entered data.
15
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
In this example both sep = and what = instructions are used. Additionally, the
scan( ) command allows we to create data items from the keyboard or from
clipboard entries, thus enabling we to move data from other applications quite
easily. It is also possible to get the scan( ) command to read a file directly as
described in the following:
In this example the data file is called test data.txt, which is plain text, and the
numerical values are separated by spaces. Note that the filename must be
enclosed in quotes (single or double). Of course we can use the what = and sep =
instructions as appropriate.
R looks for your data file in the default directory. We can find the default
directory by using the getwd( ) command like so:
> getwd( ) # The first example shows the default for a Windows XP
machine
[1] "C:/Documents and Settings/Administrator/My Documents"
> getwd( ) # The second example is for a Macintosh OS X system
[1] "/Users/markgardener"
> getwd( ) # The final example is for Linux (Ubuntu 10.10)
[1] "/home/mark"
If your file is somewhere else we must type its name and location in full. The
location is relative to the default directory; in the preceding example the file was
on the desktop so the command ought to have been:
> data6 = scan(file = 'Desktop/test data.txt')
16
The filename and directories are all case sensitive. We can also type in a URL
and link to a file over the Internet directly; once again the full URL is required.
It may be easier to point permanently at a directory so that the files can be loaded
simply by typing their names. We can alter the working directory using the
setwd() command:
setwd('pathname')
When using this command, replace the pathname part with the location of your
target directory. The location is always relative to the current working directory,
so to set to my Desktop I used the following:
> setwd('Desktop')
> getwd( )
[1] "/Users/markgardener/Desktop"
We can look at a directory and see which files/folders are within it using the dir( )
or list.files( ) command:
dir( )
list.files( )
The default is to show the files and folders in the current working directory, but
we can type in a path (in single quote marks) to list files in any directory. For
example:
dir('Desktop')
dir('Documents')
dir('Documents/Excel files')
Note that the listing is in alphabetical order; files are shown with their extensions
and folders simply display the name.
If we have files that do not have extensions (for example: .txt, .doc), it is harder
to work out which are folders and which are files.
Invisible files are not shown by default, but we can choose to see them by adding
an extra instruction to the command like so:
dir(all.files = TRUE)
Note:
So far the data items that we have created are simple; they contain either a single
value (the result of a mathematical calculation) or several items.
A list of data items is called a vector. If we only have a single value, your vector
contains only one item, that is, it has a length of 1.
17
If we have multiple values, your vector is longer. When we display the list R
provides an index to help we see how many items there are and how far along any
particular item is. Think of a vector as a one-dimensional data object; most of the
time we will deal with larger datasets than single vectors of values.
18
Some simple example data are shown in below Table 1. Here we can see two
columns; each one is a variable. The first column is labeled abund; this is the
abundance of some water-living organism.
The second column is labeled flow and represents the flow of water where the
organism was found.
Table 1: Simple Data From a Two Column Spreadsheet
ABUND FLOW
9 2
25 3
15 5
2 9
14 14
25 24
24 29
47 34
In this case there are only two columns and it would not take too long to use the
scan( ) command to transfer the data into R.
However, it makes sense to keep the two columns together and import them to R
as a single entity. To do so, perform the following steps:
1. If you have a file saved in a proprietary format (for example, XLS), save the data
as a CSV file instead.
2. Now assign the file a sensible name and use the read.csv() command as follows:
> fw = read.csv(file.choose( ))
3. Select the file from the browser window. If we are using Linux, the filename
must be typed in full. Because the read.csv( ) command is expecting the data to
be separated with commas, we do not need to specify that. The data has headings
and because this is also the default, you do not need to tell R anything else.
4. To see the data, type its name like so:
> fw
abund flow
1 9 2
2 25 3
3 15 5
4 2 9
5 14 14
6 25 24
7 24 29
8 47 34
19
Viewing Named Objects
In a general way you “make” new items by providing a name followed by the
instruction that creates it.
R is object oriented, which means that it expects to find named things to deal with
in some way.
For example, if we are conducting an experiment and collecting data from several
samples, we want to create several named data objects in R in order to work on
them and do your analyses later on.
As a reminder, the following examples show a few of the different ways you have
seen thus far to create named items:
answer1 = 23 + 17 / 2 + pi / 4
my.data = read.csv(file.choose( ))
sample1 = c(2, 5, 7, 3, 9, 4, 5)
Now to learn how to view these items in R and remove them as necessary.
This example contains three objects. The objects are listed in alphabetical order
(with all the uppercase before the lowercase); if we have a lot of objects, the
display will run to more lines like so:
[1] "A" "A.r" "B" "CI"
[5] "CI.1" "CI.dn" "CI.up" "Ell.F"
20
[9] "F" "F1" "area" "az"
[13] "bare" "beetle.cca" "beta" "bf"
[17] "bf.beta" "bf.lm" "biol" "biol.cca"
[21] "biomass" "bird" "bp" "bs"
[25] "bss" "but" "but.lm" "c3"
Here there are 28 objects. At the beginning of each new row the display shows
you an index number relating to “how far along” the list of items you are. For
example the bare data object is the 13th item along (alphabetically).
If we do not have any named objects at all, we get the following “result”:
> ls( )
character(0)
For example:
> ls(pattern = 'b')
[1] "bare" "beetle.cca" "beta" "bf" "bf.beta"
[6] "bf.lm" "biol" "biol.cca" "biomass" "bird"
[11] "bp" "bs" "bss" "but" "but.lm"
[16] "cbh" "cbh.glm" "cbh.sf" "food.b" "nectar.b"
[21] "pred.prob" "prob2odd" "tab.est" "tab1" "tab2"
Here the pattern looks for everything containing a “b”. This is pretty broad so we
can refine it by adding more characters:
> ls(pattern = 'be')
[1] "beetle.cca" "beta" "bf.beta"
Now the pattern picks up objects with “be” in the name. If we want to search for
objects beginning with a certain letter you use the ^ character like so:
> ls(pattern = '^b')
[1] "bare" "beetle.cca" "beta" "bf" "bf.beta"
21
[6] "bf.lm" "biol" "biol.cca" "biomass" "bird"
[11] "bp" "bs" "bss" "but" "but.lm"
Compare the following search listings. In the first case the pattern matches
objects beginning with “be” but in the second case the letters are enclosed in
square brackets:
> ls(pattern = '^be')
[1] "beetle.cca" "beta"
The effect of the square brackets is to isolate the letters; each is treated as a
separate item, hence objects beginning with “b” or “e” are matched. We can
receive the same result using a slightly different approach as well:
ls(pattern = '^b | ^e')
The vertical brace (sometimes called a pipe) character stands for or, that is, you
want to search for objects beginning with “b” or beginning with “e”.
To find objects ending with a specific character you use a dollar sign at the end
like so:
> ls(pattern = 'm$')
[1] "bf.lm" "but.lm" "cbh.glm" "dep.pm"
[5] "dm" "frit.glm" "frit.lm" "frit.sum"
[9] "hlm" "mf.lm" "mr.lm" "n.glm"
[13] "newt.glm" "newt.test.glm" "sales.lm" "sm"
[17] "t.glm" "test.glm" "test.lm" "test1.glm"
[21] "tt.glm" "worm.pm"
We can use the period as a wildcard and R will match any character:
> ls(pattern = 'a.e')
[1] "area" "bare" "date" "sales" "sales.frame"
[6] "sales.lm" "sales.ts" "water"
22
> ls(pattern = 'a..e')
[1] "tab.est" "treatment"
In the first example a single wildcard was used but in the second there are two.
This pattern matching uses more or less the same conventions as standard
Regular Expressions.
Number Data:
Plain values that are whole numbers are integer values, whereas values that
contain decimals are numeric.
The distinction is fairly minor, but if we have a list of values that contain both
integers and decimals, R will regard the entire sample as numeric.
> data3
[1] 6 7 8 7 6 3 8 9 10 7 6 9
> data7
23
[1] 23.0 17.0 12.5 11.0 17.0 12.0 14.5 9.0 11.0 9.0 12.5 14.5 17.0 8.0 21.0
Text Items:
If you do not have numbers, we must have text. R recognizes two sorts of text
data items.
We can think of the first kind as plain text labels; R calls these character values.
> data8
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
These items display as plain text and have the quote marks. However, another
type of non-numeric data is called a factor:
> cut
[1] mow mow mow mow mow unmow unmow unmow unmow
Levels: mow unmow
Here the data are text but they are not in quotes.
When they are displayed the text appears plain without quote marks, but with an
additional line showing you how many different things there are in this list.
24
[1] mow mow mow mow mow unmow unmow unmow unmow
Levels: mow unmow
In this case new data objects were created but the original object could be
overwritten with the new one.
We can do a similar thing with numbers. If we begin with data that contain
decimals, that is, numeric, we can convert to integers using the as.integer( )
command.
We can convert integer values to numeric using the as.numeric( ) command:
> data7
[1] 23.0 17.0 12.5 11.0 17.0 12.0 14.5 9.0 11.0 9.0 12.5 14.5 17.0 8.0 21.0
> data7i = as.integer(data7)
> data7i
[1] 23 17 12 11 17 12 14 9 11 9 12 14 17 8 21
> data7n = as.numeric(data7i)
> data7n
[1] 23 17 12 11 17 12 14 9 11 9 12 14 17 8 21
We can also convert numbers to text using as.character( ):
> data7c = as.character(data7)
> data7c
[1] "23" "17" "12.5" "11" "17" "12" "14.5" "9" "11" "9" "12.5"
[12] "14.5" "17" "8" "21"
This works out fine if the text is sensible; in the preceding example the text
values were originally numbers.
Now see what happens if you try this on a factor:
> cut
[1] mow mow mow mow mow unmow unmow unmow unmow
25
> cut.n = as.numeric(cut)
> cut.n
[1] 1 1 1 1 1 2 2 2 2
Here we get a surprising (but potentially useful) result; the numbers relate
directly to the different factors that you have.
If we try to convert something that really is not going to work, R gives a warning
like so:
> data8
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> data8n = as.numeric(data8)
Warning message:
NAs introduced by coercion
> data8n
[1] NA NA NA NA NA NA NA NA NA NA NA NA
In this case the data is plain text and cannot be forced into any sensible number,
so we end up with a string of NAs.
If we were to convert the plain text to a factor first and then to a number, that
would be a different story:
> data8c = as.numeric(as.factor(data8))
> data8c
[1] 5 4 8 1 9 7 6 2 12 11 10 3
Here one command is nested inside the other. R evaluated the as.factor( ) part
first and then converted that into numbers.
We started with twelve months and can see that they have been assigned numbers;
notice how R has indexed them alphabetically.
26
Statement
Classification of Statements:
1. Selection Statements
2. Looping Statements
3. Unconditional Statements
1. Selection Statements:
Based upon the outcome of a particular condition, selection statements transfer
control from one point to another.
Selection statements select a statement to be executed among a set of various
statements.
The selection statements available in R are as follows:
a. if statement
b. if-else statement
c. nested if-else statement
d. if-else-if ladder statement
e. switch statement
a) if statement
This control structure checks the expression provided in parenthesis is true or not.
If true, the execution of the statements in braces {} continues.
Syntax:
if(expression){
statements
....
....
}
Flow Diagram: if statement
Example: # To implement if statement in R
b) if-else statement
It is similar to if condition but when the test expression in if condition fails, then
statements in else condition are executed.
Syntax:
if(expression){
statements
....
....
}
else{
statements
....
....
}
Flow Diagram: if-else statement
x <-5
# Check value is less than or greater than 10
if(x > 10){
print(paste(x, "is greater than 10"))
}else{
print(paste(x, "is less than 10"))
}
Output:
if(a>b){
if(a>c){
print("a is greater")
}
else{
print("c is greater")
}
}else{
if(b>c){
print("b is greater")
}else{
print("c is greater")
}
}
Input:
[1] "Enter the a:"
1: 40
2:
Read 1 item
[1] "Enter the b:"
1: 20
2:
Read 1 item
[1] "Enter the c:"
1: 10
2:
Read 1 item
Output:
[1] "a is greater"
d) if-else-if ladder
It is similar to if-else statement, here the only difference is that an if statement
is attached to else.
If the condition provided to if block is true then the statement within the if
block gets executed, else-if the another condition provided is checked and if
true then the statement within the block gets executed.
Syntax:
if(condition 1 is true) {
execute this statement
} else if(condition 2 is true) {
execute this statement
} else {
execute this statement
}
Output:
e) switch statement
Syntax:
switch (expression, case1, case2, case3,…,case n )
Flow Diagram: switch Statement
Example: # switch statement
Input:
Output:
[1] "Banana"
2. Looping Statements
In R programming, we require a control structure to run a block of code
multiple times.
Loops come in the class of the most fundamental and strong programming
concepts.
A loop is a control statement that allows multiple executions of a statement or a
set of statements. The word ‘looping’ means cycling or iterating.
There are two components of a loop, the control statement, and the loop body.
The control statement controls the execution of statements depending on the
condition and the loop body consists of the set of statements to be executed.
In order to execute the identical lines of code numerous times in a program, a
programmer can simply use a loop.
There are three types of loop in R programming:
a. for Loop
b. while Loop
c. repeat Loop
a) for loop statement:
It is a type of control statement that enables one to easily construct a loop that has
to run statements or a set of statements multiple times.
For loop is commonly used to iterate over items of a sequence.
It is an entry controlled loop, in this loop the test condition is tested first, then the
body of the loop is executed, the loop body would not be executed if the test
condition is false.
Syntax:
for (value in sequence)
{
statement
}
b) while loop
It is a type of control statement which will run a statement or a set of statements
repeatedly unless the given condition becomes false.
It is also an entry controlled loop, in this loop the test condition is tested first,
then the body of the loop is executed, the loop body would not be executed if
the test condition is false.
Syntax:
while ( condition )
{
statement
}
To terminate the repeat loop, we use a jump statement that is the break keyword.
Below are some programs to illustrate the use of repeat loops in R programming.
i=1
repeat{
print(i)
i<-i+1
if(i>5){
break
}
}
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Nested loops:
A nested for-loop has a for-loop inside of another for-loop.
For each of the iteration in the outer for-loop, the inner loop will be executed
unless a final condition is met for the inner loop.
Once an inner for-loop is executed for a particular outer iteration then the outer
for-loop goes for the next iteration and now the inner loop will be executed for
this iteration.
This process repeats itself till the final condition is met for the outer for-loop.
Syntax:
for (element1 in sequence1) {
for(element2 in sequence2){
// body
}
}
3. Unconditional Statements
The unconditional statements available in R are as follows:
a. break statement
b. return statement
c. next Statement
a) break statement:
A break statement is used to stop the iteration of a loop i.e. to terminate it based
on a condition and move the flow control to the next statement just after the loop.
It is used in for loop, while loop and repeat loop.
Now if there are nested loops then this statement terminates the near most loop.
Syntax:
...
if (condition) {
break
}
...
Flow Diagram: break statement
for ( x in 1:10) {
if ( x = = 5) {
break
}
print (x)
}
Output:
[1] 1
[1] 2
[1] 3
[1] 4
b) return statement:
return statement is used to return the result of an executed function and returns
control to the calling function.
Syntax:
return(expression)
func(1)
func(0)
func(-1)
Output:
[1] "Positive"
[1] "Zero"
[1] "Negative"
c) next Statement:
next statement is used to skip the current iteration without executing the further
statements and continues the next iteration cycle without terminating the loop.
Syntax:
loop(condition){
if (condition) {
next
}
expression statement
}
Output:
[1] 2
[1] 4
[1] 6
[1] 8
[1] 10
The factorial of 4 is 24. Factorial of any number is the product of all numbers
from 1 to that number. However the factorial of 0 is defined as 1 and negative
numbers don't have factorial.
To understand factorial see this example
4! = 1*2*3*4 = 24
Symbol
The symbol of factorial is "!" i.e "exclamatory mark" which is put after the
number like
Output:
findfactorial(0)
[1] 1
findfactorial(3)
[1] 6
In this simple example we have defined a function findfactorial which takes one
argument, which is the number for which we want to find factorial.
A variable named factorial is defined and as minimum factorial can be 1 so we
have assigned 1 to that variable.
After that we are using if else structure. If number is 0 or 1 then the factorial is 1
hence this is the if condition.
In else block, the number is more than 1 and it means the factorial can be
calculated by muliplying all numbers from 1 to that number.
We have used a variable i which goes from 1 to the number and for each iteration
of for loop, the product is multiplied by i.
In the end a return statement is used to return factorial from this function.
Ofcourse this can be done without using a function but with function the logic is
with more clarity and you also learn to develop functions as R programming is
done using functions as basic units and you can call these functions again and
again.
After grasping the logic of factorial in R code, now we can write code for another
R program which finds factorial of number taken as input from user. Here is
another program in Rstudio
Output:
>findfact(-3)
[1] "Factorial of negative numbers is not possible"
> findfact(0)
[1] "Factorial of 0 is 1"
> findfact(5)
[1] "Factorial of 5 is 120"
Output:
We may call this function and provide any positive integer it will return the
factorial of that number.
> factorial(5)
[1] 120
> factorial(4)
[1] 24