R Programming for students

You are on page 1of 40

R Programming – UNIT I

S. S. SARAVANA KUMAR
ASSISTANT PROFESSOR
Department of Computer Applications
Sri Ramakrishna College of Arts and Science
Coimbatore - 641 006
Tamil Nadu, India
R
• R is a programming language and software environment for statistical
computing, data analysis, and graphics.

• It is widely used among statisticians, data scientists, and researchers for


performing various tasks such as statistical modeling, data visualization, and
machine learning.

• R is open-source, flexible, and has a large community of users and developers


who continuously contribute to its development and maintenance.
R
• Extensions to R like RStudio can be incredibly helpful because the command
line interface of R might be challenging for some users who are used to GUI-
focused tools like SPSS and SAS.

• In addition, R is an open-source tool that is available for free.

• R integrates with other languages like C and C++, facilitating interaction with
various data sources and statistical tools.
R
• R is the most popular language in the world of Data Science.
• It is heavily used in analyzing data that is both structured and unstructured.
• R allows various features that set it apart from other Data Science languages.
• R is a programming language for statistical computing and data visualization.
• It has been adopted in the fields of data mining, bioinformatics, and data analysis.
• Designed by: Ross Ihaka and Robert Gentleman
• Developer: R Core Team
• Filename extensions: .r .rdata .rhistory .rds .rda
• First appeared: August 1993; 31 years ago
• License: GPL-2.0-or-later
• Paradigms: Multi-paradigm: procedural, object-oriented, functional, reflective,
imperative, array
• Platform: arm64 and x86-64
R
• R is a language and environment for statistical computing and graphics.
• R programming is a leading tool for machine learning, statistics, and data
analysis, allowing for the easy creation of objects, functions, and packages.
• R is a free software environment for statistical computing and graphics.
• R is a software environment which is used to analyze statistical information
and graphical representation.
• R allows us to do modular programming.
Where R Programming Used?
• R programming is used for data visualization and statistical analysis.
• This programming language covers multiple fields, such as –
• Data Science & Analytics – R programming is used for statistical analysis, data
visualization, and machine learning.
• Scientific Research – Statistical modelling and simulations in fields such as
biology, medicine, and psychology.
• Business – All the data analysis and decision-making activities in areas such as
marketing and finance are done using R programming
• Education – Teaching statistics and data analysis in universities and schools.
• Government – Data analysis and public health and social sciences policy-making.
• Finance – Portfolio optimization, risk management, and financial modelling.
• Marketing – Customer segmentation, market research, and predictive
modelling.
Programming Features of R
• Data Inputs and Data Management
• Data inputs such as data type, importing data, keyboard typing.
• Data management such as data variables, operators.

• Distributed Computing and R Packages


• Distributed Computing – Distributed computing is an open-source, high-
performance platform for the R language. It splits tasks between multiple
processing nodes to reduce execution time and analyze large datasets.
• R Packages – R packages are a collection of R functions, compiled code
and sample data. By default, R installs a set of packages during
installation.
Programming Features of R
• R is a comprehensive programming language that provides support for
procedural programming involving functions as well as object-oriented
programming with generic functions.
• There are more than 10,000 packages in the repository of R programming.
With these packages, one can make use of functions to facilitate easier
programming.
• Being an interpreter based language, R produces a machine-independent code
that is portable in nature.
• It facilitates easy debugging of errors in the code.
• R facilitates complex operations with vectors, arrays, data frames as well as
other data objects that have varying sizes.
Programming Features of R
• R can be easily integrated with many other technologies and frameworks like
Hadoop and HDFS.
• It can also integrate with other programming languages like C, C++, Python,
Java, FORTRAN, and JavaScript.
• R provides robust facilities for data handling and storage.
• R is cross-platform compatible.
• R packages can be installed and used on any OS in any software environment
without any changes.
Advantages and Disadvantages of R
• Pros of R Language
• R is the most comprehensive statistical analysis package,
as new technology and ideas often appear first in R.

• R is an open-source that’s why you can run R anywhere


any time, and even sell it under conditions of the license.

• It is cross-platform which runs on many operating


systems. It’s best for GNU/Linux and Microsoft Windows.

• In R, everyone is welcomed to provide bug fixes, code


enhancements, and new packages.
Advantages and Disadvantages of R
• Cons of R Language
• The quality of some packages in R is less than
perfect.

• There’s no customer support of R Language. So


you can’t complain if something doesn’t work.

• R commands hardly concerns over memory


management, and so R can consume all the
available memory.
R Scripts
• In order to perform scripting in R, you can simply import packages and then
use the provided functions to achieve results with minimal lines of code.

• There are several editors and IDEs that facilitate GUI features for executing R
scripts.

• Some of the useful editors that support the R programming language are:

• RGui (R Graphical User Interface)

• Rstudio – It is a comprehensive environment for R scripting and has more


features than Rstudio.
R Graphical User Interface (R GUI)
• R GUI is the standard GUI platform for working in R.
• The R Console Window forms an essential part of the R GUI.
• In this window, we input various instructions, scripts and several other
important operations.
• This console window has several tools embedded in it to facilitate ease of
operations.
• This console appears whenever we access the R GUI.
• In the main panel of R GUI, go to the ‘File‘ menu and select the ‘New Script‘
option. This will create a new script in R.
• In order to quit the active R session, you can type the following code after the
R prompt ‘>’ as follows:
1. > q()
RStudio
• RStudio is an integrated and comprehensive Integrated Development
Environment for R.
• It facilitates extensive code editing, development as well as various features
that make R an easy language to implement.
• Features of RStudio
• RStudio provides various tools and features that allow you to boost your code
productivity.
• It can also be accessed over the web and is cross-platform in nature.
• It facilitates automatic checking of updates so that you don’t have to check for
them manually.
• It provides support for recovery in case of file loss.
• With RStudio, you can manage the data more efficiently.
RStudio
• Components of RStudio
• Source – In the top left corner of the screen is the text editor that allows you
to work within source scripting. You can enter multiple lines in this source.
Furthermore, users can save the R scripts to files that are stored in local
memory.

• Console – This is present on the bottom left corner of the main window of R
Studio. It facilitates interactive scripting in R.

• Workspace and History – In the top right corner, you will find the R workspace
and the history window. This will give you the list of all the variables that were
created in the environment session. Furthermore, you can also view the list of
past commands that were executed by R.
RStudio
• Files, Plots, Package, and Help at the bottom right corner gives access to the
following tools:

• Files – A user can browse the various files and folders on a computer.

• Plots – We obtain the user plots here.

• Packages – Here, we can view the list of all the installed packages.

• Help – We can browse the built-in help system of R with this command.
Companies Using R
• Some of the companies that are using R programming are as follows:
• Facebook
• Google
• Linkedin
• IBM
• Twitter
• Uber
• Airbnb
• Ford Motor company
• Microsoft
Installing R studio
• Why use R Studio?
• Integrated Development Environment (IDE)
• Project Management
• Data Visualization
• Package Management
• Markdown and R Markdown Support
• Collaboration and Sharing
Installing R studio
• How to download R Studio for Windows?

• Visit the RStudio official website.


• Go to the download section.
• Select the version of R Studio suitable for your Windows version (32-bit or 64-
bit).
• Click on the download button to start the download.
• After downloading, run the installer and follow the on-screen instructions to
complete the installation.
Installing R studio
• How to Download R and R Studio?
• To Install R and R Studio on Windows we will have to download R and R Studio
with the following steps.
• Step 1: First, you need to set up an R environment in your local machine. You
can download the same from r-project.org.

• You have to download both the applications first go with R Base and then
install RStudio. after click on install R you will get a new page like this.
Installing R studio
• Steps to Install R and R Studio
Step 1: After downloading R for the Windows platform, install it by double-
clicking it.
Step 2: Download R Studio from their official page. Note: It is free of cost (under
AGPL licensing).
Step 3: After downloading, you will get a file named “RStudio-1.x.xxxx.exe” in
your Downloads folder.
Step 4: Double-click the installer, and install the software.
Step 5: Test the R Studio installation
Step 6: Your installation is successful.
Scripting in R
• We will create a script to print “Hello world!” in R.
• To create scripts in R, you need to perform the following steps:
• Here in R, you will have to enclose some commands in print() to get the same
output as on the command line.

• So you need to type below command: This takes “Hello World” as input in R.

print("Hello World") #Author saravanakumar


Sourcing a Script in R
• In order to execute a selected line of code:
• Select the line(s) of code, then press Ctrl + R in R GUI and Ctrl + Enter in RStudio.
• For example, we have two lines of code as follows:
print("Hello")
print(“Data Stream")
• In the above code, if you only want to print “Hello”, then select only the first line
and press Ctrl + Enter in RStudio.
• In order to execute the entire script:
• In R GUI,
• Go to Edit, and then click Run All.
• In the case of R Studio,
• Hold and press Ctrl+Shift+ Enter.
R "Hello World" Program
• A simple program to display "Hello World!" on the screen using print()
function.
> # We can use the print() function > myString <- "Hello, World!“
> print("Hello World!") > print ( myString)
[1] "Hello World!" [1] "Hello, World!"
> # Quotes can be suppressed in the output
> print("Hello World!", quote = FALSE)
[1] Hello World!
> # If there are more than 1 item, we can concatenate using paste()
> print(paste("How","are","you?"))
[1] "How are you?"
R Variables and Constants
• Variables are used to store data, whose value can be changed according to our
need.
• Unique name given to a variable (function and objects as well) is an identifier.

• Rules for writing Identifiers in R

• Identifiers can be a combination of letters, digits, period (.) and underscore (_).
• It must start with a letter or a period.
• If it starts with a period, it cannot be followed by a digit.
• Reserved words in R cannot be used as identifiers.
R Variables and Constants
• A variable provides us with named storage that our programs can manipulate.
• A variable in R can store an atomic vector, group of atomic vectors or a
combination of many Robjects.
• A valid variable name consists of letters, numbers and the dot or underline
characters.
• The variable name starts with a letter or the dot not followed by a number.
R Variables and Constants
• Valid identifiers in R
• Some of the examples of valid identifiers are:

• total, Sum, .fine.with.dot, this_is_acceptable, Number5

• Invalid identifiers in R
• Some of the invalid identifiers are:

• tot@l, 5um, _fine, TRUE, .0ne


Variable Assignment
• The variables can be assigned values using leftward, rightward and equal to operator.
• The values of the variables can be printed using print() or cat() function.
• The cat() function combines multiple items into a continuous print output.
# Assignment using equal operator.
var.1 = c(0,1,2,3)
# Assignment using leftward operator.
var.2 <- c("learn","R")
# Assignment using rightward operator.
c(TRUE,1) -> var.3
print(var.1)
cat ("var.1 is ", var.1 ,"\n")
cat ("var.2 is ", var.2 ,"\n")
cat ("var.3 is ", var.3 ,"\n")
Variable Assignment
• When we execute the above code, it produces the following result −

[1] 0 1 2 3
var.1 is 0 1 2 3
var.2 is learn R
var.3 is 1 1
Constants in R
• Constants, as the name suggests, are entities whose value cannot be altered.
• Basic types of constants are numeric constants and character constants.
• Numeric Constants
• All numbers fall under this category.
• They can be of type integer, double or complex.
• It can be checked with the typeof() function.
• Numeric constants followed by L are regarded as integer and those followed
by i are regarded as complex.
• typeof(5)
• typeof(5L)
• typeof(5i)
Constants in R
• Output
[1] "double"
[1] "integer"
[1] "complex“
• Numeric constants preceded by 0x or 0X are interpreted as hexadecimal
numbers.
• 0xff
• 0XF + 1
• Output
[1] 255
[1] 16
Constants in R
• Character Constants
• Character constants can be represented using either single quotes (') or double
quotes (") as delimiters.
• 'example'
• typeof("5")
• Output
• [1] "example"
• [1] "character"
Constants in R
• Built-in Constants
• Some of the built-in constants defined in R along with their values are shown below.
• LETTERS
• letters
• pi
• month.name
• month.abb
• Output

[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
[1] 3.141593
[1] "January" "February" "March" "April" "May" "June"
[7] "July" "August" "September" "October" "November" "December"
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
Classes in R
• All variables in R have a class, which tells you what kinds of variables they are.
• The class() function in R is used to return the values of the class attribute of an
R object.
• Syntax
• Following is the syntax for the method:
• class(x)
• Parameters
• The class() function takes the parameter value:
• x: This represents the R object whose class attribute is to be determined.
• Return value
• The class() function returns the class attribute of an R object.
Classes in R
• # creating R objects
• mydate <- as.Date('2015-03-12')
• myfunction <- function(x) { x*x}
• myname <- "Theo"
• mydf <- data.frame(c1=1:2, c2=letters[1:2])
• # getting their class attributes using the class() function
• class(mydate)
OUTPUT
• class(myfunction)
[1] "Date"
• class(myname)
• class(mydf)
[1] "function"
[1] "character"
[1] "data.frame"

You might also like