Unit 1.1
Unit 1.1
and R Programming
Introduction
• R is an open-source programming language that is widely
used as a statistical software and data analysis tool.
• R generally comes with the Command-line interface.
• R is available across widely used platforms like Windows,
Linux, and macOS.
• Also, the R programming language is the latest cutting-
edge tool.
Introduction
• It was designed by Ross Ihaka and Robert
Gentleman at the University of Auckland, New Zealand,
and is currently developed by the R Development Core
Team.
• R programming language is an implementation of the S
programming language.
• It also combines with lexical scoping semantics inspired by
Scheme. Moreover, the project conceives in 1992, with an
initial version released in 1995 and a stable beta version in
2000.
Why R Programming Language?
Features of R Programming
Language
Statistical Features of R:
• Basic Statistics
• Static Graphs
• Probability Distribution
• Data Analysis
Features of R Programming
Language
Programming Features of R:
• R Packages
• Distributed Computing
Distributed computing is a model in which components
of a software system are shared among multiple computers
to improve efficiency and performance. Two new
packages ddR and multidplyr used for distributed
programming in R were released in November 2015.
R for Basic Math
• All common arithmetic operations and mathematical
functionality are ready to use at the console prompt.
• You can perform addition, subtraction, multiplication, and
division with the symbols +, -, *, and / respectively.
• You can create exponents(also referred to as powers or
indices) using ^, and you control the order of the
calculations in a single command using parentheses, ().
Arithmetic
• In R, standard mathematical
rules apply throughout and
follow the usual left-to-right
order of operations:
parentheses, exponents,
multiplication, division,
addition,
subtraction(PEMDAS).
Arithmetic
Logarithms and Exponentials
• You’ll often see or read about researchers performing a
log transformation on certain data.
• This refers to rescaling numbers according to the
logarithm.
• When supplied a given number x and a value referred to
as a base, the logarithm calculates the power to which you
must raise the base to get to x.
• Ex: x = 243 to base 3 is 5, because 35 = 243.
• In R, log transformation is achieved with the log function.
Logarithms
Exponentials
• There’s a particular kind of log transformation often used in
mathematics called the natural log, which fixes the base at a special
mathematical number – Euler’s number.
• This is conventionally written as e and is approximately equal to 2.718.
• Euler’s number gives rise to the exponential function, defined as e
raised to the power of x, where x can be any number.
• The exponential function f(x) = ex, is often written as exp(x) and
represents the inverse of the natural log such that exp(log e x) = loge
exp(x) = x.
Logarithms and Exponentials
E - Notation
• When R prints large or small numbers beyond a certain
threshold of significant figures, set at 7 by default, the
numbers are displayed using the classic scientific e-
notation.
• In e-notation, any number x can be expressed as xey,
which represents exactly x X 10y.
• Consider the number 2, 342, 151, 012, 900.
E - Notation
Assigning Objects
• If you want to save the results and
perform further operations, you
need to be able to assign the
results of a given computation to
an object(variable) in the
current workspace.
• You can specify an assignment in
R in two ways: using arrow
notation (<-) and using a single
equal sign (=).
Assigning Objects
Vectors
• Often you’ll want to perform the same calculations or comparisons upon
multiple entities, for example if you’re rescaling measurements in a data
set.
• R provides an efficient solution to do this with vectors.
• The vector is the essential building block for handling multiple items in R.
• A vector is an ordered collection of basic data types of a given length. The
only key thing here is all the elements of a vector must be of the identical
data type e.g homogeneous data structures. Vectors are one-dimensional
data structures.
• More complicated data structures may consists of several vectors.
Creating Vectors
• The function for creating a vector is the single letter c,
with the desired entries in parentheses separated by
commas.
R> myvec <- c(1,3,1,42)
R> myvec
[1] 1 3 1 42
Creating Vectors
Creating Vectors
Sequences, Repetition, Sorting
and Lengths
• Useful functions associated with R vectors: seq, rep, sort and
length.
• You can create an equally spaced sequence of increasing or
decreasing numeric values.
• The easiest way to create such a sequence, with numeric values
separated by interval of 1, is to use the colon operator.