0% found this document useful (0 votes)
9 views85 pages

Unit 1.1

The document provides an overview of R programming, highlighting its features, statistical capabilities, and mathematical functionalities. It covers essential concepts such as vectors, matrices, and arrays, along with operations like addition, multiplication, and subsetting. Additionally, it explains how to create and manipulate these data structures in R for statistical computing and data analysis.

Uploaded by

Chaya Anu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views85 pages

Unit 1.1

The document provides an overview of R programming, highlighting its features, statistical capabilities, and mathematical functionalities. It covers essential concepts such as vectors, matrices, and arrays, along with operations like addition, multiplication, and subsetting. Additionally, it explains how to create and manipulate these data structures in R for statistical computing and data analysis.

Uploaded by

Chaya Anu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 85

Statistical Computing

and R Programming
Introduction
• R is an open-source programming language that is widely
used as a statistical software and data analysis tool.
• R generally comes with the Command-line interface.
• R is available across widely used platforms like Windows,
Linux, and macOS.
• Also, the R programming language is the latest cutting-
edge tool.
Introduction
• It was designed by Ross Ihaka and Robert
Gentleman at the University of Auckland, New Zealand,
and is currently developed by the R Development Core
Team.
• R programming language is an implementation of the S
programming language.
• It also combines with lexical scoping semantics inspired by
Scheme. Moreover, the project conceives in 1992, with an
initial version released in 1995 and a stable beta version in
2000.
Why R Programming Language?
Features of R Programming
Language
Statistical Features of R:
• Basic Statistics
• Static Graphs
• Probability Distribution
• Data Analysis
Features of R Programming
Language
Programming Features of R:
• R Packages
• Distributed Computing
Distributed computing is a model in which components
of a software system are shared among multiple computers
to improve efficiency and performance. Two new
packages ddR and multidplyr used for distributed
programming in R were released in November 2015.
R for Basic Math
• All common arithmetic operations and mathematical
functionality are ready to use at the console prompt.
• You can perform addition, subtraction, multiplication, and
division with the symbols +, -, *, and / respectively.
• You can create exponents(also referred to as powers or
indices) using ^, and you control the order of the
calculations in a single command using parentheses, ().
Arithmetic
• In R, standard mathematical
rules apply throughout and
follow the usual left-to-right
order of operations:
parentheses, exponents,
multiplication, division,
addition,
subtraction(PEMDAS).
Arithmetic
Logarithms and Exponentials
• You’ll often see or read about researchers performing a
log transformation on certain data.
• This refers to rescaling numbers according to the
logarithm.
• When supplied a given number x and a value referred to
as a base, the logarithm calculates the power to which you
must raise the base to get to x.
• Ex: x = 243 to base 3 is 5, because 35 = 243.
• In R, log transformation is achieved with the log function.
Logarithms
Exponentials
• There’s a particular kind of log transformation often used in
mathematics called the natural log, which fixes the base at a special
mathematical number – Euler’s number.
• This is conventionally written as e and is approximately equal to 2.718.
• Euler’s number gives rise to the exponential function, defined as e
raised to the power of x, where x can be any number.
• The exponential function f(x) = ex, is often written as exp(x) and
represents the inverse of the natural log such that exp(log e x) = loge
exp(x) = x.
Logarithms and Exponentials
E - Notation
• When R prints large or small numbers beyond a certain
threshold of significant figures, set at 7 by default, the
numbers are displayed using the classic scientific e-
notation.
• In e-notation, any number x can be expressed as xey,
which represents exactly x X 10y.
• Consider the number 2, 342, 151, 012, 900.
E - Notation
Assigning Objects
• If you want to save the results and
perform further operations, you
need to be able to assign the
results of a given computation to
an object(variable) in the
current workspace.
• You can specify an assignment in
R in two ways: using arrow
notation (<-) and using a single
equal sign (=).
Assigning Objects
Vectors
• Often you’ll want to perform the same calculations or comparisons upon
multiple entities, for example if you’re rescaling measurements in a data
set.
• R provides an efficient solution to do this with vectors.
• The vector is the essential building block for handling multiple items in R.
• A vector is an ordered collection of basic data types of a given length. The
only key thing here is all the elements of a vector must be of the identical
data type e.g homogeneous data structures. Vectors are one-dimensional
data structures.
• More complicated data structures may consists of several vectors.
Creating Vectors
• The function for creating a vector is the single letter c,
with the desired entries in parentheses separated by
commas.
R> myvec <- c(1,3,1,42)
R> myvec
[1] 1 3 1 42
Creating Vectors
Creating Vectors
Sequences, Repetition, Sorting
and Lengths
• Useful functions associated with R vectors: seq, rep, sort and
length.
• You can create an equally spaced sequence of increasing or
decreasing numeric values.
• The easiest way to create such a sequence, with numeric values
separated by interval of 1, is to use the colon operator.

• Should read as, “from 3 to 27 (by 1)”


Sequences
Sequences with seq
• You can use seq function to create a sequence of vector.
• The parameters are from value, to value and by value.

• These kind of sequences will always starts with from


value, but not necessarily end at to value.
• R> seq(from=3, to=27, by=5)
• [1] 3 8 13 18 23
Sequences with seq
• Instead of providing a by value, however, you can specify
a length.out value to produce a vector with that many
numbers, evenly spaced between the from and to value.
Sequences with seq
Repetition with rep
Repetition with rep
Sorting
with sort
• The sort function
sorts a vector in
increasing or
decreasing order.
Vector length with length
• This
determines
how many
entries exist in
a vector given
as the
argument x.
Subsetting and element
extraction
• In all the results you have seen printed to the console
screen so far, you may have noticed a curious features.
• Immediately to the left of the output there is a square-
bracketed [1].
• When the output is a long vector that spans the width of
the console and wraps onto the following line, another
square-bracketed number appears to the left of the new
line.
• These numbers represent the index of the entry directly to
the right.
Subsetting and element
extraction
• Quite simply, the index
corresponds to the position of
a value within a vector, that’s
precisely why the first value
always has a [1] next to it.
• These indexes allow you to
retrieve specific elements
from a vector, which is known
as subsetting.
Subsetting and element
extraction
Subsetting and element
extraction
• To delete individual element, use negative versions of the
indexes.
Vector Oriented Behaviour
• Vectors are so useful
because they allow R to
carry out operations on
multiple elements
simultaneously with speed
and efficiency.
• This vector - oriented,
vectorized or element-wise
behaviour is a key feature
of the language.
Vector Oriented Behaviour
Matrices and Arrays
Matrix and Arrays
• A matrix is simply several vectors stored together.
• Whereas the size of a vector is described by its length, the
size of a matrix is specified by a number of rows and a
number of columns.
• When you create higher dimensional structures it is are
referred as Arrays.
Defining a Matrix
• The matrix is an important mathematical construct, and
it’s essential to many statistical methods.
• Matrix A is described as mxn matrix; that is, A will have
exactly m rows and n columns.
• This means A will have a total of mn entries, with each
entry ai,j having a unique position given by its specific
row(i=1,2…..m) and column(j=1,2…n).
Defining a Matrix
Creating a Matrix
• To create a matrix, use matrix command, providing the
entries of the matrix to the data argument as a vector.
Creating a Matrix
• You can elect not to supply nrow and ncol when calling
matrix, in which case R’s default behaviour is to return a
single-column matrix of the entries in data.
• Ex: matrix(data=c(-3,2,893,0.17)) would be identical to
matrix(data=c(-3,2,893,0.17), nrow=4, ncol=1)
Filling Direction
• The filling direction of the matrix is column-by-column,
when reading the data entries from left to right.
• You can control how R fills in data using the argument
byrow.
Filling Direction
Row and Column Bindings
• If you have multiple vectors of equal length, you can
quickly build a matrix by binding together these vectors
using the built-in R functions, rbind and cbind.
• You can treat each vector as row(by using the command
rbind) or treat each vector as a column(using the
command cbind).
Row and Column Bindings
Matrix
Dimensions
• The built-in function,
dim, provides the
dimensions of a matrix
stored.
Subsetting
• Extracting and subsetting an element from matrices in R
is much like extracting element from vectors.
• The only complication is that you now have an additional
dimension.
Subsetting
Row, Column and Diagonal
Extraction
• To extract an entire row or column from a matrix, you
simply specify the desired row or column number and
leave the other value blank.
• It’s important to include the comma, that separates the
row and column numbers.
Row, Column and Diagonal
Extraction
Row, Column and Diagonal
Extraction
Row, Column and Diagonal
Extraction
• The first command returns the second and third rows of A.
• The second command returns the third and first columns of A.
• The last command accesses the third and first rows of A, in that
order, and from those rows it returns the second and third
column elements.
Omitting and Overwriting
• To delete or omit elements from a matrix, you again use
square brackets, but this time with negative indexes.
Omitting and Overwriting
• The following removes
the first row from A
and retrieves the third
and second column
values, in that order,
from the remaining
two rows.
Omitting and Overwriting
• Print A without first row and second column:
• Delete the first row and then delete the second and third
column from the result
Omitting and Overwriting
Replacement
Replacement
Replacement
Diagonal Element
Replacement
Matrix Operations and
Algebra
• Matrix Transpose
• Identity Matrix
• Scalar Multiple of a matrix
• Matrix Addition and Subtraction
• Matrix Multiplication
• Matrix Inversion
Matrix Transpose

• In R, the transpose of a matrix is found with the function t.


Matrix Transpose
Identity Matrix
Identity Matrix
Scalar Multiple of a matrix
Scalar Multiple of a matrix
Matrix Addition and
Subtraction
Matrix Addition and
Subtraction
Matrix Multiplication
Matrix Multiplication
Matrix Multiplication
Matrix Inversion
Matrix Inversion
• Matrices that are not invertible are referred to as
singular.
• Inverting a matrix is often necessary when solving
equations with matrices and has important practical
ramifications.
• The R function solve() inverts a matrix.
Matrix Inversion
Multidimensional Arrays
• Just as a matrix(a rectangle
of elements) is the result of
increasing the dimension of a
vector(a line of elements),
the dimension of a matrix can
be increased to get more
complex data structures.
• In R, vectors and matrices
can be considered special
cases of the more general
array.
Definition
• To create this multi-dimensional data structures(arrays) in
R, use the array function and specify the individual
elements in the data argument as a vector.
• Then specify size in the dim argument as another vector
with a length corresponding to the number to the number
of dimensions.
• Note that array fills the entries of each layer with the
elements in data in a strict column-wise fashion, starting
with the first layer.
Arrays
Arrays
• The order of the dimensions supplied to dim:c(rows,
columns, layers).
Subsets, Extraction, and
Replacements
Subsets, Extraction, and
Replacements
Subsets, Extraction, and
Replacements

Assignment: Other ways of extraction and


replacements in Arrays

You might also like