0% found this document useful (0 votes)
65 views24 pages

R: Introduction: Kedar Kelkar

R can be used for statistical analysis and graphing and was developed from the S language at Bell Labs; it has a simple syntax and stores objects in memory, allowing commands to be run interactively; while the base R environment includes many classical techniques, additional functionality is available through packages.

Uploaded by

Naomi Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views24 pages

R: Introduction: Kedar Kelkar

R can be used for statistical analysis and graphing and was developed from the S language at Bell Labs; it has a simple syntax and stores objects in memory, allowing commands to be run interactively; while the base R environment includes many classical techniques, additional functionality is available through packages.

Uploaded by

Naomi Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

R: Introduction

Kedar Kelkar
History of R
• R can be regarded as an implementation of the S language which was
developed at Bell Laboratories by Rick Becker, John Chambers and
Allan Wilks.

• R is an environment within which many classical and modern


statistical techniques have been implemented.

• Some of these techniques are directly built into the base R


environment. But many are supplied as packages.
 R creates a wide variety of graphs

 Completely customizable in terms of graph type, colors, legends,


size background, scale, grids, etc.
R Programming not for non-programmers?
• R is an interpreted language, not a compiled one
• Commands can be run directly, without writing the entire program.
• Compiled languages include C

• Syntax is very simple


• The syntax for sum is sum, mean is mean, median is median.
• Defining variables and using functions is very intuitive
• For complex functions, we need to install packages. These packages are free
and stored in a R library in the local system
R Programming not for non-programmers?
• Created objects are held in the memory
• You can access all the objects using the command ls()

• The collection of these objects is called the workspace.

• This workspace is not saved on disk unless you tell R to do so. This means
that your objects are lost when you close R.

• If you save the workspace then all the objects in your current R session are
saved in a file .RData.
Installing R and R Studio
• Visit the following website:
• https://cran.r-project.org/bin/windows/base/
• Right-click the installer file and RUN as administrator.
• Follow the instructions for installation.

• Downloading RStudio
• https://www.rstudio.com/products/rstudio/download/#download
Interactive Mode in R
• Interactive mode is the most basic way of functioning in R.
• You type commands and immediately get results out of it.

• Using R as a calculator

• On the console, simply start


typing out mathematical
operations to get results.
• R follows BODMAS
Computation in R
• Addition: 2+3
• Subtraction: 2-3 R follows BODMAS

• Multiplication: 2*3
• Division: 2/3
• Exponential: 2**3 or 2^3
• Remainder: 2%%3

1) 3+5+7*7-50/5
2) (3+5) + (7*7) - (50/5)
Computation in R
• Range

• Range operator is given by colon (:)


• 1:5 implies 1, 2, 3, 4, 5

• Addition and Range can be used together

• 3+1:5 implies 3+c(1, 2, 3, 4, 5)


• R would return the following: c(4, 5, 6, 7, 8)
Defining Variables
• In R, data objects can be assigned with a names.
• These names are case sensitive and may contain alphanumeric
characters (a-z,A-z,0-9), the dot (.) and underscore(_).

• Names that start with a digit or an underscore (e.g. 1a), numerical


expressions (e.g. .11), dashes ('-') or spaces, can only be used when they are
quoted: `1a` and `.11`.

• All other combinations of alphanumeric characters, dots and underscores can


be used freely without the need of backticks.

• Names can be as long as you want


Defining Variables
• Variable names are defined using the assignment operator (<-)

Note the

1) Assignment symbol
2) Case Sensitivity of R software.
Data Structures
• Vectors
• They are a Sequence of Objects
• One vector contains only one class of objects
Numerical Vector: v1 <- c(1, 2, 3, 4, 5)
String Vector: v2 <- c(“a”, “b”, “c”)

• Matrix
• One can define matrices of any number of rows and columns
matrix1 <- matrix(data=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), nrow = 4, ncol = 3)
Data Structures
• Lists
• R list is the object which contains elements of different types – like strings,
numbers, vectors and another list inside it.

• Data Frames
• A data frame is used for storing data tables. It is a list of vectors of equal
length.
Matrix and List
Data Frames
mpg Miles/(US) gallon

cyl Number of cylinders

disp Displacement (cu.in.)

hp Gross horsepower

drat Rear axle ratio

wt Weight (1000 lbs)

qsec 1/4 mile time

vs Engine (0 = V-shaped, 1 = straight)

am Transmission (0 = automatic, 1 = manual)

gear Number of forward gears

carb Number of carburettors


Practice Question 1
• Step 1: Create two 2X2 matrices and store them in variables m1 and m2.

• Step 2: Find sum of the two and store the result in m3

• Step 3: Multiply m1 by 3

• Step 4: Find value of 3m1-m2 and store it in m4

• Step 5: Multiply m4 and m1 (Tip: Use m4 %*% m3)


Practice Question 2
• Create a list of 5 variables

• V1 should be a string vector containing names Year, GDP Growth,


Employment Growth, Savings Rate, Inflation Rate

• V2 should be a matrix containing values assigned to them (put


whatever numbers you think)

• Create a list using V1 and V2


Practice Question 3
• Define a vector vec1 for 6 numerical elements

• Addition:
• Add 3 to vec1
• Multiplication
• Multiply the vector by 5

• Define another vector vec2 of 5 elements. Add vec1 and vec2.


• Add 3rd element of vec1 and 4th element of vec2
Break Time 
Measures of Central Tendency
• Arithmetic Mean:
• The sum of all observations divided by the number of observations
• function mean(x)

• Median
• The positional center of the data set
• function median(x)

• Mode
• The observations which occurs with the highest frequency
• No in-built function for mode in R
Measures of Central Tendency
• Geometric Mean
• Unlike arithmetic mean, geometric mean uses products to find the mean.
• No in-built function
• prod(x)^(1/length(x))

• Harmonic Mean
• Reciprocal of the arithmetic mean of the reciprocals of x
• No in-built function.
Practice Question 4
• Discuss the Pseudo Code for the Harmonic mean.

• Check the validity of the formula of the relationship between the


three types of means

• Prove: AM * HM = (GM)^2
Reading csv files in R
• data <- read.csv(file.choose())

• Homework
Find how to find median when frequencies and classes are given.
R Help
• ? For local search
• ?? For search in the entire documentation

You might also like