0% found this document useful (0 votes)
15 views

Course Title: Introduction To R in Business Applications

This document provides an overview of a session on applying functions in R. It discusses the apply family of functions including apply(), lapply(), sapply(), tapply(), and mapply(). It also discusses using the dplyr package in R to manage data frames. Case studies on US crime data and iris flower data are presented. The objectives are to write efficient R programs using apply functions. Feedback is requested to improve future sessions which cover graphics with GGplot in the next session.

Uploaded by

rakesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Course Title: Introduction To R in Business Applications

This document provides an overview of a session on applying functions in R. It discusses the apply family of functions including apply(), lapply(), sapply(), tapply(), and mapply(). It also discusses using the dplyr package in R to manage data frames. Case studies on US crime data and iris flower data are presented. The objectives are to write efficient R programs using apply functions. Feedback is requested to improve future sessions which cover graphics with GGplot in the next session.

Uploaded by

rakesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Course Title : Introduction to R in

Business Applications
Ram Mohan Dhara|
IMTG/ PGDM/ Term-V / 2019-2021
Session 3 : Intermediate R / Apply functions
Split your screen – one for Your Profs are rationally
log-in and other for hands- bounded! Q&A session only
on practice in last 15 mins…

Stay alert! You might have


multiple quizzes in a session Please share feedback on today’s
…and that’s evaluative! session. Help your Prof to make the
sessions better

You can’t present; your video and


After every 2-3 sessions, there is audio are in mute mode. Your are
an assignment to be submitted by not supposed to mute/ remove
the due date …and that’s your prof in session.
evaluative!
Session After completing this session, you will be able to
write programs in R using –
objectives
• Apply functions – very efficient way of using
r programming
Case Example – US crime (x77crime
dataset)
The data is available on R environment (U.S. Department of Commerce, Bureau of the Census (1977)
Statistical Abstract of the United States.). The variables –

1. State – 50 states of US
2. Population - population estimate as of July 1, 1975
3. Income - per capita income (1974)
4. Illiteracy - illiteracy (1970, percent of population)
5. Life Exp: life expectancy in years (1969–71)
6. HS Grad: percent high-school graduates (1970)
7. Frost: mean number of days with min temp below freezing point in capital or large city
8. Area: land area in square miles
9. Crime: rate per 100,000 population (1976)
Case Example - Iris flower
• The Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by the
British statistician and biologist Ronald Fisher in his 1936 paper on linear discriminant
analysis.
• The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris
virginica and Iris versicolor).
• Four features were measured from each sample: the length and the width of the sepals
and petals, in centimetres.
• Based on the combination of these four features, Fisher developed a linear discriminant
model to distinguish the species from each other.
• This data set became a typical test case for many statistical classification techniques in
machine learning.
Parts of a flower
Iris Setosa Iris Versicolor Iris Virginica

Three species of Iris


Apply family in R
• The apply family consists of vectorized functions. Below are the most common forms of apply
functions.
• apply()
• lapply()
• sapply()
• tapply()
• mapply()
apply()
• The apply() function is used
to apply a function to the
rows or columns of matrices,
arrays and data frames.
• It assembles the returned
values into a vector, and
then returns that vector.
• If you want to apply a
function on a data frame,
make sure that the data
frame is homogeneous (i.e.
either all numeric values or
all character strings).

REF : https://www.learnbyexample.org/
lapply()

• The lapply() function is used


to apply a function to each
element of the list.
• It collects the returned
values into a list, and then
returns that list.

REF : https://www.learnbyexample.org/
sapply()

• The sapply() and lapply()


work basically the same.

• The only difference is that


lapply() always returns a list,
whereas sapply() returns into
a vector or matrix.

REF : https://www.learnbyexample.org/
tapply()

• xxx

REF : https://www.learnbyexample.org/
dplyr package in R
1. select()- used to select cols of a data frame for viewing
2. filter() - used to filter a subset from a data frame; filtering can be done using multiple
conditions.
3. arrange ()- used to arrange the rows of a data frame according to some other variable/ column
say, by date
4. rename () - used to rename the variables
5. mutate ()- used to add new variables in the dataset
6. sample () - used to select random rows from a data frame
7. count ()- used to count the no of rows at the levels of a factor; similar to table() function
8. group_by () - used to group data by one or more variables
9. summarise () - used to summarise the variables. most powerful function for EDA.
Summary : what we have learnt
• How to write programs more efficiently in R using -
1. apply()
2. lapply()
3. sapply()
4. tapply()
5. mapply()
• How to manage a data frame using functions of
dplyr package
This concludes the session :
Introduction to R

Next session : Graphics with


GGplot

You might also like