0% found this document useful (0 votes)

7 views17 pages

Module 5 Introduction To R Programming

Uploaded by

Hareesh bly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views17 pages

Module 5 Introduction To R Programming

Uploaded by

Hareesh bly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Module - V

Introduction to R Programming

R Programming:

1. Basics of R
2. Installation of R studio
3. Vectors
4. Matrices
5. Data types
6. Importing files
7. Writing files
8. Merging Files
9. Data Manipulation
10.Creation and Deletion of New Variables
11.Sorting of Data
12.Functions
13.Graphical Presentation and Descriptive Statistics.

What is R programming?
 R programming is the general purpose of the programming language.
 It is also one of the interpreter programming language and execute line by line
code.
 R programming mainly used in the data Analysis and research fields.
 R supports procedural programming with functions and, for some functions,
object oriented programming with generic functions.
 That is widely used as a statistical software and data analysis tool.

Reasons to Learn R Programming

R is an open-source programming language and software environment widely
used for statistical analysis, data visualization, and machine learning. Its user-
friendly interface and extensive libraries make it an excellent choice for
researchers, statisticians, and data scientists. Here’s why R is preferred:
 Free and Open-Source: R is open to everyone, meaning users can modify,
share and distribute their work freely.
 Designed for Data: R is built for data analysis, offering a comprehensive set of
tools for statistical computing and graphics.
 Large Package Repository: The Comprehensive R Archive Network
(CRAN) offers thousands of add-on packages for specialized tasks.
 Cross-Platform Compatibility: R can work on Windows, Mac and Linux
operating systems.
 Great for Visualization: With packages like ggplot2, R makes it easy to create
informative, interactive charts and plots.

History of ‘R’ Programming:

Ross Ihaka and Robert Gentleman

 1991 - Ross Ihaka and Robert Gentleman begin work on a

new dialect of S as a research project for the Department
of Statistics at the University of Auckland.

 1993 - The first announcement of R hits the public via the

data archive StatLib and the s-news mailing list.

 1995 - Fellow statistician Martin Machler convinces R’s

inventors to release the language under a GNU general
public license, making R both free to use and open-source.
 Ihaka and Gentleman release their
seminal paper introducing R to the world.

 1997 - The R Core Team was formed, this group is the only
one with write access to R source code, and they review
and enact any suggested changes to the language.

 The same year, the Comprehensive R Archive Network

(CRAN) was formed. This repository of open-source R
software packages, extensions to the language itself,
helps professionals with myriad tasks.

 2000 - R version 1.0.0 was released to the public.

 2003 - The R Foundation was formed to hold and
administer the R software copyright and to provide
support for the R language project.
 2004 - R version 2.0.0 is released.
 2009 - The R Journal, an open-access journal for statistical
computing and research, is established.
 2013 - R version 3.0.0 is released.

 2020 - R version 4.0.0 is released.

 June 2023 - We're currently on R version 4.3.1.

Features of R Programming language

1. Statistical Analysis: R provides a wide array of statistical techniques,

including linear and nonlinear modeling, time-series analysis, and clustering,
making it a robust tool for data analysis
2. Data Visualization: One of R's standout features is its ability to create high-
quality graphics and visualizations. Packages like ggplot2 allow users to
produce complex and aesthetically pleasing plots with ease.

3. Cross-Platform Compatibility: R runs smoothly on Windows, macOS, and

Linux.

4. Reproducibility: Tools like R Markdown facilitate combining code, output,

and text for fully reproducible research and reports.

5. Open Source: Being a GNU project, R is open-source, which means it is free

to use and has a large community contributing to its development. This
fosters collaboration and innovation within the user community.

6. Data Handling: R is designed to handle and manipulate large datasets

efficiently, although it can be memory-intensive with very large datasets.

7. Integration with Other Languages: R can be integrated with other

programming languages like C, C++, and Python, allowing users to leverage
the strengths of multiple languages in their projects.

8. User-Friendly Syntax: While R has a learning curve, its syntax is designed

to be user-friendly, especially for those familiar with statistical concepts,
making it accessible for statisticians and data analysts

Applications of R
R is used in a variety of fields, including:
 Data Science and Machine Learning: R is widely used for data analysis,
statistical modeling and machine learning tasks.
 Finance: Financial analysts use R for quantitative modeling and risk analysis.
 Healthcare: In clinical research, R helps analyze medical data and test
hypotheses.
 Academia: Researchers and statisticians use R for data analysis and
publishing reproducible research.

Advantages of R Programming
 Comprehensive Statistical Tools: R includes many statistical functions and
models, making it the ideal choice for data analysis.
 Customizable Visualizations: R’s visualization tools allows for customizations
for a simple bar chart or a detailed heatmap.
 Extensive Community Support: R has a large user base and there are
countless resources, forums and tutorials available.
 Highly Extendable: The availability of over 15,000 R packages means we can
extend R's functionality to suit any project or need.

Disadvantages of R Programming
 Memory Intensive: R can be slow with very large datasets, consuming a lot
of memory.
 Limited Support for Error Handling: Unlike some other programming
languages, R has less robust error handling features.
 Steeper Learning Curve: Beginners might face challenges with some of R’s
complex features and syntax.
 Performance: R’s performance can lag behind languages like Python or C++
when it comes to speed, especially for large-scale operations.

Installation of R studio

https://teacherscollege.screenstepslive.com/a/1108074-install-r-and-rstudio-for-
windows

Array:
Array is a linear data structure where all elements are arranged sequentially. It
is a collection of elements of same data type stored at contiguous memory
locations.

Note: Contiguous memory allocation is a type of memory allocation technique where processes are allotted a
continuous block of space in memory
Vectors
In R programming, a vector is one of the most fundamental data structures. It is a
one-dimensional array that can hold elements of the same type, such as numeric,
character, or logical values. Vectors are used extensively in R for data manipulation
and analysis.

Characteristics of Vectors
 Homogeneous: All elements in a vector must be of the same type (e.g., all
numeric or all character).
 One-dimensional: Vectors are essentially one-dimensional arrays.
 Indexing: Elements in a vector can be accessed using indices, which start
from 1 in R.

Creating Vectors
There are several ways to create vectors in R:
1. Using the c() function: The most common method, where c() stands for
"combine."
 numeric_vector = c(1, 2, 3, 4, 5)
 character_vector = c("apple", "banana", "cherry")
 logical_vector = c(TRUE, FALSE, TRUE)

2. Using the seq() function: For creating sequences.

seq_vector = seq (1, 10, by = 2)
# creates a sequence from 1 to 10 with a step of 2, Output: 1 3 5 7 9

3. Using the rep() function: To replicate values.

rep_vector = rep (1:3, times = 3)
# Replicates the sequence 1, 2, 3 three times, Output: 1 2 3 1 2 3 1 2
3
Matrix in R:

In R programming, a matrix is a two-dimensional array that can hold elements of

the same data type, organized in rows and columns. Matrices are particularly useful
for mathematical computations and data analyses, especially in linear algebra and
statistical modeling.

In a matrix, rows are the ones that run horizontally and columns are the ones that
run vertically. In R programming, matrices are two-dimensional, homogeneous
data structures. These are some examples of matrices:

Characteristics of Matrices

 Two-dimensional: Matrices consist of rows and columns.

 Homogeneous: All elements in a matrix must be of the same type (numeric,
character, etc.).
 Accessing Elements: Elements can be accessed using row and column
indices.

Creating Matrices

You can create matrices in R using the matrix() function or by converting vectors
into matrices. Here are a few methods to create matrices:

1. Using the matrix() function:

You specify the data, number of rows, and the arrangement (by row or column).

Syntax:
matrix(data, nrow, ncol, byrow = FALSE)

 data : values you want to enter

 nrow : no. of rows
 ncol : no. of columns
 byrow : logical clue, if 'true' value will be assigned by rows
Example: Output:
data = 1:6
my_matrix = matrix(data, nrow = 2, ncol = 3)
print(my_matrix)

2. Using the rbind() and cbind() functions:

These functions allow you to bind vectors as rows or columns of a matrix.

Example: Output:

row1 = c(1, 2, 3)
row2 = c(4, 5, 6)
my_matrix2 = rbind(row1, row2) # Bind rows
print(my_matrix2)

Output:
col1 <- c(1, 4)
col2 <- c(2, 5)
my_matrix3 = cbind(col1, col2) # Bind columns
print(my_matrix3)

Data Types in R Programming

Data types in R define the kind of values that variables can hold. Choosing
the right data type helps optimize memory usage and computation. Unlike some
languages, R does not require explicit data type declarations while variables can
change their type dynamically during execution.

R Programming language has the following basic R-data types and the following
table shows the data type and the values that each data type can take.

Basic Data Values Examples

Types
Numeric Set of all real numbers "numeric_value = 3.14"
Integer Set of all integers, Z "integer_value = 42L"
Logical TRUE and FALSE "logical_value = TRUE"
Complex Set of complex numbers "complex_value = 1 + 2i"
"a", "b", "c", ..., "@", "#",
Character "character_value = "Hello
"$", ...., "1", "2", ...etc
Geeks"
raw as.raw() "single_raw = as.raw(255)"

1. Numeric Data type in R

Decimal values are called numeric in R. It is the default R data type for numbers in
R. If we assign a decimal value to a variable x as follows, x will be of numeric type.
Real numbers with a decimal point are represented using this data type in R. It uses
a format for double-precision floating-point numbers to represent numerical values.

Example:
x=5 Output
print(class(x)) [1] "numeric"
print(typeof(x)) [1] "double"

2. Integer Data type in R

R supports integer data types which are the set of all integers. we can create as
well as convert a value into an integer type using the as.integer() function.
We can also use the capital 'L' notation as a suffix to denote that a particular value
is of the integer R data type.

Example
x = as.integer(5)
print(class(x))
print(typeof(x))

y = 5L
print(class(y))
print(typeof(y))

3. Logical Data type in R

R has logical data types that take either a value of true or false. A logical value is
often created via a comparison between variables.
Boolean values, which have two possible values, are represented by this R data
type: FALSE or TRUE

Example:
x=4
y=3
z=x>y
print(z)
print(class(z))
print(typeof(z))

4. Complex Data type in R :

R supports complex data types that are set of all the complex numbers. The
complex data type is to store numbers with an imaginary component.

Example:
x = 4 + 3i
print(class(x))
print(typeof(x))

5. Character Data type in R :

R supports character data types where we have all the alphabets and special
characters. It stores character values or strings. Strings in R can contain alphabets,
numbers, and symbols.
The easiest way to denote that a value is of character type in R data type is to wrap
the value inside single or double inverted commas.

Example:
char = "Geeksforgeeks"
print(class(char))
print(typeof(char))

There are several tasks that can be done using R data types. Let's understand each
task with its action and the syntax for doing the task along with an R code to
illustrate the task.

6. Raw data type in R :

To save and work with data at the byte level in R, use the raw data type. By
displaying a series of unprocessed bytes, it enables low-level operations on binary
data. Here are some speculative data on R's raw data types:

Example:
x = as.raw(c(0x1, 0x2, 0x3, 0x4, 0x5))
print(x)

Output = [1] 01 02 03 04 05
Importing files
Importing files in R programming is the process of loading external datasets—
such as those stored in CSV, Excel, text, or database files—into R’s working
environment for analysis and manipulation. There are several methods and
functions available, depending on the file type and your workflow.
1. Importing CSV Files
CSV (Comma-Separated Values) files are widely used for storing tabular data. You
can use the read.csv() function to import these files.

Example:
# Import a CSV file
data_csv = read.csv("path/to/your/file.csv")
# Display the first few rows of the data
head(data_csv)

2. Importing Excel Files

To import Excel files (.xls or .xlsx formats), you can utilize the readxl package. If you
haven't installed this package yet, you can do so using the following command:
install.packages("readxl")
After installation, you can use the read_excel() function.

Example:
library(readxl)
# Import an Excel file
data_excel = read_excel("path/to/your/file.xlsx")
# Display the first few rows of the data
head(data_excel)

3. Importing Text Files

For text files, such as tab-delimited files, you can use the read.table() function or
the read.delim() function, which is specifically designed for tab-delimited files.

Example:
# Import a tab-delimited text file
data_text = read.delim("path/to/your/file.txt")
# Display the first few rows of the data
head(data_text)

4. Importing Data from Other Formats

R also supports importing data from other formats like JSON and XML. For JSON files,
you can use the jsonlite package.
Example:
install.packages("jsonlite")
library(jsonlite)
# Import a JSON file
data_json = fromJSON("path/to/your/file.json")
# Display the structure of the data
str(data_json)
Writing files
R programming Language is one of the very powerful languages specially
used for data analytics in various fields. Analysis of data means reading and
writing data from various files like excel, CSV, text files, etc. Today we will be
dealing with various ways of writing data to different types of files using R
programming.

1. Writing Data to CSV files in R Programming Language

CSV stands for Comma Separated Values. These files are used to handle a large
amount of statistical data. Following is the syntax to write to a CSV file:

Syntax:
write.csv (my_data, file = "my_data.csv")
write.csv2 (my_data, file = "my_data.csv")
Here,
csv() and csv2() are the function in R programming.
 write.csv() uses “.” for the decimal point and a comma (“, ”) for the
separator.
 write.csv2() uses a comma (“, ”) for the decimal point and a semicolon
(“;”) for the separator.

2. Writing Data to text files

Text files are commonly used in almost every application in our day-to-day life as
a step for the "Paperless World". Well,
Writing to .txt files is very similar to that of the CSV files. Following is the syntax
to write to a text file:

Syntax:
write.table(my_data, file = "my_data.txt", sep = "")

3. Writing Data to Excel files:

To write data to excel we need to install the package known as "xlsx package", it
is basically a java based solution for reading, writing, and committing changes to
excel files. It can be installed as follows:
install.packages("xlsx")
and can be loaded and General syntax of using it is:
library("xlsx")

write.xlsx(my_data, file = "result.xlsx",

sheetName = "my_data", append = FALSE).

Merging Files:

Merging files refers to the process of combining the contents of two or more files
into a single file. This is commonly done in programming, data analysis, and version
control to consolidate information, resolve differences, or create a unified dataset.

Merging files in R typically involves combining two or more data frames based on
common columns. This is a common task in data analysis, allowing you to create
richer datasets by integrating information from different sources. R provides several
methods for merging data, including the base merge() function, the dplyr package,
and the data.table package.

1. Using the merge() Function

The merge() function in base R is a straightforward way to combine data frames. It
supports various types of joins, such as inner join, left join, right join, and full outer
join.

Example:

r# Creating two data frames

df1 = data.frame(ID = c(1, 2, 3, 4), Name = c("A", "B", "C", "D"), Age = c(25, 30,
35, 40))
df2 = data.frame(ID = c(2, 3, 4, 5), Occupation = c("Engineer", "Teacher", "Doctor",
"Lawyer"), Salary = c(5000, 4000, 6000, 7000))

# Inner join (default behavior)

inner_join = merge(df1, df2, by = "ID")
print(inner_join)
Output:

ID Name Age Occupation Salary

2 B 30 Engineer 5000
3 C 35 Teacher 4000
4 D 40 Doctor 6000

2. Using the dplyr Package

The dplyr package provides a more intuitive syntax for merging data frames, using
functions like inner_join(), left_join(), right_join(), and full_join().

Example:

r# Load dplyr package

library(dplyr)

# Creating the same data frames

df1 <- data.frame(ID = c(1, 2, 3, 4), Name = c("A", "B", "C", "D"), Age = c(25, 30,
35, 40))
df2 <- data.frame(ID = c(2, 3, 4, 5), Occupation = c("Engineer", "Teacher",
"Doctor", "Lawyer"), Salary = c(5000, 4000, 6000, 7000))

# Left join
left_join_result <- left_join(df1, df2, by = "ID")
print(left_join_result)

Output:

ID Name Age Occupation Salary

1 A 25 <NA> NA
2 B 30 Engineer 5000
3 C 35 Teacher 4000
4 D 40 Doctor 6000

3. Using the data.table Package

The data.table package is optimized for speed and efficiency, especially with large
datasets. It uses a similar syntax to the base R merge() function but is generally
faster.

Example:

r# Load data.table package

library(data.table)

# Creating the same data tables

dt1 <- data.table(ID = c(1, 2, 3, 4), Name = c("A", "B", "C", "D"), Age = c(25, 30, 35,
40))
dt2 <- data.table(ID = c(2, 3, 4, 5), Occupation = c("Engineer", "Teacher", "Doctor",
"Lawyer"), Salary = c(5000, 4000, 6000, 7000))

# Full outer join

full_join_result <- merge(dt1, dt2, by = "ID", all = TRUE)
print(full_join_result)

Output:

ID Name Age Occupation Salary

1 A 25 <NA> NA
2 B 30 Engineer 5000
3 C 35 Teacher 4000
4 D 40 Doctor 6000
5 NA NA Lawyer 7000

Data manipulation
Data manipulation in R programming refers to the process of transforming,
organizing, and preparing data for analysis. It involves tasks such as selecting
specific data, filtering rows, modifying data, summarizing information, and
reshaping datasets. R provides a variety of functions and packages (like dplyr, tidyr,
and base R functions) to perform efficient data manipulation.

Key Concepts of Data Manipulation in R:

1. Selecting Data: Choosing specific columns or rows.

2. Filtering Data: Subsetting data based on conditions.
3. Mutating Data: Creating new variables or modifying existing ones.
4. Summarizing Data: Aggregating data to find summaries.
5. Reshaping Data: Changing data structure between wide and long formats.

Example: Data Manipulation Using Base R and dplyr

Suppose we have a dataset about students’ marks:

# Sample data frame

students <- data.frame(
StudentID = 1:5,
Name = c("Alice", "Bob", "Charlie", "David", "Eva"),
Subject = c("Math", "Science", "Math", "History", "Science"),
Score = c(85, 90, 78, 88, 92),
Pass = c(TRUE, TRUE, FALSE, TRUE, TRUE)
)

Student Name Subjec Scor Pass

ID t e
1 Alice Math 85 TRUE
2 Bob Science 90 TRUE
3 Charlie Math 78 FALSE
4 David History 88 TRUE
5 Eva Science 92 TRUE

Tasks:

1. Select only Name and Score columns.

2. Filter students who passed.
3. Create a new column indicating whether the student scored above 80.
4. Find the average score of students in Science.

R Code for Data Manipulation:

Using base R:

# 1. Select Name and Score columns

selected <- students[, c ("Name", "Score")]

# 2. Filter students who passed

passed_students <- students[students$Pass == TRUE, ]

# 3. Create a new column 'HighScore' indicating if Score > 80

students$HighScore <- students$Score > 80

# 4. Calculate average Score for Science subject

science_scores <- students$Score[students$Subject == "Science"]
avg_science_score <- mean(science_scores)

print(selected)
print(passed_students)
print(students)
print(paste("Average Science Score:", avg_science_score))

Using dplyr package (more efficient):

library(dplyr)

# 1. Select Name and Score

selected <- students %>%
select(Name, Score)

1. Selected columns (Name and Score):

Name Score
1 Alice 85
2 Bob 90
3 Charlie 78
4 David 88
5 Eva 92

# 2. Filter students who passed

passed_students <- students %>%
filter(Pass == TRUE)

# 3. Add 'HighScore' column

students <- students %>%
mutate(HighScore = Score > 80)

# 4. Compute average score in Science

avg_science_score <- students %>%
filter(Subject == "Science") %>%
summarise(AverageScore = mean(Score))

print(selected)
print(passed_students)
print(students)
print(avg_science_score)

https://www.geeksforgeeks.org/r-language/data-manipulation-in-r-with-dplyr-
package/

Creation and Deletion of New Variables

R Programming First Unit
100% (1)
R Programming First Unit
34 pages
DAF Truck 16 Pole Diagnostic Pinout Diagram @
100% (2)
DAF Truck 16 Pole Diagnostic Pinout Diagram @
3 pages
Learn R Programming in A Day
100% (8)
Learn R Programming in A Day
229 pages
Algorithm Challenge Booklet
No ratings yet
Algorithm Challenge Booklet
20 pages
R Language
No ratings yet
R Language
59 pages
R Lang-Unit-01
100% (1)
R Lang-Unit-01
50 pages
R Manual
No ratings yet
R Manual
84 pages
Introduction To R Programming
No ratings yet
Introduction To R Programming
60 pages
R Programming Lab
100% (1)
R Programming Lab
46 pages
Pranav R Programming Lab File
No ratings yet
Pranav R Programming Lab File
41 pages
Co358U R' Programming Lab: Government College of Engineering Jalgaon M.S. Department of Computer Engineering
No ratings yet
Co358U R' Programming Lab: Government College of Engineering Jalgaon M.S. Department of Computer Engineering
97 pages
Introduction To R Programming Notes For Students
No ratings yet
Introduction To R Programming Notes For Students
41 pages
Introduction To R
No ratings yet
Introduction To R
6 pages
Practical File
No ratings yet
Practical File
56 pages
Unit - 1 Notes R Programming
No ratings yet
Unit - 1 Notes R Programming
52 pages
1) Open Source: R Advantages
No ratings yet
1) Open Source: R Advantages
39 pages
R-Codes SCS1621
No ratings yet
R-Codes SCS1621
151 pages
Unit 1
No ratings yet
Unit 1
16 pages
Nirula R Programming Lab Manual
No ratings yet
Nirula R Programming Lab Manual
94 pages
R Programming Language
No ratings yet
R Programming Language
7 pages
Ayush Lab File R
No ratings yet
Ayush Lab File R
25 pages
Ashish Srivastava R Lab File
No ratings yet
Ashish Srivastava R Lab File
25 pages
Lab Manual
No ratings yet
Lab Manual
46 pages
R Programming Language E Notes - B.tech
No ratings yet
R Programming Language E Notes - B.tech
215 pages
Statistical Computing & R Programming Notes PDF
100% (2)
Statistical Computing & R Programming Notes PDF
22 pages
Data Analysis Using R
100% (1)
Data Analysis Using R
78 pages
R Programming R Basics For Beginners. (Z-Library)
No ratings yet
R Programming R Basics For Beginners. (Z-Library)
177 pages
What Is R Programming
No ratings yet
What Is R Programming
7 pages
R Programming Language Unit01
No ratings yet
R Programming Language Unit01
133 pages
R Programming
No ratings yet
R Programming
11 pages
R Manual
No ratings yet
R Manual
48 pages
Uint 1 R
No ratings yet
Uint 1 R
40 pages
R Lanaguage
No ratings yet
R Lanaguage
25 pages
R Programming by Adi
No ratings yet
R Programming by Adi
50 pages
R Programming in Statistics
No ratings yet
R Programming in Statistics
403 pages
R Programming Lab
No ratings yet
R Programming Lab
48 pages
R Language 1st Unit Deep
100% (3)
R Language 1st Unit Deep
61 pages
Introduction To R
No ratings yet
Introduction To R
67 pages
R Programming Unit-1
No ratings yet
R Programming Unit-1
108 pages
Unit 1
No ratings yet
Unit 1
19 pages
R Programming Introduction
No ratings yet
R Programming Introduction
3 pages
BigData - BCom Unit 3
No ratings yet
BigData - BCom Unit 3
15 pages
Edar M-1
No ratings yet
Edar M-1
46 pages
SC&RP - Unit 1
No ratings yet
SC&RP - Unit 1
106 pages
R Programming Unit 1
No ratings yet
R Programming Unit 1
83 pages
R Programming Lab
No ratings yet
R Programming Lab
26 pages
Chapter 02 Introduction
No ratings yet
Chapter 02 Introduction
31 pages
SCTR Unit 1
No ratings yet
SCTR Unit 1
36 pages
SCSA4001 - R Program
No ratings yet
SCSA4001 - R Program
151 pages
Lab 01
No ratings yet
Lab 01
11 pages
Unit1 Introduction To R Programming
No ratings yet
Unit1 Introduction To R Programming
85 pages
R Tutiorial
No ratings yet
R Tutiorial
6 pages
R Program Questions 1-24
No ratings yet
R Program Questions 1-24
56 pages
Pplpresentation 211012192639
No ratings yet
Pplpresentation 211012192639
35 pages
R Programming
No ratings yet
R Programming
49 pages
E5 - Statistical Analysis Using R
100% (1)
E5 - Statistical Analysis Using R
45 pages
R Programming For Students
No ratings yet
R Programming For Students
40 pages
Statistical Methods Lab Manual-2021-22
No ratings yet
Statistical Methods Lab Manual-2021-22
58 pages
Unit 5 R
No ratings yet
Unit 5 R
51 pages
R Lang
No ratings yet
R Lang
3 pages
Unit I - Introduction To R
No ratings yet
Unit I - Introduction To R
21 pages
Chapter 2
No ratings yet
Chapter 2
11 pages
BBA 4 - Unit
No ratings yet
BBA 4 - Unit
16 pages
Printout Call List
No ratings yet
Printout Call List
1 page
English Class Test One
No ratings yet
English Class Test One
1 page
Aggression by Kandasamy
No ratings yet
Aggression by Kandasamy
3 pages
Modbus Communication Protocol
No ratings yet
Modbus Communication Protocol
3 pages
Getting Started
No ratings yet
Getting Started
114 pages
Cisco Full-Stack Observability Platform-10 - 22 - 2023
No ratings yet
Cisco Full-Stack Observability Platform-10 - 22 - 2023
17 pages
TCIB1 Introduction R14
No ratings yet
TCIB1 Introduction R14
19 pages
Room Rental System Project
No ratings yet
Room Rental System Project
23 pages
3rd - 5th Year Mulungushi Verification - 231020 - 115956
No ratings yet
3rd - 5th Year Mulungushi Verification - 231020 - 115956
26 pages
Microsoft Azure Explained
No ratings yet
Microsoft Azure Explained
2 pages
Backtracking: Definition Constraints
No ratings yet
Backtracking: Definition Constraints
14 pages
Duolingo Cartoon - Google Search
No ratings yet
Duolingo Cartoon - Google Search
1 page
The Best Headsets For Your Internet Café
No ratings yet
The Best Headsets For Your Internet Café
2 pages
CH - 2 Classes and Objects
No ratings yet
CH - 2 Classes and Objects
23 pages
Name of The Student Tabassum Parveen CU Registration No. CU University Roll NO
No ratings yet
Name of The Student Tabassum Parveen CU Registration No. CU University Roll NO
24 pages
Calling Jasper Report From Java Web Application Using JSP - Javaknowledge
No ratings yet
Calling Jasper Report From Java Web Application Using JSP - Javaknowledge
7 pages
61929
No ratings yet
61929
4 pages
Importance of Internet Connectivity To Grade 11 STEM Students of Senior High School Within Bacoor Elementary School During Online Learning
No ratings yet
Importance of Internet Connectivity To Grade 11 STEM Students of Senior High School Within Bacoor Elementary School During Online Learning
16 pages
Fritz 11 Added Engine
No ratings yet
Fritz 11 Added Engine
6 pages
Tigris-A Gateway Between Circuit-Switched and IP Networks: Peter Curtin and Bert Whyte
No ratings yet
Tigris-A Gateway Between Circuit-Switched and IP Networks: Peter Curtin and Bert Whyte
12 pages
Ce 1983 01
No ratings yet
Ce 1983 01
140 pages
Temperature and Dissolve Oxygen Content Detector: Bitstop Network Service, Inc
No ratings yet
Temperature and Dissolve Oxygen Content Detector: Bitstop Network Service, Inc
15 pages
Memory Architecture: Chapter 5 in Hennessy & Patterson
No ratings yet
Memory Architecture: Chapter 5 in Hennessy & Patterson
23 pages
Hitachi Unified Storage Replication User Guide
No ratings yet
Hitachi Unified Storage Replication User Guide
822 pages
Cybercrime Cdi 9
No ratings yet
Cybercrime Cdi 9
17 pages
Projects Group Basic Data For Flowsheet Draughting Foster Wheeler Engineering Standard
No ratings yet
Projects Group Basic Data For Flowsheet Draughting Foster Wheeler Engineering Standard
8 pages
IDLI WhitePaper V 1-0
No ratings yet
IDLI WhitePaper V 1-0
27 pages
Sahil Jaggarwal
No ratings yet
Sahil Jaggarwal
1 page
Cranes Resume Format
No ratings yet
Cranes Resume Format
2 pages
3GPP TS 33.107 - Lawful Interception Architecture and Functions
No ratings yet
3GPP TS 33.107 - Lawful Interception Architecture and Functions
404 pages
IDontCare TABS Fingerstyle
No ratings yet
IDontCare TABS Fingerstyle
4 pages

Module 5 Introduction To R Programming

Uploaded by

Module 5 Introduction To R Programming

Uploaded by

Module - V

Reasons to Learn R Programming

History of ‘R’ Programming:

 1991 - Ross Ihaka and Robert Gentleman begin work on a

 1993 - The first announcement of R hits the public via the

 1995 - Fellow statistician Martin Machler convinces R’s

 The same year, the Comprehensive R Archive Network

 2000 - R version 1.0.0 was released to the public.

 2020 - R version 4.0.0 is released.

 June 2023 - We're currently on R version 4.3.1.

1. Statistical Analysis: R provides a wide array of statistical techniques,

3. Cross-Platform Compatibility: R runs smoothly on Windows, macOS, and

4. Reproducibility: Tools like R Markdown facilitate combining code, output,

5. Open Source: Being a GNU project, R is open-source, which means it is free

6. Data Handling: R is designed to handle and manipulate large datasets

7. Integration with Other Languages: R can be integrated with other

8. User-Friendly Syntax: While R has a learning curve, its syntax is designed

2. Using the seq() function: For creating sequences.

3. Using the rep() function: To replicate values.

In R programming, a matrix is a two-dimensional array that can hold elements of

 Two-dimensional: Matrices consist of rows and columns.

1. Using the matrix() function:

 data : values you want to enter

2. Using the rbind() and cbind() functions:

Data Types in R Programming

Basic Data Values Examples

1. Numeric Data type in R

2. Integer Data type in R

3. Logical Data type in R

4. Complex Data type in R :

5. Character Data type in R :

6. Raw data type in R :

2. Importing Excel Files

3. Importing Text Files

4. Importing Data from Other Formats

1. Writing Data to CSV files in R Programming Language

2. Writing Data to text files

3. Writing Data to Excel files:

write.xlsx(my_data, file = "result.xlsx",

1. Using the merge() Function

r# Creating two data frames

# Inner join (default behavior)

ID Name Age Occupation Salary

2. Using the dplyr Package

r# Load dplyr package

# Creating the same data frames

ID Name Age Occupation Salary

3. Using the data.table Package

r# Load data.table package

# Creating the same data tables

# Full outer join

ID Name Age Occupation Salary

Key Concepts of Data Manipulation in R:

1. Selecting Data: Choosing specific columns or rows.

Example: Data Manipulation Using Base R and dplyr

Suppose we have a dataset about students’ marks:

# Sample data frame

Student Name Subjec Scor Pass

1. Select only Name and Score columns.

R Code for Data Manipulation:

# 1. Select Name and Score columns

# 2. Filter students who passed

# 3. Create a new column 'HighScore' indicating if Score > 80

# 4. Calculate average Score for Science subject

Using dplyr package (more efficient):

# 1. Select Name and Score

1. Selected columns (Name and Score):

# 2. Filter students who passed

# 3. Add 'HighScore' column

# 4. Compute average score in Science

Creation and Deletion of New Variables

You might also like