0% found this document useful (1 vote)
220 views25 pages

R Project

This document provides an index and overview of topics covered in an R programming assignment, including: 1) An introduction to R programming, outlining its origins and key features such as static graphics, probability distributions, data analysis, and distributed computing. 2) Descriptions of different data types in R including numeric, integer, logical, complex, and character. 3) An overview of vectors as basic building blocks in R, including numeric, logical, and character vectors and ways to create them. 4) A discussion of different types of operators in R like arithmetic, logical, relational, assignment, and miscellaneous operators. 5) An explanation of functions in R, distinguishing between built

Uploaded by

Varob
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
220 views25 pages

R Project

This document provides an index and overview of topics covered in an R programming assignment, including: 1) An introduction to R programming, outlining its origins and key features such as static graphics, probability distributions, data analysis, and distributed computing. 2) Descriptions of different data types in R including numeric, integer, logical, complex, and character. 3) An overview of vectors as basic building blocks in R, including numeric, logical, and character vectors and ways to create them. 4) A discussion of different types of operators in R like arithmetic, logical, relational, assignment, and miscellaneous operators. 5) An explanation of functions in R, distinguishing between built

Uploaded by

Varob
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

R PROGRAMMING ASSIGNMENT

Submitted By : Ritik Gujre


Course :BCA (AI and DS) 3rd Sem
Roll Number : 2101060010
Submitted To : Chandan Sir
INDEX

S.No Topic Sign


1. Introduction to R programming
2. Different data types in R programming
3. Vectors in R programming
4. Operators in R programming
5. Functions in R programming
6. List, Matrices and Data Frame in R
programming
7. Subsetting in R programming
8. Importing CSV file
9. Package installation
10. Types of graphs
11. Linear Regression
1. INTRODUCTION TO R PROGRAMMING
R is an open-source programming language that is widely used as a statistical software
and data analysis tool. R generally comes with the Command-line interface. R is
available across widely used platforms like Windows, Linux, and macOS. Also, the R
programming language is the latest cutting-edge tool. It was designed by Ross Ihaka and
Robert Gentleman at the University of Auckland, New Zealand, and is currently
developed by the R Development Core Team. R programming language is an
implementation of the S programming language. It also combines with lexical scoping
semantics inspired by Scheme. Moreover, the project was conceived in 1992, with an
initial version released in 1995 and a stable beta version in 2000.
Features of R Programming:
1. Static graphics: R is rich with facilities for creating and developing interesting
static graphics. R contains functionality for many plot types including graphic
maps, mosaic plots, biplots, and the list goes on.
2. Probability distributions: Probability distributions play a vital role in statistics and
by using R we can easily handle various types of probability distribution such as
Binomial Distribution, Normal Distribution, Chi-squared Distribution and many
more.
3. Data analysis: It provides a large, coherent and integrated collection of tools for
data analysis.
4. R Packages: One of the major features of R is it has a wide availability of libraries.
R has CRAN(Comprehensive R Archive Network), which is a repository holding
more than 10, 0000 packages.
5. Distributed Computing: Distributed computing is a model in which components
of a software system are shared among multiple computers to improve efficiency
and performance. Two new packages ddR and multidplyr used for distributed
programming in R were released in November 2015.
2. DIFFERENT DATA TYPES:

Data Types Description Example Code

1. Numeric Consists of all real 1, 1.2, -3, etc x <- 3.5


numbers print(class(x))

output:
[1] "numeric"

2. Integer Consists of all positive and 1L, -1L, etc x <- 3L


negative whole numbers. L print(class(x))
is used to distinguish it
from numeric data type. output:
[1] "integer"

3. Logical Consists of True and False TRUE, FALSE x<- FALSE


as values print(class(x))

output:
[1] "logical"

4. Complex It consists of imaginary 1 +3i, 2 + 4i, etc. x <- 3 + 5i


numbers print(class(x))

output:
[1] "complex"

5. Character It consists of letters and “a”, “%”, x <- “2341”


special characters. It is “FALSE” ,”343”, print(class(x))
always enclosed within etc
inverted commas output:
[1] "character"
3. VECTORS
Vector is one of the building blocks of all R objects. It contains primitive values of the same
type. Vector can be a group of numbers, texts, true/false values, and values of some other type.
Several types of vectors exist in R. The most commonly used vectors are
1. numeric vectors
2. logical vectors
3. character vectors

1. Numeric Vector: It is the most commonly used vector. It is a type of vector which stores
numeric values.
The following are the ways to create a numeric vector:
1. Using equal sign, leftward arrow, rightward arrow.
Ex:
# equal operator
x = 1.5
print(x)
[1] 1.5
# leftward operator
y <- 2.5
print(y)
[1] 2.5
# rightward operator
3.5 -> z
print(z)
[1] 3.5
2. Using combine method:
Ex:
x <- c(1,2,3,4)
print(x)
[1] 1,2,3,4
3. Using “:” operator
Ex:
x <- c(1:5)
print(x)
[1] 1,2,3,4,5
4. Using seq() method:
Ex:
x <- seq(1, 10, 2)
print(x)
[1] 1,3,5,7,9

2. Logical Vector: It is a type of vector which consists of only logical values which are
TRUE and FALSE. It also makes use of logical operators.
Ex:
x <- TRUE
print(x)
[1] TRUE
1>2
[1] FALSE
c(1, 2) > 2
[1] FALSE TRUE
1 %in% c(1,2,3)
[1] TRUE

3. Character Vector: This type of vector consists of only alphanumeric values and special
characters.
Ex:
“Hello World”
[1] “Hello World”
c(“Hello”, “World”)
[1] “Hello” “World”
cat(“I am studying ‘R Programming’”)
[1] I am studying ‘R programming’
4. OPERATORS
Operators are the symbols directing the compiler to perform various kinds of operations
between the operands. Operators simulate the various mathematical, logical, and decision
operations performed on a set of Complex Numbers, Integers, and Numerical as input
operands.

Types of Operators Description Example Code

1. Arithmetic The operators are used to +, -, ^, *, etc v <- 1+2


perform arithmetic print(v)
calculations
output:
[1] 3

2. Logical It simulates element-wise &, |, !, etc v <- TRUE & FALSE


decision operations, based print(v)
on the specified operator
between the operands. output:
[1] FALSE

3. Relational These operators return the <, >, ==, etc v <- 12 > 32
result as either TRUE or print(v)
FALSE
output:
[1] FALSE

4. Assignment These assign values to the <-, =, etc a <- 123


variables print(a)

output:
[1] 123

5. Miscellaneous These are the mixed %in%, :, etc a <- 1 %in% c(1,2,3)
operators that simulate the
printing of sequences and print(a)
assignment of vectors, output:
either left or right-handed.
TRUE
5. FUNCTIONS
Functions are useful when you want to perform a certain task multiple time. A function accepts
input arguments and produces the output by executing valid R commands that are inside the
function. In R Programming Language when you are creating a function the function name and
the file in which you are creating the function need not be the same and you can have one or
more function definitions in a single R file.
There are mainly two types of functions:
1. In-built Functions:
The functions which are already created or defined in the programming framework are
known as a built-in function. R has a rich set of functions that can be used to perform
almost every task for the user.
some of the in-built functions are sum(), max(), min()
example:
print(sum(4:6))
output: [1] 15
print(max(4:6))
output: [1] 6
print(min(4:6))
output: [1] 4

2. User-defined Functions:
A function is a block of code that performs a specific task. R allows you to define
functions according to your need. These functions are known as user-defined functions.
User-defined functions can be created with the help of “function” keyword.
example:
expo <- function(a, b){
return a^b}
expo(2, 3)
output: [1] 8
6. LIST, MATRIX AND DATA FRAMES
LISTS:
A list in R is a generic object consisting of an ordered collection of objects. Lists are one-
dimensional, heterogeneous data structures. The list can be a list of vectors, a list of matrices, a
list of characters and a list of functions, and so on.
A list is a vector but with heterogeneous data elements. A list in R is created with the use of
list() function. R allows accessing elements of a list with the use of the index value. In R, the
indexing of a list starts with 1 instead of 0 like other programming languages.
Creating a list:
To create a List in R you need to use the function called “list()”. In other words, a list is a generic
vector containing other objects. This is a mutable object which means that we can make
changes to it even after its creation.
Example:
l <- list(1,2,3,4)
print(l)
output:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
Accessing element from a list:
print(l[1])
output:
[[1]]
[1] 1
Changing element from the list:
l[1] <- “A”
print(l[1])
output:
[[1]]
[1] “A”

MATRIX:
Matrix is a rectangular arrangement of numbers in rows and columns. In a matrix, as we know
rows are the ones that run horizontally and columns are the ones that run vertically. In R
programming, matrices are two-dimensional, homogeneous data structures. Matrix can be
created with the help of matrix() method.
Creating a matrix:
m <- matrix(
c(1,2,3,4),
nrow = 2,
ncol = 2,
byrow = TRUE
)
print(m)
output:
[,1] [,2]
[1,] 1 2
[2,] 3 4
Accessing element in a matrix:
print(m[1, ])
output:
[1] 1 2
Changing element in the matrix:
m[1, 1] <- 5
print(m[1, 1])
output:
[1] 5

DATAFRAME
Data Frames in R Language are generic data objects of R which are used to store the tabular
data. Data frames can also be interpreted as matrices where each column of a matrix can be of
the different data types. DataFrame is made up of three principal components, the data, rows,
and columns.
Creating a dataframe:
df <- data.frame(
df_id= c(1, 3)
df_name = c(“A”, “B”, “C”)
)
print(df)
output:
df_id df_name
1 1 A
2 2 B
3 3 C

Accessing data in a data frame:


r <- data.frame(df$df_name)
print(r)
output:
df.df_name
1 A
2 B
3 C

Modifying element in a data frame:


df$num <- c(5:7)
print(df)
output:
df_id df_name num
1 1 A 5
2 2 B 6
3 3 C 7
7. SUBSETTING
subsetting allows the user to access elements from an object. It takes out a portion from the
object based on the condition provided. There are 4 ways of subsetting in R programming. Each
of the methods depends on the usability of the user and the type of object. For example, if
there is a data frame with many columns such as states, country, and population and suppose
the user wants to extract states from it, then subsetting is used to do this operation.
Method 1: Subsetting in R Using [ ] Operator
Using the ‘[ ]’ operator, elements of vectors and observations from data frames can be
accessed.
# Create vector
x <- 1:15
# Print vector
cat("Original vector: ", x, "\n")
# Subsetting vector
cat("First 5 values of vector: ", x[1:5], "\n")
cat("Without values present at index 1, 2 and 3: ",
x[-c(1, 2, 3)], "\n")
Output:
Original vector: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
First 5 values of vector: 1 2 3 4 5
Without values present at index 1, 2 and 3: 4 5 6 7 8 9 10 11 12 13 14 15
Method 2: Subsetting in R Using [[ ]] Operator
[[ ]] operator is used for subsetting of list-objects. This operator is the same as [ ] operator but
the only difference is that [[ ]] selects only one element whereas [ ] operator can select more
than 1 element in a single command.
# Create list
ls <- list(a = 1, b = 2, c = 10, d = 20)
# Print list
cat("Original List: \n")
print(ls)

# Select first element of list


cat("First element of list: ", ls[[1]], "\n")
output:
Original List:
$a
[1] 1

$b
[1] 2

$c
[1] 10

$d
[1] 20

First element of list: 1


Method 3: Subsetting in R Using $ Operator
$ operator can be used for lists and data frames in R. Unlike [ ] operator, it selects only a single
observation at a time. It can be used to access an element in named list or a column in data
frame. $ operator is only applicable for recursive objects or list-like objects.
# Create list
ls <- list(a = 1, b = 2, c = "Hello", d = "GFG")
# Print list
cat("Original list:\n")
print(ls)
# Print "GFG" using $ operator
cat("Using $ operator:\n")
print(ls$d
output:
Original list:
$a
[1] 1

$b
[1] 2

$c
[1] "Hello"

$d
[1] "GFG"

Using $ operator:
[1] "GFG"
8. IMPORTING CSV FILE
A CSV file is used to store contents in a tabular-like format, which is organized in the form of
rows and columns. The column values in each row are separated by a delimiter string. The CSV
files can be loaded into the working space and worked using both in-built methods and external
package imports.
Method 1: Using read.csv() method
The read.csv() method in base R is used to load a .csv file into the present script and work with
it. The contents of the csv can be stored into the variable and further manipulated. Multiple
files can also be accessed in different variables. The output is returned to the form of a data
frame, where row numbers are assigned integers beginning with 1.
example:
path <- "/Users/xyz/Desktop/abc.csv"
content <- read.csv(path)
print (content)

Method 2: Using read_csv()


The “readr” package in R is used to read large flat files into the working space with increase
speed and efficiency. The read_csv() method reads a csv file reading one line at a time. The data
using this method is read in the form of a tibble, of the same dimensions as of the table stored
in the .csv file. Only ten rows of the tibble are displayed on the screen and rest are available
after expanding, which increases the readability of the large files.
example:
library("readr")
path <- "/Users/xyz/Desktop/abc.csv"
content <- read_csv(path, col_names = TRUE)
print (content)
9. PACKAGE INSTALLATION
R programming language doesn’t come with all packages installed, and they need to be
installed explicitly. In this article, we will discuss How to Install a Package in the R language. R
library consists of a large amount of packages which can be used to cater to the needs of the
user. To access these packages first we need to install it.
Installing packages in R:
To install a package simply pass the package to be installed as an argument to install.packages()
function. After installing the package we need to call the package in order to use its
functionalities.
Example:
install.packages("ggplot2")
library(“ggplot2”)
10. TYPES OF GRAPHS

Types of graphs in R programming.

· Graphs in R language is a preferred feature which is used to create various types of


graphs and charts for visualizations.

· R language supports a rich set of packages and functionalities to create the graphs using
the input data set for data analytics.

· The most commonly used graphs in the R language are scattered plots, box plots, line
graphs, pie charts, histograms, and bar charts.

· R graphs support both two dimensional and three-dimensional plots for exploratory
data analysis.There are R function like plot(), barplot(), pie() are used to develop graphs in R
language. R package like ggplot2 supports advance graphs functionalities.

1. Histogram
A histogram is a graphical tool that works on a single variable. Numerous variable values are grouped
into bins, and a number of values termed as the frequency are calculated. This calculation is then used
to plot frequency bars in the respective beans. The height of a bar is represented by frequency.

Code:

hist(trees$Height, breaks = 10, col = "orange", main = "Histogram of Tree


heights", xlab = "Height Bin")

Output:
2. Scatterplot
This plot is a simple chart type, but a very crucial one having tremendous significance. The chart gives
the idea about a correlation amongst variables and is a handy tool in an exploratory analysis.

Code:

attach(trees)

plot(Girth, Height, main = "Scatterplot of Girth vs Height", xlab = "Tree Girth", ylab = "Tree Height")

abline(lm(Height ~ Girth), col = "blue", lwd = 2)


Output:

3. Boxplot
Boxplot is a way of visualizing data through boxes and whiskers. Firstly, variable
values are sorted in ascending order and then the data is divided into quarters.

Code:

boxplot(trees, col = c("yellow", "red", "cyan"), main = "Boxplot for trees dataset")

Output:
4. Line Chart
Line charts are useful when comparing multiple variables. They help us relationship between multiple
variables in a single plot. In the following illustration, we will try to understand the trend of three tree
features. So, as shown in the below code, initially, and the line chart for Girth is plotted using plot()
function. Then line charts for Height and Volume are plotted on the same plot using lines() function.

Code:

plot(Girth, type = "o", col = "red", ylab = "", ylim = c(0, 110),

+ main = "Comparison amongst Girth, Height, and Volume of trees")

lines(Height, type = "o", col = "blue")

lines(Volume, type = "o", col = "green")

legend(1, 110, legend = c("Girth", "Height", "Volume"),


+ col = c("red", "blue", "green"), lty = 1:1, cex = 0.9)

Output:

5. Dot plot
This visualization tool is useful if we want to compare multiple categories against a certain measure.

Code:

attach(mtcars)

dotchart(disp, labels = row.names(mtcars), cex = 0.75,

+ main = "Displacement for various Car Models", xlab = "Displacement in Cubic Inches")
Output:
11. LINEAR REGRESSION

In Linear Regression these two variables are related through an equation, where exponent
(power) of both these variables is 1. Mathematically a linear relationship represents a straight
line when plotted as a graph. A non-linear relationship where the exponent of any variable is
not equal to 1 creates a curve.

Example:

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

relation <- lm(y~x)

plot(y,x,col = "blue",main = "Height & Weight Regression",

abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in cm")

dev.off()

You might also like