R Project
R Project
output:
[1] "numeric"
output:
[1] "logical"
output:
[1] "complex"
1. Numeric Vector: It is the most commonly used vector. It is a type of vector which stores
numeric values.
The following are the ways to create a numeric vector:
1. Using equal sign, leftward arrow, rightward arrow.
Ex:
# equal operator
x = 1.5
print(x)
[1] 1.5
# leftward operator
y <- 2.5
print(y)
[1] 2.5
# rightward operator
3.5 -> z
print(z)
[1] 3.5
2. Using combine method:
Ex:
x <- c(1,2,3,4)
print(x)
[1] 1,2,3,4
3. Using “:” operator
Ex:
x <- c(1:5)
print(x)
[1] 1,2,3,4,5
4. Using seq() method:
Ex:
x <- seq(1, 10, 2)
print(x)
[1] 1,3,5,7,9
2. Logical Vector: It is a type of vector which consists of only logical values which are
TRUE and FALSE. It also makes use of logical operators.
Ex:
x <- TRUE
print(x)
[1] TRUE
1>2
[1] FALSE
c(1, 2) > 2
[1] FALSE TRUE
1 %in% c(1,2,3)
[1] TRUE
3. Character Vector: This type of vector consists of only alphanumeric values and special
characters.
Ex:
“Hello World”
[1] “Hello World”
c(“Hello”, “World”)
[1] “Hello” “World”
cat(“I am studying ‘R Programming’”)
[1] I am studying ‘R programming’
4. OPERATORS
Operators are the symbols directing the compiler to perform various kinds of operations
between the operands. Operators simulate the various mathematical, logical, and decision
operations performed on a set of Complex Numbers, Integers, and Numerical as input
operands.
3. Relational These operators return the <, >, ==, etc v <- 12 > 32
result as either TRUE or print(v)
FALSE
output:
[1] FALSE
output:
[1] 123
5. Miscellaneous These are the mixed %in%, :, etc a <- 1 %in% c(1,2,3)
operators that simulate the
printing of sequences and print(a)
assignment of vectors, output:
either left or right-handed.
TRUE
5. FUNCTIONS
Functions are useful when you want to perform a certain task multiple time. A function accepts
input arguments and produces the output by executing valid R commands that are inside the
function. In R Programming Language when you are creating a function the function name and
the file in which you are creating the function need not be the same and you can have one or
more function definitions in a single R file.
There are mainly two types of functions:
1. In-built Functions:
The functions which are already created or defined in the programming framework are
known as a built-in function. R has a rich set of functions that can be used to perform
almost every task for the user.
some of the in-built functions are sum(), max(), min()
example:
print(sum(4:6))
output: [1] 15
print(max(4:6))
output: [1] 6
print(min(4:6))
output: [1] 4
2. User-defined Functions:
A function is a block of code that performs a specific task. R allows you to define
functions according to your need. These functions are known as user-defined functions.
User-defined functions can be created with the help of “function” keyword.
example:
expo <- function(a, b){
return a^b}
expo(2, 3)
output: [1] 8
6. LIST, MATRIX AND DATA FRAMES
LISTS:
A list in R is a generic object consisting of an ordered collection of objects. Lists are one-
dimensional, heterogeneous data structures. The list can be a list of vectors, a list of matrices, a
list of characters and a list of functions, and so on.
A list is a vector but with heterogeneous data elements. A list in R is created with the use of
list() function. R allows accessing elements of a list with the use of the index value. In R, the
indexing of a list starts with 1 instead of 0 like other programming languages.
Creating a list:
To create a List in R you need to use the function called “list()”. In other words, a list is a generic
vector containing other objects. This is a mutable object which means that we can make
changes to it even after its creation.
Example:
l <- list(1,2,3,4)
print(l)
output:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
Accessing element from a list:
print(l[1])
output:
[[1]]
[1] 1
Changing element from the list:
l[1] <- “A”
print(l[1])
output:
[[1]]
[1] “A”
MATRIX:
Matrix is a rectangular arrangement of numbers in rows and columns. In a matrix, as we know
rows are the ones that run horizontally and columns are the ones that run vertically. In R
programming, matrices are two-dimensional, homogeneous data structures. Matrix can be
created with the help of matrix() method.
Creating a matrix:
m <- matrix(
c(1,2,3,4),
nrow = 2,
ncol = 2,
byrow = TRUE
)
print(m)
output:
[,1] [,2]
[1,] 1 2
[2,] 3 4
Accessing element in a matrix:
print(m[1, ])
output:
[1] 1 2
Changing element in the matrix:
m[1, 1] <- 5
print(m[1, 1])
output:
[1] 5
DATAFRAME
Data Frames in R Language are generic data objects of R which are used to store the tabular
data. Data frames can also be interpreted as matrices where each column of a matrix can be of
the different data types. DataFrame is made up of three principal components, the data, rows,
and columns.
Creating a dataframe:
df <- data.frame(
df_id= c(1, 3)
df_name = c(“A”, “B”, “C”)
)
print(df)
output:
df_id df_name
1 1 A
2 2 B
3 3 C
$b
[1] 2
$c
[1] 10
$d
[1] 20
$b
[1] 2
$c
[1] "Hello"
$d
[1] "GFG"
Using $ operator:
[1] "GFG"
8. IMPORTING CSV FILE
A CSV file is used to store contents in a tabular-like format, which is organized in the form of
rows and columns. The column values in each row are separated by a delimiter string. The CSV
files can be loaded into the working space and worked using both in-built methods and external
package imports.
Method 1: Using read.csv() method
The read.csv() method in base R is used to load a .csv file into the present script and work with
it. The contents of the csv can be stored into the variable and further manipulated. Multiple
files can also be accessed in different variables. The output is returned to the form of a data
frame, where row numbers are assigned integers beginning with 1.
example:
path <- "/Users/xyz/Desktop/abc.csv"
content <- read.csv(path)
print (content)
· R language supports a rich set of packages and functionalities to create the graphs using
the input data set for data analytics.
· The most commonly used graphs in the R language are scattered plots, box plots, line
graphs, pie charts, histograms, and bar charts.
· R graphs support both two dimensional and three-dimensional plots for exploratory
data analysis.There are R function like plot(), barplot(), pie() are used to develop graphs in R
language. R package like ggplot2 supports advance graphs functionalities.
1. Histogram
A histogram is a graphical tool that works on a single variable. Numerous variable values are grouped
into bins, and a number of values termed as the frequency are calculated. This calculation is then used
to plot frequency bars in the respective beans. The height of a bar is represented by frequency.
Code:
Output:
2. Scatterplot
This plot is a simple chart type, but a very crucial one having tremendous significance. The chart gives
the idea about a correlation amongst variables and is a handy tool in an exploratory analysis.
Code:
attach(trees)
plot(Girth, Height, main = "Scatterplot of Girth vs Height", xlab = "Tree Girth", ylab = "Tree Height")
3. Boxplot
Boxplot is a way of visualizing data through boxes and whiskers. Firstly, variable
values are sorted in ascending order and then the data is divided into quarters.
Code:
boxplot(trees, col = c("yellow", "red", "cyan"), main = "Boxplot for trees dataset")
Output:
4. Line Chart
Line charts are useful when comparing multiple variables. They help us relationship between multiple
variables in a single plot. In the following illustration, we will try to understand the trend of three tree
features. So, as shown in the below code, initially, and the line chart for Girth is plotted using plot()
function. Then line charts for Height and Volume are plotted on the same plot using lines() function.
Code:
plot(Girth, type = "o", col = "red", ylab = "", ylim = c(0, 110),
Output:
5. Dot plot
This visualization tool is useful if we want to compare multiple categories against a certain measure.
Code:
attach(mtcars)
+ main = "Displacement for various Car Models", xlab = "Displacement in Cubic Inches")
Output:
11. LINEAR REGRESSION
In Linear Regression these two variables are related through an equation, where exponent
(power) of both these variables is 1. Mathematically a linear relationship represents a straight
line when plotted as a graph. A non-linear relationship where the exponent of any variable is
not equal to 1 creates a curve.
Example:
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
dev.off()