0% found this document useful (0 votes)
24 views12 pages

Basics of R

The document provides an introduction to R, a programming language for statistical computing and data visualization, created in 1992. It outlines the benefits of using R, including its open-source nature, extensive community support, and a wide range of statistical techniques and packages. Additionally, it describes various data types and structures in R, such as vectors, lists, matrices, data frames, arrays, and factors.

Uploaded by

kalvapushpa97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views12 pages

Basics of R

The document provides an introduction to R, a programming language for statistical computing and data visualization, created in 1992. It outlines the benefits of using R, including its open-source nature, extensive community support, and a wide range of statistical techniques and packages. Additionally, it describes various data types and structures in R, such as vectors, lists, matrices, data frames, arrays, and factors.

Uploaded by

kalvapushpa97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Introduction to R

Dr.V.Sree Ramani
C.B.I.T.
What is R
R is a popular programming language used for statistical computing and graphical
presentation. Its most common use is to analyze and visualize data. R was created
by statistician Ross Ihaka and statistician and bio informaticist Robert Gentleman from
the University of Auckland in 1992 on the basis of the programming language S. The
first official stable version (1.0) was released in 2000.

Why Use R?
•It is a great resource for data analysis, data visualization, data science and machine
learning

•It provides many statistical techniques (such as statistical tests, classification, clustering
and data reduction)
•It is easy to draw graphs in R, like pie charts, histograms, box plot, scatter plot, etc+
•It works on different platforms (Windows, Mac, Linux)
•It is open-source and free
•It has a large community support
•It has many packages (libraries of functions) that can be used to solve different problems
Features of R Programming
• Basic Statistics: The most common basic statistics terms are the mean, mode,
and median. These are all known as “Measures of Central Tendency.” So using
the R language we can measure central tendency very easily.
• Static graphics: R is rich with facilities for creating and developing interesting
static graphics. R contains functionality for many plot types including graphic
maps, mosaic plots, biplots, and the list goes on.
• Probability distributions: Probability distributions play a vital role in statistics
and by using R we can easily handle various types of probability distribution
such as Binomial Distribution, Normal Distribution, Chi-squared Distribution
and many more.
• Data analysis: It provides a large, coherent and integrated collection of tools
for data analysis.
• R Packages: One of the major features of R is it has a wide availability of
libraries. R has CRAN (Comprehensive R Archive Network), which is a repository
holding more than 10, 0000 packages.
• Distributed Computing: Distributed computing is a model in which
components of a software system are shared among multiple computers to
improve efficiency and performance.
General arithmetic operations in R

Example1
"Hello World!"
Example2
5
10
25
Example3
5+5
Basic Data Types

Basic data types in R can be divided into the following types:


•numeric - (10.5, 55, 787)

•integer - (1L, 55L, 100L, where the letter "L" declares this as
an integer)

•complex - (9 + 3i, where "i" is the imaginary part)

•character (a.k.a. string) - ("k", "R is exciting", "FALSE",


"11.5")
•logical (a.k.a. boolean) - (TRUE or FALSE)
Example:
X=10.5
Class(x)
Data Structures in R
R has six types of basic data structures. We can organize these data structures
according to their dimensions (1d, 2d, nd). We can also classify them as
homogeneous or heterogeneous (can their contents be of different types or not).
R has the following basic data structures:
1. Vector
2. List
3. Matrix
4. Data frame
5. Array
1.Vector :Vectors are single-dimensional, homogeneous data structures. To
create a vector, use the c() function.

2. List: Lists are heterogeneous data structures. They are very similar to
vectors except they can store data of different types. To create a list, we use
the list() function.
Note1. character>numeric>logical
2. as.integer/as.logical
3.assign(“v2”,list(1,2,3,”a”,12+3i, TRUE))
3. Matrix: Matrices are two-dimensional, homogeneous data structures. This means
that all values in a matrix have to be of the same type. Coercion takes place if there is
more than one data type. They have rows and columns. By default, matrices are in
column-wise order. The basic syntax to create a matrix is
: >matrix( data, nrow, ncol, byrow, dimnames)
nrow is the number of rows, ncol is the number of columns,
byrow is a logical which tells the function to arrange the matrix row-wise, by default it
is set to FALSE,
dimnames is a list of the names of the rows/columns created.
Example: >rownames = c("row1", "row2", "row3")
>colnames = c("col1", "col2", "col3")
>test_matrix2 = matrix(c(1:9), ncol = 3, dimnames = list(rownames, colnames))
# R program to access
# components of a list

# Creating a list by naming all its components


empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list("ID" = empId,"Names" = empName,"Total Staff"=numberOfEmp)
print(empList)

# Accessing components by names


cat("Accessing name components using $ command\n")
print(empList$Names)
4. Data Frame Data frames are two-dimensional, heterogeneous
data structures. They are lists of vectors of equal lengths.
Data frames have the following constraints placed upon them:
1. A data-frame must have column-names and each row should have
a unique name.
2. Each column should have the same number of items.
3 Each item in a single column should be of the same type.
4. Different columns can have different data types.
5. To create a data frame, use the data.frame() function

Example:
 student_id <- c(1:5)
 > student_name <- c("raj", "jacob", "iqbal", "shawn", "hitesh")
 > student_rank <- c("third", "fifth", "second", "fourth", "first")
 > student.data <- data.frame(student_id , student_name,
student_rank)
 > student.data
5. Array Arrays are three dimensional, homogeneous data
structures. They are collections of matrices stacked one on top of
the other in layers. We can create an array using the array()
function. The following is the syntax of it:
Array_name = array(data,dim,dimnames)
data is the data that is filled inside the array,
dim is a vector containing the dimensions of the array, and
dimnames is a list containing the names of the rows, columns, and
matrices inside the array

Example >
arr1 <- array(c(1:18),
dim=c(2,3,3))
> arr1
6. Factors Factors are vectors that can only store predefined values.
They are useful for storing categorical data. Factors have two attributes:
• Class – which has a value of “factor”, it makes it behave differently
than a normal vector.
• Levels – which is the set of allowed values You can create a factor
using the factor() function.
For example:
 fac <- factor(c("a", "b", "a", "b", "b"))
 fac
 Note:They can store both strings and integers. They are useful to
categorize unique values in columns like “TRUE” or “FALSE”, or
“MALE” or “FEMALE”,

# Creating a vector(INPUT)
x <-c("female", "male", "male", "female") Output
print(x) [1] "female" "male" "male" "female"
[1] female male male female
Levels: female male
# Converting the vector x into a factor
# named gender
gender <-factor(x)
print(gender)

You might also like