0% found this document useful (0 votes)

4 views61 pages

DS 3

This document provides an overview of fundamental concepts in data science, focusing on vectors, matrices, arrays, and factors in R programming. It explains how to create and manipulate these data structures, including functions for accessing, sorting, and subsetting data. Additionally, it covers the importance of categorical variables and how to create and summarize factors.

Uploaded by

harshithamutyala730

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views61 pages

DS 3

Uploaded by

harshithamutyala730

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Fundamentals of Data Science

Unit 3

Prepared By
Dr.P.Sasikumar
Associate Professor, AIML Dept.
1
Vectors
A vector is simply a list of items that are of the same type.

• To combine the list of items to a vector, use the c() function and separate the items by a comma.

• Vectors are the same as the arrays in R language which are used to hold multiple data values
of the same type.

• One major key point is that in R Programming Language the indexing of the vector will start
from ‘1’ and not from ‘0’.

2
Six types of atomic vectors in R
Only the first four (double, integer, logical, and character) are discussed and used in this book.

Type Example Comment

Floating point numbers with double

1 double (or numeric) -0.5, 120.9, 5.0
precision

2 integer -1L, 121L, 5L “Long” integers

3 logical TRUE, FALSE Boolean

4 character "R", "5" or 'R', '5' Text

5 complex -5+11i, 3+2i, 0+4i Real+imaginary numbers

6 raw 01, ff Raw bytes (as hexadecimal)

3
Important vector functions
• In programming, functions are used to perform a specific task,

• e.g., manipulate an object, calculate a derived quantity, or investigate existing objects.

A few of the most important ones for creating and investigating simple vectors are:

• c(): Combines multiple elements into one atomic vector.

• length(): Returns the length (number of elements) of an object.

• class(): Returns the class of an object.

• typeof(): Returns the type of an object.

• attributes(): Returns further metadata of arbitrary type.

4
Create a vector variable called fruits,
that combine strings:

• # Vector of characters/strings

• fruits <- c("banana", "apple", "orange")

• # Print fruits

• fruits

• OPUTPUT:

• [1] "banana" "apple" "orange"

5
create a vector that combines numerical
values:

• # Vector of numerical values

numbers <- c(1, 2, 3)

# Print numbers
numbers

• OUTPUT

• [1] 1 2 3

6
Numerical values
• To create a vector with numerical values in a sequence, use the : operator

# Vector with numerical values in a sequence

numbers <- 1:10

# Print numbers

numbers

Output:
[1] 1 2 3 4 5 6 7 8 9 10
7
Values in a sequence
• create numerical values with decimals in a sequence, but note that if the last element does not belong
to the sequence, it is not used:

8
create a vector of logical values

9
Vector Length
To find out how many items a vector has, use the length() function

10
Sort a Vector
• To sort items in a vector alphabetically or numerically, use the sort() function:

11
Access Vectors
• You can access the vector items by referring to its index number inside brackets [].

• The first item has index 1, the second item has index 2, and so on:

• can also access multiple elements by referring to different index positions with the c() function:

12
Change an Item

• To change the value of a specific item, refer to the index number:

13
Repeat Vectors
• To repeat vectors, use the rep() function:

• Repeat the sequence of the vector:

• Repeat each value independently:

14
Matrices
• Matrix is a rectangular arrangement of numbers in rows and columns.

• In a matrix, as we know rows are the ones that run horizontally and columns are the ones that run
vertically.

• R-matrix is a two-dimensional arrangement of data in rows and columns.

• In R programming, matrices are two-dimensional, homogeneous data structures.

• These are some examples of matrices:

15
Creating a Matrix in R
• To create a matrix in R you need to use the function called matrix().

• The arguments to this matrix() are the set of elements in the vector.

• You have to pass how many numbers of rows and how many numbers of columns you want to have
in your matrix.

Syntax to Create R-Matrix

• matrix(data, nrow, ncol, byrow, dimnames)

Parameters:

• data – values you want to enter

• nrow – no. of rows

• ncol – no. of columns

• byrow – logical clue, if ‘true’ value will be assigned by rows

• dimnames – names of rows and columns

16
17
Creating Special Matrices in R
• R allows the creation of various different types of matrices with the use of arguments passed to the
matrix() function.

1. Matrix where all rows and columns are filled by a single constant ‘k’:

• To create such a R matrix the syntax is given below:

• Syntax: matrix(k, m, n)

Parameters:

• k: the constant

• m: no of rows

• n: no of columns

18
2. Diagonal matrix:
• A diagonal matrix is a matrix in which the entries outside the main diagonal are all zero.

To create such a R matrix the syntax is given below:

• Syntax: diag(k, m, n)

• Parameters:

• k: the constants/array

• m: no of rows

• n: no of columns

19
3. Identity matrix:
• An identity matrix in which all the elements of the principal diagonal are ones and all other elements
are zeros.

• To create such a R matrix the syntax is given below:

• Syntax: diag(k, m, n)

• Parameters:

• k: 1

• m: no of rows

• n: no of columns

20
R-Matrix
• We can access elements in the R matrices using the same convention that is followed in data frames.
So, you will have a matrix and followed by a square bracket with a comma in between array.

• Value before the comma is used to access rows and value that is after the comma is used to access
columns.

• Let’s illustrate this by taking a simple R code.

21
22
Subsetting in R Programming
• To subset specific elements from the R matrix you can use bracket notation [], by using this
notation we can subset a single element from a matrix, multiple elements, subset by range, select
elements from a list etc.

• Let’s create a matrix from vectors by specifying the number of columns and number of rows.

23
Subset a specific element
• Given matrix you can pass the specified row index and column index(which is the location of a
specific element you want) into bracket notation [ ].

• It will return the subset of the given Matrix by a specific element.

24
Subset a R Matrix by a Specific Row
• To subset a matrix by a specific row, you can use bracket notation([]).

• To do this, simply specify the row index on the left side of the notation(before a comma), and the
matrix will be subsetted by the corresponding row of the specified index.

25
Subset a Matrix by a Specific
Column
• Alternatively, you can subset a matrix by a specific column using bracket notation([]).

• This time you can simply specify the column index on the right side of the notation(after a comma),

• and the matrix will be subsetted by the corresponding column of the specified index.

26
Subset a R Matrix by Logical Condition
• Subsetting a matrix by rows based on logical conditions is possible.

• For instance, you can select rows that meet a specific condition and those rows will be included in
the result.

27
Using subset() function Subset the R Matrix
• So far, we have implemented subsetting of matrices using bracket notation([]).

• Now we will implement subsetting the matrix by using the subset() function that provides a
concise way to filter data frames or matrices based on conditions.

• In this example, I will subset the matrix by rows based on the condition using this function.

28
Subset a Matrix by Name
• To subset a matrix by name in R, you can use the row and column names along with square
brackets.

• Let’s create a matrix with customized column names and row names and use these names to subset
a matrix by names.

29
•
R – Array
Arrays are essential data storage structures defined by a fixed number of dimensions. Arrays are used for the
allocation of space at contiguous memory locations.
• In R Programming Language Uni-dimensional arrays are called vectors with the length being their only
dimension. Two-dimensional arrays are called matrices, consisting of fixed numbers of rows and columns. R
Arrays consist of all elements of the same data type. Vectors are supplied as input to the function and then
create an array based on the number of dimensions.
• Creating an Array
• An R array can be created with the use of array() the function. A list of elements is passed to the array()
functions along with the dimensions as required.
• array(data, dim = (nrow, ncol, nmat), dimnames=names)
• where
• nrow: Number of rows
• ncol : Number of columns
• nmat: Number of matrices of dimensions nrow * ncol
• dimnames : Default value = NULL.
• Otherwise, a list has to be specified which has a name for each component of the dimension. Each component
is either a null or a vector of length equal to the dim value of that corresponding dimension.
30
Uni-Dimensional Array
• A vector is a uni-dimensional array, which is specified by a single dimension, length.

• A Vector can be created using ‘c()‘ function. A list of values is passed to the c() function to create a
vector.

31
Multi-Dimensional Array
• A two-dimensional matrix is an array specified by a fixed number of rows and columns, each
containing the same data type.

• A matrix is created by using array() function to which the values and the dimensions are passed.

32
Class in R
• Class is the blueprint that helps to create an object and contains its member variable along with

• the attributes. As discussed earlier in the previous section, there are two classes of R, S3, and S4.

• S3 Class

•  S3 class is somewhat primitive in nature. It lacks a formal definition and object of this class can
be created simply by adding a class attribute to it.

•  This simplicity accounts for the fact that it is widely used in R programming language. In fact
most of the R built-in classes are of this type

• S3 is the simplest yet the most popular OOP system and it lacks formal definition and structure.
An object of this type can be created by just adding an attribute to it. Following is an example to
make things more clear:.

33
Introduction to Factors
• Factor refers to a statistical data type used to store Categorical variables.

• The difference between a categorical variable and a continuous variable is that a categorical
variable can belong to a limited number of categories.

• A continuous variable, on the other hand, can correspond to an infinite number of values.

• It is important that R knows whether it is dealing with a continuous or a categorical variable, as

the statistical models you will develop in the future treat both types differently.

• A good example of a categorical variable is Gender.

• In many circumstances you can limit the Gender categories to “Male” or “Female”.

• In the above example, all the possible cases are known beforehand and are predefined.

• These distinct values are known as levels.

• After a factor is created it only consists of levels that are by default sorted alphabetically.

34
Creating a Factor in R
• The command used to create or modify a factor in R
language is – factor() with a vector as input.

The two steps to creating an R factor :

1. Creating a vector

2. Converting the vector created into a factor using

function factor()

35
•
Factor levels
When you first get a dataset, you will often notice that it contains factors with specific factor levels.
However, sometimes you will want to change the names of these levels for clarity or other reasons. R
allows you to do this with the function levels():

levels(factor_vector) <- c("name1", "name2",...)

• A good illustration is the raw data that is provided to you by a survey. A common question for every
questionnaire is the Gender of the respondent. Here, for simplicity, just two categories were
recorded, “M” and “F”.

survey_vector <- c("M", "F", "F", "M", "M")

• Recording the Gender with the abbreviations “M” and “F” can be convenient if you are collecting
data with pen and paper, but it can introduce confusion when analyzing the data. At that point, you
will often want to change the factor levels to “Male” and “Female” instead of “M” and “F” for
clarity.

• Watch out: the order with which you assign the levels is important. If you
type levels(factor_survey_vector), you’ll see that it outputs [1] “F” “M”. If you don’t specify the
levels of the factor when creating the vector, R will automatically assign them alphabetically.

• To correctly map “F” to “Female” and “M” to “Male”, the levels should be set to c(“Female”,
“Male”), in this order.
36
37
Summarizing a factor
• One of your favorite functions in R will be
summary(). This will give you a quick overview
of the contents of a variable:

summary(my_var)

• Going back to our survey, you would like to

know how many “Male” responses you have in
your study, and how many “Female” responses.
The summary() function gives you the answer
to this question.

• look at the output. The fact that you

identified “Male” and “Female” as factor
levels in factor_survey_vector enables R to
show the number of elements for each category.

38
Ordered factors
• Since “Male” and “Female” are unordered (or nominal) factor levels, R returns a warning message,
telling you that the greater than operator is not meaningful.

• As seen before, R attaches an equal value to the levels for such factors.

• But this is not always the case! Sometimes you will also deal with factors that do have a natural
ordering between its categories.

• If this is the case, we have to make sure that we pass this information to R…

• Let us say that you are leading a research team of five data analysts and that you want to evaluate
their performance.

• To do this, you track their speed, evaluate each analyst as “slow”, “medium” or “fast”, and save the
results in speed_vector.

39
Instructions
• As a first step, assign speed_vector a vector with 5 entries, one for each analyst.

• Each entry should be either “slow”, “medium”, or “fast”. Use the list below:

• Analyst 1 is medium,

• Analyst 2 is slow,

• Analyst 3 is slow,

• Analyst 4 is medium and

• Analyst 5 is fast.

• No need to specify these are factors yet.

• # Create speed_vector

speed_vector <- c("medium","slow","slow", "medium", "fast")

40
Comparing ordered factors
• Having a bad day at work, ‘data analyst number two’ enters your office and starts complaining that
‘data analyst number five’ is slowing down the entire project.

• Since you know that ‘data analyst number two’ has the reputation of being a smarty-pants, you first
decide to check if his statement is true.

• The fact that factor_speed_vector is now ordered enables us to compare different elements (the data
analysts in this case).

• You can simply do this by using the well-known operators.

Instructions

• Use [2] to select from factor_speed_vector the factor value for the second data analyst. Store it as da2.

• Use [5] to select the factor_speed_vector factor value for the fifth data analyst. Store it as da5.

• Check if da2 is greater than da5; simply print out the result. Remember that you can use the >
operator to check whether one element is larger than the other.

41
42
Data Frames

• Data Frames are data displayed in a format as a table.

• Data Frames can have different types of data inside it.

• While the first column can be character, the second and third can be numeric or logical.

• However, each column should have the same type of data.

• Data Frames in R Language are generic data objects of R that are used to store tabular data.

• Data frames can also be interpreted as matrices where each column of a matrix can be of different
data types.

• R DataFrame is made up of three principal components, the data, rows, and columns.

• Use the data.frame() function to create a data frame:

43
R Data Frames Structure
• As you can see in the image below, this is how a data frame is structured.

• The data is presented in tabular form, which makes it easier to operate and understand.

44
Create Data frame in R
• To create an R data frame use data.frame() function and
then pass each of the vectors you have created as
arguments to the function.

45
Create Subsets of a Data frame
• subset() function in R Programming Language is used to create subsets of a Data frame. This can
also be used to drop columns from a data frame.

Syntax: subset(df, expr)

• Parameters:

• df: Data frame used

• expr: Condition for subset

46
Example 1: Basic example of subset()
Function

Here, in the above code, the original data frame remains intact while another subset of
data frame is created which holds a selected row from the original data frame

47
Extending Data Frames in R
• expand.grid() function in R Programming Language is used to create a data frame with all the
values that can be formed with the combinations of all the vectors or factors passed to the function
as an argument.

• expand.grid() Function

• Syntax:

• expand.grid(…)

• Parameters:…: Vector1, Vector2, Vector3, …

48
R Programme to expand dataframe in R

49
R Programme to expand dataframe in R

50
How to Sort a DataFrame in R ?
• In R DataFrame is a two-dimensional tabular data structure that consists of rows and columns.

• Sorting a DataFrame allows us to reorder the rows based on the values in one or more columns.
This can be useful for various purposes, such as organizing data for analysis or presentation.

• Methods to sort a dataframe:

• order() function (increasing and decreasing order)

51
Method 1: Using order() function
• This function is used to sort the dataframe based on the particular column in the dataframe

• Syntax: order(dataframe$column_name,decreasing = TRUE))

where

• dataframe is the input dataframe

• Column name is the column in the dataframe such that dataframe is sorted based on this column

• Decreasing parameter specifies the type of sorting order

• If it is TRUE dataframe is sorted in descending order. Otherwise, in increasing order

• return type: Index positions of the elements

52
Example
• R program to create a dataframe with 2 columns and order based on particular columns in decreasing
order. Displayed the Sorted dataframe based on subjects in decreasing order, displayed the Sorted
dataframe based on roll no in decreasing order

53
A List in R programming

• A list in R programming is a generic object consisting of an ordered collection of objects.

• A list is with one-dimensional, heterogeneous data structures.

• The list can be a list of vectors, a list of matrices, a list of characters, a list of functions, and so on.

• Lists are the R objects which contain elements of different types like − numbers, strings, vectors and
another list inside it.

• A list can also contain a matrix or a function as its elements.

• List is created using list() function.

• In R, the indexing of a list starts with 1 instead of 0.

54
Creating a List
• Following is an example to create a list containing strings, numbers, vectors and a logical values.

55
Example to create a list containing strings,
numbers, vectors and a logical values.

56
Naming List Components
• Naming list components make it easier to access them.

• Example:

57
Accessing List Elements
• Elements of the list can be accessed by the index of the element in the list.

• In case of named lists it can also be accessed using the names.

• All the components of a list can be named and we can use those names to access the components of the
R list using the dollar command.

58
Manipulating List Elements
• We can add, delete and update list elements as shown below.

• We can add and delete elements only at the end of a list.

• But we can update any element.

59
Merging Lists
• You can merge many lists into one list by placing all the lists inside one list() function.

60
Converting List to Vector
• A list can be converted to a vector so that the elements of the vector can be used for further
manipulation.

• All the arithmetic operations on vectors can be applied after the list is converted into vectors.

• To do this conversion, we use the unlist() function. It takes the list as input and produces a vector.

Employee Relationship Management (MCQ)
67% (6)
Employee Relationship Management (MCQ)
46 pages
Rbasics
No ratings yet
Rbasics
96 pages
IDS - Unit 3 - 5
No ratings yet
IDS - Unit 3 - 5
80 pages
r22 Unit3 Vector Matrix
No ratings yet
r22 Unit3 Vector Matrix
30 pages
R22 Unit3 Vector List Matrix
No ratings yet
R22 Unit3 Vector List Matrix
37 pages
R Data Structures - 07 - 3
No ratings yet
R Data Structures - 07 - 3
35 pages
Unit 1.1
No ratings yet
Unit 1.1
85 pages
Ids Unit 3 by
No ratings yet
Ids Unit 3 by
109 pages
R Objects
No ratings yet
R Objects
10 pages
Unit - Iii: R Vectors
No ratings yet
Unit - Iii: R Vectors
16 pages
R - Lecture 2
No ratings yet
R - Lecture 2
51 pages
R Programming Merged PDF
No ratings yet
R Programming Merged PDF
365 pages
RStudio
No ratings yet
RStudio
60 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
35 pages
Creating and Manipulating Objects
No ratings yet
Creating and Manipulating Objects
12 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
Intr2R Week2 2020
No ratings yet
Intr2R Week2 2020
13 pages
R-Basic Concepts
No ratings yet
R-Basic Concepts
67 pages
List and Data Frame
No ratings yet
List and Data Frame
18 pages
R Nuts and Bolts
No ratings yet
R Nuts and Bolts
9 pages
WIN SEM (2022-23) CSE4027 ETH AP2022236000324 Reference Material I 25-Jan-2023 Module-1 Topic-3 - R Datatypes
No ratings yet
WIN SEM (2022-23) CSE4027 ETH AP2022236000324 Reference Material I 25-Jan-2023 Module-1 Topic-3 - R Datatypes
41 pages
R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
Data in R
No ratings yet
Data in R
7 pages
N2 Data in R
No ratings yet
N2 Data in R
7 pages
Unit1 Matrix and Array
No ratings yet
Unit1 Matrix and Array
19 pages
Chapter - 3 - R Objects or Data Types
No ratings yet
Chapter - 3 - R Objects or Data Types
7 pages
R-Data Structures
No ratings yet
R-Data Structures
14 pages
BDA Section 3
No ratings yet
BDA Section 3
33 pages
Introduction To R
No ratings yet
Introduction To R
21 pages
Data Structure in
No ratings yet
Data Structure in
18 pages
1 - Introduction To Programming With R
No ratings yet
1 - Introduction To Programming With R
13 pages
Basic R Tutorial
No ratings yet
Basic R Tutorial
56 pages
R Program 10
No ratings yet
R Program 10
7 pages
R - Lecture 3
No ratings yet
R - Lecture 3
21 pages
Lenguaje R C4
No ratings yet
Lenguaje R C4
15 pages
M2 Dar
No ratings yet
M2 Dar
46 pages
Unit 4
No ratings yet
Unit 4
27 pages
R Programming Unit 3 QB Solved
No ratings yet
R Programming Unit 3 QB Solved
272 pages
Introduction To Data Science With R Programming
No ratings yet
Introduction To Data Science With R Programming
40 pages
R Project
0% (1)
R Project
25 pages
Lab3-Lists, Matrices and Arrays
No ratings yet
Lab3-Lists, Matrices and Arrays
6 pages
Unit 2 Notes - Data Analysis Using R
No ratings yet
Unit 2 Notes - Data Analysis Using R
19 pages
Data Mining Lab 2
No ratings yet
Data Mining Lab 2
15 pages
Mod 2 Summary Table
No ratings yet
Mod 2 Summary Table
16 pages
Introduction To R
No ratings yet
Introduction To R
18 pages
DSF 9-10
No ratings yet
DSF 9-10
25 pages
R Programming Language: History
No ratings yet
R Programming Language: History
20 pages
Chapter 5 Slides
No ratings yet
Chapter 5 Slides
73 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
R Programming Checklist of Basic Skills With Examples
No ratings yet
R Programming Checklist of Basic Skills With Examples
33 pages
R Statistical Package
No ratings yet
R Statistical Package
63 pages
R Programming LAB Manual
No ratings yet
R Programming LAB Manual
39 pages
Network Analysis and Visualization With R and Igraph
No ratings yet
Network Analysis and Visualization With R and Igraph
62 pages
Introduction To R
No ratings yet
Introduction To R
91 pages
Introduction To R
No ratings yet
Introduction To R
74 pages
Obejcts in R A13
No ratings yet
Obejcts in R A13
8 pages
In R Programming PDF
No ratings yet
In R Programming PDF
72 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Web Programming
No ratings yet
Web Programming
49 pages
Unit 5
No ratings yet
Unit 5
18 pages
Unit 4 1
No ratings yet
Unit 4 1
29 pages
Unit 5 1
No ratings yet
Unit 5 1
24 pages
JDBC Unit3
No ratings yet
JDBC Unit3
23 pages
Corporate Organization PDF
No ratings yet
Corporate Organization PDF
25 pages
DA-087-08 - No CAR For Any Personal Properties
No ratings yet
DA-087-08 - No CAR For Any Personal Properties
2 pages
PWD MDMS Seats
No ratings yet
PWD MDMS Seats
1 page
711 Industrial Boulevard Valdosta, Georgia 31601: Saft America Inc
No ratings yet
711 Industrial Boulevard Valdosta, Georgia 31601: Saft America Inc
68 pages
Bellary Scam Punctures BJP's Anti-Corruption Claims: Update
No ratings yet
Bellary Scam Punctures BJP's Anti-Corruption Claims: Update
4 pages
Neil Sedaka
0% (1)
Neil Sedaka
15 pages
14.4 V & 18 V Family Handout-R1
No ratings yet
14.4 V & 18 V Family Handout-R1
2 pages
Law of Germany
No ratings yet
Law of Germany
6 pages
Thomas Anders
No ratings yet
Thomas Anders
18 pages
Daily Performance Report - CPP Data
No ratings yet
Daily Performance Report - CPP Data
1 page
07 The Victorian Age 1830 1901 Posledna Verzija
No ratings yet
07 The Victorian Age 1830 1901 Posledna Verzija
14 pages
Scoville
No ratings yet
Scoville
8 pages
Retotaling Application & Bank Challan HSSC I 2023
No ratings yet
Retotaling Application & Bank Challan HSSC I 2023
2 pages
Pad 381 Challenges
No ratings yet
Pad 381 Challenges
6 pages
Mama Earth Goodness Inside Happiness Outside
No ratings yet
Mama Earth Goodness Inside Happiness Outside
16 pages
Solar Installer List
No ratings yet
Solar Installer List
8 pages
Android Preparation Notes
No ratings yet
Android Preparation Notes
1 page
Business Strategy Analysis
No ratings yet
Business Strategy Analysis
22 pages
ENS Gearbox Recommendation
100% (1)
ENS Gearbox Recommendation
174 pages
2023 Process Food by Sugar concentration-MATERIALS-TOOLS-EQUIPMENTS
No ratings yet
2023 Process Food by Sugar concentration-MATERIALS-TOOLS-EQUIPMENTS
2 pages
Exxsol D80
No ratings yet
Exxsol D80
2 pages
AutoCAD Electrical Ladder Tutorial
100% (1)
AutoCAD Electrical Ladder Tutorial
23 pages
Bearers in LTE
No ratings yet
Bearers in LTE
2 pages
Texture Mapping Tutorial
No ratings yet
Texture Mapping Tutorial
25 pages
Argotek Sterilox Profile
No ratings yet
Argotek Sterilox Profile
20 pages
Viteee 2023 Samples 1
No ratings yet
Viteee 2023 Samples 1
10 pages
Production of Maleic Anhydride From Benzene
No ratings yet
Production of Maleic Anhydride From Benzene
11 pages
Generalized Linear Model: Badr Missaoui
No ratings yet
Generalized Linear Model: Badr Missaoui
35 pages
Raspberry Pi
No ratings yet
Raspberry Pi
20 pages

DS 3

Uploaded by

DS 3

Uploaded by

Fundamentals of Data Science

Type Example Comment

Floating point numbers with double

2 integer -1L, 121L, 5L “Long” integers

3 logical TRUE, FALSE Boolean

4 character "R", "5" or 'R', '5' Text

5 complex -5+11i, 3+2i, 0+4i Real+imaginary numbers

6 raw 01, ff Raw bytes (as hexadecimal)

• e.g., manipulate an object, calculate a derived quantity, or investigate existing objects.

• c(): Combines multiple elements into one atomic vector.

• length(): Returns the length (number of elements) of an object.

• class(): Returns the class of an object.

• typeof(): Returns the type of an object.

• attributes(): Returns further metadata of arbitrary type.

• fruits <- c("banana", "apple", "orange")

• [1] "banana" "apple" "orange"

• # Vector of numerical values

# Vector with numerical values in a sequence

numbers <- 1:10

• To change the value of a specific item, refer to the index number:

• Repeat the sequence of the vector:

• Repeat each value independently:

• R-matrix is a two-dimensional arrangement of data in rows and columns.

• In R programming, matrices are two-dimensional, homogeneous data structures.

• These are some examples of matrices:

Syntax to Create R-Matrix

• matrix(data, nrow, ncol, byrow, dimnames)

• data – values you want to enter

• nrow – no. of rows

• ncol – no. of columns

• byrow – logical clue, if ‘true’ value will be assigned by rows

• dimnames – names of rows and columns

• To create such a R matrix the syntax is given below:

To create such a R matrix the syntax is given below:

• To create such a R matrix the syntax is given below:

• Let’s illustrate this by taking a simple R code.

• It will return the subset of the given Matrix by a specific element.

• It is important that R knows whether it is dealing with a continuous or a categorical variable, as

• A good example of a categorical variable is Gender.

• These distinct values are known as levels.

The two steps to creating an R factor :

2. Converting the vector created into a factor using

levels(factor_vector) <- c("name1", "name2",...)

survey_vector <- c("M", "F", "F", "M", "M")

• Going back to our survey, you would like to

• look at the output. The fact that you

• Analyst 4 is medium and

• No need to specify these are factors yet.

speed_vector <- c("medium","slow","slow", "medium", "fast")

• You can simply do this by using the well-known operators.

• Data Frames are data displayed in a format as a table.

• Data Frames can have different types of data inside it.

• However, each column should have the same type of data.

• Use the data.frame() function to create a data frame:

Syntax: subset(df, expr)

• df: Data frame used

• expr: Condition for subset

• Parameters:…: Vector1, Vector2, Vector3, …

• Methods to sort a dataframe:

• order() function (increasing and decreasing order)

• Syntax: order(dataframe$column_name,decreasing = TRUE))

• dataframe is the input dataframe

• Decreasing parameter specifies the type of sorting order

• If it is TRUE dataframe is sorted in descending order. Otherwise, in increasing order

• return type: Index positions of the elements

• A list in R programming is a generic object consisting of an ordered collection of objects.

• A list is with one-dimensional, heterogeneous data structures.

• A list can also contain a matrix or a function as its elements.

• List is created using list() function.

• In R, the indexing of a list starts with 1 instead of 0.

• In case of named lists it can also be accessed using the names.

• We can add and delete elements only at the end of a list.

• But we can update any element.

You might also like