0% found this document useful (0 votes)
39 views11 pages

Virtual Lab Format 3

The document provides an introduction to a lab manual for a course on data analysis using R. It outlines the objectives, expected outcomes, and contents of the manual. The manual is intended to help students develop practical programming skills and provide model programs and exercises to assign to students.

Uploaded by

pn7363
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views11 pages

Virtual Lab Format 3

The document provides an introduction to a lab manual for a course on data analysis using R. It outlines the objectives, expected outcomes, and contents of the manual. The manual is intended to help students develop practical programming skills and provide model programs and exercises to assign to students.

Uploaded by

pn7363
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Directorate of Online Education

SRM Institute of Science and Technology

SRM Nagar, Kattankulathur-603203

LAB MANUAL

Course : MCA

SEMESTER
II

SUBJECT TITLE
DATA ANALYSIS USING R

SUBJECT CODE
V20PCA207

Prepared By
S. Mythreyi Koppur
Asst.Professor , DOE,
SRM IST

Directorate of Online Education


OBJECTIVES

1.To introduce students to the basic knowledge of R Programming

2. To impart writing skill of programming to the students and solving problems.

3. To impart the concepts like installing, loading datasets, working with data
frames in R Programming.

COURSE OUTCOME

1. Understand the logic for a given problem.


2. Write the Process Steps (algorithm ) of a given problem.
3. Apply all the concepts that have been covered in the theory course
4. Evaluate methodology of a given problem.
5. Recognize and understand the syntax and construction of programming code.
6. Know the steps involved in compiling and linking code.
7. Know the alternative ways of providing solution to a given problem

INTRODUCTION ABOUT R PROGRAMMING LAB

R Programming Content is prepared by Course Coordinator of concern subject


to help the students with their practical understanding and development of
programming skills, and may be used as a base reference during the Practical
Assignments. The model lab programs and List of Exercise Assignment prepared
by staff members will be upload in LMS .Students have to submit Lab Exercise
through LMS as Assignment Sections as Separate Folder of concern subject . The
course coordinator of concern subject can be evaluated after students submit all
program assignments for end semester sessional examination.. The lab Program
reporting style in the prescribed format (Appendix-I) and List of Lab Exercises as
Assignments prescribed format (Appendix-II)

Directorate of Online Education


VIRTUAL LAB CONTENTS(STUDENT)

This manual is intended for the First year students of MCA branch in the subject
of Data Analysis using R . This manual typically contains practical/Lab Sessions
related to Programming Language covering various aspects related the subject to
enhanced understanding.
Although, as per the syllabus, R programs are prescribed, we have made the
efforts to cover various aspects of Software Developing Languages
Students are advised to thoroughly go through this manual rather than only topics
mentioned in the syllabus as practical aspects are the key to understanding and
conceptual visualization of theoretical aspects covered in the online contents.
Guidelines
1. Students are instructed to perform their lab exercises/assignments at their
own system from their respective residences
2.Writing and editing the program in your system.
Compiling and Executing the program and save the output
3. The students are also advised to submit completed Lab assignments in the
prescribed format (Appendix-1) in LMS
4. The students are advised to complete the weekly activities/assignments well in
time.
5. The submitted Lab Assignment will be evaluated for end semester Practical
examination
6. The students must get the completed Lab Assignments evaluated by the
concerned course coordinator by LMS , Failing which the Lab assignments for that
week will be treated as incomplete.
7. At least TEN (10) such timely completed Lab assignments are compulsory,
failing which students will not be allowed to appear in the final end semester Lab
Examination.

Directorate of Online Education


APPENDIX-I
1. Title : Assignment on implementing data types in R

2. Process Steps/Description

1. Understanding the types of data.


2. Gaining a better knowledge on working with the data vectors.
3. Working with vectors and missing (NA) values.
4. Understanding the data frames.
5. Understanding the assignment functions.

3.Methodology

R platform

4.Sample coding

library(dplyr)
library(ggplot2)
class10 <- c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79)
length(class10)
sum(class10)
sum(class10)/length(class10)
mean(class10)
class10 - mean(class10)
(class10)^2 / length(class10)
sqrt(class10)
fees <- c(10500, 45000, 74100, NA, 83500, 86000, 38200, NA,
44300, 12500, 55700, 43900, 71900, NA, 62000)
sum(fees)
sum(fees, na.rm = TRUE)
mean(fees, na.rm=TRUE)
datasets::precip
head(precip)
head( sort(precip, decreasing=TRUE) )

Directorate of Online Education


head(names(precip))
test_scores <- c(87, 782, 99)
names(test_scores) <- c("Alice", "Bob", "Shirley")
test_scores
x <- c(1, "two", "III")
x
1:5
1:length(class10)
0:(length(class10) - 1)
seq(0, 100, by=10)
seq(0, 100, length.out=11)
rep(5, times=10)
rep(1:3, times=4)
rep(c(1,2,3), times=c(3,2,1))
class10
class10[1]
class10[4]
class10[13]
class10[c(1,3,5,7,9)]
class10[-1]
class10[-5]
class10[c(-5,-7,-9)]
class10 > 100
class10 == 111
class10 == 500
class10 < 100 | class10 > 200
class10 > 100 & class10 < 200
any(class10 > 300)
any(fees > 100000,na.rm = TRUE)
any(fees > 20000,na.rm = TRUE)
fees[2]
fees[4]
fees[c(1,3,4,6,7)]
fees > 10000
1 : length(fees)
all(class10 > 80)
fees[ !is.na(fees) ]

Directorate of Online Education


length(fees[ !is.na(fees) ])

5.Sample Output:

> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Warning message:
package ‘dplyr’ was built under R version 4.1.3
> library(ggplot2)
Warning message:
package ‘ggplot2’ was built under R version 4.1.3
> class10 <- c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79)
> length(class10)
[1] 10
> sum(class10)
[1] 1524
> sum(class10)/length(class10)
[1] 152.4
> mean(class10)
[1] 152.4
> class10 - mean(class10)
[1] -78.4 -30.4 82.6 -41.4 139.6 -41.4 58.6 -19.4 3.6 -73.4
> (class10)^2 / length(class10)
[1] 547.6 1488.4 5522.5 1232.1 8526.4 1232.1 4452.1 1768.9
[9] 2433.6 624.1
> sqrt(class10)
[1] 8.602325 11.045361 15.329710 10.535654 17.088007 10.535654

Directorate of Online Education


[7] 14.525839 11.532563 12.489996 8.888194
> fees <- c(10500, 45000, 74100, NA, 83500, 86000, 38200, NA,
+ 44300, 12500, 55700, 43900, 71900, NA, 62000)
> sum(fees)
[1] NA
> sum(fees, na.rm = TRUE)
[1] 627600
> mean(fees, na.rm=TRUE)
[1] 52300
> datasets::precip
Mobile Juneau Phoenix
67.0 54.7 7.0
Little Rock Los Angeles Sacramento
48.5 14.0 17.2
San Francisco Denver Hartford
20.7 13.0 43.4
Wilmington Washington Jacksonville
40.2 38.9 54.5
Miami Atlanta Honolulu
59.8 48.3 22.9
Boise Chicago Peoria
11.5 34.4 35.1
Indianapolis Des Moines Wichita
38.7 30.8 30.6
Louisville New Orleans Portland
43.1 56.8 40.8
Baltimore Boston Detroit
41.8 42.5 31.0
Sault Ste. Marie Duluth Minneapolis/St Paul
31.7 30.2 25.9
Jackson Kansas City St Louis
49.2 37.0 35.9
Great Falls Omaha Reno
15.0 30.2 7.2
Concord Atlantic City Albuquerque
36.2 45.5 7.8
Albany Buffalo New York

Directorate of Online Education


33.4 36.1 40.2
Charlotte Raleigh Bismark
42.7 42.5 16.2
Cincinnati Cleveland Columbus
39.0 35.0 37.0
Oklahoma City Portland Philadelphia
31.4 37.6 39.9
Pittsburg Providence Columbia
36.2 42.8 46.4
Sioux Falls Memphis Nashville
24.7 49.1 46.0
Dallas El Paso Houston
35.9 7.8 48.2
Salt Lake City Burlington Norfolk
15.2 32.5 44.7
Richmond Seattle Tacoma Spokane
42.6 38.8 17.4
Charleston Milwaukee Cheyenne
40.8 29.1 14.6
San Juan
59.2
> head(precip)
Mobile Juneau Phoenix Little Rock Los Angeles
67.0 54.7 7.0 48.5 14.0
Sacramento
17.2
> head( sort(precip, decreasing=TRUE) )
Mobile Miami San Juan New Orleans
67.0 59.8 59.2 56.8
Juneau Jacksonville
54.7 54.5
> head(names(precip))
[1] "Mobile" "Juneau" "Phoenix" "Little Rock"
[5] "Los Angeles" "Sacramento"
> test_scores <- c(87, 782, 99)
> names(test_scores) <- c("Alice", "Bob", "Shirley")
> test_scores

Directorate of Online Education


Alice Bob Shirley
87 782 99
> x <- c(1, "two", "III")
>x
[1] "1" "two" "III"
> 1:5
[1] 1 2 3 4 5
> 1:length(class10)
[1] 1 2 3 4 5 6 7 8 9 10
> 0:(length(class10) - 1)
[1] 0 1 2 3 4 5 6 7 8 9
> seq(0, 100, by=10)
[1] 0 10 20 30 40 50 60 70 80 90 100
> seq(0, 100, length.out=11)
[1] 0 10 20 30 40 50 60 70 80 90 100
> rep(5, times=10)
[1] 5 5 5 5 5 5 5 5 5 5
> rep(1:3, times=4)
[1] 1 2 3 1 2 3 1 2 3 1 2 3
> rep(c(1,2,3), times=c(3,2,1))
[1] 1 1 1 2 2 3
> class10
[1] 74 122 235 111 292 111 211 133 156 79
> class10[1]
[1] 74
> class10[4]
[1] 111
> class10[13]
[1] NA
> class10[c(1,3,5,7,9)]
[1] 74 235 292 211 156
> class10[-1]
[1] 122 235 111 292 111 211 133 156 79
> class10[-5]
[1] 74 122 235 111 111 211 133 156 79
> class10[c(-5,-7,-9)]
[1] 74 122 235 111 111 133 79

Directorate of Online Education


> class10 > 100
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
> class10 == 111
[1] FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
> class10 == 500
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> class10 < 100 | class10 > 200
[1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE TRUE
> class10 > 100 & class10 < 200
[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE
> any(class10 > 300)
[1] FALSE
> any(fees > 100000,na.rm = TRUE)
[1] FALSE
> any(fees > 20000,na.rm = TRUE)
[1] TRUE
> fees[2]
[1] 45000
> fees[4]
[1] NA
> fees[c(1,3,4,6,7)]
[1] 10500 74100 NA 86000 38200
> fees > 10000
[1] TRUE TRUE TRUE NA TRUE TRUE TRUE NA TRUE TRUE TRUE TRUE
[13] TRUE NA TRUE
> 1 : length(fees)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> all(class10 > 80)
[1] FALSE
> fees[ !is.na(fees) ]
[1] 10500 45000 74100 83500 86000 38200 44300 12500 55700 43900
[11] 71900 62000
> length(fees[ !is.na(fees) ])
[1] 12

Directorate of Online Education


6. Result/inference:

Thus the code was successfully executed, and the output is verified

APPENDIX-II
LIST OF EXPERIMENTS -ASSIGNMENTS -LMS
Assignment Title of Program
No
1 Assign a set of values to a data vector, then find its length and
sum and sqrt.
2 Assign a set of values to a variable with some missing values (NA)
and then calculate its sum and mean
3 From the inbuilt dataset in R, list the first 6 rows using the head
function then sort it in descending order.

Directorate of Online Education

You might also like