0% found this document useful (0 votes)

7 views10 pages

DADS301 MBA Sem 3programming in DS

MBA answer Key.

Uploaded by

Varun Asthana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views10 pages

DADS301 MBA Sem 3programming in DS

MBA answer Key.

Uploaded by

Varun Asthana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Varun Asthana

Roll No. 2114501153

Program – Online MBA
Course – Programming in Data Science Directorate of Online Education
ASSIGNMENT

SESSION 2023
PROGRAM MASTER OF BUSINESS ADMINISTRATION (MBA)
SEMESTER III
COURSE CODE & NAME DADS301 – PROGRAMMING IN DATA SCIENCE
CREDITS 4
NUMBER OF ASSIGNMENTS & MARKS 02
30 Marks each

Note: Answer all questions. Kindly note that answers for 10 marks questions should be approximately of 400 - 450
words. Each question is followed by evaluation scheme.

Q.No Assignment Set – 1 Marks Total Marks

Questions
1. Create a vector that contains a sequence of integers between 0 and 9, 10 10
plus a sequence of 50 numbers between 10 and 45.

2. What do you mean by descriptive statistics. Write 5 important functions 10 10

used for calculating the descriptive stats.
3. Create three vectors of the same length, two of them with discrete 10 10
random values and one with continuous random values. Next, create a
data frame with these vectors.

Q.No Assignment Set – 2 Marks Total Marks

Questions
1. Create a for loop that goes through a numeric sequence, computes e to 10 10
the power of each value and if the result is greater than 1000 it stores
this result in another vector.
2. Create a box and whisker plot for mpg variable present in mtcars data 10 10
sets.
3. What do you mean by outlier, describe some methodology to treat an 10 10
outlier.
Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education
Q1) Create a vector that contains a sequence of integers between 0 and 9, plus a sequence of 50
numbers between 10 and 45.

A1) Input code mentioned below includes explanations entered in the comments using ‘#’.

Output

Q2) What do you mean by descriptive statistics. Write 5 important functions used for calculating
the descriptive stats.

A2) Descriptive Statistics aims at summarizing, describing and presenting the data values in R. It
helps in understanding the data by giving clear overview. It is good step to deep dive into any
analysis especially RCA’s. There exists many measures that summarizes the dataset. They are divided
into two types:
1. Location Measure – it gives good understanding of the central tendency of the data.
Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education
2. Dispersion Measure – it gives a good understanding about the spread of the data.

Important functions for calculating descriptive stats is as following:

We will be working on iris data set. Therefore, loading the same.

Common calculations done are as following :

1# Minimum & Maximum Function can be found by using the below function:

2# Median can be calculated using the below function

3# Mean can be calculated using the below function

4# Quartiles can be calculated using the below function

5# Standard Deviation can be calculated as following:

Common functions to obtain descriptive stats are as following:

1. summary() function shall give various measures tabulated in the summary:

Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education

2. by() function - In the below example we have requested the summary on the basis of the
species.

3. Hmisc package we can use hmisc() function

Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education

4. Using Psych () function the descriptive stats can be obtained

5. doBy () function can helpful in obtaining descriptive stats

Q3) Create three vectors of the same length, two of them with discrete random values and one
with continuous random values. Next, create a data frame with these vectors.
A3) Set Seed in R is used for creating reproducible results when we are writing code that includes
creating a random variables. When we use set seed then it is ensured that whenever we will run the
code then some random values will be produced every time.

Then we will create random variables:

1st Discrete variable is created to the length of 10 and from the range of numbers from 1 to 5.

2nd Discrete variable is created to the length of 10 and from the range of alphabets from a to e.

Continuous variable is created using rnorm.

Then dataframe is defined with the three variables that are created.

Lastly, the dataframe is printed

Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education

Q4) Create a for loop that goes through a numeric sequence, computes e to the power of each
value and if the result is greater than 1000 it stores this result in another vector.

A4) Please find the code as following:

# import math module

# math module provides exp () function for computing e to the power of a value import math

# Now we will define a sequence

sequence = [1,2,3,4,5,6,7,8,9,10]

# we will now define an empty list to store the result

# If the criteria meets then results will be printed in result vector
result_vector = []

#now we will put the for condition

for value in sequence:
e_power = math.exp(value)
if e_power > 1000:
result_vector.append(e_power)

#now we will print the result for the values that meet criteria
print(result_vector)
Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education

Output:
Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education
Q5) Create a box and whisker plot for mpg variable present in mtcars data sets.

A5) For creating a Box and Whisker Plot using R first we will load the mtcars dataset

Now boxplot function will be used for plotting the Box and Whiskers Plot for the ‘mpg’ variable. This
shall be creating the box and whiskers plot with all the ‘mpg’ variables on the x-axis and its
corresponding values shall be on the y-axis.

The plot formed from the above code shall provide the plot as following:

It should be noted that the box in the middle of the plot represents the interquartile range also
known as the IQR that ranges from Q1 Quartile (25th Percentile) to the Q3 Quartile (75th Percentile)
of the mtcars data.

The line inside the box represents the median value at the 50th Percentile. Whiskers extending from
the box represents the range of values that fall within 1.5 times the IQR whereas any points falling
outside the range is considered to be as the outlies and plotted as individual points.

There’s another way to plot box and whiskers using ggplot.

Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education

Q6) What do you mean by outlier, describe some methodology to treat an outlier.

A6) Outlier is any value which lay outside most of the other values in the set of data. These values
can be exceptions that stand outside of the individual samples of population as well. To be an
outlier, the outlier value needs to significantly vary and must be confirmed via calculations. One
graphical representation of the outlier can be shown as below:

Outliers can cause serious variation from the parameters to be measured like Median, Standard
Deviation etc and can cause error in the analysis. Here are some common methods to treat outliers
in data analysis:
Varun Asthana
Roll No. 2114501153
Program – Online MBA
Course – Programming in Data Science Directorate of Online Education
Identify and remove outliers: This is the simplest approach, where outliers are identified based on a
certain threshold, and then removed from the dataset. The threshold can be set based on domain
knowledge, or using statistical methods such as the Z-score or the interquartile range (IQR). The
identification can be done by applying filters in the dataset. In most of the cases, deletion of the
outliers is not done only removal from the final dataset (used for analysis) is done.

Winsorization: In this method, the extreme values are replaced with less extreme values, usually the
closest non-outlying values. This approach can preserve the data distribution and reduce the impact
of outliers. This has become a common way and most of the analysis are performed based on the
method.

Transformation: Data transformation can be used to reduce the effect of outliers. Common
transformations include logarithmic, square root, or inverse transformation. This approach is useful
when the data is highly skewed, and the transformation can make the distribution more
symmetrical.

Imputation: Outliers can also be treated by imputing their values based on statistical methods such
as mean, median, or mode imputation. This method can preserve the sample size and the
distribution of the data.

Robust statistical methods: Robust statistical methods are designed to be less sensitive to outliers.
These methods include the median, trimmed mean, or M-estimators. They can provide more reliable
estimates and reduce the impact of outliers on the results.

It is important to note that the choice of method should be based on the nature and extent of
outliers, the type of data, and the research question. Additionally, it is always a good practice to
report the methods used to treat outliers, and to perform sensitivity analyses to assess the
robustness of the results to the different methods.

Big Book For Buckyballs Tricks
0% (2)
Big Book For Buckyballs Tricks
6 pages
DAUR Lab Manual
No ratings yet
DAUR Lab Manual
14 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
Fdsa Record Ai&Ds
No ratings yet
Fdsa Record Ai&Ds
26 pages
Nishant R File
No ratings yet
Nishant R File
49 pages
Singh Project1 Report
No ratings yet
Singh Project1 Report
12 pages
Artificial Intelligence Lab: Bahria University, Islamabad
No ratings yet
Artificial Intelligence Lab: Bahria University, Islamabad
5 pages
Diploma in Information Technology: Centralized Question Bank
No ratings yet
Diploma in Information Technology: Centralized Question Bank
4 pages
Ziyaul 12
No ratings yet
Ziyaul 12
26 pages
Foundation of Data Science Previous Year Question Paper
No ratings yet
Foundation of Data Science Previous Year Question Paper
40 pages
R Lab Rec Final
No ratings yet
R Lab Rec Final
31 pages
ML Programs
No ratings yet
ML Programs
41 pages
50 Inference
No ratings yet
50 Inference
31 pages
DS Lab Manual Final
No ratings yet
DS Lab Manual Final
49 pages
Bca212 Ids 2023
No ratings yet
Bca212 Ids 2023
3 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Awini Mustapha-Project1
No ratings yet
Awini Mustapha-Project1
8 pages
50 R Exercises
No ratings yet
50 R Exercises
44 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
Analysis Report
No ratings yet
Analysis Report
8 pages
Galgotias College of Engineering & Technology: Inroduction To Data Analytics and Visualization Lab File (KDS-551)
No ratings yet
Galgotias College of Engineering & Technology: Inroduction To Data Analytics and Visualization Lab File (KDS-551)
47 pages
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
No ratings yet
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
44 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
R Lab Manual
No ratings yet
R Lab Manual
27 pages
R Lab Manual
No ratings yet
R Lab Manual
19 pages
L6 and 7-Data Preprocessing-Coding
No ratings yet
L6 and 7-Data Preprocessing-Coding
34 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Lab Manual R - STD
No ratings yet
Lab Manual R - STD
17 pages
Ad3411-Data Science and Analytics Laboratory
No ratings yet
Ad3411-Data Science and Analytics Laboratory
27 pages
Data Science Syllabus
No ratings yet
Data Science Syllabus
4 pages
Index: SR. NO. Practical Name Date of Perform NO. Sign
No ratings yet
Index: SR. NO. Practical Name Date of Perform NO. Sign
28 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
Principles of AI Laboratory Varshadr
No ratings yet
Principles of AI Laboratory Varshadr
54 pages
CH 3
No ratings yet
CH 3
33 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
ML Lab Manual 2024
No ratings yet
ML Lab Manual 2024
41 pages
R Programmimg Practical Journal All-1
No ratings yet
R Programmimg Practical Journal All-1
25 pages
1.4 Getting Data Into R 2
No ratings yet
1.4 Getting Data Into R 2
21 pages
Datascience Lab
No ratings yet
Datascience Lab
24 pages
Saurabh
No ratings yet
Saurabh
22 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
Glocal University: Practical File of R Programming
100% (1)
Glocal University: Practical File of R Programming
32 pages
R Lab
No ratings yet
R Lab
15 pages
Ids 1
No ratings yet
Ids 1
30 pages
ML Lab Manual
No ratings yet
ML Lab Manual
37 pages
ML Lab Manual
No ratings yet
ML Lab Manual
113 pages
Omkar
No ratings yet
Omkar
37 pages
It Workshop Lab File
No ratings yet
It Workshop Lab File
39 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
44 pages
Python Practice Questions
No ratings yet
Python Practice Questions
5 pages
DSBDA Manual
No ratings yet
DSBDA Manual
76 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
DAV Practicle File
No ratings yet
DAV Practicle File
28 pages
Lec 13
No ratings yet
Lec 13
46 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
R Prograaming Journal
No ratings yet
R Prograaming Journal
16 pages
1152CS239-Intro. To Data Science-Syllabus
No ratings yet
1152CS239-Intro. To Data Science-Syllabus
6 pages
Pds Record Document Ds II
No ratings yet
Pds Record Document Ds II
36 pages
IGNOU MCA Previous Years Unsolved Papers All in One
From Everand
IGNOU MCA Previous Years Unsolved Papers All in One
Manish Soni
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Control Engineering - Open Vs Closed Loop
No ratings yet
Control Engineering - Open Vs Closed Loop
14 pages
اسئلة السنوات الماضية للمقاومة (الكورس الثاني)
No ratings yet
اسئلة السنوات الماضية للمقاومة (الكورس الثاني)
7 pages
P and S Question Bank (2024-2025)
No ratings yet
P and S Question Bank (2024-2025)
8 pages
Code Challenges For A Level 21 40 1
No ratings yet
Code Challenges For A Level 21 40 1
10 pages
COSC/MATH 2056 EL-01 Discrete Mathematics Ii: Course Information
No ratings yet
COSC/MATH 2056 EL-01 Discrete Mathematics Ii: Course Information
4 pages
QP Economics Xi 201920
No ratings yet
QP Economics Xi 201920
10 pages
Syllabus For Jupeb
No ratings yet
Syllabus For Jupeb
2 pages
LESSON 12. Sampling Distribution of Sample Means
No ratings yet
LESSON 12. Sampling Distribution of Sample Means
16 pages
Programs
No ratings yet
Programs
8 pages
Lecture Notes - Extensions of Functions
No ratings yet
Lecture Notes - Extensions of Functions
42 pages
Daftar Buku Perpustakaan Fakultas Matematika Dan Ilmu Pengetahuan Alam
No ratings yet
Daftar Buku Perpustakaan Fakultas Matematika Dan Ilmu Pengetahuan Alam
15 pages
RSA Examples
No ratings yet
RSA Examples
18 pages
21-22 Course Catalog
No ratings yet
21-22 Course Catalog
30 pages
M Sol Ch-13 Mathematical Reasoning
No ratings yet
M Sol Ch-13 Mathematical Reasoning
10 pages
Beamer Class Example8 Warsaw
No ratings yet
Beamer Class Example8 Warsaw
28 pages
ENA Lect. Notes Unit 5 - 5.6 Problems On Mohrs Circle New
No ratings yet
ENA Lect. Notes Unit 5 - 5.6 Problems On Mohrs Circle New
16 pages
FIR & IIR Filters Design
No ratings yet
FIR & IIR Filters Design
12 pages
Philosophical Underpinnings of The Transdisciplina
No ratings yet
Philosophical Underpinnings of The Transdisciplina
17 pages
Numerical
No ratings yet
Numerical
14 pages
Control Lab
No ratings yet
Control Lab
61 pages
Sight Reduction - WP PDF
No ratings yet
Sight Reduction - WP PDF
5 pages
Machine Learning Methods
No ratings yet
Machine Learning Methods
27 pages
Intro To Plant Taxonomy Notes
No ratings yet
Intro To Plant Taxonomy Notes
26 pages
BHRM 242 - Collection, Organisation and Presentation of Data
No ratings yet
BHRM 242 - Collection, Organisation and Presentation of Data
13 pages
NMO Round1 2021
No ratings yet
NMO Round1 2021
8 pages
Minor Third Ditone: Analysis Modes
No ratings yet
Minor Third Ditone: Analysis Modes
1 page
Altair 05 TR
No ratings yet
Altair 05 TR
27 pages
Toaz - Info Engineering Metrology PR
No ratings yet
Toaz - Info Engineering Metrology PR
129 pages
1 s2.0 S0094576516313613 Main
No ratings yet
1 s2.0 S0094576516313613 Main
13 pages