0% found this document useful (0 votes)

306 views49 pages

DS Practical (BSC CS)

Data science practicals

Uploaded by

theforwardko9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

306 views49 pages

DS Practical (BSC CS)

Data science practicals

Uploaded by

theforwardko9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 49

DATA SCIENCE

T.Y.B.Sc.CS (COMPUTER SCIENCE)

Name: MALI KRISHNA VINOD

Seat No: 1110192

N.B. MEHTA (VALWADA) SCIENCE COLLEGE BORDI

MAHARASHTRA – 401701

DEPARTMENT OF INFORMATION TECHNOLOGY

T.Y.B.Sc.CS (COMPUTER SCIENCE)

Semester-6

Academic year 2023-24

CERTIFICATE

Class: B.Sc. Computer Science (Semester 6)

Year: 2023-2024

This is to certify that the work entered in this journal is the work of Shri MALI
KRISHNA VINOD of T.Y.B.Sc.CS division Computer Science Roll No. Uni. Exam No has
satisfactorily completed the required number of practical and worked for the 2 nd term of the
Year 2023-24 in the college laboratory as laid down by the university.

____________ _ __________

Head of the External Internal Examiner
Department Examiner Subject teacher

Date: / / 2024 Department of IT-CS

T.Y.B.Sc.(Computer Science) DATA SCIENCE

INDEX

Sr. Date of Date of

Title Sign
No. Experiment Submission

1. Introduction to Excel.

2. Data Frames and Basic Data Pre-processing.

3. Feature Scaling and Dummification.

4. Hypothesis Testing.

5. ANOVA (Analysis of Variance).

6. Regression and Its Types.

7. Logistic Regression and Decision Tree.

8. K-Means Clustering.

9. Principal Component Analysis (PCA).

N.B. Mehta College Page |1

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 01

Aim: - Introduction to Excel.

 Perform conditional formatting on a dataset using various criteria.

We will take the following marksheet data set to perform Conditional Formatting.

1. Highlighting Cells Rules.

Step 1: Select the column ‘Percentage’ in your Excel sheet and click on ‘Conditional
Formatting’ then click on ‘Highlight Cells Rules’ and then click on ‘Greater Than’.

N.B. Mehta College Page |2

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Step 2: A box will appear where you will enter the value and select a formatting type and
then click on OK.

The formatting will appear as follows which highlights the cells which are above the value
we gave for formatting and the formatting type we gave.

N.B. Mehta College Page |3

T.Y.B.Sc.(Computer Science) DATA SCIENCE

2. Top/Bottom Rules.

Step 1: Select the column ‘Total’ in your Excel sheet and click on ‘Conditional Formatting’
then click on ‘Top/Bottom Rules’ and then click on ‘Top 10 %’.

Step 2: A box will appear where you will enter the percentage of top values you want to
display and select a formatting type and then click on OK.

The formatting will appear as follows which highlights the cells which are the top 20 percent
of all the values from the column with the formatting type we gave.

N.B. Mehta College Page |4

T.Y.B.Sc.(Computer Science) DATA SCIENCE

3. Data Bars.

Step 1: Select the column ‘Sub5’ in your Excel sheet and click on ‘Conditional Formatting’
then click on ‘Data Bars’ and then select whichever formatting you like.

This gives us the formatting of Data Bars according to the data available as shown above.

4. Color Scales.

Step 1: Select the column ‘Sub4’ in your Excel sheet and click on ‘Conditional Formatting’
then click on ‘Color Scales’ and then select whichever formatting you like.

This applies a color gradient to a range of cells as shown above.

N.B. Mehta College Page |5

T.Y.B.Sc.(Computer Science) DATA SCIENCE

5. Icon Sets.

Step 1: Select the column ‘Sub3’ in your Excel sheet and click on ‘Conditional Formatting’
then click on ‘Icon Set’ and then select whichever formatting you like.

In the above figure you can see the marks of Sub3 have been rated in the form of 4 bars
rating.

N.B. Mehta College Page |6

T.Y.B.Sc.(Computer Science) DATA SCIENCE

 Create a pivot table to analyze and summarize data.

To create a pivot table, we are going to take the following data set of product orders.

Step 1: To create a pivot table, select the rows and columns, go to the ‘INSERT’ menu and
click on ‘Pivot Table’ in tables section.

N.B. Mehta College Page |7

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Step 2: A dialog box will appear to create pivot table. Just click on OK.

This will create a blank pivot table along with the "PivotTable Field List" pane.

N.B. Mehta College Page |8

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Let’s do an analysis of ranking the top 5 products by revenue.

Step 1: Drag the "Product" field to the Rows area, and the "Price" field to the Values area.
This will give you the total revenue generated by each product.

Step 2: Right-click on any cell in the ‘Price’ column within the pivot table. Select ‘Show
Values As’ from the context menu. From the dropdown menu, select ‘Rank Largest to
Smallest’.

N.B. Mehta College Page |9

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Step 3: A dialog box will appear to select the base field. Product will be the default field keep
it as it is and click on ‘OK’.

As we can see our data is ranked based on the revenue.

Step 4: Now to see the top 5 ranked products Click on the drop-down arrow next to ‘Row
Labels’ in the pivot table. Select ‘Value Filters’ from the drop-down menu. Choose ‘Top
10...’ or any other number you prefer. Enter the number ‘5’ as we want to display top 5
selling products. Click ‘OK’.

N.B. Mehta College P a g e | 10

T.Y.B.Sc.(Computer Science) DATA SCIENCE

As we see the top 5 products are given below.

If we want to see the sum of these products again, we can just right click on any cell of the
‘Sum of price’ column select ‘Show Values As’ and click on ‘No calculations’.

As we can see we got the sum of price of the top 5 products.

N.B. Mehta College P a g e | 11

T.Y.B.Sc.(Computer Science) DATA SCIENCE

 Use VLOOKUP function to retrieve information from a different

worksheet or table.

The Data set we are going to use for VLOOKUP function is as follows which gives us some
stats of football players.

To create a VLOOKUP function, we’ll just copy the columns as follows.

We’ll just enter any ID from the IDs available. Then we click on the cell under the Player
column and click on Insert Function.

N.B. Mehta College P a g e | 12

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Select the ‘VLOOKUP’ function and click on ‘OK’.

Now in the new dialog box there are 4 fields to be filled. In ‘Lookup_value’ select the Id
column where you’ll put the Id of the player you wish to see. In ‘Table_array’ select the
complete table of the players. In ‘column_index_num’ select the index value of the column.
The index value of ‘Player’ column is 2. ‘Range_lookup’ will be ‘false’ to fetch the exact
value. Then click on ‘Ok’.

Similarly do this for all the other columns as well.

N.B. Mehta College P a g e | 13

T.Y.B.Sc.(Computer Science) DATA SCIENCE

As we can see we can fetch data of the players from their ID with the help of VLOOKUP.

N.B. Mehta College P a g e | 14

T.Y.B.Sc.(Computer Science) DATA SCIENCE

 Perform what-if analysis using Goal Seek to determine input values

for desired output.

We are going to perform what-if analysis on the following data-set.

Now we’ll select a cell in which you want to perform goal seek analysis. Go to the Data tab
and click on What-If Analysis in the Forecast section and click on Goal Seek.

N.B. Mehta College P a g e | 15

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Enter the value you want to achieve and enter select the cell on whose basis you want to
achieve the value and click on Ok.

You can see the status of the goal reached. Click on Ok.

As you can see the changes are made and the goal is reached.

N.B. Mehta College P a g e | 16

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 02

Aim: - Data Frames and Basic Data Pre-processing.

 Read data from CSV and JSON files into a data frame.

1. Reading Data from Csv file.

To read data from the Csv file, we need a Csv file. We are going to use the following
‘emplyoee_data.csv’ file for reading its data.

We are using Google Colab to read the above csv file. For that we are going to upload this
‘employee_data.csv’ file in Google Colab.

Python Code for Reading ‘Employee_data.csv’ File.

N.B. Mehta College P a g e | 17

T.Y.B.Sc.(Computer Science) DATA SCIENCE

2. Reading Data from JSON file.

To read data from the Csv file, we need a Csv file. We are going to use the following
‘emplyoee_data.json’ file for reading its data.

N.B. Mehta College P a g e | 18

T.Y.B.Sc.(Computer Science) DATA SCIENCE

We are using Google Colab to read the above json file. For that we are going to upload this
‘employee_data.json’ file in Google Colab.

Python Code for Reading ‘Employee_data.json’ File.

N.B. Mehta College P a g e | 19

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 20

T.Y.B.Sc.(Computer Science) DATA SCIENCE

 Perform basic data pre-processing tasks such as handling missing

values and outliers.
 Manipulate and transform data using functions like filtering, sorting,
and grouping.

We use the following CSV and JSON file to perform the above tasks.

CSV file: -

N.B. Mehta College P a g e | 21

T.Y.B.Sc.(Computer Science) DATA SCIENCE

JSON file: -

We are using Google Colab to perform the above tasks. For that we are going to upload these
‘sales_data.csv’ file and ‘sales_data.json’ in Google Colab.

N.B. Mehta College P a g e | 22

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Python Code for the above tasks: -

N.B. Mehta College P a g e | 23

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 24

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 03

Aim: - Feature Scaling and Dummification

 Apply feature-scaling techniques like standardization and

normalization to numerical features.

To perform feature-scaling techniques like standardization and normalization to numerical

features we need a dataset. Hence, we are going to use the ‘wine.csv’ dataset.

Wine.csv: -

Python Code to perform Feature-Scaling: -

N.B. Mehta College P a g e | 25

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 26

T.Y.B.Sc.(Computer Science) DATA SCIENCE

 Perform feature dummification to convert categorical variables into

numerical representations.

To perform feature-scaling techniques like standardization and normalization to numerical

features we need a dataset. Hence, we are going to use the ‘wine.csv’ dataset.

Iris.csv: -

Python Code to perform Dummification: -

N.B. Mehta College P a g e | 27

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 28

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 04

Aim: - Hypothesis Testing

 Formulate null and alternative hypotheses for a given problem.

 Conduct a hypothesis test using appropriate statistical tests (e.g., t-
test, chi-square test).
 Interpret the results and draw conclusions based on the test
outcomes.

1. T-test.

Description: -

The aim of the program is to demonstrate the process of conducting a two-sample t-test and
drawing conclusions based on the results. Specifically, it aims to compare the means of two
samples (sample1 and sample2) drawn from normal distributions with different means but the
same standard deviation.

Detailed Breakdown: -

a) Generate Samples: The program generates two samples, each representing a different
population or group. These samples are generated from normal distributions with
means of 10 and 12, and a standard deviation of 2.
b) Perform Hypothesis Test: The program conducts a two-sample t-test to determine
whether there is a statistically significant difference between the means of the two
samples.
c) Set Significance Level: It sets a significance level (alpha) at 0.05, which is a common
threshold used in hypothesis testing.
d) Visualize Distributions: The program plots histograms of the two samples to visualize
their distributions and compare their means visually.

N.B. Mehta College P a g e | 29

T.Y.B.Sc.(Computer Science) DATA SCIENCE

e) Highlight Critical Region: If the p-value from the t-test is less than the significance
level, the program highlights the critical region on the plot to indicate where the
observed difference in means is statistically significant.
f) Draw Conclusions: Based on the results of the t-test, the program draws conclusions
about whether there is significant evidence to reject the null hypothesis (i.e., the
means of the two populations are equal) and provides interpretations of the findings
based on the direction of the difference in means, if applicable.

Python Code to perform T-test: -

N.B. Mehta College P a g e | 30

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 31

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Conclusion: -

Based on the results of the two-sample t-test:

1. If p-value < alpha (0.05):
 If the mean of Sample 1 is significantly higher than that of Sample 2:
 Conclusion: There is significant evidence to reject the null hypothesis.
 Interpretation: The mean of Sample 1 is significantly higher than that
of Sample 2.
 If the mean of Sample 2 is significantly higher than that of Sample 1:
 Conclusion: There is significant evidence to reject the null hypothesis.
 Interpretation: The mean of Sample 2 is significantly higher than that
of Sample 1.
2. If p-value ≥ alpha (0.05):
 Conclusion: Fail to reject the null hypothesis.
 Interpretation: There is not enough evidence to claim a significant difference
between the means of the two samples.
These conclusions are drawn based on the comparison of p-value with the chosen
significance level (alpha) and provide insights into whether there is significant evidence to
support the alternative hypothesis, which posits a difference between the means of the two
populations.

N.B. Mehta College P a g e | 32

T.Y.B.Sc.(Computer Science) DATA SCIENCE

2. Chi-square Test.

Description: -

We apply chi square test to check if there is correlation among the given two categorical
variables.

Assumptions: -

 Observation in each sample is independent and identically distributed.

 Expected frequencies should be at least 5 for the majority (80%) of the cells.
 Two categorical variables

We are going to perform Chi-square Test on the following ‘mpg.csv’ dataset.

N.B. Mehta College P a g e | 33

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Python Program to perform Chi-square Test: -

N.B. Mehta College P a g e | 34

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Conclusion: -

There is sufficient evidence to reject the null hypothesis, indicating that there is a significant
association between 'horsepower_new' and 'modelyear_new' categories.

N.B. Mehta College P a g e | 35

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 05

Aim: - ANOVA (Analysis of Variance)

 Perform one-way ANOVA to compare means across multiple groups.

 Conduct post-hoc tests to identify significant differences between
group means.

Python Program to perform ANOVA (Analysis of Variance): -

N.B. Mehta College P a g e | 36

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 06

Aim: - Regression and Its Types

 Implement simple linear regression using a dataset.

 Explore and interpret the regression model coefficients and
goodness-of-fit measures.
 Extend the analysis to multiple linear regression and assess the
impact of additional predictors.

Python Program to perform Regression: -

N.B. Mehta College P a g e | 37

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 38

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 07

Aim: - Logistic Regression and Decision Tree

 Build a logistic regression model to predict a binary outcome.

 Evaluate the model's performance using classification metrics (e.g.,
accuracy, precision, recall).
 Construct a decision tree model and interpret the decision rules for
classification.

Python Program: -

N.B. Mehta College P a g e | 39

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 40

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 08

Aim: - K-Means Clustering

 Apply the K-Means algorithm to group similar data points into

clusters.
 Determine the optimal number of clusters using elbow method or
silhouette analysis.
 Visualize the clustering results and analyze the cluster
characteristics.

We are going to perform K-Means Clustering on the following ‘Wholesale.csv’ dataset.

N.B. Mehta College P a g e | 41

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 42

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Python Program to perform K-Means Clustering: -

N.B. Mehta College P a g e | 43

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 44

T.Y.B.Sc.(Computer Science) DATA SCIENCE

Practical No. 09

Aim: - Principal Component Analysis (PCA)

 Perform PCA on a dataset to reduce dimensionality.

 Evaluate the explained variance and select the appropriate number
of principal components.
 Visualize the data in the reduced-dimensional space.

Python Program for PCA: -

N.B. Mehta College P a g e | 45

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 46

T.Y.B.Sc.(Computer Science) DATA SCIENCE

N.B. Mehta College P a g e | 47

UNIT 1 (Cyber Security)
100% (1)
UNIT 1 (Cyber Security)
16 pages
Class Diagram For Online Library Management System
100% (1)
Class Diagram For Online Library Management System
1 page
New Cloud Journal Tycs Sem Vi Cs Corner
No ratings yet
New Cloud Journal Tycs Sem Vi Cs Corner
64 pages
EH Practicals
No ratings yet
EH Practicals
64 pages
Slides Ch4 Disk Scheduling
No ratings yet
Slides Ch4 Disk Scheduling
13 pages
Nikhil MOOC Report
No ratings yet
Nikhil MOOC Report
16 pages
Develop Static Pages (Using Only HTML) of An Online Book Store. Should Consist The Following Pages
No ratings yet
Develop Static Pages (Using Only HTML) of An Online Book Store. Should Consist The Following Pages
96 pages
Case Study DS-BDA
No ratings yet
Case Study DS-BDA
29 pages
Software Risk, Configuration Management
No ratings yet
Software Risk, Configuration Management
35 pages
Infosys Campus Registration Guide
No ratings yet
Infosys Campus Registration Guide
7 pages
Enterprise Computing With Java Practical File: Master of Computer Application
No ratings yet
Enterprise Computing With Java Practical File: Master of Computer Application
45 pages
Tycs Sem Vi Cloud WS Final
No ratings yet
Tycs Sem Vi Cloud WS Final
103 pages
MFCS Practicles PDF
No ratings yet
MFCS Practicles PDF
16 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
MCA Notes
No ratings yet
MCA Notes
189 pages
TEACHING AND EVALUATION SCHEME FOR 5th Semester (CSE) (Wef 2020-21)
No ratings yet
TEACHING AND EVALUATION SCHEME FOR 5th Semester (CSE) (Wef 2020-21)
25 pages
Web Services Notes
No ratings yet
Web Services Notes
119 pages
CMP514 Advance Java R
No ratings yet
CMP514 Advance Java R
174 pages
Mayuresh Final Black Book - Organized
No ratings yet
Mayuresh Final Black Book - Organized
6 pages
Module-1 Introduction To File Structures
No ratings yet
Module-1 Introduction To File Structures
50 pages
Ai Practical File Gtu
No ratings yet
Ai Practical File Gtu
43 pages
Bangladeshi Flower Identification Using Computer Vision and Machine Learning Techniques
100% (1)
Bangladeshi Flower Identification Using Computer Vision and Machine Learning Techniques
16 pages
Natural Language Processing: by Dr. Parminder Kaur
No ratings yet
Natural Language Processing: by Dr. Parminder Kaur
26 pages
Numericals CG Unit II
No ratings yet
Numericals CG Unit II
7 pages
Cyber Bidding Gateway
No ratings yet
Cyber Bidding Gateway
5 pages
Developer Proposal - GammaStack
No ratings yet
Developer Proposal - GammaStack
6 pages
DWM Manual
No ratings yet
DWM Manual
60 pages
OOAD
No ratings yet
OOAD
2 pages
Unit 4 Cloud Dr. Preeti Patil
100% (1)
Unit 4 Cloud Dr. Preeti Patil
81 pages
Web Technology File 2020-21
No ratings yet
Web Technology File 2020-21
30 pages
Codevita PDF
100% (1)
Codevita PDF
80 pages
3rd & 4th Sem MCA Syllabus Updated
No ratings yet
3rd & 4th Sem MCA Syllabus Updated
98 pages
BCS456B Capacity Planning For IT
No ratings yet
BCS456B Capacity Planning For IT
2 pages
BCSL 045 PDF
No ratings yet
BCSL 045 PDF
16 pages
Web Technology - ISolved Practical Slips
100% (1)
Web Technology - ISolved Practical Slips
30 pages
ETI Unit I MCQs
No ratings yet
ETI Unit I MCQs
12 pages
SPM
No ratings yet
SPM
83 pages
Web Lab Manual
No ratings yet
Web Lab Manual
33 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
25th August MCA New First Year Syllabus 2020
No ratings yet
25th August MCA New First Year Syllabus 2020
24 pages
DBMS Lab (18IS507) Manual With Solutions-1
No ratings yet
DBMS Lab (18IS507) Manual With Solutions-1
24 pages
JDBC
No ratings yet
JDBC
16 pages
Cocomo Model
No ratings yet
Cocomo Model
26 pages
Database Management System Practical File
No ratings yet
Database Management System Practical File
43 pages
Ethical Hacking Question Bank
No ratings yet
Ethical Hacking Question Bank
5 pages
CSE403 Network Security and Cryptography 12376::chetna Kwatra 3.0 0.0 0.0 3.0 Courses With Research Focus
No ratings yet
CSE403 Network Security and Cryptography 12376::chetna Kwatra 3.0 0.0 0.0 3.0 Courses With Research Focus
8 pages
DBMS Practical File
No ratings yet
DBMS Practical File
39 pages
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
No ratings yet
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
12 pages
Advance Java Questions
No ratings yet
Advance Java Questions
4 pages
FSD Module 3 Notes
No ratings yet
FSD Module 3 Notes
16 pages
Bus Reservations Using HTML Css and Js
No ratings yet
Bus Reservations Using HTML Css and Js
16 pages
B.B.A (C.a) 2019 Pattern
No ratings yet
B.B.A (C.a) 2019 Pattern
75 pages
Slip16 (Employee Investment) (1 M)
No ratings yet
Slip16 (Employee Investment) (1 M)
3 pages
Pranav R Programming Lab File
No ratings yet
Pranav R Programming Lab File
41 pages
Chameleon: A Hierarchical Clustering Algorithm Using Dynamic Modeling
No ratings yet
Chameleon: A Hierarchical Clustering Algorithm Using Dynamic Modeling
18 pages
#Procedure To Find Square of A Given No
No ratings yet
#Procedure To Find Square of A Given No
10 pages
The Joy of Computing Using Python - Course Assignment-1
No ratings yet
The Joy of Computing Using Python - Course Assignment-1
5 pages
CS2029-Advanced Database Technology
No ratings yet
CS2029-Advanced Database Technology
18 pages
Hpu Bca 4th Sem Paper of Internet Technology&webpagedesign
No ratings yet
Hpu Bca 4th Sem Paper of Internet Technology&webpagedesign
4 pages
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet
Chapter 8 Stat
No ratings yet
Chapter 8 Stat
36 pages
Research Report
No ratings yet
Research Report
47 pages
Duane Broe Capstone Project
No ratings yet
Duane Broe Capstone Project
58 pages
Upang Cea Common Ece069 p3-1
No ratings yet
Upang Cea Common Ece069 p3-1
49 pages
Methods Used in Collection of Data and Guidelines in Writing The Statistical Treatment of Data
No ratings yet
Methods Used in Collection of Data and Guidelines in Writing The Statistical Treatment of Data
19 pages
DMAIC Project 3rd Sample - V1
No ratings yet
DMAIC Project 3rd Sample - V1
27 pages
Hypothesis Testing For One Population Part 3 PDF
No ratings yet
Hypothesis Testing For One Population Part 3 PDF
6 pages
DSILYTC Syllabus (AY20-21 Term 2)
No ratings yet
DSILYTC Syllabus (AY20-21 Term 2)
13 pages
Measurement of The Penetration Depth in Biological Tissue For Different Optical Powers
No ratings yet
Measurement of The Penetration Depth in Biological Tissue For Different Optical Powers
6 pages
The Nine Nations of North America v2
100% (1)
The Nine Nations of North America v2
13 pages
Statistics
100% (12)
Statistics
256 pages
Basic Statistical Test Flow Chart Geo 441: Quantitative Methods Group Comparison and Association
No ratings yet
Basic Statistical Test Flow Chart Geo 441: Quantitative Methods Group Comparison and Association
2 pages
Morphometrics - Brief Notes Oyvind Hammer
No ratings yet
Morphometrics - Brief Notes Oyvind Hammer
50 pages
Cognitive Psychology - Write-Up Template
No ratings yet
Cognitive Psychology - Write-Up Template
5 pages
When Should You Use The Spearman's Rank-Order Correlation?
No ratings yet
When Should You Use The Spearman's Rank-Order Correlation?
6 pages
Geodetic Deformation Analysis
No ratings yet
Geodetic Deformation Analysis
51 pages
Chapter 9
No ratings yet
Chapter 9
14 pages
Citsit
No ratings yet
Citsit
16 pages
SPSS Exact Tests 10.0
No ratings yet
SPSS Exact Tests 10.0
2 pages
Worksheet 1 CH 9.1 Critical Value and Hypothesis Testing
No ratings yet
Worksheet 1 CH 9.1 Critical Value and Hypothesis Testing
4 pages
Type I and Type II Errors - Wikipedia, The Free Encyclopedia
No ratings yet
Type I and Type II Errors - Wikipedia, The Free Encyclopedia
15 pages
Acha, Akintunde and Charles 2023
No ratings yet
Acha, Akintunde and Charles 2023
13 pages
Textile Curriculum
No ratings yet
Textile Curriculum
37 pages
Diagnostic Test Research Iii Multiple Choice. Choose The Letter of The Correct Answer. Write Your Answers On A Separate Answer Sheet
No ratings yet
Diagnostic Test Research Iii Multiple Choice. Choose The Letter of The Correct Answer. Write Your Answers On A Separate Answer Sheet
4 pages
Bangladeshi BANKS CMPARISON
No ratings yet
Bangladeshi BANKS CMPARISON
33 pages
A Study On Industry Loan Process of PNB
No ratings yet
A Study On Industry Loan Process of PNB
63 pages
Unit 8 Packet - Part 1
No ratings yet
Unit 8 Packet - Part 1
21 pages
Dasar Filsafat
No ratings yet
Dasar Filsafat
66 pages
Best FastFlags - 1000 Sub Special
No ratings yet
Best FastFlags - 1000 Sub Special
2 pages
CaseStudiesStatistics 2009
No ratings yet
CaseStudiesStatistics 2009
10 pages

DS Practical (BSC CS)

Uploaded by

DS Practical (BSC CS)

Uploaded by

DATA SCIENCE

T.Y.B.Sc.CS (COMPUTER SCIENCE)

Name: MALI KRISHNA VINOD

Seat No: 1110192

N.B. MEHTA (VALWADA) SCIENCE COLLEGE BORDI

DEPARTMENT OF INFORMATION TECHNOLOGY

T.Y.B.Sc.CS (COMPUTER SCIENCE)

Academic year 2023-24

Class: B.Sc. Computer Science (Semester 6)

______________________ _____________________ ____________________

Date: / / 2024 Department of IT-CS

Sr. Date of Date of

2. Data Frames and Basic Data Pre-processing.

3. Feature Scaling and Dummification.

5. ANOVA (Analysis of Variance).

6. Regression and Its Types.

7. Logistic Regression and Decision Tree.

9. Principal Component Analysis (PCA).

N.B. Mehta College Page |1

Aim: - Introduction to Excel.

 Perform conditional formatting on a dataset using various criteria.

1. Highlighting Cells Rules.

N.B. Mehta College Page |2

N.B. Mehta College Page |3

N.B. Mehta College Page |4

This applies a color gradient to a range of cells as shown above.

N.B. Mehta College Page |5

N.B. Mehta College Page |6

 Create a pivot table to analyze and summarize data.

N.B. Mehta College Page |7

N.B. Mehta College Page |8

Let’s do an analysis of ranking the top 5 products by revenue.

N.B. Mehta College Page |9

As we can see our data is ranked based on the revenue.

N.B. Mehta College P a g e | 10

As we see the top 5 products are given below.

As we can see we got the sum of price of the top 5 products.

N.B. Mehta College P a g e | 11

 Use VLOOKUP function to retrieve information from a different

To create a VLOOKUP function, we’ll just copy the columns as follows.

N.B. Mehta College P a g e | 12

Select the ‘VLOOKUP’ function and click on ‘OK’.

Similarly do this for all the other columns as well.

N.B. Mehta College P a g e | 13

N.B. Mehta College P a g e | 14

 Perform what-if analysis using Goal Seek to determine input values

We are going to perform what-if analysis on the following data-set.

N.B. Mehta College P a g e | 15

N.B. Mehta College P a g e | 16

Aim: - Data Frames and Basic Data Pre-processing.

1. Reading Data from Csv file.

Python Code for Reading ‘Employee_data.csv’ File.

N.B. Mehta College P a g e | 17

2. Reading Data from JSON file.

N.B. Mehta College P a g e | 18

Python Code for Reading ‘Employee_data.json’ File.

N.B. Mehta College P a g e | 19

N.B. Mehta College P a g e | 20

 Perform basic data pre-processing tasks such as handling missing

N.B. Mehta College P a g e | 21

N.B. Mehta College P a g e | 22

Python Code for the above tasks: -

N.B. Mehta College P a g e | 23

N.B. Mehta College P a g e | 24

Aim: - Feature Scaling and Dummification

 Apply feature-scaling techniques like standardization and

To perform feature-scaling techniques like standardization and normalization to numerical

Python Code to perform Feature-Scaling: -

N.B. Mehta College P a g e | 25

N.B. Mehta College P a g e | 26

 Perform feature dummification to convert categorical variables into

To perform feature-scaling techniques like standardization and normalization to numerical

Python Code to perform Dummification: -

N.B. Mehta College P a g e | 27

N.B. Mehta College P a g e | 28

Aim: - Hypothesis Testing

____________ _ __________