Guidelines DAVP

The document outlines the guidelines for a course on Data Analysis and Visualization using Python for the DSE Semester III and B.A. Programme II Semester, effective from the academic year 2024-25. It includes a detailed syllabus covering topics such as basic statistics, data manipulation with NumPy and Pandas, data visualization using Matplotlib, and practical exercises. Additionally, it lists essential readings and suggests practical projects to apply the concepts learned in the course.

Uploaded by

ritikadaga508

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views3 pages

Guidelines DAVP

Uploaded by

ritikadaga508

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Guidelines of DSE Semester III /

B.A. Programme II Semester / GE Semester II (NEP-UGCF 2022)

Data Analysis and Visualization using Python
DSE/A2/GE2a
(Effective from Academic Year 2024-25)

TOPICS/UNITS Chapter Ref

Week 1 Unit 1 Introduction to basic statistics and Ch1: pg 11-24, pg 29-35, pg 37-p38 [2]
to 3 analysis: Fundamentals of Data Analysis,
Statistical foundations for Data Analysis, Types
of data, Descriptive Statistics, Correlation and
covariance, Linear Regression, Statistical Ch 1: 1.3 (pg 4-6) [1]
Hypothesis Generation and Testing
Python Libraries: NumPy, Pandas, Matplotlib

Week 4 Unit 2 Array manipulation using Numpy: Ch4:4.1. Usage of rand(), randn() and randint() [1]
to 6 NumPy array: Creating NumPy arrays, various functions of NumPy
data types of NumPy arrays
Indexing and slicing, swapping axes, transposing
arrays, data processing using Numpy arrays

Week 7 Unit 3 Data Manipulation using Pandas: Data Ch 5: 5.1, 5.2 excluding Arithmetic and data [1]
to 10 Structures in Pandas: Series, Data Frame, Index alignment, axis indexes with duplicate labels, 5.3
objects, loading data into Panda’s data frame, Ch 6: 6.1 (pg 177-181,184)
Working with Data Frames: Arithmetics, Ch 7: 7.1, 7.2 till binning (pg 203-217)
Statistics, Binning, Indexing, Reindexing, Ch 8: 8.1 (pg 247-253), 8.2 (pg 253-258) 8.3 (pg
Filtering, Handling missing data, Hierarchical 270-273)
indexing, Data wrangling: Data cleaning,
transforming, merging and reshaping

Week Unit 4 Plotting and Visualization: Using Ch 9: 9.1 (pg 281-296), 9.2 (pg 298-313), 9.3 [1]
11 to 13 Matplotlib to plot data: figures, subplots,
markings, color and line styles, labels and
legends, Plotting functions in Pandas: Lines, bar, Ch 5 : pg 281-282 [2]
Scatter plots, histograms, stacked bars, Heatmap,
3D Plotting, interactive plotting using Bokeh and
Plotly

Week Data Aggregation and Group operations: Chapter 10: 10.1, 10.2, 10.3 (till pg 337), 10.5 [1]
14 to 15 Group by mechanics, Data aggregation, General
split-apply-combine, Pivot tables and cross
tabulation

Essential/recommended readings
1. McKinney W. Python for Data Analysis: Data Wrangling with Pandas, NumPy and IPython. 3rd edition. O’Reilly
Media, 2022
2. Molin S. Hands-On Data Analysis with Pandas, Packt Publishing, 2019.
3. Gupta S.C., Kapoor V.K., Fundamentals of Mathematical Statistics, Sultan Chand & Sons, 2020.

Suggested Practical List (If any): (30 Hours)

Practical exercises such as

Use a dataset of your choice from Open Data Portal (https:// data.gov.in/, UCI repository) or
load from scikit, seaborn library for the following exercises to practice the concepts learnt.

1. Load a Pandas dataframe with a selected dataset. Identify and count the missing values
in a dataframe. Clean the data after removing noise as follows
a) Drop duplicate rows.
b) Detect the outliers and remove the rows having more than two outliers identified using boxplot.
c) Identify the most correlated positively correlated attributes and negatively correlated
attributes

2. Import iris data using sklearn library or (Download IRIS data from:
https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn.datasets)
a. Compute mean, mode, median, standard deviation, confidence interval and standard
error for each feature
b. Compute correlation coefficients between each pair of features and plot heatmap
c. Find covariance between length of sepal and petal iv. Build contingency table for class
feature

3. Load Titanic data from sklearn library , plot the following with proper legend and axis
labels:
a. Plot bar chart to show the frequency of survivors and non-survivors for male
and female passengers separately
b. Draw a scatter plot for any two selected features
c. Compare density distribution for features age and passenger fare
d. Use a pair plot to show pairwise bivariate distribution

4. Using Titanic dataset, do the following

a. Find total number of passengers with age less than 30
b. Find total fare paid by passengers of first class
c. Compare number of survivors of each passenger class

5. Download any dataset and do the following

a. Count number of categorical and numeric features
b. Remove one correlated attribute (if any)
c. Display five-number summary of each attribute and show it visually

Project: Students are encouraged to work on a good dataset in consultation with their faculty
and apply the concepts learned in the course.
Additional Practice Exercises:

1. Write programs in Python using NumPy library to do the following:

a. Compute the mean, standard deviation, and variance of a two dimensional random integer array along
the second axis.
b. Create a 2-dimensional array of size m x n integer elements, also print the shape, type and data type of
the array and then reshape it into an n x m array, where n and m are user inputs given at the run time.
c. Test whether the elements of a given 1D array are zero, non-zero and NaN. Record the indices of these
elements in three separate arrays.
d. Create three random arrays of the same size: Array1, Array2 and Array3. Subtract Array 2 from Array3
and store in Array4. Create another array Array5 having two times the values in Array1. Find Co-
variance and Correlation of Array1 with Array4 and Array5 respectively.
e. Create two random arrays of the same size 10: Array1, and Array2. Find the sum of the first half of both
the arrays and product of the second half of both the arrays.

2. Consider two data files (in CSV format) having attendance of two workshops. Each file has three fields ‘Name’,
‘Date, duration (in minutes) where names are unique within a file. Note that duration may take one of three
values (30, 40, 50) only. Import the data into two data frames and do the following:
a. Perform merging of the two data frames to find the names of students who had attended both
workshops.
b. Find names of all students who have attended a single workshop only.
c. Merge two data frames row-wise and find the total number of records in the data frame.
d. Merge two data frames row-wise and use two columns viz. names and dates as multi-row indexes.
Generate descriptive statistics for this hierarchical data frame.

3. Consider the following data frame containing a family name, gender of the family member and her/his monthly
income in each record.
Name Gender MonthlyIncome (Rs.)
Shah Male 114000.00
Vats Male 65000.00
Vats Female 43150.00
Kumar Female 69500.00
Vats Female 155000.00
Kumar Male 103000.00
Shah Male 55000.00
Shah Female 112400.00
Kumar Female 81030.00
Vats Male 71900.00
Write a program in Python using Pandas to perform the following:
a. Calculate and display familywise gross monthly income.
b. Display the highest and lowest monthly income for each family name
c. Calculate and display monthly income of all members earning income less than Rs. 80000.00.
d. Display total number of females along with their average monthly income
e. Delete rows with Monthly income less than the average income of all members

On The Art of Building in Ten Books
0% (1)
On The Art of Building in Ten Books
5 pages
Iso 13385-2 - 2011
No ratings yet
Iso 13385-2 - 2011
8 pages
Cap 8 Harry Stack Sullivan PDF
No ratings yet
Cap 8 Harry Stack Sullivan PDF
30 pages
Assignment
No ratings yet
Assignment
5 pages
4G-4G Traffic Sharing For LTC (Low Throughput Cells) Improvement in Huawei LTE
No ratings yet
4G-4G Traffic Sharing For LTC (Low Throughput Cells) Improvement in Huawei LTE
3 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Guidelines - Data Exploration and Visualization
No ratings yet
Guidelines - Data Exploration and Visualization
3 pages
DXV Guidelines
No ratings yet
DXV Guidelines
3 pages
Manishadav
No ratings yet
Manishadav
27 pages
GE Practical Sem 2
No ratings yet
GE Practical Sem 2
28 pages
DAV Practical File 234003
No ratings yet
DAV Practical File 234003
14 pages
DAV Practicle File
No ratings yet
DAV Practicle File
28 pages
23HCS4142 PDF
No ratings yet
23HCS4142 PDF
24 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Pandas Worksheet
No ratings yet
Pandas Worksheet
19 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Gec Practicals
No ratings yet
Gec Practicals
31 pages
21hcs4108 Davpracticals
No ratings yet
21hcs4108 Davpracticals
29 pages
CS 2 Sem Syllabus
No ratings yet
CS 2 Sem Syllabus
3 pages
VIP Question Bank For DPV For Theory Exam
No ratings yet
VIP Question Bank For DPV For Theory Exam
6 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Data Understanding and Preparation
No ratings yet
Data Understanding and Preparation
48 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
DSA Lab Manual Pgms - fINAL
No ratings yet
DSA Lab Manual Pgms - fINAL
34 pages
Question Bank - Python Cia 3
No ratings yet
Question Bank - Python Cia 3
3 pages
AE II Simulation File PDF
No ratings yet
AE II Simulation File PDF
32 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
RAW Data
No ratings yet
RAW Data
22 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
96 pages
DXE 24gksmknvj
No ratings yet
DXE 24gksmknvj
16 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
ELE492 - ELE492 - Image Process Lecture Notes 5
No ratings yet
ELE492 - ELE492 - Image Process Lecture Notes 5
41 pages
Lab Manual
No ratings yet
Lab Manual
19 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
EX-02-Data Manipulation Pandas Matplot
No ratings yet
EX-02-Data Manipulation Pandas Matplot
9 pages
CSE445 NSU Week - 3
No ratings yet
CSE445 NSU Week - 3
48 pages
Practical No.-01
No ratings yet
Practical No.-01
25 pages
Python For Statistics
No ratings yet
Python For Statistics
40 pages
Python For Data Science
No ratings yet
Python For Data Science
45 pages
CS3362 - Data Science Laboratory - Manual - Final-1
No ratings yet
CS3362 - Data Science Laboratory - Manual - Final-1
76 pages
2023 Data Analysis and Visualization Using Python
100% (2)
2023 Data Analysis and Visualization Using Python
9 pages
Python 1
No ratings yet
Python 1
16 pages
AI Final PDF
No ratings yet
AI Final PDF
38 pages
AD3411 - 1 To 5
No ratings yet
AD3411 - 1 To 5
11 pages
GE02 (DAVP) Assignment
No ratings yet
GE02 (DAVP) Assignment
3 pages
ANL252 SU3 Jul2022
No ratings yet
ANL252 SU3 Jul2022
23 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Practical Assignment4 1
No ratings yet
Practical Assignment4 1
6 pages
Fds QB
No ratings yet
Fds QB
6 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
Unit 4
No ratings yet
Unit 4
62 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Program Questions
No ratings yet
Program Questions
2 pages
Olympic Data Minor Project 5th Sem
No ratings yet
Olympic Data Minor Project 5th Sem
23 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Python Lab Manual
No ratings yet
Python Lab Manual
33 pages
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
Machine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
From Everand
Machine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
Abhishek Mishra
No ratings yet
A Brief Review of Die Sinking Electrical Discharging Machining Process Towards Automation
No ratings yet
A Brief Review of Die Sinking Electrical Discharging Machining Process Towards Automation
7 pages
FORM - 2 - Appendix 2.A - Fire Protection Codes Matrix
No ratings yet
FORM - 2 - Appendix 2.A - Fire Protection Codes Matrix
9 pages
A Cute Letter From A Muslim Girl To Her Christian Parents
No ratings yet
A Cute Letter From A Muslim Girl To Her Christian Parents
3 pages
Summary Notes - Topic 3 Movement in and Out of Cells - CAIE Biology IGCSE
No ratings yet
Summary Notes - Topic 3 Movement in and Out of Cells - CAIE Biology IGCSE
4 pages
smc7 12
No ratings yet
smc7 12
43 pages
Health and Diseases
No ratings yet
Health and Diseases
31 pages
Lesson 5 Tools, Techniques and Procedures
100% (2)
Lesson 5 Tools, Techniques and Procedures
39 pages
Management of Patients With Intestinal and Rectal Disorders
No ratings yet
Management of Patients With Intestinal and Rectal Disorders
54 pages
Scrubber
No ratings yet
Scrubber
15 pages
Multisensor Installation Tool List - 4309978 - 01
No ratings yet
Multisensor Installation Tool List - 4309978 - 01
6 pages
3.1 Fasteners
No ratings yet
3.1 Fasteners
30 pages
Subjective: " " Sto: Diagnostics: Sto:Goal MET: Vital
No ratings yet
Subjective: " " Sto: Diagnostics: Sto:Goal MET: Vital
3 pages
Voolenvine FavoriteSocks 2020 Final PDF
No ratings yet
Voolenvine FavoriteSocks 2020 Final PDF
6 pages
1.2 Huawei MW Material
No ratings yet
1.2 Huawei MW Material
8 pages
Master Thesis Cloud Computing
100% (1)
Master Thesis Cloud Computing
8 pages
Modern Tanks and Afvs 1991present Russell Hart Stephen Hart Download
No ratings yet
Modern Tanks and Afvs 1991present Russell Hart Stephen Hart Download
33 pages
Eng - Avionics PTC 2019
No ratings yet
Eng - Avionics PTC 2019
186 pages
Wetted Surface Area of Partially Filled Horizontal Vessel
No ratings yet
Wetted Surface Area of Partially Filled Horizontal Vessel
1 page
Global Humanitarian Overview 2025 (Abridged Report)
No ratings yet
Global Humanitarian Overview 2025 (Abridged Report)
20 pages
Evaluation of The Methods For Determination of The Free Radical Scavenging Activity by DPPH
No ratings yet
Evaluation of The Methods For Determination of The Free Radical Scavenging Activity by DPPH
14 pages
Richard Cross - The Medieval Christian Philosophers - An Introduction (Library of Medieval Studies) - I.B. Tauris (2013)
No ratings yet
Richard Cross - The Medieval Christian Philosophers - An Introduction (Library of Medieval Studies) - I.B. Tauris (2013)
286 pages
Commercial Radio Operators
100% (1)
Commercial Radio Operators
11 pages
Amber Training
100% (1)
Amber Training
36 pages
SMD Marking Code
No ratings yet
SMD Marking Code
14 pages
Sixteen Saviours or One?, John Perry. 1879
100% (3)
Sixteen Saviours or One?, John Perry. 1879
160 pages