0% found this document useful (0 votes)
8 views17 pages

CSL 410 L05

Uploaded by

rpschauhan2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

CSL 410 L05

Uploaded by

rpschauhan2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python


Unit No. 1
Tools and Techniques for Data Science

Lecture No. 05

Dr. Sanjay Jain


Associate Professor, CSA/SOET
Outlines
• Tools for Data Science
• Advantages of Using Python
• Python Libraries for Data Science
• References
Course Outcome
• CO.1: Understanding: Basic concept of Data science, applications areas
and tools for data science.
• CO.2: Applying: Implementation of Numpy for handling numerical data
and pandas for handling data from datafiles.
• CO.3: Analyzing: Analyze the domain of data, cleaning and preparing the
data for data science.
• CO.4: Evaluating: evaluate and summarize the data using statistical &
visualization tools;
• CO.5: Create: Create dataset for machine learning models. .
Tools for Data Science

<CO: 1> <Reference No.: R1,R3,R4>


4
Tools for Data Science

<CO: 1> <Reference No.: R1,R3,R4>


5
Advantages of Using Python

<CO: 1> <Reference No.: R1,R3,R4>


6
Python Libraries for Data Science

<CO: 1> <Reference No.: R1,R3,R4>


7
Python Libraries for Data Science

Many popular Python toolboxes/libraries:


– NumPy
– SciPy
– Pandas
– SciKit-Learn

Visualization libraries
– matplotlib
– Seaborn

and many more …


<CO: 1> <Reference No.: R1,R3,R4>
8
Python Libraries for Data Science

NumPy:
 introduces objects for multidimensional arrays and matrices, as well
as functions that allow to easily perform advanced mathematical and
statistical operations on those objects

 provides vectorization of mathematical operations on arrays and


matrices which significantly improves the performance

 many other python libraries are built on NumPy

Link: http://www.numpy.org/

<CO: 1> <Reference No.: R1,R3,R4>


9
Python Libraries for Data Science

SciPy:
 collection of algorithms for linear algebra, differential equations,
numerical integration, optimization, statistics and more

 part of SciPy Stack

 built on NumPy

Link: https://www.scipy.org/scipylib/

<CO: 1> <Reference No.: R1,R3,R4>


10
Python Libraries for Data Science

Pandas:
 adds data structures and tools designed to work with table-like data
(similar to Series and Data Frames in R)

 provides tools for data manipulation: reshaping, merging, sorting,


slicing, aggregation etc.

 allows handling missing data

Link: http://pandas.pydata.org/

<CO: 1> <Reference No.: R1,R3,R4>


11
Python Libraries for Data Science
SciKit-Learn:
 provides machine learning algorithms: classification, regression,
clustering, model validation etc.

 built on NumPy, SciPy and matplotlib

Link: http://scikit-learn.org/

<CO: 1> <Reference No.: R1,R3,R4>


12
Python Libraries for Data Science

matplotlib:
 python 2D plotting library which produces publication quality
figures in a variety of hardcopy formats

 a set of functionalities similar to those of MATLAB

 line plots, scatter plots, barcharts, histograms, pie charts etc.

 relatively low-level; some effort needed to create advanced


visualization
Link: https://matplotlib.org/

<CO: 1> <Reference No.: R1,R3,R4>


13
Python Libraries for Data Science
Seaborn:
 based on matplotlib

 provides high level interface for drawing attractive statistical graphics

 Similar (in style) to the popular ggplot2 library in R

Link: https://seaborn.pydata.org/

<CO: 1> <Reference No.: R1,R3,R4>


14
Learning Outcomes

The students have learn and understand the followings:


•Tools for Data Science
•Advantages of Using Python
•Python Libraries for Data Science
References
• Data Science with Python by by Aaron England, Mohamed Noordeen Alaudeen, and
Rohan Chopra. Packt Publishing; July 2019
• https://intellipaat.com/blog/what-is-data-science/
• https://onlinecourses.nptel.ac.in/noc20_cs36/
Thank You

You might also like