0% found this document useful (0 votes)

1K views12 pages

Lab - Manual FDS

The document describes the contents and objectives of a data science laboratory course. It includes: 1. The course objectives are to understand Python libraries for data science like NumPy, Pandas, and Matplotlib, and to learn statistical analysis, data visualization, and machine learning techniques. 2. The list of experiments cover downloading and using data science packages, working with NumPy arrays, reading and exploring data, and applying techniques like regression, correlation, and plotting. 3. Students will learn to use Python libraries for data science, apply statistical measures, perform descriptive analytics, and present data visually.

Uploaded by

Prabha K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views12 pages

Lab - Manual FDS

Uploaded by

Prabha K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

DATA SCIENCE LABORATORY

DATA SCIENCE
LAB MANUAL
(Under Revision)

Prepared & Consolidated

by
Vignesh L S

VIGNESH LS 1
DATA SCIENCE LABORATORY

CS3362 DATA SCIENCE LABORATORY (Under Revision)

COURSE OBJECTIVES:
 To understand the python libraries for data science
 To understand the basic Statistical and Probability measures for data science.
 To learn descriptive analytics on the benchmark data sets.
 To apply correlation and regression analytics on standard data sets.
 To present and interpret data using visualization packages in Python.
LIST OF EXPERIMENTS:
1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels and
Pandas packages.
2. Working with Numpy arrays
3. Working with Pandas data frames
4. Reading data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set.
5. Use the diabetes data set from UCI and Pima Indians Diabetes data set for performing the
following:
 Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation,
Skewness and Kurtosis.
 Bivariate analysis: Linear and logistic regression modeling
 Multiple Regression analysis
 Also compare the results of the above analysis for the two data sets.
6. Apply and explore various plotting functions on UCI data sets.
 Normal curves
 Density and contour plots
 Correlation and scatter plots
 Histograms
 Three dimensional plotting
7. Visualizing Geographic Data with Basemap

LIST OF EQUIPMENTS :(30 Students per Batch)

Tools: Python, Numpy, Scipy, Matplotlib, Pandas, statmodels, seaborn, plotly, bokeh
Note: Example data sets like: UCI, Iris, Pima Indians Diabetes etc.

VIGNESH LS 2
DATA SCIENCE LABORATORY

COURSE OUTCOMES:
At the end of this course, the students will be able to:
 CO1: Make use of the python libraries for data science
 CO2: Make use of the basic Statistical and Probability measures for data science.
 CO3: Perform descriptive analytics on the benchmark data sets.
 CO4: Perform correlation and regression analytics on standard data sets
 CO5: Present and interpret data using visualization packages in Python.

VIGNESH LS 3
DATA SCIENCE LABORATORY

1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels
and Pandas packages.
Aim
To Download and install python and its packages using pip installation
Procedure
Install Python Data Science Packages
Python is a high-level and general-purpose programming language with data science and
machine learning packages. Use the video below to install on Windows, MacOS, or Linux. As
a first step, install Python for Windows, MacOS, or Linux.
Python Packages
The power of Python is in the packages that are available either through the pip or conda
package managers. This page is an overview of some of the best packages for machine
learning and data science and how to install them.
We will explore the Python packages that are commonly used for data science and machine
learning. You may need to install the packages from the terminal, Anaconda prompt,
command prompt, or from the Jupyter Notebook. If you have multiple versions of Python or
have specific dependencies then use an environment manager such as pyenv. For most users,
a single installation is typically sufficient. The Python package manager pip has all of the
packages (such as gekko) that we need for this course. If there is an administrative access
error, install to the local profile with the --user flag.

pip install gekko

Gekko
Gekko provides an interface to gradient-based solvers for machine learning and
optimization of mixed-integer, differential algebraic equations, and time series models.
Gekko provides exact first and second derivatives through automatic differentiation and
discretization with simultaneous or sequential methods.

pip install gekko

Keras
Keras provides an interface for artificial neural networks. Keras acts as an interface for the
TensorFlow library. Other backend packages were supported until version 2.4. TensorFlow
is now the only backend and is installed separately with pip install tensorflow.

pip install keras

VIGNESH LS 4
DATA SCIENCE LABORATORY

Matplotlib
The package matplotlib generates plots in Python.

pip install matplotlib

Numpy
Numpy is a numerical computing package for mathematics, science, and engineering. Many
data science packages use Numpy as a dependency.

pip install numpy

OpenCV
OpenCV (Open Source Computer Vision Library) is a package for real-time computer vision
and developed with support from Intel Research.

pip install opencv-python

Pandas
Pandas visualizes and manipulates data tables. There are many functions that allow efficient
manipulation for the preliminary steps of data analysis problems.

pip install pandas

Plotly
Plotly renders interactive plots with HTML and JavaScript. Plotly Express is included with
Plotly.

pip install plotly

PyTorch
PyTorch enables deep learning, computer vision, and natural language processing.
Development is led by Facebook's AI Research lab (FAIR).

pip install torch

Scikit-Learn
Scikit-Learn (or sklearn) includes a wide variety of classification, regression and clustering
algorithms including neural network, support vector machine, random forest, gradient
boosting, k-means clustering, and other supervised or unsupervised learning methods.

pip install scikit-learn

SciPy
SciPy is a general-purpose package for mathematics, science, and engineering and extends
the base capabilities of NumPy.

pip install scipy

VIGNESH LS 5
DATA SCIENCE LABORATORY

Seaborn
Seaborn is built on matplotlib, and produces detailed plots in few lines of code.

pip install seaborn

Statsmodels
Statsmodels is a package for exploring data, estimating statistical models, and performing
statistical tests. It include descriptive statistics, statistical tests, plotting functions, and result
statistics.

pip install statsmodels

TensorFlow
TensorFlow is an open source machine learning platform with particular focus on training
and inference of deep neural networks. Development is led by the Google Brain team.

pip install tensorflow

VIGNESH LS 6
DATA SCIENCE LABORATORY

Working with Numpy arrays

CREATE A NUMPY NDARRAY OBJECT
NumPy is used to work with arrays. The array object in NumPy is called ndarray.

Example
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))

To create an ndarray, we can pass a list, tuple or any array-like object into the array()
method, and it will be converted into an ndarray:

Example
Use a tuple to create a NumPy array:
import numpy as np
arr = np.array((1, 2, 3, 4, 5))
print(arr)

Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays).
0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

Example
Create a 0-D array with value 42
import numpy as np
arr = np.array(42)
print(arr)

1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
These are the most common and basic arrays.

Example
Create a 1-D array containing the values 1,2,3,4,5:

VIGNESH LS 7
DATA SCIENCE LABORATORY

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)

2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array. These are often used to
represent matrix or 2nd order tensors.

Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)

3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.

Example
Create a 3-D array with two 2-D arrays, both containing two arrays with the values
1,2,3 and 4,5,6:
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(arr)

Check Number of Dimensions?

NumPy Arrays provides the ndim attribute that returns an integer that tells us how many
dimensions the array have.

Example
Check how many dimensions the arrays have:
import numpy as np
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(a.ndim)

VIGNESH LS 8
DATA SCIENCE LABORATORY

print(b.ndim)
print(c.ndim)
print(d.ndim)

Higher Dimensional Arrays

An array can have any number of dimensions.
When the array is created, you can define the number of dimensions by using the ndmin
argument.

Example
Create an array with 5 dimensions and verify that it has 5 dimensions:
import numpy as np
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('number of dimensions :', arr.ndim)

In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element
that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim
has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

VIGNESH LS 9
DATA SCIENCE LABORATORY

Working with Pandas data frames

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table
with rows and columns.

Example
Create a simple Pandas DataFrame:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)

Result
calories duration
0 420 50
1 380 40
2 390 45

Locate Row
As you can see from the result above, the DataFrame is like a table with rows and columns.
Pandas use the loc attribute to return one or more specified row(s)

Example
Return row 0:
#refer to the row index:

print(df.loc[0])

VIGNESH LS 10
DATA SCIENCE LABORATORY

Result
calories 420
duration 50
Name: 0, dtype: int64
Note: This example returns a Pandas Series.

Example
Return row 0 and 1:
#use a list of indexes:
print(df.loc[[0, 1]])

Result
calories duration
0 420 50
1 380 40
Note: When using [], the result is a Pandas DataFrame.

Named Indexes
With the index argument, you can name your own indexes.

Example
Add a list of names to give each row a name:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])
print(df)

VIGNESH LS 11
DATA SCIENCE LABORATORY

Result
calories duration
day1 420 50
day2 380 40
day3 390 45

Locate Named Indexes

Use the named index in the loc attribute to return the specified row(s).

Example
Return "day2":
#refer to the named index:
print(df.loc["day2"])

Result
calories 380
duration 40
Name: 0, dtype: int64

Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.

Example
Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd
df = pd.read_csv('data.csv')
print(df)

VIGNESH LS 12

Data Science Using Python Lab Manual
No ratings yet
Data Science Using Python Lab Manual
68 pages
Ad3351 Daa Lecture Notes Units 1,2,3
No ratings yet
Ad3351 Daa Lecture Notes Units 1,2,3
79 pages
CCS359 - Quantum Computing Manual (WOL)
No ratings yet
CCS359 - Quantum Computing Manual (WOL)
25 pages
Ad3351 Daa Unit I
No ratings yet
Ad3351 Daa Unit I
135 pages
CD3291 Data Structurres and Algorithm Lab Manual
No ratings yet
CD3291 Data Structurres and Algorithm Lab Manual
84 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
65 pages
Big Data Analytics TEXTBOOK
100% (1)
Big Data Analytics TEXTBOOK
230 pages
Introduction To QAD Enterprise Applications User Guide PDF
No ratings yet
Introduction To QAD Enterprise Applications User Guide PDF
208 pages
Digital Marketing Lab Manual
100% (11)
Digital Marketing Lab Manual
19 pages
Bda Sem 7 Book
No ratings yet
Bda Sem 7 Book
188 pages
AI Manual-2021-2022 (Even) - Lab Manual
100% (1)
AI Manual-2021-2022 (Even) - Lab Manual
37 pages
Eula
No ratings yet
Eula
14 pages
FAFL Padma Reddy
100% (1)
FAFL Padma Reddy
457 pages
The SQL Tutorial For Data Analysis
No ratings yet
The SQL Tutorial For Data Analysis
103 pages
Ad3301 Dev Full Notes
No ratings yet
Ad3301 Dev Full Notes
53 pages
Data Engineering UNIT-1
100% (1)
Data Engineering UNIT-1
14 pages
Aiml Lab Manual Upto DT
No ratings yet
Aiml Lab Manual Upto DT
40 pages
Video Tutorial: Machine Learning 17CS73
100% (2)
Video Tutorial: Machine Learning 17CS73
27 pages
CD3291 Data Structures and Algorithms Lecture Notes 2
No ratings yet
CD3291 Data Structures and Algorithms Lecture Notes 2
156 pages
CS8392 - Oop - Unit 1 - PPT - 1.1
67% (3)
CS8392 - Oop - Unit 1 - PPT - 1.1
28 pages
Python Lab Manual 2022-23-2
No ratings yet
Python Lab Manual 2022-23-2
36 pages
Week-01 Assignment
No ratings yet
Week-01 Assignment
7 pages
TDD
No ratings yet
TDD
222 pages
Big Data and Business Analytics: Lab Manual
100% (1)
Big Data and Business Analytics: Lab Manual
45 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Windows Sysadmin Interview Questions
No ratings yet
Windows Sysadmin Interview Questions
64 pages
Artificial Intelligence Anna University Notes
No ratings yet
Artificial Intelligence Anna University Notes
156 pages
Heuristic Search: Dr.M. Nagaratna Professor, Dept - of CSE Jntuceh
No ratings yet
Heuristic Search: Dr.M. Nagaratna Professor, Dept - of CSE Jntuceh
54 pages
CCW331 BA IAT 1 Set 1 & Set 2 Questions
No ratings yet
CCW331 BA IAT 1 Set 1 & Set 2 Questions
19 pages
Al3391 - Ai Theory Syllabus
No ratings yet
Al3391 - Ai Theory Syllabus
2 pages
REA Approach Model AIS
No ratings yet
REA Approach Model AIS
45 pages
Ai & ML Lab Manual
No ratings yet
Ai & ML Lab Manual
41 pages
Module - 1 IDS
100% (1)
Module - 1 IDS
19 pages
BTCS9202 Data Sciences Lab Manual
No ratings yet
BTCS9202 Data Sciences Lab Manual
39 pages
ML Lab Manual - Ex No. 1 To 9
No ratings yet
ML Lab Manual - Ex No. 1 To 9
26 pages
1) Aim: Demonstration of Preprocessing of Dataset Student - Arff
No ratings yet
1) Aim: Demonstration of Preprocessing of Dataset Student - Arff
26 pages
Unit 1 Introduction of Machine Learning Notes
No ratings yet
Unit 1 Introduction of Machine Learning Notes
57 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
82 pages
Comp Method Book CM s21
No ratings yet
Comp Method Book CM s21
295 pages
Unit - 3 ML
No ratings yet
Unit - 3 ML
17 pages
Concept Learning
No ratings yet
Concept Learning
62 pages
ML Lab Manual (1-10) FINAL
No ratings yet
ML Lab Manual (1-10) FINAL
34 pages
Week-02 Assignment - 2023 Updated
No ratings yet
Week-02 Assignment - 2023 Updated
5 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
MACHINE LEARNING Important Questions
100% (1)
MACHINE LEARNING Important Questions
2 pages
Introduction To AI and Production Systems
No ratings yet
Introduction To AI and Production Systems
20 pages
FRD Template
No ratings yet
FRD Template
20 pages
ML - LAB Record
No ratings yet
ML - LAB Record
36 pages
BA Lab Manual
No ratings yet
BA Lab Manual
62 pages
Experiment 5
100% (1)
Experiment 5
6 pages
Dbms
No ratings yet
Dbms
99 pages
Cp4252-Machine Learning Lab Manual 23-24
No ratings yet
Cp4252-Machine Learning Lab Manual 23-24
28 pages
FDS Lesson Plan
No ratings yet
FDS Lesson Plan
8 pages
5 CBLM 33-AL
0% (1)
5 CBLM 33-AL
36 pages
Iii Year Vi Sem CS6659 Artificial Intelligence
No ratings yet
Iii Year Vi Sem CS6659 Artificial Intelligence
44 pages
IT8761-SECURITY LABORATORY-590519304-IT8761 Security Labmanual
No ratings yet
IT8761-SECURITY LABORATORY-590519304-IT8761 Security Labmanual
43 pages
White Paper - PCI Compliance
No ratings yet
White Paper - PCI Compliance
45 pages
Cs3451 Ios Unit 5 Notes
No ratings yet
Cs3451 Ios Unit 5 Notes
21 pages
3m Privacy Filter Screen Anti Glare Screen Protector Price List Singapore 2019
No ratings yet
3m Privacy Filter Screen Anti Glare Screen Protector Price List Singapore 2019
32 pages
Voice Based Email System
No ratings yet
Voice Based Email System
40 pages
IF4071 - Deep Learning Laboratory
No ratings yet
IF4071 - Deep Learning Laboratory
1 page
General Architecture of Text Mining Systems
No ratings yet
General Architecture of Text Mining Systems
6 pages
Untitled
No ratings yet
Untitled
4 pages
Unit I - QB
100% (1)
Unit I - QB
3 pages
Ad3311 Set4
No ratings yet
Ad3311 Set4
2 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
FRAM Utilities UsersGuide
No ratings yet
FRAM Utilities UsersGuide
70 pages
NOAA Manual-Eng
No ratings yet
NOAA Manual-Eng
34 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
AAI Module 2 Notes
No ratings yet
AAI Module 2 Notes
14 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
WS 2.4
0% (1)
WS 2.4
3 pages
GE3171 - Python Lab Syllabus
100% (1)
GE3171 - Python Lab Syllabus
2 pages
Aptitude 5
No ratings yet
Aptitude 5
11 pages
Bhoomi Project All
No ratings yet
Bhoomi Project All
24 pages
Social Media Audit Template - PDF (MAKE A COPY) PDF
No ratings yet
Social Media Audit Template - PDF (MAKE A COPY) PDF
3 pages
Question Paper - AI (Feb 1)
No ratings yet
Question Paper - AI (Feb 1)
2 pages
Ad3311 Set 1
No ratings yet
Ad3311 Set 1
2 pages
Lab 8 GIS
No ratings yet
Lab 8 GIS
11 pages
CCS334 BDA Practical Question
No ratings yet
CCS334 BDA Practical Question
2 pages
MG Gs Crestron Flex Unified Communications Solutions
No ratings yet
MG Gs Crestron Flex Unified Communications Solutions
9 pages
Why Functional Programming Matters: John Hughes The University, Glasgow
No ratings yet
Why Functional Programming Matters: John Hughes The University, Glasgow
23 pages
RPA Interview Questions
100% (1)
RPA Interview Questions
2 pages
Stars 1.06
No ratings yet
Stars 1.06
22 pages
Week-05 Assignment
No ratings yet
Week-05 Assignment
5 pages
Inbound Integration Process (Lookup Integration)
No ratings yet
Inbound Integration Process (Lookup Integration)
4 pages
Tecnomatix Plant Simulation Release Notes Version 9.0
No ratings yet
Tecnomatix Plant Simulation Release Notes Version 9.0
14 pages
Cisco Tidal Intelligent Automation For SAP System Refresh Datasheet 1104B0710 - FINAL
No ratings yet
Cisco Tidal Intelligent Automation For SAP System Refresh Datasheet 1104B0710 - FINAL
3 pages
Be - Computer Engineering - Semester 6 - 2022 - May - Artificial Intelligence Ai Pattern 2019
No ratings yet
Be - Computer Engineering - Semester 6 - 2022 - May - Artificial Intelligence Ai Pattern 2019
2 pages
Review On Travel Agency System Management Portal
No ratings yet
Review On Travel Agency System Management Portal
7 pages
Cybersecurity For Smart Cities: A Brief Review
No ratings yet
Cybersecurity For Smart Cities: A Brief Review
10 pages
Corsair
No ratings yet
Corsair
5 pages
Question Bank III Unit CC
No ratings yet
Question Bank III Unit CC
2 pages
On Campus Mock Test Question Set 2
No ratings yet
On Campus Mock Test Question Set 2
2 pages
Telnet
No ratings yet
Telnet
5 pages
Enhancing VNF Performance by Exploiting SR IOV and DPDK Packet Processing Acceleration
No ratings yet
Enhancing VNF Performance by Exploiting SR IOV and DPDK Packet Processing Acceleration
6 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Security Trends, Legal, Ethical and Professional Aspects of Security
No ratings yet
Security Trends, Legal, Ethical and Professional Aspects of Security
3 pages

Lab - Manual FDS

Uploaded by

Lab - Manual FDS

Uploaded by

DATA SCIENCE LABORATORY

Prepared & Consolidated

CS3362 DATA SCIENCE LABORATORY (Under Revision)

LIST OF EQUIPMENTS :(30 Students per Batch)

pip install gekko

pip install gekko

pip install keras

pip install matplotlib

pip install numpy

pip install opencv-python

pip install pandas

pip install plotly

pip install torch

pip install scikit-learn

pip install scipy

pip install seaborn

pip install statsmodels

pip install tensorflow

Working with Numpy arrays

Check Number of Dimensions?

Higher Dimensional Arrays

Working with Pandas data frames

Locate Named Indexes

Load Files Into a DataFrame

You might also like