0% found this document useful (0 votes)

87 views

EX-02-Data manipulation pandas matplot

The document outlines data manipulation techniques using NumPy and Pandas, as well as data visualization methods with Matplotlib. It includes various programs demonstrating array creation, reshaping, copying, concatenating, and handling missing data, along with aggregation functions in Pandas. Additionally, it covers plotting techniques like line graphs, scatter plots, and bar charts using Matplotlib.

Uploaded by

keerthivasank.22cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views

EX-02-Data manipulation pandas matplot

Uploaded by

keerthivasank.22cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

AIM:

To perform data manipulation using numpy and pandas and data visualization using matplotlib.

Program:
A. Data Manipulation using Numpy

NumPy stands for Numerical Python. NumPy is a python library used for working with arrays.

array(): Used to create numpy ndarray

copy(): It is a new array and it owns the data and any changes made to the copy will not affect
original array
view():View is just a view of the original array.The view does not own the data and any changes
made to the view will affect the original array
reshape() : Used to change the dimensions of the array
concatenate(): Used to join two or more arrays into single array
split(): Splitting is reverse operation of Joining. Array and number of splits are passed as
argument.
where(): search an array for a certain value, and return the indexes that get a match
sort(): Sorting means putting elements in an ordered sequence.
random: Module to work with random numbers

1. Program to create array and perform reshape:

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
x=arr.reshape(2,3)
print( x)

Output:
[[1 2 3]
[4 5 6]]

2. Program to demonstrate copy and view

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42
print(arr)
print(x)

arr = np.array([11, 12, 13, 14, 15])

y = arr.copy()
arr[0] = 42
print(arr)
print(y)

Output:
[42 2 3 4 5]
[1 2 3 4 5]

[42 12 13 14 15]
[42 12 13 14 15]

3. Program to demonstrate concatenate and split

import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))
print(arr)
print(np.split(arr,2))

Output:
[1 2 3 4 5 6]
[array([1, 2, 3]), array([4, 5, 6])]

4. Program to demonstrate sorting and searching

import numpy as np
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
print(np.where(arr==2))

Output:
[0 1 2 3]
3

B. Data Manipulation using Pandas

Pandas is a python library that allow us to analyze big data and make conclusions based on
statistical theories.
read_csv(dataset name): Used to load dataset
dropna(): Used to remove rows or columns with missing values
fillna(): Used to replace the NaN values with other values
mean(): Used to find the mean of the values for the requested axis
sum(): Function return the sum of the values for the requested axis
count(): is used to count the no. of non-NA/null observations across the given axis.

1. Program for handling missing data:

import numpy as np
import pandas as pd

# Load Dataset

dframe = pd.read_csv(‘Data.csv’)
print(dframe)

# Dropping missing data

dframe.dropna(inplace = True)
print(dframe)

# Filling missing data with mean

dframe.fillna(value = dframe.mean(), inplace = True)

print(dframe)

Output:
a b c
0 23 10.0 0.0
1 24 12.0 NaN
2 22 NaN NaN

a b c
0 23 10.0 0.0

a b c
0 23 10.0 0.0
1 24 12.0 0.0
2 22 11.0 0.0

2. Program for merge and join of data frame using panda:

import numpy as np
import pandas as pd

left = pd.DataFrame({'Key': ['K0', 'K1', 'K2', 'K3'], 'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'Key': ['K0', 'K1', 'K2', 'K3'], 'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})

# Merging the dataframes

print(pd.merge(left, right, how ='inner', on ='Key'))

# Joining the dataframes

left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],

'B': ['B0', 'B1', 'B2', 'B3']}, index= ['K0', 'K1', 'K2', 'K3'])

right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],

'D': ['D0', 'D1', 'D2', 'D3']}, index= ['K0', 'K1', 'K2', 'K3'])

print(left.join(right))

Output:
Key A B C D
0 K0 A0 B0 C0 D0
1 K1 A1 B1 C1 D1
2 K2 A2 B2 C2 D2
3 K3 A3 B3 C3 D3
A B C D
K0 A0 B0 C0 D0
K1 A1 B1 C1 D1
K2 A2 B2 C2 D2
K3 A3 B3 C3 D3

3. Program for aggregation in panda:

import numpy as np
import pandas as pd

df = pd.read_csv(‘data.csv’)
print(df)
# Mean
print(df.mean())

# Count
print(df.count())

# Median
print(df.median())

# Sum
print(df.sum())

#Mad (Mean Absolute Deviation)

print(df.mad())

# Std (Standard deviation)

print(df.std())

# Var (Variance)
print(df.var())

Output:
A B
0 0.374540 0.155995
1 0.950714 0.058084
2 0.731994 0.866176
3 0.598658 0.601115
4 0.156019 0.708073
A 0.562385
B 0.477888
dtype: float64

A 5
B 5
dtype: int64

A 0.598658
B 0.601115
dtype: float64

A 2.811925
B 2.389442
dtype: float64

A 0.237685
B 0.296679
dtype: float64

A 0.308748
B 0.353125
dtype: float64

A 0.095325
B 0.124697
dtype: float64

C. Data Visualization using Matplotlib

Matplotlib is a multi-platform data visualization library built on NumPy arrays. Matplotlib

consists of several plots like line, bar, scatter, histogram etc.
figure(): Thought of as a single container that contains all the objects representing axes, graphics,
text, and labels
axes(): Bounding box with ticks and labels, which will eventually contain the plot elements that
make up our visualization.
xlim(), ylim(): Used to set x and y limit for axis
axis[xmin,xmax,ymin,ymax]: Used to set x and y limit for axis
xlabel(), ylabel(): Used to set label for axis
legend(): Used to set label for multiple plot lines in graph
plt.axis(‘equal’): Equal splitting of x and y limit values
plt.axis(‘tight’): Default spacing left between values of x and y limits
plt.savefig(): Used to save the output graph to specified format

1. Line Drawing
Program:
import matplotlib.pyplot as plt
import numpy as np

plt.style.use('seaborn-whitegrid')
fig = plt.figure()
plt.xlim(1,10)
plt.ylim(1,20)
plt.axis('tight')
plt.title('Sin ans Cos curves')
plt.xlabel('x')
plt.ylabel('Y')
x=np.linspace(1,20,10)
plt.plot(x, x + 0, linestyle='solid', linewidth=5,label='solid')
plt.plot(x, x + 1, linestyle='dashed',label='dashed')
plt.plot(x, x + 2, linestyle='dashdot',label='dash dot')
plt.plot(x, x + 3, linestyle='dotted',label='dotted')
plt.legend()
plt.savefig(‘sincos.png’) #Output graph saved into png format in current folder

Output:

2. Scatter Plot
Program:
import matplotlib.pyplot as plt
import numpy as np
plt.xlim(1,10)
plt.ylim(0,10)
plt.title('Scatter plots')
plt.xlabel('x axis')
plt.ylabel('Y axis')
x = np.linspace(0, 10, 10)
y = x+2
plt.plot(x, y, 'o', color=' green ') #scatter plot using plt.plot()
OR
plt.scatter(x,y,marker='o',color='green') #scatter plot using plt.scatter()
plt.savefig(‘scatter.pdf’) #Output graph saved into pdf format in current folder

Output:
3. Bar chart
Program:
import matplotlib.pyplot as plt

# x-coordinates of left sides of bars

left = [1, 2, 3, 4, 5]

# heights of bars
height = [10, 24, 36, 40, 5]

# labels for bars

tick_label = ['one', 'two', 'three', 'four', 'five']

# plotting a bar chart

plt.bar(left, height, tick_label = tick_label,
width = 0.8, color = ['red', 'green'])

# naming the x-axis

plt.xlabel('x - axis')
# naming the y-axis
plt.ylabel('y - axis')

# plot title
plt.title('My bar chart!')

Output:
+

Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
EEMUA Publication 183 - Guide For The Prevention of Tank Bottom Leakage © Eemua
No ratings yet
EEMUA Publication 183 - Guide For The Prevention of Tank Bottom Leakage © Eemua
4 pages
Fds PDF
No ratings yet
Fds PDF
58 pages
Python Course Cheat Sheet
No ratings yet
Python Course Cheat Sheet
30 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Fundamentals of Data Science Lab Manual
No ratings yet
Fundamentals of Data Science Lab Manual
34 pages
ML3_Data_Analysis
No ratings yet
ML3_Data_Analysis
80 pages
Section 7
No ratings yet
Section 7
33 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
FODS_LAB_MANUAL
No ratings yet
FODS_LAB_MANUAL
26 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
Dsa Record-1
No ratings yet
Dsa Record-1
153 pages
EDA LAB ASSIGNMENT2
No ratings yet
EDA LAB ASSIGNMENT2
10 pages
Mohit
No ratings yet
Mohit
19 pages
BDA File
No ratings yet
BDA File
26 pages
Unit 5 PythonPackages(Matplotlib)
No ratings yet
Unit 5 PythonPackages(Matplotlib)
24 pages
fdsa lab manual final
No ratings yet
fdsa lab manual final
70 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
34 pages
Certificate
No ratings yet
Certificate
25 pages
unit 5
No ratings yet
unit 5
28 pages
FOD Record Sem 1
No ratings yet
FOD Record Sem 1
25 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
aadarsh
No ratings yet
aadarsh
26 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
NumPy and Pandas (1)
No ratings yet
NumPy and Pandas (1)
12 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
16 Mark Ds
No ratings yet
16 Mark Ds
18 pages
RAW Data
No ratings yet
RAW Data
22 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
12 Ip Practical List With Solution Complete
No ratings yet
12 Ip Practical List With Solution Complete
5 pages
AD3411 - 1 To 5
No ratings yet
AD3411 - 1 To 5
11 pages
Python
No ratings yet
Python
32 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
DSP LAB-3(part-a)
No ratings yet
DSP LAB-3(part-a)
16 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
38 pages
IP grade 12 record
No ratings yet
IP grade 12 record
12 pages
ML(sudhanshu)
No ratings yet
ML(sudhanshu)
24 pages
MCP Lab-2023 ContentForPythonLibrariesTopic
No ratings yet
MCP Lab-2023 ContentForPythonLibrariesTopic
9 pages
Python For DScience & D Visualisation Updated
No ratings yet
Python For DScience & D Visualisation Updated
11 pages
External
No ratings yet
External
11 pages
DSA lab manual pgms_fINAL
No ratings yet
DSA lab manual pgms_fINAL
34 pages
final dev record
No ratings yet
final dev record
49 pages
DAV EXP 1 t12 31
No ratings yet
DAV EXP 1 t12 31
39 pages
GEC PRACTICALS
No ratings yet
GEC PRACTICALS
31 pages
AI Final PDF
No ratings yet
AI Final PDF
38 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
Fundamentals of Data Science Lab Manual New1
No ratings yet
Fundamentals of Data Science Lab Manual New1
32 pages
Usage of NumPy for Numerical Data in Detail
No ratings yet
Usage of NumPy for Numerical Data in Detail
52 pages
Ds Lab-1
No ratings yet
Ds Lab-1
40 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
DS_lab manual
No ratings yet
DS_lab manual
31 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Python CSBS Bhavya Lab Manual
No ratings yet
Python CSBS Bhavya Lab Manual
14 pages
Assignment-2 & Mini-Project (Lab Based) (Python) - SE 2024-25
No ratings yet
Assignment-2 & Mini-Project (Lab Based) (Python) - SE 2024-25
3 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Biol2001 R-Lecture 1
No ratings yet
Biol2001 R-Lecture 1
35 pages
6 Testing and Qualification of Two-Stage Turbocharging Systems
No ratings yet
6 Testing and Qualification of Two-Stage Turbocharging Systems
16 pages
5.design and Analysis of Fatique Lfe of Impeller
No ratings yet
5.design and Analysis of Fatique Lfe of Impeller
50 pages
QuickAmp-72 ENGLISH v3-0
No ratings yet
QuickAmp-72 ENGLISH v3-0
16 pages
Industrial Manufacturing Process of Acrylonitrile: November 2014
No ratings yet
Industrial Manufacturing Process of Acrylonitrile: November 2014
182 pages
AASHTO Guide For Design Pavement Structures 1993
100% (1)
AASHTO Guide For Design Pavement Structures 1993
625 pages
How To Upgrade Oracle 9.2.0.1 To 9.2.0.5 or 9.2.0.6
No ratings yet
How To Upgrade Oracle 9.2.0.1 To 9.2.0.5 or 9.2.0.6
3 pages
10th Physics Guess Paper 2025
No ratings yet
10th Physics Guess Paper 2025
7 pages
Verilog Basics
No ratings yet
Verilog Basics
52 pages
Solution - Assignment 1 - Semiconductor Fundamentals
No ratings yet
Solution - Assignment 1 - Semiconductor Fundamentals
8 pages
Ga-201 Centrif Pumps
No ratings yet
Ga-201 Centrif Pumps
31 pages
Chapter13 The Periodic Table: Putting The Elements in Order
No ratings yet
Chapter13 The Periodic Table: Putting The Elements in Order
26 pages
Steel Design 14
No ratings yet
Steel Design 14
4 pages
Assign 4 - GR5 - S22324
No ratings yet
Assign 4 - GR5 - S22324
9 pages
Scalable Pattern Recognition For Large-Scale Scientific Data Mining
No ratings yet
Scalable Pattern Recognition For Large-Scale Scientific Data Mining
14 pages
Experiment: Acetanilide
No ratings yet
Experiment: Acetanilide
5 pages
XS Power Batteries D Series Instructions
No ratings yet
XS Power Batteries D Series Instructions
2 pages
BP302TP
No ratings yet
BP302TP
2 pages
Practical 3 CGR PDF
No ratings yet
Practical 3 CGR PDF
4 pages
Maths Basics Concepts
No ratings yet
Maths Basics Concepts
6 pages
Elementary General Music Curriculum Overview
No ratings yet
Elementary General Music Curriculum Overview
2 pages
Fuzzy Mathematical Programming Assignment
No ratings yet
Fuzzy Mathematical Programming Assignment
13 pages
Steps For Creating Cascading Text
No ratings yet
Steps For Creating Cascading Text
13 pages
Cse III Discrete Mathematical Structures 10cs34 Notes
No ratings yet
Cse III Discrete Mathematical Structures 10cs34 Notes
115 pages
Does Financial Literacy Affect Stock Market Participation
No ratings yet
Does Financial Literacy Affect Stock Market Participation
42 pages
Product Data Sheet TraXon With Modules 71455
No ratings yet
Product Data Sheet TraXon With Modules 71455
7 pages
Goulds 3196 i-FRAME ANSI Process Goulds HT 3196 i-FRAME ANSI High Temperature Process Pump
No ratings yet
Goulds 3196 i-FRAME ANSI Process Goulds HT 3196 i-FRAME ANSI High Temperature Process Pump
1 page
g8m2l11 7 1 - Finding Missing Angle Measurements
No ratings yet
g8m2l11 7 1 - Finding Missing Angle Measurements
5 pages
Syllabus
No ratings yet
Syllabus
3 pages

EX-02-Data manipulation pandas matplot

Uploaded by

EX-02-Data manipulation pandas matplot

Uploaded by

AIM:

array(): Used to create numpy ndarray

1. Program to create array and perform reshape:

2. Program to demonstrate copy and view

arr = np.array([11, 12, 13, 14, 15])

3. Program to demonstrate concatenate and split

4. Program to demonstrate sorting and searching

B. Data Manipulation using Pandas

1. Program for handling missing data:

# Dropping missing data

# Filling missing data with mean

dframe.fillna(value = dframe.mean(), inplace = True)

2. Program for merge and join of data frame using panda:

# Merging the dataframes

print(pd.merge(left, right, how ='inner', on ='Key'))

# Joining the dataframes

left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],

right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],

3. Program for aggregation in panda:

#Mad (Mean Absolute Deviation)

# Std (Standard deviation)

C. Data Visualization using Matplotlib

Matplotlib is a multi-platform data visualization library built on NumPy arrays. Matplotlib

# x-coordinates of left sides of bars

# labels for bars

# plotting a bar chart

# naming the x-axis

You might also like