0% found this document useful (0 votes)

38 views18 pages

Unit 1 Pandas - Charts

Uploaded by

Bhavya Bhatt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views18 pages

Unit 1 Pandas - Charts

Uploaded by

Bhavya Bhatt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Data Visualization

"A picture is worth of a thousand words". Most of us are familiar with this expression. Data visualization plays an essential role
in their presentation of both small and large-scale data. It especially applies when trying to explain the analysis of increasingly
large datasets.
Data visualization is the discipline of trying to expose the data to understand it by placing it in a visual context. Its main goal is to
collect large datasets into visual graphics to allow for easy understanding of complex relationships within the data. Several data
visualization libraries are available in Python namely Matplotlib, Seaborn and Folium etc.
Purpose of Data visualization
 Better analysis and Quick action
 Identifying patterns and Finding errors
 Understanding the story, Exploring business insights and Grasping the Latest Trends
Plotting library
Matplotlib is the whole python package/library used to create 2D graphs and plots by using python scripts. Pyplot is a module in
matplotlib, which supports a very wide variety of graphs and plots namely-histogram, bar charts, power spectra, error charts
etc. It is used along with NumPy to provide an environment for MatLab. import matplotlib.pyplot as plt - is used for chart.
Pyplot provides the state-machine interface to the plotting library in matplotlib. It means that figures and axes are implicitly and
automatically created to achieve the desired plot. For example, calling plot from pyplot will automatically create the necessary
figure and axes to achieve the desired plot. Setting a title will then automatically set that title to the current axes object. The
pyplot interface is generally preferred for non-interactive plotting (i.e., scripting).
Following features are provided in matplotlib library for data visualization.
 Drawing – plots can be drawn based on passed data through specific functions.
 Customization – plots can be customized as per requirement after specifying it in the arguments of the functions. Like
color, style (dashed, dotted), width; adding label, title and legend in plots can be customized.
 Saving – After drawing and customization plots can be saved like .pdf, .png, .eps etc. for future use.
Customizing / Adding details of the plots

Y limit range
Title

Legend

Y label

X label X limit range

1. title() To add title for plot.

2. xticks()/yticks() For setting xticks and yticks.
3. xlim() / ylim() For setting x limit / y limit.
4. xlabel() / ylabel() For setting x-axis label / y-axis label.
5. legend() To add legend to plot. These are the different colors that identify different sets of
data plotted on the plot.
Notes- size, linewidth, markersize and rwidth always measure in points. Default rwidth is 1 point.
Line chart- A line chart or line graph is a type of chart which displays information as a series of data points called
‘markers’ connected by straight line segments.
Plot( x axis column, y axis column, color, marker, markersize, markeredgecolor, linestyle, linewidth)
Month Sales Month Sales Month Sales1 Sales2
Jan 5 Jan 5 Jan 5 8
Feb 4 Feb 4 Feb 4 7
Mar 6 Mar 6 Mar 6 2
Apr 2 Apr 2 Apr 2 6
May 7 May 7 May 7 4
June 8 June 8 June 8 5

import pandas as pd import pandas as pd import pandas as pd

import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt
x=pd.DataFrame({ x=pd.DataFrame({
x=pd.DataFrame({ 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'month' : 'sales1' : [5, 4, 6, 2, 7, 8 ] }) 'sales1' : [5, 4, 6, 2, 7, 8 ],
['jan','feb','mar',’apr’,’may’,’june’], 'sales2' : [8, 7, 2, 6, 4, 5 ] })
'sales1' : [5, 4, 6, 2, 7, 8 plt.plot( x['month'], x['sales1'], color = 'g' , marker='X',
]}) plt.plot( x['month'], x['sales1'], color = 'g')
markersize=15, markeredgecolor='blue', linestyle='dashdot',
plt.plot( x['month'], x['sales1'], color = 'g') plt.plot( x['month'], x['sales2'], color = 'b')
linewidth=5)
plt.title('Line Chart- Monthwise Sales of 2016 ') plt.title('Line Chart- Monthwise Sales of 2016 ')
plt.title('Line Chart- Monthwise Sales of 2016 ')
plt.xlabel( 'Months' ) plt.xlabel( 'Months' )
plt.xlabel( 'Months' )
plt.ylabel( 'Sales' ) plt.ylabel( 'Sales' )
plt.ylabel( 'Sales' )
plt.legend([ 'Sales Values' ], loc='best') plt.legend([ 'Sales1', 'Sales2' ], loc='best')
plt.legend([ 'Sales Values' ], loc='best')
plt.savefig('d:\linechart.pdf') plt.savefig('d:\ linechart.pdf')
plt.savefig('d:\ linechart.pdf')
plt.show() plt.show()
plt.show()
marker = ‘x’ or ‘X’ or ‘+’ or ‘D’ or ‘o’ or ‘O’
linestyle = ‘solid’ or ‘dashdot’ or ‘dotted’ or ‘dashed’
Bar chart- A bar chart or bar graph is a chart or graph that represents categorical with rectangular bars with heights or
lengths proportional to the values that they represents. The bars can be plotted vertically or horizontally.
Bar ( x axis column, y axis column, color, width)
Month Sales Month Sales Month Sales1 Sales2
Jan 5 Jan 5 Jan 5 8
Feb 4 Feb 4 Feb 4 7
Mar 6 Mar 6 Mar 6 2
Apr 2 Apr 2 6
May 7 May 7 4
June 8 June 8 5

import pandas as pd import pandas as pd import pandas as pd

import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np
x=pd.DataFrame({ x=pd.DataFrame({ x=pd.DataFrame({
'month' : ['jan','feb','mar'], 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'sales1' : [5, 4, 6 ]}) 'sales1' : [5, 4, 6, 2, 7, 8 ] }) 'sales1' : [5, 4, 6, 2, 7, 8 ],
'sales2' : [8, 7, 2, 6, 4, 5 ] })
plt.bar( x['month'], x['sales1'], color = ['g', 'r', 'b'], plt.barh( x['month'], x['sales1'], color = 'g') r=np.arange( len( x.month ) )
width=[0.2, 0.5, 0.9]) plt.title('Bar Chart- Monthwise Sales of 2016 ')
plt.bar( r, x['sales1'], color = 'g', width=0.35)
plt.title('Bar Chart- Monthwise Sales of 2016 ') plt.xlabel( 'Sales' ) plt.bar( r+0.35, x['sales2'], color = 'b', width=0.35)
plt.title('Line Chart- Monthwise Sales of 2016 ')
plt.xlabel( 'Month' ) plt.ylabel( 'Month' )
plt.xticks( r, x['month'] )
plt.ylabel( 'Sales' ) plt.legend([ 'year 2016' ], loc='best') plt.xlabel( 'Months' )
plt.ylabel( 'Sales' )
plt.legend([ 'year 2016' ], loc='best') plt.savefig('d:\barchart.pdf')
plt.legend([ 'Sales1', 'Sales2' ], loc='best')
plt.savefig('d:\barchart.pdf') plt.show() plt.savefig('d:\barchart.pdf')
plt.show() plt.show()
Histogram chart- A histogram is a summarisation for discrete or continuous data such as weight, heights etc. A
histogram provides a visual interpretation of numerical data by showing the number of data points that fall within a
specified range of values called (bins). It is similar to a vertical bar graph. However a histogram, unlike a vertical
bargraph, shows no gaps between the bars.
hist(column name, bin=range, orientation=horizontal/vertical, rwidth, histtype=step/filledstep, cumulative=True/False)
Month Sales Bins frequency Month Sales Bins frequency Month Sales Bins frequency
Jan 5 Jan 5 Jan 5
Feb 4 2-5 2 Feb 4 2-4 1 Feb 4 2-4 1
Mar 6 5-8 4 Mar 6 4-6 2 Mar 6 4-6 2
Apr 2 By 3 values interval Apr 2 6-8 3 Apr 2 6-8 3
May 7 May 7 May 7
June 8 June 8 By 2 values interval June 8 By 2 values interval

import pandas as pd import pandas as pd import pandas as pd

plt.hist(x['sales'], bins=[2,5,8]) plt.hist(x['sales'], bins=[2,4,6, 8]) plt.hist(x['sales'], bins=[2,4,6,8], orientation = 'horizontal')

plt.title('Simple Histogram Chart of Sales of 2016 ') plt.title('Histogram Chart of Sales of 2016 ') plt.title('Horizontal Histogram Chart of Sales of 2016 ')
plt.xlabel( 'Sales Bins or Interval by 3' ) plt.xlabel( 'Sales Bins or Interval by 2' ) plt.xlabel('Sales Frequency values' )
plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Bins or Interval by 2' )
plt.legend([ 'Sales Frequencies' ], loc='best') plt.legend([ 'Sales Frequencies' ], loc='best') plt.legend([ 'Sales Frequencies' ], loc='best')
plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')
plt.show() plt.show() plt.show()
Cumulative Histogram chart and Frequency Polygon chart - A frequency polygon is a type of frequency distribution
graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval.
A straight line then connects each set of points. It is easy to compare two or more distribution on the same set of axes.
hist(column name, bin=range, orientation=horizontal/vertical, rwidth, histtype=step/stepfilled, cumulative=True/False)
Month Sales Bins frequency cumulative Month Sales Bins frequency Month Sales Bins frequency
Jan 5 Jan 5 Jan 5
Feb 4 2-4 1 1 Feb 4 2-4 1 Feb 4 2-4 1
Mar 6 4-6 2 1+2=3 Mar 6 4-6 2 Mar 6 4-6 2
Apr 2 6-8 3 3+3=6 Apr 2 6-8 3 Apr 2 6-8 3
May 7 May 7 May 7
June 8 June 8 By 2 values interval June 8

import pandas as pd import pandas as pd import pandas as pd

import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt

x=pd.DataFrame({ x=pd.DataFrame({ x=pd.DataFrame({

'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'sales' : [5, 4, 6, 2, 7, 8 ] }) 'sales' : [5, 4, 6, 2, 7, 8 ] }) 'sales' : [5, 4, 6, 2, 7, 8 ] })

plt.hist(x['sales'], bins=[2,4,6,8], cumulative=True) plt.hist(x['sales'], bins=[2,4,6,8], histtype='step') plt.hist(x['sales'], bins=[2,4,6,8], rwidth = 0.9)
plt.title('Cumulative Histogram Chart of Sales of 2016 ') plt.title('Frequency Polygon Chart of Sales of 2016 ') plt.title('Histogram Chart of Sales using rwidth - bar graph style ')

plt.xlabel( 'Sales Bins or Interval' ) plt.xlabel( 'Sales Bins or Interval by 2' ) plt.xlabel('Sales Frequency values' )
plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Bins or Interval by 3' )
plt.legend([ 'Sales Frequencies' ], loc='best') plt.legend([ 'Sales Frequencies' ], loc='upper left') plt.legend([ 'Sales Frequencies' ], loc='best')

plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')

plt.show() plt.show() plt.show()
Different color codes linestyle

Character color Character color Character color ‘solid

‘b’ blue ‘m’ magenta ‘c’ cyan ‘dashdot’

‘g’ green ‘y’ yellow ‘w’ white ‘dotted’

‘r’ red ‘k’ black ‘dashed’

Marker types for plotting loc

Marker description Marker description Marker description ‘best’

‘upper right’
‘.’ point marker ‘s’ square marker ‘3’ tri left marker
‘upper left’
‘,’ pixel marker ‘p’ pentagon marker ‘4’ tri right marker ‘lower left’
‘lower right’
‘o’ circle marker ‘*’ star marker ‘v’ triangle down marker
‘right’
‘+’ plus marker ‘h’ hexagon1 marker ‘^’ triangle up marker ‘center left’

‘x’ x marker ‘H’ hexagon2 marker ‘<’ triangle left marker ‘center right’
‘lower center’
‘D’ diamond marker ‘1’ tri down marker ‘>’ triangle right marker
‘upper center’
‘d’ thin diamond marker ‘2’ tri up marker ‘|’ , ‘_’ vline, hline markers ‘center’
Difference between bar graph and histogram

In above, bar graph there is gaps between bars.

In above, histogram there is no gaps between bars.

Difference between histogram and frequency polygon
Histogram Frequency Polygon

1. In histogram, the number of observations 1. In frequency polygon, the number of observations

is not marked with a single point is marked with a single point

at the midpoint of an interval. at the midpoint of an interval.

2. In histogram, inside bars always filled with colors. 2. In frequency polygon, inside bars always, no color filled only

blanks. Only boundaries are shows.

3. for frequency polygon, histtype=’step’, parameter are used.

For changing default limit values of a chart using xlim() / ylim()
Without - xlim() / ylim() With- xlim() / ylim()
Default x limit values and y limit values. Note- we can change x limit values and y limit values in a
chart according to our choice values.

Default – for x axis 40 to 45 Use defined – for x axis 35 to 50

for y axis 10 to 15 for y axis 05 to 20
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import pandas as pd import pandas as pd
x=pd.DataFrame({ x=pd.DataFrame({
'temp' : [40, 42, 45], 'temp' : [40, 42, 45],
'sales' : [10,12, 15]}) 'sales' : [10,12, 15]})

plt.scatter( x['temp'], x['sales'], color='b', marker='x') plt.scatter( x['temp'], x['sales'], color='b', marker='x')
plt.xlim (35, 50)
plt.ylim (5, 20)
plt.xlabel('temperature', fontsize=16) plt.xlabel('temperature', fontsize=16)
plt.ylabel('Sales', fontsize=16) plt.ylabel('Sales', fontsize=16)
plt.title('scatter plot - temperature vs sales', fontsize=20) plt.title('scatter plot - temperature vs sales', fontsize=20)
plt.legend([ 'Sales ' ], loc='best') plt.legend([ 'Sales ' ], loc='best')
plt.savefig('d:\scatterchart.pdf') plt.savefig('d:\scatterchart.pdf')
plt.show() plt.show()
To change, default limit values label sequence, of a chart using xticks() / yticks()
Without label range sequence - xticks() / yticks() With label range sequence - xticks() / yticks()
Default label of x limit values and y limit values. Note- we can change the label sequence of x limit values
and y limit values in a chart according to our choice values.

Default – for x axis 40, 42, 45 Use defined – for x axis 40 - t1 , 42 – t2, 45 – t3
for y axis 10, 12, 15 for y axis 10 – s1 , 12 – s2, 15 - s3
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import pandas as pd import pandas as pd
x=pd.DataFrame({ x=pd.DataFrame({
'temp' : [40, 42, 45], 'temp' : [40, 42, 45],
'sales' : [10,12, 15]}) 'sales' : [10,12, 15]})

plt.scatter( x['temp'], x['sales'], color='b', marker='x') plt.scatter( x['temp'], x['sales'], color='b', marker='x')
plt.xticks(x['temp']) plt.xticks(x['temp'], ['t1','t2','t3'])
plt.yticks(x['sales']) plt.yticks(x['sales'], ['s1','s2','s3'])
plt.xlabel('temperature', fontsize=16) plt.xlabel('temperature', fontsize=16)
plt.ylabel('Sales', fontsize=16) plt.ylabel('Sales', fontsize=16)
plt.title('scatter plot - temperature vs sales', fontsize=20) plt.title('scatter plot - temperature vs sales', fontsize=20)
plt.legend([ 'Sales ' ], loc='best') plt.legend([ 'Sales ' ], loc='best')
plt.savefig('d:\scatterchart.pdf') plt.savefig('d:\scatterchart.pdf')
plt.show() plt.show()
Sin line chart Cos line chart Log line chart Exp line chart

import numpy as np import numpy as np import numpy as np import numpy as np

import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt

a = np.arange(1,10) a = np.arange(1,10) a = np.arange(1,10) a = np.arange(1,10)

b = np.sin(a) b = np.cos(a) b = np.log(a) b= np.exp(a)

plt.plot(a,b) plt.plot(a,b) plt.plot(a,b) plt.plot(a,b)

plt.title('Simple Plot Chart of sin values ') plt.title('Simple Plot Chart of cos values ') plt.title('Simple Plot Chart of log values ') plt.title('Simple Plot Chart of exp values ')
plt.xlabel( 'sin range' ) plt.xlabel( 'cos range' ) plt.xlabel( 'log range' ) plt.xlabel(‘exp range' )
plt.ylabel( 'sin values' ) plt.ylabel( 'cos values' ) plt.ylabel( 'log values' ) plt.ylabel( ' exp values' )
plt.legend([ 'sin values' ], loc='best') plt.legend([ 'cos values' ], loc='best') plt.legend([ 'log values' ], loc='best') plt.legend([ ' exp values' ], loc='best')
plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf')
plt.show() plt.show() plt.show() plt.show()
1. Consider the data given below; create sequences required from the data below: Write code to:
Rainfall in mm
Zones Jan Feb Mar Apr May Jun Jul Aug sep Oct Nov Dec
North 140 130 130 190 160 200 150 170 190 170 150 120
South 160 200 130 200 200 170 110 160 130 140 170 200
East 140 180 150 170 190 140 170 180 190 150 140 170
West 180 150 200 120 180 140 110 130 150 190 110 140
Central 110 160 130 110 120 170 130 200 150 160 170 130
(a). Create bar charts to see the distribution of rainfall from Jan – Dec for all the zones. (b). Create a line chart to observe any trends from Jan to Dec.

import pandas as pd x=np.arange(len(df.Zones)) plt.plot(df['Zones'],df['Jan'])

import numpy as np plt.bar(x, df['Jan'], width=0.05) plt.plot(df['Zones'],df['Feb'])
import matplotlib.pyplot as plt plt.bar(x+0.05, df['Feb'], width=0.05)
plt.plot(df['Zones'],df['Mar'])
plt.bar(x+0.1, df['Mar'], width=0.05)
plt.bar(x+0.15, df['Apr'], width=0.05)
plt.plot(df['Zones'],df['Apr'])
df=pd.DataFrame({'Zones'
:['North','South','East','West','Central'], plt.bar(x+0.20, df['May'], width=0.05) plt.plot(df['Zones'],df['May'])
'Jan':[140,160,140,180,110], plt.bar(x+0.25, df['Jun'], width=0.05) plt.plot(df['Zones'],df['Jun'])
'Feb':[130,200,180,150,160], plt.bar(x+0.3, df['Jul'], width=0.05) plt.plot(df['Zones'],df['Jul'])
'Mar':[130,130,150,200,130], plt.bar(x+0.35, df['Aug'], width=0.05) plt.plot(df['Zones'],df['Aug'])
'Apr':[190,200,170,120,110], plt.bar(x+0.4, df['Sep'], width=0.05)
plt.plot(df['Zones'],df['Sep'])
'May':[160,200,190,180,120], plt.bar(x+0.45, df['Oct'], width=0.05)
plt.bar(x+0.5, df['Nov'], width=0.05)
plt.plot(df['Zones'],df['Oct'])
'Jun':[200,170,140,140,170],
plt.bar(x+0.55, df['Dec'], width=0.05) plt.plot(df['Zones'],df['Nov'])
'Jul':[150,110,170,110,130],
'Aug':[170,160,180,130,200], plt.plot(df['Zones'],df['Dec'])
'Sep':[190,130,190,150,150], plt.xticks(x,df.Zones)
'Oct':[170,140,150,190,160], plt.title('Multiple Bar Chart Jan to Dec Rainfall, with different Zones') plt.title('Multiple Line Chart Jan to Dec Rainfall, with different Zones')
'Nov':[150,170,140,110,170], plt.xlabel( 'Names of Zones' )
plt.xlabel( 'Names of Zones' )
'Dec':[120,200,170,140,130]} plt.ylabel( 'Monthwise Rainfall in mm- Jan to Dec values' )
plt.legend([['Jan'],['Feb'],['Mar'],['Apr'],['May'],['Jun'],['Jul'],['Aug'],['Se
plt.ylabel( 'Monthwise Rainfall in mm- Jan to Dec values' )
) plt.legend([['Jan'],['Feb'],['Mar'],['Apr'],['May'],['Jun'],['Jul'],['Aug'],['Sep'],['Oct'],['
p'],['Oct'],['Nov'],['Dec']], loc='best')
plt.savefig('d:\chart.pdf') Nov'],['Dec']], loc='best')
plt.show() plt.savefig('d:\chart.pdf')
plt.show()
2. Consider the data given below: import pandas as pd
App Name App Prince in Rs Total Downloads import numpy as np
Angry Bird 75 197000
Teen Titan 120 209000 import matplotlib.pyplot as plt
Marvel Comics 190 414000
ColorMe 245 196000 df=pd.DataFrame({'App Name' : ['Angry Bird','Teen Titan','Marvel Comics','ColorMe','Fun
Fun Run 550 272000
Crazy Taxi 55 311000
Run','Crazy Taxi','Igram Pro','Wapp Pro','Maths formulas'],
Igram Pro 175 213000
Wapp Pro 75 455000 'App Prince in Rs' : [75,120,190,245,550,55,175,75,140],
Maths formulas 140 278000 'Total Downloads' : [197000,209000,414000,196000,272000,311000,213000,455000,278000]})
(a). A line chart depicting the prices of the apps. (b). A bar chart depicting the downloads of the apps. (c). Convent the Est downloads sequence that has each
download value divided by 1000. Now create a bar chart that
plots multiple bars for prices as well est downloads.

plt.plot(df['App Name'], df['App Prince in Rs' ]) plt.bar(df['App Name'], df['Total Downloads' ]) df['Est Downloads' ] = df['Total Downloads' ] / 1000
plt.title('Simple Line Chart ') plt.title('Simple Bar Chart ') x=np.arange(len(df['App Name']))
plt.xlabel( 'App Name' ) plt.xlabel( 'App Name' ) plt.bar(x, df['App Prince in Rs' ], width=.25)
plt.ylabel( 'App Prince in Rs' ) plt.ylabel( 'Total Downloads' ) plt.bar(x+0.25, df['Est Downloads' ], width=.25)
plt.legend([ 'App Prince in Rs' ], loc='best') plt.legend([ 'Total Downloads' ], loc='best') plt.xticks(x,df['App Name'])
plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf') plt.title('Muplitple Bar Chart ')
plt.show() plt.show() plt.xlabel( 'App Name' )
plt.ylabel( 'App Prince in Rs and Est Downloads Rs' )
plt.legend([ ['App Prince in Rs'],['Est Downloads'] ], loc='best')
plt.savefig('d:\chart.pdf')
plt.show()
3. Given a data frame df1 as shown below: import pandas as pd
1990 2000 2010 import numpy as np
a 52 340 890 import matplotlib.pyplot as plt
b 64 480 560
c 78 688 1102 df=pd.DataFrame({1990:[52,64,78,94],
d 94 766 889 2000:[340,480,688,766],
2010:[890,560,1102,889]},index=['a','b','c','d'])
Write code to create:

(a). A scatter chart from the 1990 and 2010 columns of (b). A line chart from the 1990 and 2010 columns of (c). Create a bar chart plotting the three columns of
dataframe df1 dataframe df1 dataframe df1

plt.scatter(df[1990],df[2000]) x=np.arange(len(df.index)) x=np.arange(len(df.index))

plt.title('Simple Scatter Chart ') plt.plot(x,df[1990]) plt.bar(x, df[1990], width=0.25)
plt.xlabel( '1990 values' ) plt.plot(x,df[2010]) plt.bar(x+0.25,df[2000], width=0.25)
plt.ylabel( '2000 values' ) plt.xticks(x,df.index) plt.bar(x+0.50,df[2010], width=0.25)
plt.legend([ '1990 and 2000' ], loc='best') plt.title('Multiple Line Chart ') plt.xticks(x,df.index)
plt.savefig('d:\chart.pdf') plt.xlabel( 'sales person name' ) plt.title('Multiple Bar Chart ')
plt.show() plt.ylabel( '1990 ,2010 values' ) plt.xlabel( 'sales person name' )
plt.ylabel( '1990,2000,2010 values' )
plt.legend([[1990],[2010]], loc='best')
plt.legend([[1990],[2000],[2010]], loc='best')
plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf')
plt.show() plt.show()
5. Given the following set of data : import pandas as pd
Weight measurements for 16 small orders of French fries in (grams). import numpy as np
78 72 69 81 63 67 65 75 import matplotlib.pyplot as plt
79 74 71 83 71 79 80 69 arr=np.array([78,72,69,81,63,67,65,75,79,74,71,83,71,79,80,69])
(a). Create a simple histogram from (b). Create a horizontal histogram from (c). Create a step type of histogram from (d). Create a cumulative histogram
above data. above data. above data.- Frequency polygon from above data.

plt.hist(arr, bins=[60,65,70,75,80]) plt.hist(arr, bins=[60,65,70,75,80], plt.hist(arr, bins=[60,65,70,75,80], plt.hist(arr, bins=[60,65,70,75,80],

orientation='horizontal') histtype='step') cumulative=True)

plt.title('Simple Histogram Chart of plt.title('Simple Histogram Chart of weight plt.title('Simple Histogram Chart of weight plt.title('Simple Histogram Chart of

weight ') in horizontal form') in frequency polygon') weight in cumulative')

plt.xlabel( 'weight Bins or Interval' ) plt.xlabel( 'weight Frequency values' ) plt.xlabel( 'weight Bins or Interval' ) plt.xlabel( 'weight Bins or Interval' )

plt.ylabel( 'weight Frequency values' ) plt.ylabel( 'weight Bins or Interval' ) plt.ylabel( 'weight Frequency values' ) plt.ylabel( 'weight Frequency values' )

plt.legend([ 'weight Frequencies' ], plt.legend([ 'weight Frequencies' ], plt.legend([ 'weight Frequencies' ], plt.legend([ 'weight Frequencies' ],

loc='best') loc='best') loc='best') loc='best')

plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')

plt.show() plt.show() plt.show( plt.show(

Board sample question and answers
Mr. Sanjay wants to plot a bar graph for the given Mr. Harry wants to draw a line chart using a list of Write a code to plot the speed of a passenger train as
set of values of subject on x-axis and number of elements named LIST. Complete the code to perform the shown in the figure given below:
students who opted for that subject on y-axis. following operations:
Complete the code to perform the following : (i) To plot a line chart using the given LIST,
(i) To plot the bar graph in statement 1 (ii) To give a y-axis label to the line chart named “Sample
(ii) To display the graph in statement 2 Numbers”.
import matplotlib.pyplot as plt import matplotlib.pyplot as PLINE
x=['Hindi', 'English', 'Science', 'SST'] LIST=[10,20,30,40,50,60]
y=[10,20,30,40] _____________________ Statement 1
_____________________ Statement 1 _____________________ Statement 2
_____________________ Statement 2 PLINE.show()

(i) plt.bar(x,y) (i) PLINE.plot(LIST) import matplotlib.pyplot as plt

import numpy as np
(ii) plt.show() (ii) PLINE.ylabel(“Sample Numbers”)
x = np.arange(1, 5)

plt.plot(x, x*1.5, label='Normal')

plt.plot(x, x*3.0, label='Fast')

plt.plot(x, x/3.0, label='Slow')

plt.legend()

plt.show()
Syntax and examples of various Pandas charts
import matplotlib.pyplot as plt
plt.title('Simple Histogram Chart of weight in frequency polygon')
plt.xlabel( 'weight Bins or Interval' )
plt.ylabel( 'weight Frequency values' )
plt.legend([ 'weight Frequencies' ], loc='best')
plt.savefig('d:\histchart.pdf')
plt.show(

Line plt.plot( x['month'], x['sales1'], color = 'g' , marker = 'X', markersize = 15, markeredgecolor = 'blue',
linestyle = 'dashdot', linewidth = 5)

Column / barh r=np.arange( len( x.month ) )

plt.bar( r, x['sales1'], color = 'g', width = 0.35)
plt.bar( r+0.35, x['sales2'], color = 'b', width = 0.35)
plt.xticks( r, x['month'] )
Histogram plt. hist(numeric column name, bin = range, orientation = horizontal/vertical, rwidth = 0.5,
histtype = step/stepfilled/bar/ barstacked, cumulative = True/False)
Frequency polygon plt. hist(numeric column name, bin=range, histtype=’step’)

XII IP CH 3 Plotting With Pyplot
No ratings yet
XII IP CH 3 Plotting With Pyplot
52 pages
Data Visualization Python Tutorial
100% (1)
Data Visualization Python Tutorial
9 pages
Unit 1 - Chap 2 - Data Visualisation
No ratings yet
Unit 1 - Chap 2 - Data Visualisation
29 pages
Data Visualisation Using Python Matplotlib Codes For Class 12th Ip
No ratings yet
Data Visualisation Using Python Matplotlib Codes For Class 12th Ip
13 pages
Line Chart
No ratings yet
Line Chart
33 pages
Matplotlib in Python
No ratings yet
Matplotlib in Python
43 pages
Assignment: Master in Business Administration
No ratings yet
Assignment: Master in Business Administration
18 pages
Suyash Singh Class 12 A5 Info Practice Practical File
No ratings yet
Suyash Singh Class 12 A5 Info Practice Practical File
64 pages
Data Visulation
No ratings yet
Data Visulation
8 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
97 pages
Visualisation All
0% (1)
Visualisation All
70 pages
Cap 793
No ratings yet
Cap 793
17 pages
Data Visualization Using Matplotlib in Python
No ratings yet
Data Visualization Using Matplotlib in Python
15 pages
Notebook Visualizing Data Book
No ratings yet
Notebook Visualizing Data Book
19 pages
Python Plots
No ratings yet
Python Plots
47 pages
Ramos Residence 01.10.24 1
100% (2)
Ramos Residence 01.10.24 1
18 pages
Plot Final
No ratings yet
Plot Final
10 pages
Session 13, Data Visualization
No ratings yet
Session 13, Data Visualization
13 pages
MCA - S3 - Data Visualisation - U5
No ratings yet
MCA - S3 - Data Visualisation - U5
19 pages
DVPD Final Lab Word PDF
No ratings yet
DVPD Final Lab Word PDF
93 pages
Unit 6 Data Visualization-1
No ratings yet
Unit 6 Data Visualization-1
30 pages
Matplotlib Functions
No ratings yet
Matplotlib Functions
32 pages
Data Visualization Using Matplotlib and Seaborn
No ratings yet
Data Visualization Using Matplotlib and Seaborn
28 pages
Ex1 - Plotting and Visualization Using Numpy and Pandas
No ratings yet
Ex1 - Plotting and Visualization Using Numpy and Pandas
14 pages
Data Visualization
No ratings yet
Data Visualization
48 pages
Practical Graph
No ratings yet
Practical Graph
8 pages
Unit 4 Python
No ratings yet
Unit 4 Python
12 pages
Atomic Habits by James Clear
100% (1)
Atomic Habits by James Clear
23 pages
Wa0029.
No ratings yet
Wa0029.
16 pages
Data Visulization
No ratings yet
Data Visulization
2 pages
Data Visualization Part Notes - 1
No ratings yet
Data Visualization Part Notes - 1
9 pages
Study Material For XII Computer Science On: Data Visualization Using Pyplot
No ratings yet
Study Material For XII Computer Science On: Data Visualization Using Pyplot
22 pages
Matplotlib
No ratings yet
Matplotlib
13 pages
Chapter 4 Plotting Data Using Matplotlib
No ratings yet
Chapter 4 Plotting Data Using Matplotlib
11 pages
Unit 4
No ratings yet
Unit 4
27 pages
Program16 0
No ratings yet
Program16 0
6 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
34 pages
Matplotlib
No ratings yet
Matplotlib
30 pages
Data Visualization
No ratings yet
Data Visualization
28 pages
Unit Iv Notes Class 12
No ratings yet
Unit Iv Notes Class 12
22 pages
Matplotlib Pandas Guide
No ratings yet
Matplotlib Pandas Guide
9 pages
Matplotlib Pandas Guide
No ratings yet
Matplotlib Pandas Guide
7 pages
Notes9 - Class - 10 - Data Visualization Using MatPlotlib Notes
No ratings yet
Notes9 - Class - 10 - Data Visualization Using MatPlotlib Notes
5 pages
XII DataVisualization
No ratings yet
XII DataVisualization
34 pages
BTech 5 CSE Data Analytics Using Python Unit 5 Notes
No ratings yet
BTech 5 CSE Data Analytics Using Python Unit 5 Notes
9 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
21 pages
Graphs Using Matplotlib
No ratings yet
Graphs Using Matplotlib
23 pages
Data Visualization With Matplotlib
No ratings yet
Data Visualization With Matplotlib
20 pages
Assignment 4 On Visualization On Graph With Solution
No ratings yet
Assignment 4 On Visualization On Graph With Solution
14 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
24 pages
CHAPTER-2 Data Visualization
No ratings yet
CHAPTER-2 Data Visualization
4 pages
Bar Chart
No ratings yet
Bar Chart
18 pages
Data Visualizations in Python With Matplotlib: Sidita Duli, PHD
No ratings yet
Data Visualizations in Python With Matplotlib: Sidita Duli, PHD
6 pages
Shivansh Exp5
No ratings yet
Shivansh Exp5
4 pages
Unit 1 Chap 2 Data Visualisation - PDF - 20250716 - 182739 - 0000
No ratings yet
Unit 1 Chap 2 Data Visualisation - PDF - 20250716 - 182739 - 0000
28 pages
Data Visualization
No ratings yet
Data Visualization
17 pages
21CS644 Module 4
No ratings yet
21CS644 Module 4
24 pages
DataVisualization - 1 Surya Sir
No ratings yet
DataVisualization - 1 Surya Sir
51 pages
Sound
100% (1)
Sound
5 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
Zohar - Sifra Detzniyutha - Book of Secrets
50% (2)
Zohar - Sifra Detzniyutha - Book of Secrets
26 pages
ML Week 7
No ratings yet
ML Week 7
12 pages
Data Visualization Notes
No ratings yet
Data Visualization Notes
7 pages
Richthofen - The Hunter Birth of A Logo Pt.2
100% (2)
Richthofen - The Hunter Birth of A Logo Pt.2
46 pages
Turbine System
100% (1)
Turbine System
52 pages
CROWN RD5700 - Spec - GB
No ratings yet
CROWN RD5700 - Spec - GB
6 pages
BOX Hill Growth Centres Precinct Development Control Plan - in Force 28 June 2021
No ratings yet
BOX Hill Growth Centres Precinct Development Control Plan - in Force 28 June 2021
243 pages
Testbank For Life The Science of Biology 12th Edition Hillis Solution Manual
No ratings yet
Testbank For Life The Science of Biology 12th Edition Hillis Solution Manual
18 pages
Arts7 Q1 M1 Attiresfabricsandtapestriesv Final
100% (2)
Arts7 Q1 M1 Attiresfabricsandtapestriesv Final
28 pages
Unit 3 Computer Network
100% (1)
Unit 3 Computer Network
19 pages
12 Math Practice 01
No ratings yet
12 Math Practice 01
6 pages
CHAPTER 1 PYQs - FINAL - 241212 - 185830
No ratings yet
CHAPTER 1 PYQs - FINAL - 241212 - 185830
42 pages
Asme MFC-21.1-2015
No ratings yet
Asme MFC-21.1-2015
38 pages
pc120 6 - 6E - 6EO SAA4D102E 2
No ratings yet
pc120 6 - 6E - 6EO SAA4D102E 2
12 pages
Abinet PR
No ratings yet
Abinet PR
8 pages
Micro Presentation Topics
No ratings yet
Micro Presentation Topics
1 page
25.# Injectable Vit d3 Consent Form - Zenoti 2021 - Draft 2
No ratings yet
25.# Injectable Vit d3 Consent Form - Zenoti 2021 - Draft 2
3 pages
Hypatia Ipazia: The Mean Streets of Old Alexandria by Mike Flynn
No ratings yet
Hypatia Ipazia: The Mean Streets of Old Alexandria by Mike Flynn
28 pages
RSLTE001 - System Program Cell Level - RSLTE-LNBTS-2-Day-rslte LTE17A Reports RSLTE001 XML-2018 03-27-06!40!24 955
No ratings yet
RSLTE001 - System Program Cell Level - RSLTE-LNBTS-2-Day-rslte LTE17A Reports RSLTE001 XML-2018 03-27-06!40!24 955
1,000 pages
45905128e8e0b-1 Gs Pre Abhyaas Test 4359 e 2024 Letter
No ratings yet
45905128e8e0b-1 Gs Pre Abhyaas Test 4359 e 2024 Letter
22 pages
Chemistry
No ratings yet
Chemistry
4 pages
Chemistry Exp 10
No ratings yet
Chemistry Exp 10
4 pages
PHENOLS
No ratings yet
PHENOLS
9 pages
Week 3
No ratings yet
Week 3
4 pages
Power, Conflict and Resistance: SocialMovements, Networks and Hierarchies by Athina Karatzogianni
No ratings yet
Power, Conflict and Resistance: SocialMovements, Networks and Hierarchies by Athina Karatzogianni
284 pages
CV and Publications - Bornstein - 10 - 24
No ratings yet
CV and Publications - Bornstein - 10 - 24
35 pages
Class 12 IP Practice Assignment Series 10
No ratings yet
Class 12 IP Practice Assignment Series 10
4 pages
Pointo - Pitch Deck - 5-Dec.'24
No ratings yet
Pointo - Pitch Deck - 5-Dec.'24
15 pages
Redbull Meijer
No ratings yet
Redbull Meijer
1 page
Brazil Baby Food
No ratings yet
Brazil Baby Food
9 pages
Chem Org Summary Chart Good
No ratings yet
Chem Org Summary Chart Good
4 pages
SSB GibsonMcElhaneyLtr4.2016 PDF
No ratings yet
SSB GibsonMcElhaneyLtr4.2016 PDF
1 page
Canablast EDP 10 Pump - en PDF
No ratings yet
Canablast EDP 10 Pump - en PDF
4 pages
Scott Slaybaugh - Who Is To Blame? (Titanic Articles)
No ratings yet
Scott Slaybaugh - Who Is To Blame? (Titanic Articles)
8 pages
Consumer Strategies For Controlling Electric Water Heaters Under Dynamic Pricing
No ratings yet
Consumer Strategies For Controlling Electric Water Heaters Under Dynamic Pricing
8 pages

Unit 1 Pandas - Charts

Uploaded by

Unit 1 Pandas - Charts

Uploaded by

Data Visualization

X label X limit range

1. title() To add title for plot.

import pandas as pd import pandas as pd import pandas as pd

import pandas as pd import pandas as pd import pandas as pd

import pandas as pd import pandas as pd import pandas as pd

plt.hist(x['sales'], bins=[2,5,8]) plt.hist(x['sales'], bins=[2,4,6, 8]) plt.hist(x['sales'], bins=[2,4,6,8], orientation = 'horizontal')

import pandas as pd import pandas as pd import pandas as pd

x=pd.DataFrame({ x=pd.DataFrame({ x=pd.DataFrame({

plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')

Character color Character color Character color ‘solid

‘b’ blue ‘m’ magenta ‘c’ cyan ‘dashdot’

‘g’ green ‘y’ yellow ‘w’ white ‘dotted’

‘r’ red ‘k’ black ‘dashed’

Marker types for plotting loc

Marker description Marker description Marker description ‘best’

In above, bar graph there is gaps between bars.

In above, histogram there is no gaps between bars.

1. In histogram, the number of observations 1. In frequency polygon, the number of observations

is not marked with a single point is marked with a single point

at the midpoint of an interval. at the midpoint of an interval.

blanks. Only boundaries are shows.

3. for frequency polygon, histtype=’step’, parameter are used.

Default – for x axis 40 to 45 Use defined – for x axis 35 to 50

import numpy as np import numpy as np import numpy as np import numpy as np

a = np.arange(1,10) a = np.arange(1,10) a = np.arange(1,10) a = np.arange(1,10)

plt.plot(a,b) plt.plot(a,b) plt.plot(a,b) plt.plot(a,b)

import pandas as pd x=np.arange(len(df.Zones)) plt.plot(df['Zones'],df['Jan'])

plt.scatter(df[1990],df[2000]) x=np.arange(len(df.index)) x=np.arange(len(df.index))

plt.hist(arr, bins=[60,65,70,75,80]) plt.hist(arr, bins=[60,65,70,75,80], plt.hist(arr, bins=[60,65,70,75,80], plt.hist(arr, bins=[60,65,70,75,80],

orientation='horizontal') histtype='step') cumulative=True)

weight ') in horizontal form') in frequency polygon') weight in cumulative')

loc='best') loc='best') loc='best') loc='best')

plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')

plt.show() plt.show() plt.show( plt.show(

(i) plt.bar(x,y) (i) PLINE.plot(LIST) import matplotlib.pyplot as plt

plt.plot(x, x*1.5, label='Normal')

plt.plot(x, x*3.0, label='Fast')

plt.plot(x, x/3.0, label='Slow')

Column / barh r=np.arange( len( x.month ) )

You might also like