0% found this document useful (0 votes)
38 views18 pages

Unit 1 Pandas - Charts

Uploaded by

Bhavya Bhatt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views18 pages

Unit 1 Pandas - Charts

Uploaded by

Bhavya Bhatt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Data Visualization

"A picture is worth of a thousand words". Most of us are familiar with this expression. Data visualization plays an essential role
in their presentation of both small and large-scale data. It especially applies when trying to explain the analysis of increasingly
large datasets.
Data visualization is the discipline of trying to expose the data to understand it by placing it in a visual context. Its main goal is to
collect large datasets into visual graphics to allow for easy understanding of complex relationships within the data. Several data
visualization libraries are available in Python namely Matplotlib, Seaborn and Folium etc.
Purpose of Data visualization
 Better analysis and Quick action
 Identifying patterns and Finding errors
 Understanding the story, Exploring business insights and Grasping the Latest Trends
Plotting library
Matplotlib is the whole python package/library used to create 2D graphs and plots by using python scripts. Pyplot is a module in
matplotlib, which supports a very wide variety of graphs and plots namely-histogram, bar charts, power spectra, error charts
etc. It is used along with NumPy to provide an environment for MatLab. import matplotlib.pyplot as plt - is used for chart.
Pyplot provides the state-machine interface to the plotting library in matplotlib. It means that figures and axes are implicitly and
automatically created to achieve the desired plot. For example, calling plot from pyplot will automatically create the necessary
figure and axes to achieve the desired plot. Setting a title will then automatically set that title to the current axes object. The
pyplot interface is generally preferred for non-interactive plotting (i.e., scripting).
Following features are provided in matplotlib library for data visualization.
 Drawing – plots can be drawn based on passed data through specific functions.
 Customization – plots can be customized as per requirement after specifying it in the arguments of the functions. Like
color, style (dashed, dotted), width; adding label, title and legend in plots can be customized.
 Saving – After drawing and customization plots can be saved like .pdf, .png, .eps etc. for future use.
Customizing / Adding details of the plots

Y limit range
Title

Legend

Y label

X label X limit range

1. title() To add title for plot.


2. xticks()/yticks() For setting xticks and yticks.
3. xlim() / ylim() For setting x limit / y limit.
4. xlabel() / ylabel() For setting x-axis label / y-axis label.
5. legend() To add legend to plot. These are the different colors that identify different sets of
data plotted on the plot.
Notes- size, linewidth, markersize and rwidth always measure in points. Default rwidth is 1 point.
Line chart- A line chart or line graph is a type of chart which displays information as a series of data points called
‘markers’ connected by straight line segments.
Plot( x axis column, y axis column, color, marker, markersize, markeredgecolor, linestyle, linewidth)
Month Sales Month Sales Month Sales1 Sales2
Jan 5 Jan 5 Jan 5 8
Feb 4 Feb 4 Feb 4 7
Mar 6 Mar 6 Mar 6 2
Apr 2 Apr 2 Apr 2 6
May 7 May 7 May 7 4
June 8 June 8 June 8 5

import pandas as pd import pandas as pd import pandas as pd


import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt
x=pd.DataFrame({ x=pd.DataFrame({
x=pd.DataFrame({ 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'month' : 'sales1' : [5, 4, 6, 2, 7, 8 ] }) 'sales1' : [5, 4, 6, 2, 7, 8 ],
['jan','feb','mar',’apr’,’may’,’june’], 'sales2' : [8, 7, 2, 6, 4, 5 ] })
'sales1' : [5, 4, 6, 2, 7, 8 plt.plot( x['month'], x['sales1'], color = 'g' , marker='X',
]}) plt.plot( x['month'], x['sales1'], color = 'g')
markersize=15, markeredgecolor='blue', linestyle='dashdot',
plt.plot( x['month'], x['sales1'], color = 'g') plt.plot( x['month'], x['sales2'], color = 'b')
linewidth=5)
plt.title('Line Chart- Monthwise Sales of 2016 ') plt.title('Line Chart- Monthwise Sales of 2016 ')
plt.title('Line Chart- Monthwise Sales of 2016 ')
plt.xlabel( 'Months' ) plt.xlabel( 'Months' )
plt.xlabel( 'Months' )
plt.ylabel( 'Sales' ) plt.ylabel( 'Sales' )
plt.ylabel( 'Sales' )
plt.legend([ 'Sales Values' ], loc='best') plt.legend([ 'Sales1', 'Sales2' ], loc='best')
plt.legend([ 'Sales Values' ], loc='best')
plt.savefig('d:\linechart.pdf') plt.savefig('d:\ linechart.pdf')
plt.savefig('d:\ linechart.pdf')
plt.show() plt.show()
plt.show()
marker = ‘x’ or ‘X’ or ‘+’ or ‘D’ or ‘o’ or ‘O’
linestyle = ‘solid’ or ‘dashdot’ or ‘dotted’ or ‘dashed’
Bar chart- A bar chart or bar graph is a chart or graph that represents categorical with rectangular bars with heights or
lengths proportional to the values that they represents. The bars can be plotted vertically or horizontally.
Bar ( x axis column, y axis column, color, width)
Month Sales Month Sales Month Sales1 Sales2
Jan 5 Jan 5 Jan 5 8
Feb 4 Feb 4 Feb 4 7
Mar 6 Mar 6 Mar 6 2
Apr 2 Apr 2 6
May 7 May 7 4
June 8 June 8 5

import pandas as pd import pandas as pd import pandas as pd


import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np
x=pd.DataFrame({ x=pd.DataFrame({ x=pd.DataFrame({
'month' : ['jan','feb','mar'], 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'sales1' : [5, 4, 6 ]}) 'sales1' : [5, 4, 6, 2, 7, 8 ] }) 'sales1' : [5, 4, 6, 2, 7, 8 ],
'sales2' : [8, 7, 2, 6, 4, 5 ] })
plt.bar( x['month'], x['sales1'], color = ['g', 'r', 'b'], plt.barh( x['month'], x['sales1'], color = 'g') r=np.arange( len( x.month ) )
width=[0.2, 0.5, 0.9]) plt.title('Bar Chart- Monthwise Sales of 2016 ')
plt.bar( r, x['sales1'], color = 'g', width=0.35)
plt.title('Bar Chart- Monthwise Sales of 2016 ') plt.xlabel( 'Sales' ) plt.bar( r+0.35, x['sales2'], color = 'b', width=0.35)
plt.title('Line Chart- Monthwise Sales of 2016 ')
plt.xlabel( 'Month' ) plt.ylabel( 'Month' )
plt.xticks( r, x['month'] )
plt.ylabel( 'Sales' ) plt.legend([ 'year 2016' ], loc='best') plt.xlabel( 'Months' )
plt.ylabel( 'Sales' )
plt.legend([ 'year 2016' ], loc='best') plt.savefig('d:\barchart.pdf')
plt.legend([ 'Sales1', 'Sales2' ], loc='best')
plt.savefig('d:\barchart.pdf') plt.show() plt.savefig('d:\barchart.pdf')
plt.show() plt.show()
Histogram chart- A histogram is a summarisation for discrete or continuous data such as weight, heights etc. A
histogram provides a visual interpretation of numerical data by showing the number of data points that fall within a
specified range of values called (bins). It is similar to a vertical bar graph. However a histogram, unlike a vertical
bargraph, shows no gaps between the bars.
hist(column name, bin=range, orientation=horizontal/vertical, rwidth, histtype=step/filledstep, cumulative=True/False)
Month Sales Bins frequency Month Sales Bins frequency Month Sales Bins frequency
Jan 5 Jan 5 Jan 5
Feb 4 2-5 2 Feb 4 2-4 1 Feb 4 2-4 1
Mar 6 5-8 4 Mar 6 4-6 2 Mar 6 4-6 2
Apr 2 By 3 values interval Apr 2 6-8 3 Apr 2 6-8 3
May 7 May 7 May 7
June 8 June 8 By 2 values interval June 8 By 2 values interval

import pandas as pd import pandas as pd import pandas as pd


import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt
x=pd.DataFrame({ x=pd.DataFrame({ x=pd.DataFrame({
'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'sales' : [5, 4, 6, 2, 7, 8 ] }) 'sales' : [5, 4, 6, 2, 7, 8 ] }) 'sales' : [5, 4, 6, 2, 7, 8 ] })

plt.hist(x['sales'], bins=[2,5,8]) plt.hist(x['sales'], bins=[2,4,6, 8]) plt.hist(x['sales'], bins=[2,4,6,8], orientation = 'horizontal')


plt.title('Simple Histogram Chart of Sales of 2016 ') plt.title('Histogram Chart of Sales of 2016 ') plt.title('Horizontal Histogram Chart of Sales of 2016 ')
plt.xlabel( 'Sales Bins or Interval by 3' ) plt.xlabel( 'Sales Bins or Interval by 2' ) plt.xlabel('Sales Frequency values' )
plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Bins or Interval by 2' )
plt.legend([ 'Sales Frequencies' ], loc='best') plt.legend([ 'Sales Frequencies' ], loc='best') plt.legend([ 'Sales Frequencies' ], loc='best')
plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')
plt.show() plt.show() plt.show()
Cumulative Histogram chart and Frequency Polygon chart - A frequency polygon is a type of frequency distribution
graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval.
A straight line then connects each set of points. It is easy to compare two or more distribution on the same set of axes.
hist(column name, bin=range, orientation=horizontal/vertical, rwidth, histtype=step/stepfilled, cumulative=True/False)
Month Sales Bins frequency cumulative Month Sales Bins frequency Month Sales Bins frequency
Jan 5 Jan 5 Jan 5
Feb 4 2-4 1 1 Feb 4 2-4 1 Feb 4 2-4 1
Mar 6 4-6 2 1+2=3 Mar 6 4-6 2 Mar 6 4-6 2
Apr 2 6-8 3 3+3=6 Apr 2 6-8 3 Apr 2 6-8 3
May 7 May 7 May 7
June 8 June 8 By 2 values interval June 8

import pandas as pd import pandas as pd import pandas as pd


import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt

x=pd.DataFrame({ x=pd.DataFrame({ x=pd.DataFrame({


'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+, 'month' : *'jan','feb','mar',’apr’,’may’,’june’+,
'sales' : [5, 4, 6, 2, 7, 8 ] }) 'sales' : [5, 4, 6, 2, 7, 8 ] }) 'sales' : [5, 4, 6, 2, 7, 8 ] })

plt.hist(x['sales'], bins=[2,4,6,8], cumulative=True) plt.hist(x['sales'], bins=[2,4,6,8], histtype='step') plt.hist(x['sales'], bins=[2,4,6,8], rwidth = 0.9)
plt.title('Cumulative Histogram Chart of Sales of 2016 ') plt.title('Frequency Polygon Chart of Sales of 2016 ') plt.title('Histogram Chart of Sales using rwidth - bar graph style ')

plt.xlabel( 'Sales Bins or Interval' ) plt.xlabel( 'Sales Bins or Interval by 2' ) plt.xlabel('Sales Frequency values' )
plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Frequency values' ) plt.ylabel( 'Sales Bins or Interval by 3' )
plt.legend([ 'Sales Frequencies' ], loc='best') plt.legend([ 'Sales Frequencies' ], loc='upper left') plt.legend([ 'Sales Frequencies' ], loc='best')

plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')


plt.show() plt.show() plt.show()
Different color codes linestyle

Character color Character color Character color ‘solid

‘b’ blue ‘m’ magenta ‘c’ cyan ‘dashdot’

‘g’ green ‘y’ yellow ‘w’ white ‘dotted’

‘r’ red ‘k’ black ‘dashed’

Marker types for plotting loc

Marker description Marker description Marker description ‘best’


‘upper right’
‘.’ point marker ‘s’ square marker ‘3’ tri left marker
‘upper left’
‘,’ pixel marker ‘p’ pentagon marker ‘4’ tri right marker ‘lower left’
‘lower right’
‘o’ circle marker ‘*’ star marker ‘v’ triangle down marker
‘right’
‘+’ plus marker ‘h’ hexagon1 marker ‘^’ triangle up marker ‘center left’

‘x’ x marker ‘H’ hexagon2 marker ‘<’ triangle left marker ‘center right’
‘lower center’
‘D’ diamond marker ‘1’ tri down marker ‘>’ triangle right marker
‘upper center’
‘d’ thin diamond marker ‘2’ tri up marker ‘|’ , ‘_’ vline, hline markers ‘center’
Difference between bar graph and histogram

In above, bar graph there is gaps between bars.

In above, histogram there is no gaps between bars.


Difference between histogram and frequency polygon
Histogram Frequency Polygon

1. In histogram, the number of observations 1. In frequency polygon, the number of observations

is not marked with a single point is marked with a single point

at the midpoint of an interval. at the midpoint of an interval.

2. In histogram, inside bars always filled with colors. 2. In frequency polygon, inside bars always, no color filled only

blanks. Only boundaries are shows.

3. for frequency polygon, histtype=’step’, parameter are used.


For changing default limit values of a chart using xlim() / ylim()
Without - xlim() / ylim() With- xlim() / ylim()
Default x limit values and y limit values. Note- we can change x limit values and y limit values in a
chart according to our choice values.

Default – for x axis 40 to 45 Use defined – for x axis 35 to 50


for y axis 10 to 15 for y axis 05 to 20
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import pandas as pd import pandas as pd
x=pd.DataFrame({ x=pd.DataFrame({
'temp' : [40, 42, 45], 'temp' : [40, 42, 45],
'sales' : [10,12, 15]}) 'sales' : [10,12, 15]})

plt.scatter( x['temp'], x['sales'], color='b', marker='x') plt.scatter( x['temp'], x['sales'], color='b', marker='x')
plt.xlim (35, 50)
plt.ylim (5, 20)
plt.xlabel('temperature', fontsize=16) plt.xlabel('temperature', fontsize=16)
plt.ylabel('Sales', fontsize=16) plt.ylabel('Sales', fontsize=16)
plt.title('scatter plot - temperature vs sales', fontsize=20) plt.title('scatter plot - temperature vs sales', fontsize=20)
plt.legend([ 'Sales ' ], loc='best') plt.legend([ 'Sales ' ], loc='best')
plt.savefig('d:\scatterchart.pdf') plt.savefig('d:\scatterchart.pdf')
plt.show() plt.show()
To change, default limit values label sequence, of a chart using xticks() / yticks()
Without label range sequence - xticks() / yticks() With label range sequence - xticks() / yticks()
Default label of x limit values and y limit values. Note- we can change the label sequence of x limit values
and y limit values in a chart according to our choice values.

Default – for x axis 40, 42, 45 Use defined – for x axis 40 - t1 , 42 – t2, 45 – t3
for y axis 10, 12, 15 for y axis 10 – s1 , 12 – s2, 15 - s3
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import pandas as pd import pandas as pd
x=pd.DataFrame({ x=pd.DataFrame({
'temp' : [40, 42, 45], 'temp' : [40, 42, 45],
'sales' : [10,12, 15]}) 'sales' : [10,12, 15]})

plt.scatter( x['temp'], x['sales'], color='b', marker='x') plt.scatter( x['temp'], x['sales'], color='b', marker='x')
plt.xticks(x['temp']) plt.xticks(x['temp'], ['t1','t2','t3'])
plt.yticks(x['sales']) plt.yticks(x['sales'], ['s1','s2','s3'])
plt.xlabel('temperature', fontsize=16) plt.xlabel('temperature', fontsize=16)
plt.ylabel('Sales', fontsize=16) plt.ylabel('Sales', fontsize=16)
plt.title('scatter plot - temperature vs sales', fontsize=20) plt.title('scatter plot - temperature vs sales', fontsize=20)
plt.legend([ 'Sales ' ], loc='best') plt.legend([ 'Sales ' ], loc='best')
plt.savefig('d:\scatterchart.pdf') plt.savefig('d:\scatterchart.pdf')
plt.show() plt.show()
Sin line chart Cos line chart Log line chart Exp line chart

import numpy as np import numpy as np import numpy as np import numpy as np


import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt import matplotlib.pyplot as plt

a = np.arange(1,10) a = np.arange(1,10) a = np.arange(1,10) a = np.arange(1,10)


b = np.sin(a) b = np.cos(a) b = np.log(a) b= np.exp(a)

plt.plot(a,b) plt.plot(a,b) plt.plot(a,b) plt.plot(a,b)


plt.title('Simple Plot Chart of sin values ') plt.title('Simple Plot Chart of cos values ') plt.title('Simple Plot Chart of log values ') plt.title('Simple Plot Chart of exp values ')
plt.xlabel( 'sin range' ) plt.xlabel( 'cos range' ) plt.xlabel( 'log range' ) plt.xlabel(‘exp range' )
plt.ylabel( 'sin values' ) plt.ylabel( 'cos values' ) plt.ylabel( 'log values' ) plt.ylabel( ' exp values' )
plt.legend([ 'sin values' ], loc='best') plt.legend([ 'cos values' ], loc='best') plt.legend([ 'log values' ], loc='best') plt.legend([ ' exp values' ], loc='best')
plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf')
plt.show() plt.show() plt.show() plt.show()
1. Consider the data given below; create sequences required from the data below: Write code to:
Rainfall in mm
Zones Jan Feb Mar Apr May Jun Jul Aug sep Oct Nov Dec
North 140 130 130 190 160 200 150 170 190 170 150 120
South 160 200 130 200 200 170 110 160 130 140 170 200
East 140 180 150 170 190 140 170 180 190 150 140 170
West 180 150 200 120 180 140 110 130 150 190 110 140
Central 110 160 130 110 120 170 130 200 150 160 170 130
(a). Create bar charts to see the distribution of rainfall from Jan – Dec for all the zones. (b). Create a line chart to observe any trends from Jan to Dec.

import pandas as pd x=np.arange(len(df.Zones)) plt.plot(df['Zones'],df['Jan'])


import numpy as np plt.bar(x, df['Jan'], width=0.05) plt.plot(df['Zones'],df['Feb'])
import matplotlib.pyplot as plt plt.bar(x+0.05, df['Feb'], width=0.05)
plt.plot(df['Zones'],df['Mar'])
plt.bar(x+0.1, df['Mar'], width=0.05)
plt.bar(x+0.15, df['Apr'], width=0.05)
plt.plot(df['Zones'],df['Apr'])
df=pd.DataFrame({'Zones'
:['North','South','East','West','Central'], plt.bar(x+0.20, df['May'], width=0.05) plt.plot(df['Zones'],df['May'])
'Jan':[140,160,140,180,110], plt.bar(x+0.25, df['Jun'], width=0.05) plt.plot(df['Zones'],df['Jun'])
'Feb':[130,200,180,150,160], plt.bar(x+0.3, df['Jul'], width=0.05) plt.plot(df['Zones'],df['Jul'])
'Mar':[130,130,150,200,130], plt.bar(x+0.35, df['Aug'], width=0.05) plt.plot(df['Zones'],df['Aug'])
'Apr':[190,200,170,120,110], plt.bar(x+0.4, df['Sep'], width=0.05)
plt.plot(df['Zones'],df['Sep'])
'May':[160,200,190,180,120], plt.bar(x+0.45, df['Oct'], width=0.05)
plt.bar(x+0.5, df['Nov'], width=0.05)
plt.plot(df['Zones'],df['Oct'])
'Jun':[200,170,140,140,170],
plt.bar(x+0.55, df['Dec'], width=0.05) plt.plot(df['Zones'],df['Nov'])
'Jul':[150,110,170,110,130],
'Aug':[170,160,180,130,200], plt.plot(df['Zones'],df['Dec'])
'Sep':[190,130,190,150,150], plt.xticks(x,df.Zones)
'Oct':[170,140,150,190,160], plt.title('Multiple Bar Chart Jan to Dec Rainfall, with different Zones') plt.title('Multiple Line Chart Jan to Dec Rainfall, with different Zones')
'Nov':[150,170,140,110,170], plt.xlabel( 'Names of Zones' )
plt.xlabel( 'Names of Zones' )
'Dec':[120,200,170,140,130]} plt.ylabel( 'Monthwise Rainfall in mm- Jan to Dec values' )
plt.legend([['Jan'],['Feb'],['Mar'],['Apr'],['May'],['Jun'],['Jul'],['Aug'],['Se
plt.ylabel( 'Monthwise Rainfall in mm- Jan to Dec values' )
) plt.legend([['Jan'],['Feb'],['Mar'],['Apr'],['May'],['Jun'],['Jul'],['Aug'],['Sep'],['Oct'],['
p'],['Oct'],['Nov'],['Dec']], loc='best')
plt.savefig('d:\chart.pdf') Nov'],['Dec']], loc='best')
plt.show() plt.savefig('d:\chart.pdf')
plt.show()
2. Consider the data given below: import pandas as pd
App Name App Prince in Rs Total Downloads import numpy as np
Angry Bird 75 197000
Teen Titan 120 209000 import matplotlib.pyplot as plt
Marvel Comics 190 414000
ColorMe 245 196000 df=pd.DataFrame({'App Name' : ['Angry Bird','Teen Titan','Marvel Comics','ColorMe','Fun
Fun Run 550 272000
Crazy Taxi 55 311000
Run','Crazy Taxi','Igram Pro','Wapp Pro','Maths formulas'],
Igram Pro 175 213000
Wapp Pro 75 455000 'App Prince in Rs' : [75,120,190,245,550,55,175,75,140],
Maths formulas 140 278000 'Total Downloads' : [197000,209000,414000,196000,272000,311000,213000,455000,278000]})
(a). A line chart depicting the prices of the apps. (b). A bar chart depicting the downloads of the apps. (c). Convent the Est downloads sequence that has each
download value divided by 1000. Now create a bar chart that
plots multiple bars for prices as well est downloads.

plt.plot(df['App Name'], df['App Prince in Rs' ]) plt.bar(df['App Name'], df['Total Downloads' ]) df['Est Downloads' ] = df['Total Downloads' ] / 1000
plt.title('Simple Line Chart ') plt.title('Simple Bar Chart ') x=np.arange(len(df['App Name']))
plt.xlabel( 'App Name' ) plt.xlabel( 'App Name' ) plt.bar(x, df['App Prince in Rs' ], width=.25)
plt.ylabel( 'App Prince in Rs' ) plt.ylabel( 'Total Downloads' ) plt.bar(x+0.25, df['Est Downloads' ], width=.25)
plt.legend([ 'App Prince in Rs' ], loc='best') plt.legend([ 'Total Downloads' ], loc='best') plt.xticks(x,df['App Name'])
plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf') plt.title('Muplitple Bar Chart ')
plt.show() plt.show() plt.xlabel( 'App Name' )
plt.ylabel( 'App Prince in Rs and Est Downloads Rs' )
plt.legend([ ['App Prince in Rs'],['Est Downloads'] ], loc='best')
plt.savefig('d:\chart.pdf')
plt.show()
3. Given a data frame df1 as shown below: import pandas as pd
1990 2000 2010 import numpy as np
a 52 340 890 import matplotlib.pyplot as plt
b 64 480 560
c 78 688 1102 df=pd.DataFrame({1990:[52,64,78,94],
d 94 766 889 2000:[340,480,688,766],
2010:[890,560,1102,889]},index=['a','b','c','d'])
Write code to create:

(a). A scatter chart from the 1990 and 2010 columns of (b). A line chart from the 1990 and 2010 columns of (c). Create a bar chart plotting the three columns of
dataframe df1 dataframe df1 dataframe df1

plt.scatter(df[1990],df[2000]) x=np.arange(len(df.index)) x=np.arange(len(df.index))


plt.title('Simple Scatter Chart ') plt.plot(x,df[1990]) plt.bar(x, df[1990], width=0.25)
plt.xlabel( '1990 values' ) plt.plot(x,df[2010]) plt.bar(x+0.25,df[2000], width=0.25)
plt.ylabel( '2000 values' ) plt.xticks(x,df.index) plt.bar(x+0.50,df[2010], width=0.25)
plt.legend([ '1990 and 2000' ], loc='best') plt.title('Multiple Line Chart ') plt.xticks(x,df.index)
plt.savefig('d:\chart.pdf') plt.xlabel( 'sales person name' ) plt.title('Multiple Bar Chart ')
plt.show() plt.ylabel( '1990 ,2010 values' ) plt.xlabel( 'sales person name' )
plt.ylabel( '1990,2000,2010 values' )
plt.legend([[1990],[2010]], loc='best')
plt.legend([[1990],[2000],[2010]], loc='best')
plt.savefig('d:\chart.pdf') plt.savefig('d:\chart.pdf')
plt.show() plt.show()
5. Given the following set of data : import pandas as pd
Weight measurements for 16 small orders of French fries in (grams). import numpy as np
78 72 69 81 63 67 65 75 import matplotlib.pyplot as plt
79 74 71 83 71 79 80 69 arr=np.array([78,72,69,81,63,67,65,75,79,74,71,83,71,79,80,69])
(a). Create a simple histogram from (b). Create a horizontal histogram from (c). Create a step type of histogram from (d). Create a cumulative histogram
above data. above data. above data.- Frequency polygon from above data.

plt.hist(arr, bins=[60,65,70,75,80]) plt.hist(arr, bins=[60,65,70,75,80], plt.hist(arr, bins=[60,65,70,75,80], plt.hist(arr, bins=[60,65,70,75,80],

orientation='horizontal') histtype='step') cumulative=True)

plt.title('Simple Histogram Chart of plt.title('Simple Histogram Chart of weight plt.title('Simple Histogram Chart of weight plt.title('Simple Histogram Chart of

weight ') in horizontal form') in frequency polygon') weight in cumulative')

plt.xlabel( 'weight Bins or Interval' ) plt.xlabel( 'weight Frequency values' ) plt.xlabel( 'weight Bins or Interval' ) plt.xlabel( 'weight Bins or Interval' )

plt.ylabel( 'weight Frequency values' ) plt.ylabel( 'weight Bins or Interval' ) plt.ylabel( 'weight Frequency values' ) plt.ylabel( 'weight Frequency values' )

plt.legend([ 'weight Frequencies' ], plt.legend([ 'weight Frequencies' ], plt.legend([ 'weight Frequencies' ], plt.legend([ 'weight Frequencies' ],

loc='best') loc='best') loc='best') loc='best')

plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf') plt.savefig('d:\histchart.pdf')

plt.show() plt.show() plt.show( plt.show(


Board sample question and answers
Mr. Sanjay wants to plot a bar graph for the given Mr. Harry wants to draw a line chart using a list of Write a code to plot the speed of a passenger train as
set of values of subject on x-axis and number of elements named LIST. Complete the code to perform the shown in the figure given below:
students who opted for that subject on y-axis. following operations:
Complete the code to perform the following : (i) To plot a line chart using the given LIST,
(i) To plot the bar graph in statement 1 (ii) To give a y-axis label to the line chart named “Sample
(ii) To display the graph in statement 2 Numbers”.
import matplotlib.pyplot as plt import matplotlib.pyplot as PLINE
x=['Hindi', 'English', 'Science', 'SST'] LIST=[10,20,30,40,50,60]
y=[10,20,30,40] _____________________ Statement 1
_____________________ Statement 1 _____________________ Statement 2
_____________________ Statement 2 PLINE.show()

(i) plt.bar(x,y) (i) PLINE.plot(LIST) import matplotlib.pyplot as plt


import numpy as np
(ii) plt.show() (ii) PLINE.ylabel(“Sample Numbers”)
x = np.arange(1, 5)

plt.plot(x, x*1.5, label='Normal')

plt.plot(x, x*3.0, label='Fast')

plt.plot(x, x/3.0, label='Slow')

plt.legend()

plt.show()
Syntax and examples of various Pandas charts
import matplotlib.pyplot as plt
plt.title('Simple Histogram Chart of weight in frequency polygon')
plt.xlabel( 'weight Bins or Interval' )
plt.ylabel( 'weight Frequency values' )
plt.legend([ 'weight Frequencies' ], loc='best')
plt.savefig('d:\histchart.pdf')
plt.show(

Line plt.plot( x['month'], x['sales1'], color = 'g' , marker = 'X', markersize = 15, markeredgecolor = 'blue',
linestyle = 'dashdot', linewidth = 5)

Column / barh r=np.arange( len( x.month ) )


plt.bar( r, x['sales1'], color = 'g', width = 0.35)
plt.bar( r+0.35, x['sales2'], color = 'b', width = 0.35)
plt.xticks( r, x['month'] )
Histogram plt. hist(numeric column name, bin = range, orientation = horizontal/vertical, rwidth = 0.5,
histtype = step/stepfilled/bar/ barstacked, cumulative = True/False)
Frequency polygon plt. hist(numeric column name, bin=range, histtype=’step’)

You might also like