0% found this document useful (0 votes)
5 views

Data Visualization

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Data Visualization

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Unit 3: Data Visualization

• Visualization Design Principles,


• Tables,
• Univariate Data Visualization, Multivariate Data Visualization,
• Visualizing Groups, Dynamic Techniques General Matplotlib Tips,
• Two Interfaces for the Price of One,
• Simple Line Plots,
• Visualizing Errors, Density and Contour Plots, Histograms,
• Binning, and Density, Customizing Plot Legends,
• Customizing Color bars, Multiple Subplots, Text and Annotation, Customizing Matplotlib.
Visualizing Groups, Dynamic Techniques General Matplotlib Tips
Data visualization
• we can use Groupby & Aggregate functions with the
matplotlib and seaborn functions to make beautiful graphic
representations of our data.
• Groupby Function
• Groupby function is applied on a dataset and provides the
set of grouped datasets based on column values. In
simple terms, if any column has a set of values, then we
want a smaller dataset where each smaller dataset contains
each type of value based on that column.
• Let’s say we have a dataset where we have a column
named “Region,” and we want the set of datasets where
each smaller datasets represent each region.
• We can use the following code:

• test=sales.groupby(by=["Region"])
• Here, the main dataset name is sales
that have a column named “Region.”
We can utilize the get_group function
to grab specific groups, as follows:

• test.get_group("East")
• test.get_group("West")
Understanding Pandas Groupby for Data
Aggregation

• What if I told you that we could derive effective


and impactful insights from our dataset in just a
few lines of code?
• GroupBy function in Pandas saves us a ton of effort by
delivering super quick results in a matter of seconds.
• Learning Objectives
• Understanding the syntax and functionality of the groupby() method
is important for efficient data grouping.
• Different types of aggregation functions available in pandas, including
sum(), mean(), count(), max(), and min(), is necessary to perform
effective data analysis.
• Knowing how to apply various aggregation functions to grouped data
enables data analysts to extract useful insights from large data sets.
• Aggregation
• These perform statistical operations on a set of data. Have a glance at all the
aggregate functions in the Pandas package:

• count() – Number of non-null observations


• sum() – Sum of values
• mean() – Mean of values
• median() – Arithmetic median of values
• min() – Minimum
• max() – Maximum
• mode() – Mode
• std() – Standard deviation
• var() – Variance
Problem1
1. Write a Pandas program to split the following dataframe into groups based on school code.
Also check the type of GroupBy object.
Test Data:
school class name date_Of_Birth age height weight address
S1 s001 V Alberto Franco 15/05/2002 12 173 35 street1
S2 s002 V Gino Mcneill 17/05/2002 12 192 32 street2
S3 s003 VI Ryan Parkes 16/02/1999 13 186 33 street3
S4 s001 VI Eesha Hinton 25/09/1998 13 167 30 street1
S5 s002 V Gino Mcneill 11/05/2002 14 151 31 street2
S6 s004 VI David Parkes 15/09/1997 12 159 32 street4
• #1. Write a Pandas program to split the following dataframe into groups based on school code. Also
check the type of GroupBy object.
• import pandas as pd
• pd.set_option('display.max_rows', None)
• #pd.set_option('display.max_columns', None)
• student_data = pd.DataFrame({
• 'school_code': ['s001','s002','s003','s001','s002','s004'],
• 'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],
• 'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Gino Mcneill', 'David
Parkes'],
• 'date_Of_Birth ': ['15/05/2002','17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
• 'age': [12, 12, 13, 13, 14, 12],
• 'height': [173, 192, 186, 167, 151, 159],
• 'weight': [35, 32, 33, 30, 31, 32],
• 'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']},
• index=['S1', 'S2', 'S3', 'S4', 'S5', 'S6'])

print("Original DataFrame:")
• print(student_data)
• print('\nSplit the said data on school_code wise:')
• result = student_data.groupby(['school_code'])
• for name,group in result:
• print("\nGroup:")
• print(name)
• print(group)
• print("\nType of the object:")
• print(type(result))
• #Write a Pandas program to split the following given dataframe into groups based on school code
and class.
• import pandas as pd
• pd.set_option('display.max_rows', None)
• #pd.set_option('display.max_columns', None)
• student_data = pd.DataFrame({
• 'school_code': ['s001','s002','s003','s001','s002','s004'],
• 'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],
• 'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Gino Mcneill', 'David
Parkes'],
• 'date_Of_Birth ':
['15/05/2002','17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
• 'age': [12, 12, 13, 13, 14, 12],
• 'height': [173, 192, 186, 167, 151, 159],
• 'weight': [35, 32, 33, 30, 31, 32],
• 'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']},
• index=['S1', 'S2', 'S3', 'S4', 'S5', 'S6'])
• print("Original DataFrame:")
• print(student_data)
• print('\nSplit the said data on school_code, class wise:')
• result = student_data.groupby(['school_code', 'class'])
• for name,group in result:
• print("\nGroup:")
• print(name)
• print(group)
• #Write a Pandas program to split the following given dataframe into groups based on
school code and cast grouping as a list.
• import pandas as pd
• pd.set_option('display.max_rows', None)
• #pd.set_option('display.max_columns', None)
• student_data = pd.DataFrame({
• 'school_code': ['s001','s002','s003','s001','s002','s004'],
• 'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],
• 'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Gino
Mcneill', 'David Parkes'],
• 'date_Of_Birth ':
['15/05/2002','17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
• 'age': [12, 12, 13, 13, 14, 12],
• 'height': [173, 192, 186, 167, 151, 159],
• 'weight': [35, 32, 33, 30, 31, 32],
• 'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']},
• index=['S1', 'S2', 'S3', 'S4', 'S5', 'S6'])
• print("Original DataFrame:")
• print(student_data)
• print('\nCast grouping as a list:')
• result = student_data.groupby(['school_code'])
• print(list(result))
how to visualize data with the help of the
Matplotlib library of Python?
• Matplotlib
• Matplotlib is a low-level library of Python which is used for data
visualization. It is easy to use and emulates MATLAB like graphs and
visualization. This library is built on the top of NumPy arrays and
consist of several plots like line chart, bar chart, histogram, etc. It
provides a lot of flexibility but at the cost of writing more code.
• To install Matplotlib type the below command in the terminal.
pip install matplotlib
Pyplot

• Pyplot is a Matplotlib module that provides a MATLAB-like interface.


• Matplotlib is designed to be as usable as MATLAB, with the ability to
use Python and the advantage of being free and open-source.
• Each pyplot function makes some change to a figure: e.g., creates a
figure, creates a plotting area in a figure, plots some lines in a plotting
area, decorates the plot with labels, etc.
• The various plots we can utilize using Pyplot are Line Plot, Histogram,
Scatter, 3D Plot, Image, Contour, and Polar.
• import matplotlib.pyplot as plt
• # initializing the data
• x = [10, 20, 30, 40]
• y = [20, 25, 35, 55]
• # plotting the data
• plt.plot(x, y)

• plt.show()
• Adding Title
• The title() method in matplotlib module is used to specify
the title of the visualization depicted and displays the title
using various attributes.
• import matplotlib.pyplot as plt

• # initializing the data


• x = [10, 20, 30, 40]
• y = [20, 25, 35, 55]

• # plotting the data


• plt.plot(x, y)

• # Adding title to the plot


• plt.title("Linear graph")

• plt.show()
#We can also change the appearance of
the title by using the parameters of this
• import matplotlib.pyplot as plt
• # initializing the data
• x = [10, 20, 30, 40]
• y = [20, 25, 35, 55]

• # plotting the data


• plt.plot(x, y)

• # Adding title to the plot


• plt.title("Linear graph", fontsize=25,
color="green")

• plt.show()
• Adding X Label and Y Label
• In layman’s terms, the X label and the Y label are the titles
given to X-axis and Y-axis respectively. These can be added
to the graph by using the xlabel() and ylabel() methods.
• import matplotlib.pyplot as plt
• # initializing the data
• x = [10, 20, 30, 40]
• y = [20, 25, 35, 55]
• # plotting the data
• plt.plot(x, y)
• # Adding title to the plot
• plt.title("Linear graph", fontsize=25, color="green")
• # Adding label on the y-axis
• plt.ylabel('Y-Axis')
• # Adding label on the x-axis
• plt.xlabel('X-Axis')
• plt.show()
• Setting Limits and Tick labels
• You might have seen that Matplotlib automatically sets the values and the
markers(points) of the X and Y axis, however, it is possible to set the limit
and markers manually. xlim() and ylim() functions are used to set the limits
of the X-axis and Y-axis respectively. Similarly, xticks() and yticks() functions
are used to set tick labels.
• import matplotlib.pyplot as plt
• # initializing the data
• x = [10, 20, 30, 40]
• y = [20, 25, 35, 55]
• # plotting the data
• plt.plot(x, y)
• # Adding title to the plot
• plt.title("Linear graph", fontsize=25, color="green")
• # Adding label on the y-axis
• plt.ylabel('Y-Axis')
• # Adding label on the x-axis
• plt.xlabel('X-Axis')
• # Setting the limit of y-axis
• plt.ylim(0, 80)
• # setting the labels of x-axis
• plt.xticks(x, labels=["one", "two", "three", "four"])
• plt.show()
• Adding Legends
• A legend is an area describing the elements of the graph. In simple
terms, it reflects the data displayed in the graph’s Y-axis. It
generally appears as the box containing a small sample of each
color on the graph and a small description of what this data
means.
• The attribute bbox_to_anchor=(x, y) of legend() function is used to
specify the coordinates of the legend, and the attribute ncol
represents the number of columns that the legend has. Its default
value is 1.
import matplotlib.pyplot as plt
# initializing the data
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
# plotting the data
plt.plot(x, y)
# Adding title to the plot
plt.title("Linear graph", fontsize=25, color="green")
# Adding label on the y-axis
plt.ylabel('Y-Axis')
# Adding label on the x-axis
plt.xlabel('X-Axis')
# Setting the limit of y-axis
plt.ylim(0, 80)
# setting the labels of x-axis
plt.xticks(x, labels=["one", "two", "three", "four"])
# Adding legends
plt.legend(["GFG"])
plt.show()
• Figure class
• Consider the figure class as the overall window or page on which
everything is drawn. It is a top-level container that contains one or
more axes. A figure can be created using the figure() method.
• # Python program to show pyplot module
• import matplotlib.pyplot as plt
• from matplotlib.figure import Figure
• # initializing the data
• x = [10, 20, 30, 40] ,y = [20, 25, 35, 55]
• # Creating a new figure with width = 7 inches
• # and height = 5 inches with face color as
• # green, edgecolor as red and the line width of the edge as 7
• fig = plt.figure(figsize =(7, 5), facecolor='g',
• edgecolor='b', linewidth=7)
• # Creating a new axes for the figure
• ax = fig.add_axes([1, 1, 1, 1])
• # Adding the data to be plotted
• ax.plot(x, y)
• # Adding title to the plot
• plt.title("Linear graph", fontsize=25, color="yellow")
• # Adding label on the y-axis
• plt.ylabel('Y-Axis’) # Adding label on the x-axis
• plt.xlabel('X-Axis')
• # Setting the limit of y-axis plt.ylim(0, 80)
• # setting the labels of x-axis
• plt.xticks(x, labels=["one", "two", "three", "four"])
• # Adding legends
• plt.legend(["GFG"])
• plt.show()
Function Overloading
Finding Gross Pay
• Three are three types of employees in Indian railways.
They are regular, daily wages and consolidated
employees. Gross Pay for the employees are
calculated as follows:
• regular employees - basic + hra + % of DA * basic

• Daily wages – wages per hour * number of hours

• Consolidated – fixed amount


PAC - Finding Gross pay
Input Output Logic Involved
Components for Gross pay Based on type of
calculating gross employees –
pay Calculate gross pay
Writing Functions
• Same function name for all three type of employees

• More meaningful and elegant way of doing things

• I prefer the name - calculate_Gross_Pay for all types


of employees
Look alike but exhibit different characters
Polymorphism
• Refers to ‘one name having many forms’,
‘one interface doing multiple actions’.
• In C++, polymorphism can be either
• static polymorphism or
• dynamic polymorphism.

• C++ implements static polymorphism


through
• overloaded functions
• overloaded operators
Polymorphism
• Derived from the Greek many forms

• Single name can be used for different purposes

• Different ways of achieving the polymorphism:

1. Function overloading

2. Operator overloading

3. Dynamic binding
Overloading
• Overloading – A name having two or more
distinct meanings

• Overloaded function - a function having


more than one distinct meanings

• Overloaded operator - When two or more


distinct meanings are defined for an
operator
Overloading
• Operator overloading is inbuilt in C and C+
+.
• ‘-’ can be unary as well as binary
• ‘*’ is used for multiplication as well as
pointers
• ‘<<‘, ‘>>’ used as bitwise shift as well as
insertion and extraction operators
• All arithmetic operators can work with any
type of data
Function Overloading
 C++ enables several functions of the same name to be
defined, as long as they have different signatures.
 This is called function overloading.
 The C++ compiler selects the proper function to call by
examining the number, types and order of the
arguments in the call.
 Overloaded functions are distinguished by their signatures

 Signature - Combination of a function’s name and its

parameter types (in order)

 C++ compilers encodes each function identifier with the

number and types of its parameters (sometimes referred to

as name mangling or name decoration) to enable type-safe

linkage.
Signature of a Function
• A function’s argument list (i.e., number and type of
argument) is known as the function’s signature.
• Functions with Same signature - Two
functions with same number and types of
arguments in same order
• variable names doesn’t matter. For
instance, following two functions have
same signature.
void squar (int a, float b); //function 1
void squar (int x, float y);

37
Following code fragment overloads a
function name prnsqr( ).

void prnsqr (int i); //overloaded for integer #1

void prnsqr (char c); //overloaded for character #2

void prnsqr (float f); //overloaded for floats #3

void prnsqr (double d); //overloaded for double floats #4

38 38
void prnsqr (int i)
{
cout<<“Integer”<<i<<“’s square is”<<i*i<<“\n”;
}
void prnsqr (char c);
{
cout <<“No Square for characters”<<“\n”;
}
void prnsqr (float f)
{
cout<<“float”<<f <<“’s square is”<<f *f<<“\n”;
}
void prnsqr (double d)
{
cout <<“Double float”<<d<<“’s square is”<<d*d<<“\n’;
39 39
Where does Python belong to?

• Python follows both functional oriented and object oriented programming


approaches.
• FEATURES OF OOPS:
• Let’s understand the terms familiar in OOPS.
1.Class
2.Object
3.Encapsulation
4.Abstraction
5.Inheritance
6.Polymorphism
7.Abstract class
class Employee:
def display(self):
print("Hello my name is Prachi")
emp_obj = Employee()
emp_obj.display()

Hello my name is Prachi

class Employee:

def display(self):
print("Hello my name is Rahul")

emp_obj1 = Employee()
emp_obj1.display()

emp_obj2 = Employee()
emp_obj2.display()

Hello my name is Rahul


Hello my name is Rahul

You might also like