0% found this document useful (0 votes)
70 views20 pages

PML Ex3

The document discusses implementing Matplotlib in Python. It describes Matplotlib as a comprehensive library for creating static, animated, and interactive visualizations. Key Matplotlib functions covered include plot(), scatter(), bar(), hist(), and implementations of linear and polynomial regression. Examples shown include plotting age vs weight data, sales of cars by manufacturer over time, reading real-world CSV data and creating box plots, histograms, scatter plots and bubble charts to analyze features of the data. Polynomial regression is demonstrated by fitting a polynomial model to salary data vs position level.

Uploaded by

Jasmitha B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views20 pages

PML Ex3

The document discusses implementing Matplotlib in Python. It describes Matplotlib as a comprehensive library for creating static, animated, and interactive visualizations. Key Matplotlib functions covered include plot(), scatter(), bar(), hist(), and implementations of linear and polynomial regression. Examples shown include plotting age vs weight data, sales of cars by manufacturer over time, reading real-world CSV data and creating box plots, histograms, scatter plots and bubble charts to analyze features of the data. Polynomial regression is demonstrated by fitting a polynomial model to salary data vs position level.

Uploaded by

Jasmitha B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Ex No: 3 MATPLOTLIB IN PYTHON

DATE:

Aim:

To implement Matplotlib using Python programming.

Description:
MATPLOTLIB:

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in
Python. Matplotlib makes easy things easy and hard things possible. Create publication-quality plots.
Make interactive figures that can zoom, pan, and update.

Pyplot:

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under
the plt alias:

import matplotlib.pyplot as plt

Plot():

The plot() function is used to draw points (markers) in a diagram.By default, the plot() function draws a


line from point to point.The function takes parameters for specifying points in the diagram.Parameter 1
is an array containing the points on the x-axis.Parameter 2 is an array containing the points on the y-axis

scatter():

The scatter() function plots one dot for each observation. It needs two arrays of the same length, one for
the values of the x-axis, and one for values on the y-axis

bar():

The bar() function takes arguments that describes the layout of the bars.


The categories and their values represented by the first  and second argument as arrays.

plt.bar(x, y)

hist():

A histogram is a graph showing frequency distributions.

It is a graph showing the number of observations within each given interval.

The hist() function will use an array of numbers to create a histogram, the array is sent into the function
as an argument.

Linear Regression:

Linear regression uses the relationship between the data-points to draw a straight line through all them.

This line can be used to predict future values.

Polynomial Regression:

If your data points clearly will not fit a linear regression (a straight line through all data points), it might
be ideal for polynomial regression.

Polynomial regression, like linear regression, uses the relationship between the variables x and y to find
the best way to draw a line through the data points.

IMPLEMENTATION:

1. Plot the Age across Weight using matplotlib. Consider Age and Weight are 1D
array of 10 members. Plot them in X and Y –axis using plot() function.

import matplotlib.pyplot as plt


import numpy as np
age=np.array([23,24,25,26,27,28,29,30,31,32])
weight=np.array([55,50,70,80,57,78,79,75,74,90])
plt.plot(age,weight,'o')
plt.xlabel('age')
plt.ylabel('weight')
plt.title('AGE WITH WEIGHT')
plt.show()

2. Plot a graph between sales of Car by Maruti in each year 2015-2022. Fix the size
of graph, use specific color of line for visualizing.

<Figure size 576x432 with 0 Axes>


import matplotlib.pyplot as plt
import numpy as np
years=([2015,2016,2017,2018,2019,2020,2021,2022])
sales=([700000,600000,400000,500000,900000,800000,1000000,1200000])
plt.plot(years,sales,color='red')
plt.xlabel(years)
plt.ylabel(sales)
plt.title('sales of car Maurti')
plt.figure(figsize=(8,6))
plt.show()
3. Plot the sales of Car by Audi in the same time period in the previous graph using
different color & style line with specification for each color[Hint: use legend()].
Add Title for the graph

import numpy as np
from google.colab import files
sp=files.upload()

Choose Files No file chosen Upload widget is only available when the cell has been
executed in the current browser session. Please rerun this cell to enable.
Saving student-mat.csv to student-mat.csv
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv("student-mat.csv")
plt.scatter(data['age'], data['traveltime'])
plt.title("Scatter Plot")
plt.xlabel('age')
plt.ylabel('traveltime')
plt.show()

4. Read a real-time data in CSV form[Iris, Toy, Car etc.] and analyze features
(i) Finding median, outliers using box plot – single feature[continuous].

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
arr = np.random.randint(1, 20, size=30)
arr1 = np.append(arr, [27, 30])
print('Thus the array becomes{}'.format(arr1))
q1 = np.quantile(arr1, 0.25)
q3 = np.quantile(arr1, 0.75)
med = np.median(arr1)
iqr = q3-q1
upper_bound = q3+(1.5*iqr)
lower_bound = q1-(1.5*iqr)
print(iqr, upper_bound, lower_bound)

Thus the array becomes[19 5 7 12 9 5 17 11 19 10 7 14 10 13 16 3 16 18 7 19 18 9 11 4


15 19 18 17 17 4 27 30]
9.5 32.25 -5.75

plt.boxplot(arr1)
fig = plt.figure(figsize =(10, 7))
plt.show()
q1 = np.quantile(arr1, 0.25)
q3 = np.quantile(arr1, 0.75)
med = np.median(arr1)
iqr = q3-q1
upper_bound = q3+(1.5*iqr)
lower_bound = q1-(1.5*iqr)
print(iqr, upper_bound, lower_bound)

9.5 32.25 -5.75

outliers = arr1[(arr1 <= lower_bound) | (arr1 >= upper_bound)]
print('The following are the outliers in the boxplot:{}'.format(outliers))

The following are the outliers in the boxplot:[20,27]

arr2 = arr1[(arr1 >= lower_bound) & (arr1 <= upper_bound)]
plt.figure(figsize=(12, 7))
plt.boxplot(arr2)
plt.show()
import numpy as np
from  google.colab import files
sp=files.upload()

Saving tips.csv to tips.csv

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
print(sns.get_dataset_names())

['anagrams', 'anscombe', 'attention', 'brain_networks', 'car_crashes', 'diamonds', 'dots',


'dowjones', 'exercise', 'flights', 'fmri', 'geyser', 'glue', 'healthexp', 'iris', 'mpg', 'penguins',
'planets', 'seaice', 'taxis', 'tips', 'titanic']

[]
tips_df=sns.load_dataset('tips')
print(tips_df)
sns.lineplot(x="sex", y="total_bill", data=tips_df)
plt.title('Title using Matplotlib Function')
  
plt.show()

total_bill tip sex smoker day time size


0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]

BOX PLOT:

sns.boxplot(x='day',y='total_bill',data=tips_df,hue='sex',palette='afmhot')
plt.legend(loc=0)

(ii) Finding distribution using bar plot and histogram – Two features [categorical or
grouped].
BARPLOT:

sns.barplot(x='day',y='tip', data=tips_df, 
            hue='sex')
  plt.show()

HISTOGRAM:

sns.histplot(x='total_bill', data=tips_df,kde=True, hue='sex')
  
plt.show()

(iii) Finding distribution across feature using scatter plot and Bubble chart – 3 or
more features [continuous/ categorical]
SCATTERPLOT:

sns.scatterplot(x='day', y='tip', data=tips_df)
plt.show()

sns.scatterplot(x='day', y='tip', data=tips_df,
               hue='sex')
plt.show()

BUBBLE CHART:

import plotly.graph_objects as go

fig = go.Figure(data=[go.Scatter(
    x=[1, 2, 3, 4], y=[10, 11, 12, 13],
    mode='markers',
    marker=dict(
        color=['rgb(93, 164, 214)', 'rgb(255, 144, 14)',
               'rgb(44, 160, 101)', 'rgb(255, 65, 54)'],
        opacity=[1, 0.8, 0.6, 0.4],
        size=[40, 60, 80, 100],
    )
)])

fig.show()

5. Plot any two features from the dataset in scatter plot and find linear regression
between the features and plot the linear fit model

import numpy as np
from  google.colab import files
sp=files.upload()

Saving student_scores.csv to student_scores.csv


import numpy as np 

import pandas as pd 

from matplotlib import pyplot as plt 

import seaborn as sns 

from sklearn.linear_model import LinearRegression 
score_df = pd.read_csv('student_scores.csv') 

score_df.head()

score_df.describe()

X = score_df.iloc[:, :-1].values 

y = score_df.iloc[:, 1].values 
print(y)
[30 90 80 45 67]
from sklearn.model_selection import train_test_split 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0) 

from sklearn.linear_model import LinearRegression 

regressor = LinearRegression() 
regressor.fit(X_train, y_train) 

y_pred = regressor.predict(X_test)
plt.scatter(X_train, y_train,color='g') 

plt.plot(X_test, y_pred,color='k') 
plt.show()

6. Plot any two features from the dataset in scatter plot and find polynomial
regression between the features and plot the polynomial model

import numpy as np
from  google.colab import files
sp=files.upload()

Saving salary_data.csv to salary_data.csv


import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('https://s3.us-west-2.amazonaws.com/public.gamelab.fun/dataset/
position_salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values

from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)
def viz_linear():
    plt.scatter(X, y, color='red')
    plt.plot(X, lin_reg.predict(X), color='blue')
    plt.title('Truth or Bluff (Linear Regression)')
    plt.xlabel('Position level')
    plt.ylabel('Salary')
    plt.show()
    return
viz_linear()

from sklearn.preprocessing import PolynomialFeatures
poly_reg = PolynomialFeatures(degree=4)
X_poly = poly_reg.fit_transform(X)
pol_reg = LinearRegression()
pol_reg.fit(X_poly, y)
def viz_polymonial():
    plt.scatter(X, y, color='red')
    plt.plot(X, pol_reg.predict(poly_reg.fit_transform(X)), color='blue')
    plt.title('Truth or Bluff (Linear Regression)')
    plt.xlabel('Position level')
    plt.ylabel('Salary')
    plt.show()
    return
viz_polymonial()
lin_reg.predict([[5.5]])
pol_reg.predict(poly_reg.fit_transform([[5.5]]))

array([132148.43750002])

Problem Implementation Time Viva Total


Understanding Management

RESULT:

Thus the Matplotlib using Python programming has been understood and executed successfully.

You might also like