0% found this document useful (0 votes)
12 views19 pages

Unit 3

NumPy is an open-source Python library used for scientific and engineering applications, providing multidimensional array data structures and efficient functions. It allows for array manipulation, including initialization, indexing, slicing, and reshaping, while Matplotlib is a graph plotting library that enables visualization through various types of plots, including line plots, scatter plots, and bar graphs. The document also covers creating histograms, pie charts, and performing multiple regression analysis using scikit-learn.

Uploaded by

Mahesh Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views19 pages

Unit 3

NumPy is an open-source Python library used for scientific and engineering applications, providing multidimensional array data structures and efficient functions. It allows for array manipulation, including initialization, indexing, slicing, and reshaping, while Matplotlib is a graph plotting library that enables visualization through various types of plots, including line plots, scatter plots, and bar graphs. The document also covers creating histograms, pie charts, and performing multiple regression analysis using scikit-learn.

Uploaded by

Mahesh Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

NUMPY

NumPy (Numerical Python) is an open source Python library that’s widely used in science
and engineering. The NumPy library contains multidimensional array data structures, such as
the homogeneous, N-dimensional ndarray, and a large library of functions that operate
efficiently on these data structures.

How to import NumPy

import numpy as np

This widespread convention allows access to NumPy features with a short, recognizable
prefix (np.) while distinguishing NumPy features from others that have the same name.

Array fundamentals

One way to initialize an array is using a Python sequence, such as a list. For example:

>>> a = np.array([1, 2, 3, 4, 5, 6])


>>> a
array([1, 2, 3, 4, 5, 6])

As with built-in Python sequences, NumPy arrays are “0-indexed”: the first element of the
array is accessed using index 0, not 1.

Like the original list, the array is mutable.


>>> a[0] = 10
>>> a
array([10, 2, 3, 4, 5, 6])
Also like the original list, Python slice notation can be used for indexing.
>>> a[:3]
array ([10, 2, 3])
One major difference is that slice indexing of a list copies the elements into a new list, but
slicing an array returns a view: an object that refers to the data in the original array. The
original array can be mutated using the view.
b = a[3:]
>>> b
array([4, 5, 6])
>>> b[0] = 40
>>> a
array([ 10, 2, 3, 40, 5, 6])

Adding, removing, and sorting elements

This section covers np.sort(), np.concatenate()

Sorting an element is simple with np.sort(). You can specify the axis, kind, and order when
you call the function.
If you start with this array:
>>> arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
You can quickly sort the numbers in ascending order with:
>>> np.sort(arr)
array([1, 2, 3, 4, 5, 6, 7, 8])

ARRAY RESHAPE
arr.reshape() will give a new shape to an array without changing the data. Just remember that
when you use the reshape method, the array you want to produce needs to have the same
number of elements as the original array. If you start with an array with 12 elements, you’ll
need to make sure that your new array also has a total of 12 elements.
If you start with this array:
>>> a = np.arange(6)
>>> print(a)
[0 1 2 3 4 5]
You can use reshape() to reshape your array. For example, you can reshape this array to an
array with three rows and two columns:
>>> b = a.reshape(3, 2)
>>> print(b)
[[0 1]
[2 3]
[4 5]]
With np.reshape, you can specify a few optional parameters:
>>> np.reshape(a, newshape=(1, 6), order='C')
array([[0, 1, 2, 3, 4, 5]])
a is the array to be reshaped.
newshape is the new shape you want. You can specify an integer or a tuple of integers. If you
specify an integer, the result will be an array of that length. The shape should be compatible
with the original shape.

Indexing and slicing

You can index and slice NumPy arrays in the same ways you can slice Python lists.

>>> data = np.array([1, 2, 3])

>>> data[1]
2
>>> data[0:2]
array([1, 2])
>>> data[1:]
array([2, 3])
>>> data[-2:]
array([2, 3])
MATPLOTLIB

Matplotlib is a low level graph plotting library in python that serves as a visualization utility.

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Matplotlib is mostly written in python, a few segments are written in C, Objective-C and
Javascript for Platform compatibility.

import matplotlib.pyplot as plt


import numpy as np
xpoints = np.array([0,6])
ypoints = np.array([0, 250])
plt.plot(xpoints, ypoints)
plt.show()
Plotting x and y points
The plot() function is used to draw points (markers) in a diagram.

By default, the plot() function draws a line from point to point.

The function takes parameters for specifying points in the diagram.

Parameter 1 is an array containing the points on the x-axis.

Parameter 2 is an array containing the points on the y-axis.

If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays
[1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt


import numpy as np
xpoints = np.array([1, 8])
ypoints = np.array([3, 10])
plt.plot(xpoints, ypoints)
plt.show()
Multiple Points
You can plot as many points as you like, just make sure you have the same
number of points in both axis.

Example
Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and
finally to position (8, 10):

import matplotlib.pyplot as plt


import numpy as np

xpoints = np.array([1, 2, 6, 8])


ypoints = np.array([3, 8, 1, 10])

plt.plot(xpoints, ypoints)
plt.show()
Markers
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o')
plt.show()

Marker Size
You can use the keyword argument markersize or the shorter version, ms to
set the size of the markers:

import matplotlib.pyplot as plt


import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o', ms = 20)
plt.show()
Marker Color
You can use the keyword argument markeredgecolor or the shorter mec to
set the color of the edge of the markers

import matplotlib.pyplot as plt


import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o', ms = 20, mec = 'r')
####plt.plot(ypoints, marker = 'o', ms = 20, mec = '#4CAF50', mfc
= '#4CAF50')
###plt.plot(ypoints, marker = 'o', ms = 20, mec = 'hotpink', mfc
= 'hotpink')
plt.show()
Create Labels for a Plot
With Pyplot, you can use the xlabel() and ylabel() functions to set a label
for the x- and y-axis.
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
plt.plot(x, y)
plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")
plt.show()
Set Font Properties for Title and Labels
You can use the fontdict parameter in xlabel(), ylabel(), and title() to set font properties for
the title and labels.

Example

Set font properties for the title and labels:

import numpy as np

import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])

y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

font1 = {'family':'serif','color':'blue','size':20}

font2 = {'family':'serif','color':'darkred','size':15}

plt.title("Sports Watch Data", fontdict = font1)

plt.xlabel("Average Pulse", fontdict = font2)


plt.ylabel("Calorie Burnage", fontdict = font2)

plt.plot(x, y)

plt.show()

Matplotlib Scatter
With Pyplot, you can use the scatter() function to draw a scatter plot.

The scatter() function plots one dot for each observation. It needs two arrays
of the same length, one for the values of the x-axis, and one for values on
the y-axis:

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()

ColorMap
The Matplotlib module has a number of available colormaps.

A colormap is like a list of colors, where each color has a value that ranges
from 0 to 100.

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()

Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:

import matplotlib.pyplot as plt


import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x,y)
plt.show()
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, color = "red")
plt.show()

Histogram
A histogram is a graph showing frequency distributions.

It is a graph showing the number of observations within each given interval.

In Matplotlib, we use the hist() function to create histograms.

The hist() function will use an array of numbers to create a histogram, the
array is sent into the function as an argument.
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()

Creating Pie Charts


With Pyplot, you can use the pie() function to draw pie charts:

import matplotlib.pyplot as plt


import numpy as np
y = np.array([35, 25, 25, 15])
plt.pie(y)
plt.show()
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.show()

Explode
The explode parameter, if specified, and not None, must be an array with one
value for each wedge.

Each value represents how far from the center each wedge is displayed:
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode)
plt.show()

Legend
To add a list of explanation for each wedge, use the legend() function:

import matplotlib.pyplot as plt


import numpy as np

y = np.array([35, 25, 25, 15])


mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.legend()
plt.show()

CASE Study
Multiple regression is like linear regression, but with more than one independent value,
meaning that we try to predict a value based on two or more variables.
Take a look at the data set below, it contains some information about cars.

import pandas
from sklearn import linear_model

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]
y = df['CO2']
regr = linear_model.LinearRegression()
regr.fit(X, y)

#predict the CO2 emission of a car where the weight is 2300kg, and the
volume is 1300cm3:
predictedCO2 = regr.predict([[2300, 1300]])

print(predictedCO2)

Output
Predicted Value
[107.2087328]

You might also like