Unit 3
Unit 3
NumPy (Numerical Python) is an open source Python library that’s widely used in science
and engineering. The NumPy library contains multidimensional array data structures, such as
the homogeneous, N-dimensional ndarray, and a large library of functions that operate
efficiently on these data structures.
import numpy as np
This widespread convention allows access to NumPy features with a short, recognizable
prefix (np.) while distinguishing NumPy features from others that have the same name.
Array fundamentals
One way to initialize an array is using a Python sequence, such as a list. For example:
As with built-in Python sequences, NumPy arrays are “0-indexed”: the first element of the
array is accessed using index 0, not 1.
Sorting an element is simple with np.sort(). You can specify the axis, kind, and order when
you call the function.
If you start with this array:
>>> arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
You can quickly sort the numbers in ascending order with:
>>> np.sort(arr)
array([1, 2, 3, 4, 5, 6, 7, 8])
ARRAY RESHAPE
arr.reshape() will give a new shape to an array without changing the data. Just remember that
when you use the reshape method, the array you want to produce needs to have the same
number of elements as the original array. If you start with an array with 12 elements, you’ll
need to make sure that your new array also has a total of 12 elements.
If you start with this array:
>>> a = np.arange(6)
>>> print(a)
[0 1 2 3 4 5]
You can use reshape() to reshape your array. For example, you can reshape this array to an
array with three rows and two columns:
>>> b = a.reshape(3, 2)
>>> print(b)
[[0 1]
[2 3]
[4 5]]
With np.reshape, you can specify a few optional parameters:
>>> np.reshape(a, newshape=(1, 6), order='C')
array([[0, 1, 2, 3, 4, 5]])
a is the array to be reshaped.
newshape is the new shape you want. You can specify an integer or a tuple of integers. If you
specify an integer, the result will be an array of that length. The shape should be compatible
with the original shape.
You can index and slice NumPy arrays in the same ways you can slice Python lists.
>>> data[1]
2
>>> data[0:2]
array([1, 2])
>>> data[1:]
array([2, 3])
>>> data[-2:]
array([2, 3])
MATPLOTLIB
Matplotlib is a low level graph plotting library in python that serves as a visualization utility.
Matplotlib is mostly written in python, a few segments are written in C, Objective-C and
Javascript for Platform compatibility.
If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays
[1, 8] and [3, 10] to the plot function.
Example
Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and
finally to position (8, 10):
plt.plot(xpoints, ypoints)
plt.show()
Markers
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o')
plt.show()
Marker Size
You can use the keyword argument markersize or the shorter version, ms to
set the size of the markers:
x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
plt.plot(x, y)
plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")
plt.show()
Set Font Properties for Title and Labels
You can use the fontdict parameter in xlabel(), ylabel(), and title() to set font properties for
the title and labels.
Example
import numpy as np
x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
font1 = {'family':'serif','color':'blue','size':20}
font2 = {'family':'serif','color':'darkred','size':15}
plt.plot(x, y)
plt.show()
Matplotlib Scatter
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays
of the same length, one for the values of the x-axis, and one for values on
the y-axis:
ColorMap
The Matplotlib module has a number of available colormaps.
A colormap is like a list of colors, where each color has a value that ranges
from 0 to 100.
Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:
Histogram
A histogram is a graph showing frequency distributions.
The hist() function will use an array of numbers to create a histogram, the
array is sent into the function as an argument.
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()
Explode
The explode parameter, if specified, and not None, must be an array with one
value for each wedge.
Each value represents how far from the center each wedge is displayed:
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode)
plt.show()
Legend
To add a list of explanation for each wedge, use the legend() function:
CASE Study
Multiple regression is like linear regression, but with more than one independent value,
meaning that we try to predict a value based on two or more variables.
Take a look at the data set below, it contains some information about cars.
import pandas
from sklearn import linear_model
df = pandas.read_csv("data.csv")
X = df[['Weight', 'Volume']]
y = df['CO2']
regr = linear_model.LinearRegression()
regr.fit(X, y)
#predict the CO2 emission of a car where the weight is 2300kg, and the
volume is 1300cm3:
predictedCO2 = regr.predict([[2300, 1300]])
print(predictedCO2)
Output
Predicted Value
[107.2087328]