Numpy Numerical Python_unit3
Numpy Numerical Python_unit3
Numerical
Python
Unit 3
● NumPy Basics
○ Arrays and Vectorized
Computation
○ The NumPy nd array
■ Creating ndarrays
■ Data Types for nd
arrays-
○ Arithmetic with NumPy
Arrays
Index ○
○
Basic Indexing and
Slicing -
Boolean Indexing
○ Transposing Arrays and
Swapping Axes
● Universal Functions
○ Fast Element-Wise Array
Functions
● Mathematical and Statistical
Methods
● Sorting-Unique and Other Set
Logic.
Numpy
Tools for reading/writing array data to disk and working with memory-
mapped files.
Fast vectorized array operations for data munging and cleaning, subsetting
and filtering, transformation, and any other kinds of computations
Common array algorithms like sorting, unique, and set operations Efficient
descriptive statistics and aggregating/summarizing data
Data alignment and relational data manipulations for merging and joining
together heterogeneous datasets Expressing conditional logic as array
expressions instead of loops with if-elifelse branches
import numpy as np
In [25]: arr2.ndim
Out[25]: 2
In [26]: arr2.shape
Out[26]: (2, 4)
Array creation functions
The data type or dtype is a special object containing the information (or
metadata, data about data) the ndarray needs to interpret a chunk of
memory as a particular type of data:
import numpy as np
Output:
int64 # This means: integer, 64 bits = 8 bytes per value
NumPy data types
Casting of data types using- astype
You can explicitly convert or cast an array from one dtype to another
using ndarray’s astype method:
Out[55]: Out[56]:
In [88]: arr
Out[88]: array([ 0, 1, 2, 3, 4, 64, 64, 64, 8, 9])
In [89]: arr[1:6]
Out[89]: array([ 1, 2, 3, 4, 64])
Indexing with slices
Consider the two-dimensional array from before, arr2d. Slicing this array is a
bit
different: ● As you can see, it has sliced
In [90]: arr2d along axis 0, the first axis.
● A slice, therefore, selects a range
Out[90]: array([[1, 2, 3],
of elements along an axis.
[4, 5, 6], ● It can be helpful to read the
[7, 8, 9]]) expression arr2d[:2] as “select
the first two rows of arr2d.”
In [91]: arr2d[:2]
Out[91]: array([[1, 2, 3],
[4, 5, 6]])
Indexing with slices
In [92]: arr2d[:2, 1:]
pass multiple slices just like you can pass multiple indexes:
Out[92]: array([[2,
3],
[5, 6]])
● By mixing integer indexes and slices, you get lower dimensional slices.
● For example, select the second row but only the first two columns like so:
In [93]: arr2d[1, :2]
Out[93]: array([4, 5])
● Similarly, I can select the third column but only the first two rows like so:
In [94]: arr2d[:2, 2]
Out[94]: array([3, 6])
Two-dimensional array slicing
Note that a colon by itself means to take Assigning to a slice expression
the entire axis, so you can slice only assigns to the whole selection:
higher dimensional axes by doing: In [96]: arr2d[:2, 1:] = 0
In [95]: arr2d[:, :1] In [97]: arr2dOut[97]:
Out[95]: array([[1, 0, 0],
array([[1], [4, 0, 0],
[4], [7, 8, 9]])
[7]])
Two-
dimensional
array slicing
Boolean Indexing
example we have some data in an array and an array of names with
duplicates. Using here the randn function in numpy.random to generate
some random normally distributed data: In [101]: data
Out[101]:
In [98]: names = np.array(['Bob', 'Joe', 'Will',
array([[ 0.0929, 0.2817, 0.769 ,
'Bob', 'Will', 'Joe', 'Joe'])
1.2464],
In [99]: data = np.random.randn(7, 4)
[ 1.0072, -1.2962, 0.275 ,
In [100]: names 0.2289],
Out[100]: [ 1.3529, 0.8864, -2.0016, -
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], 0.3718],
dtype='<U4') [ 1.669 , -0.4386, -0.5397,
0.477 ],
Boolean Indexing
● Suppose each name corresponds to a row in the data array and we
wanted to select all the rows with corresponding name 'Bob'.
● Like comparisons (such as ==) with arrays are also vectorized.
● Thus, comparing names with the string 'Bob' yields a boolean array:
In [102]: names == 'Bob'
Out[102]: array([ True, False, False, True, False, False, False], dtype=bool)
● This boolean array can be passed when indexing the array:
In [103]: data[names == 'Bob']
Out[103]: array([[ 0.0929, 0.2817, 0.769 , 1.2464],
[ 1.669 , -0.4386, -0.5397, 0.477 ]])
Boolean Indexing- mix and match boolean arrays
● The boolean array must be of the same length as the array axis it’s
indexing.
● You can even mix and match boolean arrays with slices or integers (or
sequences of integers.
● Boolean selection will not fail if the boolean array is not the correct
length.
● In these examples, select from the rows where names == 'Bob' and
In [105]: data[names == 'Bob', 3]
index the columns,
Out[105]: array([ 1.2464, 0.477 ])
In [104]: data[names == 'Bob', 2:]
Out[104]: array([[ 0.769 , 1.2464],
Boolean Indexing- mix and match boolean arrays
To select everything but 'Bob', you can either use != or negate the condition
using ~:
In [106]: names != 'Bob'
Out[106]: array([False, True, True, False, True, True, True], dtype=bool)
In [107]: data[~(names == 'Bob')]
Out[107]: array([[ 1.0072, -1.2962, 0.275 , 0.2289],
[ 1.3529, 0.8864, -2.0016, -0.3718],
[ 3.2489, -1.0212, -0.5771, 0.1241],
[ 0.3026, 0.5238, 0.0009, 1.3438],
[-0.7135, -0.8312, -2.3702, -1.8608]])
Boolean Indexing- mix and match boolean arrays
The ~ operator can be useful when you want to invert a general condition:
In [108]: cond = names == 'Bob'
In [109]: data[~cond]
Out[109]:
array([[ 1.0072, -1.2962, 0.275 , 0.2289],
[ 1.3529, 0.8864, -2.0016, -0.3718],
[ 3.2489, -1.0212, -0.5771, 0.1241],
[ 0.3026, 0.5238, 0.0009, 1.3438],
[-0.7135, -0.8312, -2.3702, -1.8608]])
Multiple boolean conditions
Selecting two of the three names to combine multiple boolean conditions,
use
boolean arithmetic operators like & (and) and | (or):
In [110]: mask = (names == 'Bob') | (names == 'Will')
In [111]: mask
Out[111]: array([ True, False, True, True, True, False, False], dtype=bool)
In [112]: data[mask]
Out[112]: array([[ 0.0929, 0.2817, 0.769 , 1.2464],
[ 1.3529, 0.8864, -2.0016, -0.3718],
[ 1.669 , -0.4386, -0.5397, 0.477 ],
[ 3.2489, -1.0212, -0.5771, 0.1241]])
Selecting data from an array by Boolean indexing always creates a copy of
Setting negative values in data to 0
Setting values with boolean arrays works in a common-sense way. To set all
of the
negative values in data to 0 we need only do:
In [113]: data[data < 0] = 0
In [114]: data
Out[114]:array([[ 0.0929, 0.2817, 0.769 , 1.2464 ],
[ 1.0072, 0. , 0.275 ,
0.2289 ],
[ 1.3529, 0.8864, 0. ,
0. ],
[ 1.669 , 0. , 0. ,
Transposing Arrays
Transposing is a special form of reshaping that similarly returns a view on
the underlying data without copying anything.
Arrays have the transpose method and also the special T attribute:
In [126]: arr = np.arange(15).reshape((3, 5))
In [128]: arr.T
In [127]: arr
Out[128]:
Out[127]:
array([[ 0, 5, 10],
array([[ 0, 1, 2, 3, 4 ],
[ 1, 6, 11],
[ 5, 6, 7, 8, 9 ],
[ 2, 7, 12],
[10, 11, 12, 13, 14]])
[ 3, 8, 13],
[ 4, 9, 14]])
Transposing Arrays
When doing matrix computations, you may do this very often—for example,
when
In [131]: np.dot(arr.T, arr)
computing the inner matrix product using np.dot:
Out[131]:
In [129]: arr = np.random.randn(6, 3)
array([[ 9.2291, 0.9394, 4.948 ],
In [130]: arr [ 0.9394, 3.7662, -
Out[130]: 1.3622],
[ 4.948 , -1.3622,
array([[-0.8608, 0.5601, -1.2659],
4.3437]])
[ 0.1198, -1.0635, 0.3329],
[-2.3594, -0.1995, -1.542 ],
[-0.9707, -1.307 , 0.2863],
[ 0.378 , -0.7539, 0.3313],
Higher dimensional arrays, transpose
For higher dimensional arrays, transpose will accept a tuple of axis numbers
to permute the axes (for extra mind bending):
In [132]: arr = np.arange(16).reshape((2, 2, 4))
In [133]: arr In [134]: arr.transpose((1, 0, 2))
Out[133]: Out[134]:
array([[[ 0, 1, 2, 3], array([[[ 0, 1, 2,
3],
[ 4, 5, 6, 7]], [ 8, 9, 10, 11]],
[[ 8, 9, 10, 11], [[ 4, 5, 6, 7],
[12, 13, 14, 15]]]) [12, 13, 14, 15]]])
Swapping Axes
Simple transposing with .T is a special case of swapping axes. ndarray has
the method swapaxes, which takes a pair of axis numbers and switches the
indicated axes to rearrange In [136]: arr.swapaxes(1, 2)
the data: Out[136]:
array([[[ 0, 4],
In [135]: arr
[ 1, 5],
Out[135]: [ 2, 6],
array([[[ 0, 1, 2, 3], [ 3, 7]],
[[ 8, 12],
[ 4, 5, 6, 7]],
[ 9, 13],
[[ 8, 9, 10, 11], [10, 14],
[12, 13, 14, 15]]]) [11, 15]]])
Universal Functions: Fast Element-Wise Array
Functions
In [141]: x = np.random.randn(8)
In [142]: y = np.random.randn(8)
In [143]: x
Out[143]: array([-0.0119, 1.0048, 1.3272, -0.9193, -1.5491, 0.0222, 0.7584, -0.6605])
In [144]: y
Out[144]: array([ 0.8626, -0.01 , 0.05 , 0.6702, 0.853 , -0.9559, -0.0235, -2.3042])
In [145]: np.maximum(x, y)
Out[145]: array([ 0.8626, 1.0048, 1.3272, 0.6702, 0.853 , 0.0222, 0.7584, -0.6605])
Here, numpy.maximum computed the element-wise maximum of the
Unary universal functions
Binary universal functions
Mathematical and Statistical Methods
Like Python’s built-in list type, NumPy arrays can be sorted in-place with the sort
method:
In [195]: arr = np.random.randn(6)
In [196]: arr
Out[196]: array([ 0.6095, -0.4938, 1.24 , -0.1357, 1.43 , -0.8469])
In [197]: arr.sort()
In [198]: arr
Out[198]: array([-0.8469, -0.4938, -0.1357, 0.6095, 1.24 , 1.43 ])
Sorting
Sorting each one-dimensional section of values in a multidimensional array
inplace along an axis by passing the axis number to sort: