0% found this document useful (0 votes)
3 views5 pages

Numpy

This document is a cheat sheet for using NumPy in Python for data science, covering essential functions for array manipulation, mathematical operations, and data types. It includes examples of array creation, indexing, slicing, and various arithmetic operations. Additionally, it provides information on saving/loading arrays and performing aggregate functions.

Uploaded by

youngraison
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

Numpy

This document is a cheat sheet for using NumPy in Python for data science, covering essential functions for array manipulation, mathematical operations, and data types. It includes examples of array creation, indexing, slicing, and various arithmetic operations. Additionally, it provides information on saving/loading arrays and performing aggregate functions.

Uploaded by

youngraison
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Python For Data Science Cheat Sheet Inspecting Your Array Subsetting, Slicing, Indexing Also see Lists

>>> a.shape Array dimensions Subsetting


NumPy Basics >>>
>>>
len(a)
b.ndim
Length of array
Number of array dimensions
>>> a[2]
3
1 2 3 Select the element at the 2nd index
Learn Python for Data Science Interactively at www.DataCamp.com >>> e.size Number of array elements >>> b[1,2] 1.5 2 3 Select the element at row 1 column 2
>>> b.dtype Data type of array elements 6.0 4 5 6 (equivalent to b[1][2])
>>> b.dtype.name Name of data type
>>> b.astype(int) Convert an array to a different type Slicing
NumPy >>> a[0:2]
array([1, 2])
1 2 3 Select items at index 0 and 1
2
The NumPy library is the core library for scientific computing in Asking For Help >>> b[0:2,1] 1.5 2 3 Select items at rows 0 and 1 in column 1
>>> np.info(np.ndarray.dtype) array([ 2., 5.]) 4 5 6
Python. It provides a high-performance multidimensional array
Array Mathematics
1.5 2 3
>>> b[:1] Select all items at row 0
object, and tools for working with these arrays. array([[1.5, 2., 3.]]) 4 5 6 (equivalent to b[0:1, :])
Arithmetic Operations >>> c[1,...] Same as [1,:,:]
Use the following import convention: array([[[ 3., 2., 1.],
>>> import numpy as np [ 4., 5., 6.]]])
>>> g = a - b Subtraction
array([[-0.5, 0. , 0. ], >>> a[ : :-1] Reversed array a
NumPy Arrays [-3. , -3. , -3. ]])
array([3, 2, 1])

>>> np.subtract(a,b) Boolean Indexing


1D array 2D array 3D array Subtraction
>>> a[a<2] Select elements from a less than 2
>>> b + a Addition 1 2 3
array([[ 2.5, 4. , 6. ], array([1])
axis 1 axis 2
1 2 3 axis 1 [ 5. , 7. , 9. ]]) Fancy Indexing
1.5 2 3 >>> np.add(b,a) Addition >>> b[[1, 0, 1, 0],[0, 1, 2, 0]] Select elements (1,0),(0,1),(1,2) and (0,0)
axis 0 axis 0 array([ 4. , 2. , 6. , 1.5])
4 5 6 >>> a / b Division
array([[ 0.66666667, 1. , 1. ], >>> b[[1, 0, 1, 0]][:,[0,1,2,0]] Select a subset of the matrix’s rows
[ 0.25 , 0.4 , 0.5 ]]) array([[ 4. ,5. , 6. , 4. ], and columns
>>> np.divide(a,b) Division [ 1.5, 2. , 3. , 1.5],
Creating Arrays >>> a * b
array([[ 1.5, 4. , 9. ],
Multiplication
[ 4. , 5.
[ 1.5, 2.
,
,
6.
3.
,
,
4. ],
1.5]])

>>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]])


>>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float) >>> np.multiply(a,b) Multiplication Array Manipulation
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]], >>> np.exp(b) Exponentiation
dtype = float) >>> np.sqrt(b) Square root Transposing Array
>>> np.sin(a) Print sines of an array >>> i = np.transpose(b) Permute array dimensions
Initial Placeholders >>> np.cos(b) Element-wise cosine >>> i.T Permute array dimensions
>>> np.log(a) Element-wise natural logarithm
>>> np.zeros((3,4)) Create an array of zeros >>> e.dot(f) Dot product
Changing Array Shape
>>> np.ones((2,3,4),dtype=np.int16) Create an array of ones array([[ 7., 7.], >>> b.ravel() Flatten the array
>>> d = np.arange(10,25,5) Create an array of evenly [ 7., 7.]]) >>> g.reshape(3,-2) Reshape, but don’t change data
spaced values (step value)
>>> np.linspace(0,2,9) Create an array of evenly Comparison Adding/Removing Elements
spaced values (number of samples) >>> h.resize((2,6)) Return a new array with shape (2,6)
>>> e = np.full((2,2),7) Create a constant array >>> a == b Element-wise comparison >>> np.append(h,g) Append items to an array
>>> f = np.eye(2) Create a 2X2 identity matrix array([[False, True, True], >>> np.insert(a, 1, 5) Insert items in an array
>>> np.random.random((2,2)) Create an array with random values [False, False, False]], dtype=bool) >>> np.delete(a,[1]) Delete items from an array
>>> np.empty((3,2)) Create an empty array >>> a < 2 Element-wise comparison
array([True, False, False], dtype=bool) Combining Arrays
>>> np.array_equal(a, b) Array-wise comparison >>> np.concatenate((a,d),axis=0) Concatenate arrays
I/O array([ 1, 2,
>>> np.vstack((a,b))
3, 10, 15, 20])
Stack arrays vertically (row-wise)
Aggregate Functions array([[ 1. , 2. , 3. ],
Saving & Loading On Disk [ 1.5, 2. , 3. ],
>>> a.sum() Array-wise sum [ 4. , 5. , 6. ]])
>>> np.save('my_array', a) >>> a.min() Array-wise minimum value >>> np.r_[e,f] Stack arrays vertically (row-wise)
>>> np.savez('array.npz', a, b) >>> b.max(axis=0) Maximum value of an array row >>> np.hstack((e,f)) Stack arrays horizontally (column-wise)
>>> np.load('my_array.npy') >>> b.cumsum(axis=1) Cumulative sum of the elements array([[ 7., 7., 1., 0.],
>>> a.mean() Mean [ 7., 7., 0., 1.]])
Saving & Loading Text Files >>> b.median() Median >>> np.column_stack((a,d)) Create stacked column-wise arrays
>>> np.loadtxt("myfile.txt") >>> a.corrcoef() Correlation coefficient array([[ 1, 10],
>>> np.std(b) Standard deviation [ 2, 15],
>>> np.genfromtxt("my_file.csv", delimiter=',') [ 3, 20]])
>>> np.savetxt("myarray.txt", a, delimiter=" ") >>> np.c_[a,d] Create stacked column-wise arrays
Copying Arrays Splitting Arrays
Data Types >>> h = a.view() Create a view of the array with the same data >>> np.hsplit(a,3) Split the array horizontally at the 3rd
>>> np.copy(a) Create a copy of the array [array([1]),array([2]),array([3])] index
>>> np.int64 Signed 64-bit integer types >>> np.vsplit(c,2) Split the array vertically at the 2nd index
>>> np.float32 Standard double-precision floating point >>> h = a.copy() Create a deep copy of the array [array([[[ 1.5, 2. , 1. ],
>>> np.complex Complex numbers represented by 128 floats [ 4. , 5. , 6. ]]]),
array([[[ 3., 2., 3.],
>>>
>>>
np.bool
np.object
Boolean type storing TRUE and FALSE values
Python object type Sorting Arrays [ 4., 5., 6.]]])]

>>> np.string_ Fixed-length string type >>> a.sort() Sort an array


>>> np.unicode_ Fixed-length unicode type >>> c.sort(axis=0) Sort the elements of an array's axis DataCamp
Learn Python for Data Science Interactively
Python 数据科学 速查表
数组信息 子集、切片、 索引 参阅 列表

Numpy 基础
>>> a.shape 数组形状,几行几列 子集
呆鸟 译
数组长度
选择索引2对应的值
>>> len(a)
几维数组
>>> a[2] 1 2 3
>>> b.ndim
天善智能 商业智能与大数据社区 www.hellobi.com
3
>>> e.size 数组有多少元素 >>> b[1,2] 选择行1列2对应的值(等同于b[1][2]
数据类型
1.5 2 3
>>> b.dtype
数据类型的名字
6.0 4 5 6
>>> b.dtype.name
>>> b.astype(int) 数据类型转换 切片
NumPy >>> a[0:2] 1 2 3 选择索引为0与1对应的值
调用帮助
2 array([1, 2])
>>> b[0:2,1] 1.5 2 3
选择第1列中第0行、第1行的值
Numpy 是 Python 数据科学计算的核心库,提供了高性能的多维数组对象及处 >>> np.info(np.ndarray.dtype) array([ 2., 5.]) 4 5 6
理数组的工具。
数组计算 选择第0行的所有值(等同于b[0:1,:1]
1.5 2 3
>>> b[:1]
array([[1.5, 2., 3.]]) 4 5 6

使用以下语句导入 Numpy 库: 算数运算 >>> c[1,...]


array([[[ 3., 2., 1.],
等同于 [1,:,:]
[ 4., 5., 6.]]])
减法
>>> import numpy as np
反转数组a
>>> g = a - b
NumPy 数组
array([[-0.5, 0. , 0. ], >>> a[ : :-1]
array([3, 2, 1])
条件索引
[-3. , -3. , -3. ]])
1维数组 2维数组 3维数组 >>> np.subtract(a,b) 减法
>>> b + a 加法
>>> a[a<2] 1 2 3 选择数组a中所有小于2的值
array([[ 2.5, 4. , 6. ], array([1])
axis 2
花式索引
1 2 3
axis 1
axis 1 [ 5. , 7. , 9. ]])
选择(1,0),(0,1),(1,2) 和(0,0)所对应的值
加法
1.5 2 3 >>> np.add(b,a) >>> b[[1, 0, 1, 0],[0, 1, 2, 0]]
axis 0
], 除法
axis 0 >>> a / b array([ 4. , 2. , 6. , 1.5])
选择矩阵的行列子集
4 5 6 array([[ 0.66666667, 1. , 1. >>> b[[1, 0, 1, 0]][:,[0,1,2,0]]
[ 0.25 , 0.4 , 0.5 ]])
除法
array([[ 4. ,5. , 6. , 4. ],

创建数组
>>> np.divide(a,b) [ 1.5, 2. , 3. , 1.5],
>>> a * b 乘法 [ 4. , 5.
[ 1.5, 2.
,
,
6.
3.
,
,
4. ],
1.5]])
array([[ 1.5, 4. , 9. ],

数组操作
>>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]])
>>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float) >>> np.multiply(a,b) 乘法
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]], >>> np.exp(b) 幂
dtype = float) >>> np.sqrt(b) 平方根 转置数组
正弦 转置数组
初始化占位符
>>> np.sin(a) >>> i = np.transpose(b)
>>> np.cos(b) 余弦 >>> i.T 转置数组
自然对数 改变数组形状
创建值为0数组
>>> np.log(a)
>>> np.zeros((3,4)) 点积 拉平数组
>>> np.ones((2,3,4),dtype=np.int16) 创建值为1数组
>>> e.dot(f) >>> b.ravel()
array([[ 7., 7.],
>>> d = np.arange(10,25,5) 创建均匀间隔的数组(步进值) [ 7., 7.]])
>>> g.reshape(3,-2) 改变数组形状,但不改变数据
添加或删除值
>>> np.linspace(0,2,9) 创建均匀间隔的数组(样本数) 比较 >>> h.resize((2,6))
返回形状为(2,6)的新数组
追加数据
>>> np.append(h,g)
>>> e = np.full((2,2),7) 创建常数数组 >>> a == b
对比值 插入数据
>>> np.insert(a, 1, 5)
创建2x2单位矩阵
>>> f = np.eye(2) array([[False, True, True],
删除数据
>>> np.delete(a,[1])
创建随机值的数组
>>> np.random.random((2,2)) [False, False, False]], dtype=bool)
对比值 合并数组
创建空数组
>>> np.empty((3,2)) >>> a < 2
array([True, False, False], dtype=bool)
拼接数组
>>> np.concatenate((a,d),axis=0)
对比数组
输入/输出
>>> np.array_equal(a, b)
array([ 1, 2, 3, 10, 15, 20])

聚合函数 >>> np.vstack((a,b)) 纵向以行的维度堆叠数组


保存与载入磁盘上的文件
array([[ 1. , 2. , 3. ],
数组汇总
[ 1.5, 2. , 3. ],
>>> a.sum() [ 4. , 5. , 6. ]])
>>> np.save('my_array', a) >>> a.min() 数组最小值 >>> np.r_[e,f] 纵向以行的维度堆叠数组
>>> np.savez('array.npz', a, b) >>> b.max(axis=0) 数组最大值,按行 >>> np.hstack((e,f)) 横向以列的维度堆叠数组
>>> np.load('my_array.npy') >>> b.cumsum(axis=1) 数组元素的累加值 array([[ 7., 7., 1., 0.],
平均数
保存与载入文本文件
>>> a.mean() [ 7., 7., 0., 1.]])
>>> b.median() 中位数 >>> np.column_stack((a,d)) 以列的维度创建堆叠数组
>>> np.loadtxt("myfile.txt") >>> a.corrcoef() 相关系数 array([[ 1, 10],
标准差
>>> np.std(b) [ 2, 15],
>>> np.genfromtxt("my_file.csv", delimiter=',') [ 3, 20]])
>>> np.savetxt("myarray.txt", a, delimiter=" ") >>> np.c_[a,d] 以列的维度创建堆叠数组
数组复制 分割数组
数据类型 >>> h = a.view() 使用同一数据创建数组视图
>>> np.hsplit(a,3) 纵向分割数组为3等份
带符号的64位整数 创建数组的副本
[array([1]),array([2]),array([3])]
>>> np.copy(a)
横向分割数组为2等份
>>> np.int64
创建数组的深度拷贝
>>> np.vsplit(c,2)
>>> np.float32 标准双精度浮点数 >>> h = a.copy() [array([[[ 1.5, 2. , 1. ],
>>> np.complex 显示为128位浮点数的复数 [ 4. , 5. , 6. ]]]),
布尔值:True值和False值 数组排序
array([[[ 3., 2., 3.],
>>> np.bool [ 4., 5., 6.]]])]
>>> np.object Python对象
固定长度字符串 数组排序
原文作者
>>> np.string_ >>> a.sort()
以轴为依据对数组排序
DataCamp
>>> np.unicode_ 固定长度Unicode >>> c.sort(axis=0)
Learn Python for Data Science Interactively
SLICING (INDEXING/SUBSETTING)
Numpy (Numerical Python)
Numpy Cheat Sheet Setting data with assignment :

ndarray1[ndarray1 < 0] = 0 *
5. Boolean arrays methods

Count # of ‘Trues’ (ndarray1 > 0).sum()


Python Package in boolean array

Created By: Arianne Colton and Sean Chen If ndarray1 is two-dimensions, ndarray1 < 0 If at least one ndarray1.any()
*
creates a two-dimensional boolean array. value is ‘True’
If all values are ndarray1.all()
‘True’
Numpy (Numerical Python) COMMON OPERATIONS
1. Transposing Note: These methods also work with non-boolean
What is NumPy? Default data type is ‘np.float64’. This is • A special form of reshaping which returns a ‘view’ arrays, where non-zero elements evaluate to True.
** equivalent to Python’s float type which is 8 on the underlying data without copying anything.
Foundation package for scientific computing in Python bytes (64 bits); thus the name ‘float64’.
ndarray1.transpose() or 6. Sorting
Why NumPy? If casting were to fail for some reason,
*** ‘TypeError’ will be raised. ndarray1.T or Inplace sorting ndarray1.sort()
• Numpy ‘ndarray’ is a much more efficient way
of storing and manipulating “numerical data” ndarray1.swapaxes(0, 1)
than the built-in Python data structures. Return a sorted sorted1 =
SLICING (INDEXING/SUBSETTING) 2. Vectorized wrappers (for functions that np.sort(ndarray1)
• Libraries written in lower-level languages, such copy instead of
as C, can operate on data stored in Numpy • Slicing (i.e. ndarray1[2:6]) is a ‘view’ on take scalar values) inplace
‘ndarray’ without copying any data. the original array. Data is NOT copied. Any • math.sqrt() works on only a scalar
modifications (i.e. ndarray1[2:6] = 8) to the
N-DIMENSIONAL ARRAY (NDARRAY) np.sqrt(seq1) # any sequence (list, 7. Set methods
‘view’ will be reflected in the original array.
ndarray, etc) to return a ndarray
What is NdArray? • Instead of a ‘view’, explicit copy of slicing via : Return sorted np.unique(ndarray1)

3. Vectorized expressions unique values


Fast and space-efficient multidimensional array ndarray1[2:6].copy()
(container for homogeneous data) providing vectorized • np.where(cond, x, y) is a vectorized version Test membership resultBooleanArray =
arithmetic operations of ndarray1 values np.in1d(ndarray1, [2,
• Multidimensional array indexing notation : of the expression ‘x if condition else y’ 3, 6])
in [2, 3, 6]
Create NdArray np.array(seq1) ndarray1[0][2] or ndarray1[0, 2] np.where([True, False], [1, 2],
# seq1 - is any sequence like object, [2, 3]) => ndarray (1, 3)
• Other set methods : intersect1d(),union1d(),
i.e. [1, 2, 3]
setdiff1d(), setxor1d()
Create Special 1, np.zeros(10) * Boolean indexing :
• Common Usages :
NdArray # one dimensional ndarray with 10 ndarray1[(names == ‘Bob’) | (names == 8. Random number generation (np.random)
elements of value 0 ‘Will’), 2:] np.where(matrixArray > 0, 1, -1)
2, np.ones(2, 3) • Supplements the built-in Python random * with
# ‘2:’ means select from 3rd column on => a new array (same shape) of 1 or -1 values
# two dimensional ndarray with 6
functions for efficiently generating whole arrays
elements of value 1 np.where(cond, 1, 0).argmax() * of sample values from many kinds of probability
3, np.empty(3, 4, 5) * Selecting data by boolean indexing => Find the first True element distributions.
*
# three dimensional ndarray of ALWAYS creates a copy of the data.
samples = np.random.normal(size =(3, 3))
uninitialized values argmax() can be used to find the
4, np.eye(N) or The ‘and’ and ‘or’ keywords do NOT work index of the maximum element.
* * Example usage is find the first
np.identity(N) with boolean arrays. Use & and |. Python built-in random ONLY samples
# creates N by N identity matrix element that has a “price > number” *
in an array of price data. one value at a time.
NdArray version of np.arange(1, 10)
Python’s range * Fancy indexing (aka ‘indexing using integer arrays’)
Select a subset of rows in a particular order : 4. Aggregations/Reductions Methods
Get # of Dimension ndarray1.ndim (i.e. mean, sum, std)
Get Dimension Size dim1size, dim2size, .. = ndarray1[ [3, 8, 4] ]
ndarray1.shape Compute mean ndarray1.mean() or
ndarray1[ [-1, 6] ]
Get Data Type ** ndarray1.dtype np.mean(ndarray1)
# negative indices select rows from the end Created by Arianne Colton and Sean Chen
Explicit Casting ndarray2 = ndarray1. Compute statistics ndarray1.mean(axis = 1)
astype(np.int32) *** www.datasciencefree.com
Fancy indexing ALWAYS creates a over axis * ndarray1.sum(axis = 0)
* Based on content from
copy of the data.
Cannot assume empty() will return all zeros. ‘Python for Data Analysis’ by Wes McKinney
* * axis = 0 means column axis, 1 is row axis.
It could be garbage values.
Updated: August 18, 2016
LEARN DATA SCIENCE ONLINE
Start Learning For Free - www.dataquest.io

Data Science Cheat Sheet


NumPy

KEY IMPORTS
We’ll use shorthand in this cheat sheet Import these to start
arr - A numpy Array object import numpy as np

I M P O RT I N G/ E X P O RT I N G arr.T - Transposes arr (rows become columns and S C A L A R M AT H


np.loadtxt('file.txt') - From a text file vice versa) np.add(arr,1) - Add 1 to each array element
np.genfromtxt('file.csv',delimiter=',') arr.reshape(3,4) - Reshapes arr to 3 rows, 4 np.subtract(arr,2) - Subtract 2 from each array
- From a CSV file columns without changing data element
np.savetxt('file.txt',arr,delimiter=' ') arr.resize((5,6)) - Changes arr shape to 5x6 np.multiply(arr,3) - Multiply each array
- Writes to a text file and fills new values with 0 element by 3
np.savetxt('file.csv',arr,delimiter=',') np.divide(arr,4) - Divide each array element by
- Writes to a CSV file A D D I N G/ R E M OV I N G E L E M E N TS 4 (returns np.nan for division by zero)
np.append(arr,values) - Appends values to end np.power(arr,5) - Raise each array element to
C R E AT I N G A R R AYS of arr the 5th power
np.array([1,2,3]) - One dimensional array np.insert(arr,2,values) - Inserts values into
np.array([(1,2,3),(4,5,6)]) - Two dimensional arr before index 2 V E C TO R M AT H
array np.delete(arr,3,axis=0) - Deletes row on index np.add(arr1,arr2) - Elementwise add arr2 to
np.zeros(3) - 1D array of length 3 all values 0 3 of arr arr1
np.ones((3,4)) - 3x4 array with all values 1 np.delete(arr,4,axis=1) - Deletes column on np.subtract(arr1,arr2) - Elementwise subtract
np.eye(5) - 5x5 array of 0 with 1 on diagonal index 4 of arr arr2 from arr1
(Identity matrix) np.multiply(arr1,arr2) - Elementwise multiply
np.linspace(0,100,6) - Array of 6 evenly divided C O M B I N I N G/S P L I T T I N G arr1 by arr2
values from 0 to 100 np.concatenate((arr1,arr2),axis=0) - Adds np.divide(arr1,arr2) - Elementwise divide arr1
np.arange(0,10,3) - Array of values from 0 to less arr2 as rows to the end of arr1 by arr2
than 10 with step 3 (eg [0,3,6,9]) np.concatenate((arr1,arr2),axis=1) - Adds np.power(arr1,arr2) - Elementwise raise arr1
np.full((2,3),8) - 2x3 array with all values 8 arr2 as columns to end of arr1 raised to the power of arr2
np.random.rand(4,5) - 4x5 array of random floats np.split(arr,3) - Splits arr into 3 sub-arrays np.array_equal(arr1,arr2) - Returns True if the
between 0-1 np.hsplit(arr,5) - Splits arr horizontally on the arrays have the same elements and shape
np.random.rand(6,7)*100 - 6x7 array of random 5th index np.sqrt(arr) - Square root of each element in the
floats between 0-100 array
np.random.randint(5,size=(2,3)) - 2x3 array I N D E X I N G/S L I C I N G/S U B S E T T I N G np.sin(arr) - Sine of each element in the array
with random ints between 0-4 arr[5] - Returns the element at index 5 np.log(arr) - Natural log of each element in the
arr[2,5] - Returns the 2D array element on index array
I N S P E C T I N G P R O P E RT I E S [2][5] np.abs(arr) - Absolute value of each element in
arr.size - Returns number of elements in arr arr[1]=4 - Assigns array element on index 1 the the array
arr.shape - Returns dimensions of arr (rows, value 4 np.ceil(arr) - Rounds up to the nearest int
columns) arr[1,3]=10 - Assigns array element on index np.floor(arr) - Rounds down to the nearest int
arr.dtype - Returns type of elements in arr [1][3] the value 10 np.round(arr) - Rounds to the nearest int
arr.astype(dtype) - Convert arr elements to arr[0:3] - Returns the elements at indices 0,1,2
type dtype (On a 2D array: returns rows 0,1,2) STAT I ST I C S
arr.tolist() - Convert arr to a Python list arr[0:3,4] - Returns the elements on rows 0,1,2 np.mean(arr,axis=0) - Returns mean along
np.info(np.eye) - View documentation for at column 4 specific axis
np.eye arr[:2] - Returns the elements at indices 0,1 (On arr.sum() - Returns sum of arr
a 2D array: returns rows 0,1) arr.min() - Returns minimum value of arr
C O P Y I N G/S O RT I N G/ R E S H A P I N G arr[:,1] - Returns the elements at index 1 on all arr.max(axis=0) - Returns maximum value of
np.copy(arr) - Copies arr to new memory rows specific axis
arr.view(dtype) - Creates view of arr elements arr<5 - Returns an array with boolean values np.var(arr) - Returns the variance of array
with type dtype (arr1<3) & (arr2>5) - Returns an array with np.std(arr,axis=1) - Returns the standard
arr.sort() - Sorts arr boolean values deviation of specific axis
arr.sort(axis=0) - Sorts specific axis of arr ~arr - Inverts a boolean array arr.corrcoef() - Returns correlation coefficient
two_d_arr.flatten() - Flattens 2D array arr[arr<5] - Returns array elements smaller than 5 of array
two_d_arr to 1D

LEARN DATA SCIENCE ONLINE


Start Learning For Free - www.dataquest.io
NumPy Cheat Sheets: Tips and Tricks
What‘s the Meaning of Axes and Shape properties?
1D NumPy Array 2D NumPy Array 3D NumPy Array
Axis 0

Axis 0

Axis 0
Axis 1 Axis 1
→ a.ndim = 1 „axis 0“ → a.ndim = 2 „axis 0 and axis 1“ → a.ndim = 3 „axis 0 and axis 1“
→ a.shape = (5,) „five rows“ → a.shape = (5, 4) „five rows, four cols“ → a.shape = (5, 4, 3) „5 rows, 4 cols, 3 levels“
→ a.size = 5 „5 elements“ → a.size = 20 „5*4=20 elements“ → a.size = 60 „5*4*3=60 elements“

What‘s Broadcasting? How to Search Arrays? The np.nonzero() Trick

Goal: bring arrays with different shapes into the same shape Goal: find elements that meet a certain condition in a NumPy array
during arithmetic operations. NumPy does that for you! • Step 1: Understanding np.nonzero()

import numpy as np import numpy as np

salary = np.array([2000, 4000, 8000]) X = np.array([[1, 0, 0],


salary_bump = 1.1 [0, 2, 2],
[3, 0, 0]])
print(salary * salary_bump)
# [2200. 4400. 8800.] print(np.nonzero(X))
# (array([0, 1, 1, 2], dtype=int64), array([0, 1, 2, 0],
• For any dimension where first array has size of one, NumPy dtype=int64))
conceptually copies its data until the size of the second array
is reached. The result is a tuple of two NumPy arrays. The first array gives the
• If dimension is completely missing for array B, it is simply row indices of non-zero elements. The second array gives the
copied along the missing dimension. column indices of non-zero elements.
• Step 2: Use np.nonzero() and broadcasting to find elements
What‘s Boolean Indexing?
Goal: create a new array that contains a fine-grained element import numpy as np
selection of the old array
## Data: air quality index AQI data (row = city)
import numpy as np X = np.array(
[[ 42, 40, 41, 43, 44, 43 ], # Hong Kong
a = np.array([[1, 2, 3], [ 30, 31, 29, 29, 29, 30 ], # New York
[4, 5, 6], [ 8, 13, 31, 11, 11, 9 ], # Berlin
[7, 8, 9]]) [ 11, 11, 12, 13, 11, 12 ]]) # Montreal

indices = np.array([[False, False, True], cities = np.array(["Hong Kong", "New York", "Berlin",
[False, False, False], "Montreal"])
[True, True, False]])
# Find cities with above average pollution
print(a[indices]) polluted = set(cities[np.nonzero(X > np.average(X))[0]])
# [3 7 8] print(polluted)

We create two arrays “a” and “indices”. The first array contains The Boolean expression “X > np.average(X)” uses broadcasting
two-dimensional numerical data (=data array). The second to bring both operands to the same shape. Then it performs an
array has the same shape and contains Boolean values element-wise comparison to determine a Boolean array that
(=indexing array). You can use the indexing array for fine- contains “True” if the respective measurement observed an
grained data array access. This creates a new NumPy array from above average AQI value. The function np.average() computes
the data array containing only those elements for which the the average AQI value over all NumPy array elements. Boolean
indexing array contains “True” Boolean values at the respective indexing accesses all city rows with above average pollution
array positions. Thus, the resulting array contains the three values.
values 3, 7, and 8.

„A Puzzle A Day to Learn, Code, and Play!“

You might also like