Numpy
Numpy
Numpy 基础
>>> a.shape 数组形状,几行几列 子集
呆鸟 译
数组长度
选择索引2对应的值
>>> len(a)
几维数组
>>> a[2] 1 2 3
>>> b.ndim
天善智能 商业智能与大数据社区 www.hellobi.com
3
>>> e.size 数组有多少元素 >>> b[1,2] 选择行1列2对应的值(等同于b[1][2]
数据类型
1.5 2 3
>>> b.dtype
数据类型的名字
6.0 4 5 6
>>> b.dtype.name
>>> b.astype(int) 数据类型转换 切片
NumPy >>> a[0:2] 1 2 3 选择索引为0与1对应的值
调用帮助
2 array([1, 2])
>>> b[0:2,1] 1.5 2 3
选择第1列中第0行、第1行的值
Numpy 是 Python 数据科学计算的核心库,提供了高性能的多维数组对象及处 >>> np.info(np.ndarray.dtype) array([ 2., 5.]) 4 5 6
理数组的工具。
数组计算 选择第0行的所有值(等同于b[0:1,:1]
1.5 2 3
>>> b[:1]
array([[1.5, 2., 3.]]) 4 5 6
创建数组
>>> np.divide(a,b) [ 1.5, 2. , 3. , 1.5],
>>> a * b 乘法 [ 4. , 5.
[ 1.5, 2.
,
,
6.
3.
,
,
4. ],
1.5]])
array([[ 1.5, 4. , 9. ],
数组操作
>>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]])
>>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float) >>> np.multiply(a,b) 乘法
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]], >>> np.exp(b) 幂
dtype = float) >>> np.sqrt(b) 平方根 转置数组
正弦 转置数组
初始化占位符
>>> np.sin(a) >>> i = np.transpose(b)
>>> np.cos(b) 余弦 >>> i.T 转置数组
自然对数 改变数组形状
创建值为0数组
>>> np.log(a)
>>> np.zeros((3,4)) 点积 拉平数组
>>> np.ones((2,3,4),dtype=np.int16) 创建值为1数组
>>> e.dot(f) >>> b.ravel()
array([[ 7., 7.],
>>> d = np.arange(10,25,5) 创建均匀间隔的数组(步进值) [ 7., 7.]])
>>> g.reshape(3,-2) 改变数组形状,但不改变数据
添加或删除值
>>> np.linspace(0,2,9) 创建均匀间隔的数组(样本数) 比较 >>> h.resize((2,6))
返回形状为(2,6)的新数组
追加数据
>>> np.append(h,g)
>>> e = np.full((2,2),7) 创建常数数组 >>> a == b
对比值 插入数据
>>> np.insert(a, 1, 5)
创建2x2单位矩阵
>>> f = np.eye(2) array([[False, True, True],
删除数据
>>> np.delete(a,[1])
创建随机值的数组
>>> np.random.random((2,2)) [False, False, False]], dtype=bool)
对比值 合并数组
创建空数组
>>> np.empty((3,2)) >>> a < 2
array([True, False, False], dtype=bool)
拼接数组
>>> np.concatenate((a,d),axis=0)
对比数组
输入/输出
>>> np.array_equal(a, b)
array([ 1, 2, 3, 10, 15, 20])
ndarray1[ndarray1 < 0] = 0 *
5. Boolean arrays methods
Created By: Arianne Colton and Sean Chen If ndarray1 is two-dimensions, ndarray1 < 0 If at least one ndarray1.any()
*
creates a two-dimensional boolean array. value is ‘True’
If all values are ndarray1.all()
‘True’
Numpy (Numerical Python) COMMON OPERATIONS
1. Transposing Note: These methods also work with non-boolean
What is NumPy? Default data type is ‘np.float64’. This is • A special form of reshaping which returns a ‘view’ arrays, where non-zero elements evaluate to True.
** equivalent to Python’s float type which is 8 on the underlying data without copying anything.
Foundation package for scientific computing in Python bytes (64 bits); thus the name ‘float64’.
ndarray1.transpose() or 6. Sorting
Why NumPy? If casting were to fail for some reason,
*** ‘TypeError’ will be raised. ndarray1.T or Inplace sorting ndarray1.sort()
• Numpy ‘ndarray’ is a much more efficient way
of storing and manipulating “numerical data” ndarray1.swapaxes(0, 1)
than the built-in Python data structures. Return a sorted sorted1 =
SLICING (INDEXING/SUBSETTING) 2. Vectorized wrappers (for functions that np.sort(ndarray1)
• Libraries written in lower-level languages, such copy instead of
as C, can operate on data stored in Numpy • Slicing (i.e. ndarray1[2:6]) is a ‘view’ on take scalar values) inplace
‘ndarray’ without copying any data. the original array. Data is NOT copied. Any • math.sqrt() works on only a scalar
modifications (i.e. ndarray1[2:6] = 8) to the
N-DIMENSIONAL ARRAY (NDARRAY) np.sqrt(seq1) # any sequence (list, 7. Set methods
‘view’ will be reflected in the original array.
ndarray, etc) to return a ndarray
What is NdArray? • Instead of a ‘view’, explicit copy of slicing via : Return sorted np.unique(ndarray1)
KEY IMPORTS
We’ll use shorthand in this cheat sheet Import these to start
arr - A numpy Array object import numpy as np
Axis 0
Axis 0
Axis 1 Axis 1
→ a.ndim = 1 „axis 0“ → a.ndim = 2 „axis 0 and axis 1“ → a.ndim = 3 „axis 0 and axis 1“
→ a.shape = (5,) „five rows“ → a.shape = (5, 4) „five rows, four cols“ → a.shape = (5, 4, 3) „5 rows, 4 cols, 3 levels“
→ a.size = 5 „5 elements“ → a.size = 20 „5*4=20 elements“ → a.size = 60 „5*4*3=60 elements“
Goal: bring arrays with different shapes into the same shape Goal: find elements that meet a certain condition in a NumPy array
during arithmetic operations. NumPy does that for you! • Step 1: Understanding np.nonzero()
indices = np.array([[False, False, True], cities = np.array(["Hong Kong", "New York", "Berlin",
[False, False, False], "Montreal"])
[True, True, False]])
# Find cities with above average pollution
print(a[indices]) polluted = set(cities[np.nonzero(X > np.average(X))[0]])
# [3 7 8] print(polluted)
We create two arrays “a” and “indices”. The first array contains The Boolean expression “X > np.average(X)” uses broadcasting
two-dimensional numerical data (=data array). The second to bring both operands to the same shape. Then it performs an
array has the same shape and contains Boolean values element-wise comparison to determine a Boolean array that
(=indexing array). You can use the indexing array for fine- contains “True” if the respective measurement observed an
grained data array access. This creates a new NumPy array from above average AQI value. The function np.average() computes
the data array containing only those elements for which the the average AQI value over all NumPy array elements. Boolean
indexing array contains “True” Boolean values at the respective indexing accesses all city rows with above average pollution
array positions. Thus, the resulting array contains the three values.
values 3, 7, and 8.