Topic 4 Aggregates
Topic 4 Aggregates
[2]: L = np.random.random(100)
sum(L)
[2]: 51.93544860115952
The syntax is quite similar to that of NumPy’s sum function, and the result is the same in the
simplest case:
[3]: np.sum(L)
[3]: 51.93544860115953
However, because it executes the operation in compiled code, NumPy’s version of the operation is
computed much more quickly:
[4]: big_array = np.random.rand(1000000)
%timeit sum(big_array)
%timeit np.sum(big_array)
94.2 ms ± 5.28 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
845 µs ± 102 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
• Be careful about: the sum function and the np.sum function
1
1.2 Minimum and Maximum
Similarly, Python has built-in min and max functions, used to find the minimum value and maximum
value of any given array:
[5]: min(big_array), max(big_array)
NumPy’s corresponding functions have similar syntax, and again operate much more quickly:
[6]: np.min(big_array), np.max(big_array)
58.7 ms ± 7.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
570 µs ± 47.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
For min, max, sum, and several other NumPy aggregates, a shorter syntax is to use methods of the
array object itself:
[8]: print(big_array.min(), big_array.max(), big_array.sum())
[10]: 5.661022791056153
Aggregation functions take an additional argument specifying the axis along which the aggregate is
computed. For example, we can find the minimum value within each column by specifying axis=0:
2
[11]: M.min(axis=0)
The function returns four values, corresponding to the four columns of numbers.
Similarly, we can find the maximum value within each row:
[12]: M.max(axis=1)