0% found this document useful (0 votes)

13 views3 pages

Topic 4 Aggregates

Uploaded by

sobaba6180

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views3 pages

Topic 4 Aggregates

Uploaded by

sobaba6180

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Aggregates

Pruthvish Rajput, Venus Patel

February 23, 2023

1 Aggregations: Min, Max, and Everything In Between

• a first step in data processing is to compute summary statistics.
– the mean and standard deviation
– the sum
– product
– median
– minimum and maximum, quantiles, etc.

1.1 Summing the Values in an Array

As a quick example, consider computing the sum of all values in an array. Python itself can do this
using the built-in sum function:
[1]: import numpy as np

[2]: L = np.random.random(100)
sum(L)

[2]: 51.93544860115952

The syntax is quite similar to that of NumPy’s sum function, and the result is the same in the
simplest case:
[3]: np.sum(L)

[3]: 51.93544860115953

However, because it executes the operation in compiled code, NumPy’s version of the operation is
computed much more quickly:
[4]: big_array = np.random.rand(1000000)
%timeit sum(big_array)
%timeit np.sum(big_array)

94.2 ms ± 5.28 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
845 µs ± 102 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
• Be careful about: the sum function and the np.sum function

1
1.2 Minimum and Maximum
Similarly, Python has built-in min and max functions, used to find the minimum value and maximum
value of any given array:
[5]: min(big_array), max(big_array)

[5]: (6.064240321013159e-08, 0.9999998126919177)

NumPy’s corresponding functions have similar syntax, and again operate much more quickly:
[6]: np.min(big_array), np.max(big_array)

[6]: (6.064240321013159e-08, 0.9999998126919177)

[7]: %timeit min(big_array)

%timeit np.min(big_array)

58.7 ms ± 7.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
570 µs ± 47.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
For min, max, sum, and several other NumPy aggregates, a shorter syntax is to use methods of the
array object itself:
[8]: print(big_array.min(), big_array.max(), big_array.sum())

6.064240321013159e-08 0.9999998126919177 500287.889621271

Whenever possible, make sure that you are using the NumPy version of these aggregates when
operating on NumPy arrays!

1.2.1 Multi dimensional aggregates

One common type of aggregation operation is an aggregate along a row or column. Say you have
some data stored in a two-dimensional array:
[9]: M = np.random.random((3, 4))
print(M)

[[0.28500679 0.68234357 0.06552604 0.04215306]

[0.60573798 0.67655455 0.69527212 0.24607059]
[0.89005827 0.13258705 0.35994861 0.97976416]]
By default, each NumPy aggregation function will return the aggregate over the entire array:
[10]: M.sum()

[10]: 5.661022791056153

Aggregation functions take an additional argument specifying the axis along which the aggregate is
computed. For example, we can find the minimum value within each column by specifying axis=0:

2
[11]: M.min(axis=0)

[11]: array([0.28500679, 0.13258705, 0.06552604, 0.04215306])

The function returns four values, corresponding to the four columns of numbers.
Similarly, we can find the maximum value within each row:
[12]: M.max(axis=1)

[12]: array([0.68234357, 0.69527212, 0.97976416])

1.2.2 Other aggregation functions

Function Name NaN-safe Version Description

np.sum np.nansum Compute sum of elements
np.prod np.nanprod Compute product of elements
np.mean np.nanmean Compute mean of elements
np.std np.nanstd Compute standard deviation
np.var np.nanvar Compute variance
np.min np.nanmin Find minimum value
np.max np.nanmax Find maximum value
np.argmin np.nanargmin Find index of minimum value
np.argmax np.nanargmax Find index of maximum value
np.median np.nanmedian Compute median of elements
np.percentile np.nanpercentile Compute rank-based statistics of elements
np.any N/A Evaluate whether any elements are true
np.all N/A Evaluate whether all elements are true

Unit 4
No ratings yet
Unit 4
62 pages
Advanced Data Science Training - Trainer
No ratings yet
Advanced Data Science Training - Trainer
515 pages
Automatic Plastic Injection Moulding Machine - Injection Moulding Machines - Injection Moulding Manufacturers
No ratings yet
Automatic Plastic Injection Moulding Machine - Injection Moulding Machines - Injection Moulding Manufacturers
23 pages
Linux Administration Task Using Python
No ratings yet
Linux Administration Task Using Python
61 pages
Numerical Methods Using Python: (MCSC-202)
No ratings yet
Numerical Methods Using Python: (MCSC-202)
34 pages
of Sedimentary Basins - Notes
100% (1)
of Sedimentary Basins - Notes
44 pages
DataFrame Statistics
No ratings yet
DataFrame Statistics
41 pages
Amalgamation & Sale of Partnership Firm
No ratings yet
Amalgamation & Sale of Partnership Firm
24 pages
Ai Tools Lab - N3
No ratings yet
Ai Tools Lab - N3
66 pages
Chapter 1. Vectors, Matrices, and Arrays: Problem
No ratings yet
Chapter 1. Vectors, Matrices, and Arrays: Problem
26 pages
Ai Programs
No ratings yet
Ai Programs
22 pages
The Unknown Life of Jesus Christ
No ratings yet
The Unknown Life of Jesus Christ
104 pages
M. Scott Shell
No ratings yet
M. Scott Shell
15 pages
Numpy Full
100% (1)
Numpy Full
40 pages
Unit III - Data Manipulation Using Python
No ratings yet
Unit III - Data Manipulation Using Python
16 pages
NUPLE
No ratings yet
NUPLE
10 pages
Nand 2 Nor 2
No ratings yet
Nand 2 Nor 2
19 pages
Numpy
No ratings yet
Numpy
10 pages
3 Shyamendra Module3 DHA1
No ratings yet
3 Shyamendra Module3 DHA1
4 pages
Matrix Methods
No ratings yet
Matrix Methods
16 pages
Numpy (Numerical Python)
No ratings yet
Numpy (Numerical Python)
80 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
Efficient Computing With NumPy
No ratings yet
Efficient Computing With NumPy
73 pages
ML Labs
No ratings yet
ML Labs
15 pages
Tutorial-2 Basic NumPy
No ratings yet
Tutorial-2 Basic NumPy
16 pages
Data Analysis
No ratings yet
Data Analysis
20 pages
Considerații Privind Restaurarea Unei Icoane Rusesti Din Sec Al XIX-lea
100% (2)
Considerații Privind Restaurarea Unei Icoane Rusesti Din Sec Al XIX-lea
11 pages
Unit 3 - Numpy - VP
No ratings yet
Unit 3 - Numpy - VP
53 pages
Numpy
No ratings yet
Numpy
15 pages
Introduction To NumPy
No ratings yet
Introduction To NumPy
27 pages
Numpy - Statistical Operations
No ratings yet
Numpy - Statistical Operations
4 pages
Unit 4
No ratings yet
Unit 4
49 pages
13 - NumPy
No ratings yet
13 - NumPy
46 pages
Numpy Array
No ratings yet
Numpy Array
14 pages
NumPy Is A Powerful Python Library Used For Numerical Computing. Here Are S - 20250101 - 154624 - 0000
No ratings yet
NumPy Is A Powerful Python Library Used For Numerical Computing. Here Are S - 20250101 - 154624 - 0000
8 pages
Chakan Iv, Pune: Indospace - in
No ratings yet
Chakan Iv, Pune: Indospace - in
16 pages
Numpy
No ratings yet
Numpy
9 pages
Unit 3
No ratings yet
Unit 3
56 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
NumpyToday's Session
No ratings yet
NumpyToday's Session
8 pages
Python Unit 4
No ratings yet
Python Unit 4
43 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
Python Sem V Portion 2
No ratings yet
Python Sem V Portion 2
29 pages
Numpy
No ratings yet
Numpy
15 pages
A Selection of Useful Numpy Core Functions: Greg Von Winckel
No ratings yet
A Selection of Useful Numpy Core Functions: Greg Von Winckel
14 pages
Bell ADT D-Series General Info
100% (1)
Bell ADT D-Series General Info
32 pages
Numpy
No ratings yet
Numpy
64 pages
Unit 5 PythonPackages (Matplotlib)
No ratings yet
Unit 5 PythonPackages (Matplotlib)
24 pages
Schematic Nrf24l01+Pa+Lna
100% (1)
Schematic Nrf24l01+Pa+Lna
2 pages
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
47 pages
Numpy 33
No ratings yet
Numpy 33
8 pages
Week2-1 Numpy
No ratings yet
Week2-1 Numpy
43 pages
ANL252 SU3 Jul2022
No ratings yet
ANL252 SU3 Jul2022
23 pages
L - AND - T - Project - Naveen 24cs002895
No ratings yet
L - AND - T - Project - Naveen 24cs002895
7 pages
Python 2.1.1
No ratings yet
Python 2.1.1
7 pages
Shalvin
No ratings yet
Shalvin
9 pages
Unit 4 Numpy
No ratings yet
Unit 4 Numpy
14 pages
Num Py
No ratings yet
Num Py
13 pages
ML Sample Programs
No ratings yet
ML Sample Programs
7 pages
Numpy
No ratings yet
Numpy
20 pages
Ot Lab 6
No ratings yet
Ot Lab 6
13 pages
Python Assignment 1
No ratings yet
Python Assignment 1
4 pages
Elevate Your Self Worth
No ratings yet
Elevate Your Self Worth
46 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
02 Numpy
No ratings yet
02 Numpy
11 pages
Accomplishment Report On Booklet
No ratings yet
Accomplishment Report On Booklet
5 pages
(Numpy) - Extended Cheatsheet
No ratings yet
(Numpy) - Extended Cheatsheet
8 pages
Chapter#1 - Introduction To Web Engineering
No ratings yet
Chapter#1 - Introduction To Web Engineering
54 pages
Taurus Led 08-08-2022.
No ratings yet
Taurus Led 08-08-2022.
60 pages
AA278A Lecture Notes 8. Optimal Control and Dynamic Games: Claire J. Tomlin May 11, 2005
No ratings yet
AA278A Lecture Notes 8. Optimal Control and Dynamic Games: Claire J. Tomlin May 11, 2005
12 pages
Grade 5 P.E and Arts Paper 1 End of Year Exams 2022
No ratings yet
Grade 5 P.E and Arts Paper 1 End of Year Exams 2022
2 pages
Numpy Cheat Sheet Python For Data Science: Inspecting Your Array Sorting Arrays
No ratings yet
Numpy Cheat Sheet Python For Data Science: Inspecting Your Array Sorting Arrays
1 page
General Organic Chemistry
No ratings yet
General Organic Chemistry
78 pages
Dissertation Essex Uni
100% (2)
Dissertation Essex Uni
6 pages
Gangguan Pendengaran Dan Kelainan Telinga
No ratings yet
Gangguan Pendengaran Dan Kelainan Telinga
157 pages
Conclusion
No ratings yet
Conclusion
2 pages
TCNet Design Report
No ratings yet
TCNet Design Report
2 pages
Sketchuptextureclub - Textures - Terms of Use
No ratings yet
Sketchuptextureclub - Textures - Terms of Use
2 pages
Activity 2 - Qualitative Test For The Presence of Organic Compounds
No ratings yet
Activity 2 - Qualitative Test For The Presence of Organic Compounds
5 pages
Can Charisma Be Taught
No ratings yet
Can Charisma Be Taught
24 pages
Chem 1 Subject-Outline
No ratings yet
Chem 1 Subject-Outline
10 pages
Queue - Haynes Kia Sephia &amp Spectra Automotive Repair Manual
No ratings yet
Queue - Haynes Kia Sephia &amp Spectra Automotive Repair Manual
4 pages
Homework 5 Problem 1
No ratings yet
Homework 5 Problem 1
4 pages
FS 2 LEARNING EPISODE 3 Final Episode
No ratings yet
FS 2 LEARNING EPISODE 3 Final Episode
10 pages
SABA Sports Book
No ratings yet
SABA Sports Book
11 pages
Mining Rehabilitation Fund Questions and Answers
No ratings yet
Mining Rehabilitation Fund Questions and Answers
4 pages
Some, Any, Much, Many, A Lot Of, How Many, How Mu
No ratings yet
Some, Any, Much, Many, A Lot Of, How Many, How Mu
1 page
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet

Topic 4 Aggregates

Uploaded by

Topic 4 Aggregates

Uploaded by

Aggregates

Pruthvish Rajput, Venus Patel

1 Aggregations: Min, Max, and Everything In Between

1.1 Summing the Values in an Array

[5]: (6.064240321013159e-08, 0.9999998126919177)

[6]: (6.064240321013159e-08, 0.9999998126919177)

[7]: %timeit min(big_array)

6.064240321013159e-08 0.9999998126919177 500287.889621271

1.2.1 Multi dimensional aggregates

[[0.28500679 0.68234357 0.06552604 0.04215306]

[11]: array([0.28500679, 0.13258705, 0.06552604, 0.04215306])

[12]: array([0.68234357, 0.69527212, 0.97976416])

1.2.2 Other aggregation functions

Function Name NaN-safe Version Description

You might also like