Unit 5: NumPy
1. Arrays and Vectorized Computation
Definition:
NumPy (Numerical Python) is a library used for numerical computations in Python. It provides
powerful N-dimensional array objects and functions for operations on these arrays. The core of
NumPy is the ndarray (N-dimensional array).
Advantages of NumPy Arrays:
● Efficient memory storage and fast computations.
● Supports vectorized operations (element-wise operations without loops).
● Ideal for handling large datasets in scientific computing.
Vectorized Computation:
Vectorization replaces slow Python loops with fast array expressions. For example:
Use Cases:
● Data analysis
● Machine learning preprocessing
● Simulations
2. The NumPy ND array
Definition:
An ndarray is the core data structure of NumPy that represents arrays of any dimension.
Creating ND arrays:
Attributes of ndarray:
● ndim: Number of dimensions.
● shape: Size of each dimension.
● size: Total number of elements.
● dtype: Data type of array elements.
Example:
3. Creating ND Arrays
Methods to Create Arrays:
● np.array(): Convert Python lists to arrays.
● np.zeros((2,3)): Array of all zeros.
● np.ones((2,3)): Array of all ones.
● np.arange(0, 10, 2): Array from 0 to 10 with step 2.
● np.linspace(0, 1, 5): 5 evenly spaced numbers from 0 to 1.
Example:
These methods help in scientific simulations and test data generation.
4. Data Types for ND Arrays
Purpose:
NumPy allows setting specific data types for efficient memory and compatibility with C-based
libraries.
Common Data Types:
● int32, int64
● float32, float64
● bool
● complex
Check/Change Data Types:
5. Arithmetic with NumPy Arrays
Element-wise Operations:
Operations:
● Addition: a + b
● Subtraction: a - b
● Multiplication: a * b
● Division: a / b
Scalar Operations:
These operations are highly optimized in NumPy and run faster than native Python loops.
6. Basic Indexing and Slicing
Indexing:
Slicing:
2D Indexing:
Slicing helps in extracting sub-arrays efficiently.
7. Boolean Indexing
Definition:
Allows filtering arrays using boolean conditions.
Example:
Boolean indexing is helpful in data selection and preprocessing.
8. Transposing Arrays and Swapping Axes
Transpose:
Swapping Axes:
These operations are crucial in reshaping data for ML models and image processing.
9. Universal Functions (ufuncs): Fast Element-wise Functions
Definition:
Functions like np.add(), np.sqrt() work element-wise and are faster than Python loops.
Examples:
Other Examples:
● np.exp(a), np.sin(a), np.log(a), np.power(a, 2)
They allow fast mathematical computations on arrays.
10. Mathematical and Statistical Methods
Functions:
● sum(), mean(), std(), var(), min(), max()
● argmax(), argmin()
Example:
Statistical analysis is vital in data science and numerical modeling.
11. Sorting, Unique, and Set Logic
Sorting:
Unique:
Set Operations:
● np.intersect1d()
● np.union1d()
● np.setdiff1d()
Used in data cleaning and analysis to identify overlaps and differences.
12. Data Visualization
Definition:
Visualization helps interpret data patterns using graphs and plots.
Libraries:
● matplotlib.pyplot
● seaborn
Examples:
Common Plots:
● Line plot
● Bar chart
● Histogram
● Scatter plot
● Pie chart
Used in data exploration, presentation, and machine learning.