🔢 Complete NumPy Basics Guide
What is NumPy?
NumPy (Numerical Python) is the fundamental package for scientific computing in Python. It provides support for large, multi-dimensional arrays and
matrices, along with mathematical functions to operate on them.
1. Installation & Import
# Install NumPy pip install numpy # Import NumPy import numpy as np
2. Creating Arrays
From Lists
# 1D Array arr1d = np.array([1, 2, 3, 4, 5]) print(arr1d)
Output: [1 2 3 4 5]
Explanation: np.array() converts a Python list into a NumPy array. Notice no commas in output - this is NumPy array format.
# 2D Array (Matrix) arr2d = np.array([[1, 2, 3], [4, 5, 6]]) print(arr2d) print("Shape:", arr2d.shape)
Output: [[1 2 3]
[4 5 6]]
Shape: (2, 3)
Explanation: Double brackets [[]] create a 2D array. .shape returns (rows, columns) - here 2 rows, 3 columns.
Built-in Array Creation Functions
# Array of zeros zeros = np.zeros((3, 4)) print("Zeros:\n", zeros) # Array of ones ones = np.ones((2, 3))
print("Ones:\n", ones) # Identity matrix eye = np.eye(3) print("Identity:\n", eye) # Array with range of values
arange = np.arange(0, 10, 2) print("Arange:", arange) # Array with evenly spaced values linspace = np.linspace(0,
10, 5) print("Linspace:", linspace)
Zeros:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
Ones:
[[1. 1. 1.]
[1. 1. 1.]]
Identity:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
Arange: [0 2 4 6 8]
Linspace: [ 0. 2.5 5. 7.5 10. ]
Explanation:
np.zeros((3,4)) - Creates 3×4 array filled with zeros
np.ones((2,3)) - Creates 2×3 array filled with ones
np.eye(3) - Creates 3×3 identity matrix (1s on diagonal, 0s elsewhere)
np.arange(0,10,2) - Like Python's range: start=0, stop=10, step=2
np.linspace(0,10,5) - 5 evenly spaced numbers between 0 and 10
3. Array Properties
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) print("Array:\n", arr) print("Shape:", arr.shape) # Dimensions
print("Size:", arr.size) # Total elements print("Dtype:", arr.dtype) # Data type print("Ndim:", arr.ndim) # Number
of dimensions
Array:
[[1 2 3 4]
[5 6 7 8]]
Shape: (2, 4)
Size: 8
Dtype: int64
Ndim: 2
Explanation:
.shape - Shows dimensions as (rows, columns)
.size - Total number of elements (2×4=8)
.dtype - Data type (int64 = 64-bit integers)
.ndim - Number of dimensions (2D array = 2)
4. Array Reshaping & Manipulation
Reshape
arr = np.array([1, 2, 3, 4, 5, 6]) print("Original:", arr) reshaped = arr.reshape(2, 3) print("Reshaped (2x3):\n",
reshaped) reshaped2 = arr.reshape(3, 2) print("Reshaped (3x2):\n", reshaped2)
Original: [1 2 3 4 5 6]
Reshaped (2x3):
[[1 2 3]
[4 5 6]]
Reshaped (3x2):
[[1 2]
[3 4]
[5 6]]
Explanation: reshape() changes array dimensions but keeps same data. Total elements must match: 6 elements can become 2×3 or 3×2, but not 2×4.
Flatten
arr2d = np.array([[1, 2, 3], [4, 5, 6]]) flattened = arr2d.flatten() print("Original:\n", arr2d)
print("Flattened:", flattened)
Original:
[[1 2 3]
[4 5 6]]
Flattened: [1 2 3 4 5 6]
Explanation: flatten() converts any multi-dimensional array into 1D. It reads row by row: first row [1,2,3], then second row [4,5,6].
5. Array Stacking
arr1 = np.array([[1, 2], [3, 4]]) arr2 = np.array([[5, 6], [7, 8]]) # Vertical stack (row-wise) vstack =
np.vstack((arr1, arr2)) print("Vertical Stack:\n", vstack) # Horizontal stack (column-wise) hstack =
np.hstack((arr1, arr2)) print("Horizontal Stack:\n", hstack)
Vertical Stack:
[[1 2]
[3 4]
[5 6]
[7 8]]
Horizontal Stack:
[[1 2 5 6]
[3 4 7 8]]
Explanation:
vstack() - Stacks arrays vertically (adds more rows)
hstack() - Stacks arrays horizontally (adds more columns)
Arrays must have compatible shapes for stacking
6. Mathematical Operations
Basic Arithmetic
arr1 = np.array([1, 2, 3, 4]) arr2 = np.array([5, 6, 7, 8]) print("arr1:", arr1) print("arr2:", arr2)
print("Addition:", arr1 + arr2) print("Subtraction:", arr1 - arr2) print("Multiplication:", arr1 * arr2)
print("Division:", arr1 / arr2) print("Power:", arr1 ** 2)
arr1: [1 2 3 4]
arr2: [5 6 7 8]
Addition: [ 6 8 10 12]
Subtraction: [-4 -4 -4 -4]
Multiplication: [ 5 12 21 32]
Division: [0.2 0.33333333 0.42857143 0.5 ]
Power: [ 1 4 9 16]
Explanation: All operations happen element-wise: [1,2,3,4] + [5,6,7,8] = [1+5, 2+6, 3+7, 4+8] = [6,8,10,12]. This is called vectorization - much faster than Python
loops!
Statistical Functions
arr = np.array([1, 2, 3, 4, 5]) print("Array:", arr) print("Sum:", np.sum(arr)) print("Mean:", np.mean(arr))
print("Median:", np.median(arr)) print("Standard deviation:", np.std(arr)) print("Min:", np.min(arr)) print("Max:",
np.max(arr)) print("Argmin (index of min):", np.argmin(arr)) print("Argmax (index of max):", np.argmax(arr))
Array: [1 2 3 4 5]
Sum: 15
Mean: 3.0
Median: 3.0
Standard deviation: 1.4142135623730951
Min: 1
Max: 5
Argmin (index of min): 0
Argmax (index of max): 4
Explanation:
sum() - Adds all elements: 1+2+3+4+5=15
mean() - Average: 15/5=3.0
argmin()/argmax() - Returns INDEX of min/max, not the value itself
Min value 1 is at index 0, max value 5 is at index 4
7. Array Indexing & Slicing
1D Array Indexing
arr = np.array([10, 20, 30, 40, 50]) print("Array:", arr) print("First element:", arr[0]) print("Last element:",
arr[-1]) print("Slice [1:4]:", arr[1:4]) print("Every 2nd element:", arr[::2])
Array: [10 20 30 40 50]
First element: 10
Last element: 50
Slice [1:4]: [20 30 40]
Every 2nd element: [10 30 50]
Explanation:
arr[0] - First element (index starts from 0)
arr[-1] - Last element (negative indexing from end)
arr[1:4] - Elements from index 1 to 3 (4 is excluded)
arr[::2] - Every 2nd element (step=2)
2D Array Indexing
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print("2D Array:\n", arr2d) print("Element at [1,2]:", arr2d[1,
2]) print("First row:", arr2d[0, :]) print("Second column:", arr2d[:, 1]) print("Subarray [0:2, 1:3]:\n",
arr2d[0:2, 1:3])
2D Array:
[[1 2 3]
[4 5 6]
[7 8 9]]
Element at [1,2]: 6
First row: [1 2 3]
Second column: [2 5 8]
Subarray [0:2, 1:3]:
[[2 3]
[5 6]]
Explanation:
arr2d[1,2] - Row 1, Column 2 (value 6)
arr2d[0,:] - Row 0, all columns (first row)
arr2d[:,1] - All rows, column 1 (second column)
arr2d[0:2, 1:3] - Rows 0-1, columns 1-2 (submatrix)
8. Boolean Indexing
arr = np.array([1, 2, 3, 4, 5, 6]) # Boolean condition condition = arr > 3 print("Array:", arr) print("Condition
(arr > 3):", condition) print("Elements > 3:", arr[condition]) # Direct boolean indexing print("Elements < 4:",
arr[arr < 4])
Array: [1 2 3 4 5 6]
Condition (arr > 3): [False False False True True True]
Elements > 3: [4 5 6]
Elements < 4: [1 2 3]
Explanation:
arr > 3 creates a boolean array (True/False for each element)
arr[condition] returns only elements where condition is True
arr[arr < 4] combines condition and indexing in one line
This is powerful for filtering data based on conditions!
9. Random Numbers
# Set seed for reproducibility np.random.seed(42) # Random integers rand_int = np.random.randint(1, 10, size=5)
print("Random integers:", rand_int) # Random floats between 0 and 1 rand_float = np.random.rand(3, 3) print("Random
floats:\n", rand_float) # Random normal distribution rand_normal = np.random.randn(5) print("Random normal:",
rand_normal) # Random choice from array choices = np.random.choice([1, 2, 3, 4, 5], size=3) print("Random
choices:", choices)
Random integers: [7 4 8 9 9]
Random floats:
[[0.37454012 0.95071431 0.73199394]
[0.59865848 0.15601864 0.15599452]
[0.05808361 0.86617615 0.60111501]]
Random normal: [ 1.32765 -0.234153 1.46210794 -0.20515826 0.3130677 ]
Random choices: [4 2 1]
Explanation:
np.random.seed(42) - Sets seed for reproducible results
randint(1,10,size=5) - 5 random integers between 1-9
rand(3,3) - 3×3 array of random floats between 0-1
randn(5) - 5 random numbers from normal distribution (mean=0, std=1)
choice() - Randomly picks from given array
rand_normal = np.random.randn(5) print("Random normal:", rand_normal) # Random choice from array choices = np.random.choice([1, 2, 3, 4, 5], size=3)
print("Random choices:", choices)
Random integers: [7 4 8 9 9]
Random floats:
[[0.37454012 0.95071431 0.73199394]
[0.59865848 0.15601864 0.15599452]
[0.05808361 0.86617615 0.60111501]]
Random normal: [ 1.32765 -0.234153 1.46210794 -0.20515826 0.3130677 ]
Random choices: [4 2 1]
10. Useful NumPy Functions
Function Description Example
np.round() Round to given decimals np.round([1.234, 2.567], 2)
np.abs() Absolute values np.abs([-1, -2, 3])
np.sqrt() Square root np.sqrt([4, 9, 16])
np.exp() Exponential np.exp([1, 2, 3])
np.log() Natural logarithm np.log([1, 2, 3])
np.sort() Sort array np.sort([3, 1, 4, 2])
np.unique() Unique elements np.unique([1, 1, 2, 2, 3])
Examples:
# Rounding arr = np.array([1.234, 2.567, 3.891]) print("Original:", arr) print("Rounded to 2 decimals:", np.round(arr,
2)) # Sorting unsorted = np.array([3, 1, 4, 1, 5, 9, 2, 6]) print("Unsorted:", unsorted) print("Sorted:",
np.sort(unsorted)) # Unique values with_duplicates = np.array([1, 1, 2, 2, 3, 3, 3]) print("With duplicates:",
with_duplicates) print("Unique:", np.unique(with_duplicates))
Original: [1.234 2.567 3.891]
Rounded to 2 decimals: [1.23 2.57 3.89]
Unsorted: [3 1 4 1 5 9 2 6]
Sorted: [1 1 2 3 4 5 6 9]
With duplicates: [1 1 2 2 3 3 3]
Unique: [1 2 3]
11. Data Type Conversion
# Integer to float int_arr = np.array([1, 2, 3, 4]) float_arr = int_arr.astype(float) print("Integer array:", int_arr,
"dtype:", int_arr.dtype) print("Float array:", float_arr, "dtype:", float_arr.dtype) # Float to integer float_arr2 =
np.array([1.7, 2.3, 3.9]) int_arr2 = float_arr2.astype(int) print("Float array:", float_arr2) print("Integer array:",
int_arr2) # Note: truncation, not rounding
Integer array: [1 2 3 4] dtype: int64
Float array: [1. 2. 3. 4.] dtype: float64
Float array: [1.7 2.3 3.9]
Integer array: [1 2 3]
Important: When converting float to int, NumPy truncates (doesn't round). Use np.round() first if you want rounding behavior.
12. Practical Example: Matrix Operations
# Create matrices A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) print("Matrix A:\n", A) print("Matrix
B:\n", B) # Element-wise multiplication element_wise = A * B print("Element-wise multiplication:\n", element_wise) #
Matrix multiplication (dot product) matrix_mult = np.dot(A, B) print("Matrix multiplication:\n", matrix_mult) # Transpose
transpose_A = A.T print("Transpose of A:\n", transpose_A)
Matrix A:
[[1 2]
[3 4]]
Matrix B:
[[5 6]
[7 8]]
Element-wise multiplication:
[[ 5 12]
[21 32]]
Matrix multiplication:
[[19 22]
[43 50]]
Transpose of A:
[[1 3]
[2 4]]
Quick Tips:
• Use np.array() to create arrays from lists
• .shape tells you dimensions, .size tells you total elements
• Use reshape() to change array dimensions
• * is element-wise multiplication, np.dot() is matrix multiplication
• Boolean indexing is powerful for filtering data
• Always check data types with .dtype
13. Common Patterns & Best Practices
# Creating arrays with specific data types arr_int = np.array([1, 2, 3], dtype=np.int32) arr_float = np.array([1, 2, 3],
dtype=np.float64) print("Int32 array:", arr_int, "dtype:", arr_int.dtype) print("Float64 array:", arr_float, "dtype:",
arr_float.dtype) # Vectorized operations (faster than loops) # Instead of loops, use vectorized operations arr =
np.array([1, 2, 3, 4, 5]) squared = arr ** 2 # Much faster than a for loop print("Original:", arr) print("Squared:",
squared) # Broadcasting example arr_2d = np.array([[1, 2, 3], [4, 5, 6]]) arr_1d = np.array([10, 20, 30]) result = arr_2d
+ arr_1d # Broadcasting print("2D array:\n", arr_2d) print("1D array:", arr_1d) print("Broadcasted addition:\n", result)
Int32 array: [1 2 3] dtype: int32
Float64 array: [1. 2. 3.] dtype: float64
Original: [1 2 3 4 5]
Squared: [ 1 4 9 16 25]
2D array:
[[1 2 3]
[4 5 6]]
1D array: [10 20 30]
Broadcasted addition:
[[11 22 33]
[14 25 36]]
🎯 Summary:
This guide covers the essential NumPy operations you'll use 90% of the time. NumPy is the foundation for most Python scientific computing libraries like Pandas,
Scikit-learn, and Matplotlib. Master these basics, and you'll have a solid foundation for data science and scientific computing!