03 Numpy
03 Numpy
What is NumPy?
NumPy (Numerical Python) is a Python library used for numerical computing. It provides
support for multi-dimensional arrays, mathematical operations, and linear algebra, making it
essential for scientific computing, data analysis, and machine learning.
print(arr1)
print(arr2)
print(zeros_array)
print(ones_array)
print(random_array)
6. Statistical Functions
arr = np.array([1, 2, 3, 4, 5])
print(np.mean(arr)) # Output: 3.0 (Mean)
print(np.median(arr)) # Output: 3.0 (Median)
print(np.std(arr)) # Output: 1.414 (Standard Deviation)
2. 2D Array (Matrix)
A two-dimensional array represents data in rows and columns.
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix)
print(matrix.shape) # Output: (2, 3) → 2 rows, 3 columns
3. 3D Array (Tensor)
A three-dimensional array is useful for handling multi-layered data, like images.
tensor = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(tensor.shape) # Output: (2, 2, 2) → 2 layers, 2 rows, 2 columns
🔹 Supports various data types: int32, int64, float32, float64, complex, etc.
5. ndarray.itemsize → Memory Size of Each Element
📌 Returns the size (in bytes) of one element in the array.
print(arr.itemsize) # Output: 8 (for int64), 4 (for int32)
🔹
int64 64-bit signed integer (large range) 8 bytes
Example:
import numpy as np
arr = np.array([1, 2, 3], dtype=np.int32)
print(arr.dtype) # Output: int32
B. Floating-Point Types
Data Type Description Memory
float16 16-bit floating point 2 bytes
float32 32-bit floating point 4 bytes
🔹
float64 64-bit floating point (default) 8 bytes
Example:
arr = np.array([1.2, 2.3, 3.4], dtype=np.float32)
print(arr.dtype) # Output: float32
2. Boolean Type
Data Type Description
🔹
bool_ Boolean (True or False)
Example:
arr = np.array([True, False, True], dtype=np.bool_)
print(arr.dtype) # Output: bool
3. Complex Number Type
Data Type Description Memory
complex64 Complex number with 32-bit floats 8 bytes
🔹
complex128 Complex number with 64-bit floats 16 bytes
Example:
arr = np.array([1+2j, 3+4j], dtype=np.complex64)
print(arr.dtype) # Output: complex64
4. String Type
Data Type Description
🔹
str_ Fixed-size Unicode string
Example:
arr = np.array(["apple", "banana", "cherry"], dtype=np.str_)
print(arr.dtype) # Output: <U6 (Unicode string of length 6)>
🔹
object_ Stores mixed data types
Example:
arr = np.array([1, "hello", 3.14], dtype=np.object_)
print(arr.dtype) # Output: object
1. Indexing in NumPy
NumPy supports zero-based indexing, meaning the first element has an index of 0.
A. Indexing in 1D Arrays
import numpy as np
B. Indexing in 2D Arrays
For 2D arrays, use [row, column] indexing.
arr_2d = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
C. Indexing in 3D Arrays
For 3D arrays, use [depth, row, column] indexing.
arr_3d = np.array([
[[1, 2, 3], [4, 5, 6]], # First layer
[[7, 8, 9], [10, 11, 12]] # Second layer
])
2. Slicing in NumPy
Slicing allows extracting subarrays using the format:
array[start:stop:step]
A. Slicing a 1D Array
arr = np.array([10, 20, 30, 40, 50, 60, 70])
B. Slicing a 2D Array
arr_2d = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]
])
C. Slicing a 3D Array
arr_3d = np.array([
[[1, 2, 3], [4, 5, 6]],
[[7, 8, 9], [10, 11, 12]]
])
For 2D arrays:
arr_2d = np.array([
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
])
rows = [0, 1, 2]
cols = [2, 1, 0]
4. Boolean Indexing
Retrieve elements based on conditions.
arr = np.array([10, 20, 30, 40, 50])
For 2D arrays:
arr_2d = np.array([
[10, 20, 30],
[40, 50, 60]
])
🔹 NumPy functions like mean(), max(), and min() make statistical calculations easy.
🔹 Vectorized operations enable fast percentage change calculations.
Step 3: Identify Days When Stock Price Was Above the Average
# Find days where stock price was above average
above_avg_days = stock_prices[stock_prices > average_price]
Python Code
import numpy as np
# 3. Calculate the percentage change from the first to the last day
percentage_change = ((stock_prices[-1] - stock_prices[0]) / stock_prices[0]) *
100
print(f"Percentage Change Over the Week: {percentage_change:.2f}%")
Expected Output
Stock Prices Over the Week: [150 152 148 155 157 160 158]
Average Stock Price: $154.29
Highest Stock Price: $160
Lowest Stock Price: $148
Percentage Change Over the Week: 5.33%
Stock Prices Above Average: [155 157 160 158]
Day with Highest Stock Price: Day 6
Day with Lowest Stock Price: Day 3
What is a 3D Array?
A 3D array is a three-dimensional NumPy array, which consists of multiple 2D matrices stacked
together. It is structured as:
(depth, rows, columns) → Like a stack of 2D arrays.
📌 Example of a 3D array:
import numpy as np
Structure of array_3d
[
[[ 1 2 3] # First matrix
[ 4 5 6]
[ 7 8 9]],
Uses of a 3D Array
1. Image Processing (RGB Images)
📌 In computer vision, images are stored as 3D arrays (Height × Width × Channels).
image = np.zeros((256, 256, 3)) # A blank 256x256 RGB image
✅ Used in: OpenCV, TensorFlow, and deep learning for image classification.
2. Medical Imaging (MRI, CT Scans)
📌 3D arrays store multiple slices of scans in medical imaging.
mri_scan = np.zeros((100, 256, 256)) # 100 slices of 256x256 resolution
3. Slicing in a 3D Array
Slicing allows you to extract subarrays.
Extract a subarray
print(array_3d[:, 1:, 2:]) # Extracts last two rows & last two columns from
both layers
1. Summing Elements
print(np.sum(array_3d)) # Sum of all elements in the array
print(np.sum(array_3d, axis=0)) # Sum across layers (depth)
print(np.sum(array_3d, axis=1)) # Sum across rows
print(np.sum(array_3d, axis=2)) # Sum across columns
Transpose a 3D Array
transposed_array = array_3d.transpose(1, 0, 2) # Swaps rows and layers
print(transposed_array.shape) # Output: (3, 2, 4)
# 1. Compute total sales for each product across all stores and days
total_sales_per_product = np.sum(sales_data, axis=(0, 1))
print("\nTotal Sales for Each Product:", total_sales_per_product)
Total Sales per Day: [1490 1410 1820 2090 1620 1710 1630]
Use Case of a 2D NumPy Array in Python
Use Case: Student Grade Analysis
📌 Scenario:
A university records students' scores in 3 subjects (Math, Science, and English) for 5 students.
Using a 2D NumPy array, we will analyze the data by calculating average scores, highest scores,
lowest scores, and subject-wise performance.
🔹 2D Array Structure: Each row represents a student, and each column represents a subject.
Step 2: Compute Key Statistics
1. Calculate the Average Score for Each Student
average_per_student = np.mean(grades, axis=1)
print("Average Score per Student:", average_per_student)
📌 max() and min() functions help identify top and bottom-performing students in each
subject.
📌 sum(axis=1) calculates total scores for each student, and argmax() finds the highest
scorer.
Output:
Student Grades (Rows: Students, Columns: Subjects):
[[85 90 78]
[88 76 92]
[90 88 85]
[70 65 80]
[95 98 95]]
Complete Program
Python Code
import numpy as np
Expected Output
Student Grades (Rows: Students, Columns: Subjects):
[[85 90 78]
[88 76 92]
[90 88 85]
[70 65 80]
[95 98 95]]
Average Score per Subject (Math, Science, English): [85.6 83.4 86.0]
# Creating an array
arr = np.array([10, 20, 30, 40, 50])
Loading an Array
loaded_arr = np.load('data.npy')
print(loaded_arr) # Output: [10 20 30 40 50]
print(loaded['first']) # Output: [1 2 3]
print(loaded['second']) # Output: [[4 5 6] [7 8 9]]
2. Working with Text Files (.txt)
Text files are human-readable and commonly used for structured datasets.
# Read data
arr_loaded = np.memmap('large_data.dat', dtype='float32', mode='r',
shape=(10000, 10000))
print(arr_loaded[0, 0]) # Access without loading the entire file
Summary Table
File Format Save Function Load Function
np.save('file.npy',
.npy (Binary) np.load('file.npy')
arr)
.npz (Multiple np.savez('file.npz', data = np.load('file.npz');
Arrays) a=arr1, b=arr2) data['a']
np.savetxt('file.txt', np.loadtxt('file.txt',
.txt (Text File)
arr, fmt='%d') dtype=int)
np.savetxt('file.csv',
.csv (Comma- np.loadtxt('file.csv',
arr, delimiter=',',
separated) delimiter=',', dtype=int)
fmt='%d')
np.genfromtxt('file.csv',
.csv (with missing
- delimiter=',',
values) filling_values=0)
.dat (Memory- np.memmap('file.dat', np.memmap('file.dat',
mapped) dtype, mode, shape) dtype, mode, shape)
1. Reading a Large CSV File for Stock Market Analysis
📌 Use Case: Import stock prices and analyze them.
Example:
import numpy as np
✔ Key Insights:
• Reads a stock price dataset from a CSV file.
• Extracts Open, High, Low, Close prices.
• Computes the average closing price.
✔ Key Insights:
• Reads temperature sensor data stored in a .txt file.
• Finds the maximum and minimum recorded temperature.
3. Loading Image Data for Machine Learning (Binary .npy)
📌 Use Case: Load preprocessed image data from a NumPy binary file.
Example:
# Create a random image array (100x100 pixels, grayscale)
image = np.random.randint(0, 255, (100, 100), dtype=np.uint8)
✔ Key Insights:
• Saves a random 100x100 grayscale image to a .npy file.
• Loads the image back into a NumPy array.
✔ Key Insights:
• Uses np.genfromtxt() to handle missing values.
• Replaces missing values with 0.
✔ Key Insights:
• Uses np.memmap() for big data processing.
• Avoids loading the full dataset into RAM.