1.
Creation of NumPy Arrays
NumPy Arrays are powerful structures for handling numerical data efficiently. Unlike Python
lists, arrays offer faster computation and are used widely in data science and machine learning.
🔹 Importing NumPy
python
CopyEdit
import numpy as np
🔹 Types of Array Creation
Method Description Example
np.array() Converts Python list/tuple to NumPy array np.array([1, 2, 3])
np.zeros() Creates an array filled with zeros np.zeros((2, 3))
np.ones() Creates an array filled with ones np.ones((2, 2))
np.arange() Creates an array with a range of values np.arange(0, 10, 2)
np.linspace() Creates evenly spaced values in a range np.linspace(0, 1, 5)
np.eye() Creates an identity matrix np.eye(3)
np.random.rand() Random values between 0 and 1 np.random.rand(2, 3)
🔹 Example
python
CopyEdit
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)
Output:
lua
CopyEdit
[[1 2 3]
[4 5 6]]
🔷 2. Array Indexing in NumPy
Indexing is used to access or modify specific elements or rows/columns in an array.
🔹 1D Array Indexing
python
CopyEdit
arr = np.array([10, 20, 30, 40])
print(arr[1]) # Output: 20
🔹 2D Array Indexing
python
CopyEdit
a = np.array([[1, 2, 3],
[4, 5, 6]])
print(a[0][1]) # Output: 2 (row 0, column 1)
Alternatively:
python
CopyEdit
print(a[0, 1]) # Same result
🔹 Negative Indexing
Access elements from the end.
python
CopyEdit
arr = np.array([10, 20, 30])
print(arr[-1]) # Output: 30
🔹 Slicing
python
CopyEdit
a = np.array([10, 20, 30, 40, 50])
print(a[1:4]) # Output: [20 30 40]
🔹 Indexing a 2D Array (Rows and Columns)
python
CopyEdit
a = np.array([[10, 20, 30],
[40, 50, 60]])
print(a[:, 1]) # All rows, 2nd column → [20 50]
print(a[1, :]) # 2nd row, all columns → [40 50 60]
📌 Why NumPy Arrays Are Important
Faster than Python lists
Support vectorized operations
Used in ML for data storage, preprocessing, and feeding data into m
NumPy: Reshaping Arrays and Array Math & Assignment
🔷 1. Reshaping Arrays
Reshaping means changing the shape (i.e., number of rows and columns) of an array without
changing its data.
🔹 Function Used:
python
CopyEdit
array.reshape(new_shape)
🔹 Rules:
Total number of elements must remain the same.
Can convert between 1D, 2D, and 3D arrays.
🔹 Examples:
1D to 2D:
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6])
b = a.reshape((2, 3))
print(b)
Output:
CopyEdit
[[1 2 3]
[4 5 6]]
2D to 1D:
python
CopyEdit
a = np.array([[1, 2, 3], [4, 5, 6]])
b = a.reshape(-1)
print(b) # Output: [1 2 3 4 5 6]
Using -1 (Auto-calculate dimension):
python
CopyEdit
a = np.array([1, 2, 3, 4, 5, 6])
b = a.reshape((3, -1)) # Automatically calculates column size
print(b)
Output:
CopyEdit
[[1 2]
[3 4]
[5 6]]
🔷 2. Array Math and Assignment
NumPy allows element-wise operations and broadcasting, making it very efficient for
mathematical calculations.
🔹 Basic Math Operations (Element-wise)
python
CopyEdit
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b) # [5 7 9]
print(a - b) # [-3 -3 -3]
print(a * b) # [4 10 18]
print(a / b) # [0.25 0.4 0.5 ]
🔹 Mathematical Functions
python
CopyEdit
np.sum(a) # Sum of all elements
np.mean(a) # Average
np.max(a) # Maximum
np.min(a) # Minimum
np.sqrt(a) # Square root
np.exp(a) # Exponentials
np.log(a) # Logarithm
🔹 Broadcasting
Allows arithmetic between arrays of different shapes.
Example:
python
CopyEdit
a = np.array([[1, 2], [3, 4]])
b = np.array([10, 20])
print(a + b)
Output:
CopyEdit
[[11 22]
[13 24]]
🔹 Assignment (Changing Values)
You can assign new values to elements or slices of an array.
python
CopyEdit
a = np.array([10, 20, 30, 40])
a[0] = 99 # Change first element
a[1:3] = [55, 66] # Change second and third
print(a) # Output: [99 55 66 40]
In 2D:
python
CopyEdit
b = np.array([[1, 2], [3, 4]])
b[1][0] = 100
print(b)
📌 Summary
Concept Description
reshape() Change array shape without changing data
Array Math Perform element-wise operations
Broadcasting Auto-align smaller array to match dimensions
Assignment Modify values of elements directly
Manipulating Tabular Data using Pandas
Pandas is a powerful Python library used for data analysis and manipulation. It provides two
main data structures:
🔷 1. Pandas Series
✅ What is a Series?
A Series is a one-dimensional labeled array.
Can hold any data type: integers, floats, strings, etc.
Think of it like a column in Excel.
🔹 Creating a Series:
python
CopyEdit
import pandas as pd
s = pd.Series([10, 20, 30, 40])
print(s)
Output:
go
CopyEdit
0 10
1 20
2 30
3 40
dtype: int64
The left side is the index.
The right side is the data values.
🔹 Custom Index:
python
CopyEdit
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
🔹 Accessing Elements:
python
CopyEdit
s['a'] # Output: 10
s[0] # Output: 10
🔹 Operations:
python
CopyEdit
s + 5 # Adds 5 to each element
s.mean() # Average
s.max(), s.min()
🔷 2. Pandas DataFrame
✅ What is a DataFrame?
A DataFrame is a two-dimensional table (like a spreadsheet).
Contains rows and columns, each with labels.
Built from lists, dictionaries, Series, or NumPy arrays.
🔹 Creating a DataFrame:
python
CopyEdit
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'Score': [85, 90, 95]}
df = pd.DataFrame(data)
print(df)
Output:
markdown
CopyEdit
Name Age Score
0 Alice 25 85
1 Bob 30 90
2 Charlie 35 95
🔷 3. Manipulating DataFrames
🔹 Accessing Columns:
python
CopyEdit
df['Name'] # Returns Series
df[['Name', 'Age']] # Multiple columns
🔹 Accessing Rows:
python
CopyEdit
df.loc[0] # Row by label
df.iloc[1] # Row by position
🔹 Adding a New Column:
python
CopyEdit
df['Grade'] = ['B', 'A', 'A+']
🔹 Filtering Rows (Conditional Selection):
python
CopyEdit
df[df['Age'] > 28]
🔹 Updating Values:
python
CopyEdit
df.at[0, 'Score'] = 88
🔹 Dropping Columns or Rows:
python
CopyEdit
df.drop('Age', axis=1) # Drop column
df.drop(1, axis=0) # Drop row by index
🔹 Sorting:
python
CopyEdit
df.sort_values(by='Score', ascending=False)
🔹 Grouping:
python
CopyEdit
df.groupby('Grade').mean()
🔹 Missing Values:
python
CopyEdit
df.isnull() # Check missing
df.dropna() # Drop rows with missing
df.fillna(0) # Fill missing with 0
✅ Summary Table:
Feature Series DataFrame
Structure 1D 2D (rows and columns)
Access Index only Index and Column
Feature Series DataFrame
Creation pd.Series() pd.DataFrame()
Example Use Single column Complete table
✅ Why Use Pandas for Tabular Data?
Easy to load CSV/Excel files
Clean and preprocess large datasets
Group, summarize, filter, and sort
Integrates with Matplotlib and Seaborn for visualizationPandas Series
and DataFrame
Pandas is a Python library used for working with structured (tabular) data. It provides two
main data structures:
🔷 1. Pandas Series
✅ What is a Series?
A Series is a one-dimensional labeled array.
It can hold data like integers, strings, floats, etc.
Think of it like a single column from an Excel sheet.
✅ Key Points:
Has data values and index labels
Built on top of NumPy arrays
🔹 Example:
python
CopyEdit
import pandas as pd
s = pd.Series([10, 20, 30, 40])
print(s)
Output:
go
CopyEdit
0 10
1 20
2 30
3 40
dtype: int64
0, 1, 2, 3 are index labels (can be changed)
10, 20, 30, 40 are values
🔹 With custom index:
python
CopyEdit
s = pd.Series([100, 200, 300], index=['a', 'b', 'c'])
🔷 2. Pandas DataFrame
✅ What is a DataFrame?
A DataFrame is a two-dimensional labeled data structure.
It consists of rows and columns (like a table).
Each column in a DataFrame is actually a Series.
✅ Key Features:
Columns can have different data types
Data can be accessed using row/column labels
🔹 Creating a DataFrame:
python
CopyEdit
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'Score': [85, 90, 95]
}
df = pd.DataFrame(data)
print(df)
Output:
markdown
CopyEdit
Name Age Score
0 Alice 25 85
1 Bob 30 90
2 Charlie 35 95
✅ Difference Between Series and DataFrame:
Feature Series DataFrame
Structure 1D 2D (rows and columns)
Shape Single column Multiple rows and columns
Example [10, 20, 30] Table with rows and columns
Use Represents a column Represents full table of data
Use Represents a column Represents full table of data
What is Data Visualization?
Data Visualization means converting data into visual formats like graphs, charts, and plots.
This helps in understanding trends, patterns, and relationships in data.
What is Matplotlib?
Matplotlib is a popular Python library used to create static, interactive, and animated
visualizations.
It works well with NumPy and Pandas.
The main module used is:
python
CopyEdit
import matplotlib.pyplot as plt
Basic Steps to Create a Graph
1. Import the library:
python
CopyEdit
import matplotlib.pyplot as plt
2. Prepare the data:
python
CopyEdit
x = [1, 2, 3, 4]
y = [10, 20, 25, 30]
3. Plot the graph:
python
CopyEdit
plt.plot(x, y)
4. Show the graph:
python
CopyEdit
plt.show()
Types of Charts in Matplotlib
1. Line Chart
Used to show trends over time.
python
CopyEdit
plt.plot(x, y)
2. Bar Chart
Used to compare different categories.
python
CopyEdit
plt.bar(x, y)
3. Histogram
Used to show distribution of a dataset.
python
CopyEdit
plt.hist(data)
4. Pie Chart
Used to show percentage distribution.
python
CopyEdit
plt.pie(values, labels=names)
Important Functions in Matplotlib
Function Purpose
plt.plot() Draws line chart
plt.bar() Draws bar chart
plt.hist() Draws histogram
plt.pie() Draws pie chart
plt.xlabel() Sets x-axis label
plt.ylabel() Sets y-axis label
plt.title() Sets the title of the graph
plt.legend() Adds a legend to the chart
plt.grid() Shows grid lines
plt.show() Displays the plot
Example with Title and Labels
python
CopyEdit
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [5, 10, 15, 20]
plt.plot(x, y)
plt.title("Simple Line Graph")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()
Why Use Matplotlib? (Advantages)
Easy to use and flexible
Can create many types of charts
Good for basic as well as advanced plotting
Works well with other Python libraries
✅ Example
Instead of this:
python
CopyEdit
matplotlib.pyplot.plot([1,2,3], [4,5,6])
matplotlib.pyplot.show()
We use this:
python
CopyEdit
import matplotlib.pyplot as plt
plt.plot([1,2,3], [4,5,6])
plt.show()
✅ Summary
Part Meaning
Import Load a library into Python
matplotlib.pyplot The plotting part of the Matplotlib library
as plt Give it a short name plt for convenience
Data Visualization Using Seaborn
🔷 What is Seaborn?
Seaborn is a Python visualization library based on Matplotlib.
It provides a high-level interface for drawing attractive and informative statistical
graphics.
Works well with Pandas DataFrames.
✅ Importing Seaborn
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
🔹 1. Line Chart (Line Plot)
Used to show trends over time or continuous data.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("flights")
sns.lineplot(x="year", y="passengers", data=data)
plt.title("Line Chart")
plt.show()
🔹 2. Bar Chart
Used to compare categories.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("titanic")
sns.barplot(x="class", y="fare", data=data)
plt.title("Bar Chart")
plt.show()
🔹 3. Pie Chart
🔸 Note: Seaborn does not support pie charts directly.
🔸 Use Matplotlib for pie charts:
python
CopyEdit
import matplotlib.pyplot as plt
# Example data
labels = ['A', 'B', 'C']
sizes = [30, 45, 25]
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title("Pie Chart")
plt.show()
🔹 4. Scatter Plot
Used to show relationship between two continuous variables.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("iris")
sns.scatterplot(x="sepal_length", y="sepal_width", data=data)
plt.title("Scatter Plot")
plt.show()
🔹 5. Subplots (Multiple Plots in One Figure)
Use Matplotlib to create subplots, and use Seaborn for each plot.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Load data
data = sns.load_dataset("iris")
# Create subplots
plt.figure(figsize=(10,5))
# Subplot 1
plt.subplot(1, 2, 1)
sns.histplot(data["sepal_length"])
plt.title("Sepal Length")
# Subplot 2
plt.subplot(1, 2, 2)
sns.histplot(data["sepal_width"])
plt.title("Sepal Width")
plt.tight_layout()
plt.show()
✅ Summary Table
Chart Type Seaborn Function Notes
Line Chart sns.lineplot() Best for time-series data
Bar Chart sns.barplot() Good for categorical data
Pie Chart Use matplotlib.pyplot.pie() Seaborn does not support it
Scatter Plot sns.scatterplot() Good for 2D relationships
Subplots Use plt.subplot() + sns Combine multiple charts
Let me know if you’d like:
Output images of these plots 📊
A small dataset to try them on 📄
Practice questions or quiz 💡ata Visualization Using
Seaborn
🔷 What is Seaborn?
Seaborn is a Python visualization library based on Matplotlib.
It provides a high-level interface for drawing attractive and informative statistical
graphics.
Works well with Pandas DataFrames.
✅ Importing Seaborn
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
🔹 1. Line Chart (Line Plot)
Used to show trends over time or continuous data.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("flights")
sns.lineplot(x="year", y="passengers", data=data)
plt.title("Line Chart")
plt.show()
🔹 2. Bar Chart
Used to compare categories.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("titanic")
sns.barplot(x="class", y="fare", data=data)
plt.title("Bar Chart")
plt.show()
🔹 3. Pie Chart
🔸 Note: Seaborn does not support pie charts directly.
🔸 Use Matplotlib for pie charts:
python
CopyEdit
import matplotlib.pyplot as plt
# Example data
labels = ['A', 'B', 'C']
sizes = [30, 45, 25]
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title("Pie Chart")
plt.show()
🔹 4. Scatter Plot
Used to show relationship between two continuous variables.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("iris")
sns.scatterplot(x="sepal_length", y="sepal_width", data=data)
plt.title("Scatter Plot")
plt.show()
🔹 5. Subplots (Multiple Plots in One Figure)
Use Matplotlib to create subplots, and use Seaborn for each plot.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Load data
data = sns.load_dataset("iris")
# Create subplots
plt.figure(figsize=(10,5))
# Subplot 1
plt.subplot(1, 2, 1)
sns.histplot(data["sepal_length"])
plt.title("Sepal Length")
# Subplot 2
plt.subplot(1, 2, 2)
sns.histplot(data["sepal_width"])
plt.title("Sepal Width")
plt.tight_layout()
plt.show()
✅ Summary Table
Chart Type Seaborn Function Notes
Line Chart sns.lineplot() Best for time-series data
Bar Chart sns.barplot() Good for categorical data
Pie Chart Use matplotlib.pyplot.pie() Seaborn does not support it
Scatter Plot sns.scatterplot() Good for 2D relationships
Subplots Use plt.subplot() + sns Combine multiple charts
Let me know if you’d like:
Output images of these plots 📊
A small dataset to try them on 📄
Practice questions or quiz 💡
difference between seaborn and matplotlib
Here’s a clear and simple comparison of Seaborn vs Matplotlib — perfect for school or
beginner-level understanding.
🔍 Difference Between Seaborn and Matplotlib
Feature Matplotlib Seaborn
High-level visualization library built on
Library Type Low-level visualization library
Matplotlib
import matplotlib.pyplot as
Import Syntax plt
import seaborn as sns
Code Complexity More code needed to customize Less code; auto-styling and themes
Design Simple/basic-looking plots Beautiful, modern, and clean plots
Data Format Works with lists, arrays Works best with Pandas DataFrames
Plot Types Line, bar, scatter, pie, histogram, etc. Line, bar, scatter, heatmap, violin, etc.
Auto handles many settings like legends,
Customization Very customizable but manual
colors
Themes Not automatic Built-in themes and styles (sns.set())
Advanced Stats
Needs manual work Built-in support (e.g. boxplot, violinplot)
Plots
Pie Chart Support ✅ Supported ❌ Not directly supported
✅ Example Comparison
🔹 Matplotlib Line Plot:
python
CopyEdit
import matplotlib.pyplot as plt
x = [1, 2, 3]
y = [5, 7, 9]
plt.plot(x, y)
plt.title("Line Plot")
plt.show()
🔹 Seaborn Line Plot:
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
data = {'x': [1, 2, 3], 'y': [5, 7, 9]}
sns.lineplot(x='x', y='y', data=data)
plt.title("Line Plot")
plt.show()
🏁 Summary in One Line:
Matplotlib gives more control and is used for custom plots.
Seaborn makes it easier to create beautiful statistical plots quickly.
Because Matplotlib gives us control over extra things like:
Feature Example
plt.title() Set a custom title
plt.xlabel() / plt.ylabel() Label axes
plt.xlim() / plt.ylim() Set limits for axes
plt.subplot() Create subplots
plt.grid() Add gridlines
plt.legend() Modify legends
plt.show() Display the plot (final step)
🧠 Think of it like this:
Seaborn = Cooked dish 🍛 (ready-made, nice-looking plots)
Matplotlib = Kitchen tools 🍳 (to adjust taste, style, layout, etc.)
You use Seaborn for easier plotting
...and Matplotlib for final touch-ups and display.
✅ Example: Use Both Together
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Example data
data = sns.load_dataset("tips")
# Seaborn plot
sns.barplot(x="day", y="total_bill", data=data)
# Matplotlib enhancements
plt.title("Average Bill per Day")
plt.xlabel("Day of Week")
plt.ylabel("Total Bill")
plt.grid(True)
plt.show()
Here:
Seaborn creates the bar chart.
Matplotlib adds title, labels, grid, and shows the plot.
🟡 Final Answer:
We use Matplotlib with Seaborn to customize and display plots, because Seaborn doesn’t
replace Matplotlib — it extends it.
What is Scikit-learn?
Scikit-learn (sklearn) is a popular Python library used for machine learning.
It provides tools to:
o Load datasets
o Preprocess data
o Build models (like classification, regression, clustering)
o Evaluate model performance
✅ Key Features of Scikit-learn
Feature Description
Easy to use Simple and consistent API
Built-in datasets Has small datasets for practice (e.g. iris, digits)
Preprocessing tools Normalize, scale, split, etc.
Many ML models SVM, Decision Trees, KNN, Linear Regression
Model Evaluation Accuracy score, confusion matrix, etc.
Feature Description
🔹 How to Install
bash
CopyEdit
pip install scikit-learn
📂
hat is Machine Learning?
Machine Learning (ML) is a branch of Artificial Intelligence (AI) that allows computers to
learn from data and make decisions or predictions without being explicitly programmed.
📌 Example:
If you give a machine many images of cats and dogs, it can learn the patterns and later predict
whether a new image is of a cat or a dog.
🧠 Why is Machine Learning Important?
Can automate tasks (like spam detection)
Learns and improves from experience
Helps in predictions, recommendations, classification, etc.
Used in self-driving cars, recommendation systems, medical diagnosis, etc.
🔍 Classification of Machine Learning
ML is mainly divided into 3 types:
1️⃣ Supervised Learning
Data is labeled (we know the input and correct output).
The model learns from examples.
📘 Examples:
Task Description
Email Spam Filter "Spam" or "Not Spam"
Task Description
Disease Detection "Positive" or "Negative" result
Price Prediction Predict house price based on size
✅ Algorithms: Linear Regression, Decision Tree, KNN, SVM
2️⃣ Unsupervised Learning
Data is not labeled (no predefined output).
The model tries to find hidden patterns in data.
📘 Examples:
Task Description
Customer Segmentation Group similar customers automatically
Market Basket Analysis Find items bought together often
✅ Algorithms: K-Means Clustering, Hierarchical Clustering, PCA
3️⃣ Reinforcement Learning
The model learns by trial and error.
It gets rewards or penalties based on actions.
📘 Examples:
Task Description
Game Playing AI Learns to play chess or Atari games
Self-driving Car Learns how to drive safely
✅ Concepts: Agent, Environment, Reward, Policy
Summary Table
Type Data Type Learns From Goal Examples
Supervised Labeled Past examples Predict output Spam filter, disease detection
Unsupervised Unlabeled Hidden patterns Group or reduce data Clustering, market analysis
Reinforcement Feedback Trial & error Maximize rewards Robot control, game AI