0% found this document useful (0 votes)
25 views32 pages

Iml Lab (1) .177

The document is a lab record for the course 'Introduction to Machine Learning' submitted by a student as part of their Bachelor's in Computer Science and Engineering. It includes various experiments such as implementing classifiers, predictive models, and algorithms like FIND-S and decision trees, along with general laboratory instructions. The lab aims to provide practical experience in machine learning using tools like Python and R.

Uploaded by

mallikrihan7026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views32 pages

Iml Lab (1) .177

The document is a lab record for the course 'Introduction to Machine Learning' submitted by a student as part of their Bachelor's in Computer Science and Engineering. It includes various experiments such as implementing classifiers, predictive models, and algorithms like FIND-S and decision trees, along with general laboratory instructions. The lab aims to provide practical experience in machine learning using tools like Python and R.

Uploaded by

mallikrihan7026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

SCHOOL OF ENGINEERING AND TECHNOLOGY

LAB RECORD

Of

Introduction to Machine Learning

Submitted in partial fulfillment of the requirement for the course

Introduction to Machine Learning (4CSPL2041) in

Bachelors of Technology in Computer Science and Engineering

Submitted By-

Jeevith G L

[22BBTCS122]

Course In-charge:
Prof. Shana Aneevvan
Assistant Professor,
Dept. of CSE, SOET

Department of Computer Science and Engineering

Off Hennur – Bagalur Main Road,


Near Kempegowda International Airport, Chagalahatti, Bangalore,
Karnataka – 562149
2024-2025
CONTENTS

Page
Sl.no. Experiments
No.

Study and usage of python and R tool 4


1

2 Implement a classifier for the sales data. 5

3 Develop a predictive model for predicting house prices 8

Implement the FIND-S algorithm. Verify that it successfully produces the trace in
4 10
for the Enjoy sport example. (Tom Mitchell Reference)
Implement a decision tree algorithm for sales prediction/classification in retail
5 12
sector
6 Implement back propagation algorithm for stock prices prediction. 15
7 Implement clustering algorithm for Insurance fraud detection 18
8 Implement clustering algorithm for identifying cancerous data 21
9 Apply reinforcement learning and develop a game of your own. 23

10 Develop a traffic signal control system using reinforcement learning technique. 28


4CSPL2041 IML LAB

General Laboratory Instructions

1. Students are advised to come to the laboratory at least 5 minutes before (to the starting time), those
who come after 5 minutes will not be allowed into the lab.
2. Plan your task properly much before to the commencement, come prepared to the lab with the
synopsis / program / experiment details.
3. Student should enter into the laboratory with:
a. Laboratory observation notes with all the details (Problem statement, Aim, Algorithm,
Procedure, Program, Expected Output, etc.,) filled in for the lab session.
b. Laboratory Record updated up to the last session experiments and other utensils (if
any) needed in the lab.
c. Proper Dress code and Identity card.
4. Sign in the laboratory login register, write the TIME-IN, and occupy the computer system allotted to
you by the faculty.
5. Execute your task in the laboratory, and record the results / output in the lab observation note book,
and get certified by the concerned faculty.
6. All the students should be polite and cooperative with the laboratory staff, must maintain the
discipline and decency in the laboratory.
7. Computer labs are established with sophisticated and high end branded systems, which should be
utilized properly.
8. Students / Faculty must keep their mobile phones in SWITCHED OFF mode during the lab
sessions. Misuse of the equipment, misbehaviors with the staff and systems etc., will attract severe
punishment.
9. Students must take the permission of the faculty in case of any urgency to go out; if anybody found
loitering outside the lab / class without permission during working hours will be treated seriously and
punished appropriately.
10. Students should LOG OFF/ SHUT DOWN the computer system before he/she leaves the lab after
completing the task (experiment) in all aspects. He/she must ensure the system / seat is kept properly.

Dept. of CSE, CMR Page 3


4CSPL2041 IML LAB

Ex.No:1 Introduction to Python and R Tool

AIM:
To study the usage of python and R tool.

PROGRAM:
Python

1. Getting Started:
Install Python via python.org or use tools like Anaconda for a bundled data science environment.
Use IDEs like PyCharm, Jupyter Notebook, or VS Code.
2. Core Concepts:
Learn the basics: data types, control structures, functions, and modules.
Explore libraries like NumPy (numerical computation), pandas (data manipulation), and matplotlib/
seaborn (visualization).
3. Advanced Applications:
Machine learning: scikit-learn, TensorFlow, or PyTorch.
4. Data visualization: Plotly, Dash.
Web scraping: Beautiful Soup, Selenium.
Automation: Python scripts for task automation.
5. Practice:
Platforms: Kaggle, HackerRank, or LeetCode.
Projects: Build small projects, such as data analysis scripts or visualization dashboards.

1. Getting Started:
Download R and RStudio from CRAN and RStudio.
Familiarize yourself with the RStudio environment.
2. Core Concepts:
Data manipulation: Learn functions for vectors, matrices, and data frames.
Libraries: Master dplyr (data manipulation), ggplot2 (visualization), and tidyr (data tidying).
3. Advanced Applications:
Statistical analysis: Perform hypothesis testing, regression, and clustering.
Machine learning: Use packages like caret or mlr.
4. Visualization: Build interactive plots with shiny or plotly.
5. Practice:
Use datasets like iris, mtcars, or external datasets from Kaggle.
Analyze real-world data to develop actionable insights.

Dept. of CSE, CMR Page 4


4CSPL2041 IML LAB

Ex.No:2 Implementation of Classifier for Sales Data

AIM:
To build and implement a classifier for the sales data.

PROGRAM:

import pandas as pd
import urllib.request
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Step 1: Load the dataset


data = pd.read_csv('C:\\Users\\JGS Ram\\Downloads\\sales\\Sales_Data.csv')

# Step 2: Display first few rows and column names to understand the structure
print("Columns in the dataset:", data.columns)
print(data.head())

# Step 3: Strip spaces in column names to avoid any errors


data.columns = data.columns.str.strip()

# Step 4: Handle missing values by replacing with the mean of numeric columns
data.fillna(data.select_dtypes(include=['number']).mean(), inplace=True)

# Step 5: Inspect the dataset columns for normalized data


print("Columns in the dataset:", data.columns)

# Step 6: Check for relevant columns for creating 'High_Sales' label


high_sales_created = False

# Check for one of the 'Normalized' columns (e.g., 'Normalized 1', 'Normalized 2', etc.)
if 'Normalized 1' in data.columns: # Replace this with the actual relevant column you choose
threshold = data['Normalized 1'].mean()
data['High_Sales'] = (data['Normalized 1'] > threshold).astype(int)
high_sales_created = True
elif 'Normalized 2' in data.columns:
threshold = data['Normalized 2'].mean()
data['High_Sales'] = (data['Normalized 2'] > threshold).astype(int)
high_sales_created = True

Dept. of CSE, CMR Page 5


4CSPL2041 IML LAB

else:
print("No normalized columns found for defining high sales.")
exit()

# Step 7: Encode categorical variables only if they exist


if 'Product_Code' in data.columns:
data['Product_Code'] = data['Product_Code'].astype('category').cat.codes

# Step 8: Define the features (X) and target (y) only if 'High_Sales' was created
if high_sales_created:
X = data.drop(columns=['High_Sales']) # Features
y = data['High_Sales'] # Target

# Step 9: Split the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 10: Initialize and train the RandomForestClassifier


model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

# Step 11: Evaluate the model


y_pred = model.predict(X_test)

# Accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")

# Confusion Matrix
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

# Classification Report
print("Classification Report:")
print(classification_report(y_test, y_pred))
else:
print("No valid 'High_Sales' column created. The model will not be trained.")

Dept. of CSE, CMR Page 6


4CSPL2041 IML LAB

Output:

Result:

A classifier was successfully implemented on the sales data, accurately predicting sales categories based on
given features

Dept. of CSE, CMR Page 7


4CSPL2041 IML LAB

Ex.No:3 House Price Prediction

AIM:
To develop a predictive model for predicting house prices.

PROGRAM:
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Step 1: Load the California Housing Dataset


california = fetch_california_housing()

# Step 2: Convert the dataset to a pandas DataFrame


data = pd.DataFrame(california.data, columns=california.feature_names)
data['Price'] = california.target

# Step 3: Display the first few rows of the dataset


print("Dataset Overview:")
print(data.head())

# Step 4: Split the dataset into features (X) and target (y)
X = data.drop(columns=['Price']) # Features
y = data['Price'] # Target

# Step 5: Split the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 6: Initialize and train the Linear Regression model


model = LinearRegression()
model.fit(X_train, y_train)

# Step 7: Make predictions on the test set


y_pred = model.predict(X_test)

# Step 8: Evaluate the model


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.2f}”)

Dept. of CSE, CMR Page 8


4CSPL2041 IML LAB

print(f"R^2 Score: {r2:.2f}")

# Step 9: Display the coefficients


print("Model Coefficients:")
print(pd.DataFrame(model.coef_, X.columns, columns=['Coefficient']))

Output:

Result:

A predictive model was successfully developed, estimating house prices based on features like location, size,
and amenities

Dept. of CSE, CMR Page 9


4CSPL2041 IML LAB

Ex.No:4. FIND-S Algorithm

AIM:
To implement FIND-S algorithm and to verify that it successfully produces the trace in for the
Enjoy sport example.

PROGRAM:

import pandas as pd

# Training examples for the "Enjoy Sport" dataset


data = {
'Outlook': ['Sunny', 'Sunny', 'Overcast', 'Rain', 'Rain', 'Rain', 'Overcast', 'Sunny', 'Sunny', 'Rain', 'Sunny',
'Overcast', 'Overcast', 'Rain'],
'Temperature': ['Hot', 'Hot', 'Hot', 'Mild', 'Cool', 'Cool', 'Cool', 'Mild', 'Cool', 'Mild', 'Mild', 'Mild', 'Hot',
'Mild'],
'Humidity': ['High', 'High', 'High', 'High', 'Low', 'Low', 'Low', 'High', 'Low', 'Low', 'High', 'High', 'Low',
'High'],
'Wind': ['Weak', 'Strong', 'Weak', 'Weak', 'Weak', 'Strong', 'Strong', 'Weak', 'Weak', 'Weak', 'Strong', 'Strong',
'Weak', 'Strong'],
'Enjoy Sport': ['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes', 'No']
}

# Convert the data to a pandas DataFrame


df = pd.DataFrame(data)

# Filter the data to include only positive examples (Enjoy Sport == 'Yes')
positive_examples = df[df['Enjoy Sport'] == 'Yes']

# Step 1: Start with the first positive example as the hypothesis (the most specific hypothesis)
initial_hypothesis = positive_examples.iloc[0, :-1].values
hypothesis = initial_hypothesis

# Step 2: Iterate through the remaining positive examples and generalize the hypothesis
for index, example in positive_examples.iloc[1:].iterrows():
for i in range(len(hypothesis)):
# If the current hypothesis doesn't match the current example, generalize it
if hypothesis[i] != example[i]:
hypothesis[i] = '?'

Dept. of CSE, CMR Page 10


4CSPL2041 IML LAB

# Output the resulting hypothesis


print(f"The most specific hypothesis for the 'Enjoy Sport' concept is: {hypothesis}")

Output:

Result:

The FIND-S algorithm was successfully implemented, identifying the most specific hypothesis for the
EnjoySport dataset by generalizing from positive examples

Dept. of CSE, CMR Page 11


4CSPL2041 IML LAB

Ex.No:5 Decision Tree Algorithm

AIM:
To implement a decision tree algorithm for sales prediction/classification in retail sector.

PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.preprocessing import LabelEncoder

# Sample retail sales data


data = {
'Product_ID': [101, 102, 103, 104, 105, 106, 107, 108],
'Category': ['Electronics', 'Furniture', 'Electronics', 'Clothing', 'Furniture', 'Clothing', 'Electronics',
'Furniture'],
'Region': ['North', 'East', 'West', 'South', 'North', 'West', 'East', 'South'],
'Promotion': ['Yes', 'No', 'Yes', 'Yes', 'No', 'No', 'Yes', 'Yes'],
'Sales': ['High', 'Low', 'High', 'High', 'Low', 'Low', 'High', 'High']
}

# Convert the data into a pandas DataFrame


df = pd.DataFrame(data)

# Encode categorical variables (Category, Region, Promotion, and Sales)


label_encoder = LabelEncoder()

df['Category'] = label_encoder.fit_transform(df['Category'])
df['Region'] = label_encoder.fit_transform(df['Region'])
df['Promotion'] = label_encoder.fit_transform(df['Promotion'])
df['Sales'] = label_encoder.fit_transform(df['Sales']) # Target variable (Sales as High=1, Low=0)

# Feature selection: Independent variables (X) and target variable (y)


X = df[['Category', 'Region', 'Promotion']] # Features
y = df['Sales'] # Target

# Split the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the Decision Tree Classifier


model = DecisionTreeClassifier(random_state=42)
Dept. of CSE, CMR Page 12
4CSPL2041 IML LAB

# Train the model on the training data


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# Evaluate the model


accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")

# Confusion Matrix
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

# Classification Report
print("Classification Report:")
print(classification_report(y_test, y_pred))

# Visualizing the decision tree (Optional)


from sklearn.tree import plot_tree
import matplotlib.pyplot as plt

plt.figure(figsize=(12,8))
plot_tree(model, feature_names=['Category', 'Region', 'Promotion'], class_names=['Low', 'High'],
filled=True)
plt.show()

Dept. of CSE, CMR Page 13


4CSPL2041 IML LAB

Output:

Result:

A decision tree algorithm was successfully implemented, effectively classifying and predicting sales trends in
the retail sector based on historical data

Dept. of CSE, CMR Page 14


4CSPL2041 IML LAB

Ex.No:6 Back Propagation Algorithm

AIM:
To implement back propagation algorithm for stock prices prediction.

PROGRAM:
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
import matplotlib.pyplot as plt

# Step 1: Create a dummy stock price dataset


np.random.seed(42)
dates = pd.date_range('2022-01-01', periods=1000, freq='B') # Business days for 1000 days
stock_prices = np.random.uniform(100, 500, size=(1000,)) # Random prices between 100 and 500

# Create DataFrame
data = pd.DataFrame({'Date': dates, 'Close': stock_prices})

# Preview the first few rows


print(data.head())

# Step 2: Use only 'Close' price for prediction


stock_data = data[['Close']]

# Step 3: Normalize the data using MinMaxScaler


scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(stock_data)

# Step 4: Prepare the data for training


def create_dataset(data, time_step=1):
X, y = [], []
for i in range(len(data) - time_step - 1):
X.append(data[i:(i + time_step), 0])
y.append(data[i + time_step, 0])
return np.array(X), np.array(y)

time_step = 60 # Use the last 60 days to predict the next day's price
X, y = create_dataset(scaled_data, time_step)

Dept. of CSE, CMR Page 15


4CSPL2041 IML LAB

# Step 5: Reshape X to be 3D (samples, time_step, features)


X = X.reshape(X.shape[0], X.shape[1])

# Step 6: Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 7: Initialize the MLPRegressor model


model = MLPRegressor(hidden_layer_sizes=(50, 50), max_iter=1000)

# Step 8: Train the model


model.fit(X_train, y_train)

# Step 9: Make predictions


predictions = model.predict(X_test)

# Step 10: Inverse the scaling to get the actual price values
predictions = scaler.inverse_transform(predictions.reshape(-1, 1))
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1))

# Step 11: Plot the actual vs predicted values


plt.figure(figsize=(12,6))
plt.plot(y_test_actual, color='blue', label='Actual Stock Price')
plt.plot(predictions, color='red', label='Predicted Stock Price')
plt.title('Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('Stock Price')
plt.legend()
plt.show()

Output:

Dept. of CSE, CMR Page 16


4CSPL2041 IML LAB

Result:

The backpropagation-based MLP model effectively predicted stock prices using past data. The plotted results
showed a close alignment between actual and predicted values, demonstrating the model's capability to capture
stock price trends.

Dept. of CSE, CMR Page 17


4CSPL2041 IML LAB

Ex.No:7 Insurance Fraud Detection

AIM:
To implement clustering algorithm for Insurance fraud detection

PROGRAM:
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Step 1: Generate a synthetic dataset for insurance fraud detection


# For the sake of this example, let's assume we have 5 features related to the insurance claim.
# 1. Claim amount
# 2. Age of the policyholder
# 3. Number of claims made
# 4. Number of previous fraud reports
# 5. Claim type (e.g., medical, auto, etc.)

# Create random data to simulate this scenario


np.random.seed(42)

n_samples = 1000
claim_amount = np.random.normal(2000, 500, n_samples) # Normal distribution for claim amount
age = np.random.normal(40, 10, n_samples) # Normal distribution for age
num_claims = np.random.randint(1, 10, n_samples) # Random number of claims
num_previous_frauds = np.random.randint(0, 2, n_samples) # 0 for non-fraudulent, 1 for fraudulent
claim_type = np.random.randint(0, 3, n_samples) # 0,1,2 representing different types of claims

# Simulate fraud data by adding some extreme claim amounts and a higher chance of previous fraud for
fraudulent cases
fraudulent_indices = np.random.choice(n_samples, size=50, replace=False)
claim_amount[fraudulent_indices] = np.random.normal(10000, 5000, size=50) # High fraudulent claim
amounts
num_previous_frauds[fraudulent_indices] = 1 # Assign fraud flag to these rows

# Create a DataFrame
data = pd.DataFrame({
'Claim Amount': claim_amount,
'Age': age,

Dept. of CSE, CMR Page 18


4CSPL2041 IML LAB

'Num Claims': num_claims,


'Num Previous Frauds': num_previous_frauds,
'Claim Type': claim_type
})

# Step 2: Preprocess the data (Normalization)


scaler = StandardScaler()
data_scaled = scaler.fit_transform(data)

# Step 3: Apply K-Means Clustering


kmeans = KMeans(n_clusters=2, random_state=42) # We assume 2 clusters: fraudulent and non-fraudulent
kmeans.fit(data_scaled)

# Step 4: Assign labels to the data


data['Cluster'] = kmeans.labels_

# Step 5: Visualize the clusters (first two features)


plt.figure(figsize=(8, 6))
plt.scatter(data['Claim Amount'], data['Age'], c=data['Cluster'], cmap='viridis', label='Cluster')
plt.title('Insurance Fraud Detection Clusters')
plt.xlabel('Claim Amount')
plt.ylabel('Age of Policyholder')
plt.colorbar(label='Cluster')
plt.show()

# Step 6: Analyze the clusters


print(data.head())

# Check number of samples in each cluster


cluster_counts = data['Cluster'].value_counts()
print(f"Number of samples in each cluster:\n{cluster_counts}")

# Optionally, you can check for outliers or anomalies in the larger claim amounts or fraudulent cases.

Dept. of CSE, CMR Page 19


4CSPL2041 IML LAB

Output:

Result:

A clustering algorithm was successfully implemented, grouping insurance claims to detect potential fraud
patterns based on anomalies and similarities

Dept. of CSE, CMR Page 20


4CSPL2041 IML LAB

Ex.No:8 Clustering Algorithm for Cancer Prediction

AIM:
To implement clustering algorithm for identifying cancerous data

PROGRAM:
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix

# Step 1: Load the Breast Cancer Wisconsin dataset


cancer = load_breast_cancer()
X = cancer.data # Features
y = cancer.target # Actual labels (0 for benign, 1 for malignant)

# Step 2: Preprocess the data (standardize the features)


scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Step 3: Apply K-Means clustering


kmeans = KMeans(n_clusters=2, random_state=42) # 2 clusters: benign and malignant
kmeans.fit(X_scaled)

# Step 4: Assign clusters to the data


y_pred = kmeans.labels_ # Clustering labels from K-Means

# Step 5: Evaluate the clustering performance


# Since the dataset has ground truth (y), we can compare the predicted labels with the actual ones
# We’ll use confusion matrix and classification report to evaluate clustering performance
print("Confusion Matrix:")
print(confusion_matrix(y, y_pred))
print("\nClassification Report:")
print(classification_report(y, y_pred))

# Step 6: Visualizing the Clusters (using first two features for simplicity)
plt.figure(figsize=(8, 6))
plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=y_pred, cmap='viridis', label='Cluster')
plt.title('Breast Cancer Clustering with K-Means’)
Dept. of CSE, CMR Page 21
4CSPL2041 IML LAB

plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.colorbar(label='Cluster')
plt.show()

# Optional: Check the centers of the clusters


print("Cluster Centers:")
print(kmeans.cluster_centers_)

Output:

Result:

A clustering algorithm was successfully implemented, effectively distinguishing cancerous and non-
cancerous data based on feature similarities.
Dept. of CSE, CMR Page 22
4CSPL2041 IML LAB

Ex.No:9 Reinforcement Learning

AIM:
To apply reinforcement learning and develop a game of your own.

PROGRAM:
import numpy as np
import random
import matplotlib.pyplot as plt

# Step 1: Set up the environment


class GridWorld:
def init (self, size=5, goal=(4, 4)):
self.size = size
self.grid = np.zeros((size, size))
self.goal = goal
self.position = (0, 0) # Initial position at top-left corner

def reset(self):
self.position = (0, 0) # Reset to initial position
return self.position

def step(self, action):


# Define possible actions: 0 - up, 1 - down, 2 - left, 3 - right
if action == 0: # up
new_position = (self.position[0] - 1, self.position[1])
elif action == 1: # down
new_position = (self.position[0] + 1, self.position[1])
elif action == 2: # left
new_position = (self.position[0], self.position[1] - 1)
else: # right
new_position = (self.position[0], self.position[1] + 1)

# Check boundaries of the grid


if new_position[0] < 0 or new_position[0] >= self.size or new_position[1] < 0 or new_position[1] >=
self.size:
new_position = self.position # Stay in place if out of bounds

self.position = new_position

# Reward function: goal position has a reward of +1, else -0.1 for each step
if self.position == self.goal:
return self.position, 1 # Reward for reaching the goal
else:
return self.position, -0.1 # Small negative reward for each step

def render(self):
grid = np.full((self.size, self.size), " ")
grid[self.goal] = "G" # Goal

Dept. of CSE, CMR Page 23


4CSPL2041 IML LAB

grid[self.position] = "A" # Agent


print("\n".join([" ".join(row) for row in grid]))
print()

# Step 2: Set up Q-learning


class QLearningAgent:
def init (self, actions, learning_rate=0.1, discount_factor=0.9, epsilon=0.2):
self.actions = actions
self.learning_rate = learning_rate
self.discount_factor = discount_factor
self.epsilon = epsilon # Exploration-exploitation tradeoff
self.q_table = {} # Q-table for storing state-action values

def get_q_value(self, state, action):


return self.q_table.get((state, action), 0.0)

def update_q_value(self, state, action, reward, next_state):


max_next_q = max([self.get_q_value(next_state, a) for a in self.actions])
current_q = self.get_q_value(state, action)
new_q = current_q + self.learning_rate * (reward + self.discount_factor * max_next_q - current_q)
self.q_table[(state, action)] = new_q

def choose_action(self, state):


if random.uniform(0, 1) < self.epsilon:
return random.choice(self.actions) # Exploration: Random action
else:
q_values = [self.get_q_value(state, action) for action in self.actions]
max_q = max(q_values)
best_actions = [i for i, q in enumerate(q_values) if q == max_q]
return random.choice(best_actions) # Exploitation: Best action

# Step 3: Train the agent


def train_agent(epochs=1000):
env = GridWorld()
agent = QLearningAgent(actions=[0, 1, 2, 3]) # 4 possible actions
rewards = []

for epoch in range(epochs):


state = env.reset()
total_reward = 0

done = False
while not done:
action = agent.choose_action(state)
next_state, reward = env.step(action)
agent.update_q_value(state, action, reward, next_state)
total_reward += reward
state = next_state

if state == env.goal:
done = True # Episode ends when goal is reached

rewards.append(total_reward)
if epoch % 100 == 0:
Dept. of CSE, CMR Page 24
4CSPL2041 IML LAB

print(f"Epoch {epoch}: Total Reward = {total_reward}")

return rewards

# Step 4: Test the trained agent


def test_agent():
env = GridWorld()
agent = QLearningAgent(actions=[0, 1, 2, 3])

# Manually train the agent before testing (or load pre-trained Q-table)
train_agent(epochs=1000)

# Test the trained agent


state = env.reset()
total_reward = 0
steps = 0
while state != env.goal and steps < 100:
env.render()
action = agent.choose_action(state)
state, reward = env.step(action)
total_reward += reward
steps += 1

env.render()
print(f"Total reward after {steps} steps: {total_reward}")

# Train and test the agent


train_agent(epochs=1000)
test_agent()

Output:
Epoch 0: Total Reward = -0.30000000000000004
Epoch 100: Total Reward = -0.09999999999999987
Epoch 200: Total Reward = 0.10000000000000009
Epoch 300: Total Reward = -0.09999999999999987
Epoch 400: Total Reward = 1.1102230246251565e-16
Epoch 500: Total Reward = -0.19999999999999996
Epoch 600: Total Reward = 0.30000000000000004
Epoch 700: Total Reward = -0.19999999999999996
Epoch 800: Total Reward = 0.20000000000000007
Epoch 900: Total Reward = 0.10000000000000009
Epoch 0: Total Reward = -1.0000000000000004
Epoch 100: Total Reward = 0.10000000000000009
Epoch 200: Total Reward = -0.30000000000000004
Epoch 300: Total Reward = 0.20000000000000007
Epoch 400: Total Reward = 0.10000000000000009
Epoch 500: Total Reward = 0.30000000000000004
Epoch 600: Total Reward = 0.30000000000000004
Epoch 700: Total Reward = 0.20000000000000007

Dept. of CSE, CMR Page 25


4CSPL2041 IML LAB

Epoch 800: Total Reward = 0.20000000000000007


Epoch 900: Total Reward = -0.40000000000000013
A

Dept. of CSE, CMR Page 26


4CSPL2041 IML LAB

Total reward after 10 steps: -0.9999999999999999

Result:

The final results after training and testing the reinforcement learning agent in the Grid World environment are
as follows:
During training, the agent gradually learned to navigate the grid, with total rewards fluctuating as it explored
different paths. Some episodes resulted in negative rewards due to inefficient movements, while others showed
improvement, reaching the goal with optimal steps.
After testing, the agent took 10 steps and achieved a total reward of -1.0, indicating that it may still need further
training or fine-tuning of hyperparameters to improve efficiency.

Dept. of CSE, CMR Page 27


4CSPL2041 IML LAB

Ex.No:10 Traffic Signal Control System

AIM:
To develop a traffic signal control system using reinforcement learning technique.

PROGRAM:
import numpy as np
import random

# Traffic Signal Environment with Multiple States


class TrafficSignalEnv:
def init (self):
self.states = [0, 1, 2] # 0: Red, 1: Green, 2: Yellow (additional state for complexity)
self.actions = [0, 1] # 0: Do nothing (keep signal), 1: Switch signal
self.state = 0 # Initial state (Red light)
self.traffic = [10, 20, 5] # Simple representation of traffic (e.g., cars)
self.reward = 0

def reset(self):
self.state = 0 # Reset state to Red
return self.state

def step(self, action):


if self.state == 0: # Red Light
if action == 1:
self.state = 1 # Switch to Green
self.reward = 10 # Reward for switching to Green
else:
self.reward = -1 # Penalty for doing nothing

elif self.state == 1: # Green Light


if action == 1:
self.state = 2 # Switch to Yellow
self.reward = 5 # Reward for switching to Yellow
else:
self.reward = -1 # Penalty for doing nothing

elif self.state == 2: # Yellow Light


if action == 1:
self.state = 0 # Switch back to Red
self.reward = 2 # Reward for switching to Red
else:
Dept. of CSE, CMR Page 28
4CSPL2041 IML LAB

self.reward = -1 # Penalty for doing nothing

return self.state, self.reward

# Q-learning Agent for Traffic Signal Control


class QLearningAgent:
def init (self, n_actions, n_states, learning_rate=0.1, discount_factor=0.9, epsilon=0.9):
self.n_actions = n_actions
self.n_states = n_states
self.learning_rate = learning_rate
self.discount_factor = discount_factor
self.epsilon = epsilon
self.q_table = np.zeros((n_states, n_actions)) # Initialize Q-table with zeros

def select_action(self, state):


if random.uniform(0, 1) < self.epsilon: # Exploration
return random.choice(range(self.n_actions))
else: # Exploitation
return np.argmax(self.q_table[state])

def update_q_table(self, state, action, reward, next_state):


best_next_action = np.argmax(self.q_table[next_state]) # Find the best action for the next state
q_value = self.q_table[state, action]
max_future_q = self.q_table[next_state, best_next_action]

# Update Q-value using the Bellman equation


self.q_table[state, action] = q_value + self.learning_rate * (reward + self.discount_factor *
max_future_q - q_value)

# Training the Agent


env = TrafficSignalEnv()
agent = QLearningAgent(n_actions=2, n_states=3)

epochs = 100
steps_per_epoch = 10

for epoch in range(epochs):


total_reward = 0
state = env.reset() # Reset the environment at the start of each epoch

for step in range(steps_per_epoch):


action = agent.select_action(state) # Select an action based on the current state
next_state, reward = env.step(action) # Take the action and receive the next state and reward

Dept. of CSE, CMR Page 29


4CSPL2041 IML LAB

# Update Q-table
agent.update_q_table(state, action, reward, next_state)

# Accumulate total reward for this epoch


total_reward += reward

state = next_state # Update the current state

print(f"Epoch {epoch + 1}: Total Reward = {total_reward}")

# Stopping condition if learning stabilizes early


if epoch > 0 and total_reward == steps_per_epoch * 10:
print(f"Learning stabilized at epoch {epoch + 1}.")
break

# Print the final Q-table after training


print("\nFinal Q-table:")
print(agent.q_table)

Output:
Epoch 1: Total Reward = 41
Epoch 2: Total Reward = 30
Epoch 3: Total Reward = 27
Epoch 4: Total Reward = 27
Epoch 5: Total Reward = 30
Epoch 6: Total Reward = 41
Epoch 7: Total Reward = 10
Epoch 8: Total Reward = 27
Epoch 9: Total Reward = 30
Epoch 10: Total Reward = 50
Epoch 11: Total Reward = 30
Epoch 12: Total Reward = 30
Epoch 13: Total Reward = 21
Epoch 14: Total Reward = 10
Epoch 15: Total Reward = 21
Epoch 16: Total Reward = 47
Epoch 17: Total Reward = 21
Epoch 18: Total Reward = 10
Epoch 19: Total Reward = 50
Epoch 20: Total Reward = 41
Epoch 21: Total Reward = 30
Epoch 22: Total Reward = 27
Epoch 23: Total Reward = 27
Epoch 24: Total Reward = 30
Dept. of CSE, CMR Page 30
4CSPL2041 IML LAB

Epoch 25: Total Reward = 41


Epoch 26: Total Reward = 41
Epoch 27: Total Reward = 21
Epoch 28: Total Reward = 41
Epoch 29: Total Reward = 10
Epoch 30: Total Reward = 41
Epoch 31: Total Reward = 27
Epoch 32: Total Reward = 27
Epoch 33: Total Reward = 30
Epoch 34: Total Reward = 27
Epoch 35: Total Reward = 27
Epoch 36: Total Reward = 7
Epoch 37: Total Reward = 21
Epoch 38: Total Reward = 21
Epoch 39: Total Reward = 27
Epoch 40: Total Reward = 30
Epoch 41: Total Reward = 27
Epoch 42: Total Reward = 21
Epoch 43: Total Reward = 41
Epoch 44: Total Reward = 27
Epoch 45: Total Reward = 10
Epoch 46: Total Reward = 21
Epoch 47: Total Reward = 1
Epoch 48: Total Reward = 30
Epoch 49: Total Reward = 47
Epoch 50: Total Reward = 27
Epoch 51: Total Reward = 27
Epoch 52: Total Reward = 30
Epoch 53: Total Reward = 10
Epoch 54: Total Reward = 30
Epoch 55: Total Reward = 27
Epoch 56: Total Reward = 41
Epoch 57: Total Reward = 27
Epoch 58: Total Reward = 41
Epoch 59: Total Reward = 30
Epoch 60: Total Reward = 10
Epoch 61: Total Reward = 30
Epoch 62: Total Reward = 21
Epoch 63: Total Reward = 27
Epoch 64: Total Reward = 30
Epoch 65: Total Reward = 27
Epoch 66: Total Reward = 10
Epoch 67: Total Reward = 10
Epoch 68: Total Reward = 30
Dept. of CSE, CMR Page 31
4CSPL2041 IML LAB

Epoch 69: Total Reward = 21


Epoch 70: Total Reward = 10
Epoch 71: Total Reward = 41
Epoch 72: Total Reward = 47
Epoch 73: Total Reward = 27
Epoch 74: Total Reward = 27
Epoch 75: Total Reward = 27
Epoch 76: Total Reward = 47
Epoch 77: Total Reward = 47
Epoch 78: Total Reward = 21
Epoch 79: Total Reward = 27
Epoch 80: Total Reward = 41
Epoch 81: Total Reward = 41
Epoch 82: Total Reward = 21
Epoch 83: Total Reward = 30
Epoch 84: Total Reward = 10
Epoch 85: Total Reward = 30
Epoch 86: Total Reward = 30
Epoch 87: Total Reward = 27
Epoch 88: Total Reward = 27
Epoch 89: Total Reward = 27
Epoch 90: Total Reward = 41
Epoch 91: Total Reward = 10
Epoch 92: Total Reward = 21
Epoch 93: Total Reward = 27
Epoch 94: Total Reward = 30
Epoch 95: Total Reward = 30
Epoch 96: Total Reward = 41
Epoch 97: Total Reward = 30
Epoch 98: Total Reward = 21
Epoch 99: Total Reward = 41
Epoch 100: Total Reward = 41

Final Q-table:
[[43.28894396 50.26009451]
[38.81486928 45.50582136]
[38.16273778 46.17528699]]

Result:

A reinforcement learning-based traffic signal control system was successfully developed, optimizing signal
timings to reduce congestion and improve traffic flow.

Dept. of CSE, CMR Page 32

You might also like