114AG01 Intro To AI & ML
114AG01 Intro To AI & ML
College of Engineering
Opp Gujarat University, Navrangpura, Ahmedabad - 380015
LAB MANUAL
Honors/Minor Degree: Artificial Intelligence and Machine Learning
Faculty Details:
1. Prof. M. K. Shah
2. Dr. N. H. Domadiya
3. Prof. H. D. Rajput
Introduction to AI and Machine Learning (114AG01)
CERTIFICATE
Practical List
Subject Name: Introduction to AI and Machine Learning (114AG01)
Term: 2023-2024
Practical Rubrics
Term: 2022-2023
SIGN OF FACULTY
Pract. Faculty
RB1 RB2 RB3 RB4 Total Date
No. Sign
10
11
Table of Contents
Practical-6: Write the code in Kanren to demonstrate the constraint system. .......................................... 33
Practical-10: Write the code in python to implement logistic regression for single class classification. ...... 47
Practical-11: Write the code in python to implement logistic regression for multi class classification........ 56
Practical-1
Implement following programs in python: Basics of Python
Program
if mode == 1:
celsius = float(input("Enter temperature in celsius: "))
fahrenheit = (celsius * 9 / 5) + 32
print(f'{celsius:.2f} Celsius is: {fahrenheit:0.2f} Fahrenheit')
elif mode == 2:
fahrenheit = float(input("Enter temperature in fahrenheit: "))
celsius = (fahrenheit - 32) * 5 / 9
print(f'{fahrenheit:.2f} Fahrenheit is: {celsius:0.2f} Celsius')
else:
print("Invalid mode selected")
Output:
Enter mode:
(1) Convert from Celsius to Fahrenheit
(2) Convert from Fahrenheit to Celsius
> 1
Enter temperature in celsius: 43
43.00 Celsius is: 109.40 Fahrenheit
if operation == 1:
print(n1, "+", n2, "=", (n1 + n2))
elif operation == 2:
print(n1, "-", n2, "=", (n1 - n2))
elif operation == 3:
print(n1, "*", n2, "=", (n1 * n2))
elif operation == 4:
print(n1, "/", n2, "=", (n1 / n2))
else:
print("Invalid Input")
Output:
Select Mode:
1. Addition
2. Subtraction
3. Multiplication
4. Division
> 3
Enter the First Number: 5
Enter the Second Number: 9
5.0 * 9.0 = 45.0
Output:
def is_prime(n):
if n % 2 == 0 and n > 2:
return False
return all(n % i for i in range(3, int(math.sqrt(n)) + 1, 2))
Output:
Output:
def input_to_tuple(i):
name = input(f"Enter name for student {i + 1}: ")
cpi = input(f"Enter CPI for student {i + 1}: ")
return name, float(cpi)
Output:
while True:
mode = int(input("""Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> """))
if mode == 1:
s = input("Enter roll number to add: ")
students.add(s)
elif mode == 2:
s = input("Enter roll number to remove: ")
students.remove(s)
elif mode == 3:
print(students)
else:
break
Output:
Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> 1
> 1
Enter roll number to add: 782
Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> 3
{'123', '456', '782'}
Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> 2
Enter roll number to remove: 123
Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> 2
Enter roll number to remove: 782
Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> 3
{'456'}
Enter mode:
(1) Add student roll number
(2) Delete student roll number
(3) Display students' roll numbers
(4) Exit
> 4
del student["Name"]
print("After removal: ")
pprint(student)
Output:
Original dictionary:
{'Branch': 'EC', 'EnrollmentNo': '230283111005', 'Name': 'Aryan Chudasama'}
After removal:
{'Branch': 'EC', 'EnrollmentNo': '230283111005'}
After insertion:
{'Branch': 'EC',
'EnrollmentNo': '230283111005',
'FullName': 'Chudasama Aryan Hiteshbhai'}
9. Write a program to read and display content of the file. Also display
number of lines, words in the given file.
Answer:
./test_file.txt
./Prac1-9.py
with open('./test_file.txt') as f:
data = str(f.read())
num_lines = len(data.splitlines())
num_words = len(data.split())
print(f"""File contents:
{data}
END
Number of lines: {num_lines}
Number of words: {num_words}
""")
Output:
File contents:
This is a test file.
Alpha beta gamma.
Beep boop.
END
Number of lines: 3
Number of words: 10
Practical-2
Study about numpy, pandas, Scikit-learn and matplotlib libraries.
Pandas:
Pandas is a powerful Python library for data manipulation and analysis. It provides
easy-to-use data structures such as DataFrames, which are tabular structures that
allow you to store and manipulate data efficiently. Pandas offers a wide range of
functions and methods for data cleaning, preprocessing, merging, grouping, and
more. It is widely used for data wrangling and exploration tasks in data science
projects.
NumPy:
Matplotlib:
Scikit-learn:
learning models and evaluate their performance. It also includes utilities for data
preprocessing, feature extraction, and model evaluation.
Seaborn:
Practical-3
Getting Started with Python Logic Programming using Kanren and
SymPy packages.
Kanren:
kanren enables one to express sophisticated relations—in the form of goals—and
generate values that satisfy the relations. The following code is the "Hello, world!" of
logic programming; it asks for values of the logic variable x such that x == 5:
Multiple logic variables and goals can be used simultaneously. The following code
asks for one list containing the values of x and z such that x == z and z == 3:
>>> z = var()
>>> run(1, [x, z], eq(x, z),
eq(z, 3))
([3, 3],)
kanren uses unification to match forms within expression trees. The following code
asks for values of x such that (1, 2) == (1, x):
The examples above made implicit use of the goal constructors lall and lany, which
represent goal conjunction and disjunction, respectively. Many useful relations can be
Representing Knowledge
kanren stores data as facts that state relationships between terms. The following
code creates a parent relationship and uses it to state facts about who is a parent of
whom within the Simpsons family:
We can use intermediate variables for more complex queries. For instance, who is
Bart's grandfather?
Introduction to sympy
What is Symbolic Computation?
import sympy
from sympy import *
import math
from IPython.display import display
init_printing(use_latex="svg", use_unicode=True)
math.sqrt(9)
9 is a perfect square, so we got the exact answer, 3. But suppose we computed the
square root of a number that isn’t a perfect square
math.sqrt(8)
Here we got an approximate result. 2.82842712475 is not the exact square root of 8
(indeed, the actual square root of 8 cannot be represented by a finite decimal, since
it is an irrational number). If all we cared about was the decimal form of the square
root of 8, we would be done.
But suppose we want to go further. Recall that sqrt(8) = sqrt(4 * 2) = 2 * sqrt(2). We
would have a hard time deducing this from the above result. This is where symbolic
computation comes in. With a symbolic computation system like SymPy, square roots
of numbers that are not perfect squares are left unevaluated by default
sympy.sqrt(3)
sympy.sqrt(8)
The above example starts to show how we can manipulate irrational numbers exactly
using SymPy. But it is much more powerful than that. Symbolic computation systems
(which by the way, are also often called computer algebra systems, or just CASs like
Matlab) such as SymPy are capable of computing symbolic expressions with variables.
As we will see later, in SymPy, variables are defined using symbols. Unlike many
symbolic manipulation systems, variables in SymPy must be defined before they are
used.
Let us define a symbolic expression, representing the mathematical expression x + 2y.
x, y = symbols('x y')
expr = x + 2*y
expr
Note that we wrote x + 2*y just as we would if x and y were ordinary Python
variables. But in this case, instead of evaluating to something, the expression remains
as just x + 2*y. Now let us play around with it:
expr + 1
expr - x
Notice something in the above example. When we typed expr - x, we did not get x
+ 2*y - x, but rather just 2*y. The x and the -x automatically canceled one
another. This is similar to how sqrt(8) automatically turned into 2*sqrt(2) above. This
isn’t always the case in SymPy, however:
x*expr
Here, we might have expected x(x + 2y) to transform into x^2 + 2xy, but
instead we see that the expression was left alone. This is a common theme in SymPy.
Aside from obvious simplifications like x - x = 0 and simplification of radicals,
most simplifications are not performed automatically. This is because we might
prefer the factored form, or we might prefer the expanded form. Both forms are
useful in different circumstances. In SymPy, there are functions to go from one form
to the other
expanded_expr = expand(x*expr)
expanded_expr
factor(expanded_expr)
The real power of a symbolic computation system such as SymPy is the ability to do
all sorts of computations symbolically. SymPy can simplify expressions, compute
derivatives, integrals, and limits, solve equations, work with matrices, and much,
much more, and do it all symbolically. It includes modules for plotting, printing (like
2D pretty printed output of math formulas, or LATEX), code generation, physics,
statistics, combinatorics, number theory, geometry, logic, and more.
x, t, z, nu = symbols('x t z nu')
diff(exp1, x)
integrate(exp2, x)
exp3 = sin(x**2)
exp3
Find a limit:
Limit(sin(x)/x, x, 0)
limit(sin(x)/x, x, 0)
Solve an equation
Equality(x ** 2 - 2, x)
solve(x**2 - 2, x)
dsolve(diff_eqn, y(t))
mat.eigenvals()
A concrete example: Find the exponential Fourier series coefficients c(k) for
a triangular pulse.
import numpy as np
import sympy
from sympy import *
import math
from IPython.display import display
import matplotlib.pyplot as plt
from sympy.utilities.lambdify import lambdify
import builtins
init_printing(use_latex="svg", use_unicode=True)
t = sympy.symbols('t')
n, k = sympy.symbols('n k', integer=True)
x = sympy.Function('x')
# Determine fundamental_frequency = w0
fundamental_frequency = 2 * pi / period_of_x
half_period = period_of_x / 2
# Define a single period of the function
x_single_period = sympy.Piecewise(
(amplitude_of_x / half_period * (t + half_period), (-half_period <= t) & (t <= 0)),
(-amplitude_of_x / half_period * (t - half_period), (0 < t) & (t <= half_period)),
)
x_single_period
x = x_single_period.subs(
t, (t - floor(t / period_of_x) * period_of_x -
half_period))
x = simplify(x.subs(t, (t + half_period)))
actual_t = np.arange((-3*half_period).evalf(),
(3*half_period).evalf(), 0.01)
plt.plot(actual_t, x_np(actual_t))
plt.title("Plot of triangle wave x(t)")
plt.ylabel("Amplitude")
plt.xlabel("Time")
plt.grid()
plt.xlim(float(actual_t[0]), float(actual_t[-1]))
plt.show()
t_0 = -half_period
# Find C0 by integrating x(t) over its period
computed_c_0 = 1 / period_of_x * integrate(
x_single_period, # Function
(
t, # Variable of integration
t_0, # Lower limit
t_0 + period_of_x # Upper limit
)
)
computed_c_0
Here we see that computed_c_n_not_zero has 3 cases, one for even coefficients, one
for odd coefficients, and one for zero, to find the odd and even coefficients
separately...
ideal_c_0 = amplitude_of_x / 2
ideal_c_n = Piecewise(
(ideal_c_0, Eq(n, 0)),
(0, Eq(n % 2, 0)),
(2 * amplitude_of_x / pow(n * pi, 2), Eq(n % 2, 1))
)
ideal_c_n
n_max = 10
n_array = np.arange(-n_max, n_max)
# Compare the Fourier series coefficients, one evaluated by hand, and the
other in Matlab
ideal_c_n_values = ideal_c_n_np(n_array)
computed_c_n_values = computed_c_n_np(n_array)
# Check that they are (at least approximately) the same.
assert (np.all(np.abs(ideal_c_n_values - computed_c_n_values) <
0.00001))
ideal_c_n_values
Output:
plt.stem(n_array, ideal_c_n_values)
plt.grid()
plt.title("Plot of frequency spectrum of X(w)")
plt.ylabel("Mangnitude")
plt.xlabel("n")
ideal_c_n
def get_reconstructed_x(n_max):
individual_term_of_fourier_series = computed_c_n * \
exp(sympy.I * n * fundamental_frequency * t)
x_as_fourier_series = simplify(
summation(individual_term_of_fourier_series.subs(n, k), (k, -n, n)))
reconstructed_x = simplify(x_as_fourier_series.subs(n, n_max))
reconstructed_x_np = lambdify(t, reconstructed_x, "numpy")
return reconstructed_x, reconstructed_x_np
Reconstruct x(t) from its Fourier series expansion, plot it and compare it with its original expression:
actual_t = np.arange((-3*half_period).evalf(),
(3*half_period).evalf(), 0.01, dtype=float)
original_x_vals = x_np(actual_t)
reconstructed_x_vals = np.real(reconstructed_x_np(actual_t))
diff = np.abs(original_x_vals - reconstructed_x_vals)
max_diff = np.max(diff)
rms_diff = np.sqrt(np.mean(diff ** 2))
display(max_diff)
display(rms_diff)
Here, we see that their plots line up almost perfectly, with the sharp edge at times t =
0, -pi, +pi, -2pi, +2pi, etc., having been smoothed out in the reconstructed version.
If we plot the reconstructed x(t) alone, like so:
plt.plot(actual_t, reconstructed_x_vals)
plt.grid()
plt.title("Plot of triangle wave x(t)'s reconstructed variant")
plt.ylabel("Amplitude")
plt.xlabel("Time")
plt.xlim([actual_t[0], actual_t[-1]])
We see that the function is a bit wobbly, and the edges have been smoothed out
greately, as compared to the original triangle wave. This is due to the errors in
reconstruction, which can be reduced by increasing n_max.
For eg, for n_max = 30:
n_max = 30
accurate_rec_x, accurate_rec_x_np = get_reconstructed_x(n_max)
accurate_rec_x
accurate_rec_x_vals = np.real(accurate_rec_x_np(actual_t))
diff = np.abs(original_x_vals - accurate_rec_x_vals)
max_diff = np.max(diff)
rms_diff = np.sqrt(np.mean(diff ** 2))
plt.plot(actual_t, accurate_rec_x_vals)
plt.grid()
plt.title(f"Plot of x(t) reconstructed with n_max = {n_max}")
plt.ylabel("Amplitude")
plt.xlabel("Time")
plt.xlim([actual_t[0], actual_t[-1]])
Practical-4
Write the code in Kanren to demonstrate the followings:
a) The use of logical variables.
Answer:
from pprint import pprint
x = var()
Output:
x = var()
# Find x, such that x is a member of (1, 2, 3), and x is a member of (2, 3, 4)
print("x, such that x is a member of (1, 2, 3), and x is a member of (2, 3, 4): ")
pprint(run(0, x, membero(x, (1, 2, 3)), membero(x, (2, 3, 4))))
Output:
Practical-5
Write the code in Kanren to create parent and grandparent
relationships and use it to state facts and query based on the facts.
Answer:
from pprint import pprint
x = var()
parent = Relation()
facts(parent, ("Homer", "Bart"), ("Homer", "Lisa"), ("Abe", "Homer"))
print("Find x such that x is a parent of 'Bart': ")
pprint(run(1, x, parent(x, "Bart")))
Output:
Practical-6
Write the code in Kanren to demonstrate the constraint system.
Answer:
x = var()
print("Find x such that x != 1, and x != 3, and x is a member of {1, 2, 3}: ")
pprint(run(0, x, neq(x, 1), neq(x, 3), membero(x, (1, 2, 3))))
print("Find x such that x is a integer, and x is a member of {1.1, 2, 3.2, 4}: ")
pprint(run(0, x, isinstanceo(x, Integral), membero(x, (1.1, 2, 3.2, 4))))
Output:
Practical-7
Write the code in Kanren to match the mathematical expressions.
Answer:
# Declare that these ops are commutative & associative using the facts system
fact(commutative, mul)
fact(commutative, add)
fact(associative, mul)
fact(associative, add)
Output:
((3, 2),)
Practical-8
Write the code in python to implement linear regression for one
variable.
Answer:
Simple Linear Regression
Import libraries
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression
Import data
# Get dataset
df_sal = pd.read_csv('./Salary_Data.csv').replace([np.inf, -np.inf], np.nan)
df_sal.head()
Output:
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
Analyze data
Describe
# Describe data
df_sal.describe()
YearsExperience Salary
count 30.000000 30.000000
mean 5.313333 76003.000000
std 2.837888 27414.429785
min 1.100000 37731.000000
25% 3.200000 56720.750000
50% 4.700000 65237.000000
75% 7.700000 100544.750000
max 10.500000 122391.000000
Distribution
# Data distribution
sns.displot(df_sal['Salary'])
plt.title('Salary Distribution Plot')
plt.show()
Split data
Split into Independent/Dependent variables
# Splitting variables
X = df_sal.iloc[:, :1] # independent
y = df_sal.iloc[:, 1:] # dependent
Train model
# Regressor model
regressor = LinearRegression()
regressor.fit(X_train, y_train)
Output:
LinearRegression()
Predict results
# Prediction result
y_pred_test = regressor.predict(X_test) # predicted value of y_test
y_pred_train = regressor.predict(X_train) # predicted value of y_train
Visualize predictions
Output:
Coefficient: [[9312.57512673]]
Intercept: [26780.09915063]
Practical-9
Write the code in python to implement linear regression using
gradient descent for one variable.
Answer:
Import libraries
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression
Import data
# Get dataset
df_sal = pd.read_csv('./Salary_Data.csv').replace([np.inf, -np.inf], np.nan)
df_sal.head()
Output:
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
Split data
x = df_sal["YearsExperience"].to_numpy()
y = df_sal["Salary"].to_numpy()
x_train_len = len(x_train)
w0_history = []
w1_history = []
mse_history = []
if epoch_num % 100 == 0:
print(f"Epoch {epoch_num}: MSE: {curr_mse:.5f}");
w0 = w0 - eta * del_mse_by_del_w0
w1 = w1 - eta * del_mse_by_del_w1
Output:
Predict results
# Prediction result
y_pred_test = w1 * x_test + w0 # predicted value of y_test
y_pred_train = w1 * x_train + w0 # predicted value of y_train
Visualize predictions
Prediction on training set
# Prediction on training set
plt.scatter(x_train, y_train, color = 'lightcoral')
plt.plot(x_train, y_pred_train, color = 'firebrick')
plt.title('Salary vs Experience (Training Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.legend(['X_train/Pred(y_test)', 'X_train/y_train'], title = 'Sal/Exp', loc='best',
facecolor='white')
plt.box(False)
plt.show()
Output:
Coefficient: 9312.965074356625
Intercept: 26777.62790981993
Practical-10
Write the code in python to implement logistic regression for
single class classification.
Answer:
# fmt: off
from typing import *
import numpy as np
import pandas as pd
import seaborn as sns
import itertools
# fmt: on
sns.set()
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
ROOT_INPUT_DIR = "./"
def make_confusion_matrix(
y_true,
y_pred,
classes: list = None,
figsize: Tuple[int, int] = (10, 10),
text_size: int = 15,
norm: bool = False,
savefig: Union[str, bool] = False,
remove_diagonal: bool = False):
"""Makes a labelled confusion matrix comparing predictions and ground truth
labels.
If classes is passed, confusion matrix will be labelled, if not, integer class values
will be used.
Computer Science Department 48 L. D. College of Engineering, Ahmedabad-15
220280116044 Introduction to AI and Machine Learning (114AG01) Vatsal Kathiriya
Args:
y_true: Array of truth labels (must be same shape as y_pred).
y_pred: Array of predicted labels (must be same shape as y_true).
classes: Array of class labels (e.g. string form). If `None`, integer labels are used.
figsize: Size of output figure (default=(10, 10)).
text_size: Size of output figure text (default=15).
norm: normalize values or not (default=False).
savefig: save confusion matrix to file (default=False).
Returns:
A labelled confusion matrix plot comparing y_true and y_pred.
Example usage:
make_confusion_matrix(y_true=test_labels, # ground truth test labels
y_pred=y_preds, # predicted labels
classes=class_names, # array of class label names
figsize=(15, 15),
text_size=10)
"""
# Create the confustion matrix
cm: np.ndarray = sk_metrics.confusion_matrix(y_true, y_pred)
if remove_diagonal:
np.fill_diagonal(cm, 0)
cm_norm = cm.astype("float") / cm.sum(axis=1)[:, np.newaxis] # normalize it
n_classes = cm.shape[0] # find the number of classes we're dealing with
# Plot the figure and make it pretty
fig, ax = plt.subplots(figsize=figsize)
cax = ax.matshow(cm, cmap=plt.cm.Blues) # colors will represent how 'correct' a
class is, darker == better
plt.grid(False)
fig.colorbar(cax)
# Are there a list of classes?
if classes is not None:
labels = classes
else:
labels = np.arange(cm.shape[0])
View shape
raw_data.shape
Output:
(891, 12)
raw_data.isna().sum()
Output:
PassengerId 0
Survived 0
Pclass 0
Name 0
Sex 0
Age 177
SibSp 0
Parch 0
Ticket 0
Fare 0
Cabin 687
Embarked 2
dtype: int64
Remove NaNs
## Input/Target split
ids = data_no_nan["PassengerId"]
targets = data_no_nan["Survived"]
inputs = data_no_nan.drop(columns=[ids.name, targets.name])
## Categorical columns
categorical_columns = [
*data_no_nan.columns[data_no_nan.dtypes == "object"],
"Pclass"
]
numeric_columns = inputs.columns[~inputs.columns.isin(set(categorical_columns))]
(categorical_columns, numeric_columns)
Categorical columns
w=5
h=5
cols = 3
rows = int(np.ceil(len(categorical_columns) / cols))
plt.figure(figsize=(cols * w, rows * h))
for i, col in enumerate(categorical_columns[:]):
plt.subplot(rows, cols, i+1)
sns.countplot(x=inputs[col], hue=targets.astype(str))
#plt.hist(inputs[col])
plt.xticks(rotation=75)
plt.title(col)
w=5
h=5
cols = 3
rows = int(np.ceil(len(numeric_columns) / cols))
plt.figure(figsize=(cols * w, rows * h))
X_train: pd.DataFrame
X_val: pd.DataFrame
y_train: pd.Series
y_val: pd.Series
X_train, X_val, y_train, y_val = train_test_split(inputs, targets, test_size=0.1)
X_train.reset_index(drop=True, inplace=True)
X_val.reset_index(drop=True, inplace=True)
y_train.reset_index(drop=True, inplace=True)
y_val.reset_index(drop=True, inplace=True)
X_train.shape, X_val.shape
Output:
Transform data
col_transformer = sk_compose.ColumnTransformer([
("MinMaxScaler", sk_pre.MinMaxScaler(), numeric_columns),
("one_hot_encoder", sk_pre.OneHotEncoder(drop="first",
handle_unknown="ignore"), categorical_columns),
], remainder="passthrough", verbose_feature_names_out=False)
X_train_scaled = pd.DataFrame(col_transformer.fit_transform(X_train),
columns=col_transformer.get_feature_names_out())
X_val_scaled = pd.DataFrame(col_transformer.transform(X_val),
columns=col_transformer.get_feature_names_out())
col_transformer.get_feature_names_out()
Output:
Setup model
model = sk_linear_model.LogisticRegression()
model.fit(X_train_scaled, y_train)
m_preds = model.predict(X_val_scaled) > 0.5
Output:
(0.7, 0.6809954751131222)
m_metrics
Output:
m_class_metrics
Output:
make_confusion_matrix(y_val, m_preds)
Practical-11
Write the code in python to implement logistic regression for multi
class classification.
Answer:
Multi class logistic regression in numpy
Softmax regression, also called multinomial logistic regression extends logistic
regression to multiple classes.
Given:
Step 0: Initialize the weight matrix and bias values with zeros (or small random
values).
Step 1: For each class 𝑘 compute a linear combination of the input features and the
weight vector of class 𝑘, that is, for each training example compute a score for each
class. For class 𝑘 and input vector 𝐱 (𝑖) we have:
𝑠𝑐𝑜𝑟𝑒𝑘 (𝐱 (𝑖) ) = 𝐰𝑘𝑇 ⋅ 𝐱 (𝑖) + 𝑏𝑘
where is the dot product and 𝐰(𝑘) the weight vector of class 𝑘. We can compute the scores for all
classes and training examples in parallel, using vectorization and broadcasting:
𝒔𝒄𝒐𝒓𝒆𝒔 = 𝑿 ⋅ 𝑾T + 𝒃
where 𝐗 is a matrix of shape (𝑛𝑠𝑎𝑚𝑝𝑙𝑒𝑠 , 𝑛𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 ) that holds all training examples, and 𝐖 is a
matrix of shape (𝑛𝑐𝑙𝑎𝑠𝑠𝑒𝑠 , 𝑛𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 ) that holds the weight vector for each class.
Step 2: Apply the softmax activation function to transform the scores into
probabilities. The probability that an input vector 𝐱 (𝑖) belongs to class 𝑘 is given by
exp (𝑠𝑐𝑜𝑟𝑒𝑘 (𝐱 (𝑖) ))
(𝑖)
𝑝̂ 𝑘 (𝐱 ) =
∑𝐾 (𝑖)
𝑗=1 exp (𝑠𝑐𝑜𝑟𝑒𝑗 (𝐱 ))
Again we can perform this step for all classes and training examples at once using vectorization. The
class predicted by the model for 𝐱 (𝑖) is then simply the class with the highest probability.
Step 3: Compute the cost over the whole training set. We want our model to predict
a high probability for the target class and a low probability for the other classes. This
can be achieved using the cross-entropy loss function:
𝑚 𝐾
1 (𝑖) (𝑖)
𝐽(𝐖, 𝑏) = − ∑ ∑ [𝑦𝑘 log(𝑝̂𝑘 )]
𝑚
𝑖=1 𝑘=1
(𝑖)
In this formula, the target labels are one-hot encoded. So 𝑦𝑘 is 1 is the target class for 𝐱 (𝑖) is k,
(𝑖)
otherwise 𝑦𝑘 is 0.
Note: when there are only two classes, this cost function is equivalent to the cost function of
logistic regression.
Step 4: Compute the gradient of the cost function with respect to each weight vector
and bias.
The general formula for class 𝑘 is given by:
𝑚
1 (𝑖) (𝑖)
∇ ∗ 𝒘𝑘 𝐽(𝑾, 𝑏) = ∑ 𝒙(𝑖) [𝑝̂𝑘 − 𝑦𝑘 ]]
𝑚
𝑖=1
𝑏𝑘 = 𝑏𝑘 − 𝜂 ∇𝑏𝑘 𝐽
# fmt: off
from typing import *
import numpy as np
import pandas as pd
import seaborn as sns
import itertools
# fmt: on
sns.set()
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
ROOT_INPUT_DIR = "./"
def make_confusion_matrix(
y_true,
y_pred,
classes: list = None,
figsize: Tuple[int, int] = (10, 10),
text_size: int = 15,
Computer Science Department 59 L. D. College of Engineering, Ahmedabad-15
220280116044 Introduction to AI and Machine Learning (114AG01) Vatsal Kathiriya
norm: bool = False,
savefig: Union[str, bool] = False,
remove_diagonal: bool = False):
"""Makes a labelled confusion matrix comparing predictions and ground truth
labels.
If classes is passed, confusion matrix will be labelled, if not, integer class values
will be used.
Args:
y_true: Array of truth labels (must be same shape as y_pred).
y_pred: Array of predicted labels (must be same shape as y_true).
classes: Array of class labels (e.g. string form). If `None`, integer labels are used.
figsize: Size of output figure (default=(10, 10)).
text_size: Size of output figure text (default=15).
norm: normalize values or not (default=False).
savefig: save confusion matrix to file (default=False).
Returns:
A labelled confusion matrix plot comparing y_true and y_pred.
Example usage:
make_confusion_matrix(y_true=test_labels, # ground truth test labels
y_pred=y_preds, # predicted labels
classes=class_names, # array of class label names
figsize=(15, 15),
text_size=10)
"""
# Create the confustion matrix
cm: np.ndarray = sk_metrics.confusion_matrix(y_true, y_pred)
if remove_diagonal:
np.fill_diagonal(cm, 0)
cm_norm = cm.astype("float") / \
cm.sum(axis=1)[:, np.newaxis] # normalize it
n_classes = cm.shape[0] # find the number of classes we're dealing with
class MultiClassLogisticRegression:
def __init__(self, n_iter=10000, thres=1e-3):
self.n_iter = n_iter
self.thres = thres
Output:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0
1 4.9 3.0 1.4 0.2 0
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 0
X: pd.DataFrame = df.drop(columns=["target"])
y: pd.Series = df["target"]
X.shape
Output:
(150, 4)
Train-test split
X_train: pd.DataFrame
X_val: pd.DataFrame
y_train: pd.Series
y_val: pd.Series
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.4)
X_train.reset_index(drop=True, inplace=True)
X_val.reset_index(drop=True, inplace=True)
y_train.reset_index(drop=True, inplace=True)
y_val.reset_index(drop=True, inplace=True)
X_train.shape
Output:
(90, 4)
model = MultiClassLogisticRegression(thres=1e-5)
model.fit(X_train, y_train, lr=0.0001)
m_preds = model.predict(X_val).argmax(axis=1)
(0.95, 0.9444444444444445)
m_metrics
Output:
m_class_metrics
Output:
make_confusion_matrix(y_val, m_preds)
plt.plot(np.arange(len(model.loss)), model.loss)
plt.xlabel("Number of iterations")
plt.ylabel("Loss")
plt.show()
X_train.columns
Index(['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'],
dtype='object')
col0 = "sepal length (cm)"
col1 = "sepal width (cm)"
X_train_trimmed = X_train[[col0, col1]] # we only take the first two features.
logreg = MultiClassLogisticRegression()
logreg.fit(X_train_trimmed, y_train)
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[col0].min() - .5, X[col0].max() + .5
y_min, y_max = X[col1].min() - .5, X[col1].max() + .5
h = .02 # step size in the mesh
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xticks(())
plt.yticks(())
plt.show()