0% found this document useful (0 votes)

11 views52 pages

BASIC - FUNCTIONALITIES - OF - PYTHON (1) Vikas

The document provides an overview of Python programming, covering key concepts such as variables, data types, conditionals, loops, functions, and error handling. It also introduces libraries like NumPy, Pandas, Matplotlib, and SciPy, highlighting their functionalities and examples. Additionally, it touches on object-oriented programming and file handling in Python.

Uploaded by

deppubadshah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views52 pages

BASIC - FUNCTIONALITIES - OF - PYTHON (1) Vikas

Uploaded by

deppubadshah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 52

EXPERIMENT – 1

AIM
Exploring and demonstrating Python.

1. Variables and Data Types

Variables are used to store data in Python. Python is dynamically typed, meaning the data
type of a variable is inferred when the variable is assigned a value.
Data Types:
 Integers: Whole numbers, e.g., 5, 100, -3
 Floats: Numbers with a decimal point, e.g., 3.14, -1.23
 Strings: Sequences of characters enclosed in either single or double quotes, e.g.,
"Hello", 'World'
 Booleans: Represents True or False values, used for logical operations
 Lists: Ordered, mutable collections that can contain elements of different data types
 Tuples: Ordered, immutable collections
 Dictionaries: Unordered collections of key-value pairs
 Sets: Unordered collections of unique elements
Python also supports None (a null value) to represent the absence of a value.
Example:

# Define variables of different data types

integer_var = 10
float_var = 3.14
string_var = "Hello, World!"
boolean_var = True
list_var = [1, 2, 3]
tuple_var = (4, 5, 6)
set_var = {7, 8, 9}
dict_var = {"key1": "value1", "key2": "value2"}
none_var = None

# Print variables with their data types

print(f"Integer: {integer_var}, Data type: {type(integer_var)}")
print(f"Float: {float_var}, Data type: {type(float_var)}")
print(f"String: '{string_var}', Data type: {type(string_var)}")
print(f"Boolean: {boolean_var}, Data type: {type(boolean_var)}")
print(f"List: {list_var}, Data type: {type(list_var)}")
print(f"Tuple: {tuple_var}, Data type: {type(tuple_var)}")
print(f"Set: {set_var}, Data type: {type(set_var)}")
print(f"Dictionary: {dict_var}, Data type: {type(dict_var)}")
print(f"None: {none_var}, Data type: {type(none_var)}")

Output:

2. Conditionals
Conditional statements allow you to execute specific blocks of code based on conditions.
Python uses if, elif, and else for condition checking.
 if: Used to check a condition
 elif: Used for additional conditions if the previous if is false
 else: Executed when none of the conditions in if or elif are met.

Example:
age = 20
if age >= 18:
print("Adult")
else:
print("Minor")
Output:
3. Loops
Loops allow you to repeat a block of code multiple times. Python provides two main types of
loops:
 for loop: Used to iterate over a sequence (like a list, tuple, or string) or to repeat a
block of code a specific number of times.
 while loop: Continues to execute a block of code as long as a condition is True.
Example of a for loop:
for i in range(5):
print(i) # Output: 0, 1, 2, 3, 4

Output:

Example of a while loop:

count = 0
while count < 8:
print(count) # Output: 0, 1, 2, 3, 4
count += 1
Output:

4. Functions
Functions in Python are defined using the def keyword, and they allow you to organize your
code into reusable blocks.
Functions can have parameters (inputs) and return values (outputs).

Example of a simple function:

def greet(name):
print(f"Hello, {name}!")

greet("Honey")
Output:
5. Lists
A list is an ordered collection of elements, and it is one of the most commonly used data
structures in Python. Lists are mutable, meaning you can change their contents after they
are created.
Example:
fruits = ["apple", "banana", "cherry"]
fruits.append("orange") # Adds an item to the list
print(fruits) # Output: ['apple', 'banana', 'cherry', 'orange']
Output:

6. Dictionaries
Dictionaries are unordered collections of key-value pairs. Each key is unique, and each key
maps to a value. You can access, modify, and add elements using the keys.
Example:
person = {"name": "Honey", "age": 20}
print(person["name"])
person["age"] = 21 # Modify value
print(person)
Output:
7. Error Handling
Python uses try, except, else, and finally to handle exceptions (errors) during execution. This
allows your program to continue running even if an error occurs.
 try: Block of code that might raise an exception
 except: Handles the exception
 else: Executes if no exception occurs
 finally: Executes no matter what, after try and except

Example:
try:
x = int(input("Enter a number: "))
result = 10 / x
except ZeroDivisionError:
print("Cannot divide by zero!")
except ValueError:
print("Invalid input!")
else:
print(f"The result is {result}")
finally:
print("Execution completed.")
Output:

8. Classes and Objects (Object-Oriented Programming)

Python supports object-oriented programming (OOP), which allows you to structure your
code in terms of objects and classes.
 Class: A blueprint for creating objects
 Object: An instance of a class
 Methods: Functions that are associated with a class
Example:
class Car:
def __init__(self, make, model):
self.make = make
self.model = model

def display_info(self):
print(f"Car Make: {self.make}, Model: {self.model}")

# Creating an object of class Car

my_car = Car("Toyota", "Corolla")
my_car.display_info()
Output:

9. File Handling
Python allows you to interact with files using built-in functions. You can open, read, write,
and close files.
 open(): Opens a file for reading or writing
 read(): Reads the contents of a file
 write(): Writes to a file
 close(): Closes the file
Example:
# Writing to a file
with open("example.txt", "w") as file:
file.write("Hello, World!")
# Reading from a file
with open("example.txt", "r") as file:
content = file.read()
print(content)
Output:

10. List Comprehensions

List comprehensions provide a concise way to create lists. They allow you to generate a new
list by applying an expression to each element in an existing iterable.
Example:
squares = [x**2 for x in range(5)]
print(squares) # Output: [0, 1, 4, 9, 16]
Output:

11. Lambda Functions

Lambda functions are small, anonymous functions that can have any number of arguments
but only one expression. They are often used for short-term operations.
Example:
multiply = lambda x, y: x * y
print(multiply(2, 3))
Output:

12. Modules and Libraries

Python has a rich ecosystem of built-in libraries and third-party modules. You can import and
use them in your code using the import keyword.
Example using the math module:
import math
result = math.sqrt(16)
print(result)
Output:

You can also create your own modules and import them into your programs.

PYTHON LIBRARIES
1.Numpy
NumPy is a Python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use
it freely.
NumPy stands for Numerical Python.
Some functionalities of numpy are: -
 Array Creation
import numpy as np

# Create a 1D array
array_1d = np.array([1, 2, 3, 4])
print("1D Array:", array_1d)

# Create a 2D array
array_2d = np.array([[1, 2], [3, 4]])
print("2D Array:\n", array_2d)

# Create arrays with zeros, ones, or random numbers

zeros_array = np.zeros((2, 3))
ones_array = np.ones((2, 3))
random_array = np.random.rand(2, 3)
print("Zeros Array:\n", zeros_array)
print("Ones Array:\n", ones_array)
print("Random Array:\n", random_array)

Output:
 Array Operations

import numpy as np
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])

# Element-wise addition, subtraction, multiplication, and division

print("Addition:", array1 + array2)
print("Subtraction:", array1 - array2)
print("Multiplication:", array1 * array2)
print("Division:", array1 / array2)

# Dot product
print("Dot Product:", np.dot(array1, array2))

Output:

 Indexing and Slicing

import numpy as np
array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Accessing elements
print("Element at (1,2):", array[1, 2])

# Slicing rows and columns

print("First row:", array[0, :])
print("First column:", array[:, 0])
print("Sub-array:\n", array[1:3, 1:3])
Output:

 Mathematical Functions
import numpy as np

data = np.array([1, 2, 3, 4, 5])

print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Standard Deviation:", np.std(data))
print("Sum:", np.sum(data))
print("Cumulative Sum:", np.cumsum(data))

Output:

 Linear Algebra
import numpy as np

matrix = np.array([[1, 2], [3, 4]])

# Transpose of a matrix
print("Transpose:\n", np.transpose(matrix))

# Determinant
print("Determinant:", np.linalg.det(matrix))

# Inverse of a matrix
print("Inverse:\n", np.linalg.inv(matrix))

# Eigenvalues and eigenvectors

eigenvalues, eigenvectors = np.linalg.eig(matrix)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)
Output:

2. Pandas
Pandas is a Python library widely used for data analysis and manipulation. It provides
structures like DataFrame and Series, which allow you to work with structured data
efficiently. Here are some key functionalities of Pandas along with implementation
examples:
 Creating Data Structure
import pandas as pd

# Create a Series
data = [10, 20, 30, 40]
series = pd.Series(data, index=['a', 'b', 'c', 'd'])
print("Series:\n", series)

# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'City': ['New York', 'Los
Angeles', 'Chicago']}
df = pd.DataFrame(data)
print("\nDataFrame:\n", df)

Output:
 Indexing and Selecting Data
import pandas as pd

data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

# Access a column
print("Column A:\n", df['A'])

# Access rows using loc and iloc

print("\nFirst row using loc:\n", df.loc[0])
print("\nFirst row using iloc:\n", df.iloc[0])

# Access specific elements

print("\nElement at (0, 1):", df.iloc[0, 1])

Output:

 Filtering and conditional selection

import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Salary': [50000, 60000,
70000]}
df = pd.DataFrame(data)

# Filter rows where Age > 30

filtered = df[df['Age'] > 30]
print("Filtered DataFrame:\n", filtered)

Output:
 Data Cleaning
import pandas as pd
data = {'A': [1, 2, None], 'B': [None, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

# Check for missing values

print("Missing Values:\n", df.isnull())

# Fill missing values

df_filled = df.fillna(0)
print("\nFilled DataFrame:\n", df_filled)

# Drop rows with missing values

df_dropped = df.dropna()
print("\nDataFrame after dropping missing values:\n", df_dropped)

Output:

 Groupby Operation

import pandas as pd
data = {'Category': ['A', 'A', 'B', 'B'], 'Values': [10, 20, 30, 40]}
df = pd.DataFrame(data)

# Group by 'Category' and calculate sum

grouped = df.groupby('Category').sum()
print("Grouped Data:\n", grouped)

Output:
3. Matplotlib
Matplotlib is a powerful Python library for creating static, interactive, and animated
visualizations. Here are some basic functionalities of Matplotlib with implementation
examples:
 Plotting a Simple Line Graph

import matplotlib.pyplot as plt

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create a line plot

plt.plot(x, y, label='y = 2x')
plt.title("Simple Line Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.legend()
plt.show()

Output:

 Scatter Plot

import matplotlib.pyplot as plt

# Data
x = [5, 7, 8, 7, 2, 17, 2, 9, 4, 11]
y = [99, 86, 87, 88, 100, 86, 103, 87, 94, 78]

# Create a scatter plot

plt.scatter(x, y, color='red', label="Data Points")
plt.title("Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.legend()
plt.show()

Output:

 Bar Chart

import matplotlib.pyplot as plt

# Data
categories = ['A', 'B', 'C', 'D']
values = [3, 7, 8, 5]

# Create a bar chart

plt.bar(categories, values, color='blue')
plt.title("Bar Chart")
plt.xlabel("Categories")
plt.ylabel("Values")
plt.show()
Output:

 Histogram

import matplotlib.pyplot as plt

# Data
data = [1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7, 8]

# Create a histogram
plt.hist(data, bins=5, color='green', edgecolor='black')
plt.title("Histogram")
plt.xlabel("Bins")
plt.ylabel("Frequency")
plt.show()
Output:

 Pie Chart

import matplotlib.pyplot as plt

# Data
labels = ['Python', 'Java', 'C++', 'Ruby']
sizes = [50, 30, 15, 5]
colors = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue']

# Create a pie chart

plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)
plt.title("Pie Chart")
plt.show()

Output:

4. SciPy
SciPy is a Python library built on top of NumPy that is widely used for scientific and
numerical computations. It provides modules for optimization, integration, interpolation,
linear algebra, signal processing, statistics, and more. Below are the basic functionalities of
SciPy with examples:
 Linear Algebra Operations

from scipy import linalg

import numpy as np

# Matrix
A = np.array([[3, 2], [1, 4]])

# Compute the determinant

det = linalg.det(A)
print("Determinant:", det)

# Compute the inverse

inverse = linalg.inv(A)
print("\nInverse:\n", inverse)

# Solve a linear system (Ax = b)

b = np.array([6, 8])
x = linalg.solve(A, b)
print("\nSolution to Ax = b:\n", x)

Output:

 Optimization

from scipy.optimize import minimize

# Define a function to minimize

def func(x):
return (x - 3)**2 + 4

# Minimize the function

result = minimize(func, x0=0) # x0 is the initial guess
print("Optimization Result:\n", result)

Output:

 Integration

from scipy import integrate

# Define a function
def f(x):
return x**2

# Integrate f(x) from 0 to 3

result, error = integrate.quad(f, 0, 3)
print("Integration Result:", result)

Output:

 Interpolation

from scipy import interpolate

import numpy as np

# Data points
x = [0, 1, 2, 3, 4]
y = [1, 2, 0, 2, 1]

# Create a cubic spline interpolation

f = interpolate.interp1d(x, y, kind='cubic')

# Interpolate at new points

x_new = np.linspace(0, 4, 50)
y_new = f(x_new)

# Plot the result

import matplotlib.pyplot as plt
plt.plot(x, y, 'o', label='Data Points')
plt.plot(x_new, y_new, '-', label='Cubic Spline')
plt.legend()
plt.show()

Output:

 Signal Processing

from scipy.signal import butter, lfilter

import numpy as np
import matplotlib.pyplot as plt

# Create a sample signal

fs = 500 # Sampling frequency
t = np.linspace(0, 1, fs, endpoint=False) # Time vector
signal = np.sin(2 * np.pi * 5 * t) + 0.5 * np.random.randn(t.size)

# Create a low-pass Butterworth filter

b, a = butter(4, 0.1, btype='low') # Order=4, cutoff=0.1

# Apply the filter

filtered_signal = lfilter(b, a, signal)

# Plot the result

plt.plot(t, signal, label='Original Signal')
plt.plot(t, filtered_signal, label='Filtered Signal', linewidth=2)
plt.legend()
plt.show()
Output:

5. Scikit- learn
Scikit-learn is a powerful Python library used for machine learning, providing simple and
efficient tools for data mining and data analysis. It supports various supervised and
unsupervised learning algorithms, and it's built on top of NumPy, SciPy, and matplotlib.
Below are the basic functionalities of Scikit-learn with examples:

 Data Preprocessing

from sklearn.preprocessing import StandardScaler, MinMaxScaler

import numpy as np

# Data
data = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])

# Standardization (zero mean, unit variance)

scaler = StandardScaler()
standardized_data = scaler.fit_transform(data)
print("Standardized Data:\n", standardized_data)

# Normalization (scales between 0 and 1)

normalizer = MinMaxScaler()
normalized_data = normalizer.fit_transform(data)
print("\nNormalized Data:\n", normalized_data)

Output:

 Train-Test Split

from sklearn.model_selection import train_test_split

import numpy as np

# Data
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
y = np.array([1, 2, 3, 4, 5])

# Split into 80% training and 20% testing

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

print("X_train:", X_train)
print("X_test:", X_test)

Output:

 Linear Regression

import numpy as np
from sklearn.linear_model import LinearRegression

# Data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 2, 3, 4, 5])

# Create a Linear Regression model

model = LinearRegression()
model.fit(X, y)

# Predict
y_pred = model.predict([[6]])
print("Predicted value for input 6:", y_pred)
Output:

 Logistic Regression

import numpy as np
from sklearn.linear_model import LogisticRegression

# Data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([0, 0, 1, 1, 1]) # Binary target variable

# Create and train a Logistic Regression model

model = LogisticRegression()
model.fit(X, y)

# Predict
y_pred = model.predict([[6]])
print("Predicted class for input 6:", y_pred)

Output:

 K-Nearest Neighbors (KNN)

import numpy as np
from sklearn.neighbors import KNeighborsClassifier

# Data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([0, 0, 1, 1, 1]) # Binary target variable
# Create and train a KNN model
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X, y)

# Predict
y_pred = knn.predict([[6]])
print("Predicted class for input 6:", y_pred)

Output:

EXPERIMENT – 2
AIM
Perform Data Preprocessing like outlier detection, handling missing value, analyzing
redundancy and normalization on different datasets.

THEORY
Data preprocessing is a crucial step in machine learning, ensuring that data is clean,
consistent, and ready for training. Below are common data preprocessing techniques with
Python code using pandas and scikit-learn.

1. Handling Missing Values

Missing values can be handled by removing them or imputing them.
import pandas as pd
from sklearn.impute import SimpleImputer
# Sample data
data = {'Age': [25, 30, None, 35, 40],
'Salary': [50000, 60000, 75000, None, 90000]}
df = pd.DataFrame(data)
# Impute missing values with mean
imputer = SimpleImputer(strategy='mean')
df[['Age', 'Salary']] = imputer.fit_transform(df[['Age', 'Salary']])
print(df)
Output:

2. Encoding Categorical Data

Machine learning models work with numerical values, so categorical data must be encoded.
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
import numpy as np
import pandas as pd
# Sample categorical data
df = pd.DataFrame({'City': ['Delhi', 'Mumbai', 'Delhi', 'Bangalore', 'Mumbai']})
# Label Encoding
label_encoder = LabelEncoder()
df['City_Label'] = label_encoder.fit_transform(df['City'])
# One-Hot Encoding
one_hot_encoder = OneHotEncoder(sparse_output=False)
encoded = one_hot_encoder.fit_transform(df[['City']])
# Convert to DataFrame
df_encoded = pd.DataFrame(encoded,
columns=one_hot_encoder.get_feature_names_out(['City']))
df = pd.concat([df, df_encoded], axis=1)
print(df)
Output:

3. Feature Scaling
Feature scaling standardizes or normalizes numerical features.

import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler, MinMaxScaler
# Sample data
data = {'Height': [150, 160, 170, 180, 190],
'Weight': [50, 60, 70, 80, 90]}
df = pd.DataFrame(data)
# Standardization (Z-score normalization)
scaler = StandardScaler()
df[['Height', 'Weight']] = scaler.fit_transform(df[['Height', 'Weight']])
print(df)

Output:
4. Feature Engineering (Polynomial Features)
Creating new features from existing ones.

import numpy as np
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures
# Sample data
data = {'Feature': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
# Polynomial Features (Degree 2)
poly = PolynomialFeatures(degree=2, include_bias=False)
df_poly = pd.DataFrame(poly.fit_transform(df), columns=['Feature', 'Feature^2'])
print(df_poly)
Output:

5. Dimensionality Reduction (PCA)

Reduces the number of features while preserving variance.
import pandas as pd
from sklearn.decomposition import PCA
import numpy as np
# Sample data
np.random.seed(42)
data = np.random.rand(5, 3) # 5 samples, 3 features
df = pd.DataFrame(data, columns=['A', 'B', 'C'])
# Applying PCA
pca = PCA(n_components=2)
df_pca = pd.DataFrame(pca.fit_transform(df), columns=['PC1', 'PC2'])
print(df_pca)
Output:

6. Handling Outliers (Using IQR)

Detecting and removing outliers using the Interquartile Range (IQR) method.
import numpy as np
import pandas as pd
# Sample data
data = {'Salary': [50000, 60000, 75000, 90000, 120000, 300000]} # 300000 is an outlier
df = pd.DataFrame(data)
# Calculate IQR
Q1 = df['Salary'].quantile(0.25)
Q3 = df['Salary'].quantile(0.75)
IQR = Q3 - Q1
# Define bounds
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
# Remove outliers
df_cleaned = df[(df['Salary'] >= lower_bound) & (df['Salary'] <= upper_bound)]
print(df_cleaned)

Output:

EXPERIMENT – 3
AIM
Write a program to implement decision tree based on ID3, C4.5 and CART algorithm.

THEORY
A decision tree is a supervised learning algorithm used for both classification and regression
tasks. It models decisions in a tree-like structure where:
 Each internal node represents a feature (attribute).
 Each branch represents a decision based on that feature’s value.
 Each leaf node represents a class label (for classification) or a numeric output (for
regression).
ID3
The ID3 algorithm, developed by Ross Quinlan, builds a decision tree using Entropy and
Information Gain as splitting criteria.
How ID3 Works
 Entropy (HHH) measures the impurity (randomness) in a dataset:
o If all examples belong to one class, entropy is 0 (pure).
o If the examples are evenly split between classes, entropy is 1 (high impurity).
 Information Gain (IG) measures how much a feature reduces entropy:
o The feature with the highest Information Gain is chosen as the root node.

CODE
import numpy as np
import pandas as pd
from collections import Counter

def entropy(data):
labels = data.iloc[:, -1] # Assuming last column is the target
label_counts = Counter(labels)
total = len(labels)
ent = -sum((count/total) * np.log2(count/total) for count in label_counts.values())
return ent

def information_gain(data, feature):

total_entropy = entropy(data)
values = data[feature].unique()
weighted_entropy = sum((len(subset)/len(data)) * entropy(subset) for value in values
if len(subset := data[data[feature] == value]) > 0)
gain = total_entropy - weighted_entropy
print(f"Information Gain for {feature}: {gain:.4f}")
return gain

def best_feature(data):
features = data.columns[:-1] # Exclude target column
return max(features, key=lambda feature: information_gain(data, feature))

def id3(data, tree=None, depth=0):

labels = data.iloc[:, -1]
if len(set(labels)) == 1:
return labels.iloc[0] # Pure class
if len(data.columns) == 1:
return labels.mode()[0] # Majority class
best_feat = best_feature(data)
print(f"\nBest Feature at depth {depth}: {best_feat}")
if tree is None:
tree = {}
tree[best_feat] = {}
for value in data[best_feat].unique():
subset = data[data[best_feat] == value].drop(columns=[best_feat])
tree[best_feat][value] = id3(subset, depth=depth+1)
return tree

def print_tree(tree, indent=""):

if not isinstance(tree, dict):
print(indent + "-> " + str(tree))
return
for key, subtree in tree.items():
print(indent + str(key))
for value, subsubtree in subtree.items():
print(indent + f" {value}:")
print_tree(subsubtree, indent + " ")

# Example dataset
columns = ["CGPA", "Interactiveness", "Practical Knowledge", "Skills", "Placed"]
data = pd.DataFrame([
["High", "Good", "Excellent", "Strong", "Yes"],
["Low", "Poor", "Weak", "Weak", "No"],
["Medium", "Average", "Good", "Medium", "Yes"],
["High", "Good", "Good", "Strong", "Yes"],
["Medium", "Average", "Average", "Medium", "No"],
["Low", "Poor", "Weak", "Weak", "No"],
["High", "Excellent", "Excellent", "Strong", "Yes"],
["Medium", "Good", "Good", "Medium", "Yes"],
["Low", "Average", "Poor", "Weak", "No"],
["Medium", "Good", "Average", "Medium", "Yes"],
["High", "Excellent", "Excellent", "Strong", "Yes"],
["Low", "Poor", "Weak", "Weak", "No"],
["Medium", "Average", "Good", "Medium", "Yes"],
["High", "Good", "Good", "Strong", "Yes"]
], columns=columns)

# Build and display tree

tree = id3(data)
print("\nDecision Tree:")
print_tree(tree)
Output:

C 4.5
C4.5, developed by Ross Quinlan, is an extension of ID3 with improvements.
Key Improvements in C4.5
1. Handles both categorical and numerical data
o If a numerical feature is selected, it finds the best threshold (e.g., age > 30).
2. Uses Gain Ratio instead of Information Gain
o Gain Ratio solves the bias in Information Gain by normalizing it.
o Formula:
Gain Ratio=Information Gain/Split Information
o Split Information prevents the algorithm from favoring attributes with many
unique values.
3. Handles missing values
o It assigns probabilities for missing values.
4. Pruning to reduce overfitting
o Uses post-pruning, removing branches that add little value.

CODE
import numpy as np

import pandas as pd

from collections import Counter

def entropy(data):

labels = data.iloc[:, -1] # Assuming last column is the target

label_counts = Counter(labels)

total = len(labels)

ent = -sum((count/total) * np.log2(count/total) for count in label_counts.values())

return ent

def split_info(data, feature):

values = data[feature].unique()

total = len(data)

split_ent = -sum((len(subset)/total) * np.log2(len(subset)/total) for value in values

if len(subset := data[data[feature] == value]) > 0)

return split_ent

def gain_ratio(data, feature):

gain = information_gain(data, feature)

split = split_info(data, feature)

ratio = gain / split if split != 0 else 0

print(f"Split Info for {feature}: {split:.4f}")

print(f"Gain Ratio for {feature}: {ratio:.4f}")

return ratio

def information_gain(data, feature):

total_entropy = entropy(data)

values = data[feature].unique()

weighted_entropy = sum((len(subset)/len(data)) * entropy(subset) for value in values

if len(subset := data[data[feature] == value]) > 0)

gain = total_entropy - weighted_entropy

print(f"Information Gain for {feature}: {gain:.4f}")

return gain

def best_feature(data):

features = data.columns[:-1] # Exclude target column

return max(features, key=lambda feature: gain_ratio(data, feature))

def c45(data, tree=None, depth=0):

labels = data.iloc[:, -1]

if len(set(labels)) == 1:

return labels.iloc[0] # Pure class

if len(data.columns) == 1:

return labels.mode()[0] # Majority class

best_feat = best_feature(data)

print(f"\nBest Feature at depth {depth}: {best_feat}")

if tree is None:

tree = {}

tree[best_feat] = {}

for value in data[best_feat].unique():

subset = data[data[best_feat] == value].drop(columns=[best_feat])

tree[best_feat][value] = c45(subset, depth=depth+1)

return tree

def print_tree(tree, indent=""):

if not isinstance(tree, dict):

print(indent + "-> " + str(tree))

return

for key, subtree in tree.items():

print(indent + str(key))

for value, subsubtree in subtree.items():

print(indent + f" {value}:")

print_tree(subsubtree, indent + " ")

columns = ["CGPA", "Interactiveness", "Practical Knowledge", "Skills", "Placed"]

data = pd.DataFrame([

["High", "Good", "Excellent", "Strong", "Yes"],

["Low", "Poor", "Weak", "Weak", "No"],

["Medium", "Average", "Good", "Medium", "Yes"],

["High", "Good", "Good", "Strong", "Yes"],

["Medium", "Average", "Average", "Medium", "No"],

["Low", "Poor", "Weak", "Weak", "No"],

["High", "Excellent", "Excellent", "Strong", "Yes"],

["Medium", "Good", "Good", "Medium", "Yes"],

["Low", "Average", "Poor", "Weak", "No"],

["Medium", "Good", "Average", "Medium", "Yes"],

["High", "Excellent", "Excellent", "Strong", "Yes"],

["Low", "Poor", "Weak", "Weak", "No"],

["Medium", "Average", "Good", "Medium", "Yes"],

["High", "Good", "Good", "Strong", "Yes"]

], columns=columns)

# Build and display C4.5 decision tree

tree = c45(data)

print("\nC4.5 Decision Tree:")

print_tree(tree)

Output:
CART
CART, developed by Breiman et al., is another decision tree algorithm. Unlike ID3 and C4.5,
it:
 Works for both classification and regression.
 Uses the Gini Index (instead of entropy) to find the best split.
 If Gini = 0, the node is pure (only one class present).
 If Gini = 1, the node is maximally impure.
Splitting is done in a binary way (each node splits into two branches only).
Example: Instead of splitting on Color (Red, Blue, Green), CART creates binary splits like Color
= Red?.
Regression Trees use Mean Squared Error (MSE) instead of Gini.

CODE
import numpy as np
import pandas as pd
from collections import Counter

def gini_index(data):
labels = data.iloc[:, -1]
label_counts = Counter(labels)
total = len(labels)
gini = 1 - sum((count/total) ** 2 for count in label_counts.values())
return gini

def gini_split(data, feature):

values = data[feature].unique()
total = len(data)
weighted_gini = 0
for value in values:
subset = data[data[feature] == value]
if len(subset) > 0:
gini_val = gini_index(subset)
print(f"Gini index for {feature} = {value}: {gini_val:.4f}")
weighted_gini += (len(subset)/total) * gini_val
return weighted_gini

def best_feature_cart(data):
features = data.columns[:-1]
return min(features, key=lambda feature: gini_split(data, feature))

def cart(data, tree=None, depth=0):

labels = data.iloc[:, -1]
if len(set(labels)) == 1:
return labels.iloc[0]
if len(data.columns) == 1:
return labels.mode()[0]
best_feat = best_feature_cart(data)
print(f"\nBest Feature at depth {depth}: {best_feat}")
if tree is None:
tree = {}
tree[best_feat] = {}
for value in data[best_feat].unique():
subset = data[data[best_feat] == value].drop(columns=[best_feat])
tree[best_feat][value] = cart(subset, depth=depth+1)
return tree

def print_tree(tree, indent=""):

# Build and display CART decision tree

tree = cart(data)
print("\nCART Decision Tree:")
print_tree(tree)

Output:
EXPERIMENT -4
AIM
To implement a simple Artificial Neural Network (ANN) using the Backpropagation algorithm
from scratch using Python and NumPy, and test it on a suitable dataset like the XOR
problem.

THEORY
An Artificial Neural Network (ANN) is inspired by the structure of biological neurons. It
contains:
 Input Layer
 Hidden Layer(s)
 Output Layer
The Backpropagation algorithm is used to minimize error by updating weights using
gradients.

Dataset Used:
Iris dataset: A famous dataset consisting of 3 types of Iris flowers (Setosa, Versicolor,
Virginica) with 4 features:
 Sepal Length
 Sepal Width
 Petal Length
 Petal Width
We'll simplify it for binary classification:
 Class 0: Setosa
 Class 1: Versicolor
(We’ll ignore Virginica to keep it binary.)
 Input Layer: 4 neurons (4 features)
 Hidden Layer: 5 neurons
 Output Layer: 1 neuron (binary output)

CODE
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Activation and loss

def sigmoid(x):
return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
return x * (1 - x)

def binary_cross_entropy(y_true, y_pred):

epsilon = 1e-9
y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
return -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))

# ANN class
class SimpleANN:
def __init__(self, input_size, hidden_size, output_size):
self.W1 = np.random.randn(input_size, hidden_size)
self.b1 = np.zeros((1, hidden_size))
self.W2 = np.random.randn(hidden_size, output_size)
self.b2 = np.zeros((1, output_size))

def forward(self, X):

self.z1 = np.dot(X, self.W1) + self.b1
self.a1 = sigmoid(self.z1)
self.z2 = np.dot(self.a1, self.W2) + self.b2
self.a2 = sigmoid(self.z2)
return self.a2

def backward(self, X, y, output, learning_rate=0.1):

m = y.shape[0]
dz2 = output - y
dW2 = np.dot(self.a1.T, dz2) / m
db2 = np.sum(dz2, axis=0, keepdims=True) / m

dz1 = np.dot(dz2, self.W2.T) * sigmoid_derivative(self.a1)

dW1 = np.dot(X.T, dz1) / m
db1 = np.sum(dz1, axis=0, keepdims=True) / m

# Update weights
self.W2 -= learning_rate * dW2
self.b2 -= learning_rate * db2
self.W1 -= learning_rate * dW1
self.b1 -= learning_rate * db1

def train(self, X, y, epochs=1000, learning_rate=0.1):

for i in range(epochs):
output = self.forward(X)
loss = binary_cross_entropy(y, output)
self.backward(X, y, output, learning_rate)
if i % 100 == 0:
print(f"Epoch {i}, Loss: {loss:.4f}")

def predict(self, X):

output = self.forward(X)
return (output > 0.5).astype(int)

# Load dataset
iris = load_iris()
X = iris.data[:100] # Only Setosa and Versicolor
y = iris.target[:100].reshape(-1, 1) # 0 or 1

# Preprocessing
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Train/Test Split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2,
random_state=42)

# Create and train model

model = SimpleANN(input_size=4, hidden_size=5, output_size=1)
model.train(X_train, y_train, epochs=1000, learning_rate=0.1)

# Test accuracy
predictions = model.predict(X_test)
accuracy = np.mean(predictions == y_test)
print(f"Test Accuracy: {accuracy * 100:.2f}%")

# 🔍 Predict on user input

print("\n--- Predict a New Sample ---")
sample = np.array([[5.1, 3.5, 1.4, 0.2]]) # Example: likely Setosa
sample_scaled = scaler.transform(sample)
result = model.predict(sample_scaled)
print("Predicted Class:", "Setosa" if result[0][0] == 0 else "Versicolor")

Output
EXPERIMENT – 5

AIM
To implement the K-Nearest Neighbors (K-NN) algorithm from scratch in Python and use it
to classify data points from the Iris dataset. Display both correct and wrong predictions.

THEORY
 K-NN is a lazy, instance-based learning algorithm.
 For each test point, it finds the K nearest points in the training set using Euclidean
distance, and predicts the most common class among those neighbors.
Dataset Used:
Wine dataset (from sklearn.datasets)
 178 samples
 13 features (like alcohol, magnesium, etc.)
 3 classes (0, 1, 2) — different wine cultivars

CODE
import numpy as np
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from collections import Counter

# Euclidean Distance Function

def euclidean_distance(x1, x2):
return np.sqrt(np.sum((x1 - x2) ** 2))

# KNN Class
class KNN:
def __init__(self, k=3):
self.k = k

def fit(self, X_train, y_train):

self.X_train = X_train
self.y_train = y_train

def predict(self, X_test):

predictions = []
for x in X_test:
distances = [euclidean_distance(x, x_train) for x_train in self.X_train]
k_indices = np.argsort(distances)[:self.k]
k_labels = [self.y_train[i] for i in k_indices]
most_common = Counter(k_labels).most_common(1)[0][0]
predictions.append(most_common)
return np.array(predictions)

# Load Wine Dataset

data = load_wine()
X, y = data.data, data.target

# Scale features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2,
random_state=42)

# Train the model

model = KNN(k=5)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
# Evaluate model
print("Correct Predictions:")
for i in range(len(y_test)):
if predictions[i] == y_test[i]:
print(f"Sample {i}: Predicted = {predictions[i]}, Actual = {y_test[i]} ✅")

print("\nWrong Predictions:")
for i in range(len(y_test)):
if predictions[i] != y_test[i]:
print(f"Sample {i}: Predicted = {predictions[i]}, Actual = {y_test[i]} ❌")

# Take a sample input for prediction

sample_input = [13.0, 2.3, 2.4, 15.6, 100.0, 2.8, 2.5, 0.3, 1.9, 5.0, 1.0, 3.0, 1000.0]
sample_scaled = scaler.transform([sample_input]) # Normalize like training data
sample_prediction = model.predict(sample_scaled)
print(f"\n📌 Predicted class for sample input = {sample_prediction[0]}")

OUTPUT

2396510-14-8EN - r1 - Service Information and Procedures Class M
100% (1)
2396510-14-8EN - r1 - Service Information and Procedures Class M
2,072 pages
Browiner MX30 Mobile DR Service Manual 2016.11.10
No ratings yet
Browiner MX30 Mobile DR Service Manual 2016.11.10
91 pages
Python Presentation
100% (1)
Python Presentation
71 pages
Python
No ratings yet
Python
106 pages
Juspay
No ratings yet
Juspay
12 pages
PYTHON Notes by Devaraj
100% (1)
PYTHON Notes by Devaraj
40 pages
Python Programming Basics
No ratings yet
Python Programming Basics
7 pages
Unit 1
100% (1)
Unit 1
69 pages
Esko ArtiosCAD 7.20 Administrator Guide
No ratings yet
Esko ArtiosCAD 7.20 Administrator Guide
276 pages
Adobe Summer Intern Interview Experience
No ratings yet
Adobe Summer Intern Interview Experience
4 pages
Lab 1 - Python Review
No ratings yet
Lab 1 - Python Review
7 pages
Zakos Oil Calculation Survey Report Generator For Tanker Ships
No ratings yet
Zakos Oil Calculation Survey Report Generator For Tanker Ships
64 pages
7CH4Q45C 7CH4Q90C: - User'S Guide
No ratings yet
7CH4Q45C 7CH4Q90C: - User'S Guide
37 pages
MachineLearningNotes PDF
100% (1)
MachineLearningNotes PDF
299 pages
Python Development
No ratings yet
Python Development
22 pages
Python Lecture For Beginners
No ratings yet
Python Lecture For Beginners
45 pages
Python Full Learning Guide With Codes
No ratings yet
Python Full Learning Guide With Codes
10 pages
Final Report On Blood Donation Website
100% (1)
Final Report On Blood Donation Website
63 pages
Python
No ratings yet
Python
18 pages
Python Notes Second Semester BVOC Computer Application
No ratings yet
Python Notes Second Semester BVOC Computer Application
27 pages
AguaSense An Automated Fishpond Monitoring and Filtration System
No ratings yet
AguaSense An Automated Fishpond Monitoring and Filtration System
51 pages
Python Introduction
No ratings yet
Python Introduction
50 pages
Career Hackers: Hack Your Career Today Python in Finance
100% (1)
Career Hackers: Hack Your Career Today Python in Finance
26 pages
4 Weeks Session 2 DA Fundamentals
No ratings yet
4 Weeks Session 2 DA Fundamentals
36 pages
PYTHON ASSIGNMENT - CA-3 - Ayushhmaan
No ratings yet
PYTHON ASSIGNMENT - CA-3 - Ayushhmaan
32 pages
Jesal Patel Internship 3eY5UNvdBs
No ratings yet
Jesal Patel Internship 3eY5UNvdBs
9 pages
Python Assignment
No ratings yet
Python Assignment
10 pages
Data Visualization Using Tableau (WEEK-1)
No ratings yet
Data Visualization Using Tableau (WEEK-1)
9 pages
Introduction To Python
No ratings yet
Introduction To Python
48 pages
Ec 02 2023
No ratings yet
Ec 02 2023
82 pages
Intership Body
No ratings yet
Intership Body
31 pages
19-Lambda Function, Modules, CSV Files, Numpy-18-01-2023
No ratings yet
19-Lambda Function, Modules, CSV Files, Numpy-18-01-2023
48 pages
Python: BY Kannan Moudgalya
No ratings yet
Python: BY Kannan Moudgalya
21 pages
ML Lab File Vijay Kumar
No ratings yet
ML Lab File Vijay Kumar
27 pages
Python ?
No ratings yet
Python ?
69 pages
01 Data Types
No ratings yet
01 Data Types
14 pages
Basics
No ratings yet
Basics
17 pages
Report of Python
No ratings yet
Report of Python
28 pages
Python
No ratings yet
Python
13 pages
Python Programming
No ratings yet
Python Programming
13 pages
Python Basics Notes For Interns
No ratings yet
Python Basics Notes For Interns
13 pages
Python
No ratings yet
Python
26 pages
m6 Answer 2025
No ratings yet
m6 Answer 2025
10 pages
04 - Python Programming Basics
No ratings yet
04 - Python Programming Basics
47 pages
Python
No ratings yet
Python
20 pages
PythonNotes Original
No ratings yet
PythonNotes Original
47 pages
Lecture 1
No ratings yet
Lecture 1
26 pages
Python Unit - 1
No ratings yet
Python Unit - 1
11 pages
Python Introduction
No ratings yet
Python Introduction
53 pages
4431 Question Paper
No ratings yet
4431 Question Paper
2 pages
Unit 3 Python
No ratings yet
Unit 3 Python
6 pages
Iii Sem Python Solved QP 2024
No ratings yet
Iii Sem Python Solved QP 2024
13 pages
Python Cheatsheet
No ratings yet
Python Cheatsheet
8 pages
Lesson No 1 (Shreya)
No ratings yet
Lesson No 1 (Shreya)
9 pages
Features of The Hospital ERP System
No ratings yet
Features of The Hospital ERP System
4 pages
1.1 (Co1, Co2)
No ratings yet
1.1 (Co1, Co2)
25 pages
Python
No ratings yet
Python
25 pages
Python Programming Notes
No ratings yet
Python Programming Notes
4 pages
Introduction To 2d Drawing and Orthographic Projection
No ratings yet
Introduction To 2d Drawing and Orthographic Projection
36 pages
Chapter 03 - Fundamentals of Firewall 1
No ratings yet
Chapter 03 - Fundamentals of Firewall 1
30 pages
PYTHON Notes
No ratings yet
PYTHON Notes
4 pages
Workshop Notes-1 Introduction To Python
No ratings yet
Workshop Notes-1 Introduction To Python
8 pages
PYTHON
No ratings yet
PYTHON
22 pages
MIS Questions Bank
No ratings yet
MIS Questions Bank
24 pages
Mad Part 2 Notes
No ratings yet
Mad Part 2 Notes
11 pages
Paper Solution
No ratings yet
Paper Solution
11 pages
Nidhi Final
No ratings yet
Nidhi Final
48 pages
Python QB
No ratings yet
Python QB
3 pages
Python Basics
No ratings yet
Python Basics
5 pages
Python Que Bank Answers
No ratings yet
Python Que Bank Answers
14 pages
FGC 1819 - Att5 Abb Xpert User Guide
No ratings yet
FGC 1819 - Att5 Abb Xpert User Guide
56 pages
Python Comprehensive Resource Document
No ratings yet
Python Comprehensive Resource Document
7 pages
CH #3 Solved Exercise
No ratings yet
CH #3 Solved Exercise
6 pages
Ui/Ux Design Portfolio: Nur Oktaviana Rahma
No ratings yet
Ui/Ux Design Portfolio: Nur Oktaviana Rahma
48 pages
Pyp Ques Bank
No ratings yet
Pyp Ques Bank
24 pages
Discrete Structures Lab 1 Python Basics: 1 Python Installation 2 Python Tutorial
No ratings yet
Discrete Structures Lab 1 Python Basics: 1 Python Installation 2 Python Tutorial
8 pages
Logcat 1733226506728
No ratings yet
Logcat 1733226506728
13 pages
Getting Started With VDI
No ratings yet
Getting Started With VDI
1 page
ML Lab File
No ratings yet
ML Lab File
53 pages
(SAMS) Safety Assessment Management System - Cebu Pacific Air
No ratings yet
(SAMS) Safety Assessment Management System - Cebu Pacific Air
22 pages
Python: Finally
No ratings yet
Python: Finally
10 pages
Cloudvue Catalog With Updated Specs
No ratings yet
Cloudvue Catalog With Updated Specs
26 pages
CSC126 CHP 3
No ratings yet
CSC126 CHP 3
19 pages
LegOSC - Mindstorms NXT Robotics Programming For A
No ratings yet
LegOSC - Mindstorms NXT Robotics Programming For A
7 pages
Kushagra Yadav SDE2
No ratings yet
Kushagra Yadav SDE2
1 page
Stepper Driver Power Supplies Processor: Title: Duet 2
No ratings yet
Stepper Driver Power Supplies Processor: Title: Duet 2
7 pages
Python Crash Course
No ratings yet
Python Crash Course
12 pages
Logical - HcmTop-HcmExtractsTop-HcmExtractsSetup
No ratings yet
Logical - HcmTop-HcmExtractsTop-HcmExtractsSetup
1 page
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet