BASIC - FUNCTIONALITIES - OF - PYTHON (1) Vikas
BASIC - FUNCTIONALITIES - OF - PYTHON (1) Vikas
AIM
Exploring and demonstrating Python.
Output:
2. Conditionals
Conditional statements allow you to execute specific blocks of code based on conditions.
Python uses if, elif, and else for condition checking.
if: Used to check a condition
elif: Used for additional conditions if the previous if is false
else: Executed when none of the conditions in if or elif are met.
Example:
age = 20
if age >= 18:
print("Adult")
else:
print("Minor")
Output:
3. Loops
Loops allow you to repeat a block of code multiple times. Python provides two main types of
loops:
for loop: Used to iterate over a sequence (like a list, tuple, or string) or to repeat a
block of code a specific number of times.
while loop: Continues to execute a block of code as long as a condition is True.
Example of a for loop:
for i in range(5):
print(i) # Output: 0, 1, 2, 3, 4
Output:
4. Functions
Functions in Python are defined using the def keyword, and they allow you to organize your
code into reusable blocks.
Functions can have parameters (inputs) and return values (outputs).
def greet(name):
print(f"Hello, {name}!")
greet("Honey")
Output:
5. Lists
A list is an ordered collection of elements, and it is one of the most commonly used data
structures in Python. Lists are mutable, meaning you can change their contents after they
are created.
Example:
fruits = ["apple", "banana", "cherry"]
fruits.append("orange") # Adds an item to the list
print(fruits) # Output: ['apple', 'banana', 'cherry', 'orange']
Output:
6. Dictionaries
Dictionaries are unordered collections of key-value pairs. Each key is unique, and each key
maps to a value. You can access, modify, and add elements using the keys.
Example:
person = {"name": "Honey", "age": 20}
print(person["name"])
person["age"] = 21 # Modify value
print(person)
Output:
7. Error Handling
Python uses try, except, else, and finally to handle exceptions (errors) during execution. This
allows your program to continue running even if an error occurs.
try: Block of code that might raise an exception
except: Handles the exception
else: Executes if no exception occurs
finally: Executes no matter what, after try and except
Example:
try:
x = int(input("Enter a number: "))
result = 10 / x
except ZeroDivisionError:
print("Cannot divide by zero!")
except ValueError:
print("Invalid input!")
else:
print(f"The result is {result}")
finally:
print("Execution completed.")
Output:
def display_info(self):
print(f"Car Make: {self.make}, Model: {self.model}")
9. File Handling
Python allows you to interact with files using built-in functions. You can open, read, write,
and close files.
open(): Opens a file for reading or writing
read(): Reads the contents of a file
write(): Writes to a file
close(): Closes the file
Example:
# Writing to a file
with open("example.txt", "w") as file:
file.write("Hello, World!")
# Reading from a file
with open("example.txt", "r") as file:
content = file.read()
print(content)
Output:
You can also create your own modules and import them into your programs.
PYTHON LIBRARIES
1.Numpy
NumPy is a Python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use
it freely.
NumPy stands for Numerical Python.
Some functionalities of numpy are: -
Array Creation
import numpy as np
# Create a 1D array
array_1d = np.array([1, 2, 3, 4])
print("1D Array:", array_1d)
# Create a 2D array
array_2d = np.array([[1, 2], [3, 4]])
print("2D Array:\n", array_2d)
Output:
Array Operations
import numpy as np
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
# Dot product
print("Dot Product:", np.dot(array1, array2))
Output:
# Accessing elements
print("Element at (1,2):", array[1, 2])
Mathematical Functions
import numpy as np
print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Standard Deviation:", np.std(data))
print("Sum:", np.sum(data))
print("Cumulative Sum:", np.cumsum(data))
Output:
Linear Algebra
import numpy as np
# Transpose of a matrix
print("Transpose:\n", np.transpose(matrix))
# Determinant
print("Determinant:", np.linalg.det(matrix))
# Inverse of a matrix
print("Inverse:\n", np.linalg.inv(matrix))
2. Pandas
Pandas is a Python library widely used for data analysis and manipulation. It provides
structures like DataFrame and Series, which allow you to work with structured data
efficiently. Here are some key functionalities of Pandas along with implementation
examples:
Creating Data Structure
import pandas as pd
# Create a Series
data = [10, 20, 30, 40]
series = pd.Series(data, index=['a', 'b', 'c', 'd'])
print("Series:\n", series)
# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'City': ['New York', 'Los
Angeles', 'Chicago']}
df = pd.DataFrame(data)
print("\nDataFrame:\n", df)
Output:
Indexing and Selecting Data
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)
# Access a column
print("Column A:\n", df['A'])
Output:
Output:
Data Cleaning
import pandas as pd
data = {'A': [1, 2, None], 'B': [None, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)
Output:
Groupby Operation
import pandas as pd
data = {'Category': ['A', 'A', 'B', 'B'], 'Values': [10, 20, 30, 40]}
df = pd.DataFrame(data)
Output:
3. Matplotlib
Matplotlib is a powerful Python library for creating static, interactive, and animated
visualizations. Here are some basic functionalities of Matplotlib with implementation
examples:
Plotting a Simple Line Graph
# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
Output:
Scatter Plot
# Data
x = [5, 7, 8, 7, 2, 17, 2, 9, 4, 11]
y = [99, 86, 87, 88, 100, 86, 103, 87, 94, 78]
Output:
Bar Chart
# Data
categories = ['A', 'B', 'C', 'D']
values = [3, 7, 8, 5]
Histogram
# Data
data = [1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7, 8]
# Create a histogram
plt.hist(data, bins=5, color='green', edgecolor='black')
plt.title("Histogram")
plt.xlabel("Bins")
plt.ylabel("Frequency")
plt.show()
Output:
Pie Chart
# Data
labels = ['Python', 'Java', 'C++', 'Ruby']
sizes = [50, 30, 15, 5]
colors = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue']
Output:
4. SciPy
SciPy is a Python library built on top of NumPy that is widely used for scientific and
numerical computations. It provides modules for optimization, integration, interpolation,
linear algebra, signal processing, statistics, and more. Below are the basic functionalities of
SciPy with examples:
Linear Algebra Operations
# Matrix
A = np.array([[3, 2], [1, 4]])
Output:
Optimization
Output:
Integration
# Define a function
def f(x):
return x**2
Output:
Interpolation
# Data points
x = [0, 1, 2, 3, 4]
y = [1, 2, 0, 2, 1]
Output:
Signal Processing
5. Scikit- learn
Scikit-learn is a powerful Python library used for machine learning, providing simple and
efficient tools for data mining and data analysis. It supports various supervised and
unsupervised learning algorithms, and it's built on top of NumPy, SciPy, and matplotlib.
Below are the basic functionalities of Scikit-learn with examples:
Data Preprocessing
# Data
data = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
Output:
Train-Test Split
# Data
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
y = np.array([1, 2, 3, 4, 5])
print("X_train:", X_train)
print("X_test:", X_test)
Output:
Linear Regression
import numpy as np
from sklearn.linear_model import LinearRegression
# Data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 2, 3, 4, 5])
# Predict
y_pred = model.predict([[6]])
print("Predicted value for input 6:", y_pred)
Output:
Logistic Regression
import numpy as np
from sklearn.linear_model import LogisticRegression
# Data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([0, 0, 1, 1, 1]) # Binary target variable
# Predict
y_pred = model.predict([[6]])
print("Predicted class for input 6:", y_pred)
Output:
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
# Data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([0, 0, 1, 1, 1]) # Binary target variable
# Create and train a KNN model
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X, y)
# Predict
y_pred = knn.predict([[6]])
print("Predicted class for input 6:", y_pred)
Output:
EXPERIMENT – 2
AIM
Perform Data Preprocessing like outlier detection, handling missing value, analyzing
redundancy and normalization on different datasets.
THEORY
Data preprocessing is a crucial step in machine learning, ensuring that data is clean,
consistent, and ready for training. Below are common data preprocessing techniques with
Python code using pandas and scikit-learn.
3. Feature Scaling
Feature scaling standardizes or normalizes numerical features.
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler, MinMaxScaler
# Sample data
data = {'Height': [150, 160, 170, 180, 190],
'Weight': [50, 60, 70, 80, 90]}
df = pd.DataFrame(data)
# Standardization (Z-score normalization)
scaler = StandardScaler()
df[['Height', 'Weight']] = scaler.fit_transform(df[['Height', 'Weight']])
print(df)
Output:
4. Feature Engineering (Polynomial Features)
Creating new features from existing ones.
import numpy as np
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures
# Sample data
data = {'Feature': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
# Polynomial Features (Degree 2)
poly = PolynomialFeatures(degree=2, include_bias=False)
df_poly = pd.DataFrame(poly.fit_transform(df), columns=['Feature', 'Feature^2'])
print(df_poly)
Output:
Output:
EXPERIMENT – 3
AIM
Write a program to implement decision tree based on ID3, C4.5 and CART algorithm.
THEORY
A decision tree is a supervised learning algorithm used for both classification and regression
tasks. It models decisions in a tree-like structure where:
Each internal node represents a feature (attribute).
Each branch represents a decision based on that feature’s value.
Each leaf node represents a class label (for classification) or a numeric output (for
regression).
ID3
The ID3 algorithm, developed by Ross Quinlan, builds a decision tree using Entropy and
Information Gain as splitting criteria.
How ID3 Works
Entropy (HHH) measures the impurity (randomness) in a dataset:
o If all examples belong to one class, entropy is 0 (pure).
o If the examples are evenly split between classes, entropy is 1 (high impurity).
Information Gain (IG) measures how much a feature reduces entropy:
o The feature with the highest Information Gain is chosen as the root node.
CODE
import numpy as np
import pandas as pd
from collections import Counter
def entropy(data):
labels = data.iloc[:, -1] # Assuming last column is the target
label_counts = Counter(labels)
total = len(labels)
ent = -sum((count/total) * np.log2(count/total) for count in label_counts.values())
return ent
def best_feature(data):
features = data.columns[:-1] # Exclude target column
return max(features, key=lambda feature: information_gain(data, feature))
# Example dataset
columns = ["CGPA", "Interactiveness", "Practical Knowledge", "Skills", "Placed"]
data = pd.DataFrame([
["High", "Good", "Excellent", "Strong", "Yes"],
["Low", "Poor", "Weak", "Weak", "No"],
["Medium", "Average", "Good", "Medium", "Yes"],
["High", "Good", "Good", "Strong", "Yes"],
["Medium", "Average", "Average", "Medium", "No"],
["Low", "Poor", "Weak", "Weak", "No"],
["High", "Excellent", "Excellent", "Strong", "Yes"],
["Medium", "Good", "Good", "Medium", "Yes"],
["Low", "Average", "Poor", "Weak", "No"],
["Medium", "Good", "Average", "Medium", "Yes"],
["High", "Excellent", "Excellent", "Strong", "Yes"],
["Low", "Poor", "Weak", "Weak", "No"],
["Medium", "Average", "Good", "Medium", "Yes"],
["High", "Good", "Good", "Strong", "Yes"]
], columns=columns)
C 4.5
C4.5, developed by Ross Quinlan, is an extension of ID3 with improvements.
Key Improvements in C4.5
1. Handles both categorical and numerical data
o If a numerical feature is selected, it finds the best threshold (e.g., age > 30).
2. Uses Gain Ratio instead of Information Gain
o Gain Ratio solves the bias in Information Gain by normalizing it.
o Formula:
Gain Ratio=Information Gain/Split Information
o Split Information prevents the algorithm from favoring attributes with many
unique values.
3. Handles missing values
o It assigns probabilities for missing values.
4. Pruning to reduce overfitting
o Uses post-pruning, removing branches that add little value.
CODE
import numpy as np
import pandas as pd
def entropy(data):
label_counts = Counter(labels)
total = len(labels)
return ent
values = data[feature].unique()
total = len(data)
return split_ent
total_entropy = entropy(data)
values = data[feature].unique()
return gain
def best_feature(data):
if len(set(labels)) == 1:
if len(data.columns) == 1:
best_feat = best_feature(data)
if tree is None:
tree = {}
tree[best_feat] = {}
return tree
return
print(indent + str(key))
data = pd.DataFrame([
], columns=columns)
tree = c45(data)
print_tree(tree)
Output:
CART
CART, developed by Breiman et al., is another decision tree algorithm. Unlike ID3 and C4.5,
it:
Works for both classification and regression.
Uses the Gini Index (instead of entropy) to find the best split.
If Gini = 0, the node is pure (only one class present).
If Gini = 1, the node is maximally impure.
Splitting is done in a binary way (each node splits into two branches only).
Example: Instead of splitting on Color (Red, Blue, Green), CART creates binary splits like Color
= Red?.
Regression Trees use Mean Squared Error (MSE) instead of Gini.
CODE
import numpy as np
import pandas as pd
from collections import Counter
def gini_index(data):
labels = data.iloc[:, -1]
label_counts = Counter(labels)
total = len(labels)
gini = 1 - sum((count/total) ** 2 for count in label_counts.values())
return gini
def best_feature_cart(data):
features = data.columns[:-1]
return min(features, key=lambda feature: gini_split(data, feature))
# Example dataset
columns = ["CGPA", "Interactiveness", "Practical Knowledge", "Skills", "Placed"]
data = pd.DataFrame([
["High", "Good", "Excellent", "Strong", "Yes"],
["Low", "Poor", "Weak", "Weak", "No"],
["Medium", "Average", "Good", "Medium", "Yes"],
["High", "Good", "Good", "Strong", "Yes"],
["Medium", "Average", "Average", "Medium", "No"],
["Low", "Poor", "Weak", "Weak", "No"],
["High", "Excellent", "Excellent", "Strong", "Yes"],
["Medium", "Good", "Good", "Medium", "Yes"],
["Low", "Average", "Poor", "Weak", "No"],
["Medium", "Good", "Average", "Medium", "Yes"],
["High", "Excellent", "Excellent", "Strong", "Yes"],
["Low", "Poor", "Weak", "Weak", "No"],
["Medium", "Average", "Good", "Medium", "Yes"],
["High", "Good", "Good", "Strong", "Yes"]
], columns=columns)
Output:
EXPERIMENT -4
AIM
To implement a simple Artificial Neural Network (ANN) using the Backpropagation algorithm
from scratch using Python and NumPy, and test it on a suitable dataset like the XOR
problem.
THEORY
An Artificial Neural Network (ANN) is inspired by the structure of biological neurons. It
contains:
Input Layer
Hidden Layer(s)
Output Layer
The Backpropagation algorithm is used to minimize error by updating weights using
gradients.
Dataset Used:
Iris dataset: A famous dataset consisting of 3 types of Iris flowers (Setosa, Versicolor,
Virginica) with 4 features:
Sepal Length
Sepal Width
Petal Length
Petal Width
We'll simplify it for binary classification:
Class 0: Setosa
Class 1: Versicolor
(We’ll ignore Virginica to keep it binary.)
Input Layer: 4 neurons (4 features)
Hidden Layer: 5 neurons
Output Layer: 1 neuron (binary output)
CODE
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
def sigmoid_derivative(x):
return x * (1 - x)
# ANN class
class SimpleANN:
def __init__(self, input_size, hidden_size, output_size):
self.W1 = np.random.randn(input_size, hidden_size)
self.b1 = np.zeros((1, hidden_size))
self.W2 = np.random.randn(hidden_size, output_size)
self.b2 = np.zeros((1, output_size))
# Update weights
self.W2 -= learning_rate * dW2
self.b2 -= learning_rate * db2
self.W1 -= learning_rate * dW1
self.b1 -= learning_rate * db1
# Load dataset
iris = load_iris()
X = iris.data[:100] # Only Setosa and Versicolor
y = iris.target[:100].reshape(-1, 1) # 0 or 1
# Preprocessing
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Train/Test Split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2,
random_state=42)
# Test accuracy
predictions = model.predict(X_test)
accuracy = np.mean(predictions == y_test)
print(f"Test Accuracy: {accuracy * 100:.2f}%")
Output
EXPERIMENT – 5
AIM
To implement the K-Nearest Neighbors (K-NN) algorithm from scratch in Python and use it
to classify data points from the Iris dataset. Display both correct and wrong predictions.
THEORY
K-NN is a lazy, instance-based learning algorithm.
For each test point, it finds the K nearest points in the training set using Euclidean
distance, and predicts the most common class among those neighbors.
Dataset Used:
Wine dataset (from sklearn.datasets)
178 samples
13 features (like alcohol, magnesium, etc.)
3 classes (0, 1, 2) — different wine cultivars
CODE
import numpy as np
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from collections import Counter
# KNN Class
class KNN:
def __init__(self, k=3):
self.k = k
# Scale features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2,
random_state=42)
print("\nWrong Predictions:")
for i in range(len(y_test)):
if predictions[i] != y_test[i]:
print(f"Sample {i}: Predicted = {predictions[i]}, Actual = {y_test[i]} ❌")
OUTPUT