ml lab programs 2
ml lab programs 2
PROGRAM CODE :
def compute_mean(numbers):
return sum(numbers) / len(numbers)
def compute_median(numbers):
sorted_numbers = sorted(numbers)
n = len(sorted_numbers)
if n % 2 == 0:
mid = n // 2
return (sorted_numbers[mid - 1] + sorted_numbers[mid]) / 2
else:
return sorted_numbers[n // 2]
def compute_mode(numbers):
count = Counter(numbers)
max_count = max(count.values())
mode = [num for num, freq in count.items() if freq == max_count]
return mode if mode else None
if __name__ == "__main__":
# Sample input, you can change this list to test with different data
data = [1, 2, 3, 4, 5, 6, 6, 7, 8, 8, 8]
mean = compute_mean(data)
median = compute_median(data)
mode = compute_mode(data)
print(f"Data: {data}")
print(f"Mean: {mean}")
print(f"Median: {median}")
1
lOMoARcPSD|51452891
print(f"Mode: {mode}")
OUTPUT :
2
lOMoARcPSD|51452891
PROGRAM CODE:
def compute_mean(numbers):
return sum(numbers) / len(numbers)
def compute_variance(numbers):
mean = compute_mean(numbers)
squared_diff = [(x - mean) ** 2 for x in numbers]
variance = sum(squared_diff) / len(numbers)
return variance
def compute_standard_deviation(numbers):
variance = compute_variance(numbers)
standard_deviation = variance ** 0.5
return standard_deviation
if __name__ == "__main__":
# Taking user input for a list of numbers
input_data = input("Enter a list of numbers separated by spaces: ")
try:
# Convert the user input into a list of floats
data = [float(num) for num in input_data.split()]
print(f"Data: {data}")
print(f"Variance: {variance}")
print(f"Standard Deviation: {standard_deviation}")
except ValueError:
print("Invalid input! Please enter a list of numbers separated by spaces.")
3
lOMoARcPSD|51452891
OUTPUT :
4
Below is an example of applying the K-Nearest Neighbors (KNN) algorithm
for both classification and regression using Python. We'll use the popular
scikit-learn library and some sample datasets to illustrate the concepts.
# Calculate accuracy
accuracy = accuracy_score(y_test_c, y_pred_c)
print("Classification Results:")
print(f"Accuracy: {accuracy * 100:.2f}%")
Output
When you run the above code, you'll get the following type of output:
Classification Results:
Accuracy: 95.56%
Regression Results:
1. Classification:
o We used the Iris dataset, a built-in dataset in scikit-learn, to
classify flowers into three species.
Here’s a Python program to demonstrate the Decision Tree Algorithm for a
classification problem using the Iris dataset. The program also includes parameter
tuning using Grid Search for better results.
grid_search.fit(X_train, y_train)
1. Dataset:
o We used the Iris dataset, which has 4 features and 3 target classes.
2. Default Decision Tree:
o A basic decision tree is trained without parameter tuning.
o We evaluate its accuracy and visualize the decision tree.
3. Parameter Tuning:
o We used Grid Search with a parameter grid to find the optimal
hyperparameters.
o Parameters tuned include:
criterion: The function to measure split quality (Gini or Entropy).
max_depth: Maximum depth of the tree.
min_samples_split: Minimum samples required to split an internal
node.
min_samples_leaf: Minimum samples required to be at a leaf
node.
o The best model is selected, evaluated, and visualized.
4. Evaluation:
o The accuracy and classification report are displayed for both default and
tuned models.
Sample Output
Classification Report:
precision recall f1-score support
accuracy 0.96 50
macro avg 0.95 0.94 0.94 50
weighted avg 0.96 0.96 0.96 50
Classification Report:
precision recall f1-score support
accuracy 0.98 50
macro avg 0.96 0.96 0.96 50
weighted avg 0.98 0.98 0.98 50
Visualizations
Key Takeaways
Let me know if you'd like to explore any specific part in more detail!
oThe KNeighborsClassifier was initialized with k=3k=3,
meaning the class of a test sample is determined by the
majority class among its 3 nearest neighbors.
2. Regression:
o We created a synthetic regression dataset with
make_regression.
o The KNeighborsRegressor was initialized with k=3k=3,
meaning the predicted value of a test sample is the average of
its 3 nearest neighbors' values.
Key Points
Would you like further details, such as visualizations of the results or how
to optimize kk?
Here’s an example of using the Decision Tree algorithm for regression in Python. We'll
use a synthetic regression dataset and evaluate the model's performance based on
metrics such as Mean Squared Error (MSE) and R² score.
1. Dataset:
o A synthetic regression dataset is created using make_regression with one
feature and some noise added to simulate real-world data.
2. Decision Tree Regressor:
o A DecisionTreeRegressor is trained on the training data to predict the
target variable.
o Default parameters are used for the initial model.
3. Evaluation:
o The performance of the model is evaluated using:
Mean Squared Error (MSE): Measures the average squared
difference between predicted and actual values.
R² Score: Indicates how well the model explains the variability of
the target variable (1 indicates perfect fit).
4. Visualization:
o The decision tree is visualized to understand its structure.
o A scatter plot is created to compare the actual values with the predicted
values.
Sample Output
Regression Results:
Visualizations:
Decision trees can capture complex patterns but may overfit if not pruned or
regularized.
Visualizing the tree helps in understanding how the model is splitting data.
For better performance, consider hyperparameter tuning (e.g., max_depth,
min_samples_split) or ensemble methods (e.g., Random Forests).
Let me know if you'd like to extend this example with parameter tuning or advanced
techniques!
Here's a demonstration of the Naïve Bayes Classification algorithm using Python. We'll
use the Gaussian Naïve Bayes model from sklearn and apply it to the Iris dataset to
classify different species of flowers.
This script:
# Load dataset
iris = load_iris()
X, y = iris.data, iris.target
# Make predictions
y_pred = nb_classifier.predict(X_test)
OUTPUT
1. Accuracy: 1.00
2. Classification Report: