Aimlmid 2 Notes
Aimlmid 2 Notes
Here's your simple, long-paragraph style notes for the entire AIML
syllabus you mentioned—perfect for quick revision, fast learning, and easy
understanding for your Mid-2 and Semester exams.
Machine Learning (ML) is a part of artificial intelligence where computers are trained to
learn from data and make decisions without being explicitly programmed. The main
idea is to let systems learn patterns and improve over time through experience. There
are three main types of ML: Supervised Learning, where the model learns from
labeled data; Unsupervised Learning, where it finds patterns from unlabeled data; and
Reinforcement Learning, where the system learns by trial and error using feedback or
rewards. ML is used in spam filtering, recommendation systems, image recognition, and
many real-world applications.
Before feeding data into a machine learning model, we need to clean and prepare it —
this is called Data Preprocessing. It includes steps like:
• Data Cleaning: Removing or filling missing data, correcting errors, and handling
duplicates.
• Data Splitting: Dividing data into training and testing sets to evaluate the
model’s performance.
• Data Normalization: Scaling data so that all features have similar ranges (like
bringing all values between 0 and 1).
• Data Batching: Breaking data into small parts (batches) to train the model
gradually, especially in deep learning.
• Data Shuffling: Mixing the order of data to prevent the model from learning a
pattern that’s not useful. This step ensures the model trains efficiently and gives
accurate results.
Overfitting happens when a model learns the training data too well — including its
noise and errors — so it performs poorly on new, unseen data. Underfitting, on the
other hand, means the model is too simple and fails to learn enough from the training
data, leading to poor performance on both training and testing sets. A good ML model
balances both by generalizing well to new data.
Performance Metrics in ML
• Precision tells how many of the positive predictions were actually correct.
• Cross-validation is used to train and test the model on different parts of the
dataset to ensure it's reliable and not overfitted.
• Linear Regression is used for predicting continuous values (like house prices) by
fitting a straight line through the data.
• Logistic Regression is used for classification problems (like spam or not spam).
• Naive Bayes Classifier is based on probability and assumes all features are
independent.
• Decision Trees classify data by making decisions based on feature values — like
a flowchart.
• K-Nearest Neighbors (KNN) classifies new data based on the majority class
among the 'k' closest points.
• Support Vector Machines (SVM) find the best boundary (hyperplane) that
separates data into classes.
Unsupervised learning finds hidden patterns or groups in data without labeled outputs.
• K-Means Clustering divides the data into 'k' clusters based on similarity. Each
data point is assigned to the nearest cluster center.
ANNs are inspired by the human brain and are made of layers of interconnected nodes
(neurons). Each connection has a weight and a bias that affect the output. The ANN
learns by adjusting these weights to reduce error. The basic unit is the McCulloch-Pitts
neuron, and the simplest model is the Perceptron. Single-layer Perceptrons can solve
simple problems, while Multi-Layer Perceptrons (MLPs) can solve complex ones using
hidden layers.
There are various types:
• Recurrent Neural Networks (RNNs) have loops and are good for sequences like
text or time-series data.
ANNs are used in image recognition, language translation, and more.
Let me know if you want this in PDF format, or want a shortened version or flashcards
to revise quickly! Good luck, bro — you’re gonna kill it in these exams!