ML Notes
ML Notes
Machine Learning
Artificial Intelligence (AI): The broader concept of machines performing tasks that typically
require human intelligence, such as reasoning, problem-solving, and decision-making. AI
encompasses fields like natural language processing, robotics, and expert systems.
Machine Learning (ML): A subset of AI focusing on algorithms that enable systems to learn
patterns from data and improve performance over time without explicit programming. ML is
data-driven and forms the basis of many AI applications.
Key Differences: AI is the overall goal, while ML is a specific approach to achieve AI. AI
includes ML, expert systems, and reasoning algorithms; ML focuses solely on learning from
data.
1.
(a) What are Training and Test Data: Training data is used to teach the model, enabling it
to learn patterns. Test data evaluates the model's performance on unseen data.
(b) Bayes Theorem and Its Significance:
Significance: It calculates conditional probabilities and is widely used in probabilistic models
like Naive Bayes.
(c) Differences Between Linear and Logistic Regression:
Linear regression predicts continuous outcomes; logistic regression predicts probabilities for
classification. Linear regression fits a line; logistic regression uses the sigmoid function to
model probabilities.
2.
(a) Differences Between Classification and Regression:
Classification predicts discrete labels (e.g., spam or not spam); regression predicts continuous
values (e.g., house price).
(b) Steps for Building a Decision Tree:
1. Identify the best split based on criteria like Gini index or entropy.
2. Divide data into subsets based on the split.
3. Repeat the process recursively until a stopping condition is met (e.g., maximum
depth).
3.
(a) Ensemble Modeling: Combines multiple models (e.g., bagging, boosting) to improve
predictive performance.
(b) Recurrent Networks: RNNs have feedback loops allowing information to persist,
making them suitable for sequential data like time series and natural language processing.
(c) Concept of a Perceptron: A single-layer neural network with weights, bias, and an
activation function. It classifies linearly separable data by adjusting weights based on input.
[Attach a diagram: shows input, weights, summation, and activation output.]
4.
(a) ANN: Artificial Neural Networks mimic biological neurons for complex tasks like image
recognition.
(b) Deep Learning: A subset of ML using multi-layered ANNs to learn from large datasets,
achieving state-of-the-art results in vision and language.
(c) Hierarchical Agglomerative Clustering: A bottom-up clustering technique that merges
similar data points into clusters iteratively.
(d) PCA: Principal Component Analysis reduces dimensionality by transforming data to a
new coordinate system.
(e) Multilayer Networks and Backpropagation: Multilayer networks contain multiple
hidden layers; backpropagation adjusts weights by minimizing the error between predicted
and actual outputs using gradient descent.
Short Notes
Principal Component Analysis (PCA): A dimensionality reduction technique that
transforms data into a lower-dimensional space while retaining maximum variance. It
uses eigenvectors and eigenvalues of the covariance matrix.
Logistic Regression: A classification algorithm predicting probabilities using the
sigmoid function. It is used for binary outcomes.
Artificial Neural Network (ANN): A network of interconnected nodes (neurons)
designed to simulate human learning, widely used in image recognition and NLP.
Decision Tree and Pruning: A tree-based algorithm for classification and regression.
Pruning removes unnecessary branches to prevent overfitting and improve
generalization.
Multiple Linear Regression: Models the relationship between a dependent variable
and multiple independent variables, expressed
Information Gain (IG): Measures the reduction in entropy after a dataset is split on
an attribute.
Example: In a decision tree, IG helps choose the attribute that best splits the dataset.
K-Means Algorithm
K-Means is a clustering algorithm that partitions data into kk clusters:
1. Initialize k centroids.
2. Assign each data point to the nearest centroid.
3. Update centroids as the mean of points in each cluster.
4. Repeat steps 2 and 3 until centroids stabilize.
Bayes Theorem
P(A∣B)=P(B∣A)P(A)P(B)
Bayes theorem calculates the probability of an event A given evidence B. It is widely used in
probabilistic models like Naive Bayes.
Hierarchical Clustering
A clustering technique that builds a tree-like structure (dendrogram) by either merging
clusters (agglomerative) or splitting clusters (divisive). It does not require the number of
clusters to be specified beforehand.
Short Notes
Supervised Learning: Models learn from labeled data. Example: Predicting housing
prices.
Unsupervised Learning: Models discover patterns in unlabeled data. Example:
Customer segmentation.
Agglomerative Clustering: A bottom-up approach to clustering where individual
data points are merged iteratively.
Overfitting: When a model performs well on training data but poorly on unseen data.
Support Vector Machine (SVM): A supervised algorithm that finds the optimal
hyperplane for separating classes.