AI Foundations and Challenges1
AI Foundations and Challenges1
chatgpt.com/c/67c3fa97-daf8-8006-b8fe-ad5f9a998f85
Here are detailed answers to the questions from your Midterm Examination on Artificial
Intelligence & Machine Learning:
1. Mathematics – Provides the theoretical basis for logic, probability, and optimization.
2. Computer Science – AI depends on algorithms, data structures, and programming
techniques.
3. Neuroscience – Inspires neural networks and brain-like computational models.
4. Linguistics – Aids in natural language processing (NLP) for human-computer
interaction.
5. Psychology & Cognitive Science – Helps in understanding learning, reasoning,
and problem-solving.
6. Statistics & Probability – Essential for machine learning, pattern recognition, and
decision-making.
1/31
7. Engineering & Robotics – Involves designing intelligent agents and autonomous
systems.
Due to these limitations, more advanced logic systems, like first-order logic (FOL), are
used in AI.
Limitations:
1. High Time Complexity – Since they explore blindly, they may take an exponential
amount of time.
2. Memory Intensive – Some methods (like BFS) require large memory to store all
nodes.
3. Not Efficient for Large Problems – They do not scale well for complex real-world
problems.
2/31
Common uninformed search algorithms include Breadth-First Search (BFS), Depth-
First Search (DFS), and Uniform Cost Search (UCS).
Q1 (d): Is heuristic search always better than blind search? Justify your
answer.
Yes, heuristic search is generally better than blind search because it uses additional
knowledge to guide the search towards an optimal solution efficiently.
1. Reduces Search Space – It focuses only on promising paths, unlike blind search.
2. Faster Execution – Heuristic algorithms like A* reach the goal state quicker than
BFS or DFS.
3. More Practical for Large Problems – Real-world AI applications, such as
pathfinding in maps, use heuristic search for efficiency.
Direction Starts from known facts and Starts from the goal and works
applies rules to reach a backward to find supporting facts.
conclusion.
Efficiency Works well when there are many Works well when there are few
rules and fewer goals. possible conclusions but many facts.
3/31
Artificial Intelligence (AI) is a branch of computer science that develops systems
capable of mimicking human intelligence—learning, reasoning, and problem-solving.
Given: Two jugs of capacity 4L and 3L, and a target of 2L in one jug.
States: Different water levels in the jugs (e.g., (0,0), (4,0), (0,3)).
Actions: Fill, empty, pour water between jugs.
Goal State: (2, X), where one jug has exactly 2 liters.
Example: Consider a graph traversal where we need to find the shortest path from A to
G. If h(n) represents the estimated cost to goal, BFS picks the node with the smallest
h(n).
Advantages:
Disadvantage:
4/31
1. Variables must be assigned values.
2. Constraints restrict the allowed values.
Example: Sudoku
Example:
1. Given predicates:
P(x, y): Loves(x, y)
Q(x): Loves(John, x)
2. To unify Loves(John, x) with Loves(y, Mary):
Substitution: x = Mary, y = John.
Strategy Explores level by level (queue). Explores deep paths first (stack).
Memory Usage High (stores all nodes at a level). Low (stores only the path).
Completeness Always finds solution (if exists). May get stuck in infinite depth.
5/31
Types:
Supervised learning ek type ka machine learning hai jisme model ko input-output pairs
se train kiya jata hai. Matlab ki hum model ko pehle se labeled data dete hain, jisse wo
patterns seekh sake.
1. Generative Learning
2. Discriminative Learning
1. Generative Learning
🔹 Example Models:
Naïve Bayes Classifier
Gaussian Mixture Models (GMMs)
Hidden Markov Models (HMMs)
6/31
🔹 Use Cases:
Spam Detection (Email spam or not spam)
Face Recognition
Speech Recognition
2. Discriminative Learning
🔹 Example Models:
Logistic Regression
Support Vector Machines (SVMs)
Neural Networks
Random Forest
🔹 Use Cases:
Object Detection (Image me object hai ya nahi)
Sentiment Analysis (Positive ya negative review)
Credit Card Fraud Detection
Approach Pehle data ka structure seekhta hai, Sirf class boundary seekhta
phir classify karta hai hai
🎯 7/31
🎯 Conclusion:
Agar data ka probability distribution important hai, to Generative Learning use
karte hain.
Agar sirf classification ya prediction karna hai, to Discriminative Learning
better hota hai.
👉 Example se samjhein:
Generative Model ek artist ki tarah hai jo poori cheez recreate kar sakta hai.
Discriminative Model ek judge ki tarah hai jo sirf ye decide karega ki cheez sahi
hai ya nahi.
1. Parametric Learning
2. Non-Parametric Learning
🔹 Example Models:
✅ Linear Regression (y = mx + c)
✅ Logistic Regression
✅ Naïve Bayes
✅ Neural Networks (up to a fixed size)
8/31
🔹 Use Cases:
✅ Stock Market Prediction (Assuming stock prices follow a linear trend)
✅ Medical Diagnosis (Classifying diseases based on fixed parameters)
✅ Spam Detection (Fixed probability rules for word occurrence)
🔹 Pros (Fayde)
✔️ Fast & Efficient → Computation fast hoti hai.
✔️ Less Data Required → Kam data me bhi sahi se train ho sakta hai.
✔️ Easy to Interpret → Simple equations hote hain, jisme har parameter ka meaning
samajhna easy hota hai.
🔹 Cons (Nuksan)
❌ Limited Flexibility → Agar data complex hai to model over-simplify kar sakta hai.
❌ Wrong Assumptions → Agar data assumed function ko follow nahi kare to accuracy
low ho sakti hai.
🔹 Example Models:
✅ K-Nearest Neighbors (KNN)
✅ Decision Trees & Random Forests
✅ Support Vector Machines (SVMs)
✅ Neural Networks (Adaptive architecture)
🔹 Use Cases:
✅ Face Recognition (Deep Learning models without fixed assumptions)
✅ Weather Prediction (Complex, non-linear patterns)
✅ Recommendation Systems (Netflix, Amazon recommendations)
🔹 Pros (Fayde)
✔️ More Flexible → Complex aur non-linear data ko bhi handle kar sakta hai.
✔️ Better Accuracy → Agar data sahi se preprocess kiya ho to accuracy high hoti hai.
✔️ No Strong Assumptions → Data ke natural patterns ko seekhne me madad karta
hai.
9/31
🔹 Cons (Nuksan)
❌ Slow Training → Large datasets hone pe training aur prediction slow ho sakta hai.
❌ More Data Required → Kam data me ye overfit ya underfit kar sakta hai.
❌ High Memory Usage → Bahut saara data store karna padta hai, especially KNN me.
💡 Parametric vs. Non-Parametric - Key Differences
Non-Parametric
Feature Parametric Learning Learning
🎯 Conclusion:
Agar problem simple hai aur less data available hai, to Parametric Learning
best hai.
Agar problem complex hai aur zyada data available hai, to Non-Parametric
Learning best hai.
👉 Example se samjhein:
Parametric Learning ek formula-based shortcut ki tarah hai jo simple problems
ke liye fast kaam karega.
Non-Parametric Learning ek trial-and-error method ki tarah hai jo experience se
better hoga lekin time lagega.
Machine learning models can be classified into two categories based on how they learn
from data:
10/31
1. Parametric Learning
2. Non-Parametric Learning
🔹 Example Models:
✅ Linear Regression (y = mx + c)
✅ Logistic Regression
✅ Naïve Bayes
✅ Neural Networks (with a fixed architecture)
🔹 Use Cases:
✅ Stock Market Prediction (Assuming stock prices follow a linear trend)
✅ Medical Diagnosis (Classifying diseases based on fixed parameters)
✅ Spam Detection (Using probability-based classification)
🔹 Advantages:
✔️ Fast & Efficient → Requires fewer computations.
✔️ Less Data Required → Works well even with small datasets.
✔️ Easy to Interpret → The meaning of each parameter is clear and understandable.
🔹 Disadvantages:
❌ Limited Flexibility → May oversimplify complex data.
❌ Incorrect Assumptions → If the data does not follow the assumed function, the
model will have low accuracy.
11/31
🔹 How does it work?
No fixed equation or function assumption.
The more data you provide, the better the model performs.
These models are data-driven and can capture complex patterns.
🔹 Example Models:
✅ K-Nearest Neighbors (KNN)
✅ Decision Trees & Random Forests
✅ Support Vector Machines (SVMs)
✅ Neural Networks (Adaptive architectures)
🔹 Use Cases:
✅ Face Recognition (Deep learning models that adapt to new patterns)
✅ Weather Prediction (Detecting complex, non-linear patterns)
✅ Recommendation Systems (Personalized movie or product recommendations)
🔹 Advantages:
✔️ More Flexible → Can handle complex and non-linear data.
✔️ Better Accuracy → Can improve significantly with more data.
✔️ No Strong Assumptions → Works well even if the data is unpredictable.
🔹 Disadvantages:
❌ Slow Training → Training and predictions can be slow for large datasets.
❌ More Data Required → Needs a lot of data to perform well.
❌ High Memory Usage → Some models (like KNN) require storing the entire dataset.
💡 Parametric vs. Non-Parametric - Key Differences
Non-Parametric
Feature Parametric Learning Learning
🎯 12/31
🎯 Conclusion:
If the problem is simple and less data is available, then Parametric Learning is
the best choice.
If the problem is complex and a large dataset is available, then Non-
Parametric Learning is better.
2/2
Agar aapko complex data ko accurately classify karna hai aur overfitting avoid karna
hai, to SVM ek best choice hai.
13/31
🔹 1. Linear SVM → Jab data linearly separable hota hai (Ek straight line se classes ko
🔹 2. Non-Linear SVM → Jab data complex aur non-linearly separable hota hai (Ek
separate kiya ja sakta hai).
straight line se separate nahi kiya ja sakta, to Kernel Trick use hoti hai).
4️⃣ 14/31
4️⃣ Kernel Trick - Solution for Non-Linear Data
If data is not linear, SVM uses the Kernel Trick to map data into higher dimensions,
making it linearly separable.
15/31
Unsupervised Learning: Clustering (Hinglish & English
Explanation)
Clustering is one of the most important techniques in unsupervised learning, where the
model tries to group similar data points without any predefined labels.
🔹 Applications of Clustering
✅ Customer Segmentation → Identifying groups of similar customers for marketing.
✅ Anomaly Detection → Detecting fraud transactions in banking.
✅ Medical Diagnosis → Identifying diseases based on symptoms.
✅ Image Segmentation → Dividing an image into multiple regions.
🎯 Conclusion
Clustering is useful when you don't have labeled data.
Different algorithms work better for different types of data.
It helps in pattern discovery and decision-making.
17/31
Let me know if you need more details! 😊🚀
dimensionality reduction, kernel methods.
🔹 Hinglish Explanation
1️⃣ Dimensionality Reduction Kya Hai?
Jab data me bahut zyada features (dimensions) hote hain, to analysis difficult ho
jata hai.
Dimensionality Reduction ek technique hai jo features (variables) ki count ko
reduce karti hai bina important information ko lose kiye.
Ye overfitting ko kam karta hai aur model training ko fast banata hai.
🎯 Conclusion
Dimensionality reduction helps simplify data while retaining important
information.
Kernel methods improve performance by transforming non-linear data into a
higher-dimensional space.
Both techniques are crucial for improving accuracy and efficiency in machine
learning.
🔹 Probability Formula:
P(A)=Total OutcomesFavorable Outcomes
20/31
Jahan,
🔹 Example:
Agar ek fair coin toss karein, toh
Jab kisi event ke hone ki probability prior information se update ki jati hai.
Formula:
P(A∣B)=P(B)P(B∣A)⋅P(A)
Example: Disease diagnosis, spam filtering.
🔹 Probability Formula:
🔹
P(A)=Total OutcomesFavorable Outcomes
Example:
When tossing a fair coin,
Formula:
P(A∪B)=P(A)+P(B)−P(A∩B)
Example: Drawing a card that is either red or a king from a deck.
2. Multiplication Theorem (Intersection of Events)
Formula:
P(A∩B)=P(A)×P(B)
Example: Probability of rolling a die and getting a 6 and flipping a coin and
getting heads.
3. Bayes' Theorem
v= 35−1
✅ Matrix Multiplication:
Ek row ko dusre matrix ke column se multiply kar ke add karte hain.
Example:
A⋅B=[1324]×[5768]
✅ Transpose:
Rows ko columns me convert karna.
Example:
AT=[1324]T=[1234]
v= 35−1
24/31
Matrices: A grid of numbers used to represent data.
Example:
A=[1324]
Tensors: Multi-dimensional arrays used in deep learning.
✅ Matrix Multiplication:
Multiply rows of first matrix with columns of second matrix.
Example:
A⋅B=[1324]×[5768]
✅ Transpose:
Swap rows with columns.
Example:
AT=[1324]T=[1234]
Midterm Examination
Time: 1 Hour 30 Minutes
Max Marks: 30
Instructions: Attempt any three questions including Q1, which is compulsory.
Q1 (Compulsory)
(a) Explain the difference between Generative Learning and Discriminative Learning
with examples. (3 Marks)
(b) What are parametric and non-parametric learning? Give examples of each. (3
Marks)
(c) Define Basic Probability Theory and explain Bayes’ Theorem with an example. (4
Marks)
Q2
(a) What is Linear Algebra? Explain how matrices and vectors are used in Machine
Learning. (5 Marks)
(b) Explain the working of a Support Vector Machine (SVM). How does it help in
classification problems? (5 Marks)
Q3
26/31
(a) What is clustering in Unsupervised Learning? Explain any two clustering algorithms
with examples. (5 Marks)
Q4
(a) Explain Neural Networks and how they are different from traditional machine learning
models. (5 Marks)
(b) What are kernel methods? How do they improve classification models like SVM? (5
Marks)
🚀
This paper ensures conceptual understanding while maintaining the balance between
theory and application. Let me know if you need modifications!
answer them
Q1 (Compulsory)
Use Case Useful when data distribution is Better for classification tasks
important
🔹 Example:
Generative: Naïve Bayes predicts spam emails by learning the probability
distribution of words in spam vs. non-spam.
Discriminative: Logistic Regression directly learns a boundary between spam and
non-spam.
27/31
(b) Parametric vs. Non-Parametric Learning (3 Marks)
Use Case Works well with structured, Works well with large, unstructured
limited data data
🔹 Example:
Parametric: Logistic Regression assumes data follows a linear relationship.
Non-Parametric: KNN does not assume any fixed form and adapts based on
neighbors.
Bayes' Theorem
Bayes' Theorem helps update the probability of an event based on new evidence:
P(A∣B)=P(B)P(B∣A)⋅P(A)
where,
28/31
Using Bayes' Theorem, we can calculate the actual probability of having the disease
given a positive test.
Q2
🔹 Example:
If you have three features and two data points, the dataset can be written as:
X=[142536]
How it Works?
1. SVM finds the best hyperplane that separates data into classes.
2. It maximizes the margin between two classes.
3. If data is not linearly separable, SVM uses kernel functions to transform data.
🔹 Example:
Linear SVM separates positive and negative classes using a straight line.
Non-Linear SVM uses the Kernel Trick to classify complex data.
Q3
29/31
1. K-Means Clustering:
🔹 Example:
If we have 1000 features, PCA can reduce it to 10 main components, making
computations faster.
Q4
Neural Networks (NN) mimic the human brain and outperform traditional ML models in
complex tasks.
Example CNNs (for images), RNNs (for time- Logistic Regression, Decision
series) Trees
30/31
Kernel Methods map data to a higher-dimensional space to make it linearly separable.
🔹 Common Kernels:
1. Linear Kernel: For linearly separable data.
2. Polynomial Kernel: For medium complexity data.
3. Radial Basis Function (RBF) Kernel: Maps data into infinite dimensions.
🔹 Example:
A simple XOR dataset is not separable in 2D, but using an RBF kernel, it can be
separated in higher dimensions.
Conclusion
🚀
This answer key covers all questions in detail with examples and explanations. Let
me know if you need more clarifications or modifications!
31/31