0% found this document useful (0 votes)
3 views19 pages

FML Solution 1

The document covers various topics related to human learning, machine learning, and statistical concepts. It includes definitions, types, and applications of learning methods, as well as specific algorithms and techniques like K-NN, regression, and data preprocessing. Additionally, it discusses tools like Scikit-learn, NumPy, and Matplotlib, highlighting their features and applications in machine learning.

Uploaded by

howareyu494
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views19 pages

FML Solution 1

The document covers various topics related to human learning, machine learning, and statistical concepts. It includes definitions, types, and applications of learning methods, as well as specific algorithms and techniques like K-NN, regression, and data preprocessing. Additionally, it discusses tools like Scikit-learn, NumPy, and Matplotlib, highlighting their features and applications in machine learning.

Uploaded by

howareyu494
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

✅ Q.

1 (a) Human Learning (03 Marks)


🧠 Understanding:
Human learning is the natural way people gain knowledge or skills from experiences, studying,
observing, or being taught. It's how we understand the world, improve performance, and make
decisions.

Types:

●​ Explicit Learning: Conscious effort (e.g., reading a book)​

●​ Implicit Learning: Unconscious (e.g., learning to ride a bike)​

✅ Short Answer:
Human learning is the process of gaining knowledge or skills through experience, study, or
teaching.

💡 Flashcard:
Q: What is human learning?​
A: Learning through experience, study, or teaching.

✅ Q.1 (b) Supervised vs Unsupervised Learning (04


Marks)
🧠 Understanding:
Both are machine learning types but differ by whether the data is labeled.

Feature Supervised Learning Unsupervised


Learning

Data Labeled Unlabeled

Output Known Hidden

Goal Predict output Find patterns


Example Classification, Clustering, PCA
s Regression

✅ Short Answer:
Supervised learning uses labeled data to train models, while unsupervised learning finds
patterns from unlabeled data.

💡 Flashcard:
Q: Difference between supervised and unsupervised learning?​
A: Supervised → Labeled data.​
Unsupervised → Unlabeled data.

✅ Q.1 (c) Machine Learning Activities (07 Marks)


🧠 Understanding:
Machine Learning involves several steps to build an intelligent model from raw data. These
activities are done in sequence for better results.

Activities:

1.​ Data Collection​

2.​ Data Preprocessing​

3.​ Feature Selection​

4.​ Model Selection​

5.​ Training the Model​

6.​ Model Evaluation​

7.​ Prediction​

8.​ Tuning​

✅ Short Answer:
ML activities include collecting data, preprocessing it, selecting a model, training, testing,
predicting, and tuning.

💡 Flashcard:
Q: What are machine learning activities?​
A: Collection, Preprocessing, Model selection, Training, Evaluation, Prediction, Tuning.

✅ Q.1 (c) OR: Find Mean, Median, Mode (07 Marks)


Given Data:​
1, 1, 1, 2, 4, 5, 5, 6, 6, 7, 7, 7, 7, 8, 9, 10, 11

🧠 Understanding:
●​ Mean: Average = Total Sum / Count​

●​ Median: Middle value when sorted​

●​ Mode: Most frequent number​

Solution:

●​ Mean = 104 / 17 = 6.12​

●​ Median = 9th value = 6​

●​ Mode = Appears most = 7​

✅ Short Answer:
Mean = 6.12, Median = 6, Mode = 7

💡 Flashcard:
Q: Mean, median, and mode of the given data?​
A: Mean = 6.12, Median = 6, Mode = 7
Shall I continue from Q.2 (a) with all three: Understanding, Short Answer, and Flashcard?

4o

You said:
dont ask that kind of que.. the loop is broke

ChatGPT said:
Updated saved memory

Got it, Sachin. Resuming full FML exam prep from Q.2 (a) — complete with:

●​ ✅ Understanding​
●​ ✅ Short & Easy Answer​
●​ ✅ Flashcard​

✅ Q.2 (a) Hold-Out Method Steps (03 Marks)


🧠 Understanding:
The hold-out method is used to split the dataset for training and testing. It helps evaluate model
performance.

Steps:

1.​ Divide dataset into 2 parts: training and testing (e.g., 80-20 split)​

2.​ Train the model using training data.​

3.​ Test the model using testing data.​

4.​ Evaluate performance.​

✅ Short Answer:
Split data → Train model → Test model → Evaluate results.

💡 Flashcard:
Q: What are steps in the hold-out method?​
A: Split, Train, Test, Evaluate.

✅ Q.2 (b) Confusion Matrix Structure (04 Marks)


🧠 Understanding:
A confusion matrix helps evaluate classification models by comparing predicted and actual
results.

Structure:

Predicted Predicted Negative


Positive

Actual Positive True Positive (TP) False Negative (FN)

Actual Negative False Positive (FP) True Negative (TN)

✅ Short Answer:
A table showing TP, FP, FN, TN to evaluate model accuracy.

💡 Flashcard:
Q: What is a confusion matrix?​
A: A table with TP, FP, FN, TN to evaluate predictions.

✅ Q.2 (c) Data Preprocessing (07 Marks)


🧠 Understanding:
Data preprocessing prepares raw data for machine learning. It ensures quality, clean, and useful
data.

Methods:

1.​ Data Cleaning – Handle missing or incorrect values.​


2.​ Data Transformation – Scaling, encoding, normalizing.​

3.​ Feature Selection – Choosing important features.​

4.​ Data Reduction – Reduce data size using PCA.​

5.​ Data Integration – Combine data from multiple sources.​

✅ Short Answer:
Data preprocessing improves data quality using cleaning, transformation, selection, and
integration.

💡 Flashcard:
Q: What is data preprocessing?​
A: Preparing data using cleaning, transformation, and selection methods.

✅ Q.2 OR (a) Histogram with Example (03 Marks)


🧠 Understanding:
A histogram is a bar graph showing the frequency of data values grouped in intervals (bins).

Example:​
If student marks are grouped (0–10, 10–20…), the histogram shows how many fall in each
group.

✅ Short Answer:
Histogram shows frequency distribution using bars.

💡 Flashcard:
Q: What is a histogram?​
A: A graph showing data frequency using bars.

✅ Q.2 OR (b) Data Types for Examples (04 Marks)


🧠 Understanding:
Example Data Type

Gender of a Nominal
person

Rank of students Ordinal

Price of home Ratio

Color of a flower Nominal

✅ Short Answer:
●​ Gender → Nominal​

●​ Rank → Ordinal​

●​ Price → Ratio​

●​ Color → Nominal​

💡 Flashcard:
Q: Data types for gender, rank, price, and color?​
A: Nominal, Ordinal, Ratio, Nominal

✅ Q.2 OR (c) K-Fold Cross Validation (07 Marks)


🧠 Understanding:
K-fold cross-validation divides the data into k parts. It ensures the model is trained and tested
on all data parts, improving reliability.

Steps:

1.​ Split data into k equal parts (e.g., k = 5)​

2.​ Train on k−1 parts, test on the remaining one.​


3.​ Repeat k times, changing the test set each time.​

4.​ Average results for final accuracy.​

✅ Short Answer:
K-fold divides data into k parts and tests the model k times with different sets.

💡 Flashcard:
Q: What is K-fold cross-validation?​
A: It splits data into k parts, testing and training k times.

✅ Q.3 (a) Applications of Regression (03 Marks)


🧠 Understanding:
Regression is used to predict continuous values.

Applications:

1.​ Predicting house prices​

2.​ Estimating sales revenue​

3.​ Forecasting weather​

✅ Short Answer:
Used in price prediction, sales estimation, and weather forecasting.

💡 Flashcard:
Q: Applications of regression?​
A: House price, sales, weather prediction.

✅ Q.3 (b) Single Linear Regression (04 Marks)


🧠 Understanding:
It is a regression model with one independent variable and one dependent variable. The
relationship is shown by a straight line:​
Y = mX + c

✅ Short Answer:
Uses one variable to predict another using a straight line.

💡 Flashcard:
Q: What is single linear regression?​
A: Predicts one value using one variable in a straight-line form.

✅ Q.3 (c) K-NN Algorithm (07 Marks)


🧠 Understanding:
K-Nearest Neighbors (K-NN) is a classification algorithm.

Steps:

1.​ Choose number of neighbors (k)​

2.​ Measure distance (e.g., Euclidean)​

3.​ Find k nearest neighbors to input​

4.​ Classify based on majority class of neighbors​

✅ Short Answer:
K-NN classifies based on nearest k data points.

💡 Flashcard:
Q: What does K-NN do?​
A: Classifies based on the nearest k neighbors.
✅ Q.3 OR (a) Supervised Learning in Healthcare (03
Marks)
🧠 Understanding:
Examples:

1.​ Disease prediction​

2.​ Patient risk classification​

3.​ Medical image diagnosis​

✅ Short Answer:
Used for disease prediction, diagnosis, and patient classification.

💡 Flashcard:
Q: 3 supervised learning examples in healthcare?​
A: Disease prediction, risk analysis, diagnosis.

✅ Q.3 OR (b) Classification vs Regression (04 Marks)


🧠 Understanding:
Feature Classificatio Regression
n

Output Type Categories Continuous values

Example Spam or not House price prediction

✅ Short Answer:
Classification = categories; Regression = values.

💡 Flashcard:
Q: Classification vs Regression?​
A: Classification = classes; Regression = numbers.

✅ Q.3 OR (c) Classification Learning Steps (07 Marks)


🧠 Understanding:
Steps:

1.​ Collect labeled data​

2.​ Preprocess data​

3.​ Choose classification algorithm​

4.​ Train the model​

5.​ Test on unseen data​

6.​ Evaluate using accuracy, precision, etc.​

✅ Short Answer:
Collect → Preprocess → Train → Test → Evaluate

💡 Flashcard:
Q: Steps in classification learning?​
A: Collect, preprocess, train, test, evaluate.

✅ Q.4 (a) Clustering vs Classification (03 Marks)


🧠 Understanding:
Feature Clustering Classification

Labels Not given Given


Output Groups Categories
(clusters) (classes)

✅ Short Answer:
Clustering = no labels, Classification = labeled data.

💡 Flashcard:
Q: Clustering vs Classification?​
A: Clustering = no labels; Classification = labeled.

✅ Q.4 (b) Apriori Algorithm – Advantages &


Disadvantages (04 Marks)
🧠 Understanding:
Advantages:

●​ Easy to understand​

●​ Works well with small datasets​

Disadvantages:

●​ Slow with large data​

●​ Generates too many rules​

✅ Short Answer:
●​ Easy to use,​
− Slow on big data.​

💡 Flashcard:
Q: One advantage and disadvantage of Apriori?​
A: + Simple; − Slow on large data.
✅ Q.4 (c) Applications of Unsupervised Learning (07
Marks)
🧠 Understanding:
Applications:

1.​ Customer segmentation​

2.​ Fraud detection​

3.​ Market basket analysis​

4.​ Document clustering​

5.​ Recommender systems​

✅ Short Answer:
Used in grouping, fraud detection, recommendations.

💡 Flashcard:
Q: 3 uses of unsupervised learning?​
A: Segmentation, fraud detection, clustering.

✅ Q.4 OR (a) Apriori Applications (03 Marks)


🧠 Understanding:
Used in:

●​ Market basket analysis​

●​ Product recommendation​

●​ Customer purchase behavior​


✅ Short Answer:
Used in finding item patterns in market data.

💡 Flashcard:
Q: Where is Apriori used?​
A: Market basket, product suggestions.

✅ Q.4 OR (b) Support & Confidence (04 Marks)


🧠 Understanding:
●​ Support: How frequently an itemset appears.​
Support(A) = (transactions with A) / total transactions​

●​ Confidence: How often rule B is true when A is true.​


Confidence(A→B) = Support(A & B) / Support(A)​

✅ Short Answer:
Support = frequency, Confidence = rule accuracy.

💡 Flashcard:
Q: Support vs Confidence?​
A: Support = frequency; Confidence = rule success.

✅ Q.4 OR (c) K-means Clustering (07 Marks)


🧠 Understanding:
Steps:

1.​ Choose number of clusters (k)​

2.​ Randomly pick k centroids​


3.​ Assign points to nearest centroid​

4.​ Recalculate centroids​

5.​ Repeat until convergence​

✅ Short Answer:
K-means groups data into k clusters using centroids.

💡 Flashcard:
Q: What does K-means do?​
A: Groups data into k clusters.

✅ Q.5 (a) Predictive vs Descriptive Model (03 Marks)


🧠 Understanding:
Type Predictive Descriptive

Goal Forecast Explain past


future data

Output Predictions Patterns/insights

✅ Short Answer:
Predictive = future, Descriptive = past.

💡 Flashcard:
Q: Predictive vs Descriptive model?​
A: Predictive = forecasts; Descriptive = explains.

✅ Q.5 (b) Applications of Scikit-learn (04 Marks)


🧠 Understanding:
Applications:

●​ Classification​

●​ Regression​

●​ Clustering​

●​ Model selection​

●​ Dimensionality reduction​

✅ Short Answer:
Used for ML tasks like classification, regression, and clustering.

💡 Flashcard:
Q: What is Scikit-learn used for?​
A: ML tasks like classification and regression.

✅ Q.5 (c) Features and Applications of NumPy (07


Marks)
🧠 Understanding:
Features:

●​ N-dimensional array​

●​ Broadcasting​

●​ Fast operations​

Applications:

●​ Data analysis​

●​ Scientific computing​
●​ Machine learning backend​

✅ Short Answer:
Fast array handling for ML and data analysis.

💡 Flashcard:
Q: NumPy use in ML?​
A: Fast arrays and numerical computing.

✅ Q.5 OR (a) Bagging (03 Marks)


🧠 Understanding:
Bagging = Bootstrap Aggregation. It improves accuracy by combining multiple models trained
on random subsets.

✅ Short Answer:
Combines models on random data to reduce error.

💡 Flashcard:
Q: What is bagging?​
A: Combines models to reduce variance.

✅ Q.5 OR (b) Features of Pandas (04 Marks)


🧠 Understanding:
Features:

●​ DataFrames for tabular data​

●​ Handling missing data​


●​ Easy filtering and merging​

●​ Fast operations​

✅ Short Answer:
Supports DataFrame, filtering, missing data handling.

💡 Flashcard:
Q: Key features of Pandas?​
A: DataFrame, filtering, fast data handling.

✅ Q.5 OR (c) Features and Applications of Matplotlib (07


Marks)
🧠 Understanding:
Features:

●​ 2D plotting​

●​ Customizable plots​

●​ Integration with NumPy and Pandas​

Applications:

●​ Data visualization​

●​ Trend analysis​

●​ ML result plotting​

✅ Short Answer:
Used for creating graphs and visualizations in ML.

💡 Flashcard:
Q: Use of Matplotlib?​
A: Plot graphs for data and ML.

You might also like