Advance Machine Learning 3

The document discusses advanced machine learning concepts, including concept learning through the Find-S algorithm, fraud detection using the PyCaret library, and various word embedding techniques in NLP. It illustrates how concept learning derives general rules from specific examples, highlights the automation of fraud detection with minimal coding, and compares traditional and advanced word embedding methods. Key insights emphasize the importance of feature engineering in fraud detection and the advantages of modern embedding techniques for understanding word meanings.

Uploaded by

riswanthgs23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

Advance Machine Learning 3

Uploaded by

riswanthgs23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Master of Business Administration

Batch 2023-25

Advanced Machine Learning

Assignment 3

Submitted to:
Dr. Praveen Gujjar

Submitted By:
Rishwanth GS
USN: 23MBAR0069
1. Demonstrate the Use of Concept Learning with Example
Concept learning is the process of deriving a general rule from specific training examples. It
plays a crucial role in supervised machine learning, where a model learns to classify new
data based on given attributes. One of the simplest approaches in concept learning is the
Find-S Algorithm, which finds the most specific hypothesis that fits all positive examples.
The algorithm starts with the most restrictive hypothesis and generalizes it as it encounters
new positive examples.
Example: Predicting if a Person Will Play Tennis
Consider the following dataset where we predict whether a person will play tennis based on
weather conditions.

Outlook Temperature Humidity Wind Play Tennis

Sunny Hot High Weak No

Sunny Hot High Strong No

Overcast Hot High Weak Yes

Rain Mild High Weak Yes

Rain Cool Normal Weak Yes

Using Find-S, we begin with the most specific hypothesis (h = (?, ?, ?, ?)), then generalize it
based on positive examples:
1. First positive example (Overcast, Hot, High, Weak): h = (Overcast, Hot, High, Weak)
2. Second positive example (Rain, Mild, High, Weak): h = (?, ?, High, Weak)
3. Third positive example (Rain, Cool, Normal, Weak): h = (?, ?, ?, Weak)
Final Hypothesis
(?, ?, ?, Weak) → The person will play tennis if the wind is weak, regardless of other
conditions.
Limitations
 Find-S assumes there are no contradictory examples (all data is clean).
 It does not handle negative examples or missing data well.
 It finds only one hypothesis and ignores alternative possibilities.
Despite its simplicity, Find-S introduces the core idea of concept learning and is a stepping
stone to more advanced classification models like Decision Trees and Neural Networks.

2. Illustrate Fraud Detection Using PyCaret

Fraud detection is a critical machine learning application that helps identify suspicious
transactions in financial systems. It involves classifying transactions as either fraudulent (1)
or non-fraudulent (0) based on various features such as transaction amount, location, and
user behavior. PyCaret, an automated machine learning (AutoML) library, simplifies fraud
detection by handling data preprocessing, model selection, and evaluation with minimal
coding.
Example: Detecting Fraudulent Transactions
We use a dataset where each transaction has features like amount, time, transaction type, and
location. Our goal is to predict whether a transaction is fraudulent.
Step 1: Install and Import PyCaret
!pip install pycaret
from pycaret.classification import * import
pandas as pd
Step 2: Load the Fraud Detection Dataset
from pycaret.datasets import get_data
data = get_data('fraud') # Example dataset
print(data.head())
Step 3: Set Up PyCaret
clf = setup(data, target='Class', session_id=123)
 target='Class' means we are predicting whether a transaction is fraudulent.
Step 4: Train a Model
model = create_model('rf') # Train a Random Forest classifier
Step 5: Evaluate the Model
evaluate_model(model)
This opens an interactive dashboard to analyze performance using metrics like Accuracy,
Precision, Recall, and F1-score.
Step 6: Make Predictions
predictions = predict_model(model)
print(predictions[['Class', 'Label']].head())
 Class is the actual fraud label, while Label is the model’s prediction.
Key Insights
1. PyCaret automates fraud detection by selecting the best algorithm.
2. Fraud detection requires feature engineering, including past transaction history
and behavioral analysis.
3. Combining PyCaret with deep learning and anomaly detection can further
improve fraud identification.
This approach makes fraud detection accessible even for those with minimal coding
experience while delivering high accuracy.

3. Highlight the Various Word Embedding Techniques

Word embeddings are techniques used in Natural Language Processing (NLP) to convert
words into numerical vectors while preserving their meaning. Traditional text
representation methods, such as One-Hot Encoding and TF-IDF, do not capture semantic
relationships between words. More advanced techniques like Word2Vec, GloVe, FastText,
and BERT allow models to understand word meanings based on context.
Example: Different Word Embedding Techniques
1. One-Hot Encoding
 Assigns a unique binary vector to each word in a vocabulary.
 Example: For ["dog", "cat", "fish"]:
o dog = [1, 0, 0], cat = [0, 1, 0], fish = [0, 0, 1]
 z‘’ Limitation: High dimensionality and no word similarity captured.
2. TF-IDF (Term Frequency-Inverse Document Frequency)
 Weighs words based on how often they appear in a document relative to
all documents.
 Example: "Apple" in a fruit-related article will have a higher TF-IDF score than in
an article about technology.
 z‘’ Limitation: Does not capture word meaning or relationships.
3. Word2Vec
 Developed by Google, it predicts words based on context.
 Types:
o CBOW (Continuous Bag of Words): Predicts a word from surrounding words.
o Skip-Gram: Predicts surrounding words from a given word.
 Example: "King"−"Man"+"Woman"≈"Queen"\text{"King"} - \text{"Man"} +
\text{"Woman"} \approx \text{"Queen"}
 ‘z’ Advantage: Captures semantic relationships.
4. GloVe (Global Vectors for Word Representation)
 Uses word co-occurrence statistics.
 Example: "Ice" and "winter" appear together frequently, so their vectors will
be similar.
 z‘ ’ Advantage: Captures meaning based on word usage patterns.
5. FastText
 Enhances Word2Vec by considering subwords.
 Example: The word "playing" is broken into [pla, lay, ayi, yin, ing], improving
rare word handling.
 ‘z’ Advantage: Useful for morphologically rich languages.
6. BERT (Bidirectional Encoder Representations from
Transformers)
 Uses deep learning to generate context-aware embeddings.
 Example:
o In "I deposited money in the bank", "bank" means financial institution.
o In "She sat by the river bank", "bank" means land near a river.
 ’‘z Advantage: Most powerful for context-aware NLP tasks.

Internal Resistance Project Class 12
100% (6)
Internal Resistance Project Class 12
16 pages
Fake News Detection
100% (1)
Fake News Detection
25 pages
Project
91% (11)
Project
20 pages
Questions Answers Chapter Wise
No ratings yet
Questions Answers Chapter Wise
4 pages
Western Political Thought
No ratings yet
Western Political Thought
290 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Q1 Math1 Tos Grade1
100% (1)
Q1 Math1 Tos Grade1
2 pages
Chicago River Design Guidelines 2019
100% (2)
Chicago River Design Guidelines 2019
137 pages
Application Letter For Safaricom
No ratings yet
Application Letter For Safaricom
4 pages
Machine Learning Lab File
No ratings yet
Machine Learning Lab File
48 pages
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
No ratings yet
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
20 pages
ML - 1 - Sovan - Introduction To ML
No ratings yet
ML - 1 - Sovan - Introduction To ML
83 pages
Guide To Safety Edge
No ratings yet
Guide To Safety Edge
7 pages
USP-NF Purified Water
No ratings yet
USP-NF Purified Water
1 page
Gap Analysis 2024 2025
No ratings yet
Gap Analysis 2024 2025
4 pages
SSc-Model Question Paper Solved 2023-24
No ratings yet
SSc-Model Question Paper Solved 2023-24
69 pages
The Kite Runner Essays
100% (2)
The Kite Runner Essays
7 pages
Dissertation Sahra Wagenknecht
100% (2)
Dissertation Sahra Wagenknecht
8 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
DMML Unit 3
No ratings yet
DMML Unit 3
97 pages
UNIT I 4 ML Hypothesis & Concept Learning
No ratings yet
UNIT I 4 ML Hypothesis & Concept Learning
69 pages
Machine Learning - Unit - 1
100% (1)
Machine Learning - Unit - 1
58 pages
Credit Card Fraud Detection System
100% (1)
Credit Card Fraud Detection System
7 pages
Detection of Insurance Fraud Using NLP and ML
No ratings yet
Detection of Insurance Fraud Using NLP and ML
50 pages
Fortiss Report AI Engineering en Web
No ratings yet
Fortiss Report AI Engineering en Web
120 pages
04 MLModelingBasics
No ratings yet
04 MLModelingBasics
61 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
ML New Record
No ratings yet
ML New Record
51 pages
Considering Customer Lifetime Network Value in Oligopoly Markets With The Use of Game Theory Approach
No ratings yet
Considering Customer Lifetime Network Value in Oligopoly Markets With The Use of Game Theory Approach
27 pages
03-Computational Cognitive Science
No ratings yet
03-Computational Cognitive Science
42 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
72 pages
Fraud Detection On Bankism Data
No ratings yet
Fraud Detection On Bankism Data
25 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
39 pages
Irjet V5i9192 PDF
No ratings yet
Irjet V5i9192 PDF
6 pages
Computer Science 2
No ratings yet
Computer Science 2
66 pages
5 - AIML - Module3 - PPT
No ratings yet
5 - AIML - Module3 - PPT
37 pages
ML Unit 2
No ratings yet
ML Unit 2
16 pages
Internship Reportfinal
No ratings yet
Internship Reportfinal
21 pages
Data Analytics Imp
No ratings yet
Data Analytics Imp
20 pages
Data Science
No ratings yet
Data Science
25 pages
Verification and Validation Norvig 2016
No ratings yet
Verification and Validation Norvig 2016
83 pages
ANN, KNN & Decision Tree
No ratings yet
ANN, KNN & Decision Tree
13 pages
Design of A NLP-empowered Finance Fraud Awareness Model The Anti-Fraud Chatbot For Fraud Detection and Fraud Classification As An Instance
No ratings yet
Design of A NLP-empowered Finance Fraud Awareness Model The Anti-Fraud Chatbot For Fraud Detection and Fraud Classification As An Instance
17 pages
Module 2
No ratings yet
Module 2
15 pages
Research
No ratings yet
Research
15 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
12 Rashis and Their Lords
No ratings yet
12 Rashis and Their Lords
1 page
Project PPT
No ratings yet
Project PPT
17 pages
Aiml
No ratings yet
Aiml
20 pages
Case Study N Sanjay
No ratings yet
Case Study N Sanjay
7 pages
ML 2 Micro
No ratings yet
ML 2 Micro
6 pages
MS Broschuere FLUITEX EN Metric
No ratings yet
MS Broschuere FLUITEX EN Metric
12 pages
Lab Manual
No ratings yet
Lab Manual
17 pages
BT40406 PPT
No ratings yet
BT40406 PPT
9 pages
Fraud Detection in Banking Data Using Machine Learning
No ratings yet
Fraud Detection in Banking Data Using Machine Learning
17 pages
Curriculum Development A Summary
No ratings yet
Curriculum Development A Summary
22 pages
Proactive Fraud Defense Machine Learnings Evolvin
No ratings yet
Proactive Fraud Defense Machine Learnings Evolvin
10 pages
Van Der Pauw Method For Determining Resistivity
No ratings yet
Van Der Pauw Method For Determining Resistivity
9 pages
Privacy Enabled Financial Text Classification Using Differential Privacy and Federated Learning-8
No ratings yet
Privacy Enabled Financial Text Classification Using Differential Privacy and Federated Learning-8
6 pages
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
No ratings yet
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
7 pages
UPI Fraud Detection Using Machine Learning
No ratings yet
UPI Fraud Detection Using Machine Learning
7 pages
Grade 8 Math Module 3
No ratings yet
Grade 8 Math Module 3
23 pages
Text Classification MLND Project Report Prasann Pandya
No ratings yet
Text Classification MLND Project Report Prasann Pandya
17 pages
U1 - ML
No ratings yet
U1 - ML
5 pages
Approaches To Fraud Detection On
No ratings yet
Approaches To Fraud Detection On
10 pages
Pemanfaatan Serat Selulosa ECENG GONDOK (Eichhornia Crassipes) SEBAGAI BAHAN BAKU Pembuatan Kertas: Isolasi Dan Karakterisasi
No ratings yet
Pemanfaatan Serat Selulosa ECENG GONDOK (Eichhornia Crassipes) SEBAGAI BAHAN BAKU Pembuatan Kertas: Isolasi Dan Karakterisasi
8 pages
NLP Q2 21SAL54 Scheme
No ratings yet
NLP Q2 21SAL54 Scheme
6 pages
Experiment No.: - Welding Procedure Specification (WPS) & Welder Performance Qualification (WPQ)
No ratings yet
Experiment No.: - Welding Procedure Specification (WPS) & Welder Performance Qualification (WPQ)
12 pages
Industrial Oriented Mini Project - Summer Internship On
No ratings yet
Industrial Oriented Mini Project - Summer Internship On
14 pages
Answer The Following Questions in About 120 Words
No ratings yet
Answer The Following Questions in About 120 Words
2 pages
Machine Learning Techniques For Fraud Detection: Researchgate
No ratings yet
Machine Learning Techniques For Fraud Detection: Researchgate
9 pages
Aifb Lab Manual Exp 6 - Aids
No ratings yet
Aifb Lab Manual Exp 6 - Aids
3 pages
Evidence
No ratings yet
Evidence
4 pages
Digital Signal Processing
No ratings yet
Digital Signal Processing
3 pages
Proactive Fraud Defense
No ratings yet
Proactive Fraud Defense
1 page
Fraud Detection Using Machine Learning and Deep Learning: December 2019
No ratings yet
Fraud Detection Using Machine Learning and Deep Learning: December 2019
7 pages
Advertisement For Dav
No ratings yet
Advertisement For Dav
9 pages
HRM As Map, Model and Theory
No ratings yet
HRM As Map, Model and Theory
3 pages
Generative VS Discriminative Models - by Prathap Manohar Joshi - Medium
No ratings yet
Generative VS Discriminative Models - by Prathap Manohar Joshi - Medium
1 page
R2 BCBrochure
No ratings yet
R2 BCBrochure
2 pages
Sat - 97.Pdf - Bank Fraud Detection Using Machine Learning Algorithm
No ratings yet
Sat - 97.Pdf - Bank Fraud Detection Using Machine Learning Algorithm
11 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
8 pages
Fraud Detection in Credit Card Data Using Unsupervised Machine Learning Algorithm
No ratings yet
Fraud Detection in Credit Card Data Using Unsupervised Machine Learning Algorithm
10 pages
Analyzing and Performance of The Credit Card Fraud Detection Using Machine Learning
No ratings yet
Analyzing and Performance of The Credit Card Fraud Detection Using Machine Learning
5 pages
Credit Card Fraud Detection System Using CNN
No ratings yet
Credit Card Fraud Detection System Using CNN
7 pages
" Druggist Fold : West Manheim Twp. Police Dept. Property Manual
No ratings yet
" Druggist Fold : West Manheim Twp. Police Dept. Property Manual
1 page
ANN Course File 2011
No ratings yet
ANN Course File 2011
8 pages
Practical Data Analysis
From Everand
Practical Data Analysis
Hector Cuesta
4.5/5 (14)
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
From Everand
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
William Sullivan
1/5 (1)
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet