Lecture#11

word embedding and how it perform

Uploaded by

Qareena sadiq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views19 pages

Lecture#11

word embedding and how it perform

Uploaded by

Qareena sadiq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 19

Machine Learning (ML) for Natural

Language Processing (NLP)

Machine Learning for NLP
• Machine learning (ML) for natural language processing (NLP) and
text analytics involves using machine learning algorithms and
“narrow” artificial intelligence (AI) to understand the meaning of
text documents. These documents can be just about anything that
contains text: social media comments, online reviews, survey
responses, even financial, medical, legal and regulatory
documents. In essence, the role of machine learning and AI in
natural language processing and text analytics is to improve,
accelerate and automate the underlying text analytics functions
and NLP features that turn this unstructured text into useable data
and insights.
• Most importantly, “machine learning” really means “machine
teaching.” We know what the machine needs to learn, so our task
is to create a learning framework and provide properly-formatted,
relevant, clean data for the machine to learn from.
Machine Learning for NLP
Most importantly, “machine
learning” really means
“machine teaching.” We
know what the machine
needs to learn, so our task
is to create a learning
framework and provide
properly-formatted,
relevant, clean data for the
machine to learn from.
Supervised Machine Learning for Natural Language
Processing and Text Analytics
Machine learning for NLP and text
analytics involves a set of statistical
techniques for identifying parts of speech,
entities, sentiment, and other aspects of
text. The techniques can be expressed
as a model that is then applied to other
text, also known as supervised machine
learning. It also could be a set of
algorithms that work across large sets of
data to extract meaning, which is known
as unsupervised machine learning. It’s
important to understand the difference
between supervised and unsupervised
learning, and how you can get the best of
both in one system.
Supervised Machine Learning for Natural Language
Processing and Text Analytics
In supervised machine learning, a batch of text documents
are tagged or annotated with examples of what the
machine should look for and how it should interpret that
aspect. These documents are used to “train” a statistical
model, which is then given un-tagged text to analyze.
The most popular supervised NLP machine learning
algorithms are:
•Support Vector Machines
•Naive Bayes
•Maximum Entropy
•Conditional Random Field
•Neural Networks/Deep Learning
Support Vector Machines
Support Vector Machine or SVM is one of the most popular
Supervised Learning algorithms, which is used for Classification as
well as Regression problems. However, primarily, it is used for
Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into classes so that
we can easily put the new data point in the correct category in the
future. This best decision boundary is called a hyperplane.

SVM chooses the extreme points/vectors that help in creating the

hyperplane. These extreme cases are called as support vectors, and
hence algorithm is termed as Support Vector Machine. Consider the
below diagram in which there are two different categories that are
classified using a decision boundary or hyperplane
SVM is Supervised Learning
Which Hyper plane is best
Pros:

• It works really well with a clear margin of

separation
• It is effective in high dimensional spaces.
• It is effective in cases where the number of
dimensions is greater than the number of
samples.
• It uses a subset of training points in the
decision function (called support vectors),
so it is also memory efficient.
Cons:
• It doesn’t perform well when we have large
data set because the required training time is
higher
• It also doesn’t perform very well, when the data
set has more noise i.e. target classes are
overlapping
• SVM doesn’t directly provide probability
estimates, these are calculated using an
expensive five-fold cross-validation. It is
included in the related SVC method of Python
scikit-learn library.
Naive Bayes Algorithm
• It is a classification technique based on Bayes’ Theorem with an
assumption of independence among predictors. In simple terms, a
Naive Bayes classifier assumes that the presence of a particular
feature in a class is unrelated to the presence of any other feature.

• For example, a fruit may be considered to be an apple if it is red,

round, and about 3 inches in diameter. Even if these features
depend on each other or upon the existence of the other features,
all of these properties independently contribute to the probability
that this fruit is an apple and that is why it is known as ‘Naive’.
• Naive Bayes model is easy to build and particularly useful for very
large data sets. Along with simplicity, Naive Bayes is known to
outperform even highly sophisticated classification methods.
• P(c|x) is the posterior probability of class (c, target) given
predictor (x, attributes).
• P(c) is the prior probability of class.
• P(x|c) is the likelihood which is the probability of predictor
given class.
• P(x) is the prior probability of predictor.
Unsupervised Machine Learning for Natural
Language Processing and Text Analytics
• Unsupervised machine learning involves training a model
without pre-tagging or annotating. Some of these techniques are
surprisingly easy to understand.

• Clustering means grouping similar documents together into

groups or sets. These clusters are then sorted based on
importance and relevancy (hierarchical clustering).
Latent Semantic Indexing (LSI)
• Another type of unsupervised learning is Latent Semantic
Indexing (LSI). This technique identifies on words and phrases
that frequently occur with each other. Data scientists use LSI
for faceted searches, or for returning search results that aren’t
the exact search term.

• For example, the terms “manifold” and “exhaust” are closely

related documents that discuss internal combustion engines.
So, when you Google “manifold” you get results that also
contain “exhaust”.
Thanx for Listening

Spammer Detection Fake Pople Identification On Social Networks1
No ratings yet
Spammer Detection Fake Pople Identification On Social Networks1
64 pages
CASE STUDY - VAE Application - Quiz - Attempt Review
100% (1)
CASE STUDY - VAE Application - Quiz - Attempt Review
8 pages
AI 22MCA262 2023-June
100% (1)
AI 22MCA262 2023-June
2 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
21 pages
MCQs - Artificial Neural Networks - Components and Concepts - AIMCQs
No ratings yet
MCQs - Artificial Neural Networks - Components and Concepts - AIMCQs
11 pages
Machine Learning - Its Types
No ratings yet
Machine Learning - Its Types
8 pages
Python Timetable
No ratings yet
Python Timetable
3 pages
Machine Learning - Data
No ratings yet
Machine Learning - Data
11 pages
Deep Learning Techniques (Important Questions)
No ratings yet
Deep Learning Techniques (Important Questions)
5 pages
Deep Learning April 2025 Question Paper Part 1
No ratings yet
Deep Learning April 2025 Question Paper Part 1
4 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
Microsoft Azure AI Fundamentals: AI-900
No ratings yet
Microsoft Azure AI Fundamentals: AI-900
2 pages
Machine Learning - Lec1
No ratings yet
Machine Learning - Lec1
56 pages
ML Model Flowchart
No ratings yet
ML Model Flowchart
5 pages
6CS4 AI Unit-4 @zammers
No ratings yet
6CS4 AI Unit-4 @zammers
129 pages
Unit 5 PPT
No ratings yet
Unit 5 PPT
32 pages
ML Chapter 1
No ratings yet
ML Chapter 1
37 pages
Chapter 1 ML
No ratings yet
Chapter 1 ML
30 pages
AI
No ratings yet
AI
52 pages
Intro To ML
No ratings yet
Intro To ML
34 pages
Machine Learning Unit-I
No ratings yet
Machine Learning Unit-I
41 pages
ML Unit - 2
No ratings yet
ML Unit - 2
36 pages
Machine Learning IAI
No ratings yet
Machine Learning IAI
94 pages
All Algos - of - ML
No ratings yet
All Algos - of - ML
31 pages
AI Unit-4
No ratings yet
AI Unit-4
58 pages
Vision Transformer (Vit) : Shusen Wang
No ratings yet
Vision Transformer (Vit) : Shusen Wang
35 pages
Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)
No ratings yet
Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)
44 pages
Project Example
No ratings yet
Project Example
19 pages
ML L1 PDF
No ratings yet
ML L1 PDF
43 pages
Unit5 ML Introduction
No ratings yet
Unit5 ML Introduction
32 pages
Chapter 1.1 Regression
No ratings yet
Chapter 1.1 Regression
47 pages
Unit 3
No ratings yet
Unit 3
20 pages
InTech-Types of Machine Learning Algorithms
No ratings yet
InTech-Types of Machine Learning Algorithms
30 pages
Parameter-Efficient Fine-Tuning For Large Models: A Comprehensive Survey
No ratings yet
Parameter-Efficient Fine-Tuning For Large Models: A Comprehensive Survey
25 pages
INTRO TO AI - Mehul Bharti
No ratings yet
INTRO TO AI - Mehul Bharti
23 pages
Machine Learning - Ii Unit 1
No ratings yet
Machine Learning - Ii Unit 1
21 pages
AI Lec3
No ratings yet
AI Lec3
22 pages
Types of ML
No ratings yet
Types of ML
10 pages
Types of Machine Learning
No ratings yet
Types of Machine Learning
14 pages
Wa0000.
No ratings yet
Wa0000.
26 pages
Class Notes
No ratings yet
Class Notes
29 pages
Supervised Learning
No ratings yet
Supervised Learning
9 pages
CONNEAU and Lample - 2019 - Cross-Lingual Language Model Pretraining
No ratings yet
CONNEAU and Lample - 2019 - Cross-Lingual Language Model Pretraining
11 pages
UNIT4
No ratings yet
UNIT4
12 pages
3 Introduction To Machine Learning
No ratings yet
3 Introduction To Machine Learning
21 pages
Ai Unit 4
No ratings yet
Ai Unit 4
32 pages
Handwritten Hindi Character Recognition Using MultipleClassifiers in Machine Learning
No ratings yet
Handwritten Hindi Character Recognition Using MultipleClassifiers in Machine Learning
6 pages
Algorithm of Neural Network M4
No ratings yet
Algorithm of Neural Network M4
25 pages
Lect 2 in Machine Learning For NLP
No ratings yet
Lect 2 in Machine Learning For NLP
17 pages
4.introduction To Learning - Unit 2
No ratings yet
4.introduction To Learning - Unit 2
8 pages
Machine Learning (AI)
No ratings yet
Machine Learning (AI)
19 pages
Manish NTCC Presentation Sem 5
No ratings yet
Manish NTCC Presentation Sem 5
11 pages
What Is Machine Learning-UNIT III
No ratings yet
What Is Machine Learning-UNIT III
12 pages
Perceptron
No ratings yet
Perceptron
8 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
Speech Recognition With NLP (1) .PPTX DK
No ratings yet
Speech Recognition With NLP (1) .PPTX DK
10 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
10 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
Self Supervised Learning: A Succinct Review: Veenu Rani Syed Tufael Nabi Munish Kumar Ajay Mittal Krishan Kumar
No ratings yet
Self Supervised Learning: A Succinct Review: Veenu Rani Syed Tufael Nabi Munish Kumar Ajay Mittal Krishan Kumar
15 pages
Wa0017.
No ratings yet
Wa0017.
20 pages
Machine Learning - Part - 1
No ratings yet
Machine Learning - Part - 1
17 pages
Report
No ratings yet
Report
27 pages
Learning in Artificial Intelligence
No ratings yet
Learning in Artificial Intelligence
5 pages
Ijctt V48P126
No ratings yet
Ijctt V48P126
11 pages
Unit 1
No ratings yet
Unit 1
8 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
12 pages
Exploring Progress in Aspect-Based Sentiment Analysis An In-Depth Survey
No ratings yet
Exploring Progress in Aspect-Based Sentiment Analysis An In-Depth Survey
10 pages
Unit 3 Material
No ratings yet
Unit 3 Material
8 pages
International Conference On Smart Systems and Inventive Technology (Icssit 2019)
No ratings yet
International Conference On Smart Systems and Inventive Technology (Icssit 2019)
7 pages
Syllabus ADaSci Certified Generative AI Engineer
No ratings yet
Syllabus ADaSci Certified Generative AI Engineer
3 pages
Machine Learning Models For News Article Classification
No ratings yet
Machine Learning Models For News Article Classification
8 pages
Research Paper 3
No ratings yet
Research Paper 3
7 pages
Machine Learning Algorithms - A Review - ART20203995
No ratings yet
Machine Learning Algorithms - A Review - ART20203995
6 pages
Introduction To LLMs
No ratings yet
Introduction To LLMs
2 pages
Machine-Learning-MBA-Unit 4 Machine - Learning-MBA-unit-3
No ratings yet
Machine-Learning-MBA-Unit 4 Machine - Learning-MBA-unit-3
5 pages
Mlnov2024 2
No ratings yet
Mlnov2024 2
3 pages
Handwritten Digit Recognition Using A Neural Network
No ratings yet
Handwritten Digit Recognition Using A Neural Network
4 pages
NLP Assignment 4
No ratings yet
NLP Assignment 4
3 pages
Supervised and Unsupervised Machine Learning
No ratings yet
Supervised and Unsupervised Machine Learning
3 pages
AI Machine Learning
No ratings yet
AI Machine Learning
3 pages
College Edge Detection
No ratings yet
College Edge Detection
2 pages
COMSW4705 001 2022 3-NATURALLANGUAGEPROCESSINGCOMSW4705 001 2022 3-NATURALLANGUAGEPROCESSING DanielBauer
No ratings yet
COMSW4705 001 2022 3-NATURALLANGUAGEPROCESSINGCOMSW4705 001 2022 3-NATURALLANGUAGEPROCESSING DanielBauer
2 pages
Assignment-1 ML
No ratings yet
Assignment-1 ML
1 page
Machine Learning (ML) Is A Subset o
No ratings yet
Machine Learning (ML) Is A Subset o
2 pages
Siddhant Bansal
No ratings yet
Siddhant Bansal
1 page
Specialized AI Professional - Academy One-Pager
No ratings yet
Specialized AI Professional - Academy One-Pager
1 page
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
From Everand
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
Alexandra George
No ratings yet
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Lecture#11

Uploaded by

Lecture#11

Uploaded by

Machine Learning (ML) for Natural

Language Processing (NLP)

SVM chooses the extreme points/vectors that help in creating the

• It works really well with a clear margin of

• For example, a fruit may be considered to be an apple if it is red,

• Clustering means grouping similar documents together into

• For example, the terms “manifold” and “exhaust” are closely

You might also like