0% found this document useful (0 votes)
42 views35 pages

Parvathy V J, Engineer Special Programs, Livewire, Trivandrum

This document provides an overview of natural language processing (NLP) including: - The basics of NLP and how it gives machines the ability to understand human language. - Common NLP tasks like sentiment analysis, question answering, and language translation. - The challenges of NLP due to ambiguities in human language. - The main techniques in NLP including syntactic analysis, semantic analysis, and vectorization methods like bag-of-words, TF-IDF, and Word2Vec to convert text into numerical representations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views35 pages

Parvathy V J, Engineer Special Programs, Livewire, Trivandrum

This document provides an overview of natural language processing (NLP) including: - The basics of NLP and how it gives machines the ability to understand human language. - Common NLP tasks like sentiment analysis, question answering, and language translation. - The challenges of NLP due to ambiguities in human language. - The main techniques in NLP including syntactic analysis, semantic analysis, and vectorization methods like bag-of-words, TF-IDF, and Word2Vec to convert text into numerical representations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Parvathy V J, Engineer Special Programs, Livewire, Trivandrum

Outline
• Basics of NLP
• Text pre-processing & vectorisation
• NLP process implementation in python
• Spam email classifier use-case
What is an Intelligent Machine ?
• Generalized Adaptability
• Automated Reasoning
• Knowledge Representation
• Natural Language Processing Alan Turing
• Problem Solving Ability
• Machine Learning
• Natural Language Processing or NLP is a field of Artificial Intelligence that
gives the machines the ability to read, understand and derive meaning from
human languages
Human-Machine Interaction using NLP
• Language detection
• Next word prediction
• Automated query answering
• Audio to text conversion
• Data to audio conversion
• Processing of the text data
What is NLP used for?
• NLP enables recognition and prediction of diseases-Amazon Comprehend
Medical
• Customer review analysis in social media-sentimental analysis
• Personal assistants-cognitive assistants-IBM
• Spam email filtering-Yahoo and Google
• Fake news identification-NLP Group at MIT
• Voice assistants-Siri, Alexa, Cortana
• NLP in health care-Woebot chatbot therapist
• Language translation applications - Google Translate
• Checking grammatical accuracy of texts- Microsoft Word and Grammarly
Why is NLP difficult?
• Ambiguity and imprecise characteristics of natural languages make NLP
difficult for machines to implement
How does NLP works?

• Syntactic analysis and semantic analysis are the main techniques used
• Syntax -arrangement of words in a sentence to make grammatical sense
• Syntactic analysis shows how natural language aligns with grammatical rules
• Semantics refers to the meaning that is conveyed by a text
At a glimpse…
• Tokenisation
• Removal of Stop-words, punctuations, numbers ,special characters
• Stemming
• Lemmatisation
• Bag of Words
• TF-IDF
• Word2Vec
• POS tags
• Named Entity Recognition
• Chunking
• Parsing
Stop-words Removal
Vectorisation in NLP
• Bag of Words
• TF- IDF : Term frequency- Inverse document frequency
• Word2Vec
Converting words into vectors-Bag of Words
Converts text into vectors containing the count of word occurrences in the document

• Shows word existence


• Semantic weightage is not stored
• If vocabulary is large, vector may become too sparse
TF-IDF
Sentence shortening

Histogram Creation
TF-IDF
• TERM FREQUENCY

• INVERSE DOCUMENT FREQUENCY


Final vector=TF * IDF

• Gives importance to uncommon words


• Word association is not analysed
Word Embedding-Word2Vec
• Find word closeness
• Each word is represented as a vector of 32 or more dimensions instead of a
single number
• Semantic information and relation between different word is preserved
Word2Vec-visual representation
At a glimpse…
• Tokenisation
• Removal of Stop-words, punctuations, numbers ,special characters
• Stemming
• Lemmatisation
• Bag of Words
• TF-IDF
• Word2Vec
• POS tags
• Named Entity Recognition
• Chunking
• Parsing
Tools for NLP
• NLTK
• Gensim
• spaCy
• CoreNLP
Parvathy V J
Engineer, Special Programs
Livewire, Trivandrum

You might also like