Lect01

Uploaded by

rodrigoferraribr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lect01

Uploaded by

rodrigoferraribr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Introduction to NLP

By Ivan Wong
How is the weather today?

It is 37 degrees centigrade outside with no rain today.

What does my schedule look like?

You have a strategy meeting at 4 p.m. and an all-hands at

5:30 p.m. Based on today’s traffic situation, it is
recommended you leave for the office by 8:15 a.m.

What should I wear today?

White seems like a good choice.

Why NLP?
• This natural language has been the primary medium of
communication between humans since time immemorial.
• But computers can only process data in binary, i.e., 0s and 1s.
• While we can represent language data in binary, how do we make
machines understand the language?
• This is where natural language processing (NLP) comes in.
• It is an area of computer science that deals with methods to
analyze, model, and understand human language.
NLP in the Real World
• Email platforms, such as Gmail, Outlook, etc., use NLP
extensively to provide a range of product features, such as spam
classification, priority inbox, calendar event extraction, auto-
complete, etc.
• Voice-based assistants, such as Apple Siri, Google Assistant,
Microsoft Cortana, and Amazon Alexa rely on a range of NLP
techniques to interact with the user, understand user commands,
and respond accordingly.
NLP in the Real World
• Modern search engines, such as Google and Bing, which are the
cornerstone of today’s internet, use NLP heavily for various
subtasks, such as query understanding, query expansion,
question answering, information retrieval, and ranking and
grouping of the results, to name a few.
• Machine translation services, such as Google Translate, Bing
Microsoft Translator, and Amazon Translate are increasingly used
in today’s world to solve a wide range of scenarios and business
use cases.
NLP in the Real World (Cont.)
• Organizations across verticals analyze their social media feeds to
build a better and deeper understanding of the voice of their
customers.
• NLP is widely used to solve diverse sets of use cases on e-
commerce platforms like Amazon. These vary from extracting
relevant information from product descriptions to understanding
user reviews.
• Advances in NLP are being applied to solve use cases in domains
such as healthcare, finance, and law.
NLP in the Real World (Cont.)
• NLP forms the backbone of spelling- and grammar-correction
tools, such as Grammarly and spell check in Microsoft Word and
Google Docs.
• NLP is used in a range of learning and assessment tools and
technologies, such as automated scoring in exams like the
Graduate Record Examination (GRE), plagiarism detection
(e.g., Turnitin), intelligent tutoring systems, and language learning
apps like Duolingo.
• NLP is used to build large knowledge bases, such as the Google
Knowledge Graph, which are useful in a range of applications like
search and question answering.
NLP Tasks
• Language modeling
• This is the task of predicting what the next word in a sentence will be
based on the history of previous words. The goal of this task is to learn the
probability of a sequence of words appearing in a given language.
Language modeling is useful for building solutions for a wide variety of
problems, such as speech recognition, optical character recognition,
handwriting recognition, machine translation, and spelling correction.
• Text classification
• This is the task of bucketing the text into a known set of categories based
on its content. Text classification is by far the most popular task in NLP
and is used in a variety of tools, from email spam identification to
sentiment analysis.
NLP Tasks (Cont.)
• Information extraction
• As the name indicates, this is the task of extracting relevant information
from text, such as calendar events from emails or the names of people
mentioned in a social media post.
• Information retrieval
• This is the task of finding documents relevant to a user query from a large
collection. Applications like Google Search are well-known use cases of
information retrieval.
• Conversational agent
• This is the task of building dialogue systems that can converse in human
languages. Alexa, Siri, etc., are some common applications of this task.
NLP Tasks (Cont.)
• Text summarization
• This task aims to create short summaries of longer documents while retaining
the core content and preserving the overall meaning of the text.
• Question answering
• This is the task of building a system that can automatically answer questions
posed in natural language.
• Machine translation
• This is the task of converting a piece of text from one language to another. Tools
like Google Translate are common applications of this task.
• Topic modeling
• This is the task of uncovering the topical structure of a large collection of
documents. Topic modeling is a common text-mining tool and is used in a wide
range of domains, from literature to bioinformatics.
What Is Language?
• Language is a structured system of communication that involves
complex combinations of its constituent components, such as
characters, words, sentences, etc. Linguistics is the systematic
study of language.
• We can think of human language as composed of four major
building blocks:
• phonemes,
• morphemes and lexemes,
• syntax, and
• context.
Why Is NLP Challenging?
• Ambiguity: Ambiguity means uncertainty of meaning. Most human
languages are inherently ambiguous.
• “I made her duck.” This sentence has multiple meanings.
• The first one is: I cooked a duck for her.
• The second meaning is: I made her bend down to avoid an object.
• Which of the two meanings applies depends on the context in which the
sentence appears.
Why Is NLP Challenging? (Cont.)
• Common knowledge: It is the set of all facts that most humans are
aware of. In any conversation, it is assumed that these facts are
known, hence they’re not explicitly mentioned, but they do have a
bearing on the meaning of the sentence.
• “man bit dog” and “dog bit man.” We all know that the first sentence is
unlikely to happen, while the second one is very possible.
• Why do we say so? Because we all “know” that it is very unlikely that a
human will bite a dog.
• Creativity: Language is not just rule driven; there is also a creative
aspect to it.
• arious styles, dialects, genres, and variations are used in any language.
Poems are a great example of creativity in language.
• Making machines understand creativity is a hard problem not just in NLP,
but in AI in general.
• Diversity across languages: For most languages in the world, there
is no direct mapping between the vocabularies of any two
languages.
• This makes porting an NLP solution from one language to another hard.
• A solution that works for one language might not work at all for another
language.
• This means that one either builds a solution that is language agnostic or
that one needs to build separate solutions for each language.
• While the first one is conceptually very hard, the other is laborious and
time intensive.
Machine
Learning, Deep
Learning, and
NLP: An
Overview
Machine Learning, Deep Learning, and NLP:
An Overview
• The goal of ML is to “learn” to perform tasks based on examples
(called “training data”) without explicit instruction.
• This is typically done by creating a numeric representation (called
“features”) of the training data and using this representation to
learn the patterns in those examples.
• Machine learning algorithms can be grouped into three primary
paradigms: supervised learning, unsupervised learning, and
reinforcement learning.
Supervised Learning
• In supervised learning, the goal is to learn the mapping function
from input to output given a large number of examples in the form
of input-output pairs.
• The input-output pairs are known as training data, and the outputs
are specifically known as labels or ground truth.
• An example of a supervised learning problem related to language
is learning to classify email messages as spam or non-spam given
thousands of examples in both categories.
Unsupervised Learning
• Unsupervised learning refers to a set of machine learning
methods that aim to find hidden patterns in given input data
without any reference output.
• That is, in contrast to supervised learning, unsupervised learning
works with large collections of unlabeled data.
• In NLP, an example of such a task is to identify latent topics in a
large collection of textual data without any knowledge of these
topics.
Semi-supervised Learning
• Common in real-world NLP projects is a case of semi-supervised
learning, where we have a small labeled dataset and a large
unlabeled dataset.
• Semi-supervised techniques involve using both datasets to learn
the task at hand.
• Last but not least, reinforcement learning deals with methods to
learn tasks via trial and error and is characterized by the absence
of either labeled or unlabeled data in large quantities. The learning
is done in a self-contained environment and improves via
feedback (reward or punishment) facilitated by the environment.
Deep Learning
• Deep learning refers to the branch of machine learning that is
based on artificial neural network architectures.
• The ideas behind neural networks are inspired by neurons in the
human brain and how they interact with one another.
• In the past decade, deep learning–based neural architectures
have been used to successfully improve the performance of
various intelligent applications, such as image and speech
recognition and machine translation.
• This has resulted in a proliferation of deep learning–based
solutions in industry, including in NLP applications.
Approaches to NLP: Heuristics-Based NLP
• Similar to other early AI systems, early attempts at designing NLP
systems were based on building rules for the task at hand.
• This required that the developers had some expertise in the domain to
formulate rules that could be incorporated into a program.
• Such systems also required resources like dictionaries and
thesauruses, typically compiled and digitized over a period of time.
• An example of designing rules to solve an NLP problem using such
resources is lexicon-based sentiment analysis. It uses counts of
positive and negative words in the text to deduce the sentiment of the
text.
Approaches to NLP: Machine Learning for
NLP
• Machine learning techniques are applied to textual data just as they’re
used on other forms of data, such as images, speech, and structured
data.
• Supervised machine learning techniques such as classification and
regression methods are heavily used for various NLP tasks.
• As an example, an NLP classification task would be to classify news
articles into a set of news topics like sports or politics.
• On the other hand, regression techniques, which give a numeric
prediction, can be used to estimate the price of a stock based on
processing the social media discussion about that stock.
• Similarly, unsupervised clustering algorithms can be used to club
together text documents.
Approaches to NLP: Deep Learning for NLP
• In the last few years, we have seen a huge surge in using neural
networks to deal with complex, unstructured data.
• Therefore, we need models with better representation and
learning capability to understand and solve language tasks.
• Here are a few popular deep neural network architectures that
have become the status quo in NLP.
• Recurrent neural networks
• Long short-term memory
• Convolutional neural networks
• Transformers
• Autoencoders
Why Deep Learning Is Not Yet the Silver Bullet
for NLP
• Despite such tremendous success, DL is still not the silver bullet
for all NLP tasks when it comes to industrial applications. Some of
the key reasons for this are as follows:
• Overfitting on small datasets
• Few-shot learning and synthetic data generation
• Domain adaptation
• Interpretable models
• Common sense and world knowledge
• Cost
• On-device deployment

Natural Language Processing With Python A Comprehensive Guide To NLP in The Age of AI For 2024 (Hayden Van Der Post) (Z-Library)
No ratings yet
Natural Language Processing With Python A Comprehensive Guide To NLP in The Age of AI For 2024 (Hayden Van Der Post) (Z-Library)
315 pages
Winshuttle Technical Architecture Guide Winshuttle Platform Whitepaper en
No ratings yet
Winshuttle Technical Architecture Guide Winshuttle Platform Whitepaper en
8 pages
Basic NLP to End-to-end Pipeline .pptx_removed
No ratings yet
Basic NLP to End-to-end Pipeline .pptx_removed
35 pages
Introduction To NLP - Part 1
No ratings yet
Introduction To NLP - Part 1
23 pages
Natural Language Processing
No ratings yet
Natural Language Processing
73 pages
CH1
No ratings yet
CH1
87 pages
NLP Introduction Overview
No ratings yet
NLP Introduction Overview
34 pages
What Is Natural Language Processing?
No ratings yet
What Is Natural Language Processing?
5 pages
Natural Language Processing
No ratings yet
Natural Language Processing
21 pages
Unit1 A
No ratings yet
Unit1 A
8 pages
A Beginner's Introduction To Natural Language Processing (NLP)
100% (1)
A Beginner's Introduction To Natural Language Processing (NLP)
15 pages
Hadi Pres, 21-12-24-1
No ratings yet
Hadi Pres, 21-12-24-1
16 pages
DS Exp2 Rugved
No ratings yet
DS Exp2 Rugved
5 pages
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
No ratings yet
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
55 pages
NLP Lecture 1
No ratings yet
NLP Lecture 1
3 pages
NLP Module 1
No ratings yet
NLP Module 1
31 pages
DS Exp2 20101A0021 Satyam Mishra
No ratings yet
DS Exp2 20101A0021 Satyam Mishra
5 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
NLP Presentation
No ratings yet
NLP Presentation
20 pages
Harambe University
No ratings yet
Harambe University
8 pages
Chapter 6.
No ratings yet
Chapter 6.
31 pages
NLP handwritten notes_copy
No ratings yet
NLP handwritten notes_copy
26 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
No ratings yet
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
7 pages
NPL
No ratings yet
NPL
2 pages
Natural Language Processing
100% (1)
Natural Language Processing
12 pages
NLP LectureNotes UNIT 1
No ratings yet
NLP LectureNotes UNIT 1
55 pages
What Is NLP?
No ratings yet
What Is NLP?
5 pages
NLP_UNIT-1[1]
No ratings yet
NLP_UNIT-1[1]
20 pages
Lecture1
No ratings yet
Lecture1
16 pages
unit 3&4
No ratings yet
unit 3&4
10 pages
PDF Document 4
No ratings yet
PDF Document 4
5 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
11 pages
Class X Unit VI Natural Language Processing
No ratings yet
Class X Unit VI Natural Language Processing
42 pages
Introduction to Data Science_Week 7_LAQ's
No ratings yet
Introduction to Data Science_Week 7_LAQ's
4 pages
Natural Language Processing 101
No ratings yet
Natural Language Processing 101
26 pages
Deepset NLP For Product Managers Ebook
No ratings yet
Deepset NLP For Product Managers Ebook
22 pages
38. Natural Language Processing (1) Copy
No ratings yet
38. Natural Language Processing (1) Copy
30 pages
NLP MODULE 1 Chapter1 &2 ppt
No ratings yet
NLP MODULE 1 Chapter1 &2 ppt
83 pages
NLP
No ratings yet
NLP
11 pages
UNIT 1
No ratings yet
UNIT 1
26 pages
Introduction to NLP_first_week_lecture_2st
No ratings yet
Introduction to NLP_first_week_lecture_2st
4 pages
NLP Unit 1 and 2
No ratings yet
NLP Unit 1 and 2
106 pages
NLP
No ratings yet
NLP
6 pages
Introducing Natural Language Processing
No ratings yet
Introducing Natural Language Processing
13 pages
Chapter 4 Describe Features of Natural Language Processing (NLP) Workloads On Azure - Exam Ref AI-900 Microsoft Azure AI Fundamentals
No ratings yet
Chapter 4 Describe Features of Natural Language Processing (NLP) Workloads On Azure - Exam Ref AI-900 Microsoft Azure AI Fundamentals
39 pages
Seminar Vishnu
No ratings yet
Seminar Vishnu
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
30 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Natural Language Processing
No ratings yet
Natural Language Processing
5 pages
Unit-1 Aim 502
No ratings yet
Unit-1 Aim 502
15 pages
1 Natural Language Processing-Intro
No ratings yet
1 Natural Language Processing-Intro
16 pages
Natural Language Processing (NLP) (A Complete Guide)
No ratings yet
Natural Language Processing (NLP) (A Complete Guide)
26 pages
NLP Notes
No ratings yet
NLP Notes
90 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
Chap1 NLP-Students (1) Removed
No ratings yet
Chap1 NLP-Students (1) Removed
54 pages
NLP M1 Students (1)
No ratings yet
NLP M1 Students (1)
17 pages
NLP Exam Notes
No ratings yet
NLP Exam Notes
15 pages
What Is Natural Language Processing (NLP)
No ratings yet
What Is Natural Language Processing (NLP)
15 pages
Group 8 NLP
No ratings yet
Group 8 NLP
3 pages
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
From Everand
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
daniel Huston
No ratings yet
3175 Lab 3
No ratings yet
3175 Lab 3
1 page
3175 Lab 2 - Trip Booking App - Solutions(1)
No ratings yet
3175 Lab 2 - Trip Booking App - Solutions(1)
1 page
3175 Lab 4
No ratings yet
3175 Lab 4
2 pages
Csis3300 001 Outline Nb f24
No ratings yet
Csis3300 001 Outline Nb f24
8 pages
CSIS 3300 W3 Denormalization StarSchema
No ratings yet
CSIS 3300 W3 Denormalization StarSchema
27 pages
Csis 3300 w5 9 Nosql
No ratings yet
Csis 3300 w5 9 Nosql
27 pages
CSIS 3300 W11 QueryOptimization
No ratings yet
CSIS 3300 W11 QueryOptimization
27 pages
CSIS 3300 W3 Denormalization StarSchema Sol
No ratings yet
CSIS 3300 W3 Denormalization StarSchema Sol
2 pages
CSIS 3300 W13 Transactions
No ratings yet
CSIS 3300 W13 Transactions
13 pages
Lect07
No ratings yet
Lect07
24 pages
Proj2
No ratings yet
Proj2
5 pages
Lect06
No ratings yet
Lect06
21 pages
Lect04
No ratings yet
Lect04
44 pages
Lect08
No ratings yet
Lect08
17 pages
Lect05
No ratings yet
Lect05
17 pages
CSIS3400 070CourseOutline 2024Fall(1)
No ratings yet
CSIS3400 070CourseOutline 2024Fall(1)
5 pages
Lect02
No ratings yet
Lect02
23 pages
Proj01
No ratings yet
Proj01
5 pages
Dere 0922
No ratings yet
Dere 0922
7 pages
GS Wiring Example 2 - KEYENCE America
No ratings yet
GS Wiring Example 2 - KEYENCE America
4 pages
Env Aspects Impact Register
No ratings yet
Env Aspects Impact Register
2 pages
UPDA Civil Exam Syllabus MMUP Civil Syllabus UPDA Civil Syllabus
0% (1)
UPDA Civil Exam Syllabus MMUP Civil Syllabus UPDA Civil Syllabus
11 pages
Subhasis Patra CV V3
No ratings yet
Subhasis Patra CV V3
5 pages
Part Three - Roads
No ratings yet
Part Three - Roads
1 page
Paging Procedure Explanation
100% (3)
Paging Procedure Explanation
3 pages
Multiple Choice
No ratings yet
Multiple Choice
15 pages
Chapter 4 - Software Design
No ratings yet
Chapter 4 - Software Design
42 pages
log
No ratings yet
log
2 pages
SOMlink VerbindungzuAndroidGertConnectiontoAndroidDevice S12691 00001
No ratings yet
SOMlink VerbindungzuAndroidGertConnectiontoAndroidDevice S12691 00001
18 pages
Closed Loop DC Motor Control System cep-1
No ratings yet
Closed Loop DC Motor Control System cep-1
17 pages
Important: Office No. 12, Panche Mall, Near Bharti Vidyapeeth, Katraj, Pune
No ratings yet
Important: Office No. 12, Panche Mall, Near Bharti Vidyapeeth, Katraj, Pune
124 pages
March 2020: Facebook Wealth Formula
No ratings yet
March 2020: Facebook Wealth Formula
51 pages
Delta-MS-300-Series-Catalog
No ratings yet
Delta-MS-300-Series-Catalog
44 pages
Comment Sheet For Commissioning Hse Plan
No ratings yet
Comment Sheet For Commissioning Hse Plan
3 pages
BES 516-3005-G-E4-C-S49-00,3 Order Code: BES00HC: Inductive Sensors
No ratings yet
BES 516-3005-G-E4-C-S49-00,3 Order Code: BES00HC: Inductive Sensors
2 pages
COA Lecture 23 Interupt Driven Io PDF
No ratings yet
COA Lecture 23 Interupt Driven Io PDF
15 pages
Fpga Implementation Using Verilog and VHDL
No ratings yet
Fpga Implementation Using Verilog and VHDL
2 pages
SapNote 2035054 B2B NRO API Documentation
No ratings yet
SapNote 2035054 B2B NRO API Documentation
8 pages
High School Students' Social Media Usage Habits
No ratings yet
High School Students' Social Media Usage Habits
10 pages
Translation Memory Database in The Translation Process: Abstract
No ratings yet
Translation Memory Database in The Translation Process: Abstract
6 pages
Project-Oriented Premium Products - 2024
No ratings yet
Project-Oriented Premium Products - 2024
96 pages
Accounting Analytics 1
No ratings yet
Accounting Analytics 1
44 pages
Dspicdem MC1 Motor Control Development Board User's Guide: 2003 Microchip Technology Inc. DS70098A
100% (1)
Dspicdem MC1 Motor Control Development Board User's Guide: 2003 Microchip Technology Inc. DS70098A
26 pages
Shutdown Valves
100% (1)
Shutdown Valves
2 pages
Python1 Worksheet
No ratings yet
Python1 Worksheet
4 pages
Idea Statica Tutorial - Etabs Link (Aisc) : 1 How To Activate The Link
100% (1)
Idea Statica Tutorial - Etabs Link (Aisc) : 1 How To Activate The Link
25 pages
Workflow Knowledge Sharing Session
No ratings yet
Workflow Knowledge Sharing Session
15 pages

Lect01

Uploaded by

Lect01

Uploaded by

Introduction to NLP

It is 37 degrees centigrade outside with no rain today.

What does my schedule look like?

You have a strategy meeting at 4 p.m. and an all-hands at

What should I wear today?

White seems like a good choice.

You might also like