Natural Language Processing: Course ID: 1905380
Natural Language Processing: Course ID: 1905380
Course Description
• This course provides an introduction to the field of natural language processing
(NLP) providing a theoretical foundation and hands-on (lab-style) practice in
computational approaches for processing natural language text. We will discuss
problems involving different language system components (such as meaning in
context and linguistic structures)
• Students will collaborate in teams on modeling and implementing natural
language processing and digital text solutions using Python and a variety of
relevant tools
• We will begin by discussing machine learning methods for NLP as well as core
NLP, such as language modeling, part of speech tagging, and parsing.
• We will also discuss applications such as information extraction, machine
translation, text generation, and automatic summarization
Course Objectives
By the end of this course, students will be able to:
• Understand the key concepts and principles of natural language processing.
• Implement common NLP techniques such as tokenization, stemming, lemmatization, and
part-of-speech tagging.
• Develop language models using n-grams and neural networks.
• Perform text classification, sentiment analysis, and named entity recognition.
• Build machine translation systems using sequence-to-sequence models.
• Explore advanced NLP topics such as dialogue systems, question answering, and text
generation.
• Apply NLP techniques to solve practical problems in various domains, such as customer
service, content analysis, and information retrieval.
What is NLP?
• NLP is a subfield of computer science and artificial intelligence (AI) that uses machine
learning to enable computers to understand and communicate with human language.
• NLP combines computational linguistics (rule-based modeling of human language) with
statistical modeling, machine learning, and deep learning to enable computers and
digital devices to recognize, understand, and generate text and speech.
• NLP research has enabled the era of generative AI, from large language models to image
generation models that can understand and respond to natural language requests.
• NLP is already widely applied in everyday technologies like search engines, chatbots,
voice assistants, and GPS systems.
• NLP also plays a growing role in enterprise solutions that help streamline and automate
business operations, increase employee productivity, and simplify mission-critical
business processes.
Turing test
• NLP core technologies and methodologies arose from famous Turing
Test proposed by Sir Alan Turing (1912–1954) in 1950s, the father of
AI.
NLP Challenges
• Word sense ambiguity
NLP Challenges
• Word sense / meaning ambiguity Ambiguous headlines:
Credit: http://stuffsirisaid.com
NLP Challenges
• Language is not static:
• Language grows
• Cyber language: BRB, G2G , ….
NLP Challenges
• Language is compositional
NLP Challenges
• Scale
• Huge amount of data
• Penn Tree bank ~1M from Wall street journal
• Newswire collection: 500M+
• Wikipedia: 2.9 billion word (English)
• Web: several billions of words