Unit1 A
Unit1 A
Natural Language Processing (NLP) has a rich history that spans several
decades, with its roots in linguistics, computer science, and artificial intelligence.
Early Foundations
Recent Developments
7. 2020s: Large Language Models and Applications
• GPT-3 and Beyond: The release of models like GPT-3 by OpenAI
demonstrated the power of large-scale language models trained on
vast amounts of text data. These models achieved impressive
performance in various NLP tasks, from translation to text
generation.
• Ethical and Societal Considerations: The increasing capabilities of
NLP models have raised important ethical questions about bias,
privacy, and the impact of AI on society. Researchers and
practitioners are actively working on addressing these concerns.
1. Human-Computer Interaction
• Voice Assistants: NLP enables virtual assistants like Siri, Alexa, and Google
Assistant to understand and respond to spoken commands, facilitating more
intuitive human-computer interactions.
• Chatbots: NLP is used in chatbots to provide automated customer support,
answer queries, and assist with various tasks in a conversational manner.
8. Accessibility
• Speech-to-Text: Converting spoken language into written text, which is
useful for individuals with hearing impairments or those who prefer written
communication.
• Text-to-Speech: Converting written text into spoken language, which can
help individuals with visual impairments or reading difficulties.
9. Business Intelligence
• Data Analysis: Analyzing textual data from various sources, such as
reports, emails, and customer interactions, to gain insights and support
decision-making.
• Trend Analysis: Identifying emerging trends and patterns in market
research, social media, and other sources of textual data.
10. Legal and Compliance
• Document Review: Automating the review of legal documents, contracts,
and compliance reports to identify key terms, clauses, and potential issues.
• Regulatory Compliance: Monitoring and analyzing communications to
ensure compliance with regulations and standards.
Levels of NLP:
• Phonetics: The study of the physical sounds of human speech. Phonetics deals with how
sounds are produced, transmitted, and perceived.
• Phonology: The study of the abstract, cognitive aspects of sounds, including how they
function in particular languages and how they influence meaning.
2. Morphology
• Morphological Analysis: Breaking down words into their constituent morphemes (the
smallest units of meaning), such as prefixes, roots, and suffixes. For example, “unhappiness”
is decomposed into “un-,” “happy,” and “-ness.”
• Morphological Tagging: Identifying and classifying the morphemes and their grammatical
functions, such as parts of speech or grammatical markers.
3. Syntax
• Syntax Analysis: Understanding the structure of sentences and the rules that govern word
order and sentence formation. This involves identifying the grammatical structure and
relationships between words.
• Parsing: The process of analyzing a sentence to determine its syntactic structure, often
represented as a parse tree, which shows the hierarchical relationships between different
elements of the sentence.
4. Semantics
• Word Sense Disambiguation (WSD): Determining the correct meaning of a word based on
its context, as many words have multiple meanings.
• Semantic Role Labeling: Identifying the roles that different words or phrases play in a sentence,
such as agents, patients, and instruments.
• Named Entity Recognition (NER): Identifying and classifying entities in text, such as names
of people, places, organizations, and other specific items.
5. Pragmatics
• Discourse Analysis: Understanding how context and prior discourse influence the
interpretation of sentences. This includes analyzing how sentences relate to each other within
a larger text or conversation.
• Intent Recognition: Identifying the underlying intentions or goals behind language,
particularly in interactive systems like chatbots or virtual assistants.
6. Text Processing
• Tokenization: Dividing text into smaller units, such as words or phrases, to facilitate further
processing. For example, splitting the sentence "The cat sat on the mat" into tokens: ["The",
"cat", "sat", "on", "the", "mat"].
• Text Normalization: Standardizing text for processing, including tasks such as converting to
lowercase, removing punctuation, and expanding contractions.
7. Applications
8. Advanced Topics
• Coreference Resolution: Determining when different words or phrases refer to the same
entity within a text. For example, resolving "he" to "John" in the sentence "John went to the
store. He bought milk."
• Text Generation: Creating coherent and contextually appropriate text, including generating
summaries, stories, or responses in dialogue systems.
• Dialogue Systems: Developing systems that can engage in natural conversations with humans,
integrating various levels of NLP for understanding and generating responses.