Assignemnt 1
Assignemnt 1
Natural Language Processing (NLP) is a branch of artificial intelligence (AI) that focuses on enabling
computers to understand, interpret, and generate human language. It combines linguistics, computer
science, and machine learning to process and analyze large amounts of natural language data.
Natural Language Processing (NLP) faces several challenges due to the complexity, ambiguity, and
variability of human language. Some of the key challenges include:
1. Ambiguity – Words and sentences can have multiple meanings (e.g., "bank" can mean a
financial institution or a riverbank).
2. Context Understanding – Many words and phrases depend on context (e.g., "He saw the bat"
could refer to an animal or a baseball bat).
3. Sarcasm & Irony – Difficult to detect, as the literal meaning is often different from the intended
meaning.
4. Language Variability – Dialects, slang, abbreviations, and informal language make NLP models
struggle with consistency.
5. Grammar & Syntax Complexity – Some languages have complex grammatical structures that
are difficult to process.
6. Lack of Labeled Data – Training NLP models requires large amounts of high-quality annotated
text.
7. Low-Resource Languages – Many languages lack enough digital text data to build accurate
models.
8. Bias in AI Models – NLP models can inherit biases from training data, leading to unfair or
incorrect predictions.
9. Real-Time Processing – Understanding and generating language quickly in real-time applications
is a technical challenge.
10. Code-Switching – Some people mix languages in conversation (e.g., "Spanglish"), making NLP
processing more complex.
11. 3-What are the different tasks in NLP?
Natural Language Processing (NLP) involves several tasks that help machines understand, process, and
generate human language. These tasks can be broadly categorized into text processing, understanding,
and generation:
Stopword Removal – Removing common words (e.g., "the", "is") to focus on meaningful terms.
Stemming & Lemmatization – Reducing words to their root forms (e.g., "running" → "run").
Named Entity Recognition (NER) – Identifying entities like names, locations, and dates.
Sentiment Analysis – Determining the emotional tone of text (e.g., positive, negative, neutral).
Text Classification – Categorizing text into predefined groups (e.g., spam detection, topic
classification).
Coreference Resolution – Identifying when different words refer to the same entity (e.g., "John
said he would come" – linking "John" and "he").
Text Summarization – Generating short summaries of long texts (e.g., news article summaries).
Speech Recognition & Text-to-Speech (TTS) – Converting speech to text and vice versa.
Question Answering (QA) – Answering questions based on given text (e.g., search engines, AI
assistants).