0% found this document useful (0 votes)

8 views13 pages

NLP Lab Manual

The document outlines the Natural Language Processing Lab curriculum for B.Tech students in AI & ML at JNTUA, detailing course objectives, outcomes, and a list of experiments. It includes programming exercises on word analysis, generation, morphology, N-grams, and N-gram smoothing, along with theoretical explanations and sample Python code. The lab aims to equip students with practical skills in NLP through hands-on experience with various techniques and tools.

Uploaded by

saitejasvee1022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views13 pages

NLP Lab Manual

Uploaded by

saitejasvee1022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Approved by AICTE, New Delhi & Affiliated to JNTUA, Anantapur)

NELLORE PALEM, ATMAKUR, NELLORE (Dt),ANDHRAPRADESH,

INDIA – 524322Ph. No.086272 11990, 9000303399 Email: [email protected]

Website:www.aecn.in

NATURAL LANGUAGE PROCESSING LAB-(20A30603)

LA BA R AT O R Y O B S E R VAT I O N

DEPARTMENT OF
ARTIFICIAL INTELLIGENCE & MACHINE LEARNING
(AI&ML)

Name

Roll No

Class III B.TECH II SEM

Branch AI & ML

Regulation R20

Name of the Lab NATURAL LANGUAGE PROCESSING

Academic year 2024-2025

Prepared By SYAM KOTI

INDEX

S.NO NAME OF THE EXPERIMENT DATE SIGNATURE

Signature of the staff member

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY ANANTAPUR

B.Tech (AI&ML)– III-II Sem L T P C

0 0 3 1.5
(20A30603) NATURAL LANGUAGE PROCESSING LAB
Course Objectives:
To introduce the students with the basics of NLP which will empower them for
developing advanced NLP tools and solving practical problems in the field.
Course Outcomes (CO):
• After completion of the course, students will be able to Understand
approaches to syntax and semantics in NLP.
• Analyse grammar formalism and context free grammars.
• Apply the statistical estimation and statistical alignment models
• Apply Rule based Techniques, Statistical Machine translation (SMT),
word alignment, phrasebased translation.
• Have the skills (experience) of solving specific NLP tasks, which may
involve programming in Python, as well as running experiments on
textual data.
List of Experiments:
1. Word Analysis
2. Word Generation
3. Morphology
4. N-Grams
5. N-Grams Smoothing
Refer: https://nlp-iiith.vlabs.ac.in/List%20of%20experiments.html
References:
1. James Allen, Natural Language Understanding, 2nd Edition, 2003,
Pearson Education.
2. 2.Natural Language Processing, A paninian perspective, Akshar Bharathi,
Vineet Chaitanya, Prentice –Hall of India.
Online Learning Resources/Virtual Labs:
1. Natural Language Processing in TensorFlow | Coursera
EXP1 : WORD ANALYSIS
AIM: simple Python program that performs basic word analysis in NLP

THEORY:This program uses the word_tokenize function from the

nltk.tokenize module to split the text into words. It then uses the
nltk.FreqDist class to perform frequency analysis on the tokens and finds
the 10 most frequent words. The frequency of each word is then
printed.Word analysis in NLP (Natural Language Processing) refers to the
process of studying individual words within a larger text corpus in order to
understand their meaning, context, and relationships with other words. This
can include tasks such as word tokenization, stemming and lemmatization,
part-of-speech tagging, and Named Entity Recognition (NER).Word
analysis is a crucial component of NLP and is often used as a preprocessing
step before more advanced NLP techniques, such as sentiment analysis, text
classification, and machine translation. The goal of word analysis is to
extract meaningful information from text and to represent that information
in a way that can be processed by computational algorithms. Word analysis
in NLP involves several tasks to extract meaningful information from text
and represent it in a computationally manageable format. Here's a simple
example that demonstrates some common word analysis techniques:

Suppose we have the following sentence: "The quick brown fox jumps over the
lazy dog."
1. Tokenization: This involves breaking down the sentence into individual
words, or tokens. In this example, the sentence would be tokenized into
the following list of words: ["The", "quick", "brown", "fox", "jumps",
"over", "the", "lazy", "dog."]
2. Stemming and Lemmatization: These are techniques that reduce words to
their root form, so that words with the same meaning are represented in
the same way. For example, the word "jumps" could be stemmed to "jump"
and lemmatized to "jump."
3. Part-of-Speech (POS) Tagging: This involves marking each word in a
sentence with its corresponding part of speech, such as noun, verb,
adjective, etc. For example, the word "jumps" in this sentence would be
tagged as a verb.
4. Named Entity Recognition (NER): This involves identifying and
categorizing named entities, such as people, organizations, and locations,
within a sentence.For example, the word "dog" in this sentence could be
tagged as a named entity of type "animal."
PROGRAM:
import nltk

from nltk.tokenize import

word_tokenize

nltk.download('punkt')

# Sample text
text = "Natural language processing (NLP) is the ability of a
computer
program to understand human language as it is
spoken."
# Tokenize the text into words

tokens = word_tokenize(text)

print(tokens)

# Perform frequency analysis on the

tokens word_freq = nltk.FreqDist(tokens)

# Print the 10 most frequent words

print("The 10 most frequent words are:")

for word, freq in

word_freq.most_common(10):

print(f"{word}: {freq}")

OUTPUT:
['Natural', 'language', 'processing',
'(', 'NLP', ')', 'is', 'the', 'ability',
'of', 'a', 'computer', 'program', 'to',
'understand', 'human', 'language', 'as',
'it', 'is', 'spoken', '.']
The 10 most frequent words are:
language: 2 is: 2
Natural: 1
processing: 1 (: 1
NLP: 1
): 1 the: ability:
1 of: 1
RESULT:
EXP2:WORD GENERATION
AIM: simple Python program for word generation in NLP
THEORY:Word generation in NLP (Natural Language Processing) refers to the
task of generating new words or sentences based on a given text corpus or set
of inputs. This can be accomplished through various methods, such as statistical
language modeling, neural network-based language generation, and rule-based
text generation.In statistical language modeling, a model is trained on a large
corpus of text and then used to generate new words or sentences by predicting
the likelihood of certain sequences of words.
In neural network-based language generation, deep learning models, such as
Recurrent Neural Networks (RNNs) or Transformer networks, are trained on a
large corpus of text to generate new words or sentences based on patterns and
relationships learned from the training data. In rule-based text generation, a set
of rules or templates is used to generate newwords or sentences based on specific
patterns or relationships between words.Word generation is a valuable tool in
NLP for tasks such as data augmentation, text summarization, and language
translation, among others. However, it can also be used to generate misleading or
fake text, so it's important to be aware of its potential limitations and ethical
considerations. Word generation in NLP refers to the task of generating new
words or sentences based on a given text corpus or set of inputs. Here's a simple
example of word generation using a neural network-based language
model:Suppose we have a corpus of text that includes the sentence "The quick
brown fox jumps over the lazy dog." A neural network-based language model can
be trained on this text to predict the next word in a sentence, given a set of
previous words as input.For example, if we input the sequence "The quick brown
fox" into the model, it may generate the next word "jumps." We can then feed
that output back into the model to generate the next word, and so on. The final
output could be a sentence like: "The quick brown fox jumps over the green
fence."
In this example, the neural network-based language model has learned the
relationships between words and the patterns of language used in the training
corpus,and has generated a new sentence based on that information.
Word generation can be a useful tool in NLP for tasks such as text summarization,
data augmentation, and language translation. However, it's important to note that
the quality of the generated words or sentences can vary depending on the quality
of the training data and the methods used for word generation.
PROGRAM:

# Importing the necessarylibraries

import nltk

import random from nltk.corpus

import words nltk.download('words')

# Generating a random word from the corpus

random_word = random.choice(words.words()) #

Printing the generated random word print("The

randomly generated word is:", random_word)

OUTPUT:
The randomly generated word is: meritoriousness

RESULT:
EXP3:MORPHOLOGY
AIM: simple Python program for Morphology in NLP
THEORY:Morphology is the study of word structure and formation, including
inflection and derivation. Here is a simple Python program that performs basic
morphological operations, such as stemming and lemmatization, using the
Natural Language Toolkit (NLTK) library in this program, we first tokenize the
input text into words, and then perform stemming and lemmatization on each
word. The PorterStemmer and WordNetLemmatizer classes from the NLTK
library are used for this purpose. The resulting stemmed and lemmatized words
are then returned and printed.Morphology in NLP (Natural Language
Processing) is the study of the internal structure of words and how they can be
modified to create new words. It deals with inflection, derivation, and
compounding.
For example, the word "unhappy" is formed by adding the prefix "un-" to the
base word "happy." The prefix "un-" changes the meaning of the word to the
opposite, making "unhappy" mean "not happy." This process is an example of
morphology in NLP, where the structure of words is analyzed to understand how
they are formed and how their meanings are affected by various modifications.

PROGRAM:

import nltk

from nltk.stem import PorterStemmer

from nltk.tokenize import word_tokenize

from nltk.stem import WordNetLemmatizer

nltk.download('punkt')

nltk.download('wordnet')

nltk.download('omw-1.4')

def perform_stemming(text):

stemmer = PorterStemmer()

stemmed_words = []

words = word_tokenize(text)

for word in words:

stemmed_words.append(stemmer.stem(word))

return stemmed_words
def perform_lemmatization(text):

lemmatizer = WordNetLemmatizer()

lemmatized_words = [] words =

word_tokenize(text)

for word in words:

lemmatized_words.append(lemmatizer.lemmatize(word))

return lemmatized_words

text = "This is an example sentence showing off the stemming and

lemmatization in NLP"

stemmed_words = perform_stemming(text)

print("Stemmed words:", stemmed_words)

lemmatized_words = perform_lemmatization(text)

print("Lemmatized words:", lemmatized_words)

OUTPUT:
Stemmed words: ['thi', 'is', 'an', 'exampl', 'sentenc',
'show', 'off',
'the', 'stem', 'and', 'lemmat', 'in', 'nlp']
Lemmatized words: ['This', 'is', 'an', 'example', 'sentence',
'showing', 'off', 'the', 'stemming', 'and', 'lemmatization',
'in',
'NLP']

RESULT:
EXP4: N-Grams

AIM: a simple Python program that generates N-grams in NLP

THEORY: N-grams in NLP (Natural Language Processing) refer to contiguous
sequence of N items from a given text, where N can be any positive integer. The
items can be words, letters, or other units, depending on the context.
For example, let's say we have the sentence: "The quick brown fox jumps over
the lazy dog."
• A 1-gram (or unigram) would be a sequence of one item: ["The", "quick",
"brown", "fox", "jumps", "over", "the", "lazy", "dog"]
• A 2-gram (or bigram) would be a sequence of two items: ["The quick",
"quick brown", "brown fox", "fox jumps", "jumps over", "over the", "the
lazy", "lazy dog"]
• A 3-gram (or trigram) would be a sequence of three items: ["The quick
brown","quick brown fox", "brown fox jumps", "fox jumps over", "jumps
over the", "over the lazy", "the lazy dog"]
N-grams are often used in NLP for tasks such as language modeling, text
classification,and machine translation, as they can provide insights into the
structure and meaning of a text.
PROGRAM:
import nltk

from nltk.util import ngrams

nltk.download('punkt')

# Sample text

text = "Natural language processing (NLP) is the ability of a computer

program to understand human language as it is spoken." # Tokenize the

text into words tokens = nltk.word_tokenize(text) # Generate N-grams

from the tokens

N = 3

ngrams_list = list(ngrams(tokens, N))

# Print the N-grams

print("The {}-grams are:".format(N))

for ngram in ngrams_list:

print(ngram)
OUTPUT:

The 3-grams are:

('Natural', 'language', 'processing')
('language', 'processing', '(')
('processing', '(', 'NLP')
('(', 'NLP', ')')
('NLP', ')', 'is')
(')', 'is', 'the')
('is', 'the', 'ability')
('the', 'ability', 'of')
('ability', 'of', 'a')
('of', 'a', 'computer')
('a', 'computer', 'program')
('computer', 'program', 'to')
('program', 'to', 'understand')
('to', 'understand', 'human')
('understand', 'human', 'language')
('human', 'language', 'as')
('language', 'as', 'it')
('as', 'it', 'is')
('it', 'is', 'spoken')
('is', 'spoken', '.')

RESULT:
EXP5:N-GRAMS SMOOTHIG
AIM: A simple Python program that implements N-gram smoothing in NLP.
THEORY:This program uses the word_tokenize function from the nltk module
to split the text into words. It then generates bigrams (pairs of words) from the
tokens using the ngrams function. A Kneser-Ney Interpolated (KN-
Interpolated) language model is trained on the bigrams using the
KneserNeyInterpolated class from the nltk.lm module. The model is then used
to evaluate the probabilities of the bigrams in the text.Kneser-Ney smoothing is
a type of N-gram smoothing that adjusts the probabilities of N-grams based on
the frequency of lower-order N-grams in the training data. This helps to avoid
the issue of zero probability N-grams that can arise with simple maximum
likelihood estimation.N-grm smoothing is a technique used in NLP (Natural
Language Processing) to estimate the probability of an N-gram that has not been
seen in the training data. This technique helps to solve the problem of zero
probability, which occurs when an N-gram that appears in the test data has not
been seen in the training data.There are several methods of N-gram smoothing,
but one of the most commonly used methods is called add-k smoothing. In add-
k smoothing, a small value k is added to the count of each N-gram, before
computing the probabilities. This helps to avoid zero probabilities and ensures
that the probabilities of all N-grams sum up to 1.For example, let's say we have
a corpus of text containing the following bigrams: "the quick", "quick brown",
"brown fox", and "fox jumps". We want to estimate the probability of the bigram
"jumps over", which is not present in the training data.Using add-k smoothing
with k = 1, we can compute the probabilities as follows:
● P("jumps over") = (0 + 1) / (4 + 4) = 1/8
Here, we added 1 to the count of each bigram to avoid zero probabilities, and
divided the result by the total count of all bigrams plus 4 (which is the number
of possible bigrams that can be formed from the given vocabulary).N-gram
smoothing is a useful technique in NLP for improving the accuracy of language
models and other NLP applications.
PROGRAM:

import nltk

from nltk.util import ngrams

from nltk.lm import KneserNeyInterpolated

nltk.download('punkt')

# Sample text
text = "Natural language processing (NLP) is the ability of a computer

program to understand human language as it is spoken." # Tokenize the

text into words

tokens = nltk.word_tokenize(text) # Generate bigrams from the tokens

bigrams = ngrams(tokens, 2)

# Train a Kneser-Ney Interpolated (KN-Interpolated) language model on

the bigrams

model = KneserNeyInterpolated(2)

model.fit(bigrams,bigrams) # Evaluate the model's probabilities of

bigrams

print("Probabilities of bigrams:")

for bigram in bigrams:

prob = model.score(bigram) print("{}:

{:.3f}".format(bigram, prob))

OUTPUT:
Probabilities of bigrams:

RESULT:

Kindergarten Math Guide
67% (3)
Kindergarten Math Guide
161 pages
A Companion To The Philosophy of Mind
100% (9)
A Companion To The Philosophy of Mind
656 pages
Research That Supports Using The Schoolwide Enrichment Model
No ratings yet
Research That Supports Using The Schoolwide Enrichment Model
10 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
28 pages
Unit 4
No ratings yet
Unit 4
39 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
Module I NLP
No ratings yet
Module I NLP
65 pages
Unit 1
No ratings yet
Unit 1
20 pages
NLP Lect Unit I
100% (1)
NLP Lect Unit I
140 pages
NLP Lab Manual-1
No ratings yet
NLP Lab Manual-1
18 pages
Raymond S. T. Lee - Natural Language Processing. A Textbook With Python Implementation-Springer (2024)
No ratings yet
Raymond S. T. Lee - Natural Language Processing. A Textbook With Python Implementation-Springer (2024)
454 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
NLP Handwritten Notes
No ratings yet
NLP Handwritten Notes
26 pages
Nlp-Unit-I Final
No ratings yet
Nlp-Unit-I Final
31 pages
Module-I NLP
No ratings yet
Module-I NLP
35 pages
University Institute of Engineering Department of Computer Science and Engg
No ratings yet
University Institute of Engineering Department of Computer Science and Engg
9 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Topic 2: Introduction To Natural Language Processing (NLP)
No ratings yet
Topic 2: Introduction To Natural Language Processing (NLP)
16 pages
Bhawini NLP Practical
No ratings yet
Bhawini NLP Practical
98 pages
Unit 1
No ratings yet
Unit 1
99 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
CSR 322 Syllabus
No ratings yet
CSR 322 Syllabus
2 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
Minorproject Ishant
No ratings yet
Minorproject Ishant
18 pages
Natural Language Processing (NLP) With Python - Tutorial
No ratings yet
Natural Language Processing (NLP) With Python - Tutorial
72 pages
AnandKumar Course Intro IT356
No ratings yet
AnandKumar Course Intro IT356
42 pages
NLP Syllabus R21
100% (1)
NLP Syllabus R21
2 pages
NLP Unit 1 To 5
No ratings yet
NLP Unit 1 To 5
91 pages
Ed 3 Book
No ratings yet
Ed 3 Book
577 pages
Natural Language Processing A Machine Learning Perspective by Yue Zhang, Westlake University Zhiyang Teng, Westlake University
No ratings yet
Natural Language Processing A Machine Learning Perspective by Yue Zhang, Westlake University Zhiyang Teng, Westlake University
768 pages
Unit 3
No ratings yet
Unit 3
14 pages
Module-5:: Network Analysis
No ratings yet
Module-5:: Network Analysis
22 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
31 pages
Unit Iii
No ratings yet
Unit Iii
6 pages
Natural Language Processing
No ratings yet
Natural Language Processing
2 pages
NLP Record300
No ratings yet
NLP Record300
24 pages
NLP Course File Notes
No ratings yet
NLP Course File Notes
71 pages
NLP Book
No ratings yet
NLP Book
599 pages
NLP Materia
No ratings yet
NLP Materia
29 pages
ME02023011
No ratings yet
ME02023011
3 pages
NLP Prep
No ratings yet
NLP Prep
14 pages
Akchukwu Wisdom Chidi Seminar Corrected Version
No ratings yet
Akchukwu Wisdom Chidi Seminar Corrected Version
17 pages
01 Introduction To Natural Language Processing
No ratings yet
01 Introduction To Natural Language Processing
42 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Wisdom Natural Language Processing
No ratings yet
Wisdom Natural Language Processing
4 pages
Ai 2
No ratings yet
Ai 2
7 pages
Natural Language Processing Unit 1-2
No ratings yet
Natural Language Processing Unit 1-2
18 pages
A Tutorial On: Linguistic Data Analysis
No ratings yet
A Tutorial On: Linguistic Data Analysis
99 pages
Natural Language Processing - NOTES
No ratings yet
Natural Language Processing - NOTES
4 pages
Nlp-Unit-I Final
No ratings yet
Nlp-Unit-I Final
31 pages
NLP Module 1
No ratings yet
NLP Module 1
71 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
NLP - AI2214601 Unit 1to Unit 5 Notes
No ratings yet
NLP - AI2214601 Unit 1to Unit 5 Notes
98 pages
Brochure CMU NLP 24-08-2022 V13
No ratings yet
Brochure CMU NLP 24-08-2022 V13
13 pages
Unit 1 and 2
No ratings yet
Unit 1 and 2
5 pages
Unit I
No ratings yet
Unit I
36 pages
Unit-I NLP
No ratings yet
Unit-I NLP
37 pages
1 NLP
No ratings yet
1 NLP
26 pages
FRM Course Syllabus IPDownload
No ratings yet
FRM Course Syllabus IPDownload
2 pages
Ai Applications Unit-1
No ratings yet
Ai Applications Unit-1
11 pages
NLP A
No ratings yet
NLP A
6 pages
(Assignment 10) Vlookup
No ratings yet
(Assignment 10) Vlookup
6 pages
(Assignment 8) Sort
No ratings yet
(Assignment 8) Sort
21 pages
CRUD
No ratings yet
CRUD
2 pages
Full Stack Web Application Development With Mern Lab
No ratings yet
Full Stack Web Application Development With Mern Lab
62 pages
Internship Project Template
No ratings yet
Internship Project Template
40 pages
Bank Churn Data
No ratings yet
Bank Churn Data
33 pages
Level of Reading and Comprehension Skills of Grade 5 Elementary Pupils of Bitoon Central School: An Analytical Approach
No ratings yet
Level of Reading and Comprehension Skills of Grade 5 Elementary Pupils of Bitoon Central School: An Analytical Approach
5 pages
Translation Strategies in Subtitling
100% (1)
Translation Strategies in Subtitling
12 pages
Cultural Factors Influencing SLA
No ratings yet
Cultural Factors Influencing SLA
11 pages
An Invisible Thread ENTIRE TEXT
No ratings yet
An Invisible Thread ENTIRE TEXT
8 pages
Lit 7 WK 8 Notes
No ratings yet
Lit 7 WK 8 Notes
7 pages
Stages of Adolescent Development
No ratings yet
Stages of Adolescent Development
5 pages
Martin Stejskal's Art
No ratings yet
Martin Stejskal's Art
25 pages
Speech Act Theory, Speech Acts, and The Analysis of Fiction
No ratings yet
Speech Act Theory, Speech Acts, and The Analysis of Fiction
11 pages
ECE6009 Poster
No ratings yet
ECE6009 Poster
1 page
Ethics in Communication-Wps Office
No ratings yet
Ethics in Communication-Wps Office
24 pages
Positive Psychology
No ratings yet
Positive Psychology
14 pages
Super Detailed Lesson Plan 7es Science Models
No ratings yet
Super Detailed Lesson Plan 7es Science Models
3 pages
Being Human The Problem of Agency
No ratings yet
Being Human The Problem of Agency
25 pages
Experimental Research and Non-Experimental Research
No ratings yet
Experimental Research and Non-Experimental Research
14 pages
Pedagogical Grammar Dr. Lazgeen Mohammed Raouf Aims, Objectives and Outcomes
No ratings yet
Pedagogical Grammar Dr. Lazgeen Mohammed Raouf Aims, Objectives and Outcomes
22 pages
Introduction To Technical Writing
No ratings yet
Introduction To Technical Writing
18 pages
DLP Template 2017
No ratings yet
DLP Template 2017
3 pages
2nd Week 2 Difference Between Technical Writing and Academic Writing
No ratings yet
2nd Week 2 Difference Between Technical Writing and Academic Writing
2 pages
Apostila Pronouns
No ratings yet
Apostila Pronouns
4 pages
Why Do My Emotions Hurt So Much
No ratings yet
Why Do My Emotions Hurt So Much
16 pages
Control Systems Application
No ratings yet
Control Systems Application
6 pages
Cambridge English c2 Proficiency Reading Part 1
No ratings yet
Cambridge English c2 Proficiency Reading Part 1
6 pages
Generative AI Retail QA System Abstract
No ratings yet
Generative AI Retail QA System Abstract
3 pages
Project On: Employee Motivation and Empowerment
No ratings yet
Project On: Employee Motivation and Empowerment
63 pages
TEACHERS AND STUDENTS' PERCEPTIONS OF A Goog Teacher of English
100% (1)
TEACHERS AND STUDENTS' PERCEPTIONS OF A Goog Teacher of English
57 pages
Presentation of Data: Yes 40% No 60%
No ratings yet
Presentation of Data: Yes 40% No 60%
7 pages
Informational Influence in Organizations: An Integrated Approach To Knowledge Adoption
No ratings yet
Informational Influence in Organizations: An Integrated Approach To Knowledge Adoption
26 pages

NLP Lab Manual

Uploaded by

NLP Lab Manual

Uploaded by

Approved by AICTE, New Delhi & Affiliated to JNTUA, Anantapur)

NELLORE PALEM, ATMAKUR, NELLORE (Dt),ANDHRAPRADESH,

INDIA – 524322Ph. No.086272 11990, 9000303399 Email: [email protected]

NATURAL LANGUAGE PROCESSING LAB-(20A30603)

Class III B.TECH II SEM

Name of the Lab NATURAL LANGUAGE PROCESSING

Academic year 2024-2025

Prepared By SYAM KOTI

S.NO NAME OF THE EXPERIMENT DATE SIGNATURE

Signature of the staff member

B.Tech (AI&ML)– III-II Sem L T P C

THEORY:This program uses the word_tokenize function from the

from nltk.tokenize import

# Perform frequency analysis on the

tokens word_freq = nltk.FreqDist(tokens)

# Print the 10 most frequent words

print("The 10 most frequent words are:")

for word, freq in

# Importing the necessarylibraries

import random from nltk.corpus

import words nltk.download('words')

# Generating a random word from the corpus

Printing the generated random word print("The

randomly generated word is:", random_word)

from nltk.stem import PorterStemmer

from nltk.tokenize import word_tokenize

from nltk.stem import WordNetLemmatizer

for word in words:

for word in words:

text = "This is an example sentence showing off the stemming and

print("Stemmed words:", stemmed_words)

print("Lemmatized words:", lemmatized_words)

AIM: a simple Python program that generates N-grams in NLP

from nltk.util import ngrams

text = "Natural language processing (NLP) is the ability of a computer

program to understand human language as it is spoken." # Tokenize the

text into words tokens = nltk.word_tokenize(text) # Generate N-grams

from the tokens

ngrams_list = list(ngrams(tokens, N))

# Print the N-grams

print("The {}-grams are:".format(N))

for ngram in ngrams_list:

The 3-grams are:

from nltk.util import ngrams

from nltk.lm import KneserNeyInterpolated

program to understand human language as it is spoken." # Tokenize the

text into words

tokens = nltk.word_tokenize(text) # Generate bigrams from the tokens

# Train a Kneser-Ney Interpolated (KN-Interpolated) language model on

model.fit(bigrams,bigrams) # Evaluate the model's probabilities of

for bigram in bigrams:

prob = model.score(bigram) print("{}:

You might also like