Program-7(AI)
Program-7(AI)
Aim: To write a python program for removing stop words for a given passage from a
text file using NLTK
Algorithm:
Step1: Import nltk_data directory in python for using the list of stop words provided
by the NLTK library(so we don’t have to write our own).
Step2: Input the text or passage.
Step3: Next convert the input text or passage to lowercase and split it into a list of its
words and then check whether they are present in stop_words provided by nltk or
not.
Step4: Print the list of words which are not in stop_words list of nltk.
Source code:
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
stop_words = set(stopwords.words('english'))
word_tokens = word_tokenize(example_sent)
filtered_sentence = [w for w in word_tokens if not w.lower() in
stop_words]
filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)
print(word_tokens)
print(filtered_sentence)