0% found this document useful (0 votes)
28 views

NLP Using Python

The document demonstrates various natural language processing techniques using the NLTK library in Python including tokenization, part-of-speech tagging, stemming, lemmatization, frequency distributions, named entity recognition and more. Code examples tokenize text from Shakespeare's Hamlet and an artificial intelligence description. Functions from NLTK are used to analyze the tokenized text, count word frequencies, stem and lemmatize words, and identify parts of speech and named entities.

Uploaded by

Me me
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

NLP Using Python

The document demonstrates various natural language processing techniques using the NLTK library in Python including tokenization, part-of-speech tagging, stemming, lemmatization, frequency distributions, named entity recognition and more. Code examples tokenize text from Shakespeare's Hamlet and an artificial intelligence description. Functions from NLTK are used to analyze the tokenized text, count word frequencies, stem and lemmatize words, and identify parts of speech and named entities.

Uploaded by

Me me
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Tokenization

import os
import nltk
import nltk.corpus
print(os.listdir(nltk.data.find("corpora")))

Take Brown
from nltk.corpus import brown
brown.words()

Gutenberg
nltk.corpus.gutenberg.fileids()

Shakespeare
hamlet=nltk.corpus.gutenberg.words('shakespeare-hamlet.txt')
hamlet

First 500 words in hamlet


for word in hamlet[:500];
print (word, sep=' ', end=' ')

AI text
AI= “””According to the father of Artificial Intelligence, John McCarthy, it is “The science and
engineering of making intelligent machines, especially intelligent computer programs”.Artificial
Intelligence is a way of making a computer, a computer-controlled robot, or a software think
intelligently, in a similar manner the intelligent humans think.AI is accomplished by studying how
human brain thinks, and how humans learn, decide, and work while trying to solve a problem, and
then using the outcomes of this study as a basis of developing intelligent software and systems.’’’’
Tokenization of A1
from nltk.tokenize import word_tokenize
AI_tokens=word_tokenize(AI)
AI_tokens

Word Count

from nltk.probability import FreqDist


fdist = FreqDist()

for word in AI_tokens:


fdist[word.lower()]+=1
fdist

Stemmer
from nltk.stem import PorterStemmer
pst=PorterStemmer()

pst.stem("having")

words_to_stem=["give","giving","given","gave"]
for words in words_to_stem:
print(words + ":" +pst.stem(words))
Lemmatization
from nltk.stem import wordnet
from nltk.stem import WordNetLemmatizer
word_lem=WordNetLemmatizer()

word_lem.lemmatize('corpora')

for words in words_to_stem:


print(words+":" + word_lem.lemmatize(words))

Stops Words
from nltk.corpus import stopwords
stopwords.words('english')

sent="Timothy is a natural when it comes to drawing"


sent_tokens=word_tokenize(sent)

for token in sent_tokens:


print (nltk.pos_tag([token]))

sent="Timothy uyahamba kusasa uya eBhunya when it comes to drawing"


sent_tokens=word_tokenize(sent)

POS
sent2 = "John is eating a delicious cake"
sent2_tokens=word_tokenize(sent2)
for token in sent2_tokens:
print (nltk.pos_tag([token]))
from nltk import ne_chunk

NE_sent="The US President stays in the White House"

NE_tokens=word_tokenize(NE_sent)
NE_tags=nltk.pos_tag(NE_tokens)
NE_NER=ne_chunk(NE_tags)
print(NE_NER)

You might also like