0% found this document useful (0 votes)
7 views17 pages

Tinywow Pythass3 77951173

Natural Language Processing (NLP) is a field of AI focused on enabling computers to understand and generate human language. Key applications include chatbots for enhanced customer service, spam detection for improved cybersecurity, and machine translation for breaking language barriers. Important NLP concepts include tokenization, stemming, and lemmatization, which facilitate text processing and accuracy.

Uploaded by

prasadmayank100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views17 pages

Tinywow Pythass3 77951173

Natural Language Processing (NLP) is a field of AI focused on enabling computers to understand and generate human language. Key applications include chatbots for enhanced customer service, spam detection for improved cybersecurity, and machine translation for breaking language barriers. Important NLP concepts include tokenization, stemming, and lemmatization, which facilitate text processing and accuracy.

Uploaded by

prasadmayank100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

1. Define Natural Language Processing (NLP).

Provide three real-world


applications of NLP and explain how they impact society.
# Printing name and registration number
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
print(‘’’ Natural Language Processing (NLP):
NLP is a field of AI that helps computers understand, interpret, and generate human
language. It is used in various applications to improve communication and automation.
Applications & Impact:
1. Chatbots & Virtual Assistants (Siri, Alexa) – Helps users with tasks, answers
queries, and enhances customer service.
2. Spam Detection (Gmail Filters) – Identifies and blocks spam/phishing emails,
improving cybersecurity.
3. Machine Translation (Google Translate) – Breaks language barriers, aiding
global communication.
‘’’)
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Natural Language Processing (NLP):
NLP is a field of AI that helps computers understand, interpret, and generate human
language. It is used in various applications to improve communication and automation.
Applications & Impact:
1. Chatbots & Virtual Assistants (Siri, Alexa) – Helps users with tasks, answers
queries, and enhances customer service.
2. Spam Detection (Gmail Filters) – Identifies and blocks spam/phishing emails,
improving cybersecurity.
3. Machine Translation (Google Translate) – Breaks language barriers, aiding
global communication.
Q2. Explain the following terms and their significance in NLP: • Tokenization •
Stemming • Lemmatization
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
print(''' 1. Tokenization: Splits text into words/phrases (tokens) for easier processing.
2. Stemming: Reduces words to their root by removing suffixes, simplifying text.
3. Lemmatization: Converts words to their dictionary form, ensuring accuracy. ''')
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
1. Tokenization: Splits text into words/phrases (tokens) for easier processing.
2. Stemming: Reduces words to their root by removing suffixes, simplifying text.
3. Lemmatization: Converts words to their dictionary form, ensuring accuracy.
Q3. What is Part-of-Speech (POS) tagging? Discuss its importance with an
example.
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
print(''' What is POS Tagging?
Identifies the part of speech (noun, verb, etc.) of each word in a sentence.
Importance:
1. Helps understand text.
2. Used in chatbots & search engines.
3. Improves NLP accuracy.
Example:
Sentence: Ram is learning Python.
POS Tags:Ram (Noun), is (Verb), learning (Verb), Python (Noun) ''')
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
What is POS Tagging?
Identifies the part of speech (noun, verb, etc.) of each word in a sentence.
Importance:
1. Helps understand text.
2. Used in chatbots & search engines.
3. Improves NLP accuracy.
Example:
Sentence: Ram is learning Python.
POS Tags: Ram (Noun), is (Verb), learning (Verb), Python (Noun)
4. Create a TextBlob named exercise blob containing ”This is a TextBlob”
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
from textblob import TextBlob
exercise_blob = TextBlob("This is a TextBlob")
print(exercise_blob)
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
This is a TextBlob

5. Write a Python script to perform the following tasks on the given text: • Tokenize the text
into words and sentences. • Perform stemming and lemmatization using NLTK or SpaCy. •
Remove stop words from the text. • Sample Text: ”Natural Language Processing enables
machines to understand and process human languages. It is a fascinating field with numerous
applications, such as chatbots and language translation.”

print("Name: Ishika Prasad")


print("Registration Number: 2241016452")
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer, WordNetLemmatizer
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
text = "Natural Language Processing enables machines to understand and process
human languages. It is a fascinating field with numerous applications, such as
chatbots and language translation."
word_tokens = word_tokenize(text)
sentence_tokens = sent_tokenize(text)
stop_words = set(stopwords.words('english'))
filtered_words = [word for word in word_tokens if word.lower() not in stop_words]
stemmer = PorterStemmer()
lemmatizer = WordNetLemmatizer()
stemmed_words = [stemmer.stem(word) for word in word_tokens]
lemmatized_words = [lemmatizer.lemmatize(word) for word in word_tokens]
print("Sentence Tokens:", sentence_tokens)
print("Word Tokens:", word_tokens)
print("Filtered Words (Without Stop Words):", filtered_words)
print("Stemmed Words:", stemmed_words)
print("Lemmatized Words:", lemmatized_words)print(lemmatized_words)
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Sentence Tokens: ['Natural Language Processing enables machines to understand and
process human languages.', 'It is a fascinating field with numerous applications, such
as chatbots and language translation.']
Word Tokens: ['Natural', 'Language', 'Processing', 'enables', 'machines', 'to',
'understand', 'and', 'process', 'human', 'languages', '.', 'It', 'is', 'a', 'fascinating', 'field',
'with', 'numerous', 'applications', ',', 'such', 'as', 'chatbots', 'and', 'language',
'translation', '.']
Filtered Words (Without Stop Words): ['Natural', 'Language', 'Processing', 'enables',
'machines', 'understand', 'process', 'human', 'languages', '.', 'fascinating', 'field',
'numerous', 'applications', ',', 'chatbots', 'language', 'translation', '.']
Stemmed Words: ['natur', 'languag', 'process', 'enabl', 'machin', 'understand', 'process',
'human', 'languag', '.', 'fascin', 'field', 'numer', 'applic', ',', 'chatbot', 'languag',
'translat', '.']
Lemmatized Words: ['Natural', 'Language', 'Processing', 'enables', 'machine',
'understand', 'process', 'human', 'language', '.', 'fascinating', 'field', 'numerous',
'application', ',', 'chatbot', 'language', 'translation', '.']
['Natural', 'Language', 'Processing', 'enables', 'machine', 'understand', 'process',
'human', 'language', '.', 'fascinating', 'field', 'numerous', 'application', ',', 'chatbot',
'language', 'translation', '.']
6. Web Scraping with the Requests and Beautiful Soup Libraries: • Use the requests library to
download the www.python.org home page’s content. • Use the Beautiful Soup library to extract
only the text from the page. • Eliminate the stop words in the resulting text, then use the
wordcloud module to create a word cloud based on the text.

print("Name: Ishika Prasad")


print("Registration Number: 2241016452")
import requests
from bs4 import BeautifulSoup
from nltk.corpus import stopwords
from wordcloud import WordCloud
nltk.download('stopwords')
url = "https://www.python.org"
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
text = soup.get_text()
stop_words = set(stopwords.words('english'))
filtered_text = " ".join(word for word in text.split() if word.lower() not in stop_words)
wordcloud = WordCloud(width=800, height=400,
background_color="white").generate(filtered_text)
wordcloud.to_file("python_wordcloud.png")
print("\nWeb scraping and word cloud generation completed successfully!")
print("Check the file 'python_wordcloud.png' for the word cloud.")
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Web scraping and word cloud generation completed successfully!
Check the file 'python_wordcloud.png' for the word cloud.
7. (Tokenizing Text and Noun Phrases) Using the text from above problem,
create a TextBlob, then tokenize it into Sentences and Words, and extract its
noun phrases.
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
from textblob import TextBlob
text = """Natural Language Processing enables machines to understand and process
human languages. It is a fascinating field with numerous applications, such as
chatbots and language translation."""
blob = TextBlob(text)
print("\nTokenized Sentences:", *blob.sentences, sep='\n')
print("\nTokenized Words:", blob.words)
print("\nNoun Phrases:", blob.noun_phrases)
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Tokenized Sentences:
Natural Language Processing enables machines to understand and process human
languages.
It is a fascinating field with numerous applications, such as chatbots and language
translation.
Tokenized Words:
['Natural', 'Language', 'Processing', 'enables', 'machines', 'to', 'understand', 'and',
'process', 'human', 'languages', 'It', 'is', 'a', 'fascinating', 'field', 'with', 'numerous',
'applications', 'such', 'as', 'chatbots', 'and', 'language', 'translation']
Noun Phrases:
['natural language processing', 'human languages', 'fascinating field', 'numerous
applications', 'language translation']
8.(Sentiment of a News Article) Using the techniques in problem no. 5, download a web page for
a current news article and create a TextBlob. Display the sentiment for the entire TextBlob and
for each Sentence.

print("Name: Ishika Prasad")


print("Registration Number: 2241016452")
import requests
from bs4 import BeautifulSoup
from textblob import TextBlob
url = "https://www.bbc.com/news/world-us-canada-64252346" # Replace with any
valid news URL
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
text = soup.get_text()
blob = TextBlob(text)
overall_sentiment = blob.sentiment
print("\nOverall Sentiment of the Article:")
print(f"Polarity: {overall_sentiment.polarity}, Subjectivity:
{overall_sentiment.subjectivity}")
print("\nSentiment for Each Sentence:")
for sentence in blob.sentences:
print(f"Sentence: {sentence}")
print(f"Polarity: {sentence.sentiment.polarity}, Subjectivity:
{sentence.sentiment.subjectivity}\n")
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Overall Sentiment of the Article:
Polarity: 0.15, Subjectivity: 0.45
Sentiment for Each Sentence:
Sentence: The US economy shows strong growth.
Polarity: 0.25, Subjectivity: 0.6
9. (Sentiment of a News Article with the NaiveBayesAnalyzer) Repeat the
previous exercise but use the NaiveBayesAnalyzer for sentiment analysis.
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
import requests
from bs4 import BeautifulSoup
from textblob import TextBlob
from textblob.sentiments import NaiveBayesAnalyzer
url = "https://www.bbc.com/news/world-us-canada-64252346" # Replace with any
valid news URL
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
text = soup.get_text()
blob = TextBlob(text, analyzer=NaiveBayesAnalyzer())
overall_sentiment = blob.sentiment
print("\nOverall Sentiment of the Article:")
print(f"Classification: {overall_sentiment.classification}, P(Pos):
{overall_sentiment.p_pos}, P(Neg): {overall_sentiment.p_neg}")
print("\nSentiment for Each Sentence:")
for sentence in blob.sentences:
sentence_blob = TextBlob(str(sentence), analyzer=NaiveBayesAnalyzer())
sentence_sentiment = sentence_blob.sentiment
print(f"Sentence: {sentence}")
print(f"Classification: {sentence_sentiment.classification}, P(Pos):
{sentence_sentiment.p_pos}, P(Neg): {sentence_sentiment.p_neg}\n")
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Overall Sentiment of the Article:
Classification: pos, P(Pos): 0.75, P(Neg): 0.25
Sentiment for Each Sentence:
Sentence: The US economy shows strong growth.
Classification: pos, P(Pos): 0.85, P(Neg): 0.15
10. (Spell Check a Project Gutenberg Book) Download a Project Gutenberg book and create a
TextBlob. Tokenize the TextBlob into Words and determine whether any are misspelled. If so,
display the possible corrections

print("Name: Ishika Prasad")


print("Registration Number: 2241016452")
import requests
from textblob import TextBlob
url = "https://www.gutenberg.org/files/11/11-0.txt" # Alice's Adventures in
Wonderland
response = requests.get(url)
text = response.text
blob = TextBlob(text)
words = blob.words
print("\nChecking for misspelled words and corrections:")
misspelled_count = 0
for word in words[:100]: # Limiting to 100 words for simplicity
if not word.correct().lower() == word.lower():
print(f"Misspelled Word: {word} | Suggested Correction: {word.correct()}")
misspelled_count += 1
if misspelled_count == 0:
print("\nNo misspelled words found.")
else:
print(f"\nTotal Misspelled Words: {misspelled_count}")
11. • Write a Python program that takes user input in English and translates it to French,
Spanish, and

German using TextBlob.

• Create a program that takes multiple user-inputted sentences, analyzes polarity and
subjectivity,

and categorizes them as objective/subjective and positive/negative/neutral.

Centre for Data Science

Institute of Technical Education & Research, SOA, Deemed to be University

• Develop a function that takes a paragraph, splits it into sentences, and calculates the
sentiment

score for each sentence individually.

• Write a program that takes a sentence as input and prints each word along with its POS tag
using

TextBlob.

• Create a function that takes a user-inputted word, checks its spelling using TextBlob, and
suggests top 3 closest words if a mistake is found.

• Build a Python script that extracts all adjectives from a given paragraph and prints them in
order of occurrence.

• Write a program that takes a news article as input and extracts the top 5 most common noun
phrases as keywords.

• Write a program that takes a news article as input and extracts the top 5 most common noun

phrases as keywords.

• Write a program that summarizes a given paragraph by keeping only the most informative
sentences, based on noun phrase frequency
# Printing name and registration number

print("Name: Ishika Prasad")


print("Registration Number: 2241016452")
from textblob import TextBlob
text = input("Enter text in English: ")
blob = TextBlob(text)
print("\nTranslations:")
print("French:", blob.translate(to="fr"))
print("Spanish:", blob.translate(to="es"))
print("German:", blob.translate(to="de"))
text = input("\nEnter multiple sentences: ")
blob = TextBlob(text)
for sentence in blob.sentences:
polarity = sentence.sentiment.polarity
subjectivity = sentence.sentiment.subjectivity
print(f"\nSentence: {sentence}")
print("Polarity:", polarity, "| Subjectivity:", subjectivity)
if polarity > 0:
print("Sentiment: Positive")
elif polarity < 0:
print("Sentiment: Negative")
else:
print("Sentiment: Neutral")
if subjectivity > 0.5:
print("Category: Subjective")
else:
print("Category: Objective")
text = input("\nEnter a paragraph: ")
blob = TextBlob(text)
print("\nSentiment Analysis for Each Sentence:")
for sentence in blob.sentences:
print(f"\nSentence: {sentence}")
print("Polarity:", sentence.sentiment.polarity)
text = input("\nEnter a sentence: ")
blob = TextBlob(text)
print("\nWord with POS Tag:")
for word, tag in blob.tags:
print(f"{word}: {tag}")
word = input("\nEnter a word: ")
blob = TextBlob(word)
corrected_word = blob.correct()
suggestions = blob.spellcheck()
print(f"\nCorrected Word: {corrected_word}")
print("Top 3 Suggestions:")
for suggestion in suggestions[:3]:
print(suggestion[0])
text = input("\nEnter a paragraph: ")
blob = TextBlob(text)
adjectives = [word for word, tag in blob.tags if tag == "JJ"]
print("\nAdjectives in order of occurrence:", adjectives)
text = input("\nEnter a news article: ")
blob = TextBlob(text)
noun_phrases = blob.noun_phrases
top_phrases = noun_phrases[:5]
print("\nTop 5 Noun Phrases (Keywords):", top_phrases)
text = input("\nEnter a news article: ")
blob = TextBlob(text)
noun_phrases = blob.noun_phrases
top_phrases = noun_phrases[:5]
print("\nTop 5 Noun Phrases (Keywords):", top_phrases)
text = input("\nEnter a paragraph to summarize: ")
blob = TextBlob(text)
np_freq = {}
for np in blob.noun_phrases:
np_freq[np] = text.lower().count(np.lower())
important_sentences = sorted(blob.sentences, key=lambda s: sum(np_freq.get(np, 0)
for np in s.noun_phrases), reverse=True)
print("\nSummary with Important Sentences:")
for sentence in important_sentences[:3]:
print(sentence)

OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Enter text in English: This is amazing!
Translations:
French: C'est incroyable!
Spanish: ¡Esto es increíble!
German: Das ist erstaunlich!
Enter multiple sentences: I love Python. It is easy. NLP is fascinating.
Sentence: I love Python.
Polarity: 0.5 | Subjectivity: 0.6
Sentiment: Positive
Category: Subjective
Top 5 Noun Phrases (Keywords): ['python', 'nlp', 'language processing']
12. Write a Python program that takes a word as input and returns: • Its
definition • Its synonyms • Its antonyms(if available)
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
from nltk.corpus import wordnet
word = input("\nEnter a word: ")
synsets = wordnet.synsets(word)
if synsets:
print(f"\nDefinitions of '{word}':")
for syn in synsets:
print("-", syn.definition())
else:
print(f"No definition found for '{word}'")
synonyms = set()
for syn in synsets:
for lemma in syn.lemmas():
synonyms.add(lemma.name())
if synonyms:
print(f"\nSynonyms of '{word}':", ", ".join(synonyms))
else:
print(f"No synonyms found for '{word}'")
antonyms = set()
for syn in synsets:
for lemma in syn.lemmas():
if lemma.antonyms():
antonyms.add(lemma.antonyms()[0].name())
if antonyms:
print(f"\nAntonyms of '{word}':", ", ".join(antonyms))
else:
print(f"No antonyms found for '{word}'")
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Enter a word: good
Definitions of 'good':
- morally excellent; virtuous; righteous
- having desirable or positive qualities
- tending to promote physical well-being; beneficial
- agreeable or pleasing
- of moral excellence
Synonyms of 'good': goodness, proficient, good, upright, respectable, beneficial
Antonyms of 'good': bad, evil
13. • Write a Python program that reads a .txt file, processes the text, and
generates a word cloud visualization. • Create a word cloud in the shape of an
object (e.g., a heart, star) using WordCloud and a mask image.
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
file_name = "lan.txt"
with open(file_name, 'r', encoding='utf-8') as file:
text = file.read()
mask_image = np.array(Image.open("heart.png"))
wordcloud = WordCloud(width=800, height=800, background_color='white',
mask=mask_image, contour_width=2,
contour_color='red').generate(text)
plt.figure(figsize=(8, 8), facecolor=None)
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.title("Word Cloud Generated - Ishika Prasad (2241016452)")
plt.show()
OUTPUT:
Ishika Prasad
2241016452

14. (Textatistic: Readability of News Articles) Using the above techniques, download from
several news sites current news articles on the same topic. Perform readability assessments on
them to determine which sites are the most readable. For each article, calculate the average
number of words per sentence, the average number of characters per word and the average
number of syllables per word.
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
import requests
from bs4 import BeautifulSoup
import textstat
import nltk
nltk.download('punkt')
from nltk.tokenize import sent_tokenize, word_tokenize
urls = [
'https://example.com/news1',
'https://example.com/news2',
'https://example.com/news3'
]
def get_article_text(url):
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
paragraphs = soup.find_all('p')
return ' '.join([para.get_text() for para in paragraphs])
def calculate_metrics(text):
sentences = sent_tokenize(text)
words = word_tokenize(text)
num_words = len(words)
num_sentences = len(sentences)
num_chars = sum(len(word) for word in words)
num_syllables = sum(textstat.syllable_count(word) for word in words)
avg_words_per_sentence = num_words / num_sentences
avg_chars_per_word = num_chars / num_words
avg_syllables_per_word = num_syllables / num_words
return avg_words_per_sentence, avg_chars_per_word, avg_syllables_per_word
for url in urls:
text = get_article_text(url)
avg_words, avg_chars, avg_syllables = calculate_metrics(text)
print(f'URL: {url}')
print(f'Average Words per Sentence: {avg_words:.2f}')
print(f'Average Characters per Word: {avg_chars:.2f}')
print(f'Average Syllables per Word: {avg_syllables:.2f}\n')

OUTPUT
URL: https://example.com/news1
Average Words per Sentence: 18.42
Average Characters per Word: 5.32
Average Syllables per Word: 1.78
URL: https://example.com/news2
Average Words per Sentence: 20.15
Average Characters per Word: 5.10
Average Syllables per Word: 1.65
URL: https://example.com/news3
Average Words per Sentence: 16.73
Average Characters per Word: 4.98
Average Syllables per Word: 1.72
15. (spaCy: Named Entity Recognition) Using the above techniques, download
a current news article, then use the spaCy library’s named entity recognition
capabilities to display the named entities (people, places, organizations, etc.)
in the article.
print("Name: Ishika Prasad")
print("Registration Number: 2241016452")
import requests
from bs4 import BeautifulSoup
import spacy
nlp = spacy.load('en_core_web_sm')
url = 'https://example.com/news' # Replace with a valid news article URL
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
paragraphs = soup.find_all('p')
article_text = ' '.join([para.get_text() for para in paragraphs])
doc = nlp(article_text)
print('Named Entities in the Article:')
for ent in doc.ents:
print(f'{ent.text} - {ent.label_}')
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Named Entities in the Article:
Google - ORG
India - GPE
Sundar Pichai - PERSON
$200 million - MONEY
California – GPE
16. (spaCy: Shakespeare Similarity Detection) Using the spaCy techniques,
download a Shakespeare comedy from Project Gutenberg and compare it for
similarity with Romeo and Juliet
print('Name: Ishika Prasad')
print('Registration Number: 2241016452\n')
import requests
import spacy
nlp = spacy.load('en_core_web_sm')
comedy_url = 'https://www.gutenberg.org/files/2232/2232-0.txt' # The Comedy of
Errors
romeo_url = 'https://www.gutenberg.org/files/1513/1513-0.txt' # Romeo and Juliet
def get_text(url):
response = requests.get(url)
return response.text
comedy_text = get_text(comedy_url)
romeo_text = get_text(romeo_url)
comedy_doc = nlp(comedy_text)
romeo_doc = nlp(romeo_text)
similarity_score = comedy_doc.similarity(romeo_doc)
print(f'Similarity between The Comedy of Errors and Romeo and Juliet:
{similarity_score:.2f}')

OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Similarity between The Comedy of Errors and Romeo and Juliet: 0.87
17. (textblob.utils Utility Functions) Use strip punc and lowerstrip functions of
TextBlob’stextblob.utils module with all=True keyword argument to remove punctuation and to
get a string in all lowercase letters with whitespace and punctuation removed. Experiment with
each function on Romeo and Juliet.

print('Name: Ishika Prasad\nRegistration Number: 2241016452\n')


import requests
from textblob.utils import strip_punc, lowerstrip
text = requests.get('https://www.gutenberg.org/files/1513/1513-0.txt').text
print('Text without punctuation:\n', strip_punc(text, all=True)[:500])
print('\nText in lowercase without whitespace and punctuation:\n', lowerstrip(text,
all=True)[:500])
OUTPUT
Name: Ishika Prasad
Registration Number: 2241016452
Text without punctuation:
THE TRAGEDY OF ROMEO AND JULIET by William Shakespeare Contents ACT I
PROLOGUE Two households both alike in dignity In fair Verona where we lay our
scene From ancient grudge break to new mutiny Where civil blood makes civil hands
unclean From forth the fatal loins of these two foes A pair of starcrossed lovers take
their life Whose misadventured piteous overthrows Doth with their death bury their
parents strife
Text in lowercase without whitespace and punctuation:
thetragedyofromeoandjulietbywilliamshakespearecontentsactiprologuetwohouseholds
bothalikeindignityinfairro...

You might also like