0% found this document useful (0 votes)
149 views9 pages

ML Project Report: (Text Learning Case Study)

This text learning case study analyzes inaugural speeches from 3 US presidents: Franklin D. Roosevelt (1941), John F. Kennedy (1961), and Richard Nixon (1973). Key analyses included: 1) Calculating the number of characters, words, and sentences in each speech. 2) Finding the word counts before and after removing stopwords from the speeches. 3) Identifying the top 3 most frequently occurring words in each speech after removing stopwords. For Roosevelt, the top 3 words were "know", "spirit", and "life". For Kennedy, the top 3 words were "world", "sides", and "new". For Nixon, the top 3 words were "America", "peace",

Uploaded by

ankitbhagat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views9 pages

ML Project Report: (Text Learning Case Study)

This text learning case study analyzes inaugural speeches from 3 US presidents: Franklin D. Roosevelt (1941), John F. Kennedy (1961), and Richard Nixon (1973). Key analyses included: 1) Calculating the number of characters, words, and sentences in each speech. 2) Finding the word counts before and after removing stopwords from the speeches. 3) Identifying the top 3 most frequently occurring words in each speech after removing stopwords. For Roosevelt, the top 3 words were "know", "spirit", and "life". For Kennedy, the top 3 words were "world", "sides", and "new". For Nixon, the top 3 words were "America", "peace",

Uploaded by

ankitbhagat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

ML Project Report

(Text Learning Case Study)


Module 6 DSBA

Ankit bhagat
Date of Submission -8th Jan

1
Business problem

Problem 1: Text Learning

2
Text learning Case Study

3
Problem 2:
In this particular project, we are going to work on the inaugural corpora from the nltk in Python.
We will be looking at the following speeches of the Presidents of the United States of America:
1. President Franklin D. Roosevelt in 1941
2. President John F. Kennedy in 1961
3. President Richard Nixon in 1973
(Hint: use .words(), .raw(), .sent() for extracting counts)

2.1 Find the number of characters, words, and sentences for the mentioned documents.

Answer-
President Franklin D. Roosevelt in 1941 speech

R represent character

R1=len(inaugural.raw('1941-Roosevelt.txt'))
R1

R1=7571

R2=len(inaugural.raw('1961-Kennedy.txt'))
R2

R2=7618

R3=len(inaugural.raw('1973-Nixon.txt'))
R3
R3=9991

S represent sentence

S1 =68

S2 =52
4
S3=69

W represent Words

Roosvelt W1=1536
Kennedy W2=1546
Nixon W3=2028

5
2.2 Remove all the stopwords from the three speeches. Show the word count before and
after the removal of stopwords. Show a sample sentence after the removal of stopwords.

Word1 =Stopword line item for Roosevelt text

2)Kenndy stopword

6
3)Nixon

7
Before Stopword removal

Roosvelt W1=1536
Kennedy W2=1546
Nixon W3=2028

After stopword removal

Roosevelt A1=670
Kennedy A2=716
Nixon A3 =857

2.3 Which word occurs the most number of times in his inaugural address for each president? Mention the
top three words. (after removing the stopwords)

After removing the stopwords

a) Roosevelt

FreqDist({'know': 10, 'spirit': 9, 'life': 9, 'us': 8, 'democracy': 8, 'people': 7,


'Nation': 7, 'America': 7, 'years': 6, 'freedom': 6, ...})

1)Know
2)Sprit
3)Life

b) Kennedy

FreqDist({'world': 8, 'sides': 8, 'new': 7, 'pledge': 7, 'citizens': 5, 'I': 5,


'power': 5, 'shall': 5, 'To': 5, 'free': 5, ...})

1)world
2)sides
3)New

c) Nixon

FreqDist({'America': 21, 'peace': 19, 'world': 17, 'new': 15, "'": 14, 'I': 12,
'responsibility': 11, 'great': 9, 'home': 9, 'nation': 9, ...})

1)America
2)peace
8
3)world

You might also like