0% found this document useful (0 votes)
8 views30 pages

Text Summarizer

Code1

Uploaded by

Soundar Ravi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views30 pages

Text Summarizer

Code1

Uploaded by

Soundar Ravi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

1 Importing the required Libraries

In [4]: import numpy as np


import pandas as pd
import warnings
import re
import nltk
nltk.download('punkt')
#nltk.download('all')
from nltk import word_tokenize
from nltk.tokenize import sent_tokenize
from textblob import TextBlob
import string
from string import punctuation
from nltk.corpus import stopwords
from statistics import mean
from heapq import nlargest
from wordcloud import WordCloud
import seaborn as sns
import matplotlib.pyplot as plt

stop_words = set(stopwords.words('english'))
punctuation = punctuation + '\n' + '—' + '“' + ',' + '”' + '‘' + '-' + '’'
warnings.filterwarnings('ignore')

[nltk_data] Downloading package punkt to


[nltk_data] C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!

In [5]: # Importing the dataset


df_1 = pd.read_csv("C:/Users/hp/Downloads/set3/articles1.csv")
df_2 = pd.read_csv("C:/Users/hp/Downloads/set3/articles2.csv")
df_3 = pd.read_csv("C:/Users/hp/Downloads/set3/articles3.csv")

In [6]: # Checking if the columns are same or not


df_1.columns == df_2.columns

array([ True, True, True, True, True, True, True, True, True,
Out[6]:
True])

In [7]: # Checking if the columns are same or not


df_2.columns == df_3.columns

array([ True, True, True, True, True, True, True, True, True,
Out[7]:
True])

In [8]: # Making one Dataframe by appending all of them for the further process
d = [df_1, df_2, df_3]
df = pd.concat(d, keys = ['x', 'y', 'z'])
df.rename(columns = {'content' : 'article'}, inplace = True);

In [9]: df.head()

Out[9]: Unnamed:
id title publication author date year month url article
0

x 0 WASHINGTON
House

Republicans Fret New York 2016-
0 17283 Carl Hulse 2016.0 12.0 NaN Congressional
About Winning Times 12-31
Republicans
Their Hea...
have...
1 1 17284 Rift Between New York Benjamin 2017- 2017.0 6.0 NaN After the
Officers and Times Mueller 06-19 bullet shells
Residents as and Al get counted,
Killing... Baker the blood...

When Walt
Tyrus Wong,
Disney’s
‘Bambi’ Artist New York Margalit 2017-
2 2 17285 2017.0 1.0 NaN “Bambi”
Thwarted by Times Fox 01-06
opened in
Racial ...
1942, cri...

Death may be
Among Deaths in
New York William 2017- the great
3 3 17286 2016, a Heavy 2017.0 4.0 NaN
Times McDonald 04-10 equalizer, but
Toll in Pop Musi...
it isn’t...

Kim Jong-un SEOUL, South


Says North Korea New York Choe 2017- Korea —
4 4 17287 2017.0 1.0 NaN
Is Preparing to Times Sang-Hun 01-02 North Korea’s
T... leader, ...

In [10]: # Shape of the dataset


print ("The shape of the dataset : ", df.shape)

The shape of the dataset : (142570, 10)

In [11]: # Dropping the unnecessary columns


df.drop(columns = ['Unnamed: 0'], inplace = True)
df.head()

Out[11]: id title publication author date year month url article

x 0 WASHINGTON

House Republicans Fret New York 2016-
17283 Carl Hulse 2016.0 12.0 NaN Congressional
About Winning Their Hea... Times 12-31
Republicans
have...

After the
Benjamin
Rift Between Officers and New York 2017- bullet shells
1 17284 Mueller and Al 2017.0 6.0 NaN
Residents as Killing... Times 06-19 get counted,
Baker
the blood...

When Walt
Disney’s
Tyrus Wong, ‘Bambi’ Artist New York 2017-
2 17285 Margalit Fox 2017.0 1.0 NaN “Bambi”
Thwarted by Racial ... Times 01-06
opened in
1942, cri...

Death may be
Among Deaths in 2016, a New York William 2017- the great
3 17286 2017.0 4.0 NaN
Heavy Toll in Pop Musi... Times McDonald 04-10 equalizer, but
it isn’t...

SEOUL, South
Kim Jong-un Says North New York Choe Sang- 2017- Korea —
4 17287 2017.0 1.0 NaN
Korea Is Preparing to T... Times Hun 01-02 North Korea’s
leader, ...

In [12]: # Countplot shows the distribution of Publication


plt.rcParams['figure.figsize'] = [15, 8]
sns.set(font_scale = 1.2, style = 'darkgrid')
sns_year = sns.countplot(df['publication'], color = 'darkcyan')
plt.xticks(rotation=45)
sns_year.set(xlabel = "Publication", ylabel = "Count", title = "Distribution of Publicat
Out[12]: [Text(0.5, 0, 'Publication'),
Text(0, 0.5, 'Count'),
Text(0.5, 1.0, 'Distribution of Publication according')]

2 Exploratory Data Analysis


In [12]: # Replacing the unnecessary row value of year with it's actual values
df['year'] = df['year'].replace("https://www.washingtonpost.com/outlook/tale-of-a-woman-w

In [13]: # Years
df['year'].value_counts()

2016.0 85405
Out[13]:
2017.0 50404
2015.0 3705
2013.0 228
2014.0 125
2012.0 34
2011.0 8
2010.0 6
2008.0 3
2009.0 3
2004.0 2
2003.0 2
2005.0 2
2007.0 1
2000.0 1
Name: year, dtype: int64

In [14]: # Countplot shows the distribution of the articles according to the year
plt.rcParams['figure.figsize'] = [15, 8]
sns.set(font_scale = 1.2, style = 'whitegrid')
sns_year = sns.countplot(df['year'], color = 'darkcyan')
sns_year.set(xlabel = "Year", ylabel = "Count", title = "Distribution of the articles ac

[Text(0.5, 0, 'Year'),
Out[14]: Text(0, 0.5, 'Count'),
Text(0.5, 1.0, 'Distribution of the articles according to the year')]

In [15]: # Authors
df['author'].value_counts()

Breitbart News 1559


Out[15]:
Pam Key 1282
Associated Press 1231
Charlie Spiering 928
Jerome Hudson 806
...
Laura Italiano, Sophia Rosenbaum and Philip Messing 1
Larry Celona, C.J. Sullivan and Daniel Prendergast 1
Krit McClean 1
Melissa Klein and Joe Tacopino 1
John Yearwood 1
Name: author, Length: 15647, dtype: int64

In [16]: # Countplot shows the distribution of author


plt.rcParams['figure.figsize'] = [15, 15]
sns.set(font_scale = 1, style = 'whitegrid')
df_author = df.author.value_counts().head(80)

sns.barplot(df_author,df_author.index)
sns_year.set(xlabel = "count", ylabel = "author", title = "the most freq author")

[Text(0.5, 21.200000000000003, 'count'),


Out[16]:
Text(21.200000000000003, 0.5, 'author'),
Text(0.5, 1.0, 'the most freq author')]
In [18]: # Changing the value "The Associated Press" to "Associated Press"
df['author'] = df['author'].replace("The Associated Press", "Associated Press")

3 Making the Article Summarizer


In [19]: contractions_dict = {
"ain't": "am not",
"aren't": "are not",
"can't": "cannot",
"can't've": "cannot have",
"'cause": "because",
"could've": "could have",
"couldn't": "could not",
"couldn't've": "could not have",
"didn't": "did not",
"doesn't": "does not",
"doesn’t": "does not",
"don't": "do not",
"don’t": "do not",
"hadn't": "had not",
"hadn't've": "had not have",
"hasn't": "has not",
"haven't": "have not",
"he'd": "he had",
"he'd've": "he would have",
"he'll": "he will",
"he'll've": "he will have",
"he's": "he is",
"how'd": "how did",
"how'd'y": "how do you",
"how'll": "how will",
"how's": "how is",
"i'd": "i would",
"i'd've": "i would have",
"i'll": "i will",
"i'll've": "i will have",
"i'm": "i am",
"i've": "i have",
"isn't": "is not",
"it'd": "it would",
"it'd've": "it would have",
"it'll": "it will",
"it'll've": "it will have",
"it's": "it is",
"let's": "let us",
"ma'am": "madam",
"mayn't": "may not",
"might've": "might have",
"mightn't": "might not",
"mightn't've": "might not have",
"must've": "must have",
"mustn't": "must not",
"mustn't've": "must not have",
"needn't": "need not",
"needn't've": "need not have",
"o'clock": "of the clock",
"oughtn't": "ought not",
"oughtn't've": "ought not have",
"shan't": "shall not",
"sha'n't": "shall not",
"shan't've": "shall not have",
"she'd": "she would",
"she'd've": "she would have",
"she'll": "she will",
"she'll've": "she will have",
"she's": "she is",
"should've": "should have",
"shouldn't": "should not",
"shouldn't've": "should not have",
"so've": "so have",
"so's": "so is",
"that'd": "that would",
"that'd've": "that would have",
"that's": "that is",
"there'd": "there would",
"there'd've": "there would have",
"there's": "there is",
"they'd": "they would",
"they'd've": "they would have",
"they'll": "they will",
"they'll've": "they will have",
"they're": "they are",
"they've": "they have",
"to've": "to have",
"wasn't": "was not",
"we'd": "we would",
"we'd've": "we would have",
"we'll": "we will",
"we'll've": "we will have",
"we're": "we are",
"we've": "we have",
"weren't": "were not",
"what'll": "what will",
"what'll've": "what will have",
"what're": "what are",
"what's": "what is",
"what've": "what have",
"when's": "when is",
"when've": "when have",
"where'd": "where did",
"where's": "where is",
"where've": "where have",
"who'll": "who will",
"who'll've": "who will have",
"who's": "who is",
"who've": "who have",
"why's": "why is",
"why've": "why have",
"will've": "will have",
"won't": "will not",
"won't've": "will not have",
"would've": "would have",
"wouldn't": "would not",
"wouldn't've": "would not have",
"y'all": "you all",
"y’all": "you all",
"y'all'd": "you all would",
"y'all'd've": "you all would have",
"y'all're": "you all are",
"y'all've": "you all have",
"you'd": "you would",
"you'd've": "you would have",
"you'll": "you will",
"you'll've": "you will have",
"you're": "you are",
"you've": "you have",
"ain’t": "am not",
"aren’t": "are not",
"can’t": "cannot",
"can’t’ve": "cannot have",
"’cause": "because",
"could’ve": "could have",
"couldn’t": "could not",
"couldn’t’ve": "could not have",
"didn’t": "did not",
"doesn’t": "does not",
"don’t": "do not",
"don’t": "do not",
"hadn’t": "had not",
"hadn’t’ve": "had not have",
"hasn’t": "has not",
"haven’t": "have not",
"he’d": "he had",
"he’d’ve": "he would have",
"he’ll": "he will",
"he’ll’ve": "he will have",
"he’s": "he is",
"how’d": "how did",
"how’d’y": "how do you",
"how’ll": "how will",
"how’s": "how is",
"i’d": "i would",
"i’d’ve": "i would have",
"i’ll": "i will",
"i’ll’ve": "i will have",
"i’m": "i am",
"i’ve": "i have",
"isn’t": "is not",
"it’d": "it would",
"it’d’ve": "it would have",
"it’ll": "it will",
"it’ll’ve": "it will have",
"it’s": "it is",
"let’s": "let us",
"ma’am": "madam",
"mayn’t": "may not",
"might’ve": "might have",
"mightn’t": "might not",
"mightn’t’ve": "might not have",
"must’ve": "must have",
"mustn’t": "must not",
"mustn’t’ve": "must not have",
"needn’t": "need not",
"needn’t’ve": "need not have",
"o’clock": "of the clock",
"oughtn’t": "ought not",
"oughtn’t’ve": "ought not have",
"shan’t": "shall not",
"sha’n’t": "shall not",
"shan’t’ve": "shall not have",
"she’d": "she would",
"she’d’ve": "she would have",
"she’ll": "she will",
"she’ll’ve": "she will have",
"she’s": "she is",
"should’ve": "should have",
"shouldn’t": "should not",
"shouldn’t’ve": "should not have",
"so’ve": "so have",
"so’s": "so is",
"that’d": "that would",
"that’d’ve": "that would have",
"that’s": "that is",
"there’d": "there would",
"there’d’ve": "there would have",
"there’s": "there is",
"they’d": "they would",
"they’d’ve": "they would have",
"they’ll": "they will",
"they’ll’ve": "they will have",
"they’re": "they are",
"they’ve": "they have",
"to’ve": "to have",
"wasn’t": "was not",
"we’d": "we would",
"we’d’ve": "we would have",
"we’ll": "we will",
"we’ll’ve": "we will have",
"we’re": "we are",
"we’ve": "we have",
"weren’t": "were not",
"what’ll": "what will",
"what’ll’ve": "what will have",
"what’re": "what are",
"what’s": "what is",
"what’ve": "what have",
"when’s": "when is",
"when’ve": "when have",
"where’d": "where did",
"where’s": "where is",
"where’ve": "where have",
"who’ll": "who will",
"who’ll’ve": "who will have",
"who’s": "who is",
"who’ve": "who have",
"why’s": "why is",
"why’ve": "why have",
"will’ve": "will have",
"won’t": "will not",
"won’t’ve": "will not have",
"would’ve": "would have",
"wouldn’t": "would not",
"wouldn’t’ve": "would not have",
"y’all": "you all",
"y’all": "you all",
"y’all’d": "you all would",
"y’all’d’ve": "you all would have",
"y’all’re": "you all are",
"y’all’ve": "you all have",
"you’d": "you would",
"you’d’ve": "you would have",
"you’ll": "you will",
"you’ll’ve": "you will have",
"you’re": "you are",
"you’re": "you are",
"you’ve": "you have",
}
contractions_re = re.compile('(%s)' % '|'.join(contractions_dict.keys()))
# Function to clean the html from the article
def cleanhtml(raw_html):
cleanr = re.compile('<.*?>')
cleantext = re.sub(cleanr, '', raw_html)
return cleantext

# Function expand the contractions if there's any


def expand_contractions(s, contractions_dict=contractions_dict):
def replace(match):
return contractions_dict[match.group(0)]
return contractions_re.sub(replace, s)

# Function to preprocess the articles


def preprocessing(article):
global article_sent

# Converting to lowercase
article = article.str.lower()

# Removing the HTML


article = article.apply(lambda x: cleanhtml(x))

# Removing the email ids


article = article.apply(lambda x: re.sub('\S+@\S+','', x))

# Removing The URLS


article = article.apply(lambda x: re.sub("((http\://|https\://|ftp\://)|(www.))+(([a

# Removing the '\xa0'


article = article.apply(lambda x: x.replace("\xa0", " "))

# Removing the contractions


article = article.apply(lambda x: expand_contractions(x))

# Stripping the possessives


article = article.apply(lambda x: x.replace("'s", ''))
article = article.apply(lambda x: x.replace('’s', ''))
article = article.apply(lambda x: x.replace("\'s", ''))
article = article.apply(lambda x: x.replace("\’s", ''))
# Removing the Trailing and leading whitespace and double spaces
article = article.apply(lambda x: re.sub(' +', ' ',x))

# Copying the article for the sentence tokenization


article_sent = article.copy()

# Removing punctuations from the article


article = article.apply(lambda x: ''.join(word for word in x if word not in punctuat

# Removing the Trailing and leading whitespace and double spaces again as removing p
# Lead to a white space
article = article.apply(lambda x: re.sub(' +', ' ',x))

# Removing the Stopwords


article = article.apply(lambda x: ' '.join(word for word in x.split() if word not in

return article

# Function to normalize the word frequency which is used in the function word_frequency
def normalize(li_word):
global normalized_freq
normalized_freq = []
for dictionary in li_word:
max_frequency = max(dictionary.values())
for word in dictionary.keys():
dictionary[word] = dictionary[word]/max_frequency
normalized_freq.append(dictionary)
return normalized_freq

# Function to calculate the word frequency


def word_frequency(article_word):
word_frequency = {}
li_word = []
for sentence in article_word:
for word in word_tokenize(sentence):
if word not in word_frequency.keys():
word_frequency[word] = 1
else:
word_frequency[word] += 1
li_word.append(word_frequency)
word_frequency = {}
normalize(li_word)
return normalized_freq

# Function to Score the sentence which is called in the function sent_token


def sentence_score(li):
global sentence_score_list
sentence_score = {}
sentence_score_list = []
for list_, dictionary in zip(li, normalized_freq):
for sent in list_:
for word in word_tokenize(sent):
if word in dictionary.keys():
if sent not in sentence_score.keys():
sentence_score[sent] = dictionary[word]
else:
sentence_score[sent] += dictionary[word]
sentence_score_list.append(sentence_score)
sentence_score = {}
return sentence_score_list

# Function to tokenize the sentence


def sent_token(article_sent):
sentence_list = []
sent_token = []
for sent in article_sent:
token = sent_tokenize(sent)
for sentence in token:
token_2 = ''.join(word for word in sentence if word not in punctuation)
token_2 = re.sub(' +', ' ',token_2)
sent_token.append(token_2)
sentence_list.append(sent_token)
sent_token = []
sentence_score(sentence_list)
return sentence_score_list

# Function which generates the summary of the articles (This uses the 20% of the sentenc
def summary(sentence_score_OwO):
summary_list = []
for summ in sentence_score_OwO:
select_length = int(len(summ)*0.25)
summary_ = nlargest(select_length, summ, key = summ.get)
summary_list.append(".".join(summary_))
return summary_list

# Functions to change the article string (if passed) to change it to generate a pandas s
def make_series(art):
global dataframe
data_dict = {'article' : [art]}
dataframe = pd.DataFrame(data_dict)['article']
return dataframe

# Function which is to be called to generate the summary which in further calls other fu
def article_summarize(artefact):

if type(artefact) != pd.Series:
artefact = make_series(artefact)

df = preprocessing(artefact)

word_normalization = word_frequency(df)

sentence_score_OwO = sent_token(article_sent)

summarized_article = summary(sentence_score_OwO)

return summarized_article

In [20]: # Generating the Word Cloud of the article using the preprocessing and make_series funct
from wordcloud import WordCloud
def word_cloud(art):
art_ = make_series(art)
OwO = preprocessing(art_)
wordcloud_ = WordCloud(height = 500, width = 1000, background_color = 'white').gener
plt.figure(figsize=(15, 10))
plt.imshow(wordcloud_, interpolation='bilinear')
plt.axis('off');
# Generating the summaries for the first 100 articles
summaries = article_summarize(df['article'][0:100])

In [21]: print ("The Actual length of the article is : ", len(df['article'][0]))


df['article'][0]

The Actual length of the article is : 5607


'WASHINGTON — Congressional Republicans have a new fear when it comes to their hea
Out[21]:
lth care lawsuit against the Obama administration: They might win. The incoming Trump ad
ministration could choose to no longer defend the executive branch against the suit, whi
ch challenges the administration’s authority to spend billions of dollars on health insu
rance subsidies for and Americans, handing House Republicans a big victory on iss
ues. But a sudden loss of the disputed subsidies could conceivably cause the health care
program to implode, leaving millions of people without access to health insurance before
Republicans have prepared a replacement. That could lead to chaos in the insurance marke
t and spur a political backlash just as Republicans gain full control of the government.
To stave off that outcome, Republicans could find themselves in the awkward position of
appropriating huge sums to temporarily prop up the Obama health care law, angering conse
rvative voters who have been demanding an end to the law for years. In another twist, Do
nald J. Trump’s administration, worried about preserving executive branch prerogatives,
could choose to fight its Republican allies in the House on some central questions in th
e dispute. Eager to avoid an ugly political pileup, Republicans on Capitol Hill and the
Trump transition team are gaming out how to handle the lawsuit, which, after the electio
n, has been put in limbo until at least late February by the United States Court of Appe
als for the District of Columbia Circuit. They are not yet ready to divulge their strate
gy. “Given that this pending litigation involves the Obama administration and Congress,
it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump
transition effort. “Upon taking office, the Trump administration will evaluate this case
and all related aspects of the Affordable Care Act. ” In a potentially decision in 201
5, Judge Rosemary M. Collyer ruled that House Republicans had the standing to sue the ex
ecutive branch over a spending dispute and that the Obama administration had been distri
buting the health insurance subsidies, in violation of the Constitution, without approva
l from Congress. The Justice Department, confident that Judge Collyer’s decision would b
e reversed, quickly appealed, and the subsidies have remained in place during the appea
l. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, Hous
e Republicans last month told the court that they “and the ’s transition team currently
are discussing potential options for resolution of this matter, to take effect after the
’s inauguration on Jan. 20, 2017. ” The suspension of the case, House lawyers said, wi
ll “provide the and his future administration time to consider whether to continue pro
secuting or to otherwise resolve this appeal. ” Republican leadership officials in the H
ouse acknowledge the possibility of “cascading effects” if the payments, which have to
taled an estimated $13 billion, are suddenly stopped. Insurers that receive the subsidie
s in exchange for paying costs such as deductibles and for eligible consumers could
race to drop coverage since they would be losing money. Over all, the loss of the subsid
ies could destabilize the entire program and cause a lack of confidence that leads other
insurers to seek a quick exit as well. Anticipating that the Trump administration might
not be inclined to mount a vigorous fight against the House Republicans given the ’s di
m view of the health care law, a team of lawyers this month sought to intervene in the c
ase on behalf of two participants in the health care program. In their request, the lawy
ers predicted that a deal between House Republicans and the new administration to dismis
s or settle the case “will produce devastating consequences for the individuals who rece
ive these reductions, as well as for the nation’s health insurance and health care syste
ms generally. ” No matter what happens, House Republicans say, they want to prevail on t
wo overarching concepts: the congressional power of the purse, and the right of Congress
to sue the executive branch if it violates the Constitution regarding that spending powe
r. House Republicans contend that Congress never appropriated the money for the subsidie
s, as required by the Constitution. In the suit, which was initially championed by John
A. Boehner, the House speaker at the time, and later in House committee reports, Republi
cans asserted that the administration, desperate for the funding, had required the Treas
ury Department to provide it despite widespread internal skepticism that the spending wa
s proper. The White House said that the spending was a permanent part of the law passed
in 2010, and that no annual appropriation was required — even though the administrati
on initially sought one. Just as important to House Republicans, Judge Collyer found tha
t Congress had the standing to sue the White House on this issue — a ruling that many
legal experts said was flawed — and they want that precedent to be set to restore con
gressional leverage over the executive branch. But on spending power and standing, the T
rump administration may come under pressure from advocates of presidential authority to
fight the House no matter their shared views on health care, since those precedents coul
d have broad repercussions. It is a complicated set of dynamics illustrating how a quick
legal victory for the House in the Trump era might come with costs that Republicans neve
r anticipated when they took on the Obama White House.'

In [22]: print ("The length of the summarized article is : ", len(summaries[0]))


summaries[0]

The length of the summarized article is : 1682


'anticipating that the trump administration might not be inclined to mount a vigorous fi
Out[22]:
ght against the house republicans given the dim view of the health care law a team of la
wyers this month sought to intervene in the case on behalf of two participants in the he
alth care program.the incoming trump administration could choose to no longer defend the
executive branch against the suit which challenges the administration authority to spend
billions of dollars on health insurance subsidies for and americans handing house republ
icans a big victory on issues. in a potentially decision in 2015 judge rosemary m collye
r ruled that house republicans had the standing to sue the executive branch over a spend
ing dispute and that the obama administration had been distributing the health insurance
subsidies in violation of the constitution without approval from congress.in their reque
st the lawyers predicted that a deal between house republicans and the new administratio
n to dismiss or settle the case will produce devastating consequences for the individual
s who receive these reductions as well as for the nation health insurance and health car
e systems generally.just as important to house republicans judge collyer found that cong
ress had the standing to sue the white house on this issue a ruling that many legal expe
rts said was flawed and they want that precedent to be set to restore congressional leve
rage over the executive branch.but on spending power and standing the trump administrati
on may come under pressure from advocates of presidential authority to fight the house n
o matter their shared views on health care since those precedents could have broad reper
cussions'

Rouge for Text Rank


In [11]: from rouge import Rouge

In [12]: model_out = ['anticipating that the trump administration might not be inclined to mount
'the incoming trump administration could choose to no longer defend the exe
'in a potentially decision in 2015 judge rosemary m collyer ruled that hous
'in their request the lawyers predicted that a deal between house republica
'just as important to house republicans judge collyer found that congress h
'but on spending power and standing the trump administration may come under

reference=['anticipating that the trump administration might not be ready to mount a vig
'the incoming trump administration could choose to no longer defend the exe
'in a potentially decision in 2015 judge rosemary m collyer ruled that hous
'in their request the advocates predicted that a deal between house republi
'just as important to house republicans judge collyer found that congress h
'but on spending power and standing the trump administration may come under

In [13]: rouge = Rouge()


rouge.get_scores(model_out, reference, avg=True)

{'rouge-1': {'r': 0.9639172428646113,


Out[13]:
'p': 0.9639172428646113,
'f': 0.9639172378646114},
'rouge-2': {'r': 0.9424169228517054,
'p': 0.9424169228517054,
'f': 0.9424169178517054},
'rouge-l': {'r': 0.9639172428646113,
'p': 0.9639172428646113,
'f': 0.9639172378646114}}

text-summarization-using-nlp

Text cleaning
In [1]: text=""" Congressional Republicans have a new fear when it comes to their health care

In [2]: len(text)
5592
Out[2]:
In [3]: import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation

In [17]: stopwords = list(STOP_WORDS)

In [18]: nlp = spacy.load('en_core_web_sm')

In [19]: doc = nlp(text)

Word tokenization
In [20]: tokens = [token.text for token in doc]
print(tokens)

[' ', 'Congressional', 'Republicans', 'have', 'a', 'new', 'fear', 'when', 'it', 'comes',
'to', 'their', ' ', 'health', 'care', 'lawsuit', 'against', 'the', 'Obama', 'administr
ation', ':', 'They', 'might', 'win', '.', 'The', 'incoming', 'Trump', 'administration',
'could', 'choose', 'to', 'no', 'longer', 'defend', 'the', 'executive', 'branch', 'agains
t', 'the', 'suit', ',', 'which', 'challenges', 'the', 'administration', '’s', 'authorit
y', 'to', 'spend', 'billions', 'of', 'dollars', 'on', 'health', 'insurance', 'subsidie
s', 'for', ' ', 'and', ' ', 'Americans', ',', 'handing', 'House', 'Republicans', 'a',
'big', 'victory', 'on', ' ', 'issues', '.', 'But', 'a', 'sudden', 'loss', 'of', 'the',
'disputed', 'subsidies', 'could', 'conceivably', 'cause', 'the', 'health', 'care', 'prog
ram', 'to', 'implode', ',', 'leaving', 'millions', 'of', 'people', 'without', 'access',
'to', 'health', 'insurance', 'before', 'Republicans', 'have', 'prepared', 'a', 'replacem
ent', '.', 'That', 'could', 'lead', 'to', 'chaos', 'in', 'the', 'insurance', 'market',
'and', 'spur', 'a', 'political', 'backlash', 'just', 'as', 'Republicans', 'gain', 'ful
l', 'control', 'of', 'the', 'government', '.', 'To', 'stave', 'off', 'that', 'outcome',
',', 'Republicans', 'could', 'find', 'themselves', 'in', 'the', 'awkward', 'position',
'of', 'appropriating', 'huge', 'sums', 'to', 'temporarily', 'prop', 'up', 'the', 'Obam
a', 'health', 'care', 'law', ',', 'angering', 'conservative', 'voters', 'who', 'have',
'been', 'demanding', 'an', 'end', 'to', 'the', 'law', 'for', 'years', '.', 'In', 'anothe
r', 'twist', ',', 'Donald', 'J.', 'Trump', '’s', 'administration', ',', 'worried', 'abou
t', 'preserving', 'executive', 'branch', 'prerogatives', ',', 'could', 'choose', 'to',
'fight', 'its', 'Republican', 'allies', 'in', 'the', 'House', 'on', 'some', 'central',
'questions', 'in', 'the', 'dispute', '.', 'Eager', 'to', 'avoid', 'an', 'ugly', 'politic
al', 'pileup', ',', 'Republicans', 'on', 'Capitol', 'Hill', 'and', 'the', 'Trump', 'tran
sition', 'team', 'are', 'gaming', 'out', 'how', 'to', 'handle', 'the', 'lawsuit', ',',
'which', ',', 'after', 'the', 'election', ',', 'has', 'been', 'put', 'in', 'limbo', 'unt
il', 'at', 'least', 'late', 'February', 'by', 'the', 'United', 'States', 'Court', 'of',
'Appeals', 'for', 'the', 'District', 'of', 'Columbia', 'Circuit', '.', 'They', 'are', 'n
ot', 'yet', 'ready', 'to', 'divulge', 'their', 'strategy', '.', '“', 'Given', 'that', 't
his', 'pending', 'litigation', 'involves', 'the', 'Obama', 'administration', 'and', 'Con
gress', ',', 'it', 'would', 'be', 'inappropriate', 'to', 'comment', ',', '”', 'said', 'P
hillip', 'J.', 'Blando', ',', 'a', 'spokesman', 'for', 'the', 'Trump', 'transition', 'ef
fort', '.', '“', 'Upon', 'taking', 'office', ',', 'the', 'Trump', 'administration', 'wil
l', 'evaluate', 'this', 'case', 'and', 'all', 'related', 'aspects', 'of', 'the', 'Afford
able', 'Care', 'Act', '.', '”', 'In', 'a', 'potentially', ' ', 'decision', 'in', '201
5', ',', 'Judge', 'Rosemary', 'M.', 'Collyer', 'ruled', 'that', 'House', 'Republicans',
'had', 'the', 'standing', 'to', 'sue', 'the', 'executive', 'branch', 'over', 'a', 'spend
ing', 'dispute', 'and', 'that', 'the', 'Obama', 'administration', 'had', 'been', 'distri
buting', 'the', 'health', 'insurance', 'subsidies', ',', 'in', 'violation', 'of', 'the',
'Constitution', ',', 'without', 'approval', 'from', 'Congress', '.', 'The', 'Justice',
'Department', ',', 'confident', 'that', 'Judge', 'Collyer', '’s', 'decision', 'would',
'be', 'reversed', ',', 'quickly', 'appealed', ',', 'and', 'the', 'subsidies', 'have', 'r
emained', 'in', 'place', 'during', 'the', 'appeal', '.', 'In', 'successfully', 'seekin
g', 'a', 'temporary', 'halt', 'in', 'the', 'proceedings', 'after', 'Mr.', 'Trump', 'wo
n', ',', 'House', 'Republicans', 'last', 'month', 'told', 'the', 'court', 'that', 'the
y', '“', 'and', 'the', ' ', '’s', 'transition', 'team', 'currently', 'are', 'discussin
g', 'potential', 'options', 'for', 'resolution', 'of', 'this', 'matter', ',', 'to', 'tak
e', 'effect', 'after', 'the', ' ', '’s', 'inauguration', 'on', 'Jan.', '20', ',', '201
7', '.', '”', 'The', 'suspension', 'of', 'the', 'case', ',', 'House', 'lawyers', 'said',
',', 'will', '“', 'provide', 'the', ' ', 'and', 'his', 'future', 'administration', 'tim
e', 'to', 'consider', 'whether', 'to', 'continue', 'prosecuting', 'or', 'to', 'otherwis
e', 'resolve', 'this', 'appeal', '.', '”', 'Republican', 'leadership', 'officials', 'i
n', 'the', 'House', 'acknowledge', 'the', 'possibility', 'of', '“', 'cascading', 'effect
s', '”', 'if', 'the', ' ', 'payments', ',', 'which', 'have', 'totaled', 'an', 'estimate
d', '$', '13', 'billion', ',', 'are', 'suddenly', 'stopped', '.', 'Insurers', 'that', 'r
eceive', 'the', 'subsidies', 'in', 'exchange', 'for', 'paying', ' ', 'costs', 'such',
'as', 'deductibles', 'and', ' ', 'for', 'eligible', 'consumers', 'could', 'race', 'to',
'drop', 'coverage', 'since', 'they', 'would', 'be', 'losing', 'money', '.', 'Over', 'al
l', ',', 'the', 'loss', 'of', 'the', 'subsidies', 'could', 'destabilize', 'the', 'entir
e', 'program', 'and', 'cause', 'a', 'lack', 'of', 'confidence', 'that', 'leads', 'othe
r', 'insurers', 'to', 'seek', 'a', 'quick', 'exit', 'as', 'well', '.', 'Anticipating',
'that', 'the', 'Trump', 'administration', 'might', 'not', 'be', 'inclined', 'to', 'moun
t', 'a', 'vigorous', 'fight', 'against', 'the', 'House', 'Republicans', 'given', 'the',
' ', '’s', 'dim', 'view', 'of', 'the', 'health', 'care', 'law', ',', 'a', 'team', 'of',
'lawyers', 'this', 'month', 'sought', 'to', 'intervene', 'in', 'the', 'case', 'on', 'beh
alf', 'of', 'two', 'participants', 'in', 'the', 'health', 'care', 'program', '.', 'In',
'their', 'request', ',', 'the', 'lawyers', 'predicted', 'that', 'a', 'deal', 'between',
'House', 'Republicans', 'and', 'the', 'new', 'administration', 'to', 'dismiss', 'or', 's
ettle', 'the', 'case', '“', 'will', 'produce', 'devastating', 'consequences', 'for', 'th
e', 'individuals', 'who', 'receive', 'these', 'reductions', ',', 'as', 'well', 'as', 'fo
r', 'the', 'nation', '’s', 'health', 'insurance', 'and', 'health', 'care', 'systems', 'g
enerally', '.', '”', 'No', 'matter', 'what', 'happens', ',', 'House', 'Republicans', 'sa
y', ',', 'they', 'want', 'to', 'prevail', 'on', 'two', 'overarching', 'concepts', ':',
'the', 'congressional', 'power', 'of', 'the', 'purse', ',', 'and', 'the', 'right', 'of',
'Congress', 'to', 'sue', 'the', 'executive', 'branch', 'if', 'it', 'violates', 'the', 'C
onstitution', 'regarding', 'that', 'spending', 'power', '.', 'House', 'Republicans', 'co
ntend', 'that', 'Congress', 'never', 'appropriated', 'the', 'money', 'for', 'the', 'subs
idies', ',', 'as', 'required', 'by', 'the', 'Constitution', '.', 'In', 'the', 'suit',
',', 'which', 'was', 'initially', 'championed', 'by', 'John', 'A.', 'Boehner', ',', 'th
e', 'House', 'speaker', 'at', 'the', 'time', ',', 'and', 'later', 'in', 'House', 'commit
tee', 'reports', ',', 'Republicans', 'asserted', 'that', 'the', 'administration', ',',
'desperate', 'for', 'the', 'funding', ',', 'had', 'required', 'the', 'Treasury', 'Depart
ment', 'to', 'provide', 'it', 'despite', 'widespread', 'internal', 'skepticism', 'that',
'the', 'spending', 'was', 'proper', '.', 'The', 'White', 'House', 'said', 'that', 'the',
'spending', 'was', 'a', 'permanent', 'part', 'of', 'the', 'law', 'passed', 'in', '2010',
',', 'and', 'that', 'no', 'annual', 'appropriation', 'was', 'required', ' ', '—', ' ',
'even', 'though', 'the', 'administration', 'initially', 'sought', 'one', '.', 'Just', 'a
s', 'important', 'to', 'House', 'Republicans', ',', 'Judge', 'Collyer', 'found', 'that',
'Congress', 'had', 'the', 'standing', 'to', 'sue', 'the', 'White', 'House', 'on', 'thi
s', 'issue', ' ', '—', ' ', 'a', 'ruling', 'that', 'many', 'legal', 'experts', 'said',
'was', 'flawed', ' ', '—', ' ', 'and', 'they', 'want', 'that', 'precedent', 'to', 'be',
'set', 'to', 'restore', 'congressional', 'leverage', 'over', 'the', 'executive', 'branc
h', '.', 'But', 'on', 'spending', 'power', 'and', 'standing', ',', 'the', 'Trump', 'admi
nistration', 'may', 'come', 'under', 'pressure', 'from', 'advocates', 'of', 'presidentia
l', 'authority', 'to', 'fight', 'the', 'House', 'no', 'matter', 'their', 'shared', 'view
s', 'on', 'health', 'care', ',', 'since', 'those', 'precedents', 'could', 'have', 'broa
d', 'repercussions', '.', 'It', 'is', 'a', 'complicated', 'set', 'of', 'dynamics', 'illu
strating', 'how', 'a', 'quick', 'legal', 'victory', 'for', 'the', 'House', 'in', 'the',
'Trump', 'era', 'might', 'come', 'with', 'costs', 'that', 'Republicans', 'never', 'antic
ipated', 'when', 'they', 'took', 'on', 'the', 'Obama', 'White', 'House', '.']

In [21]: punctuation = punctuation + '\n'


punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~\n—“,”‘-’\n'
Out[21]:

In [22]: word_frequencies = {}
for word in doc:
if word.text.lower() not in stopwords:
if word.text.lower() not in punctuation:
if word.text not in word_frequencies.keys():
word_frequencies[word.text] = 1
else:
word_frequencies[word.text] += 1

In [23]: print(word_frequencies)
{' ': 7, 'Congressional': 1, 'Republicans': 15, 'new': 2, 'fear': 1, 'comes': 1, ' ':
3, 'health': 11, 'care': 7, 'lawsuit': 2, 'Obama': 5, 'administration': 13, 'win': 1, 'i
ncoming': 1, 'Trump': 9, 'choose': 2, 'longer': 1, 'defend': 1, 'executive': 5, 'branc
h': 5, 'suit': 2, 'challenges': 1, 'authority': 2, 'spend': 1, 'billions': 1, 'dollars':
1, 'insurance': 5, 'subsidies': 7, ' ': 9, 'Americans': 1, 'handing': 1, 'House': 18,
'big': 1, 'victory': 2, 'issues': 1, 'sudden': 1, 'loss': 2, 'disputed': 1, 'conceivabl
y': 1, 'cause': 2, 'program': 3, 'implode': 1, 'leaving': 1, 'millions': 1, 'people': 1,
'access': 1, 'prepared': 1, 'replacement': 1, 'lead': 1, 'chaos': 1, 'market': 1, 'spu
r': 1, 'political': 2, 'backlash': 1, 'gain': 1, 'control': 1, 'government': 1, 'stave':
1, 'outcome': 1, 'find': 1, 'awkward': 1, 'position': 1, 'appropriating': 1, 'huge': 1,
'sums': 1, 'temporarily': 1, 'prop': 1, 'law': 4, 'angering': 1, 'conservative': 1, 'vot
ers': 1, 'demanding': 1, 'end': 1, 'years': 1, 'twist': 1, 'Donald': 1, 'J.': 2, 'worrie
d': 1, 'preserving': 1, 'prerogatives': 1, 'fight': 3, 'Republican': 2, 'allies': 1, 'ce
ntral': 1, 'questions': 1, 'dispute': 2, 'Eager': 1, 'avoid': 1, 'ugly': 1, 'pileup': 1,
'Capitol': 1, 'Hill': 1, 'transition': 3, 'team': 3, 'gaming': 1, 'handle': 1, 'electio
n': 1, 'limbo': 1, 'late': 1, 'February': 1, 'United': 1, 'States': 1, 'Court': 1, 'Appe
als': 1, 'District': 1, 'Columbia': 1, 'Circuit': 1, 'ready': 1, 'divulge': 1, 'strateg
y': 1, 'Given': 1, 'pending': 1, 'litigation': 1, 'involves': 1, 'Congress': 5, 'inappro
priate': 1, 'comment': 1, 'said': 4, 'Phillip': 1, 'Blando': 1, 'spokesman': 1, 'effor
t': 1, 'taking': 1, 'office': 1, 'evaluate': 1, 'case': 4, 'related': 1, 'aspects': 1,
'Affordable': 1, 'Care': 1, 'Act': 1, 'potentially': 1, 'decision': 2, '2015': 1, 'Judg
e': 3, 'Rosemary': 1, 'M.': 1, 'Collyer': 3, 'ruled': 1, 'standing': 3, 'sue': 3, 'spend
ing': 5, 'distributing': 1, 'violation': 1, 'Constitution': 3, 'approval': 1, 'Justice':
1, 'Department': 2, 'confident': 1, 'reversed': 1, 'quickly': 1, 'appealed': 1, 'remaine
d': 1, 'place': 1, 'appeal': 2, 'successfully': 1, 'seeking': 1, 'temporary': 1, 'halt':
1, 'proceedings': 1, 'Mr.': 1, 'won': 1, 'month': 2, 'told': 1, 'court': 1, 'currently':
1, 'discussing': 1, 'potential': 1, 'options': 1, 'resolution': 1, 'matter': 3, 'effec
t': 1, 'inauguration': 1, 'Jan.': 1, '20': 1, '2017': 1, 'suspension': 1, 'lawyers': 3,
'provide': 2, 'future': 1, 'time': 2, 'consider': 1, 'continue': 1, 'prosecuting': 1, 'r
esolve': 1, 'leadership': 1, 'officials': 1, 'acknowledge': 1, 'possibility': 1, 'cascad
ing': 1, 'effects': 1, 'payments': 1, 'totaled': 1, 'estimated': 1, '13': 1, 'billion':
1, 'suddenly': 1, 'stopped': 1, 'Insurers': 1, 'receive': 2, 'exchange': 1, 'paying': 1,
'costs': 2, 'deductibles': 1, 'eligible': 1, 'consumers': 1, 'race': 1, 'drop': 1, 'cove
rage': 1, 'losing': 1, 'money': 2, 'destabilize': 1, 'entire': 1, 'lack': 1, 'confidenc
e': 1, 'leads': 1, 'insurers': 1, 'seek': 1, 'quick': 2, 'exit': 1, 'Anticipating': 1,
'inclined': 1, 'mount': 1, 'vigorous': 1, 'given': 1, 'dim': 1, 'view': 1, 'sought': 2,
'intervene': 1, 'behalf': 1, 'participants': 1, 'request': 1, 'predicted': 1, 'deal': 1,
'dismiss': 1, 'settle': 1, 'produce': 1, 'devastating': 1, 'consequences': 1, 'individua
ls': 1, 'reductions': 1, 'nation': 1, 'systems': 1, 'generally': 1, 'happens': 1, 'wan
t': 2, 'prevail': 1, 'overarching': 1, 'concepts': 1, 'congressional': 2, 'power': 3, 'p
urse': 1, 'right': 1, 'violates': 1, 'contend': 1, 'appropriated': 1, 'required': 3, 'in
itially': 2, 'championed': 1, 'John': 1, 'A.': 1, 'Boehner': 1, 'speaker': 1, 'later':
1, 'committee': 1, 'reports': 1, 'asserted': 1, 'desperate': 1, 'funding': 1, 'Treasur
y': 1, 'despite': 1, 'widespread': 1, 'internal': 1, 'skepticism': 1, 'proper': 1, 'Whit
e': 3, 'permanent': 1, 'passed': 1, '2010': 1, 'annual': 1, 'appropriation': 1, 'importa
nt': 1, 'found': 1, 'issue': 1, 'ruling': 1, 'legal': 2, 'experts': 1, 'flawed': 1, 'pre
cedent': 1, 'set': 2, 'restore': 1, 'leverage': 1, 'come': 2, 'pressure': 1, 'advocate
s': 1, 'presidential': 1, 'shared': 1, 'views': 1, 'precedents': 1, 'broad': 1, 'repercu
ssions': 1, 'complicated': 1, 'dynamics': 1, 'illustrating': 1, 'era': 1, 'anticipated':
1, 'took': 1}

Sentence tokenization
In [24]: max_frequency = max(word_frequencies.values())

In [25]: max_frequency
18
Out[25]:

In [26]: for word in word_frequencies.keys():


word_frequencies[word] = word_frequencies[word]/max_frequency

In [27]: print(word_frequencies)
{' ': 0.3888888888888889, 'Congressional': 0.05555555555555555, 'Republicans': 0.8333333
333333334, 'new': 0.1111111111111111, 'fear': 0.05555555555555555, 'comes': 0.0555555555
5555555, ' ': 0.16666666666666666, 'health': 0.6111111111111112, 'care': 0.38888888888
88889, 'lawsuit': 0.1111111111111111, 'Obama': 0.2777777777777778, 'administration': 0.7
222222222222222, 'win': 0.05555555555555555, 'incoming': 0.05555555555555555, 'Trump':
0.5, 'choose': 0.1111111111111111, 'longer': 0.05555555555555555, 'defend': 0.0555555555
5555555, 'executive': 0.2777777777777778, 'branch': 0.2777777777777778, 'suit': 0.111111
1111111111, 'challenges': 0.05555555555555555, 'authority': 0.1111111111111111, 'spend':
0.05555555555555555, 'billions': 0.05555555555555555, 'dollars': 0.05555555555555555, 'i
nsurance': 0.2777777777777778, 'subsidies': 0.3888888888888889, ' ': 0.5, 'Americans':
0.05555555555555555, 'handing': 0.05555555555555555, 'House': 1.0, 'big': 0.055555555555
55555, 'victory': 0.1111111111111111, 'issues': 0.05555555555555555, 'sudden': 0.0555555
5555555555, 'loss': 0.1111111111111111, 'disputed': 0.05555555555555555, 'conceivably':
0.05555555555555555, 'cause': 0.1111111111111111, 'program': 0.16666666666666666, 'implo
de': 0.05555555555555555, 'leaving': 0.05555555555555555, 'millions': 0.0555555555555555
5, 'people': 0.05555555555555555, 'access': 0.05555555555555555, 'prepared': 0.055555555
55555555, 'replacement': 0.05555555555555555, 'lead': 0.05555555555555555, 'chaos': 0.05
555555555555555, 'market': 0.05555555555555555, 'spur': 0.05555555555555555, 'politica
l': 0.1111111111111111, 'backlash': 0.05555555555555555, 'gain': 0.05555555555555555, 'c
ontrol': 0.05555555555555555, 'government': 0.05555555555555555, 'stave': 0.055555555555
55555, 'outcome': 0.05555555555555555, 'find': 0.05555555555555555, 'awkward': 0.0555555
5555555555, 'position': 0.05555555555555555, 'appropriating': 0.05555555555555555, 'hug
e': 0.05555555555555555, 'sums': 0.05555555555555555, 'temporarily': 0.0555555555555555
5, 'prop': 0.05555555555555555, 'law': 0.2222222222222222, 'angering': 0.055555555555555
55, 'conservative': 0.05555555555555555, 'voters': 0.05555555555555555, 'demanding': 0.0
5555555555555555, 'end': 0.05555555555555555, 'years': 0.05555555555555555, 'twist': 0.0
5555555555555555, 'Donald': 0.05555555555555555, 'J.': 0.1111111111111111, 'worried': 0.
05555555555555555, 'preserving': 0.05555555555555555, 'prerogatives': 0.0555555555555555
5, 'fight': 0.16666666666666666, 'Republican': 0.1111111111111111, 'allies': 0.055555555
55555555, 'central': 0.05555555555555555, 'questions': 0.05555555555555555, 'dispute':
0.1111111111111111, 'Eager': 0.05555555555555555, 'avoid': 0.05555555555555555, 'ugly':
0.05555555555555555, 'pileup': 0.05555555555555555, 'Capitol': 0.05555555555555555, 'Hil
l': 0.05555555555555555, 'transition': 0.16666666666666666, 'team': 0.16666666666666666,
'gaming': 0.05555555555555555, 'handle': 0.05555555555555555, 'election': 0.055555555555
55555, 'limbo': 0.05555555555555555, 'late': 0.05555555555555555, 'February': 0.05555555
555555555, 'United': 0.05555555555555555, 'States': 0.05555555555555555, 'Court': 0.0555
5555555555555, 'Appeals': 0.05555555555555555, 'District': 0.05555555555555555, 'Columbi
a': 0.05555555555555555, 'Circuit': 0.05555555555555555, 'ready': 0.05555555555555555,
'divulge': 0.05555555555555555, 'strategy': 0.05555555555555555, 'Given': 0.055555555555
55555, 'pending': 0.05555555555555555, 'litigation': 0.05555555555555555, 'involves': 0.
05555555555555555, 'Congress': 0.2777777777777778, 'inappropriate': 0.05555555555555555,
'comment': 0.05555555555555555, 'said': 0.2222222222222222, 'Phillip': 0.055555555555555
55, 'Blando': 0.05555555555555555, 'spokesman': 0.05555555555555555, 'effort': 0.0555555
5555555555, 'taking': 0.05555555555555555, 'office': 0.05555555555555555, 'evaluate': 0.
05555555555555555, 'case': 0.2222222222222222, 'related': 0.05555555555555555, 'aspect
s': 0.05555555555555555, 'Affordable': 0.05555555555555555, 'Care': 0.05555555555555555,
'Act': 0.05555555555555555, 'potentially': 0.05555555555555555, 'decision': 0.1111111111
111111, '2015': 0.05555555555555555, 'Judge': 0.16666666666666666, 'Rosemary': 0.0555555
5555555555, 'M.': 0.05555555555555555, 'Collyer': 0.16666666666666666, 'ruled': 0.055555
55555555555, 'standing': 0.16666666666666666, 'sue': 0.16666666666666666, 'spending': 0.
2777777777777778, 'distributing': 0.05555555555555555, 'violation': 0.05555555555555555,
'Constitution': 0.16666666666666666, 'approval': 0.05555555555555555, 'Justice': 0.05555
555555555555, 'Department': 0.1111111111111111, 'confident': 0.05555555555555555, 'rever
sed': 0.05555555555555555, 'quickly': 0.05555555555555555, 'appealed': 0.055555555555555
55, 'remained': 0.05555555555555555, 'place': 0.05555555555555555, 'appeal': 0.111111111
1111111, 'successfully': 0.05555555555555555, 'seeking': 0.05555555555555555, 'temporar
y': 0.05555555555555555, 'halt': 0.05555555555555555, 'proceedings': 0.0555555555555555
5, 'Mr.': 0.05555555555555555, 'won': 0.05555555555555555, 'month': 0.1111111111111111,
'told': 0.05555555555555555, 'court': 0.05555555555555555, 'currently': 0.05555555555555
555, 'discussing': 0.05555555555555555, 'potential': 0.05555555555555555, 'options': 0.0
5555555555555555, 'resolution': 0.05555555555555555, 'matter': 0.16666666666666666, 'eff
ect': 0.05555555555555555, 'inauguration': 0.05555555555555555, 'Jan.': 0.05555555555555
555, '20': 0.05555555555555555, '2017': 0.05555555555555555, 'suspension': 0.05555555555
555555, 'lawyers': 0.16666666666666666, 'provide': 0.1111111111111111, 'future': 0.05555
555555555555, 'time': 0.1111111111111111, 'consider': 0.05555555555555555, 'continue':
0.05555555555555555, 'prosecuting': 0.05555555555555555, 'resolve': 0.05555555555555555,
'leadership': 0.05555555555555555, 'officials': 0.05555555555555555, 'acknowledge': 0.05
555555555555555, 'possibility': 0.05555555555555555, 'cascading': 0.05555555555555555,
'effects': 0.05555555555555555, 'payments': 0.05555555555555555, 'totaled': 0.0555555555
5555555, 'estimated': 0.05555555555555555, '13': 0.05555555555555555, 'billion': 0.05555
555555555555, 'suddenly': 0.05555555555555555, 'stopped': 0.05555555555555555, 'Insurer
s': 0.05555555555555555, 'receive': 0.1111111111111111, 'exchange': 0.05555555555555555,
'paying': 0.05555555555555555, 'costs': 0.1111111111111111, 'deductibles': 0.05555555555
555555, 'eligible': 0.05555555555555555, 'consumers': 0.05555555555555555, 'race': 0.055
55555555555555, 'drop': 0.05555555555555555, 'coverage': 0.05555555555555555, 'losing':
0.05555555555555555, 'money': 0.1111111111111111, 'destabilize': 0.05555555555555555, 'e
ntire': 0.05555555555555555, 'lack': 0.05555555555555555, 'confidence': 0.05555555555555
555, 'leads': 0.05555555555555555, 'insurers': 0.05555555555555555, 'seek': 0.0555555555
5555555, 'quick': 0.1111111111111111, 'exit': 0.05555555555555555, 'Anticipating': 0.055
55555555555555, 'inclined': 0.05555555555555555, 'mount': 0.05555555555555555, 'vigorou
s': 0.05555555555555555, 'given': 0.05555555555555555, 'dim': 0.05555555555555555, 'vie
w': 0.05555555555555555, 'sought': 0.1111111111111111, 'intervene': 0.05555555555555555,
'behalf': 0.05555555555555555, 'participants': 0.05555555555555555, 'request': 0.0555555
5555555555, 'predicted': 0.05555555555555555, 'deal': 0.05555555555555555, 'dismiss': 0.
05555555555555555, 'settle': 0.05555555555555555, 'produce': 0.05555555555555555, 'devas
tating': 0.05555555555555555, 'consequences': 0.05555555555555555, 'individuals': 0.0555
5555555555555, 'reductions': 0.05555555555555555, 'nation': 0.05555555555555555, 'system
s': 0.05555555555555555, 'generally': 0.05555555555555555, 'happens': 0.0555555555555555
5, 'want': 0.1111111111111111, 'prevail': 0.05555555555555555, 'overarching': 0.05555555
555555555, 'concepts': 0.05555555555555555, 'congressional': 0.1111111111111111, 'powe
r': 0.16666666666666666, 'purse': 0.05555555555555555, 'right': 0.05555555555555555, 'vi
olates': 0.05555555555555555, 'contend': 0.05555555555555555, 'appropriated': 0.05555555
555555555, 'required': 0.16666666666666666, 'initially': 0.1111111111111111, 'champione
d': 0.05555555555555555, 'John': 0.05555555555555555, 'A.': 0.05555555555555555, 'Boehne
r': 0.05555555555555555, 'speaker': 0.05555555555555555, 'later': 0.05555555555555555,
'committee': 0.05555555555555555, 'reports': 0.05555555555555555, 'asserted': 0.05555555
555555555, 'desperate': 0.05555555555555555, 'funding': 0.05555555555555555, 'Treasury':
0.05555555555555555, 'despite': 0.05555555555555555, 'widespread': 0.05555555555555555,
'internal': 0.05555555555555555, 'skepticism': 0.05555555555555555, 'proper': 0.05555555
555555555, 'White': 0.16666666666666666, 'permanent': 0.05555555555555555, 'passed': 0.0
5555555555555555, '2010': 0.05555555555555555, 'annual': 0.05555555555555555, 'appropria
tion': 0.05555555555555555, 'important': 0.05555555555555555, 'found': 0.055555555555555
55, 'issue': 0.05555555555555555, 'ruling': 0.05555555555555555, 'legal': 0.111111111111
1111, 'experts': 0.05555555555555555, 'flawed': 0.05555555555555555, 'precedent': 0.0555
5555555555555, 'set': 0.1111111111111111, 'restore': 0.05555555555555555, 'leverage': 0.
05555555555555555, 'come': 0.1111111111111111, 'pressure': 0.05555555555555555, 'advocat
es': 0.05555555555555555, 'presidential': 0.05555555555555555, 'shared': 0.0555555555555
5555, 'views': 0.05555555555555555, 'precedents': 0.05555555555555555, 'broad': 0.055555
55555555555, 'repercussions': 0.05555555555555555, 'complicated': 0.05555555555555555,
'dynamics': 0.05555555555555555, 'illustrating': 0.05555555555555555, 'era': 0.055555555
55555555, 'anticipated': 0.05555555555555555, 'took': 0.05555555555555555}

In [28]: sentence_tokens = [sent for sent in doc.sents]


print(sentence_tokens)

[ , Congressional Republicans have a new fear when it comes to their health care laws
uit against the Obama administration: They might win., The incoming Trump administration
could choose to no longer defend the executive branch against the suit, which challenges
the administration’s authority to spend billions of dollars on health insurance subsidie
s for and Americans, handing House Republicans a big victory on issues., But a su
dden loss of the disputed subsidies could conceivably cause the health care program to i
mplode, leaving millions of people without access to health insurance before Republicans
have prepared a replacement., That could lead to chaos in the insurance market and spur
a political backlash just as Republicans gain full control of the government., To stave
off that outcome, Republicans could find themselves in the awkward position of appropria
ting huge sums to temporarily prop up the Obama health care law, angering conservative v
oters who have been demanding an end to the law for years., In another twist, Donald J.
Trump’s administration, worried about preserving executive branch prerogatives, could ch
oose to fight its Republican allies in the House on some central questions in the disput
e., Eager to avoid an ugly political pileup, Republicans on Capitol Hill and the Trump t
ransition team are gaming out how to handle the lawsuit, which, after the election, has
been put in limbo until at least late February by the United States Court of Appeals for
the District of Columbia Circuit., They are not yet ready to divulge their strategy., “G
iven that this pending litigation involves the Obama administration and Congress, it wou
ld be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump trans
ition effort., “Upon taking office, the Trump administration will evaluate this case and
all related aspects of the Affordable Care Act. ”, In a potentially decision in 2015,
Judge Rosemary M. Collyer ruled that House Republicans had the standing to sue the execu
tive branch over a spending dispute and that the Obama administration had been distribut
ing the health insurance subsidies, in violation of the Constitution, without approval f
rom Congress., The Justice Department, confident that Judge Collyer’s decision would be
reversed, quickly appealed, and the subsidies have remained in place during the appeal.,
In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House R
epublicans last month told the court that they “and the ’s transition team currently ar
e discussing potential options for resolution of this matter, to take effect after the
’s inauguration on Jan. 20, 2017. ”, The suspension of the case, House lawyers said, wil
l “provide the and his future administration time to consider whether to continue pros
ecuting or to otherwise resolve this appeal. ”, Republican leadership officials in the H
ouse acknowledge the possibility of “cascading effects” if the payments, which have to
taled an estimated $13 billion, are suddenly stopped., Insurers that receive the subsidi
es in exchange for paying costs such as deductibles and for eligible consumers coul
d race to drop coverage since they would be losing money., Over all, the loss of the sub
sidies could destabilize the entire program and cause a lack of confidence that leads ot
her insurers to seek a quick exit as well., Anticipating that the Trump administration m
ight not be inclined to mount a vigorous fight against the House Republicans given the
’s dim view of the health care law, a team of lawyers this month sought to intervene in
the case on behalf of two participants in the health care program., In their request, th
e lawyers predicted that a deal between House Republicans and the new administration to
dismiss or settle the case “will produce devastating consequences for the individuals wh
o receive these reductions, as well as for the nation’s health insurance and health care
systems generally. ”, No matter what happens, House Republicans say, they want to prevai
l on two overarching concepts: the congressional power of the purse, and the right of Co
ngress to sue the executive branch if it violates the Constitution regarding that spendi
ng power., House Republicans contend that Congress never appropriated the money for the
subsidies, as required by the Constitution., In the suit, which was initially championed
by John A. Boehner, the House speaker at the time, and later in House committee reports,
Republicans asserted that the administration, desperate for the funding, had required th
e Treasury Department to provide it despite widespread internal skepticism that the spen
ding was proper., The White House said that the spending was a permanent part of the law
passed in 2010, and that no annual appropriation was required — even though the admin
istration initially sought one., Just as important to House Republicans, Judge Collyer f
ound that Congress had the standing to sue the White House on this issue — a ruling t
hat many legal experts said was flawed — and they want that precedent to be set to re
store congressional leverage over the executive branch., But on spending power and stand
ing, the Trump administration may come under pressure from advocates of presidential aut
hority to fight the House no matter their shared views on health care, since those prece
dents could have broad repercussions., It is a complicated set of dynamics illustrating
how a quick legal victory for the House in the Trump era might come with costs that Repu
blicans never anticipated when they took on the Obama White House.]

Word frequency table


In [29]: sentence_scores = {}
for sent in sentence_tokens:
for word in sent:
if word.text.lower() in word_frequencies.keys():
if sent not in sentence_scores.keys():
sentence_scores[sent] = word_frequencies[word.text.lower()]
else:
sentence_scores[sent] += word_frequencies[word.text.lower()]

In [30]: sentence_scores

{ : 0.3888888888888889,
Out[30]:
Congressional Republicans have a new fear when it comes to their health care lawsuit
against the Obama administration: They might win.: 2.388888888888889,
The incoming Trump administration could choose to no longer defend the executive branch
against the suit, which challenges the administration’s authority to spend billions of d
ollars on health insurance subsidies for and Americans, handing House Republicans a
big victory on issues.: 5.444444444444443,
But a sudden loss of the disputed subsidies could conceivably cause the health care pro
gram to implode, leaving millions of people without access to health insurance before Re
publicans have prepared a replacement.: 3.222222222222221,
That could lead to chaos in the insurance market and spur a political backlash just as
Republicans gain full control of the government.: 0.8333333333333335,
To stave off that outcome, Republicans could find themselves in the awkward position of
appropriating huge sums to temporarily prop up the Obama health care law, angering conse
rvative voters who have been demanding an end to the law for years.: 2.3333333333333335,
In another twist, Donald J. Trump’s administration, worried about preserving executive
branch prerogatives, could choose to fight its Republican allies in the House on some ce
ntral questions in the dispute.: 2.055555555555556,
Eager to avoid an ugly political pileup, Republicans on Capitol Hill and the Trump tran
sition team are gaming out how to handle the lawsuit, which, after the election, has bee
n put in limbo until at least late February by the United States Court of Appeals for th
e District of Columbia Circuit.: 1.0555555555555556,
They are not yet ready to divulge their strategy.: 0.16666666666666666,
“Given that this pending litigation involves the Obama administration and Congress, it
would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump tr
ansition effort.: 1.5555555555555556,
“Upon taking office, the Trump administration will evaluate this case and all related a
spects of the Affordable Care Act. ”: 1.6111111111111112,
In a potentially decision in 2015, Judge Rosemary M. Collyer ruled that House Republi
cans had the standing to sue the executive branch over a spending dispute and that the O
bama administration had been distributing the health insurance subsidies, in violation o
f the Constitution, without approval from Congress.: 4.222222222222221,
The Justice Department, confident that Judge Collyer’s decision would be reversed, quic
kly appealed, and the subsidies have remained in place during the appeal.: 0.94444444444
44446,
In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House
Republicans last month told the court that they “and the ’s transition team currently a
re discussing potential options for resolution of this matter, to take effect after the
’s inauguration on Jan. 20, 2017. ”: 2.333333333333333,
The suspension of the case, House lawyers said, will “provide the and his future admi
nistration time to consider whether to continue prosecuting or to otherwise resolve this
appeal. ”: 2.499999999999999,
Republican leadership officials in the House acknowledge the possibility of “cascading
effects” if the payments, which have totaled an estimated $13 billion, are suddenly st
opped.: 1.2222222222222223,
Insurers that receive the subsidies in exchange for paying costs such as deductibles
and for eligible consumers could race to drop coverage since they would be losing mone
y.: 1.9444444444444446,
Over all, the loss of the subsidies could destabilize the entire program and cause a la
ck of confidence that leads other insurers to seek a quick exit as well.: 1.333333333333
3335,
Anticipating that the Trump administration might not be inclined to mount a vigorous fi
ght against the House Republicans given the ’s dim view of the health care law, a team
of lawyers this month sought to intervene in the case on behalf of two participants in t
he health care program.: 4.944444444444445,
In their request, the lawyers predicted that a deal between House Republicans and the n
ew administration to dismiss or settle the case “will produce devastating consequences f
or the individuals who receive these reductions, as well as for the nation’s health insu
rance and health care systems generally. ”: 3.944444444444444,
No matter what happens, House Republicans say, they want to prevail on two overarching
concepts: the congressional power of the purse, and the right of Congress to sue the exe
cutive branch if it violates the Constitution regarding that spending power.: 2.11111111
1111111,
House Republicans contend that Congress never appropriated the money for the subsidies,
as required by the Constitution.: 0.7777777777777778,
In the suit, which was initially championed by John A. Boehner, the House speaker at th
e time, and later in House committee reports, Republicans asserted that the administrati
on, desperate for the funding, had required the Treasury Department to provide it despit
e widespread internal skepticism that the spending was proper.: 2.333333333333333,
The White House said that the spending was a permanent part of the law passed in 2010,
and that no annual appropriation was required — even though the administration initia
lly sought one.: 3.0,
Just as important to House Republicans, Judge Collyer found that Congress had the stand
ing to sue the White House on this issue — a ruling that many legal experts said was
flawed — and they want that precedent to be set to restore congressional leverage ove
r the executive branch.: 3.833333333333333,
But on spending power and standing, the Trump administration may come under pressure fr
om advocates of presidential authority to fight the House no matter their shared views o
n health care, since those precedents could have broad repercussions.: 3.333333333333332
6,
It is a complicated set of dynamics illustrating how a quick legal victory for the Hous
e in the Trump era might come with costs that Republicans never anticipated when they to
ok on the Obama White House.: 1.0000000000000002}

Summarization
In [31]: from heapq import nlargest

In [32]: select_length = int(len(sentence_tokens)*0.3)


select_length

8
Out[32]:

In [33]: summary = nlargest(select_length, sentence_scores, key = sentence_scores.get)

In [34]: summary
[The incoming Trump administration could choose to no longer defend the executive branch
Out[34]:
against the suit, which challenges the administration’s authority to spend billions of d
ollars on health insurance subsidies for and Americans, handing House Republicans a
big victory on issues.,
Anticipating that the Trump administration might not be inclined to mount a vigorous fi
ght against the House Republicans given the ’s dim view of the health care law, a team
of lawyers this month sought to intervene in the case on behalf of two participants in t
he health care program.,
In a potentially decision in 2015, Judge Rosemary M. Collyer ruled that House Republi
cans had the standing to sue the executive branch over a spending dispute and that the O
bama administration had been distributing the health insurance subsidies, in violation o
f the Constitution, without approval from Congress.,
In their request, the lawyers predicted that a deal between House Republicans and the n
ew administration to dismiss or settle the case “will produce devastating consequences f
or the individuals who receive these reductions, as well as for the nation’s health insu
rance and health care systems generally. ”,
Just as important to House Republicans, Judge Collyer found that Congress had the stand
ing to sue the White House on this issue — a ruling that many legal experts said was
flawed — and they want that precedent to be set to restore congressional leverage ove
r the executive branch.,
But on spending power and standing, the Trump administration may come under pressure fr
om advocates of presidential authority to fight the House no matter their shared views o
n health care, since those precedents could have broad repercussions.,
But a sudden loss of the disputed subsidies could conceivably cause the health care pro
gram to implode, leaving millions of people without access to health insurance before Re
publicans have prepared a replacement.,
The White House said that the spending was a permanent part of the law passed in 2010,
and that no annual appropriation was required — even though the administration initia
lly sought one.]

In [35]: final_summary = [word.text for word in summary]

In [36]: summary = ' '.join(final_summary)

In [37]: print(text)

Congressional Republicans have a new fear when it comes to their health care lawsuit
against the Obama administration: They might win. The incoming Trump administration coul
d choose to no longer defend the executive branch against the suit, which challenges the
administration’s authority to spend billions of dollars on health insurance subsidies fo
r and Americans, handing House Republicans a big victory on issues. But a sudden
loss of the disputed subsidies could conceivably cause the health care program to implod
e, leaving millions of people without access to health insurance before Republicans have
prepared a replacement. That could lead to chaos in the insurance market and spur a poli
tical backlash just as Republicans gain full control of the government. To stave off tha
t outcome, Republicans could find themselves in the awkward position of appropriating hu
ge sums to temporarily prop up the Obama health care law, angering conservative voters w
ho have been demanding an end to the law for years. In another twist, Donald J. Trump’s
administration, worried about preserving executive branch prerogatives, could choose to
fight its Republican allies in the House on some central questions in the dispute. Eager
to avoid an ugly political pileup, Republicans on Capitol Hill and the Trump transition
team are gaming out how to handle the lawsuit, which, after the election, has been put i
n limbo until at least late February by the United States Court of Appeals for the Distr
ict of Columbia Circuit. They are not yet ready to divulge their strategy. “Given that t
his pending litigation involves the Obama administration and Congress, it would be inapp
ropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effor
t. “Upon taking office, the Trump administration will evaluate this case and all related
aspects of the Affordable Care Act. ” In a potentially decision in 2015, Judge Rosemar
y M. Collyer ruled that House Republicans had the standing to sue the executive branch o
ver a spending dispute and that the Obama administration had been distributing the healt
h insurance subsidies, in violation of the Constitution, without approval from Congress.
The Justice Department, confident that Judge Collyer’s decision would be reversed, quick
ly appealed, and the subsidies have remained in place during the appeal. In successfully
seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last
month told the court that they “and the ’s transition team currently are discussing pot
ential options for resolution of this matter, to take effect after the ’s inauguration
on Jan. 20, 2017. ” The suspension of the case, House lawyers said, will “provide the
and his future administration time to consider whether to continue prosecuting or to oth
erwise resolve this appeal. ” Republican leadership officials in the House acknowledge t
he possibility of “cascading effects” if the payments, which have totaled an estimated
$13 billion, are suddenly stopped. Insurers that receive the subsidies in exchange for p
aying costs such as deductibles and for eligible consumers could race to drop cover
age since they would be losing money. Over all, the loss of the subsidies could destabil
ize the entire program and cause a lack of confidence that leads other insurers to seek
a quick exit as well. Anticipating that the Trump administration might not be inclined t
o mount a vigorous fight against the House Republicans given the ’s dim view of the hea
lth care law, a team of lawyers this month sought to intervene in the case on behalf of
two participants in the health care program. In their request, the lawyers predicted tha
t a deal between House Republicans and the new administration to dismiss or settle the c
ase “will produce devastating consequences for the individuals who receive these reducti
ons, as well as for the nation’s health insurance and health care systems generally. ” N
o matter what happens, House Republicans say, they want to prevail on two overarching co
ncepts: the congressional power of the purse, and the right of Congress to sue the execu
tive branch if it violates the Constitution regarding that spending power. House Republi
cans contend that Congress never appropriated the money for the subsidies, as required b
y the Constitution. In the suit, which was initially championed by John A. Boehner, the
House speaker at the time, and later in House committee reports, Republicans asserted th
at the administration, desperate for the funding, had required the Treasury Department t
o provide it despite widespread internal skepticism that the spending was proper. The Wh
ite House said that the spending was a permanent part of the law passed in 2010, and tha
t no annual appropriation was required — even though the administration initially sou
ght one. Just as important to House Republicans, Judge Collyer found that Congress had t
he standing to sue the White House on this issue — a ruling that many legal experts s
aid was flawed — and they want that precedent to be set to restore congressional leve
rage over the executive branch. But on spending power and standing, the Trump administra
tion may come under pressure from advocates of presidential authority to fight the House
no matter their shared views on health care, since those precedents could have broad rep
ercussions. It is a complicated set of dynamics illustrating how a quick legal victory f
or the House in the Trump era might come with costs that Republicans never anticipated w
hen they took on the Obama White House.

In [38]: print(summary)

The incoming Trump administration could choose to no longer defend the executive branch
against the suit, which challenges the administration’s authority to spend billions of d
ollars on health insurance subsidies for and Americans, handing House Republicans a
big victory on issues. Anticipating that the Trump administration might not be inclin
ed to mount a vigorous fight against the House Republicans given the ’s dim view of the
health care law, a team of lawyers this month sought to intervene in the case on behalf
of two participants in the health care program. In a potentially decision in 2015, Jud
ge Rosemary M. Collyer ruled that House Republicans had the standing to sue the executiv
e branch over a spending dispute and that the Obama administration had been distributing
the health insurance subsidies, in violation of the Constitution, without approval from
Congress. In their request, the lawyers predicted that a deal between House Republicans
and the new administration to dismiss or settle the case “will produce devastating conse
quences for the individuals who receive these reductions, as well as for the nation’s he
alth insurance and health care systems generally. ” Just as important to House Republica
ns, Judge Collyer found that Congress had the standing to sue the White House on this is
sue — a ruling that many legal experts said was flawed — and they want that preced
ent to be set to restore congressional leverage over the executive branch. But on spendi
ng power and standing, the Trump administration may come under pressure from advocates o
f presidential authority to fight the House no matter their shared views on health care,
since those precedents could have broad repercussions. But a sudden loss of the disputed
subsidies could conceivably cause the health care program to implode, leaving millions o
f people without access to health insurance before Republicans have prepared a replaceme
nt. The White House said that the spending was a permanent part of the law passed in 201
0, and that no annual appropriation was required — even though the administration ini
tially sought one.

In [39]: len(summary)
2134
Out[39]:

Lex Rank algorithm


In [1]: #spliting para to sentences
import spacy
nlp = spacy.load('en_core_web_sm')

text = '''WASHINGTON — Congressional Republicans have a new fear when it comes to the
tokens = nlp(text)

for sent in tokens.sents:


print(sent.text.strip())

WASHINGTON — Congressional Republicans have a new fear when it comes to their heal
th care lawsuit against the Obama administration: They might win.
The incoming Trump administration could choose to no longer defend the executive branch
against the suit, which challenges the administration’s authority to spend billions of d
ollars on health insurance subsidies for and Americans, handing House Republicans a
big victory on issues.
But a sudden loss of the disputed subsidies could conceivably cause the health care prog
ram to implode, leaving millions of people without access to health insurance before Rep
ublicans have prepared a replacement.
That could lead to chaos in the insurance market and spur a political backlash just as R
epublicans gain full control of the government.
To stave off that outcome, Republicans could find themselves in the awkward position of
appropriating huge sums to temporarily prop up the Obama health care law, angering conse
rvative voters who have been demanding an end to the law for years.
In another twist, Donald J. Trump’s administration, worried about preserving executive b
ranch prerogatives, could choose to fight its Republican allies in the House on some cen
tral questions in the dispute.
Eager to avoid an ugly political pileup, Republicans on Capitol Hill and the Trump trans
ition team are gaming out how to handle the lawsuit, which, after the election, has been
put in limbo until at least late February by the United States Court of Appeals for the
District of Columbia Circuit.
They are not yet ready to divulge their strategy.
“Given that this pending litigation involves the Obama administration and Congress, it w
ould be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump tra
nsition effort.
“Upon taking office, the Trump administration will evaluate this case and all related as
pects of the Affordable Care Act. ”
In a potentially decision in 2015, Judge Rosemary M. Collyer ruled that House Republic
ans had the standing to sue the executive branch over a spending dispute and that the Ob
ama administration had been distributing the health insurance subsidies, in violation of
the Constitution, without approval from Congress.
The Justice Department, confident that Judge Collyer’s decision would be reversed, quick
ly appealed, and the subsidies have remained in place during the appeal.
In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House R
epublicans last month told the court that they “and the ’s transition team currently ar
e discussing potential options for resolution of this matter, to take effect after the
’s inauguration on Jan. 20, 2017. ”
The suspension of the case, House lawyers said, will “provide the and his future admin
istration time to consider whether to continue prosecuting or to otherwise resolve this
appeal. ”
Republican leadership officials in the House acknowledge the possibility of “cascading e
ffects” if the payments, which have totaled an estimated $13 billion, are suddenly sto
pped.
Insurers that receive the subsidies in exchange for paying costs such as deductibles
and for eligible consumers could race to drop coverage since they would be losing mone
y.
Over all, the loss of the subsidies could destabilize the entire program and cause a lac
k of confidence that leads other insurers to seek a quick exit as well.
Anticipating that the Trump administration might not be inclined to mount a vigorous fig
ht against the House Republicans given the ’s dim view of the health care law, a team o
f lawyers this month sought to intervene in the case on behalf of two participants in th
e health care program.
In their request, the lawyers predicted that a deal between House Republicans and the ne
w administration to dismiss or settle the case “will produce devastating consequences fo
r the individuals who receive these reductions, as well as for the nation’s health insur
ance and health care systems generally. ”
No matter what happens, House Republicans say, they want to prevail on two overarching c
oncepts: the congressional power of the purse, and the right of Congress to sue the exec
utive branch if it violates the Constitution regarding that spending power.
House Republicans contend that Congress never appropriated the money for the subsidies,
as required by the Constitution.
In the suit, which was initially championed by John A. Boehner, the House speaker at the
time, and later in House committee reports, Republicans asserted that the administratio
n, desperate for the funding, had required the Treasury Department to provide it despite
widespread internal skepticism that the spending was proper.
The White House said that the spending was a permanent part of the law passed in 2010, a
nd that no annual appropriation was required — even though the administration initial
ly sought one.
Just as important to House Republicans, Judge Collyer found that Congress had the standi
ng to sue the White House on this issue — a ruling that many legal experts said was f
lawed — and they want that precedent to be set to restore congressional leverage over
the executive branch.
But on spending power and standing, the Trump administration may come under pressure fro
m advocates of presidential authority to fight the House no matter their shared views on
health care, since those precedents could have broad repercussions.
It is a complicated set of dynamics illustrating how a quick legal victory for the House
in the Trump era might come with costs that Republicans never anticipated when they took
on the Obama White House.

In [2]: from lexrank import LexRank


from lexrank.mappings.stopwords import STOPWORDS
from path import Path

documents = []
documents_dir = Path('C:/Users/hp/Downloads/set3')

for file_path in documents_dir.files('*.txt'):


with file_path.open(mode='rt', encoding='utf-8') as fp:
documents.append(fp.readlines())

lxr = LexRank(documents, stopwords=STOPWORDS['en'])

sentences = ['WASHINGTON — Congressional Republicans have a new fear when it comes to


'The incoming Trump administration could choose to no longer defend the executive branch
'But a sudden loss of the disputed subsidies could conceivably cause the health care pro
'That could lead to chaos in the insurance market and spur a political backlash just as
'To stave off that outcome,' 'Republicans could find themselves in the awkward position
'In another twist, Donald J. Trump’s administration,' 'worried about preserving executiv
'Eager to avoid an ugly political pileup,' 'Republicans on Capitol Hill and the Trump tr
'They are not yet ready to divulge their strategy.',
'“Given that this pending litigation involves the Obama administration and Congress, it
'“Upon taking office,' 'the Trump administration will evaluate this case and all related
'In a potentially decision in 2015,' ' Judge Rosemary M. Collyer ruled that House Repub
'The Justice Department,' 'confident that Judge Collyers decision would be reversed,' '
'In successfully seeking a temporary halt in the proceedings after Mr. Trump won,' 'Hous
'The suspension of the case,' 'House lawyers said, will “provide the and his future adm
'Republican leadership officials in the House acknowledge the possibility of “cascading
'Insurers that receive the subsidies in exchange for paying costs such as deductibles
'Over all, the loss of the subsidies could destabilize the entire program and cause a la
'Anticipating that the Trump administration might not be inclined to mount a vigorous fi
'In their request, the lawyers predicted that a deal between House Republicans and the n
'No matter what happens, House Republicans say,' 'they want to prevail on two overarchin
'House Republicans contend that Congress never appropriated the money for the subsidies,
'In the suit, which was initially championed by John A.', 'Boehner, the House speaker at
'The White House said that the spending was a permanent part of the law passed in 2010,'
'Just as important to House Republicans, Judge Collyer found that Congress had the stand
'But on spending power and standing,' 'the Trump administration may come under pressure
'It is a complicated set of dynamics illustrating how a quick legal victory for the Hous
]

In [3]: # get summary with classical LexRank algorithm


summary = lxr.get_summary(sentences, summary_size=2, threshold=.1)
print(summary)

['It is a complicated set of dynamics illustrating how a quick legal victory for the Hou
se in the Trump era might come with costs that Republicans never anticipated when they t
ook on the Obama White House.', 'In successfully seeking a temporary halt in the proceed
ings after Mr. Trump won,House Republicans last month told the court that they “and the
’s transition team currently are discussing potential options for resolution of this mat
ter,to take effect after the ’s inauguration on Jan. 20, 2017. ”']

In [4]: # get summary with continuous LexRank


summary_cont = lxr.get_summary(sentences, threshold=None)
print(summary_cont)

['It is a complicated set of dynamics illustrating how a quick legal victory for the Hou
se in the Trump era might come with costs that Republicans never anticipated when they t
ook on the Obama White House.']
In [5]: # get LexRank scores for sentences
scores_cont = lxr.rank_sentences(
sentences,
threshold=None,
fast_power_method=False,
)
print(scores_cont)

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1.]

In [2]: import numpy as np


#from lexrank import degree_centrality_scores

import numpy as np
from scipy.sparse.csgraph import connected_components
from scipy.special import softmax
import logging

logger = logging.getLogger(__name__)

def degree_centrality_scores(
similarity_matrix,
threshold=None,
increase_power=True,
):
if not (
threshold is None
or isinstance(threshold, float)
and 0 <= threshold < 1
):
raise ValueError(
'\'threshold\' should be a floating-point number '
'from the interval [0, 1) or None',
)

if threshold is None:
markov_matrix = create_markov_matrix(similarity_matrix)

else:
markov_matrix = create_markov_matrix_discrete(
similarity_matrix,
threshold,
)

scores = stationary_distribution(
markov_matrix,
increase_power=increase_power,
normalized=False,
)

return scores

def _power_method(transition_matrix, increase_power=True, max_iter=10000):


eigenvector = np.ones(len(transition_matrix))

if len(eigenvector) == 1:
return eigenvector

transition = transition_matrix.transpose()

for _ in range(max_iter):
eigenvector_next = np.dot(transition, eigenvector)
if np.allclose(eigenvector_next, eigenvector):
return eigenvector_next

eigenvector = eigenvector_next

if increase_power:
transition = np.dot(transition, transition)

logger.warning("Maximum number of iterations for power method exceeded without conve


return eigenvector_next

def connected_nodes(matrix):
_, labels = connected_components(matrix)

groups = []

for tag in np.unique(labels):


group = np.where(labels == tag)[0]
groups.append(group)

return groups

def create_markov_matrix(weights_matrix):
n_1, n_2 = weights_matrix.shape
if n_1 != n_2:
raise ValueError('\'weights_matrix\' should be square')

row_sum = weights_matrix.sum(axis=1, keepdims=True)

# normalize probability distribution differently if we have negative transition valu


if np.min(weights_matrix) <= 0:
return softmax(weights_matrix, axis=1)

return weights_matrix / row_sum

def create_markov_matrix_discrete(weights_matrix, threshold):


discrete_weights_matrix = np.zeros(weights_matrix.shape)
ixs = np.where(weights_matrix >= threshold)
discrete_weights_matrix[ixs] = 1

return create_markov_matrix(discrete_weights_matrix)

def stationary_distribution(
transition_matrix,
increase_power=True,
normalized=True,
):
n_1, n_2 = transition_matrix.shape
if n_1 != n_2:
raise ValueError('\'transition_matrix\' should be square')

distribution = np.zeros(n_1)

grouped_indices = connected_nodes(transition_matrix)

for group in grouped_indices:


t_matrix = transition_matrix[np.ix_(group, group)]
eigenvector = _power_method(t_matrix, increase_power=increase_power)
distribution[group] = eigenvector

if normalized:
distribution /= n_1
return distribution

In [9]: similarity_matrix = np.array(


[
[1.00, 0.17, 0.02, 0.03, 0.00, 0.01, 0.00, 0.17, 0.03, 0.00, 0.00],
[0.17, 1.00, 0.32, 0.19, 0.02, 0.03, 0.03, 0.04, 0.01, 0.02, 0.01],
[0.02, 0.32, 1.00, 0.13, 0.02, 0.02, 0.05, 0.05, 0.01, 0.03, 0.02],
[0.03, 0.19, 0.13, 1.00, 0.05, 0.05, 0.19, 0.06, 0.05, 0.06, 0.03],
[0.00, 0.02, 0.02, 0.05, 1.00, 0.33, 0.09, 0.05, 0.03, 0.03, 0.06],
[0.01, 0.03, 0.02, 0.05, 0.33, 1.00, 0.09, 0.04, 0.06, 0.08, 0.04],
[0.00, 0.03, 0.05, 0.19, 0.09, 0.09, 1.00, 0.05, 0.01, 0.01, 0.01],
[0.17, 0.04, 0.05, 0.06, 0.05, 0.04, 0.05, 1.00, 0.04, 0.05, 0.04],
[0.03, 0.01, 0.01, 0.05, 0.03, 0.06, 0.01, 0.04, 1.00, 0.20, 0.24],
[0.00, 0.02, 0.03, 0.06, 0.03, 0.08, 0.01, 0.05, 0.20, 1.00, 0.10],
[0.00, 0.01, 0.02, 0.03, 0.06, 0.04, 0.01, 0.04, 0.24, 0.10, 1.00],
],
)

# scores calculated with classical LexRank algorithm


degree_centrality_scores(similarity_matrix,threshold=0.1)

array([1.01972014, 1.12818088, 1.01972013, 1.12818088, 0.91125939,


Out[9]:
0.91125939, 0.91125939, 0.91125939, 1.01972014, 1.01972014,
1.01972014])

In [10]: # scores by continuous LexRank


degree_centrality_scores(similarity_matrix,threshold=None)

array([0.98208279, 1.01726288, 1.0028002 , 1.01441783, 1.00362166,


Out[10]:
1.00906347, 0.98945771, 0.99353531, 1.00263787, 0.99343547,
0.99168482])

These scores defines similarity matrix of classical and


continuous LexRank

ROUGE

Rouge metric is used for measuring the


performance of the automatic summarization and
machine translation tasks.
In [1]: from rouge import Rouge

here we giving the sentences from splitted sentence as " model_out "

with change of some words similar meaning of the splitted sentence as "
reference "
In [4]: model_out = ["WASHINGTON — Congressional Republicans have a new fear when it comes to
"The incoming Trump administration could choose to no longer defend the
"But a sudden loss of the disputed subsidies could conceivably cause the
"That could lead to chaos in the insurance market and spur a political b
"To stave off that outcome, Republicans could find themselves in the awk
"In another twist, Donald J. Trump’s administration, worried about prese
"Eager to avoid an ugly political pileup, Republicans on Capitol Hill an
"They are not yet ready to divulge their strategy.",
"“Given that this pending litigation involves the Obama administration a
"“Upon taking office, the Trump administration will evaluate this case a
"In a potentially decision in 2015, Judge Rosemary M. Collyer ruled th
"The Justice Department, confident that Judge Collyer’s decision would b
"In successfully seeking a temporary halt in the proceedings after Mr. T
"The suspension of the case, House lawyers said, will “provide the and
"Republican leadership officials in the House acknowledge the possibilit
"Insurers that receive the subsidies in exchange for paying costs suc
"Over all, the loss of the subsidies could destabilize the entire program
"Anticipating that the Trump administration might not be inclined to mou
"In their request, the lawyers predicted that a deal between House Repub
"No matter what happens, House Republicans say, they want to prevail on
"House Republicans contend that Congress never appropriated the money fo
"In the suit, which was initially championed by John A. Boehner, the Hou
"The White House said that the spending was a permanent part of the law p
"Just as important to House Republicans, Judge Collyer found that Congre
"But on spending power and standing, the Trump administration may come u
"It is a complicated set of dynamics illustrating how a quick legal vict

reference = ["WASHINGTON — Congressional Republicans have a new anxiety when it comes to


"The incoming Trump administration could choose to no longer defend the
"But a sudden loss of the disputed subsidies could conceivably cause the
"That could lead to chaos in the insurance market and spur a political c
"To stave off that outcome, Republicans could find themselves in the awk
"In another twist, Donald J. Trump’s administration, worried about prese
"Eager to avoid an ugly political crash, Republicans on Capitol Hill and
"They are not yet ready to divulge their strategy.",
"“Given that this pending action involves the Obama administration and C
"“Upon taking office, the Trump administration will evaluate this case a
"In a potentially decision in 2015, Judge Rosemary M. Collyer ruled th
"The Justice Department, confident that Judge Collyer’s decision would b
"In successfully seeking a temporary block in the proceedings after Mr.
"The suspension of the case, House lawyers said, will “provide the and
"Republican leadership officials in the House acknowledge the possibilit
"Insurers that receive the subsidies in exchange for paying costs suc
"Over all, the loss of the subsidies could destabilize the entire program
"Anticipating that the Trump administration might not be inclined to mou
"In their request, the lawyers predicted that a deal between House Repub
"No matter what happens, House Republicans say, they want to prevail on
"House Republicans contend that Congress never appropriated the money fo
"In the suit, which was initially championed by John A. Boehner, the Hou
"The White House said that the spending was a permanent part of the law p
"Just as important to House Republicans, Judge Collyer found that Congre
"But on spending power and standing, the Trump administration may come u
"It is a hard set of dynamics illustrating how a quick legal victory for

In [5]: rouge = Rouge()


rouge.get_scores(model_out, reference, avg=True)

{'rouge-1': {'r': 0.989787609045332,


Out[5]:
'p': 0.989787609045332,
'f': 0.989787604045332},
'rouge-2': {'r': 0.9802638215589187,
'p': 0.9802638215589187,
'f': 0.9802638165589185},
'rouge-l': {'r': 0.989787609045332,
'p': 0.989787609045332,
'f': 0.989787604045332}}

#we receive the F1 score f, precision p, and recall r

You might also like