0% found this document useful (0 votes)
3 views15 pages

Konuralp

The document discusses a hybrid Pointer-Generator Network with a Coverage Mechanism designed to improve abstractive summarization by addressing issues like repetition, factual inaccuracies, and out-of-vocabulary words. The proposed model integrates both word generation and direct word extraction, resulting in more human-like and accurate summaries. Experimental results indicate that this approach outperforms existing models in terms of fluency and content novelty.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views15 pages

Konuralp

The document discusses a hybrid Pointer-Generator Network with a Coverage Mechanism designed to improve abstractive summarization by addressing issues like repetition, factual inaccuracies, and out-of-vocabulary words. The proposed model integrates both word generation and direct word extraction, resulting in more human-like and accurate summaries. Experimental results indicate that this approach outperforms existing models in terms of fluency and content novelty.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Get to The Point Presentation

Professor : Mehmet Can Yavuz


Student : Konuralp Dalkılınç - 23COMP5004
Isik University Faculty of Engineering
and Natural Sciences

1
Get To The Point: Summarization
with Pointer-Generator Networks
Improving Abstractive Summarization through Hybrid Copying and Coverage

Problems :

➢ Repetition
➢ Factual inaccuracy
➢ Out-of-vocabulary (OOV) words

Proposed Solution : Introduce a hybrid Pointer-Generator Network with a Coverage Mechanism

Obtained Result : Outperforming the current abstractive models

Abigail See Peter J. Liu Christopher D. Manning


Stanford University Google Brain
[email protected] Stanford University
[email protected] [email protected]
2
summarization

abstractive extractive

Generating words from scratch. Copying words from the input text.
+ Creating more human like + Easier to implement than
summaries abstractive
- Possibility of - Less human like summaries
inaccurate factual details and
repetitive text
- Harder to train and fine-tune

3
What This Paper Proposes
summarization

seq2seq pointer generation coverage mechanism

The combined model capable of ;

➢ Generating human-like, factually accurate summaries by seamlessly integrating new word generation with
direct word extraction from the input text
➢ Eliminating out of vocabulary(oov) problem
➢ Eliminating repetitive text generation

4
5
Basic Definitions
Tokenization:

➢ Tokens can be considered as smallest unit of natural language processing. Can be words, subwords, characters and
sentences

Word Embedding:

➢ Word embeddings are numerical vector representations of words, sentences, and documents that capture their
semantic essence. Similar embeddings cluster together, reflecting their related context or meaning

Bi-directional long short-term memory(LSTM):

➢ LSTM is a type of recurrent neural network(RNN), designed to better capture long-term


dependencies in sequential data. Bi- LSTM can processes sequence data in both
forward and reverse directions by combining two LSTMs

** UNK : unknown token

6
Seq2Seq with Attention
➢ An encoder-decoder architecture using bi-LSTM
➢ Encoder processes input tokens into hidden states
➢ Decoder generates summary token-by-token

Role of attention:

➢ By computing attention weights at each t, help decoder to focus on revelants parts of the input

Limitations:

➢ Struggling with Out Of Vocabulary words


➢ Factual accuracy
➢ Repetition of text

7
Encoders and Decoders
Encoders
➢ Processing the input one by one
➢ Convert each word into a vector representation using a Bi-LSTM
➢ These vectors are called hidden states, including their semantic meaning
Context Vector
➢ A weighted combination of encoder hidden states
➢ Created by applying attention over the encoder outputs
➢ Tells the decoder what part of the input to focus
Attention distribution
➢ A set of scores showing how much focus to put on each word in the input
Vocab distribution
➢ Shows the probability of every word in the vocabulary being the next word
➢ The word with the highest probability is selected, if model decided to generate
Decoders
➢ Generates summary one word at a time
➢ Using previous word, context vector, and past hidden state to predict the next word

8
Seq2Seq

9
Pointer Generator Network
A Hybrid Approach :

➢ For each step a decision is given whether to create a word, or to copy directly from input text

Role of pointer generator network :

➢ Like extractive summarization, copying words from input text


➢ Like abstractive summarization, generating new words from vocabulary

This solves :

➢ Out of vocabulary problem


➢ Increase factual accuracy

10
Pointer Generator Network

11
Coverage Mechanism
➢ By building a coverage vector, it tracks how much attention each word has received for the given time
➢ Using coverage vector to guide attention so it doesn't keep attending to the same words again and again

Role of coverage :

➢ Repeat the same phrases


➢ Lose track of what has already been summarized

This solves :

➢ Reducing repetition
➢ Better at tracking coverage
➢ Increasing ROUGE & METEOR test performances

12
Results
➢ This graph proves that how the coverage mechanism
effects repetition. Without it, the model keeps repeating
words and phrases. Using coverage, repetition decreases
summaries becomes much more fluent

**1, 2 grams, sentences = single words, two-word phrases,


sentences

13
Results
➢ This graph shows how much of the generated summaries
are novel(new) not copied. The pointer-generator model
creates more unique phrases and sentences compared to
the older baseline

**1, 2 grams, sentences = single words, two-word phrases,


sentences

14
Results

➢ ROUGE-1(basic content), ROUGE-2(Fluency and phrase quality), and ROUGE-L(structural and


grammatical accuracy) evaluates word overlap and structure
➢ METEOR captures more human-like properties such as meaning and grammar
➢ Model performs better, proven by the numbers too
➢ Some baseline models used an anonymized version shown by *(@entity3 gave a speech)

15

You might also like