0% found this document useful (0 votes)
9 views14 pages

Module-4 2

The document discusses text generation processes, including content selection, text planning, and surface realization, highlighting challenges such as morphological inflections and varied referring expressions. It introduces the NITROGEN model, which combines rule-based and statistical techniques for generating sentences from abstract meaning representations. Additionally, it covers neural network-based text generation, emphasizing the use of attention mechanisms and vector embeddings for structured records and image captioning.

Uploaded by

shanmukh899
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views14 pages

Module-4 2

The document discusses text generation processes, including content selection, text planning, and surface realization, highlighting challenges such as morphological inflections and varied referring expressions. It introduces the NITROGEN model, which combines rule-based and statistical techniques for generating sentences from abstract meaning representations. Additionally, it covers neural network-based text generation, emphasizing the use of attention mechanisms and vector embeddings for structured records and image captioning.

Uploaded by

shanmukh899
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Text Generation

Module 4
Text Generation
• In data-to-text generation, the input ranges
from structured records, such as the
description of an weather forecast to
unstructured perceptual data, such as a raw
image or video.
• The output may be a single sentence, such as
an image caption, or a multi-paragraph
argument.
Process involved in Text Generation

Content Selection:
determining what parts of the data to describe.
Text Planning:
• planning a presentation of this information.
• lexicalizing the data into words and phrases.
• organizing words and phrases into well-
formed sentences and paragraphs.
Example:
Process involved in Text Generation

Text Planning and Generation Example


Process involved in Text Generation

Surface Realization:
It can be performed by grammars or templates,
which link specific types of data to candidate
words and phrases.
Example:
Challenges in Text Generation
• For more complex cases, it may be necessary
to apply morphological inflections such as
pluralization and tense marking
• Languages such as Russian would require case
marking suffixes for the team names.
• Another difficult challenge for surface
realization is the generation of varied referring
expressions (e.g., The Knicks, New York,
they), which is critical to avoid repetition
NITROGEN Model
• It is a comibination of rule-based and statistical techniques
Proposed by Langkilde and Knight, 1998.
• The input to NITROGEN is an abstract meaning
representation of semantic content to be expressed in a single
sentence.
• In data-to-text scenarios, the abstract meaning representation
is the output of a higher level text planning stage.
• A set of rules then converts the abstract meaning
representation into various sentence plans, which may differ in
both the high-level structure (e.g., active versus passive voice)
as well as the low-level details (e.g., word and phrase choice).
NITROGEN Model

Example
Neural network based Text Generation
• In neural machine translation, the attention
mechanism linked words in the source to
words in the target.
• data-to text generation, the attention
mechanism can link each part of the
generated text back to a record in the data.
Neural network based Text Generation
Data Encoders:
• In some types of structured records, all values are drawn from discrete sets.
• For example, the birthplace of an individual is drawn from a discrete set of
possible locations
• The diagnosis and treatment of a patient are drawn from an exhaustive list of
clinical codes.
• In such cases, vector embeddings can be estimated for each field and possible
value: for example, a vector embedding for the field BIRTHPLACE, and another
embedding for the value BERKELEY CALIFORNIA.
• The table of such embeddings serves as the encoding of a structured record.
• It is also possible to compress the entire table into a single vector
representation, by pooling across the embeddings of each field and value
Neural network based Text Generation
• Sequences:
• Some types of structured records have a natural ordering, such as
events in a game (Chen and Mooney, 2008) and steps in a recipe
(Tutin and Kittredge, 1992).
• For example, the following records describe a sequence of events
in a robot soccer match
PASS(arg1 = PURPLE6, arg2 = PURPLE3)
KICK(arg1 = PURPLE3)
BADPASS(arg1 = PURPLE3, arg2 = PINK9).
• Each event is a single record, and can be encoded by a
concatenation of vector representations for the event type (e.g.,
PASS), the field (e.g., arg1), and the values (e.g., PURPLE3),
Neural network based Text Generation
• Another flavor of data-to-text generation is
the generation of text captions for images.
• Images are naturally represented as tensors: a
color image of 320 × 240 pixels would be
stored as a tensor with 320 × 240 × 3 intensity
values.
• The dominant approach to image classification
is to encode images as vectors using a
combination of convolution and pooling
Example of Text Generation from Image
Neural network based Text Generation
• Attention model:
• In coarse-to-fine attention, each record receives a global
attention ar ∈ [0, 1], which is independent of the decoder
state.
• This global attention, which represents the overall importance
of the record, is multiplied with the decoder-based attention
scores, before computing the final normalized attentions.
Structured attention model:
• Structured attention vectors can be computed by running the
forward-backward algorithm to obtain marginal attention
probabilities

You might also like