0% found this document useful (0 votes)
16 views

Using a Text Mining Tool to Support Text Summarization

Uploaded by

abhimyvkn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Using a Text Mining Tool to Support Text Summarization

Uploaded by

abhimyvkn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

2012 12th IEEE International Conference on Advanced Learning Technologies

Using a Text Mining Tool to Support Text Summarization

Eliseo Reategui, Miriam Klemann, Mateus David Finco


PPGIE/PPGEDU, Federal University of Rio Grande do Sul (UFRGS)
[email protected], [email protected], [email protected]

Abstract— This paper presents a mining tool that is able to visualization of information graphically can improve
extract graphs from texts, and proposes their use in helping students’ organization skills during the writing process. As
students to write summaries. The text summarization method for text summarization, the use of graphic organizers (in
is based on the use of the graphs as graphic organizers, leading particularly concept maps) has shown to be an effective
students to further reflect about the main ideas of the text method closely related to text comprehension [6]. The
before getting to the actual task of writing. An experiment authors attribute the reason for this to the fact that concept
carried out demonstrated that the tool helped students reflect mapping emphasizes the selection of major ideas, connecting
about the main ideas of the text and supported the writing of and organizing these concepts with links, then using this
the summaries.
information for writing. As noted by Brown and Day [9],
Text mining, text summarization, graphs, writing there is an intersection between this sequence of tasks and
the process of text summarization.
I. INTRODUCTION The method proposed in this paper is based on the use of
a particular type of visual representation to help students in
The use of graphic organizers and other prewriting
summarization tasks. The next section presents the text
activities have demonstrated to be an effective aid for
mining tool Sobek, as well as the method for its use in text
writing, enabling learners to segment the topic they have to
summarization tasks.
consider, and helping them to structure their writing [1][2].
Strangman et al. [3] define a graphic organizer as an III. THE TEXT MINING TOOL
illustration that depicts the relationships between facts,
terms, providing an opportunity to organize thoughts. Based The text mining tool Sobek was developed using a
on this idea of graphic organizers, we propose in this paper particular mining algorithm based on the n-simple distance
the use of a particular tool which extracts graphs graph model, in which nodes represent the main terms found
automatically from texts, as a way to support summarization in the text, and the edges represent adjacency information
- a particular type of writing task. The tool, implemented [4]. The method used here relies on a parameter n to extract
according to a particular text mining technique [4], has the compound concepts with more than one word. According
already been applied to different educational applications, as to this parameter we create a combination of the current
on the analysis of discussion forums [5]. word with the n subsequent words. Fig. 1 shows the interface
The next section presents a brief overview of graphic of the mining tool in which a graph was extracted from a
organizers. Section 3 presents the text mining tool proposed short text about Realism. For complete details about the text
here, and section 4 describes an experiment carried out to mining algorithms implemented here, please refer to [10].
validate this research. The last section of the paper presents
conclusions are directions for future work.
II. GRAPHIC ORGANIZERS FOR READING AND WRITING
Graphic organizers have been applied across a large range
of subject areas, demonstrating their benefits in different
activities such as mapping cause and effect, note taking,
comparing and contrasting concepts, and relating
information to main ideas or themes [3]. Despite the
differences in visual representation of graphic organizer
models, they frequently have a similar underlying principle
which is based on the conversion of linear textual statements
into nonlinear graphic presentations [6].
Regarding the use of graphic organizers to support
writing, Rudell [7] stresses the importance of providing tools
that allow students to illustrate their constructions and
organization of knowledge, enabling them to express Figure 1. Graph extracted from text about Realism
visually which ideas are the most meaningful, and how these
ideas are connected. Capretz et al. [8] showed that the

978-0-7695-4702-2/12 $26.00 © 2012 IEEE 607


DOI 10.1109/ICALT.2012.51
Authorized licensed use limited to: St. Aloysius College Mangalore (AIMIT). Downloaded on May 30,2023 at 10:04:34 UTC from IEEE Xplore. Restrictions apply.
The method proposed here to assist students in text The students had to work on a particular summarization
summarization using the mining tool is presented below. task, according to the steps detailed in this previous section.
The students started by reading a short text about "Realism",
A. The Summarization Method
complete text, in Portuguese, available at:
Summary writing techniques either follow a more http://www.artesbr.hpg.ig.com.br/Educacao/11/interna_hpg10.html.
intuitive approach without step by step instruction, or follow Then the students used Sobek to extract a graph representing
a rule-governed approach which may focus on tasks such as relevant terms and relationships from the text (Fig. 1). The
identifying macro level ideas, deleting unnecessary or graphs were then edited by the students, as to correct
redundant information, identifying or producing topic inconsistencies and to make adjustments, according to their
sentences [11]. Here, the method proposed is based on a understanding of the text. The goal here was to produce a
different approach where the student interacts with the text graphic organizer that could assist them in organizing their
mining tool in order to grasp the main ideas of the text and to ideas about the summary they had to write.
build a visual representation in which these ideas are As a first validation step, the summaries produced by the
expressed. Only in a second moment the student moves to students were analysed to verify whether the terms of the
the actual writing of the summary. Ellis [12] states that in a graph were in fact present in the students’ writings. The
writing activity, most of the time spent is dedicated to results showed that each student used, on average, 61.6% of
planning. Aligned with this theory, Hayes and Flower [13] the total number of terms of the original graph, composed of
divide the writing process in three stages: pre-writing, 16 terms. This means that the students had to make changes
writing and re-writing. The use of the software Sobek, as in their graphs to adjust them to their understanding of the
proposed here, focuses on the first two steps of this process. text, which is a positive finding, considering that such an
Pre-writing: action is the result of a reflection about the accuracy of the
(a) The student reads a text to be summarized. In this step terms and relationships represented.
the student learns about the topic he/she has to write about Allowing the students to modify the graphs to make them
and identifies macro level ideas. closer to their understanding of the text is similar to the
(b) After reading the text, the student uses Sobek to approach proposed by Chang [6], where a map-correction
extract automatically a graph with relevant terms and strategy was used. In their method, the students used a
relationships from the text. The graph is then used as a first concept map provided by an expert where many of the nodes
draft of a graphic organizer to help them organize their ideas. and relationships were incorrect, with the goal of letting the
(c) The student reviews the terms and relationships learners identify the problems and correct them. Here,
identified by the tool, editing the graph according to what however, the graphs were not provided by an expert, but by
he/she believes to be appropriate. This is an important step, the text mining tool. A closer analysis of the students’ use of
as it leads the student to reflect about the text and reread it, the terms extracted in the graphs (Table I) showed that all of
leading to a deeper understanding of the text. the terms appear in the students’ summaries.
Writing:
(d) Using the edited graph as a graphic organizer, the TABLE I. TOTAL OCCURRENCE OF TERMS IN STUDENTS’ SUMMARIES
student starts the actual writing of the summary. From time
to time during the writing process, the student contrasts the Terms # occurrences Terms # occurrences
graph with the original text, as to make sure that the Realism 100 romance 18
summary written is faithful to the ideas of the text. Literature 42 playwright 15
(e) The cross-checking that happens in this phase makes author 34 emphasize 12
the writing process a cycle, which may involve previous Theater 34 social 10
steps in the process, including the re-reading of the text and naturalism 24 Russian 9
the re-editing of the graphs extracted by the mining tool. romanticism 24 France 5
The rewriting step, placed by Hayes e Flower [13] as the screen 23 write 5
last phase in the writing process, is seen here as a subsequent theme 23 - -
phase in which the main goal is the revision of the text
already structured and written. In this sense, the tool may Secondly, the films obtained from the monitoring of the
operate as a support to the logical organization of students interacting with Sobek were analysed. Two
information, a process which relates reading and writing as important pieces of evidence were identified in the films.
steps of the same cognitive process [14]. Concerning the understanding of the text, it was clear that
after viewing the graphs produced by Sobek, the students
IV. EVALUATION AND RESULTS always went back to the text to reread it. Such behavior
A study with 20 high school students between 15 and 18 implies that the students began by questioning themselves
years old was conducted in order to validate the use of the whether a certain term and/or relationship represented in the
text mining tool as a support in a summarization task. The graph was in fact accurate. This reflection led the students to
experiment was carried out in a computer laboratory with 20 look for answers in the text, leading to a better understanding
computers, one for each student. The students were observed and improving accuracy, as discussed in [15].
and all their interaction with the computer was monitored by As for the use of the graphs in the production of the
the use of a screen capture software. summaries, the films demonstrated that learners went back

608

Authorized licensed use limited to: St. Aloysius College Mangalore (AIMIT). Downloaded on May 30,2023 at 10:04:34 UTC from IEEE Xplore. Restrictions apply.
and forth to their graphs several times while producing their ACKNOWLEDGMENT
summaries - which is another evidence that the students This work has been partially supported by the National
referred to their graphs while writing. Council for Scientific and Technological Development
An evaluation of the texts produced by the students was (CNPq - Brazil) under grant 476398/2010-0, and FAPERGS
also made, considering different criteria detailed in [16]. The Research Support Foundation, under grant 1018248.
average of grades obtained, in a scale from 0 to 10, were: 8.2
for main theme identification, 6.7 for cohesion, 7.0 for REFERENCES
coherence and 5.5 for text organization. Apart from the [1] K. Beissner, D. H. Jonassen, B. L. Grabowski. “Using and Selecting
grades for text organization, all the other were considered Graphic Techniques to Acquire Structural Knowledge”. Performance
good, particularly the average grade for main theme Improvement Quarterly, vol. 7, no. 34, p. 20-38, Dec. 1994.
identification. It is likely that the accurate identification of [2] W. A. Wan Mohamed, B. Omar. “Using Concept Map to Facilitate
the main theme of the text by the mining tool helped the Writing Assignment”. In Cañas, A. J., Ahlberg, M. & Novak, J. D.
(Eds.), Proceedings of the Third International Conference on Concept
students to focus on it. Some of the students’ testimonials Mapping. Helsinki, Finland, 2008.
reinforced this idea: “... based on the graph I identified what
[3] N. Strangman, T. Hall, A. Meyer. “Graphic Organizers and
was important in the text...” (student 3); “... I realized that Implications for Universal Design for Learning: Curriculum
the words selected by the graph were important, relevant...” Enhancement Report”. National Center on Accessing the General
(student 8). Curriculum, USA. 2003.
The fact that the grades for text organization were not so [4] A. Schenker. Graph-Theoretic Techniques for Web Content Mining.
satisfactory may be due to the lack of structure in the order PhD thesis, University of South Florida, 2003.
for reading/interpreting the graphs. We are currently [5] B. Azevedo, E. Reategui, P. Behar. “Automatic Analysis of Messages
investigating other forms of visual representation as to make in Discussion Forums”. In: Proceedings of the IEEE International
Conference on Interactive Collaborative Learning, Pietsany, Czech
the graphs more linear. Republic, 2011, pp. 76-81.
The students’ testimonials also showed how they [6] K. E. Chang, Y. T. Sung, I. D. Chen, “The Effect of Concept
employed the graphs in their writings: “... I used the graph Mapping to Enhance Text Comprehension and Summarization”. The
many times - I had a look at it whenever I did not want to get Journal of Experimental Education, vol. 71, no. 1, 2002, pp. 5-23.
lost in the text and I wanted things to make sense...” (student [7] M. R. Ruddell. Teaching content reading and writing, 3rd ed. New
8); “... I needed the graph to know what were the important York: John Wiley & Sons, Inc, 2001.
parts that had to be included in the text...” (student 9). [8] K. Capretz, B. Ricker, A. Sasak. “Improving organizational skills
The testimonials of the teacher who worked with the through the use of graphic organizers”. Illinois, MA: Research
students in text production confirmed that the methodology Project, Saint Xavier. University and Skylight Professional
Development, 2003.
for summary writing using the mining tool was very
[9] A. L. Brown, J. D. Day. “Macrorules for summarizing texts: The
productive. The teacher stated that, comparing to other development of expertise”. Journal of Verbal Learning and Verbal
writing activities, the students got good marks and showed a Behavior, vol. 22, 1983, pp. 1–14.
good level of engagement in the task proposed. [10] E. Reategui, D. Epstein, A. Lorenzatti, M. Klemann. “Sobek: a Text
Mining Tool for Educational Applications”. In: Proceedings of the
II. CONCLUSION International Conference on Data Mining, 2011, Las Vegas, USA,
2011, pp. 59-64.
This paper presented a text mining tool and proposed a
[11] T. W. Bean, F. L. Steenwyk. “The effect of three forms of
methodology for using it as a support in summary writing. summarization instruction on sixth graders’ summary writing and
Results of a study with 20 students demonstrated that the tool comprehension”. Journal of Reading Behavior, vol. 16, 1984, pp.
was able to produce graphs that were close to what was 297–306.
considered to be important about a text read by the students, [12] R. Ellis. Task-based language learning and teaching. New York:
but not too perfect as not to give them room to express their Oxford University Press, 2003.
own ideas about the most relevant information. The use of [13] J. R. Hayes, L. S. Flower. “Identifying the organization of the writing
Sobek supported the structuring of the students’ writing process”. In: L. W. Gregg, E. R. Steinberg (Eds), Cognitive processes
process, a result that is aligned to findings presented by other in writing. Hillsdale, NJ: Lawrence Erlbaum, 1980, pp. 3-30.
research [11]. [14] T. Shanahan. “Relations among Oral Language, Reading, and Writing
Development”. Handbook of Writing Research. In: C. MacArthur, S.
In Villalón et al. [17], the authors present a text editor Graham, J. Fitzgerald (Eds.). New York: The Guilford Press, 2008,
that attempts to trigger students’ reflection by mining their pp. 171-183.
texts. Based on theses results, the system proposes questions [15] K. A. Rawson, J. Dunlosky. “The rereading effect:
about the texts’ contents, structure, and coherence. Although Metacomprehension accuracy improves across reading trials”.
our goal here is similar in that we also employ text mining to Memory & Cognition, vol. 28, no. 6, 2000, pp. 1004-1010.
lead students to further reflect about their writing, our [16] UFRGS: Examiner’s Manual for University Entry Examinations
summarization strategy is different as it is based on the (Manual do Avaliador de Redações do Concurso Vestibular), Porto
Alegre, RS: Editora UFRGS, 2006.
mining of the reading material provided to students, and not
[17] J. Villalón, P. Kearney, R. A. Calvo, P. Reinmann. “Glosser:
on the mining of the actual students’ texts. Enhanced Feedback for Student Writing Tasks”. In Proceedings of
The observation of how students use the graphs should the Eighth IEEE International Conference on Advanced Learning
also give us further insight about possible applications of text Technologies, Santander, Spain, 2008, p.p. 454-458.
mining in educational settings.

609

Authorized licensed use limited to: St. Aloysius College Mangalore (AIMIT). Downloaded on May 30,2023 at 10:04:34 UTC from IEEE Xplore. Restrictions apply.

You might also like