Using a Text Mining Tool to Support Text Summarization
Using a Text Mining Tool to Support Text Summarization
Abstract— This paper presents a mining tool that is able to visualization of information graphically can improve
extract graphs from texts, and proposes their use in helping students’ organization skills during the writing process. As
students to write summaries. The text summarization method for text summarization, the use of graphic organizers (in
is based on the use of the graphs as graphic organizers, leading particularly concept maps) has shown to be an effective
students to further reflect about the main ideas of the text method closely related to text comprehension [6]. The
before getting to the actual task of writing. An experiment authors attribute the reason for this to the fact that concept
carried out demonstrated that the tool helped students reflect mapping emphasizes the selection of major ideas, connecting
about the main ideas of the text and supported the writing of and organizing these concepts with links, then using this
the summaries.
information for writing. As noted by Brown and Day [9],
Text mining, text summarization, graphs, writing there is an intersection between this sequence of tasks and
the process of text summarization.
I. INTRODUCTION The method proposed in this paper is based on the use of
a particular type of visual representation to help students in
The use of graphic organizers and other prewriting
summarization tasks. The next section presents the text
activities have demonstrated to be an effective aid for
mining tool Sobek, as well as the method for its use in text
writing, enabling learners to segment the topic they have to
summarization tasks.
consider, and helping them to structure their writing [1][2].
Strangman et al. [3] define a graphic organizer as an III. THE TEXT MINING TOOL
illustration that depicts the relationships between facts,
terms, providing an opportunity to organize thoughts. Based The text mining tool Sobek was developed using a
on this idea of graphic organizers, we propose in this paper particular mining algorithm based on the n-simple distance
the use of a particular tool which extracts graphs graph model, in which nodes represent the main terms found
automatically from texts, as a way to support summarization in the text, and the edges represent adjacency information
- a particular type of writing task. The tool, implemented [4]. The method used here relies on a parameter n to extract
according to a particular text mining technique [4], has the compound concepts with more than one word. According
already been applied to different educational applications, as to this parameter we create a combination of the current
on the analysis of discussion forums [5]. word with the n subsequent words. Fig. 1 shows the interface
The next section presents a brief overview of graphic of the mining tool in which a graph was extracted from a
organizers. Section 3 presents the text mining tool proposed short text about Realism. For complete details about the text
here, and section 4 describes an experiment carried out to mining algorithms implemented here, please refer to [10].
validate this research. The last section of the paper presents
conclusions are directions for future work.
II. GRAPHIC ORGANIZERS FOR READING AND WRITING
Graphic organizers have been applied across a large range
of subject areas, demonstrating their benefits in different
activities such as mapping cause and effect, note taking,
comparing and contrasting concepts, and relating
information to main ideas or themes [3]. Despite the
differences in visual representation of graphic organizer
models, they frequently have a similar underlying principle
which is based on the conversion of linear textual statements
into nonlinear graphic presentations [6].
Regarding the use of graphic organizers to support
writing, Rudell [7] stresses the importance of providing tools
that allow students to illustrate their constructions and
organization of knowledge, enabling them to express Figure 1. Graph extracted from text about Realism
visually which ideas are the most meaningful, and how these
ideas are connected. Capretz et al. [8] showed that the
608
Authorized licensed use limited to: St. Aloysius College Mangalore (AIMIT). Downloaded on May 30,2023 at 10:04:34 UTC from IEEE Xplore. Restrictions apply.
and forth to their graphs several times while producing their ACKNOWLEDGMENT
summaries - which is another evidence that the students This work has been partially supported by the National
referred to their graphs while writing. Council for Scientific and Technological Development
An evaluation of the texts produced by the students was (CNPq - Brazil) under grant 476398/2010-0, and FAPERGS
also made, considering different criteria detailed in [16]. The Research Support Foundation, under grant 1018248.
average of grades obtained, in a scale from 0 to 10, were: 8.2
for main theme identification, 6.7 for cohesion, 7.0 for REFERENCES
coherence and 5.5 for text organization. Apart from the [1] K. Beissner, D. H. Jonassen, B. L. Grabowski. “Using and Selecting
grades for text organization, all the other were considered Graphic Techniques to Acquire Structural Knowledge”. Performance
good, particularly the average grade for main theme Improvement Quarterly, vol. 7, no. 34, p. 20-38, Dec. 1994.
identification. It is likely that the accurate identification of [2] W. A. Wan Mohamed, B. Omar. “Using Concept Map to Facilitate
the main theme of the text by the mining tool helped the Writing Assignment”. In Cañas, A. J., Ahlberg, M. & Novak, J. D.
(Eds.), Proceedings of the Third International Conference on Concept
students to focus on it. Some of the students’ testimonials Mapping. Helsinki, Finland, 2008.
reinforced this idea: “... based on the graph I identified what
[3] N. Strangman, T. Hall, A. Meyer. “Graphic Organizers and
was important in the text...” (student 3); “... I realized that Implications for Universal Design for Learning: Curriculum
the words selected by the graph were important, relevant...” Enhancement Report”. National Center on Accessing the General
(student 8). Curriculum, USA. 2003.
The fact that the grades for text organization were not so [4] A. Schenker. Graph-Theoretic Techniques for Web Content Mining.
satisfactory may be due to the lack of structure in the order PhD thesis, University of South Florida, 2003.
for reading/interpreting the graphs. We are currently [5] B. Azevedo, E. Reategui, P. Behar. “Automatic Analysis of Messages
investigating other forms of visual representation as to make in Discussion Forums”. In: Proceedings of the IEEE International
Conference on Interactive Collaborative Learning, Pietsany, Czech
the graphs more linear. Republic, 2011, pp. 76-81.
The students’ testimonials also showed how they [6] K. E. Chang, Y. T. Sung, I. D. Chen, “The Effect of Concept
employed the graphs in their writings: “... I used the graph Mapping to Enhance Text Comprehension and Summarization”. The
many times - I had a look at it whenever I did not want to get Journal of Experimental Education, vol. 71, no. 1, 2002, pp. 5-23.
lost in the text and I wanted things to make sense...” (student [7] M. R. Ruddell. Teaching content reading and writing, 3rd ed. New
8); “... I needed the graph to know what were the important York: John Wiley & Sons, Inc, 2001.
parts that had to be included in the text...” (student 9). [8] K. Capretz, B. Ricker, A. Sasak. “Improving organizational skills
The testimonials of the teacher who worked with the through the use of graphic organizers”. Illinois, MA: Research
students in text production confirmed that the methodology Project, Saint Xavier. University and Skylight Professional
Development, 2003.
for summary writing using the mining tool was very
[9] A. L. Brown, J. D. Day. “Macrorules for summarizing texts: The
productive. The teacher stated that, comparing to other development of expertise”. Journal of Verbal Learning and Verbal
writing activities, the students got good marks and showed a Behavior, vol. 22, 1983, pp. 1–14.
good level of engagement in the task proposed. [10] E. Reategui, D. Epstein, A. Lorenzatti, M. Klemann. “Sobek: a Text
Mining Tool for Educational Applications”. In: Proceedings of the
II. CONCLUSION International Conference on Data Mining, 2011, Las Vegas, USA,
2011, pp. 59-64.
This paper presented a text mining tool and proposed a
[11] T. W. Bean, F. L. Steenwyk. “The effect of three forms of
methodology for using it as a support in summary writing. summarization instruction on sixth graders’ summary writing and
Results of a study with 20 students demonstrated that the tool comprehension”. Journal of Reading Behavior, vol. 16, 1984, pp.
was able to produce graphs that were close to what was 297–306.
considered to be important about a text read by the students, [12] R. Ellis. Task-based language learning and teaching. New York:
but not too perfect as not to give them room to express their Oxford University Press, 2003.
own ideas about the most relevant information. The use of [13] J. R. Hayes, L. S. Flower. “Identifying the organization of the writing
Sobek supported the structuring of the students’ writing process”. In: L. W. Gregg, E. R. Steinberg (Eds), Cognitive processes
process, a result that is aligned to findings presented by other in writing. Hillsdale, NJ: Lawrence Erlbaum, 1980, pp. 3-30.
research [11]. [14] T. Shanahan. “Relations among Oral Language, Reading, and Writing
Development”. Handbook of Writing Research. In: C. MacArthur, S.
In Villalón et al. [17], the authors present a text editor Graham, J. Fitzgerald (Eds.). New York: The Guilford Press, 2008,
that attempts to trigger students’ reflection by mining their pp. 171-183.
texts. Based on theses results, the system proposes questions [15] K. A. Rawson, J. Dunlosky. “The rereading effect:
about the texts’ contents, structure, and coherence. Although Metacomprehension accuracy improves across reading trials”.
our goal here is similar in that we also employ text mining to Memory & Cognition, vol. 28, no. 6, 2000, pp. 1004-1010.
lead students to further reflect about their writing, our [16] UFRGS: Examiner’s Manual for University Entry Examinations
summarization strategy is different as it is based on the (Manual do Avaliador de Redações do Concurso Vestibular), Porto
Alegre, RS: Editora UFRGS, 2006.
mining of the reading material provided to students, and not
[17] J. Villalón, P. Kearney, R. A. Calvo, P. Reinmann. “Glosser:
on the mining of the actual students’ texts. Enhanced Feedback for Student Writing Tasks”. In Proceedings of
The observation of how students use the graphs should the Eighth IEEE International Conference on Advanced Learning
also give us further insight about possible applications of text Technologies, Santander, Spain, 2008, p.p. 454-458.
mining in educational settings.
609
Authorized licensed use limited to: St. Aloysius College Mangalore (AIMIT). Downloaded on May 30,2023 at 10:04:34 UTC from IEEE Xplore. Restrictions apply.