Visual Text Analysis in Digital Humanities: Forum
Visual Text Analysis in Digital Humanities: Forum
1 Image and Signal Processing Group, Department of Computer Science, Leipzig University, Germany
{stjaenicke,faisal,scheuermann}@informatik.uni-leipzig.de
2 Göttingen Centre for Digital Humanities, University of Göttingen, Germany
Abstract
In 2005, Franco Moretti introduced Distant Reading to analyze entire literary text collections. This was a rather
revolutionary idea compared to the traditional Close Reading, which focuses on the thorough interpretation of an
individual work. Both reading techniques are the prior means of Visual Text Analysis. We present an overview of
the research conducted since 2005 on supporting text analysis tasks with close and distant reading visualizations
in the digital humanities. Therefore, we classify the observed papers according to a taxonomy of text analysis
tasks, categorize applied close and distant reading techniques to support the investigation of these tasks, and
illustrate approaches that combine both reading techniques in order to provide a multifaceted view of the textual
data. In addition, we take a look at the used text sources and at the typical data transformation steps required for
the proposed visualizations. Finally, we summarize collaboration experiences when developing visualizations for
close and distant reading, and we give an outlook on future challenges in that research area.
Keywords: digital humanities, survey, visual text analysis, close reading, distant reading
Categories and Subject Descriptors (according to ACM CCS): H.5.2 [Information Interfaces and Presentation]: User
Interfaces—Evaluation/methodology
– so-called close reading –, he invites to count, to graph and novel idea that was introduced by Franco Moretti at the be-
to map them. In other words, to visualize them. ginning of the 21th century. In contrast to Moretti, Jockers
uses the terms micro- and macroanalysis instead of close and
This survey observes text analysis tasks of humanities
distant reading [Joc13]. Inspired by micro- and macroeco-
scholars – e.g., literary scholars, historians and philologists
nomics, he focuses on quantitative literary text analysis us-
–, and the visualization techniques that have been developed
ing statistical analysis methods. As the methods we analyzed
in order to support these tasks. By providing a text analysis
are more related to visualization, we decided to use the tradi-
task taxonomy, categorizing applied close and distant read-
tional, more common terms close and distant reading, but we
ing techniques and outlining strategies that combine close
also considered related works using different terminologies.
and distant reading visualizations, we present an overview
This section introduces close and distant reading techniques
suitable for visualization scholars facing related digital hu-
and draws a line from the digital humanities to information
manities text analysis tasks. We further investigate the fol-
visualization by combining both techniques.
lowing questions:
• What are the used text sources and which data transfor- 2.1. Close Reading
mations are applied in order to investigate text analysis
research questions with close and distant reading visual- The close reading of a text became a fundamental method
izations? in literary criticism in the 20th century [Haw00]. Nancy
• Which experiences are reported regarding collaborations Boyles [Boy13] defines it as follows: “Essentially, close
between visualization experts and humanities scholars? reading means reading to uncover layers of meaning that
• What are future challenges for visualization scholars con- lead to deep comprehension.” In other words, close read-
cerning visual text analysis to further improve the support ing is the thorough interpretation of a text passage by the
for humanities scholars? determination of central themes and the analysis of their
development. Moreover, close reading includes the analy-
sis of (1) individuals, events, and ideas, their development
1.1. Relation to the Previous Article and interaction, (2) used words and phrases, (3) text struc-
The focus of the previous version of this survey [JFCS15] ture and style, and (4) argument patterns [Jas01]. The re-
was to illustrate the diversity of applied visualization tech- sult of a traditional close reading approach is shown in Fig-
niques that support the close and distant reading of texts ure 1. In this example, the scholar used various methods
in digital humanities applications – enriched with used vi- to annotate various features of the source text, e.g., the us-
sualization tools, collaboration experiences of visualization age of different colors (blue, red, green) and underlining
researchers working together with humanities scholars and styles (straight or wavy lines, circles). Furthermore, numer-
future challenges. ous thoughts are written next to the corresponding sentences.
Although most humanities scholars are trained in this tradi-
This survey extension aims to give visualization scholars tional approach of close reading, today’s large availability
new to the field of digital humanities an adequate overview of digitized texts and of digital editions through web portals
of related works in order to support carrying out successful
digital humanities projects. As an application domain for in-
formation visualization, these projects gain their motivation
from humanities research questions on texts. Therefore, we
introduce a taxonomy that groups the papers into classes of
text analysis tasks in order to guide visualization researchers
with similar tasks to related works that provide close and
distant reading solutions. In addition, we list text sources,
and we take a closer look at data transformations, which are
substantial steps in order to afford designing valuable visu-
alizations. Furthermore, we extended the collaboration ex-
periences and future challenges sections. Finally, this survey
considers 22 more related works.
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
Journal/Proceedings #Papers
IEEE Transactions on Visualization and
16
data transformation Computer Graphics (TVCG)
IEEE Symposium on Visual Analytics
6
Science and Technology (VAST)
distant reading Computer Graphics Forum 5
text analysis Proceedings of the International
task
Conference on Information Visualization 2
text sources
Theory and Applications (IVAPP)
close reading Information Visualisation Journal 1
insight
Visual Text Analysis
Table 1: Visualization papers examined.
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
Perseus Digital Library http://www.perseus.tufts.edu/hopper/ [BPBI10]
Project Gutenberg https://www.gutenberg.org/ [CTA∗ 13, GCL∗ 13, KO07, GTAHS15]
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
which has become the humanities leading technology to ready present when using annotated TEI files as data sources
map the structure of a digital text edition [Sin13]. XSLT (e.g., [BB15a, TFK15]).
stylesheets are basic ways to transform the TEI encoded
Topic modeling algorithms are fundamental for topic-
information into a meaningful visualization of an individ-
related analyses of text collections. The Latent Dirichlet al-
ual text (e.g., [Cor13, Pie13, HKTK14]). But most distant
location is the most often applied topic model [BNJ03]. It re-
reading techniques require more sophisticated preprocessing
quires a predefined number of topics, which are determined
steps – an brief summary of common data transformation ap-
automatically based on the words contained in the text cor-
proaches is shown below.
pus (e.g., [AKV∗ 14, BJ14]). The topic model can be used
Tokenization and normalization are rather rudimentary to cluster texts thematically [Wol13] – as happens with text
natural language processing methods first applied to segment classification methods (e.g., [PSA∗ 06,DFM∗ 08]) –, or to de-
raw text sources. The then determinable frequency distri- fine the similarity among the texts of a corpus [Joc12]. When
bution of words is a valuable basis for various tasks (e.g., temporal metadata is provided, the change of topics can be
stylometric analyses [CEJ∗ 14, Ede14]), and can be clearly analyzed (e.g., [ARR∗ 12, CLWW14]).
visualized in the form of tag clouds to support the explo-
Semi-automatic approaches reflect the importance of in-
ration of word statistics (e.g., [Bea08, GTAHS15, JBR∗ 15]).
tegrating the humanities scholar into the data transformation
Vector space models can be used to list term frequen-
process. For instance, the scholar’s knowledge is required
cies per text and support a variety of text analysis tasks
when manually generating or validating a training set to pro-
(e.g., [DFM∗ 08, GCL∗ 13, KOTM13]). On the other hand,
duce an appropriate data mining classifier (e.g., [PSA∗ 06,
counting n-grams allows to draw more specific statements
KKL∗ 11,KJW∗ 14]). Other methods include semi-automatic
about a text corpus (e.g., [CDP∗ 07,Bea12,MH13]). But tok-
alignments [GZ12] and the annotation of TEI documents
enization and normalization are also necessary steps for the
(e.g., [Tót13, OGH15]). Sometimes, even the visualization
data transformation methods described below.
entirely depends on manually collected data through crowd-
Sequence alignments are computed when inves- sourcing (e.g., [WMN∗ 14, RPSF15, HFM16, JFS16]).
tigating research questions concerning textual re-use
(e.g., [BGHE10, JGBS14]) and for the analysis of sim-
ilarities and differences among various text editions 7. Taxonomy
(e.g., [WJ13b, JGF∗ 15]). In such scenarios one typi- There has been extensive research done in developing tax-
cally applies the Gothenburg model, which includes onomies for information visualization in the last decades.
tokenization, normalization, alignment, analysis and visual- Unfortunately, these taxonomies were either too general
ization [Got15]. One of the web-based tools implementing (e.g., [BM13, RAW∗ 15]) or too specific (e.g., [LPP∗ 06,
this model is CollateX [HDvHM∗ 15]. KKC15]) to be used for our paper collection. Therefore,
Part-of-speech (POS) tagging is a frequently applied we defined a taxonomy focusing the underlying text anal-
preprocessing technique to automatically annotate the words ysis tasks in the digital humanities domain (see Figure 5). A
of a corpus according to their part of speech category. The detailed classification of papers is given in Table 4. Papers
use of tools like the Stanford POS tagger [TKMS03] is focusing a single text analysis task are grouped to a single
a mandatory basis for investigating diverse research ques- – the best fitting – category. The few works providing visu-
tions. Typically, words and their relationships are explored alization methods for two text analysis tasks each appear in
(e.g., [KKL∗ 11, MH13]) or linguistic patterns are extracted two categories [RRRG05, WH11, Wol13, Kau15]. The tax-
from a corpus (e.g., [Mur11, RFH14]). Furthermore, POS onomy consists of five major categories:
tagging is used to analyze phonetic features [CTA∗ 13] or
for research questions concerning stylometry [KO07].
persons interpretation
Named entity recognition (NER) is the practice of ex- places
sound
tracting named entities such as places or persons from texts.
Preprocessing steps like part-of-speech tagging can be ap- story flow
miscellaneous named text of
plied to automatically list named entity candidates [GH11b]. entities interest miscellaneous
With the help of lexicons, named entities are subsequently clustering
classified. For example, the Pleiades gazetteer [Ple15] is text analysis word statistics &
topics tasks corpus relationships
used for the extraction of ancient place names [EJ14], evolution analysis
and DBPedia [LIJ∗ 14] supports the discovery of com- similar text similarity
extraction patterns
modities in [HAC∗ 15]. The Stanford Named Entity Rec-
linguistic exploration
ognizer [FGM05] is a popular tool for automatic named text edition
patterns text re-use
comparison
entity extraction. The manual collection of named enti-
ties is not uncommon and guarantees the highest precision
(e.g., [JW13, Wil15b]). Occasionally, named entities are al- Figure 5: Taxonomy of text analysis tasks.
evolution [CLT∗ 11], [KBK11], [ARR∗ 12], [DWS∗ 12], [CLWW14], [Kau15]
clustering [PSA∗ 06], [DFM∗ 08], [OST∗ 10], [Wol13], [HFM16]
sound [FS11], [CGM∗ 12], [ARLC∗ 13], [CTA∗ 13], [Pie13], [Ben14], [MLCM16]
story flow [RRRG05], [LWW∗ 13], [RSDCD∗ 13], [HPR14]
miscellaneous [MFM13], [GWFI14], [KJW∗ 14], [PBD14]
word statistics & [Bea08], [CVW09], [VCPK09], [LRKC10], [Bea11], [KKL∗ 11],
corpus analysis
The analysis of named entities is a common text analy- An analysis of similar patterns that includes the discov-
sis task of humanities scholars that is supported with close ery, the alignment and the visualization of similar text seg-
and distant reading visualizations. When extracting places, ments among the texts of a given collection is a typical text
fictional or reported geographies of a single text or a whole analysis task in the digital humanities. Dependent on the
collection can be explored. The extraction of persons is re- length of patterns, we divide the tasks belonging to that cat-
quired to analyze social networks of individuals or of char- egory into three sets. While the analysis of linguistic pat-
acters in a story. Miscellaneous tasks focus on other (e.g., terns concerns short phrases, text re-use analysis focuses on
encyclopedia entries [HAHB15]) or on multiple named en- determining deliberately re-used text segments (e.g., quotes
tities (e.g., commodities and locations [HAC∗ 15]). or plagiarized passages). In papers grouped to the category
text edition comparison, the humanities scholar is more in-
The analysis of topics inherent in a text corpus supports terested in analyzing the variants between the text editions.
text analysis tasks that require both close and distant reading
techniques. Popular tasks are topic extractions, so that major Some tasks focus an individual literary work, which we
topics in the source texts can be tracked and topic-related call text of interest. Sophisticated close reading techniques
text passages can be discovered. The presence of temporal are often applied but are sometimes coupled with distant
data allows for the analysis of topical evolution, and on the reading representations of the textual content. The underly-
basis of the found topics, a topical clustering of a corpus is ing research tasks vary from visualizing text interpretations
possible. to the analysis of sound of literary works (mostly poems),
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
Timelinesp
Tag clouds
Bottom-up
Heat maps
Font sizep
Top-down
Glyphs
Graphs
Colorp
Plainp
Maps
named entities
places
1 2 1 12 5 1 1 2
persons
2 1 2 1 15 1 1 1
miscellaneous
2 1 1 1 3 1 1
topics
extraction
2 2 4 1 1 1 1
evolution
1 3 1 6 1
clustering
1 2 1 1 2 1 3
similar patterns
linguistic patterns
8 1 4 1 5 1 2 4
text re-use
6 2 4 3 3
interpretation
2 1 1 1
sound
3 3 2 3
story flow
1 1 2 1 1 2
miscellaneous
1 1 1 1 1 1 1 1 1
corpus analysis
text similarity
2 5
exploration
2 1 1 2 1 1 2 1 3
10 38 4 5 9 26 20 18 18 41 9 5 8 25
Table 5: Applied close and distant reading techniques according to text analysis tasks.
and to the story flow analysis of a given source text. Mis- In most cases, a colored background is used to express var-
cellaneous tasks, for example, support the thorough analy- ious types of information about a single word or an entire
sis [KJW∗ 14] or enhance the close reading [GWFI14] of a phrase (Figure 2). The tool Serendip [AKV∗ 14] varies the
literary work. transparency of background colors to encode the importance
of individual words (Figure 6a). Font color is also frequently
The final category are corpus analysis tasks. Usually, dis-
used for this purpose (Figure 6b left). Colored circumcircles
tant reading visualizations are used to explore text corpora
(Figure 6c) around words are used only once [MLCM16].
containing a high number of texts. A major research task is
When displaying digital editions of literary texts, insertions
the analysis of word statistics & relationships among them.
are underlined. This might be the reason that this metaphor
Further interests concern text similarity between the texts of
of underlining words is also rarely used to enhance close
a corpus, others require platforms for corpus exploration to
reading [CWG11]. Overall, coloring is a suitable method to
facilitate knowledge discovery.
express a great variety of textual features. Among other pur-
poses, coloring is used for the analysis of similar patterns,
8. Applied Visualization Techniques e.g., to mark common words (e.g., [JRS∗ 09, Mur11]) and
aligned text segments (e.g., [ZNMS15, RPSF15]) in parallel
This section provides an overview of close and distant read-
texts, or when exploring a text of interest, e.g., to highlight
ing visualizations examined in the papers in our collection.
In addition, we outline strategies for combining close and
distant reading visualizations that facilitate a multifaceted
analysis of the underlying textual data. Table 5 shows a
distribution of the applied techniques according to the text
analysis task taxonomy (see Table 4). For some research
tasks, favored visualization methods stand out. A detailed
overview about what techniques are suited for which text
analysis tasks is given below.
(a) Colored backgrounds and backgrounds with varying trans-
8.1. Close Reading Techniques parency (Figure provided by Alexander et al. and based
on [AKV∗ 14]).
A visualization that allows to close read a text requires that
the structure of the text be retained in order to facilitate a
smooth analysis. With additional information in the form of
manual annotations or of automatically processed features
of textual entities or relationships among them, a plain text
can be transformed into a comprehensive knowledge source.
As can be seen in Table 5, the application of close reading
techniques is particularly important when analyzing similar
patterns. For text edition comparison, close reading is nec- (b) PRISM uses color to highlight the classification of words and
essary to discover occurring similarities and differences, and font size to encode the number of annotations (Figures under CC
a close reading of similar linguistic patterns or text re-use BY 3.0 license based on [WMN∗ 14]).
patterns helps to analyze the contexts in which these pat-
terns were used. When focusing a text of interest, close read-
ing techniques are applied to illustrate various text features.
Other research tasks concerning named entities, topics and
corpus analysis rather investigate generic features of a cor-
pus, and apply mostly distant reading techniques. Still, close
reading is sometimes helpful to connect a computationally
gained distant view with the underlying source texts.
While ten visualizations provide only plain close read-
ing views without additional information, 56 visualizations
attend to the matter of enhancing the close reading capabili-
ties of the humanities scholars. To visualize such additional
(c) Circumcircles in the Poem View (left) and connections in the
information for a great variety of purposes, the researchers Path View (right) highlight rhyme sets in poems (Figure produced
made use of the techniques listed below. with Poemage [Poe16] based on McCurdy et al. [MLCM16]).
Color is the visual attribute most often used to display the
features of textual entities and it is applied in different ways. Figure 6: Color usage for close reading.
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
Figure 7: Poem Viewer uses glyphs to encode phonetic units, and connections show phonetic and semantic relationships (Figure
reproduced with permission from Abdul-Rahman et al. [ARLC∗ 13]).
the automated or manual classification of words or phrases tures. The Myopia Poetry Visualization tool uses rectangular
(e.g., [KJW∗ 14, WMN∗ 14]), or to visualize various sound blocks to visualize poetic feet and the spoken length of sylla-
patterns in poems (e.g., [CTA∗ 13, Ben14]). bles [CGM∗ 12]. For the visualization of a poem’s hermeneu-
tic structure, Piez deploys glyphs in the form of rectangular
Font size is another method of visualizing features of tex-
and circular maps [Pie10,Pie13]. An example is given in Fig-
tual entities. Adopted from tag cloud design [VWF09], this
ure 8. Goffin explores the placement and design of so-called
metaphor serves best to highlight the significance or weight
word-scale visualizations, which are small glyphs enriching
of a textual entity in relation to the given text or corpus. In
the base text with additional information [GWFI14]. For ex-
the design of a variant graph [JGF∗ 15, JG15], which is a di-
ample, the background color of words contained in digital
rected acyclic graph that is used for text edition compari-
copies speaks for OCR certainty. Furthermore, small inter-
son as it visualizes differences and similarities among text
active bar charts illustrate variants of observed words.
variants, font size encodes the number of occurrences of a
word in all editions (Figure 9a). Within the web-based tool Connections aid to illustrate the structure among textual
PRISM [WMN∗ 14], users collaboratively group the words entities most often applied to support text analysis tasks
of literary texts into different categories. The collected statis- concerning similar patterns. One usage of connections in
tics are used to display the number of annotations of each close reading is to highlight subsequent words in a variant
word by variable font size (Figure 6b right). In [CWG11], graph to track variation among text editions [BGHE10]. As
varying font size is used to visualize the importance of text shown in Figure 9a, colored links can help to identify cer-
passages according to the user’s preferences. tain editions [JG15, JGF∗ 15]. Other approaches juxtapose
the texts of various editions and visually link related text
Glyphs attached to individual textual entities are conve- passages [WJ13b,HKTK14,JGBS14], as instantiated in Fig-
nient techniques to visualize abstract annotations that are ure 9b. Furthermore, connections can also be used to vi-
hardly expressible with plain coloring or varying font size.
All examples we found enhance the close reading of a text
of interest, mostly poems. In [ARLC∗ 13], phonetic units are
drawn atop each word using color to classify phonetic types
(Figure 7). Additionally, pictograms illustrate phonetic fea-
(a) Variant graph for seven English translations of Genesis 1:5 con-
necting subsequent words displayed with variable font size (Figure
based on [JGF∗ 15]).
sualize sentence structure [KZ14]. Two close reading vi- maps to show relationships among various texts in a cor-
sualizations use connections for the analysis of sound in pus. The similarity for each tuple of texts within the cor-
poems. While Abdul-Rahman [ARLC∗ 13] illustrates pho- pus can be determined by counting similar text passages,
netic and semantic relations within poems (Figure 7), Mc- and the result can be visualized as a heat map [GCL∗ 13,
Curdy [MLCM16] draws paths between words of a poem FKT14], e.g., to highlight the similarity between Shake-
sharing the same tones (assonances) to highlight sonic pat- spearean plays [RRRG05]. Heat maps are also applied to vi-
terns (Figure 6c). sualize similarities or differences among text editions [JG15,
PMMR15], or to highlight re-used passages between the
texts of a corpus [JGBS14, RARC∗ 15, ZNMS15]. For the
8.2. Distant Reading Techniques
analysis of potentially plagiarized texts, so called Difflines
A visualization that displays summarized information of the reveal structural differences between several suspicious text
given text corpus facilitates distant reading. The process of fragments and their alleged originals in a Focus+Context
transforming such information into complex representations view [RPSF15], an example is shown in Figure 10. A further
can be based upon a large variety of data dimensions, e.g., heat map variant are fingerprinting techniques as introduced
various types of metadata of textual entities, automatically in [KO07] in order to visualize characteristic textual features
processed or manually retrieved relationships between tex- of literary works. In text analysis tasks concerning named
tual elements, or quantitative and qualitative statistics about entities, heat maps can be used to analyze places men-
unstructured textual contents. tioned in texts [AGZH15], or to reveal interpersonal relation-
ships between characters in prose literature [OKK13]. For
An overview of applied distant reading visualizations ac-
the analysis of topics, Alexander et al. [AKV∗ 14] propose
cording to the text analysis task taxonomy is given in Ta-
two matrix representations. The RankViewer illustrates the
ble 5. The overall usage of such techniques suggests their
ranking of words belonging to topics and the CorpusViewer
importance for nearly all text analysis tasks in digital hu-
shows relations to certain topics for each document of a cor-
manities, even when the close reading of a text is more im-
pus. Heat maps are also used in [MSR∗ 15] to display “high-
portant, e.g., when focusing a text of interest or analyzing
level summaries” of topic modeling results. Finally, heat
similar patterns.
maps are used to analyze a text of interest [KJW∗ 14], e.g., to
Within our research papers collection we found 132 vi- visualize the similarity [CTA∗ 13] or the flow [FS11, Ben14]
sualizations providing a distant reading view of a given text of sound in poems.
corpus. We extracted and grouped various approaches found
to visualize summarized information into the six following Tag clouds are intuitive visualizations to encode the fre-
categories. quency of words within a selected section, a whole docu-
ment or an entire text corpus by using variable font size.
Heat maps or block matrices are often used to highlight
text snippets, especially, when analyzing similar patterns
and in corpus analysis tasks. Thereby, a heat map may re- draw
county
balthasar slain
child
montague
to-night grief
flower
flect structural elements of a text [JRS∗ 09, VCPK09] or the friar mercutio
sound letter banished turn
kiss grave
romeo
hence gregory
tybalt
ears
quarrel tears maid
beauty
paris juliet woe musician alone
structure of an entire corpus [CDP∗ 07, Mur11, BGHJ∗ 14]. seek wilt
early face cousin none
watch ere capulet sampson
lawrence
orlando
weep
dream help servant
wrestling friend
prince
bid madam rest
bed head holy lies
breath
lips
benvolio
nurse
farewell light
rosalind
beau house
In such scenarios, the coloring of rectangular blocks helps to hast
villain
ah dear
to-morrow
away hour hate third
dunsinane
death
charles wife fall nay
phebe brother young daughter dead name peace malcolm cawdor
pray strange
doth
MH13, JKH∗ 15]. Another example is the usage of heat sweet soul true
youth away
eyes marry die blood sleep servant
tree amiens
corin fool friends world
fair wit night sleep
murtherer
jaques thank god lie heaven stand
lady
mistress to-morrow attendants
together
pray sweet think cell comes
noble friends sword ross fleance
fear
macduff
leave world
shepherd matter
exit father hear art think
As You Like It
keep blood deed
hand exeunt mine scenefear god
life poor name donalbain
frederick le ay live
grace
lennox knocking
heart sun
touchstone desire daughter
fair
fall full
life mine hand
give give sir speak hand poor
hath hear Macbeth banquo things
iago
mighty derby dead ha
mind
catesby kill grace wife fair hearcomesliveman sir enter sir exit think soul hast help age sense
gracious
doubt edward send holy
to-morrow marry give lord hath heart leave doth
art none
sweet nay officer dead
Richard III
fall dost cry
citizen hour world wit stand
duke
false
honest
york gave
queenunto cassio
business
please told head husband house villain senator
london curse madam deed attendants lieutenant
free
sorrow
wrong bloody gentle
richmond ho heard
indeed
breath thank earth dream honour prithee
desdemona
grey is't venice
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
(a) Linked views for the exploration of com- (b) Colored shapes encode emotions about (c) Fictional map of Yoknapatawpha County
modity trading (Figure reproduced with per- London places (Figure reproduced with per- and related places (Figure reproduced with
mission from Hinrichs et al. [HAC∗ 15]). mission from Heuser et al. [HAHT∗ 15]). permission from Dye et al. [DNCM14]).
Tag clouds are therefore a suitable method for corpus anal- Maps are widely used to display the geospatial informa-
ysis tasks [VCPK09, Bea12, FKT14, GTAHS15]. TagPies tion contained in a text. Most often, maps support the analy-
– a tag cloud arranged in a pie chart manner – support sis of named entities extracted from a text or an entire cor-
the comparative analysis of the co-occurrences of search pus. Two works illustrate the geographical areas which are
terms [JBR∗ 15], an example of which is shown in Figure 11. associated to persons [Wil15b], e.g., by mapping the places
Other approaches visualize the temporal evolution of tags in of activity of musicians manually extracted from musicolog-
tag clouds, either listing tags per time period [CVW09], or ical literature in order to support the geospatial comparison
by attaching a time graph to each tag [LRKC10]. Beaven of musicians’ activity regions [JFS16]. But usually, places
uses tag clouds to illustrate collocational relationships of mentioned in texts are analyzed. With the help of contem-
a single word [Bea08] and to compare the collocates be- porary (e.g., GeoNames [Geo15]) and historical gazetteers
tween two words [Bea11]. Tag clouds are also applied when (e.g., Pleiades [Ple15]), the extracted placenames can be en-
analyzing topics, e.g., by displaying a topic’s characteris- riched with geographical coordinates, and their visualization
tic tags [BJ14, ESK14, JOL∗ 15, MSR∗ 15] (an example can on a map supports the analysis of the (fictional) geographic
be seen Figure 16b), or by summarizing the major tags space described in the source text(s). Some approaches use
for certain time periods [CLT∗ 11, CLWW14]. The usage thematic [ÓML14] or density maps [GH11b,BB15b] for this
of tag clouds to explore the classification of speculative purpose, but the usage of glyphs in the form of circles is
fiction anthologies [HFM16] is shown in Figure 13. Tag more frequent [Tra09, DWS∗ 12, HAC∗ 15, Wil15a] as it sim-
clouds are rarely applied for the analysis of named enti- plifies the interaction with individual plotted places (e.g., see
ties [HAC∗ 15] (see Figure 12a) or when focusing a text of Figure 12a). In [HAHT∗ 15], circles are used to map discrete
interest [KJW∗ 14]. In some of the above mentioned works, London places occurring in fictional literature, and polygons
tag coloring is used to express additional information such as represent wider spaces such as neighborhoods or districts
the temporal evolution of a word’s significance or the classi- of London. The coloring of shapes indicates collected emo-
fication of tags. tions to these places (see Figure 12b). In [JW13], various
glyphs encode various types of places occurring in medieval
texts. Two works that focus on mapping the geographical
knowledge of ancient Greek authors draw connections be-
tween glyphs to illustrate travel routes [EJ14] or to highlight
the strength of the relationship between placenames, which
is reflected by the number of co-occurrences [BPBI10]. In
contrast to the previous works, the geospatial metadata asso-
ciated with individual corpus texts (text creation timestamp)
can be used for mapping [MBL∗ 06]. The visualization of
Faulkner’s fictional Yoknapatawpha County includes various
means of geographic mapping [DNCM14]: on the one hand,
the imagined geography and, on the other, the placenames
displayed on the geographic levels region, nation and world
Figure 13: Exploring the tree structure and representative (Figure 12c). In addition to named entity analysis, maps are
tags of a novel classification (Figure provided by Hinrichs used for the analysis of topics [DFM∗ 08, GDMF∗ 14] and
based on [HFM16]). corpus exploration [JHSS12].
Figure 14: A combination of an abstract timeline view and (a) Network of 55 poetic texts. Imitated texts marked in blue,
a tree in EMDialog (Figure provided by Hinrichs based sequels marked in red (Figure reproduced with permission from
on [HSC08]). Eder [Ede14]).
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
In [HAHB15], a graph visualizes cross references in his- tion of Thomas Jefferson’s social relationships (Figure 15c),
toric encyclopedia by linking related entries. Further appli- the nodes placed on a vertical axis are connected with
cations are the visualization of scene changes and charac- arcs [Kle12]. Riche proposed a layout for Euler diagrams,
ter movements in Shakespearen plays [RRRG05], as well as which can also be utilized to visualize relationships between
the display of conceptual [Arm14], contextual [HSC08] or characters extracted from Shakespearean texts [RD10]. Fi-
multilingual [GZ12] information. Phrase nets connect tex- nally, GeneaQuilts smartly visualizes large genealogies ex-
tual entities that appear in the form of a user-specified re- tracted from literary texts such as the Bible [BDF∗ 10].
lation (syntactic or lexical) [vHWV09]. All aforementioned
works apply force-directed algorithms for the placement of Miscellaneous methods also produce beneficial results
nodes. Radial graphs can be used to unveil the relationships for certain text analysis tasks, most often for the explo-
of words within poems [MFM13], or, again, to highlight the ration of similar patterns. In [JGBS14], an interactive dot
similarity among texts and, in this case, as nodes radially plot interface is used to visualize and explore patterns of
grouped by authors [Wol13]. The Word Tree [WV08], also text re-use between two texts (Figure 16a). In [GCL∗ 13],
used in [MH13], visualizes sentences sharing the same be- a parallel coordinates and a dot plot view, which is used for
ginning in the form of a tree. In contrast to the variant graph, filtering purposes, visualizes the similarity of parallel text
a technique that supports close reading of textual editions, sections. Sankey diagrams are used to compare the cate-
the Word Tree is a distant reading technique as it dissolves gories of words contained in two books [HCC14], and to
the order of sentences. Finally, we found a method that visu- highlight plagiarized text passages when juxtaposing a PhD
alizes plain event trigraphs extracted through phrase mining thesis to potential sources [RPSF15]. For the analysis of
algorithms and thus providing metaphors to display uncer- repetitions in Gertrude Stein’s The Making of the Ameri-
tain information [MLSU13]. When analyzing named enti- cans [CDP∗ 07], parallel coordinates visualize the frequency
ties, graphs are the means of choice to visualize the rela- of phrases across sections, and TextArc [Pal02] is used to ex-
tionships between people in the form of social networks. plore the repetition of individual words. Two miscellaneous
Such representations are widely applied in the digital hu- methods are applied to analyze topics. For the exploratory
manities to illustrate the relationships between characters thematic analysis of historical newspaper archives [ESK14],
in literary texts [CSV08, Tót13, BB15a, TFK15]. In these an application of the dust-and-magnet metaphor [YMSJ05]
graphs, the size of a node can be used to encode the fre- yielded useful results (Figure 16b). Another topical anal-
quency of a character name in the text [BHW11, Pet14], ysis technique uses a landscape metaphor to visualize the
the thickness of an edge [Cob05] (Figure 15b) or the prox- topology-based clustering of articles taken from the New
imity of the nodes [Poi15, JFS16] can serve to reflect the York Times Corpus [OST∗ 10] (Figure 16c). Various meth-
strength of a relationship, and edge style can be used to clas- ods were also developed to support the analysis of a text of
sify the type of relationship [KOTM13]. As per the afore- interest. The tool PlotVis allows users to model and inter-
mentioned works, Kochtchi uses a force-based graph lay- act with XML-encoded literary narratives in 3D [PBD14].
out to visualize social networks automatically extracted from A further complex tool named “Simulated Environment for
newspaper articles [KLB14]. In contrast, radial layouts and Theatre (SET)” supports the story flow simulation of theatri-
parallel coordinates are used in [Boo13]. For the visualiza- cal plays [RSDCD∗ 13]. It consists of various 2D interfaces
illustrating the “line of action” and a 3D interface populated showing the relationships among textual entities is illus-
by character avatars. For the analysis of word statistics & re- trated in [WV08, RFH14]. Here, textual entities can be se-
lationships, tree maps are used to illustrate the occurrences lected in both the graph and the text, triggering mutual up-
of adjectives in fairy tales in [WJ13a]. The Column Explorer dates. Other text analysis tasks also benefit from the com-
introduced in [JFS16] supports the analysis of named enti- bination of top-down and bottom-up approaches. A typi-
ties, in that case by comparatively visualizing biographical cal use case are visual analytics methods. The Varifocal-
profiles of musicians. Reader [KJW∗ 14] hierarchically visualizes a document with
the help of distant views (structural overview, tag clouds)
and close reading techniques (use of color, digital copy), thus
8.3. Techniques for Combining Close and Distant
supporting hierarchical navigation. In close reading mode,
Reading
automatically acquired classifications of textual entities can
Most of the visualizations we found provide either a close be manually modified, which subsequently affects distant
or a distant reading of a text corpus. Still, an important fea- views. The same applies to social networks automatically ex-
ture for literary scholars when working with distant read- tracted from newspaper articles [KLB14]. The user browses
ing visualizations is direct access to source texts or, in the graph, opens close reading views associated with indi-
other words, close reading. Among the papers in our collec- vidual nodes and annotates the source text, which, again,
tion providing close and distant reading, some visualizations affects the distant view and is used for classifier training.
combine both techniques – most often in the form of coor- WordSeer [MH13] allows for a multifaceted perusal of a text
dinated views [WBWK00]. We do not consider the methods corpus. For selected textual entities, several close and dis-
in [WH11,CTA∗ 13,Ben14,BJ14] as the presented visualiza- tant reading views can be used to browse the corresponding
tions for close and distant reading serve different purposes, source texts. Within the close reading views, the user can
and are not connected to one another. Table 5 orders the re- group words into classes, which can then be used as a start-
maining 37 remaining techniques, which are outlined in de- ing point for text corpus analysis.
tail below, according to the given text analysis tasks in three
groups. Top-down strategies support nearly all types of text anal-
ysis tasks. They are mostly applied to combine close and
Bottom-up methods focus primarily on close reading, es-
distant reading visualizations. Such methods implement the
pecially, when focusing similar patterns. In [GCL∗ 13], the
Information Seeking Mantra in its original meaning. Ini-
user selects a desired text passage in Shakespeare’s Othello,
tially, a distant view on the textual data is shown, the user
which is shown in various German translations. Distant read-
can often manipulate the visualization by means of filtering
ing visualizations are processed (parallel coordinates view,
and zooming, and finally retrieve the details-on-demand by
dot plot view, heat map) based on that selection. Another
clicking on a potentially interesting data item. In some cases,
bottom-up approach supports the semi-automatic alignment
the texts are simply shown at the end of the information
of early new high German text variants [MRMK15]. A
seeking pipeline [HSC08, DWS∗ 12, MFM13, RSDCD∗ 13,
graph displaying the similarities between text editions is up-
FKT14, Wil15a, Wil15b, HFM16]. Observed words or text
dated as annotations are collected in close reading sessions.
patterns are often highlighted in the close reading view by
In [Mur11], the literary scholar selects a certain phrase dur-
way of coloring [VCPK09, GZ12, Wol13, AKV∗ 14, HPR14,
ing the close reading process. Next, that phrase is searched
HAC∗ 15, JBR∗ 15, JKH∗ 15]. Various colors can thereby il-
within the text corpus and the phrase’s distribution is shown
lustrate word categories [CDP∗ 07], e.g., types of toponyms
in the form of a heat map. Two approaches provide bottom-
in the Herodotus Timemap [BPBI10] or topological clus-
up strategies to support the analysis of named entities. When
ter information [OST∗ 10]. In some systems, close reading
annotating literary texts [AGZH15], places related to Ed-
is more closely related to the preceding distant reading.
inburgh are marked, and a linked heat map that displays
In [BGHJ∗ 14], the connection between close and distant
the distribution of all annotations is accordingly updated.
reading is achieved by zooming. The distant view, a struc-
In [OGH15], the user explores automatically tagged named
tural overview, highlights certain patterns, and zooming al-
entities of scientific papers in close reading mode. After edit-
lows the close reading of individual passages. In [JGBS14],
ing, a graph reflecting contained entities and relationships
a grid-based heat map visualizes similarities between the
among them is generated.
texts of a corpus, and clicking on a grid cell opens a close
Top-down & bottom-up approaches taken within one reading view showing the corresponding two texts juxta-
visualization entity allow for switching between close and posed with connections between related text passages. Sim-
distant reading while taking into account manipulations of ilarly, the navigation between distant plagiarism overviews
the preceding view. Some of these approaches support the and the close reading of plagiarized passages is organized
analysis of similar patterns. In [JRS∗ 09] and [PMMR15], in [ZNMS15] and [RPSF15]. A distant reading visualization
the user can switch between heat map (distant reading) and illustrating the variance of verses among multiple Bible edi-
text view (close reading). A side-by-side navigation be- tions provides distant views as heat maps on various text hi-
tween source text (close reading) and a distant reading graph erarchy levels (entire Bible, book, chapter) [JG15]. In the
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
chapter view, the close reading of individual verses is possi- Rahman to examine research questions without the aid of
ble. The CorpusSeparator presented in [CWG11] is a distant existent visualizations. The generation of a text corpus is of-
view used to generate a weighted tag list (dependent on cor- ten an enduring humanities scholars’ task that begins with
pus statistics). Based upon these weights, the close reading a project launch [HFM16]. As a consequence, visualization
view of a text (illustrated with Shakespeare’s A Midsummer researchers start with a small training set, and should there-
Night’s Dream) is manipulated by coloring and sizing lines. fore design a visualization as flexible as possible in order
to enable potential changes of humanities scholars’ research
interests, and to avoid limitations. In the best case, the text
9. Collaboration Experiences corpus to be analyzed is already available in digital form and
a precise research question is at hand, as outlined in [JFS16].
Within our collection, we examined papers about the re-
search experiences reported by visualization researchers in Iterative development of prototypes. The involvement
order to provide suggestions that might help visualization of humanities scholars in various stages of the development
scholars new to the field of digital humanities to develop suc- is necessary to ensure creating an intuitive visualization that
cessful visualizations. Some projects reveal valuable insights will be used. For example, regular face-to-face sessions be-
into collaboration experiences. Excellent design studies are tween computer scientists and humanities scholars can help
outlined by Abdul-Rahman et al. [ARLC∗ 13], McCurdy et to identify problems and potential enhancements of the pro-
al. [MLCM16] and Hinrichs et al. [HFM16]. All applica- totype design [JGBS14]. Such a session should be composed
tions were successfully presented in visualization and digital of a demonstration and trials of the visualization prototype
humanities issues. Other publications also share important as well as intense discussions in order to gather the levels
experiences. A collective overview of the gained insights re- of detail and complexity that a visualization should ideally
garding various aspects of the development phase are out- reach [ARLC∗ 13]. Geßner [JGF∗ 15] stated that such a pro-
lined below. cess finally helps to gain an intuitive result “even for the in-
experienced, maybe sceptical user.” When designing a pro-
Project start. The beneficial, initial decision of carry-
filing system for musicians [JFS16], a frequent interdisci-
ing out a user-centered design study [Mun09] is reported
plinary get-together was important for the visualization re-
in various works (e.g., [ARLC∗ 13, JFS16]). This leads to a
searchers to communicate their own concerns and to itera-
very close collaboration between researchers of the different
tively redesign the underlying mathematical basis (similar-
fields, which helps to avoid gearing the development of a vi-
ity measures) – thereby ensuring that aspects of data trans-
sualization into false directions. Further important tasks at
formation retained comprehensible for the collaborating mu-
the beginning of a digital humanities project are discussions
sicologists. That the scholarly exchange is of particular im-
about the research questions and perspectives for which a
portance if the textual data source evolves throughout the
visualization, be it for close or distant reading, can be bene-
project time, is outlined by Hinrichs et al. [HFM16]. On the
ficial [JGBS14]. These discussions include the analysis of
one hand, the archival work of humanities scholars when
the data features [VCPK09] as well as the setup of reg-
working on the dataset may further develop hypotheses that
ular project meetings to work on and extend a collabora-
trigger new visualization ideas, and on the other hand, a visu-
tive idea. A typical problem of digital humanities projects
alization has the potential of changing the humanities schol-
is reported in [MLCM16]. The “initial conversations [be-
ars’ research processes and their perspective on a text col-
tween visualization and humanities scholars] were broad
lection. For the development of Poemage [MLCM16], fre-
and open-ended,” also, because the humanities scholars “did
quent meetings helped visualization researchers in under-
not have specific goals” in mind. Furthermore, the human-
standing the problem space and engaged literary scholars
ities scholars were sceptical that visualization can support
in working with the visualization, and finally, in develop-
their research, and there was also an “anxiety that the com-
ing “an interface that reflected their interests, aesthetics, and
puter would inhibit the qualitative experience of the poetic
values.” The authors also document that the departure from
encounter.” After humanities scholars presented examples
well established design principles such as regarding “ambi-
of interesting features and computer scientists “established
guity as a fundamental source of insight” or “not restricting
methods for computationally detecting and analyzing the de-
the tool to avoid clutter” was necessary in raising the value
vices that most interested them,” a common project basis and
of the visualization. Another example is given by scholars
tasks had been be generated. In such circumstances, special
involved in the development of Neatline [NMG∗ 13], which
workshops can also help computer scientists and humani-
is based upon Omeka [Ome15], a content management sys-
ties scholars get acquainted with each others’ tasks, mind-
tem for online digital collections. The stepwise development
sets and workflows [ARLC∗ 13]. Abdul-Rahman reports the
of Neatline led to advancements of Omeka itself, thus bene-
importance of visualization researchers participating in po-
fiting a far wider audience than originally anticipated.
etry readings and in-depth discussions with literary schol-
ars to discover “a variety of interesting problems that might Evaluating visualizations with humanities scholars.
be subject to visualization solutions.” Also, a small cor- The evaluation sessions provide important insights into de-
pus generated for literary scholars was helpful for Abdul- sign, intuitiveness, the utility of visualizations and into
potential enhancements. A number of humanities schol- Adapting existing visualization techniques. For some
ars working with the visualizations suggested further en- of the text analysis research questions posed in the digital
hancements, some of which strengthen the importance of humanities, the adaption of existent techniques proposed in
close reading solutions [CWG11, HFM16]. For example, visualization research papers is beneficial. A positive ex-
when similar close and distant views were provided, “users ample is the Trading Consequences project [HAC∗ 15]. In-
stressed that it is preferable to see the actual words” rather volved visualization scholars designed a system inspired
than abstract overviews [JRS∗ 09]. When working with the by VisGets [DCCW08] and made use of Parallel Tag
VarifocalReader [KJW∗ 14], the user liked to view “the dig- Clouds [CVW09]. Both visualization techniques were not
itized image of a book’s page and mentioned that this would primarily developed for digital humanities data, but they
increase his trust in the approach.” The metaphor of a digi- were beneficially adapted to support humanities scholars.
tized text is also used when comparing various English trans- Occasionally, new techniques for close and distant reading
lations of the Bible [JGF∗ 15], which “reminds the user that are designed while appropriate, sophisticated visualizations
it is a book to be read, not just some string of letters.” Al- unrelated to digital humanities data already exist. For future
though developed for museum visitors, the importance of research tasks, the inclusion of these visualizations into the
aesthetic appeal to engage in information exploration was workflows of humanities scholars could lead to faster hy-
reported in [HSC08]. The fact that visualizations should be potheses generation due to the limited time for development.
designed to meet humanities work practices is mentioned As an example, the Sequence Surveyor, which provides a
in [BDF∗ 10]. Some humanities scholars also mentioned is- dendrogram to explore genomic structures [ADG11], could
sues or limitations with the presented tools. For instance, support future research. Each leaf of the dendrogram shows
the need to confirm temporary results by analyzing larger a heat map illustrating genome distributions. This metaphor
datasets or, in other words, more texts and in more lan- could be used to visualize both the rhyme structure of a poem
guages [GCL∗ 13]. In [HAC∗ 15], the attached labeling was in dendrogram form and the heat maps displaying phonetic
a crucial issue. The authors resumed the requirement of a patterns. Other possible adaptations of existing visualization
visual representation “to be clear in order to make visu- techniques for digital humanities research can be found in
alizations a valid research tool.” As stated in [JFS16], fu- the previous version of this survey [JFCS15].
ture extensions of visualizations usually require efforts from
Novel techniques for close reading. Various publications
both computer scientists and humanities scholars. Scien-
outline that close reading benefits from visualization, e.g.,
tists involved in [HKTK14] stated that collaborative work
by highlighting crowdsourcing statistics [KG13, WMN∗ 14]
helped to reactivate and to regenerate traditional literary
or displaying information about textual features and struc-
methodologies rather than abandon them. The turn from ini-
ture [ARLC∗ 13, JGF∗ 15] alongside the source text. Al-
tial scepticism when starting the digital humanities project
though close reading is an essential task for humanities
to enthusiasm when using the resultant visualization is re-
scholars, in most cases only simple visualization techniques,
ported in [MLCM16]. In a long-term evaluation, Hinrichs et
such as color coding textual entities, are provided. Few
al. summarize the potential of their developed visualization
works attend to the matter of enhancing close reading in a
for the collaborating humanities scholars [HFM16]. Like in
beneficial manner. For example, the work on word scale vi-
their case study on fiction literature, a visualization should
sualizations is a promising technique [GWFI14] from which
be able (1) to confirm existing hypotheses, (2) to refine hu-
many humanities scholars may profit. But despite the pro-
manities scholars’ research questions, (3) to offer new ways
posed annotations of individual words with statistics or of
of answering research questions, (4) to negotiate quantitative
country names with polygons, the concept needs to be ex-
and qualitative interpretation of the underlying text corpus,
panded to annotating other kinds of named entities. For ex-
and (5) to trigger new research questions. Other visualiza-
ample, providing supplementary information about (1) act-
tion researchers share similar experiences gained in evalua-
ing persons and their relationships, (2) artifacts mentioned
tion sessions with humanities scholars [CWG11, ARLC∗ 13,
in texts, or (3) occurring references could be interesting fea-
GCL∗ 13, HAC∗ 15, MLCM16]. Such an example is given
tures for humanities scholars. Future work in visualization
by Vuillemot et al. [VCPK09]. When working on Gertrude
should include the development of design methods to meet
Stein’s The Making of Americans with POSVis, the collab-
such use cases, and studies that measure the benefit of glyph
orating literary scholar could generate substantial knowl-
based approaches for close reading in comparison to using
edge about the usage of the word one. This led to a pub-
color or font size to express certain text features.
lication she presented at the digital humanities conference
2009 [CPV09]. Visualizing transpositions in parallel texts. When ob-
serving similarities and differences among various editions
of a text, one focus is to detect transpositions of textual en-
10. Future Challenges
tities. Such transpositions may occur on various text hierar-
Throughout our work on this survey, we marked major chal- chy levels, e.g., changed word order, modified argumenta-
lenges in the digital humanities where the visualization com- tion structures, or even when exchanging whole paragraphs
munity can contribute valuable research. or sections. Although suitable methods exist for the first two
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
hierarchy levels (words, sentences) [WJ13b, JGF∗ 15], there tions considered within our survey was illustrated by usage
are no visualization techniques capable of coherently visu- scenarios, we found little evidence about conducted usability
alizing transpositions on all hierarchy levels by combining studies to, for example, justify taken design decisions. The
means of close and distant reading. number of humanities scholars participating in such studies
is potentially very small due to the multifarious research in-
Geospatial uncertainty. Many visualizations deal with
terests scholars may have on a large body of texts belonging
placenames extracted from literary texts to illustrate the geo-
to different eras and genres. Generating a user study format
graphical knowledge of a particular era. Here, various map-
that caters for the interests of many different scholars is re-
ping issues arise [JW13]. Texts may contain placenames of
quired to gain valuable insights into guidelines for design-
varying granularity (e.g., country, region, city) or type (e.g.,
ing visualizations for the digital humanities. When it comes
points for cities, polygons for areas, polylines for rivers) or
to tool building, in fact, the digital humanities community
even fictional placenames, which are hard to represent. Fur-
poses interesting and complex challenges by virtue of its in-
thermore, placenames can themselves carry uncertainty of
terdisciplinary nature. It embraces a wider range of disci-
varying degrees, e.g., the exact locations of “Sparta” and
plines, so the techniques it offers should address the larger
“Atlantis” have yet to be discovered. Another form of un-
scope. It also welcomes contrasting mindsets, methods and
certainty is defined by contextual information, e.g., expres-
cultures. While sharing similar logical and analytical meth-
sions like “in London” and “close to London” cover various
ods, computer scientists tend towards problem solving, hu-
geospatial ranges. The development of a design space pro-
manities scholars towards knowledge acquisition and dis-
viding solutions to visualize these various types of geospa-
semination [Hen14]. No one community should operate in
tial uncertainty is one of the current primary challenges in
subservience to the other but together, complementing each
digital humanities. Such a design space could be built upon
others’ approaches. For these reasons and in this context,
the ideas of MacEachren for visualizing geospatial uncer-
specialist terminology, assumptions and technical barriers
tainty [MRO∗ 12].
should all be avoided. It is in this sense that tool usability
Temporal uncertainty. The visualization of temporal un- should be understood not only as improved functionality or
certainty is an equally important future task. Such uncertain- aesthetics but as a transparent guide to utility [GO12].
ties occur, for instance, when dating cultural heritage ob-
Qualitative studies. The number of projects that include
jects, such as historical manuscripts [JW13, BESL14]. Tem-
visualization components as valuable means of text analy-
poral metadata, in fact, can be provided in multifarious man-
sis indicates the potential of visualization to support digital
ners, e.g., 1450, before 1450, after 1450, around 1450, 15th
humanities research. Some scholars suggest the role of visu-
century, first half of the the 15th century, etc. One can try
alization as providers of new perspectives on the texts that
to transform such temporal formats into machine-parsable
facilitate text comprehension and hypothesis generation. For
time ranges, but the visualization of such uncertainties is a
example, humanities scholars involved in the development
crucial issue as it comprises considerable risks of misinter-
of the PoemViewer [ARLC∗ 13] mentioned that “they would
pretation. Applying methods capable of visualizing temporal
not likely look for insight from the tool itself ... they would
uncertainty as proposed by Slingsby [SDW11] can be a first
look for enhanced poetic engagement, facilitated by visu-
step, but their utility for humanities applications needs to be
alization.” Hinrichs et al. [HFM16] state that “information
investigated.
visualizations ... are not a means to an end but a starting
Reconstructing workflows with visualization. In two vi- point to explore, interpret, and discuss literary collections.”
sualization papers, authors related situations where, during Similarly, Sinclair [SRR13] argues that “a visualization that
their conducted case studies, humanities scholars mentioned produces a single output for a given body of material is of
the importance of visualization features that emulate the limited usefulness; a visualization that provides many ways
scholar’s workflow. In [KJW∗ 14], users liked the display of to interact with the data, viewed from different perspectives,
digital copies as this builds trust in the visualization. When is better; a visualization that contributes to new and emergent
working with genealogy visualizations [BDF∗ 10], histori- ways of understanding the material is best.” Comprehensive
ans “insisted on redundant representation of gender ... that case studies that scientifically debate the actual influence and
is consistent with their current practices.” Both situations il- impact of visualization could further specify its role and fur-
lustrate the future challenge of inventing visualization tech- ther strengthen its value as part of humanities research.
niques for digital humanities applications that the humani-
ties scholar can easily adapt. An important task for the com-
puter scientist is not only to incorporate a scholar’s workflow 11. Conclusion
when designing the visualization, but to also communicate
Computer scientists and humanities scholars seemingly do
all aspects of data transformation, so that a scholar is able
not have many things in common. Although they share some
to generate trustworthy hypotheses. The importance of this
methodologies, they are geared towards different goals. But
issue is documented in [GO12].
the digital age created a platform that brings people from
Usability studies. Although the utility of most visualiza- two research areas together: the digital humanities.
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
[BGHE10] B ÜCHLER M., G ESSNER A., H EYER G., E CKART Q U H., T ONG X.: TextFlow: Towards Better Understanding of
T.: Detection of Citations and Textual Reuse on Ancient Greek Evolving Topics in Text. Visualization and Computer Graphics,
Texts and its Applications in the Classical Studies: eAQUA IEEE Transactions on 17, 12 (Dec 2011), 2412–2421. 8, 13, 14
Project. In Proceedings of the Digital Humanities 2010 (2010).
[CLWW14] C UI W., L IU S., W U Z., W EI H.: How Hierarchical
7, 8, 11
Topics Evolve in Large Text Corpora. Visualization and Com-
[BGHJ∗ 14] B ÖGEL T., G OLD V., H AUTLI -JANISZ A., puter Graphics, IEEE Transactions on 20, 12 (Dec 2014), 2281–
ROHRDANTZ C., S ULGER S., B UTT M., H OLZINGER K., 2290. 7, 8, 13, 14
K EIM D. A.: Towards visualizing linguistic patterns of deliber-
[CMS99] C ARD S. K., M ACKINLAY J. D., S HNEIDERMAN B.:
ation: a case study of the S21 arbitration. In Proceedings of the
Readings in information visualization: using vision to think.
Digital Humanities 2014 (2014). 5, 8, 12, 16
Morgan Kaufmann, 1999. 5
[BHW11] B INGENHEIMER M., H UNG J.-J., W ILES S.: Social
[Cob05] C OBURN A.: Text Modeling and Visualization with Net-
network visualization from TEI data. Literary and Linguistic
work Graphs. In Proceedings of the Digital Humanities 2005
Computing 26, 3 (2011), 271–278. 5, 8, 15
(2005). 8, 14, 15
[BJ14] B INDER J. M., J ENNINGS C.: Visibility and meaning in
topic models and 18th-century subject indexes. Literary and Lin- [Cor13] C ORDELL R.: "Taken Possession of": The Reprinting
guistic Computing 29, 3 (2014), 405–411. 5, 7, 8, 13, 16 and Reauthorship of Hawthorne’s "Celestial Railroad" in the An-
tebellum Religious Press. Digital Humanities Quarterly 7, 1
[BM13] B REHMER M., M UNZNER T.: A multi-level typology of (2013). 5, 7, 8
abstract visualization tasks. IEEE Transactions on Visualization
and Computer Graphics 19, 12 (2013), 2376–2385. 7 [CPV09] C LEMENT T., P LAISANT C., V UILLEMOT R.: The
Story of One: Humanity scholarship with visualization and text
[BNJ03] B LEI D. M., N G A. Y., J ORDAN M. I.: Latent Dirichlet analysis. In Proceedings of the Digital Humanities 2009 (2009).
Allocation. the Journal of machine Learning research 3 (2003), 18
993–1022. 7
[CRS∗ 14] C HRISTIE A., ROSS S., S AYERS J., TANIGAWA K.,
[Boo13] B OOTH A.: Documentary Social Networks: Collective T EAM I.-M. R.: Z-Axis Scholarship: Modeling How Modernists
Biographies of Women. In Proceedings of the Digital Humanities Write the City. In Proceedings of the Digital Humanities 2014
2013 (2013). 5, 8, 15 (2014). 3
[Boy13] B OYLES N.: Closing in on Close Reading. Educational [CSV08] C IULA A., S PENCE P., V IEIRA J. M.: Expressing com-
Leadership 70, 4 (2013), 36–41. 2 plex associations in medieval historical documents: the Henry
[BPBI10] BARKER E., P ELLING C., B OUZAROVSKI S., I SAK - III Fine Rolls Project. Literary and Linguistic Computing 23,
SEN L.: Mapping the World of an Ancient Greek Historian: The 3 (2008), 311–325. 8, 15
HESTIA Project. In Proceedings of the Digital Humanities 2010 [CTA∗ 13] C LEMENT T., T CHENG D., AUVIL L., C APITANU B.,
(2010). 5, 6, 8, 13, 14, 16 BARBOSA J.: Distant Listening to Gertrude Stein’s ’Melanctha’:
[Bra12] B RADLEY A. J.: Violence and the Digital Humanities Using Similarity Analysis in a Discovery Paradigm to Analyze
Text as Pharmakon. In Proceedings of the Digital Humanities Prosody and Author Influence. Literary and Linguistic Comput-
2012 (2012). 3 ing 28, 4 (2013), 582–602. 6, 7, 8, 11, 12, 16
[BW08] B YRON L., WATTENBERG M.: Stacked Graphs – Geom- [CVW09] C OLLINS C., V IEGAS F., WATTENBERG M.: Parallel
etry & Aesthetics. Visualization and Computer Graphics, IEEE Tag Clouds to explore and analyze faceted text corpora. In Vi-
Transactions on 14, 6 (Nov 2008), 1245–1252. 14 sual Analytics Science and Technology, 2009. VAST 2009. IEEE
Symposium on (Oct 2009), pp. 91–98. 8, 13, 18
[CAA∗ 14] C ORRELL M., A LEXANDER E., A LBERS D.,
S ARIKAYA A., G LEICHER M.: Navigating Reductionism and [CWG11] C ORRELL M., W ITMORE M., G LEICHER M.: Explor-
Holism in Evaluation. In Proceedings of the Fifth Workshop on ing collections of tagged text for literary scholarship. Computer
Beyond Time and Errors: Novel Evaluation Methods for Visual- Graphics Forum 30, 3 (2011), 731–740. 8, 10, 11, 12, 17, 18
ization (New York, NY, USA, 2014), BELIV ’14, ACM, pp. 23– [DCCW08] D ÖRK M., C ARPENDALE S., C OLLINS C.,
26. 3 W ILLIAMSON C.: VisGets: Coordinated Visualizations for
[CDP∗ 07] C LEMENT T., D ON A., P LAISANT C., AUVIL L., Web-based Information Exploration and Discovery. Visualiza-
PAPE G., G OREN V.: ’Something that is interesting is interesting tion and Computer Graphics, IEEE Transactions on 14, 6 (Nov
them’: Using Text Mining and Visualizations to Aid Interpreting 2008), 1205–1212. 18
Repetition in Gertrude Stein’s The Making of Americans. In Pro- [DFM∗ 08] DYENS O., F OREST D., M ONDOU P., C OOLS V.,
ceedings of the Digital Humanities 2007 (2007). 5, 7, 8, 12, 15, J OHNSTON D.: Information visualization and text mining: ap-
16 plication to a corpus on posthumanism. In Proceedings of the
[CEJ∗ 14] C RAIG H., E DER M., JANNIDIS F., K ESTEMONT M., Digital Humanities 2008 (2008). 7, 8, 13
RYBICKI J., S CHÖCH C.: Validating Computational Stylistics in [DNCM14] DYE D. J., NAPOLIN J. B., C ORNELL E., M ARTIN
Literary Interpretation. In Proceedings of the Digital Humanities W.: Digital Yoknapatawpha: Interpreting a Palimpsest of Place.
2014 (2014). 7, 8, 14 In Proceedings of the Digital Humanities 2014 (2014). 5, 8, 13,
[CGM∗ 12] C HATURVEDI M., G ANNOD G., M ANDELL L., 14
A RMSTRONG H., H ODGSON E.: Myopia: A Visualization Tool [Dru11] D RUCKER J.: Humanities Approaches to Graphical Dis-
in Support of Close Reading. In Proceedings of the Digital Hu- play. Digital Humanities Quarterly 5, 1 (2011). 3
manities 2012 (2012). 5, 6, 8, 11
[DWS∗ 12] D OU W., WANG X., S KAU D., R IBARSKY W.,
[CL13] C OLES K., L EIN J. G.: Solitary Mind, Collaborative
Z HOU M.: LeadLine: Interactive visual analysis of text data
Mind: Close Reading and Interdisciplinary Research. In Pro-
through event identification and exploration. In Visual Analyt-
ceedings of the Digital Humanities 2013 (2013). 3
ics Science and Technology (VAST), 2012 IEEE Conference on
[CLT∗ 11] C UI W., L IU S., TAN L., S HI C., S ONG Y., G AO Z., (Oct 2012), pp. 93–102. 8, 13, 14, 16
[Ede14] E DER M.: Stylometry, network analysis, and Latin liter- [GTAHS15] G RAY S. J., T ERRAS M., A MMANN R., H UDSON -
ature. In Proceedings of the Digital Humanities 2014 (2014). 5, S MITH A.: Textal: Unstructured Text Analysis Workflows
7, 8, 14 Through Interactive Smartphone Visualisations. In Proceedings
of the Digital Humanities 2015 (2015). 6, 7, 8, 13
[EJ14] E VANS C., JASNOW B.: Mapping Homer’s Catalogue of
Ships. Literary and Linguistic Computing 29, 3 (2014), 317–325. [GTW13] G OODING P., T ERRAS M., WARWICK C.: The myth
7, 8, 13 of the new: Mass digitization, distant reading, and the future of
the book. Literary and Linguistic Computing 28, 4 (2013), 629–
[eMa15] eMargin, 2015. http://eMargin.bcu.ac.uk/ 639. 5
(Retrieved 2015-01-09). 3
[GWFI14] G OFFIN P., W ILLETT W., F EKETE J.-D., I SENBERG
[ESK14] E ISENSTEIN J., S UN I., K LEIN L. F.: Exploratory The- P.: Exploring the Placement and Design of Word-Scale Visual-
matic Analysis for Historical Newspaper Archives. In Proceed- izations. Visualization and Computer Graphics, IEEE Transac-
ings of the Digital Humanities 2014 (2014). 8, 13, 14, 15 tions on 20, 12 (Dec 2014), 2291–2300. 8, 10, 11, 18
[EX10] E STEVA M., X U W.: Finding Stories in the Archive [GZ12] G EDZELMAN S., Z ANCARINI J.-C.: HyperMachiavel: a
through Paragraph Alignment. In Proceedings of the Digital Hu- translation comparison tool. In Proceedings of the Digital Hu-
manities 2010 (2010). 8, 14 manities 2012 (2012). 7, 8, 15, 16
[FGM05] F INKEL J. R., G RENAGER T., M ANNING C.: Incor- [HAC∗ 15] H INRICHS U., A LEX B., C LIFFORD J., WATSON A.,
porating Non-local Information into Information Extraction Sys- Q UIGLEY A., K LEIN E., C OATES C. M.: Trading Conse-
tems by Gibbs Sampling. In Proceedings of the 43rd Annual quences: A Case Study of Combining Text Mining and Visual-
Meeting on Association for Computational Linguistics (2005), ization to Facilitate Document Exploration. Digital Scholarship
Association for Computational Linguistics, pp. 363–370. 7 in the Humanities (2015). 6, 7, 8, 13, 14, 16, 18
[Fin10] F INN E.: The Social Lives of Books: Mapping the [HAHB15] H EUSER R., A LGEE -H EWITT M., B ENDER J.:
Ideational Networks of Toni Morrison. In Proceedings of the Knowledge Networks, Juxtaposed: Disciplinarity in the Ency-
Digital Humanities 2010 (2010). 5 clopédie and Wikipedia. In Proceedings of the Digital Humani-
ties 2015 (2015). 8, 15
[FKT14] FANKHAUSER P., K ERMES H., T EICH E.: Combining
Macro- and Microanalysis for Exploring the Construal of Sci- [HAHT∗ 15] H EUSER R., A LGEE -H EWITT M., T RAN V.,
entific Disciplinarity. In Proceedings of the Digital Humanities L OCKHART A., S TEINER E.: Mapping the Emotions of London
2014 (2014). 8, 12, 13, 16 in Fiction, 1700-1900: A Crowdsourcing Experiment. In Pro-
ceedings of the Digital Humanities 2015 (2015). 8, 13
[FMT15] F RANZINI G., M AHONY S., T ERRAS M.: A Catalogue
of Digital Editions. In Scholarly digital editions: Theory, prac- [Haw00] H AWTHORN J.: A glossary of contemporary literary
tice and future perspectives (2015), Pierazzo E., Driscoll M. J., theory. Oxford University Press, 2000. 2
(Eds.), Open Book Publishers. 5 [HCC14] H SIANG J., C HEN L., C HUNG C.-H.: A glimpse of
[FS11] F ORSTALL C., S CHEIRER W. J.: Visualizing Sound as the change of worldview between 7th and 10th century China
Functional N-Grams in Homeric Greek Poetry. In Proceedings through two leishu. In Proceedings of the Digital Humanities
of the Digital Humanities 2011 (2011). 6, 8, 12 2014 (2014). 8, 15
[GCL∗ 13] G ENG Z., C HEESMAN T., L ARAMEE R. S., F LANA - [HDvHM∗ 15] H AENTJENS D EKKER R., VAN H ULLE D., M ID -
DELL G., N EYT V., VAN Z UNDERT J.: Computer-supported col-
GAN K., T HIEL S.: ShakerVis: Visual analysis of segment vari-
ation of German translations of Shakespeare’s Othello. Informa- lation of modern manuscripts: CollateX and the Beckett Digital
tion Visualization (2013). 5, 6, 7, 8, 12, 15, 16, 18 Manuscript Project. Digital Scholarship in the Humanities 30, 3
(2015), 452–470. 7
[GDMF∗ 14] G REGORY I., D ONALDSON C., M URRIETA -
[Hen14] H ENSELER C.: Minecraft Anyone? Encouraging A New
F LORES P., RUPP C., BARON A., H ARDIE A., R AYSON P.:
Generation of Computer Scientists and Humanists, 2014. http:
Digital approaches to understanding the geographies in literary
//tinyurl.com/lk58xlv (Retrieved 2015-01-09). 19
and historical texts. In Proceedings of the Digital Humanities
2014 (2014). 8, 13, 14 [HFM16] H INRICHS U., F ORLINI S., M OYNIHAN B.: Specu-
lative Practices: Utilizing InfoVis to Explore Untapped Literary
[Geo15] GeoNames Gazetteer, 2015. http://www.
Collections. Visualization and Computer Graphics, IEEE Trans-
geonames.org/ (Retrieved 2015-01-10). 13
actions on 22, 1 (Jan 2016), 429–438. 6, 7, 8, 13, 14, 16, 17, 18,
[GH11a] G OODWIN J., H OLBO J.: Reading graphs, maps, trees: 19
responses to Franco Moretti. Parlor Press, Anderson, SC, 2011. [HKTK14] H OWELL S., K ELLEHER M., T EEHAN A., K EATING
Book, Whole. 3 J.: A Digital Humanities Approach to Narrative Voice in The
[GH11b] G REGORY I. N., H ARDIE A.: Visual GISting: bringing Secret Scripture: Proposing a New Research Method. Digital
together corpus linguistics and Geographical Information Sys- Humanities Quarterly 8, 2 (2014). 7, 8, 11, 18
tems. Literary and Linguistic Computing 26, 3 (2011), 297–314. [Hoc04] H OCKEY S.: The history of humanities computing. A
7, 8, 13 companion to digital humanities (2004), 3–19. 1
[GO12] G IBBS F., OWENS T.: Building Better Digital Humani- [HPR14] H OYT E., P ONTO K., ROY C.: Visualizing and An-
ties Tools: Toward broader audiences and user-centered designs. alyzing the Hollywood Screenplay with ScripThreads. Digital
Digital Humanities Quarterly 6, 2 (2012). 19 Humanities Quarterly 8, 4 (2014). 6, 8, 14, 16
[Goo15] Google Books, 2015. https://books.google. [HSC08] H INRICHS U., S CHMIDT H., C ARPENDALE S.: EMDi-
com/ (Retrieved 2015-01-09). 1, 3 alog: Bringing Information Visualization into the Museum. Visu-
alization and Computer Graphics, IEEE Transactions on 14, 6
[Got15] Gothenburg model, 2015. http://wiki.tei-c.
(Nov 2008), 1181–1188. 5, 8, 14, 15, 16, 18
org/index.php/Textual_Variance (Retrieved 2015-
10-06). 7 [Jas01] JASINSKI J.: Rhetoric and Society: Sourcebook on
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
Rhetoric: Key Concepts in Contemporary Rhetorical Studies, Transactions on 20, 12 (Dec 2014), 1723–1732. 7, 8, 10, 11, 12,
vol. 4. Sage Publications, 2001. 2 13, 16, 18, 19
[JBR∗ 15] J ÄNICKE S., B LUMENSTEIN J., R ÜCKER M., [KKC15] K ERRACHER N., K ENNEDY J., C HALMERS K.: A
Z ECKZER D., S CHEUERMANN G.: Visualizing the Results of Task Taxonomy for Temporal Graph Visualisation. IEEE Trans-
Search Queries on Ancient Text Corpora with Tag Pies. Digital actions on Visualization and Computer Graphics 21, 10 (Oct
Humanities Quarterly (2015). 7, 8, 12, 13, 16 2015), 1160–1172. 7
[JFCS15] JÄNICKE S., F RANZINI G., C HEEMA M. F., [KKL∗ 11] K IM H., K ANG B.- M ., L EE D.-G., C HUNG E., K IM
S CHEUERMANN G.: On Close and Distant Reading in Digital I.: Trends 21 Corpus: A Large Annotated Korean Newspaper
Humanities: A Survey and Future Challenges. In Eurographics Corpus for Linguistic and Cultural Studies. In Proceedings of
Conference on Visualization (EuroVis) - STARs (2015), Borgo R., the Digital Humanities 2011 (2011). 7, 8, 14
Ganovelli F., Viola I., (Eds.), The Eurographics Association. 2, [KLB14] KOCHTCHI A., L ANDESBERGER T. V., B IEMANN C.:
18 Networks of Names: Visual Exploration and Semi-Automatic
[JFS16] J ÄNICKE S., F OCHT J., S CHEUERMANN G.: Interac- Tagging of Social Networks from Newspaper Articles. Computer
tive Visual Profiling of Musicians. Visualization and Computer Graphics Forum 33, 3 (2014), 211–220. 5, 6, 8, 15, 16
Graphics, IEEE Transactions on 22, 1 (Jan 2016), 200–209. 7, [Kle12] K LEIN L. F.: Social Network Analysis and Visualiza-
8, 13, 15, 16, 17, 18 tion in ’The Papers of Thomas Jefferson’. In Proceedings of the
[JG15] J ÄNICKE S., G ESSNER A.: A Distant Reading Visualiza- Digital Humanities 2012 (2012). 5, 6, 8, 14, 15
tion for Variant Graphs. In Proceedings of the Digital Humanities [KO07] K EIM D., O ELKE D.: Literature Fingerprinting: A New
2015 (2015). 6, 8, 11, 12, 16 Method for Visual Literary Analysis. In Visual Analytics Sci-
[JGBS14] J ÄNICKE S., G ESSNER A., B ÜCHLER M., S CHEUER - ence and Technology, 2007. VAST 2007. IEEE Symposium on
MANN G.: Visualizations for Text Re-use. GRAPP/IVAPP (Oct 2007), pp. 115–122. 6, 7, 8, 12
(2014), 59–70. 5, 7, 8, 11, 12, 15, 16, 17 [KOTM13] K IMURA F., O SAKI T., T EZUKA T., M AEDA A.:
[JGF∗ 15] J ÄNICKE S., G ESSNER A., F RANZINI G., T ERRAS Visualization of relationships among historical persons from
M., M AHONY S., S CHEUERMANN G.: TRAViz: A Visualiza- Japanese historical documents. Literary and Linguistic Comput-
tion for Variant Graphs. Digital Scholarship in the Humanities ing 28, 2 (2013), 271–278. 7, 8, 15
30, suppl 1 (2015), i83–i99. 7, 8, 11, 17, 18, 19 [KZ14] K RAUSE T., Z ELDES A.: ANNIS3: A new architecture
[JHSS12] J ÄNICKE S., H EINE C., S TOCKMANN R., S CHEUER - for generic corpus query and visualization. Literary and Linguis-
MANN G.: Comparative Visualization of Geospatial-temporal tic Computing (2014). 6, 8, 12
Data. In GRAPP/IVAPP (2012), pp. 613–625. 8, 13, 14 [LIJ∗ 14] L EHMANN J., I SELE R., JAKOB M., J ENTZSCH A.,
[JKH∗ 15] J OHN M., KOCH S., H EIMERL F., M ÜLLER A., E RTL KONTOKOSTAS D., M ENDES P. N., H ELLMANN S., M ORSEY
T., K UHN J.: Interactive Visual Analysis Of German Poetics. In M., VAN K LEEF P., AUER S., ET AL .: DBpedia – A Large-
Proceedings of the Digital Humanities 2015 (2015). 6, 8, 12, 16 scale, Multilingual Knowledge Base Extracted from Wikipedia.
Semantic Web Journal 5 (2014), 1–29. 7
[Joc12] J OCKERS M.: Computing and Visualizing the 19th-
Century Literary Genome. In Proceedings of the Digital Hu- [LPP∗ 06] L EE B., P LAISANT C., PARR C. S., F EKETE J.-D.,
manities 2012 (2012). 7, 8, 14 H ENRY N.: Task Taxonomy for Graph Visualization. In Pro-
ceedings of the 2006 AVI Workshop on BEyond Time and Errors:
[Joc13] J OCKERS M. L.: Macroanalysis: Digital Methods & Lit- Novel Evaluation Methods for Information Visualization (New
erary History. University of Illinois Press, 2013. 2 York, NY, USA, 2006), BELIV ’06, ACM, pp. 1–5. 7
[JOL∗ 15] J ÄHNICHEN P., O ESTERLING P., L IEBMANN T., [LRKC10] L EE B., R ICHE N., K ARLSON A., C ARPENDALE
H EYER G., K URAS C., S CHEUERMANN G.: Exploratory Search S.: SparkClouds: Visualizing Trends in Tag Clouds. Visualiza-
Through Interactive Visualization of Topic Models. In Proceed- tion and Computer Graphics, IEEE Transactions on 16, 6 (Nov
ings of the Digital Humanities 2015 (2015). 6, 8, 13 2010), 1182–1189. 8, 13
[JRS∗ 09] J ONG C.-H., R AJKUMAR P., S IDDIQUIE B., [LWW∗ 13] L IU S., W U Y., W EI E., L IU M., L IU Y.: StoryFlow:
C LEMENT T., P LAISANT C., S HNEIDERMAN B.: Interac- Tracking the Evolution of Stories. Visualization and Computer
tive Exploration of Versions across Multiple Documents. In Graphics, IEEE Transactions on 19, 12 (Dec 2013), 2436–2445.
Proceedings of the Digital Humanities 2009 (2009). 8, 10, 12, 8, 14
16, 18
[Mar12] M ARCHE S.: Literature is not Data: Against Digital Hu-
[JW13] J ÄNICKE S., W RISLEY D. J.: Visualizing Uncertainty: manities, 2012. http://www.lareviewofbooks.org/
How to Use the Fuzzy Data of 550 Medieval Texts? In Proceed- article.php?id=1040 (Retrieved 2015-01-09). 3
ings of the Digital Humanities 2013 (2013). 5, 7, 8, 13, 14, 19
[MBL∗ 06] M EHLER A., BAO Y., L I X., WANG Y., S KIENA S.:
[Kau15] K AUFMAN M.: ’Everything on Paper Will Be Used Spatial Analysis of News Sources. Visualization and Computer
Against Me’: Quantifying Kissinger. In Proceedings of the Dig- Graphics, IEEE Transactions on 12, 5 (Sept 2006), 765–772. 8,
ital Humanities 2015 (2015). 5, 6, 7, 8, 14 13
[KBK11] K RSTAJIC M., B ERTINI E., K EIM D.: CloudLines: [McC15] M C C ABE M. M.: Platonic Conversations. Oxford Uni-
Compact Display of Event Episodes in Multiple Time-Series. Vi- versity Press, USA, 2015. 2
sualization and Computer Graphics, IEEE Transactions on 17,
[MFM08] M ENESES L., F URUTA R., M ALLEN E.: Exploring the
12 (Dec 2011), 2432–2439. 6, 8, 14
Biography and Artworks of Picasso with Interactive Calendars
[KG13] K EHOE A., G EE M.: eMargin: A Collaborative Textual and Timelines. In Proceedings of the Digital Humanities 2008
Annotation Tool. Ariadne 71 (July 2013). 2, 3, 18 (2008). 5
[KJW∗ 14] KOCH S., J OHN M., W ORNER M., M ULLER A., [MFM13] M ENESES L., F URUTA R., M ANDELL L.: Ambiances:
E RTL T.: VarifocalReader – In-Depth Visual Analysis of Large A Framework to Write and Visualize Poetry. In Proceedings of
Text Documents. Visualization and Computer Graphics, IEEE the Digital Humanities 2013 (2013). 8, 15, 16
[MH13] M URALIDHARAN A., H EARST M. A.: Supporting ex- [PBD14] P EÑA E., B ROWN M., D OBSON T.: On Metaphor in
ploratory text analysis in literature study. Literary and Linguistic Text Visualization Prototypes. In Proceedings of the Digital Hu-
Computing 28, 2 (2013), 283–295. 7, 8, 12, 15, 16 manities 2014 (2014). 8, 15
[MLCM16] M C C URDY N., L EIN J., C OLES K., M EYER M.: Po- [Per15] Perseus Digital Library, 2015. Ed. Gregory R. Crane.
emage: Visualizing the Sonic Topology of a Poem. Visualization Tufts University. http://www.perseus.tufts.edu (ac-
and Computer Graphics, IEEE Transactions on 22, 1 (Jan 2016), cessed March 19, 2015). 1
439–448. 8, 10, 12, 17, 18 [Pet14] P ETERSON N.: Visualization As a Bridge to Close Read-
[MLSU13] M ILLER B., L I F., S HRESTHA A., U MAPATHY K.: ing: The Audience in The Castle of Perseverance. In Proceedings
Digging into Human Rights Violations: phrase mining and tri- of the Digital Humanities 2014 (2014). 5, 8, 15
gram visualization. In Proceedings of the Digital Humanities [PHI15] PHI Latin Texts, 2015. http://latin.packhum.
2013 (2013). 8, 15 org/ (Retrieved 2015-01-09). 1
[Mor05] M ORETTI F.: Graphs, Maps, Trees: Abstract Models for [Pie10] P IEZ W.: Towards Hermeneutic Markup: An architectural
a Literary History. Verso, July 2005. 1, 3, 4 outline. In Proceedings of the Digital Humanities 2010 (2010).
5, 8, 11
[Mor13] M ORETTI F.: Distant reading. Verso, 2013. 3
[Pie13] P IEZ W.: Markup Beyond XML. In Proceedings of the
[MRMK15] M EDEK A., R ITTER J., M OLITOR P., K ÖSSER S.: Digital Humanities 2013 (2013). 7, 8, 11
Interactive Similarity Analysis of Early New High German Text
Variants. In Proceedings of the Digital Humanities 2015 (2015). [Ple15] Pleiades: a community-built gazetteer and graph of an-
8, 14, 16 cient places, 2015. http://pleiades.stoa.org/ (Re-
trieved 2015-10-06). 7, 13
[MRO∗ 12] M AC E ACHREN A., ROTH R., O’B RIEN J., L I B.,
S WINGLEY D., G AHEGAN M.: Visual Semiotics & Uncertainty [PMMR15] P ÖCKELMANN M., M EDEK A., M OLITOR P., R IT-
Visualization: An Empirical Study. Visualization and Computer TER J.: _CATview_ - Supporting The Investigation Of Text Gen-
Graphics, IEEE Transactions on 18, 12 (Dec 2012), 2496–2505. esis Of Large Manuscripts By An Overall Interactive Visualiza-
19 tion Tool. In Proceedings of the Digital Humanities 2015 (2015).
8, 12, 16
[MSR∗ 15] M ONTAGUE J., S IMPSON J., ROCKWELL G.,
RUECKER S., B ROWN S.: Exploring Large Datasets with Topic [Poe16] Poemage, 2016. http://www.sci.utah.edu/
Model Visualizations. In Proceedings of the Digital Humanities ~nmccurdy/Poemage/ (Retrieved 2016-03-18). 10
2015 (2015). 6, 8, 12, 13 [Poi15] P OIBEAU T.: Generating Navigable Semantic Maps from
Social Sciences Corpora. In Proceedings of the Digital Humani-
[Mun09] M UNZNER T.: A nested model for visualization de-
ties 2015 (2015). 6, 8, 14, 15
sign and validation. Visualization and Computer Graphics, IEEE
Transactions on 15, 6 (2009), 921–928. 17 [Pos07] P OSAVEC S.: Literary Organism, 2007. http://www.
stefanieposavec.co.uk/ (Retrieved 2015-01-09). 3
[Mur11] M URALIDHARAN A.: A Visual Interface for Exploring
Language Use in Slave Narratives. In Proceedings of the Digital [Pro15] Project Gutenberg, 2015. http://www.gutenberg.
Humanities 2011 (2011). 7, 8, 10, 12, 16 org/ (Retrieved 2015-01-09). 3
[NMG∗ 13] N OWVISKIE B., M C C LURE D., G RAHAM W., [PSA∗ 06] P LAISANT C., S MITH M. N., AUVIL L., ROSE J., Y U
S OROKA A., B OGGS J., ROCHESTER E.: Geo-Temporal In- B., C LEMENT T.: "Undiscovered Public Knowledge": Mining
terpretation of Archival Collections with Neatline. Literary and for Patterns of Erotic Language in Emily Dickinson’s Correspon-
Linguistic Computing 28, 4 (2013), 692–699. 17 dence with Susan Huntington (Gilbert) Dickinson. In Proceed-
ings of the Digital Humanities 2006 (2006). 7, 8
[OGH15] O DAT S., G ROZA T., H UNTER J.: Extracting struc-
tured data from publications in the Art Conservation Domain. [RARC∗ 15] ROE G., A BDUL -R AHMAN A., C HEN M., G LAD -
STONE C., M ORRISSEY R., O LSEN M.: Visualizing Text Align-
Digital Scholarship in the Humanities 30, 2 (2015), 225–245. 6,
7, 8, 14, 16 ments: Image Processing Techniques for Locating 18th-Century
Commonplaces. In Proceedings of the Digital Humanities 2015
[OKK13] O ELKE D., KOKKINAKIS D., K EIM D. A.: Fingerprint (2015). 8, 12
Matrices: Uncovering the dynamics of social networks in prose
[RAW∗ 15] R IND A., A IGNER W., WAGNER M., M IKSCH S.,
literature. In Computer Graphics Forum (2013), vol. 32, Wiley
L AMMARSCH T.: Task Cube: A three-dimensional conceptual
Online Library, pp. 371–380. 8, 12
space of user tasks in visualization design and evaluation. Infor-
[Ome15] Omeka, 2015. http://www.omeka.org/ (Re- mation Visualization (2015). 7
trieved 2015-01-10). 17 [RD10] R ICHE N., DWYER T.: Untangling Euler Diagrams. Vi-
[ÓML14] Ó M URCHÚ T., L AWLESS S.: The Problem of sualization and Computer Graphics, IEEE Transactions on 16, 6
Time and Space: The Difficulties in Visualising Spatiotemporal (Nov 2010), 1090–1099. 8, 15
Change in Historical Data. In Proceedings of the Digital Human- [RFH14] R EITER N., F RANK A., H ELLWIG O.: An NLP-based
ities 2014 (2014). 6, 8, 13, 14 cross-document approach to narrative structure discovery. Liter-
[OST∗ 10] O ESTERLING P., S CHEUERMANN G., T ERESNIAK ary and Linguistic Computing 29, 4 (2014), 583–605. 6, 7, 8, 14,
S., H EYER G., KOCH S., E RTL T., W EBER G.: Two-stage 16
framework for a topology-based projection and visualization of [RPSF15] R IEHMANN P., P OTTHAST M., S TEIN B.,
classified document collections. In Visual Analytics Science F ROEHLICH B.: Visual Assessment of Alleged Plagiarism
and Technology (VAST), 2010 IEEE Symposium on (Oct 2010), Cases. Computer Graphics Forum 34, 3 (2015), 61–70. 6, 7, 8,
pp. 91–98. 8, 15, 16 10, 12, 15, 16
[Pal02] PALEY W. B.: TextArc: Showing word frequency and [RRRG05] RUECKER S., R AMSAY S., R ADZIKOWSKA M., G A -
distribution in text. In Poster presented at IEEE Symposium on LEY A.: Interface Design. In Proceedings of the Digital Human-
Information Visualization (2002), vol. 2002. 15 ities 2005 (2005). 7, 8, 12, 15
c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities
[RSDCD∗ 13] ROBERTS -S MITH J., D E S OUZA -C OELHO S., [WBWK00] WANG BALDONADO M. Q., W OODRUFF A.,
D OBSON T. M., G ABRIELE S., RODRIGUEZ -A RENAS O., K UCHINSKY A.: Guidelines for using multiple views in infor-
RUECKER S., S INCLAIR S., A KONG A., B OUCHARD M., mation visualization. In Proceedings of the Working Conference
H ONG M., JAKACKI D., L AM D., KOVACS A., N ORTHAM L., on Advanced Visual Interfaces (New York, NY, USA, 2000), AVI
S O D.: Visualizing Theatrical Text: From Watching the Script to ’00, ACM, pp. 110–119. 16
the Simulated Environment for Theatre (SET). Digital Humani- [Wei15] W EIGEL M.: Graphs and Legends: Raymond Williams
ties Quarterly 7, 3 (2013). 6, 8, 15, 16 tried to save culture from a priestly elite. Can the same be said
[SDW11] S LINGSBY A., DYKES J., W OOD J.: Exploring Uncer- of the digital humanities?, 2015. http://www.thenation.
tainty in Geodemographics with Interactive Graphics. Visualiza- com/article/graphs-and-legends/ (Retrieved 2015-
tion and Computer Graphics, IEEE Transactions on 17, 12 (Dec 10-09). 5
2011), 2545–2554. 19 [WH11] WALSH J. A., H OOPER W.: Computational Discovery
[Shn96] S HNEIDERMAN B.: The Eyes Have It: A Task by Data and Visualization of the Underlying Semantic Structure of Com-
Type Taxonomy for Information Visualizations. In Visual Lan- plicated Historical and Literary Corpora. In Proceedings of the
guages, Proceedings (1996), pp. 336–343. 3 Digital Humanities 2011 (2011). 5, 6, 7, 8, 14, 16
[Sin13] S INGER K.: Digital Close Reading: TEI for Teaching [Wil15a] W ILLS T.: Relational data modelling of textual corpora:
Poetic Vocabularies. Journal of Interactive Technology and Ped- The Skaldic Project and its extensions. Digital Scholarship in the
agogy 3 (2013). 7 Humanities 30, 2 (2015), 294–313. 5, 6, 8, 13, 16
[SOI10] S AITO S., O HNO S., I NABA M.: A Platform for Cultural [Wil15b] W ILSON E. A.: Building The Early Modern Digital
Information Visualization Using Schematic Expressions of Cube. University: Using Social Network Analysis and Digital Visual-
In Proceedings of the Digital Humanities 2010 (2010). 1 ization Tools To Bring The Early Modern Network Of Networks
(EMNON) To Life. In Proceedings of the Digital Humanities
[SRR13] S INCLAIR S., RUECKER S., R ADZIKOWSKA M.: In- 2015 (2015). 7, 8, 13, 16
formation Visualization for Humanities Scholars, 2013. 19
[WJ13a] W EINGART S., J ORGENSEN J.: Computational analy-
[TEI15] TEI Consortium, 2015. eds. TEI P5: Guide- sis of the body in European fairy tales. Literary and Linguistic
lines for Electronic Text Encoding and Interchange. 2.8.0. Computing 28, 3 (2013), 404–416. 5, 8, 14, 16
2015-04-06. TEI Consortium. http://www.tei-c.org/
Guidelines/P5/ (Retrieved 2015-10-09). 1 [WJ13b] W HEELES D., J ENSEN K.: Juxta Commons. In Pro-
ceedings of the Digital Humanities 2013 (2013). 7, 8, 11, 19
[TFK15] T RILCKE P., F ISCHER F., K AMPKASPAR D.: Digital
Network Analysis of Dramatic Texts. In Proceedings of the Dig- [WMN∗ 14] WALSH B., M AIERS C., NALLY G., B OGGS J.,
ital Humanities 2015 (2015). 6, 7, 8, 15 T EAM P. P.: Crowdsourcing individual interpretations: Between
microtasking and macrotasking. Literary and Linguistic Com-
[TKMS03] T OUTANOVA K., K LEIN D., M ANNING C. D., puting 29, 3 (2014), 379–386. 7, 8, 10, 11, 18
S INGER Y.: Feature-Rich Part-of-Speech Tagging with a Cyclic
Dependency Network. In Proceedings of the 2003 Conference [Wol13] W OLFF M.: Surveying a Corpus with Alignment Visu-
of the North American Chapter of the Association for Compu- alization and Topic Modeling. In Proceedings of the Digital Hu-
tational Linguistics on Human Language Technology-Volume 1 manities 2013 (2013). 7, 8, 15, 16
(2003), Association for Computational Linguistics, pp. 173–180. [WV08] WATTENBERG M., V IEGAS F.: The Word Tree, an Inter-
7 active Visual Concordance. Visualization and Computer Graph-
[Tót13] T ÓTH G. M.: The computer-assisted analysis of a me- ics, IEEE Transactions on 14, 6 (Nov 2008), 1221–1228. 8, 15,
dieval commonplace book and diary (MS Zibaldone Quaresimale 16
by Giovanni Rucellai). Literary and Linguistic Computing 28, 3 [YMSJ05] Y I J. S., M ELTON R., S TASKO J., JACKO J. A.: Dust
(2013), 432–443. 7, 8, 15 & Magnet: multivariate information visualization using a magnet
[Tra09] T RAVIS C.: Patrick Kavanagh’s Poetic Wordscapes: GIS, metaphor. Information Visualization 4, 4 (2005), 239–256. 15
Literature and Ireland, 1922-1949. In Proceedings of the Digital [ZNMS15] Z AHORA T., N IKULIN D., M EWS C. J., S QUIRE D.:
Humanities 2009 (2009). 8, 13 Deconstructing Bricolage: Interactive Online Analysis of Com-
[UIU98] University of Illinois at Urbana-Champaign Digital Li- piled Texts with Factotum. Digital Humanities Quarterly 9, 1
braries Initiative: UIUC DLI Glossary, 1998. http:// (2015). 8, 10, 12, 16
archive.is/qOUJz (Retrieved 2016-03-18). 5
[Und15] U NDERWOOD T.: A dataset for distant-reading litera-
ture in English, 1700-1922, 2015. http://tinyurl.com/
nu26zr7 (Retrieved 2015-10-09). 5
[VCPK09] V UILLEMOT R., C LEMENT T., P LAISANT C., K U -
MAR A.: What’s being said near "Martha"? Exploring name en-
tities in literary text collections. In Visual Analytics Science and
Technology, 2009. VAST 2009. IEEE Symposium on (Oct 2009),
pp. 107–114. 6, 8, 12, 13, 14, 16, 17, 18
[vHWV09] VAN H AM F., WATTENBERG M., V IEGAS F.: Map-
ping Text with Phrase Nets. Visualization and Computer Graph-
ics, IEEE Transactions on 15, 6 (Nov 2009), 1169–1176. 8, 15
[VWF09] V IEGAS F., WATTENBERG M., F EINBERG J.: Partic-
ipatory Visualization with Wordle. Visualization and Computer
Graphics, IEEE Transactions on 15, 6 (Nov 2009), 1137–1144.
11