0% found this document useful (0 votes)
67 views25 pages

Visual Text Analysis in Digital Humanities: Forum

This article provides an overview of research on supporting text analysis tasks in digital humanities with close and distant reading visualizations. It classifies text analysis tasks, categorizes close and distant reading techniques used to support the tasks, and discusses approaches that combine the techniques. The article also examines text sources, data transformations, collaboration experiences, and future challenges.

Uploaded by

Gizliusta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views25 pages

Visual Text Analysis in Digital Humanities: Forum

This article provides an overview of research on supporting text analysis tasks in digital humanities with close and distant reading visualizations. It classifies text analysis tasks, categorizes close and distant reading techniques used to support the tasks, and discusses approaches that combine the techniques. The article also examines text sources, data transformations, collaboration experiences, and future challenges.

Uploaded by

Gizliusta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

pre-print of CGF paper (2016), DOI: 10.1111/cgf.

12873 COMPUTER GRAPHICS forum

Visual Text Analysis in Digital Humanities

S. Jänicke,1 G. Franzini,2 M. F. Cheema1 and G. Scheuermann1

1 Image and Signal Processing Group, Department of Computer Science, Leipzig University, Germany
{stjaenicke,faisal,scheuermann}@informatik.uni-leipzig.de
2 Göttingen Centre for Digital Humanities, University of Göttingen, Germany

[email protected]

Abstract
In 2005, Franco Moretti introduced Distant Reading to analyze entire literary text collections. This was a rather
revolutionary idea compared to the traditional Close Reading, which focuses on the thorough interpretation of an
individual work. Both reading techniques are the prior means of Visual Text Analysis. We present an overview of
the research conducted since 2005 on supporting text analysis tasks with close and distant reading visualizations
in the digital humanities. Therefore, we classify the observed papers according to a taxonomy of text analysis
tasks, categorize applied close and distant reading techniques to support the investigation of these tasks, and
illustrate approaches that combine both reading techniques in order to provide a multifaceted view of the textual
data. In addition, we take a look at the used text sources and at the typical data transformation steps required for
the proposed visualizations. Finally, we summarize collaboration experiences when developing visualizations for
close and distant reading, and we give an outlook on future challenges in that research area.
Keywords: digital humanities, survey, visual text analysis, close reading, distant reading
Categories and Subject Descriptors (according to ACM CCS): H.5.2 [Information Interfaces and Presentation]: User
Interfaces—Evaluation/methodology

1. Introduction but for other humanities questions scholars need to formu-


late new methods in collaboration with computer scientists.
Traditionally, humanities scholars carrying out research on a
specific or on multiple literary work(s) are interested in the Developed in the late 1980s [Hoc04], the digital human-
analysis of related texts or text passages. But the digital age ities primarily focused on designing standards to represent
has opened possibilities for scholars to enhance their tradi- cultural heritage data such as the Text Encoding Initiative
tional workflows. Enabled by digitization projects, humani- (TEI) for texts [TEI15], and to aggregate, digitize and de-
ties scholars can nowadays reach a large number of digitized liver data. In the last years, visualization techniques have
texts through web portals such as Google Books [Goo15] gained more and more importance when it comes to analyz-
and Internet Archive [Arc15]. Digital editions exist also for ing data. For example, Saito [SOI10] introduced her 2010
ancient texts; notable examples are PHI [PHI15] and the digital humanities conference paper with: “In recent years,
Perseus Digital Library [Per15]. people have tended to be overwhelmed by a vast amount of
information in various contexts. Therefore, arguments about
This shift from reading a single book “on paper” to the ’Information Visualization’ as a method to make information
possibility of browsing many digital texts is one of the ori- easy to comprehend are more than understandable.”
gins and principal pillars of the digital humanities domain,
which helps to develop solutions to handle vast amounts of A major impulse for this trend was given by Franco
cultural heritage data – text being the main data type. In con- Moretti. In 2005, he published the book “Graphs, Maps,
trast to the traditional methods, the digital humanities allow Trees” [Mor05], in which he proposes the so-called distant
to pose new research questions on cultural heritage datasets. reading approaches for textual data, which steered the tra-
Some of these questions can be answered with existent algo- ditional way of approaching literature towards a completely
rithms and tools provided by the computer science domain, new direction. Instead of reading texts in the traditional way

c 2016 The Author(s)


Computer Graphics Forum c 2016 The Eurographics Association and John
Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

– so-called close reading –, he invites to count, to graph and novel idea that was introduced by Franco Moretti at the be-
to map them. In other words, to visualize them. ginning of the 21th century. In contrast to Moretti, Jockers
uses the terms micro- and macroanalysis instead of close and
This survey observes text analysis tasks of humanities
distant reading [Joc13]. Inspired by micro- and macroeco-
scholars – e.g., literary scholars, historians and philologists
nomics, he focuses on quantitative literary text analysis us-
–, and the visualization techniques that have been developed
ing statistical analysis methods. As the methods we analyzed
in order to support these tasks. By providing a text analysis
are more related to visualization, we decided to use the tradi-
task taxonomy, categorizing applied close and distant read-
tional, more common terms close and distant reading, but we
ing techniques and outlining strategies that combine close
also considered related works using different terminologies.
and distant reading visualizations, we present an overview
This section introduces close and distant reading techniques
suitable for visualization scholars facing related digital hu-
and draws a line from the digital humanities to information
manities text analysis tasks. We further investigate the fol-
visualization by combining both techniques.
lowing questions:
• What are the used text sources and which data transfor- 2.1. Close Reading
mations are applied in order to investigate text analysis
research questions with close and distant reading visual- The close reading of a text became a fundamental method
izations? in literary criticism in the 20th century [Haw00]. Nancy
• Which experiences are reported regarding collaborations Boyles [Boy13] defines it as follows: “Essentially, close
between visualization experts and humanities scholars? reading means reading to uncover layers of meaning that
• What are future challenges for visualization scholars con- lead to deep comprehension.” In other words, close read-
cerning visual text analysis to further improve the support ing is the thorough interpretation of a text passage by the
for humanities scholars? determination of central themes and the analysis of their
development. Moreover, close reading includes the analy-
sis of (1) individuals, events, and ideas, their development
1.1. Relation to the Previous Article and interaction, (2) used words and phrases, (3) text struc-
The focus of the previous version of this survey [JFCS15] ture and style, and (4) argument patterns [Jas01]. The re-
was to illustrate the diversity of applied visualization tech- sult of a traditional close reading approach is shown in Fig-
niques that support the close and distant reading of texts ure 1. In this example, the scholar used various methods
in digital humanities applications – enriched with used vi- to annotate various features of the source text, e.g., the us-
sualization tools, collaboration experiences of visualization age of different colors (blue, red, green) and underlining
researchers working together with humanities scholars and styles (straight or wavy lines, circles). Furthermore, numer-
future challenges. ous thoughts are written next to the corresponding sentences.
Although most humanities scholars are trained in this tradi-
This survey extension aims to give visualization scholars tional approach of close reading, today’s large availability
new to the field of digital humanities an adequate overview of digitized texts and of digital editions through web portals
of related works in order to support carrying out successful
digital humanities projects. As an application domain for in-
formation visualization, these projects gain their motivation
from humanities research questions on texts. Therefore, we
introduce a taxonomy that groups the papers into classes of
text analysis tasks in order to guide visualization researchers
with similar tasks to related works that provide close and
distant reading solutions. In addition, we list text sources,
and we take a closer look at data transformations, which are
substantial steps in order to afford designing valuable visu-
alizations. Furthermore, we extended the collaboration ex-
periences and future challenges sections. Finally, this survey
considers 22 more related works.

2. Means of Visual Text Analysis


Close reading and distant reading are the prior means for
the visual analysis of textual sources in digital humanities
scenarios. While the close reading of a text is a traditional Figure 1: Traditional close reading of the second chapter
method that has its roots in antiquity when Aristotle close of Charles Dickens’ David Copperfield (Figure reproduced
read the works of Plato [McC15], distant reading is a rather with permission from Kehoe et al. [KG13]).


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

Figure 3: Distant reading example shows the structure of and


Figure 2: Digital close reading of the second chapter of the themes in Jack Kerouac’s On the Road (Figure repro-
Charles Dickens’ David Copperfield with eMargin (Figure duced with permission from Posavec [Pos07]).
reproduced with permission from Kehoe et al. [KG13]).

a tree. While a non-interactive infographic, Posavec’s ap-


like Google Books [Goo15] or Project Gutenberg [Pro15], proach perfectly illustrates the idea behind Moretti’s distant
opens up new possibilities for close reading, and especially reading, as it turns away from the traditional close reading by
for sustainable and collaborative annotation. providing an abstract view of a literary text. The branching
Figure 2 shows a straightforward approach of visualiz- structure represents the ordered hierarchy of content objects
ing various scholars’ annotations of a digital edition [KG13] from chapters down to words, and themes are drawn with
within the web-based environment eMargin [eMa15]. There, different colors. In Section 8.2 we present a list of different
colors are used to highlight different text features, and a pop- distant reading techniques developed for a wide range of text
up window lists the comments of collaborating scholars. In analysis research questions in the digital humanities.
Section 8.1 we outline different approaches to support text
analysis with close reading by visualizing supplementary
2.3. Combining Close and Distant Reading
human- or computer-generated information.
In his digital humanities collaborations, Correll worked to-
2.2. Distant Reading gether with literary scholars who were unfamiliar with dis-
tant reading views [CAA∗ 14]. Here, providing links to close
While close reading retains the ability to read the source text reading was an important method to support distant read-
without dissolving its structure, distant reading does the ex- ing hypotheses. During our literature research, we also dis-
act opposite. It aims to generate an abstract view by shifting covered a multitude of works involving close reading and
from observing textual content to visualizing global features interfaces, which provide distant reading visualizations that
of a single or of multiple text(s). Moretti [Mor13] describes allow to interactively drill down to specific portions of the
distant reading as “a little pact with the devil: we know how data. This suggests that direct access to the source texts is
to read texts, now let’s learn how not to read them.” In 2005, indeed very important for humanities scholars when work-
he introduces his idea of distant reading [Mor05] with three ing with visualizations. For example, Bradley [Bra12] asks
examples using: whether it is “possible to develop a visualization technique
• graphs to analyze genre change of historical novels, that does not destroy the original text in the process.” Sim-
• maps to illustrate geographical aspects of novels, and ilarly, Beals [Bea14] asks: “In an age where distant reading
• trees to classify different types of detective stories. is possible, is close reading dead?” Coles et al. argue that
distant reading visualizations cannot replace close reading,
But, the proposed methods and the intention of dis-
but they can direct the reader to sections that may deserve
tant reading are controversial in the humanities [GH11a,
further investigation [CL13].
CRS∗ 14], as they quantify and abstract texts at an expense
of reflecting the actual ambiguity and complexity of literary When distant reading views are interactively used to
forms [Dru11, Mar12]. However, many works in the digi- switch to close reading views, the Information Seek-
tal humanities domain are based upon Moretti’s idea. Fig- ing Mantra “Overview first, zoom and filter, details-on-
ure 3 shows Posavec’s Literary Organism [Pos07] – a dis- demand” [Shn96] is accomplished. It follows that an impor-
tant reading of Jack Kerouac’s On the Road in the form of tant task for the development of visualizations is to provide

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

Journal/Proceedings #Papers
IEEE Transactions on Visualization and
16
data transformation Computer Graphics (TVCG)
IEEE Symposium on Visual Analytics
6
Science and Technology (VAST)
distant reading Computer Graphics Forum 5
text analysis Proceedings of the International
task
Conference on Information Visualization 2
text sources
Theory and Applications (IVAPP)
close reading Information Visualisation Journal 1
insight
Visual Text Analysis
Table 1: Visualization papers examined.

humanities scholar Journal/Proceedings #Papers


Proceedings of the Annual Conference of
Figure 4: Visual text analysis process. the Alliance of Digital Humanities 60
Organizations
Literary and Linguistic Computing /
18
an overview of the data that highlights potentially interest- Digital Scholarship in the Humanities
ing patterns. A drill down on these patterns for further ex- Digital Humanities Quarterly 6
ploration is the bridge between distant and close reading.
Table 2: Digital humanities papers examined.

3. Visual Text Analysis


Figure 4 reflects a typical visual text analysis process in dig- human-computer interaction issues (e.g., ACM Conference
ital humanities workflows. The text analysis task at hand on Human Factors in Computing Systems (CHI)). Table 1
affects all steps of this process. First, the text sources are lists the number of related information visualization papers
selected in accordance to the research task. Next, it is im- examined. We also considered approaches developed for
portant to apply appropriate data transformations in order other data types, where at least one related use case was pro-
to design close and/or distant reading visualizations that are vided for a cultural heritage dataset. The TVCG journal table
beneficial to investigate the given text analysis task. But, a entry includes all found papers presented at the IEEE Sympo-
humanities scholar often refers to both the visualizations and sium on Information Visualization (InfoVis) as well as seven
the underlying texts in order to gain insights concerning the papers presented at the IEEE Symposium on Visual Analyt-
research question. While the next section defines the scope ics Science and Technology (VAST). Likewise, the Computer
of this survey, the following sections explain the various Graphics Forum entry includes all related papers also con-
components of the visual text analysis process in detail. Sec- tained in the proceedings of the Joint Eurographics–IEEE
tion 5 lists used text sources, and Section 6 provides a brief VGTC Symposium on Visualization (EuroVis). No related pa-
overview of data transformation techniques. In Section 7, we pers were found in the proceedings of the IEEE Pacific Vi-
provide a taxonomy of text analysis tasks. Applied close and sualization Symposium (PacificVis) and of the International
distant reading techniques are outlined in Section 8. Conference on Information Visualisation (IV), nor in the
IEEE Computer Graphics and Applications journal.
4. Scope Second, we included related works from the major digi-
tal humanities realms into our survey. Thereby, we decided
In order to generate the research paper pool of this survey,
to consider the proceedings of the Annual Conference of the
we used the publication year of Franco Moretti’s book on
Alliance of Digital Humanities Organizations, which yields
distant reading techniques “Graphs, Maps, Trees” [Mor05],
a suitable snapshot of the research conducted in that field, al-
2005, as a starting point. We manually scanned through the
though only short papers are contained. The 60 related works
major related visualization and digital humanities journals
presented at this major digital humanities conference (see
and conference proceedings in order to generate an appropri-
Table 2) indicate the importance of close and distant read-
ate snapshot of existent research on visualizations for close
ing visualizations for text analysis tasks. In 2014, a total of
and distant reading. To receive a processible list of related
345 individual papers were presented at the conference, of
works, we required to narrow the scope of our survey.
which 33 thematize information visualization techniques for
First, we only considered information visualization pub- cultural heritage data (9.6%), including 14 papers (4.1%)
lications. For example, we did not include papers from about close and/or distant reading relevant for our survey.


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

The collection is completed by related papers published in 5. Text Sources


two major digital humanities journals, Literary and Linguis-
The focus of interest of the papers in our collection includes
tic Computing (Digital Scholarship in the Humanities as of
four major text types.
2015) and Digital Humanities Quarterly.
• Single literary texts often motivate the development
Third, we did not consider works published in humanities
of close and distant reading techniques, e.g., Adam
realms such as Shakespeare Quarterly or American Literary
Smith’s The Wealth of Nations [BJ14], Gertrude Stein’s
History.
The Making of Americans [CDP∗ 07], Herodotus’ Histo-
ries [BPBI10], or The Castle of Perseverance [Pet14].
4.1. Considered Research Papers • POI collections are text corpora containing several works
In order to be considered for our survey, a paper needed to by a particular author (e.g., William Faulkner nov-
fulfill the following requirements: els [DNCM14]), politician (e.g., the papers of Thomas
Jefferson [Kle12] or the Kissinger Collection [Kau15]),
• Textual data: The visualization is a solution for research or any other “person of interest” (POI), e.g., alchemical
questions on an arbitrary text corpus, either a small text manuscripts written by Isaac Newton [WH11] or state-
unit such as a poem, a large text unit such as a book, or a ments about Emily Carr [HSC08].
whole text collection. For example, we did not include a • Text editions are collections of variants of the same
timeline visualization of Picasso’s works [MFM08] as it source text, e.g., research questions concerning Ger-
is based upon artworks. man translations of Shakespeare’s Othello [GCL∗ 13], En-
• Cultural heritage: The underlying textual data has a his- glish translations of the Bible [JGBS14], or reprints of
torical value. While not considering approaches dealing Nathaniel Hawthorne’s short story The Celestial Rail-
with texts extracted from social media or wiki systems road [Cor13].
(e.g., a social network visualization of philosophers pre- • Text collections contain a multitude of texts, which
sented in [AL09], which is based upon relationships mod- usually belong to a particular typology. Among others,
eled in Wikipedia), we took into account visualizations for this includes the collection of poems [Wil15a], biogra-
newspaper collections. This decision includes some visu- phies [Boo13], tales [WJ13a], novels [Ede14], medieval
alization papers that do not directly relate to the humani- texts [JW13], or news articles [KLB14].
ties. But the proposed techniques are based upon or tested
with contemporary newspaper collections, which are in- This variety of source texts reflects the diversity of text
deed part of cultural heritage. analysis research questions raised by humanities scholars.
• No straightforward metadata visualization: We only On the other hand it suggests the requirement of multifarious
considered papers that provide a visualization that is close and distant reading techniques.
based upon the inherent textual content. We omitted meth- While more and more digital editions and archives are
ods that only use the texts’ associated metadata. An ex- made available online [GTW13, FMT15], distant reading
ample is given by two graphs displaying relationships be- studies are few and far between due to the limited number of
tween texts. Whereas the related network graph presented available text collections [Und15]. It follows that the more
in [Ede14] is determined by analyzing stylistic features and the larger the accessible text collections, the bigger the
among the textual contents of novels, the unrelated graph scope for close and distant reading analyses and the higher
visualization in [Fin10] uses Amazon recommendations the probability to generate new research challenges for visu-
to determine relationships between books. alization scholars. The issue with this seemingly logic con-
• No traditional charts: In the digital humanities, the word clusion, however, is the unavoidable and enormous amount
visualization is frequently used. Traditional charts dis- of manual work required in order to assemble [Und15], en-
playing statistical information such as line or bar charts code and upload [Wei15] such text collections to the web
are also labeled as visualizations. Based on the informa- for others to generate derivative outputs. Some of the pa-
tion visualization definitions given by Card [CMS99] and pers provide information about used publicly available data
the UIUC DLI Glossary [UIU98], we only considered sources, which we list in Table 3.
papers that provided computer-supported, non-traditional
visual representations of abstract data. In contrast, we do
not require interactive methods as humanities scholars of- 6. Data Transformation Aspects
ten gain valuable insights using non-interactive visualiza- Many papers in our collection do not provide sufficient in-
tions. However, most proposed techniques implement cer- formation about applied preprocessing steps to transform
tain means of interaction. the given textual data into the visualization’s input for-
Altogether, in order to be considered, a paper required a mat. Occasionally, a visualization directly processes anno-
research question on textual data of historical interest em- tated text in XML format (e.g., [Pie10, Boo13, BGHJ∗ 14]).
phasizing content rather than metadata and supporting the In particular, many techniques are based upon documents
close and/or distant reading of texts. in the XML-based TEI format (e.g., [BHW11, CGM∗ 12])

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
c 2016 The Author(s)
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
Data Source URL Used by
Digital Libraries


Perseus Digital Library http://www.perseus.tufts.edu/hopper/ [BPBI10]
Project Gutenberg https://www.gutenberg.org/ [CTA∗ 13, GCL∗ 13, KO07, GTAHS15]
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

TextGrid Repository https://textgridrep.de/ [TFK15, JKH∗ 15]


HathiTrust https://www.hathitrust.org/ [AGZH15]
POI Collections
The Swinburne Project http://swinburnearchive.indiana.edu/swinburne/ [WH11]
The Chymistry of Isaac Newton Project http://webapp1.dlib.indiana.edu/newton/ [WH11]
The Papers of Thomas Jefferson http://rotunda.upress.virginia.edu/founders/TSJN.html [Kle12]
Internet Shakespeare Editions http://internetshakespeare.uvic.ca/ [RSDCD∗ 13]
Bob Gibson Collection of Speculative Fiction http://contentdm.ucalgary.ca/cdm/search/collection/gcsf [HFM16]
Literary Collections

Computer Graphics Forum


The Poetess Archive http://idhmc.tamu.edu/poetess/ [FS11, CGM∗ 12]
Folklore and Mythology Electronic Texts http://www.pitt.edu/~dash/folktexts.html [RFH14]
The Skaldic Project http://www.abdn.ac.uk/skaldic/ [Wil15a]
Eighteenth Century Collection Online http://quod.lib.umich.edu/e/ecco/ [JOL∗ 15]
MySword (Bible Collection) http://www.mysword.info/ [JG15]
Political Texts
1641 Depositions http://www.1641.tcd.ie/ [ÓML14]
Digital National Security Archive http://nsarchive.chadwyck.com/ [Kau15]
PoliInformatics http://poliinformatics.org/data/ [Poi15]
VroniPlag Wiki http://de.vroniplag.wikia.com/ [RPSF15]
News Archives
Europe Media Monitor http://emm.newsbrief.eu/ [KBK11]
Projekt Deutscher Wortschatz http://corpora.uni-leipzig.de/ [KLB14]
Scientific Paper Collections
Thompson Reuters Web of Knowledge http://wokinfo.com/ [ARR∗ 12]
JSTOR http://www.jstor.org/ [OGH15, MSR∗ 15]
Collected Project Corpora
ANNIS https://korpling.german.hu-berlin.de/annis3/ [KZ14]
Trading Consequences http://tradingconsequences.blogs.edina.ac.uk/about/the-corpus/ [HAC∗ 15]
Other Sources
British National Corpus http://www.natcorp.ox.ac.uk/ [Bea08, ARLC∗ 13]
Documenting the American South http://docsouth.unc.edu/ [CTA∗ 13]
Metadata Offer New Knowledge http://monk.library.illinois.edu/ [VCPK09]
The Internet Movie Script Database http://www.imsdb.com/ [HPR14]
Table 3: Publicly available data sources.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

which has become the humanities leading technology to ready present when using annotated TEI files as data sources
map the structure of a digital text edition [Sin13]. XSLT (e.g., [BB15a, TFK15]).
stylesheets are basic ways to transform the TEI encoded
Topic modeling algorithms are fundamental for topic-
information into a meaningful visualization of an individ-
related analyses of text collections. The Latent Dirichlet al-
ual text (e.g., [Cor13, Pie13, HKTK14]). But most distant
location is the most often applied topic model [BNJ03]. It re-
reading techniques require more sophisticated preprocessing
quires a predefined number of topics, which are determined
steps – an brief summary of common data transformation ap-
automatically based on the words contained in the text cor-
proaches is shown below.
pus (e.g., [AKV∗ 14, BJ14]). The topic model can be used
Tokenization and normalization are rather rudimentary to cluster texts thematically [Wol13] – as happens with text
natural language processing methods first applied to segment classification methods (e.g., [PSA∗ 06,DFM∗ 08]) –, or to de-
raw text sources. The then determinable frequency distri- fine the similarity among the texts of a corpus [Joc12]. When
bution of words is a valuable basis for various tasks (e.g., temporal metadata is provided, the change of topics can be
stylometric analyses [CEJ∗ 14, Ede14]), and can be clearly analyzed (e.g., [ARR∗ 12, CLWW14]).
visualized in the form of tag clouds to support the explo-
Semi-automatic approaches reflect the importance of in-
ration of word statistics (e.g., [Bea08, GTAHS15, JBR∗ 15]).
tegrating the humanities scholar into the data transformation
Vector space models can be used to list term frequen-
process. For instance, the scholar’s knowledge is required
cies per text and support a variety of text analysis tasks
when manually generating or validating a training set to pro-
(e.g., [DFM∗ 08, GCL∗ 13, KOTM13]). On the other hand,
duce an appropriate data mining classifier (e.g., [PSA∗ 06,
counting n-grams allows to draw more specific statements
KKL∗ 11,KJW∗ 14]). Other methods include semi-automatic
about a text corpus (e.g., [CDP∗ 07,Bea12,MH13]). But tok-
alignments [GZ12] and the annotation of TEI documents
enization and normalization are also necessary steps for the
(e.g., [Tót13, OGH15]). Sometimes, even the visualization
data transformation methods described below.
entirely depends on manually collected data through crowd-
Sequence alignments are computed when inves- sourcing (e.g., [WMN∗ 14, RPSF15, HFM16, JFS16]).
tigating research questions concerning textual re-use
(e.g., [BGHE10, JGBS14]) and for the analysis of sim-
ilarities and differences among various text editions 7. Taxonomy
(e.g., [WJ13b, JGF∗ 15]). In such scenarios one typi- There has been extensive research done in developing tax-
cally applies the Gothenburg model, which includes onomies for information visualization in the last decades.
tokenization, normalization, alignment, analysis and visual- Unfortunately, these taxonomies were either too general
ization [Got15]. One of the web-based tools implementing (e.g., [BM13, RAW∗ 15]) or too specific (e.g., [LPP∗ 06,
this model is CollateX [HDvHM∗ 15]. KKC15]) to be used for our paper collection. Therefore,
Part-of-speech (POS) tagging is a frequently applied we defined a taxonomy focusing the underlying text anal-
preprocessing technique to automatically annotate the words ysis tasks in the digital humanities domain (see Figure 5). A
of a corpus according to their part of speech category. The detailed classification of papers is given in Table 4. Papers
use of tools like the Stanford POS tagger [TKMS03] is focusing a single text analysis task are grouped to a single
a mandatory basis for investigating diverse research ques- – the best fitting – category. The few works providing visu-
tions. Typically, words and their relationships are explored alization methods for two text analysis tasks each appear in
(e.g., [KKL∗ 11, MH13]) or linguistic patterns are extracted two categories [RRRG05, WH11, Wol13, Kau15]. The tax-
from a corpus (e.g., [Mur11, RFH14]). Furthermore, POS onomy consists of five major categories:
tagging is used to analyze phonetic features [CTA∗ 13] or
for research questions concerning stylometry [KO07].
persons interpretation
Named entity recognition (NER) is the practice of ex- places
sound
tracting named entities such as places or persons from texts.
Preprocessing steps like part-of-speech tagging can be ap- story flow
miscellaneous named text of
plied to automatically list named entity candidates [GH11b]. entities interest miscellaneous
With the help of lexicons, named entities are subsequently clustering
classified. For example, the Pleiades gazetteer [Ple15] is text analysis word statistics &
topics tasks corpus relationships
used for the extraction of ancient place names [EJ14], evolution analysis
and DBPedia [LIJ∗ 14] supports the discovery of com- similar text similarity
extraction patterns
modities in [HAC∗ 15]. The Stanford Named Entity Rec-
linguistic exploration
ognizer [FGM05] is a popular tool for automatic named text edition
patterns text re-use
comparison
entity extraction. The manual collection of named enti-
ties is not uncommon and guarantees the highest precision
(e.g., [JW13, Wil15b]). Occasionally, named entities are al- Figure 5: Taxonomy of text analysis tasks.

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

[MBL∗ 06], [Tra09], [BPBI10], [GH11b], [JW13], [DNCM14], [EJ14],


places
[GDMF∗ 14], [ÓML14], [AGZH15], [BB15b], [HAHT∗ 15], [Wil15a]
named entities

[Cob05], [CSV08], [BDF∗ 10], [RD10], [BHW11], [Kle12], [Boo13],


persons [KOTM13], [OKK13], [Tót13], [KLB14], [Pet14], [BB15a], [Poi15],
[TFK15], [Wil15b], [JFS16]
miscellaneous [AGL∗ 07], [HAHB15], [HAC∗ 15], [OGH15]
extraction [AKV∗ 14], [BJ14], [ESK14], [JOL∗ 15], [Kau15], [MSR∗ 15]
topics

evolution [CLT∗ 11], [KBK11], [ARR∗ 12], [DWS∗ 12], [CLWW14], [Kau15]
clustering [PSA∗ 06], [DFM∗ 08], [OST∗ 10], [Wol13], [HFM16]

[CDP∗ 07], [WV08], [vHWV09], [Mur11], [GZ12], [MLSU13],


linguistic patterns
similar patterns

[BGHJ∗ 14], [KZ14], [RFH14], [JKH∗ 15]


[BGHE10], [WH11], [Wol13], [HCC14], [JGBS14],
text re-use
[RARC∗ 15], [RPSF15], [ZNMS15]
[JRS∗ 09], [Cor13], [GCL∗ 13], [WJ13b], [JG15],
text edition comparison
[JGF∗ 15], [MRMK15], [PMMR15]
interpretation [Pie10], [Arm14], [HKTK14], [WMN∗ 14]
text of interest

sound [FS11], [CGM∗ 12], [ARLC∗ 13], [CTA∗ 13], [Pie13], [Ben14], [MLCM16]
story flow [RRRG05], [LWW∗ 13], [RSDCD∗ 13], [HPR14]
miscellaneous [MFM13], [GWFI14], [KJW∗ 14], [PBD14]
word statistics & [Bea08], [CVW09], [VCPK09], [LRKC10], [Bea11], [KKL∗ 11],
corpus analysis

relationships [Bea12], [MH13], [WJ13a], [GTAHS15], [JBR∗ 15]


[RRRG05], [KO07], [EX10], [WH11],
text similarity
[Joc12], [CEJ∗ 14], [Ede14]
exploration [HSC08], [CWG11], [JHSS12], [FKT14]

Table 4: Classification of papers according to the taxonomy of text analysis tasks.

The analysis of named entities is a common text analy- An analysis of similar patterns that includes the discov-
sis task of humanities scholars that is supported with close ery, the alignment and the visualization of similar text seg-
and distant reading visualizations. When extracting places, ments among the texts of a given collection is a typical text
fictional or reported geographies of a single text or a whole analysis task in the digital humanities. Dependent on the
collection can be explored. The extraction of persons is re- length of patterns, we divide the tasks belonging to that cat-
quired to analyze social networks of individuals or of char- egory into three sets. While the analysis of linguistic pat-
acters in a story. Miscellaneous tasks focus on other (e.g., terns concerns short phrases, text re-use analysis focuses on
encyclopedia entries [HAHB15]) or on multiple named en- determining deliberately re-used text segments (e.g., quotes
tities (e.g., commodities and locations [HAC∗ 15]). or plagiarized passages). In papers grouped to the category
text edition comparison, the humanities scholar is more in-
The analysis of topics inherent in a text corpus supports terested in analyzing the variants between the text editions.
text analysis tasks that require both close and distant reading
techniques. Popular tasks are topic extractions, so that major Some tasks focus an individual literary work, which we
topics in the source texts can be tracked and topic-related call text of interest. Sophisticated close reading techniques
text passages can be discovered. The presence of temporal are often applied but are sometimes coupled with distant
data allows for the analysis of topical evolution, and on the reading representations of the textual content. The underly-
basis of the found topics, a topical clustering of a corpus is ing research tasks vary from visualizing text interpretations
possible. to the analysis of sound of literary works (mostly poems),


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

Close Reading Distant Reading Combining

Top-down & Bottom-up


Miscellaneousp
Connectionsp

Timelinesp
Tag clouds

Bottom-up
Heat maps
Font sizep

Top-down
Glyphs

Graphs
Colorp
Plainp

Maps
named entities

places
1 2 1 12 5 1 1 2

persons
2 1 2 1 15 1 1 1

miscellaneous
2 1 1 1 3 1 1
topics

extraction
2 2 4 1 1 1 1

evolution
1 3 1 6 1

clustering
1 2 1 1 2 1 3
similar patterns

linguistic patterns
8 1 4 1 5 1 2 4

text re-use
6 2 4 3 3

text edition comparison


1 5 2 4 4 1 2 2 1
text of interest

interpretation
2 1 1 1

sound
3 3 2 3

story flow
1 1 2 1 1 2

miscellaneous
1 1 1 1 1 1 1 1 1
corpus analysis

word statistics & relationships


3 2 8 4 1 1 2

text similarity
2 5

exploration
2 1 1 2 1 1 2 1 3
10 38 4 5 9 26 20 18 18 41 9 5 8 25

Table 5: Applied close and distant reading techniques according to text analysis tasks.

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

and to the story flow analysis of a given source text. Mis- In most cases, a colored background is used to express var-
cellaneous tasks, for example, support the thorough analy- ious types of information about a single word or an entire
sis [KJW∗ 14] or enhance the close reading [GWFI14] of a phrase (Figure 2). The tool Serendip [AKV∗ 14] varies the
literary work. transparency of background colors to encode the importance
of individual words (Figure 6a). Font color is also frequently
The final category are corpus analysis tasks. Usually, dis-
used for this purpose (Figure 6b left). Colored circumcircles
tant reading visualizations are used to explore text corpora
(Figure 6c) around words are used only once [MLCM16].
containing a high number of texts. A major research task is
When displaying digital editions of literary texts, insertions
the analysis of word statistics & relationships among them.
are underlined. This might be the reason that this metaphor
Further interests concern text similarity between the texts of
of underlining words is also rarely used to enhance close
a corpus, others require platforms for corpus exploration to
reading [CWG11]. Overall, coloring is a suitable method to
facilitate knowledge discovery.
express a great variety of textual features. Among other pur-
poses, coloring is used for the analysis of similar patterns,
8. Applied Visualization Techniques e.g., to mark common words (e.g., [JRS∗ 09, Mur11]) and
aligned text segments (e.g., [ZNMS15, RPSF15]) in parallel
This section provides an overview of close and distant read-
texts, or when exploring a text of interest, e.g., to highlight
ing visualizations examined in the papers in our collection.
In addition, we outline strategies for combining close and
distant reading visualizations that facilitate a multifaceted
analysis of the underlying textual data. Table 5 shows a
distribution of the applied techniques according to the text
analysis task taxonomy (see Table 4). For some research
tasks, favored visualization methods stand out. A detailed
overview about what techniques are suited for which text
analysis tasks is given below.
(a) Colored backgrounds and backgrounds with varying trans-
8.1. Close Reading Techniques parency (Figure provided by Alexander et al. and based
on [AKV∗ 14]).
A visualization that allows to close read a text requires that
the structure of the text be retained in order to facilitate a
smooth analysis. With additional information in the form of
manual annotations or of automatically processed features
of textual entities or relationships among them, a plain text
can be transformed into a comprehensive knowledge source.
As can be seen in Table 5, the application of close reading
techniques is particularly important when analyzing similar
patterns. For text edition comparison, close reading is nec- (b) PRISM uses color to highlight the classification of words and
essary to discover occurring similarities and differences, and font size to encode the number of annotations (Figures under CC
a close reading of similar linguistic patterns or text re-use BY 3.0 license based on [WMN∗ 14]).
patterns helps to analyze the contexts in which these pat-
terns were used. When focusing a text of interest, close read-
ing techniques are applied to illustrate various text features.
Other research tasks concerning named entities, topics and
corpus analysis rather investigate generic features of a cor-
pus, and apply mostly distant reading techniques. Still, close
reading is sometimes helpful to connect a computationally
gained distant view with the underlying source texts.
While ten visualizations provide only plain close read-
ing views without additional information, 56 visualizations
attend to the matter of enhancing the close reading capabili-
ties of the humanities scholars. To visualize such additional
(c) Circumcircles in the Poem View (left) and connections in the
information for a great variety of purposes, the researchers Path View (right) highlight rhyme sets in poems (Figure produced
made use of the techniques listed below. with Poemage [Poe16] based on McCurdy et al. [MLCM16]).
Color is the visual attribute most often used to display the
features of textual entities and it is applied in different ways. Figure 6: Color usage for close reading.


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

Figure 7: Poem Viewer uses glyphs to encode phonetic units, and connections show phonetic and semantic relationships (Figure
reproduced with permission from Abdul-Rahman et al. [ARLC∗ 13]).

the automated or manual classification of words or phrases tures. The Myopia Poetry Visualization tool uses rectangular
(e.g., [KJW∗ 14, WMN∗ 14]), or to visualize various sound blocks to visualize poetic feet and the spoken length of sylla-
patterns in poems (e.g., [CTA∗ 13, Ben14]). bles [CGM∗ 12]. For the visualization of a poem’s hermeneu-
tic structure, Piez deploys glyphs in the form of rectangular
Font size is another method of visualizing features of tex-
and circular maps [Pie10,Pie13]. An example is given in Fig-
tual entities. Adopted from tag cloud design [VWF09], this
ure 8. Goffin explores the placement and design of so-called
metaphor serves best to highlight the significance or weight
word-scale visualizations, which are small glyphs enriching
of a textual entity in relation to the given text or corpus. In
the base text with additional information [GWFI14]. For ex-
the design of a variant graph [JGF∗ 15, JG15], which is a di-
ample, the background color of words contained in digital
rected acyclic graph that is used for text edition compari-
copies speaks for OCR certainty. Furthermore, small inter-
son as it visualizes differences and similarities among text
active bar charts illustrate variants of observed words.
variants, font size encodes the number of occurrences of a
word in all editions (Figure 9a). Within the web-based tool Connections aid to illustrate the structure among textual
PRISM [WMN∗ 14], users collaboratively group the words entities most often applied to support text analysis tasks
of literary texts into different categories. The collected statis- concerning similar patterns. One usage of connections in
tics are used to display the number of annotations of each close reading is to highlight subsequent words in a variant
word by variable font size (Figure 6b right). In [CWG11], graph to track variation among text editions [BGHE10]. As
varying font size is used to visualize the importance of text shown in Figure 9a, colored links can help to identify cer-
passages according to the user’s preferences. tain editions [JG15, JGF∗ 15]. Other approaches juxtapose
the texts of various editions and visually link related text
Glyphs attached to individual textual entities are conve- passages [WJ13b,HKTK14,JGBS14], as instantiated in Fig-
nient techniques to visualize abstract annotations that are ure 9b. Furthermore, connections can also be used to vi-
hardly expressible with plain coloring or varying font size.
All examples we found enhance the close reading of a text
of interest, mostly poems. In [ARLC∗ 13], phonetic units are
drawn atop each word using color to classify phonetic types
(Figure 7). Additionally, pictograms illustrate phonetic fea-

(a) Variant graph for seven English translations of Genesis 1:5 con-
necting subsequent words displayed with variable font size (Figure
based on [JGF∗ 15]).

(b) Juxta commons supports close reading of two parallel texts by


connecting related text passages (Figure provided by Wheeles et al.
Figure 8: Close reading example with glyphs illustrating the and based on [WJ13b]).
hermeneutic structure of a poem (Figure provided by Piez
and based on [Pie10]). Figure 9: Connections used for text alignment.

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

sualize sentence structure [KZ14]. Two close reading vi- maps to show relationships among various texts in a cor-
sualizations use connections for the analysis of sound in pus. The similarity for each tuple of texts within the cor-
poems. While Abdul-Rahman [ARLC∗ 13] illustrates pho- pus can be determined by counting similar text passages,
netic and semantic relations within poems (Figure 7), Mc- and the result can be visualized as a heat map [GCL∗ 13,
Curdy [MLCM16] draws paths between words of a poem FKT14], e.g., to highlight the similarity between Shake-
sharing the same tones (assonances) to highlight sonic pat- spearean plays [RRRG05]. Heat maps are also applied to vi-
terns (Figure 6c). sualize similarities or differences among text editions [JG15,
PMMR15], or to highlight re-used passages between the
texts of a corpus [JGBS14, RARC∗ 15, ZNMS15]. For the
8.2. Distant Reading Techniques
analysis of potentially plagiarized texts, so called Difflines
A visualization that displays summarized information of the reveal structural differences between several suspicious text
given text corpus facilitates distant reading. The process of fragments and their alleged originals in a Focus+Context
transforming such information into complex representations view [RPSF15], an example is shown in Figure 10. A further
can be based upon a large variety of data dimensions, e.g., heat map variant are fingerprinting techniques as introduced
various types of metadata of textual entities, automatically in [KO07] in order to visualize characteristic textual features
processed or manually retrieved relationships between tex- of literary works. In text analysis tasks concerning named
tual elements, or quantitative and qualitative statistics about entities, heat maps can be used to analyze places men-
unstructured textual contents. tioned in texts [AGZH15], or to reveal interpersonal relation-
ships between characters in prose literature [OKK13]. For
An overview of applied distant reading visualizations ac-
the analysis of topics, Alexander et al. [AKV∗ 14] propose
cording to the text analysis task taxonomy is given in Ta-
two matrix representations. The RankViewer illustrates the
ble 5. The overall usage of such techniques suggests their
ranking of words belonging to topics and the CorpusViewer
importance for nearly all text analysis tasks in digital hu-
shows relations to certain topics for each document of a cor-
manities, even when the close reading of a text is more im-
pus. Heat maps are also used in [MSR∗ 15] to display “high-
portant, e.g., when focusing a text of interest or analyzing
level summaries” of topic modeling results. Finally, heat
similar patterns.
maps are used to analyze a text of interest [KJW∗ 14], e.g., to
Within our research papers collection we found 132 vi- visualize the similarity [CTA∗ 13] or the flow [FS11, Ben14]
sualizations providing a distant reading view of a given text of sound in poems.
corpus. We extracted and grouped various approaches found
to visualize summarized information into the six following Tag clouds are intuitive visualizations to encode the fre-
categories. quency of words within a selected section, a whole docu-
ment or an entire text corpus by using variable font size.
Heat maps or block matrices are often used to highlight
text snippets, especially, when analyzing similar patterns
and in corpus analysis tasks. Thereby, a heat map may re- draw
county
balthasar slain
child
montague
to-night grief
flower

flect structural elements of a text [JRS∗ 09, VCPK09] or the friar mercutio
sound letter banished turn
kiss grave

romeo
hence gregory
tybalt
ears
quarrel tears maid
beauty
paris juliet woe musician alone

structure of an entire corpus [CDP∗ 07, Mur11, BGHJ∗ 14]. seek wilt
early face cousin none
watch ere capulet sampson
lawrence
orlando
weep
dream help servant
wrestling friend
prince
bid madam rest
bed head holy lies
breath
lips
benvolio
nurse
farewell light

rosalind
beau house
In such scenarios, the coloring of rectangular blocks helps to hast
villain
ah dear
to-morrow
away hour hate third
dunsinane

Romeo and Juliet gone


send
faith peter
earth thane hail
william woman
analyze the distribution of specific textual patterns [CWG11, senior oliver bring lie faith
young
dost son keep joy news mother
hold
aside

stay duncan siward


tyrant

death
charles wife fall nay
phebe brother young daughter dead name peace malcolm cawdor
pray strange
doth
MH13, JKH∗ 15]. Another example is the usage of heat sweet soul true
youth away
eyes marry die blood sleep servant
tree amiens
corin fool friends world
fair wit night sleep
murtherer
jaques thank god lie heaven stand
lady
mistress to-morrow attendants
together
pray sweet think cell comes
noble friends sword ross fleance
fear
macduff
leave world
shepherd matter
exit father hear art think
As You Like It
keep blood deed
hand exeunt mine scenefear god
life poor name donalbain
frederick le ay live
grace
lennox knocking
heart sun
touchstone desire daughter
fair
fall full
life mine hand
give give sir speak hand poor
hath hear Macbeth banquo things

duke wit god man


castle soldiers
silvius speak
truly prithee comes speak hath night thought doctor comes leave bloody

man enterlorday enter heart king witch eye


ay
pity adam court better marry father mine scene death worthy
stand

celia sirartenter love lovelord macbeth


honour nay
true live art father
master man give exeunt done
exit life lady nature
leave
doth poor exeunt
gentle nature dare
sir
love lord hath
bear
lover
forest please hour keep scene heart
exit
lord live heaven son nothing
audrey fortune enter art
bear eyes face
moor bianca
speak ay eyes
wise hear dear
anne prince bear night ay exeunt love
lady
light
love man exeunt hear world wife hold cyprus devil
scene woman
bring

iago
mighty derby dead ha
mind
catesby kill grace wife fair hearcomesliveman sir enter sir exit think soul hast help age sense

gracious
doubt edward send holy
to-morrow marry give lord hath heart leave doth
art none
sweet nay officer dead

crown death god exit stay


speak hath give live hand
done fortune brabantio
true keep away bed
montano
foul
sovereign
murderer mother dear true son
poor night
father heart hand mine
nature fall dost gone
wrong

news doth rest poor scene friend


roderigo
Othello
clown
mine father lady fair
thought life
tyrrel
elizabeth hate die better young
stand
duke think life leave comes fear marry
kill
work sure
happy
brakenbury hastings
peace
george
farewell
blood
sweet heaven
fear
eyes
heaven act noblecoursepray
bear
nothing emilia found
gentlemen

Richard III
fall dost cry
citizen hour world wit stand
duke
false
honest
york gave

soul daughter fool faith matter


othello
england bid daughter state tonight
rivers clarence weep done name thought
king
cousin noble sleep watch husband abused
norfolk
margaret fight ah tent friends farewell full honor
lodovico
murther
brother mistress lost
false

queenunto cassio
business
please told head husband house villain senator
london curse madam deed attendants lieutenant
free

sorrow
wrong bloody gentle
richmond ho heard
indeed
breath thank earth dream honour prithee

desdemona
grey is't venice

gloucester richard general


days end sword means
enemies
shame hope ratcliff
land gentleman
tower
charge deep duchess
dorset uncle buckingham alas best
gratiano

despair horse royal stanley arms handkerchief sing


mayor ghost children willow
majesty tender
messenger thine
whose looks comfort

Figure 10: Heat map highlighting potential plagiarized text


passages in a PhD thesis (Figure reproduced with permission Figure 11: Comparing the tags in five Shakespeare plays
from Riemann et al. [RPSF15]). (Figure based on [JBR∗ 15]).


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

(a) Linked views for the exploration of com- (b) Colored shapes encode emotions about (c) Fictional map of Yoknapatawpha County
modity trading (Figure reproduced with per- London places (Figure reproduced with per- and related places (Figure reproduced with
mission from Hinrichs et al. [HAC∗ 15]). mission from Heuser et al. [HAHT∗ 15]). permission from Dye et al. [DNCM14]).

Figure 12: Maps supporting distant reading.

Tag clouds are therefore a suitable method for corpus anal- Maps are widely used to display the geospatial informa-
ysis tasks [VCPK09, Bea12, FKT14, GTAHS15]. TagPies tion contained in a text. Most often, maps support the analy-
– a tag cloud arranged in a pie chart manner – support sis of named entities extracted from a text or an entire cor-
the comparative analysis of the co-occurrences of search pus. Two works illustrate the geographical areas which are
terms [JBR∗ 15], an example of which is shown in Figure 11. associated to persons [Wil15b], e.g., by mapping the places
Other approaches visualize the temporal evolution of tags in of activity of musicians manually extracted from musicolog-
tag clouds, either listing tags per time period [CVW09], or ical literature in order to support the geospatial comparison
by attaching a time graph to each tag [LRKC10]. Beaven of musicians’ activity regions [JFS16]. But usually, places
uses tag clouds to illustrate collocational relationships of mentioned in texts are analyzed. With the help of contem-
a single word [Bea08] and to compare the collocates be- porary (e.g., GeoNames [Geo15]) and historical gazetteers
tween two words [Bea11]. Tag clouds are also applied when (e.g., Pleiades [Ple15]), the extracted placenames can be en-
analyzing topics, e.g., by displaying a topic’s characteris- riched with geographical coordinates, and their visualization
tic tags [BJ14, ESK14, JOL∗ 15, MSR∗ 15] (an example can on a map supports the analysis of the (fictional) geographic
be seen Figure 16b), or by summarizing the major tags space described in the source text(s). Some approaches use
for certain time periods [CLT∗ 11, CLWW14]. The usage thematic [ÓML14] or density maps [GH11b,BB15b] for this
of tag clouds to explore the classification of speculative purpose, but the usage of glyphs in the form of circles is
fiction anthologies [HFM16] is shown in Figure 13. Tag more frequent [Tra09, DWS∗ 12, HAC∗ 15, Wil15a] as it sim-
clouds are rarely applied for the analysis of named enti- plifies the interaction with individual plotted places (e.g., see
ties [HAC∗ 15] (see Figure 12a) or when focusing a text of Figure 12a). In [HAHT∗ 15], circles are used to map discrete
interest [KJW∗ 14]. In some of the above mentioned works, London places occurring in fictional literature, and polygons
tag coloring is used to express additional information such as represent wider spaces such as neighborhoods or districts
the temporal evolution of a word’s significance or the classi- of London. The coloring of shapes indicates collected emo-
fication of tags. tions to these places (see Figure 12b). In [JW13], various
glyphs encode various types of places occurring in medieval
texts. Two works that focus on mapping the geographical
knowledge of ancient Greek authors draw connections be-
tween glyphs to illustrate travel routes [EJ14] or to highlight
the strength of the relationship between placenames, which
is reflected by the number of co-occurrences [BPBI10]. In
contrast to the previous works, the geospatial metadata asso-
ciated with individual corpus texts (text creation timestamp)
can be used for mapping [MBL∗ 06]. The visualization of
Faulkner’s fictional Yoknapatawpha County includes various
means of geographic mapping [DNCM14]: on the one hand,
the imagined geography and, on the other, the placenames
displayed on the geographic levels region, nation and world
Figure 13: Exploring the tree structure and representative (Figure 12c). In addition to named entity analysis, maps are
tags of a novel classification (Figure provided by Hinrichs used for the analysis of topics [DFM∗ 08, GDMF∗ 14] and
based on [HFM16]). corpus exploration [JHSS12].

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

Figure 14: A combination of an abstract timeline view and (a) Network of 55 poetic texts. Imitated texts marked in blue,
a tree in EMDialog (Figure provided by Hinrichs based sequels marked in red (Figure reproduced with permission from
on [HSC08]). Eder [Ede14]).

Timelines are appropriate techniques to visualize histor-


ical text corpora carrying various types of temporal infor-
mation. Often, timelines support the analysis of named en-
tities. In [JW13], uncertain datings of medieval texts are
visualized to support the analysis of cited places. Some-
times, the temporal information about events reported in a
text need to be extracted in order to visualize (fictional)
(b) Excerpt from the social network in Mikhail Bulgakov’s Mas-
calendars [DNCM14, GDMF∗ 14, ÓML14, HAC∗ 15] (e.g.,
ter and Margarita (Figure provided by Maciej Ceglowski and
see Figure 12a). For the exploration of placenames in reproduced with permission from Coburn [Cob05]).
Herodotus’ Histories, a timeline is used to show where cer-
tain placenames occur in the text [BPBI10]. Timelines also
support the temporal analysis of topics [HFM16], e.g., the
exploration of events in news articles [ESK14] (see Fig-
ure 16b). Furthermore, timelines are used for corpus anal-
ysis tasks [JHSS12]. A somewhat abstract timeline view
is shown in [HSC08]. Here, a so-called tree cut section,
whereby each ring represents a decade, visualizes statements
from and about Emily Carr’s life and work (Figure 14).
Streamgraphs are popular techniques that produce aesthetic
visualizations and allow to track the evolution of topics over
time [BW08], thus generating enhanced versions of the time-
line metaphor. Such visualizations are often based on news-
paper sources [CLT∗ 11,KBK11,DWS∗ 12,CLWW14] or po-
litical text archives [Kau15, Poi15] to support the analy- (c) Thomas Jefferson’s social relationships (Figure reproduced
sis of contemporary topic changes. Using a research paper with permission from Klein [Kle12]).
pool [ARR∗ 12], the changing importance of research topics
Figure 15: Graphs supporting distant reading.
can be explored. Streamgraphs may also be used to support
the analysis of storylines in a text of interest to illustrate plot
evolution and changing locations [LWW∗ 13]. Based upon
Hollywood screenplays, the tool ScripThreads visualizes ac-
e.g., based upon stylistic closeness [Joc12, CEJ∗ 14]. For in-
tion lines of movie characters [HPR14].
stance, Eder visualizes a network of poetic texts based on
Graphs are valuable visualizations for all text analysis stylistic features [Ede14], thereby highlighting connected
tasks. They are most often applied to visualize certain struc- editions and sequels (Figure 15a). Graphs can also be used
tural features of a text corpus. A common usage is the vi- to illustrate topics (as nodes) and proximities of topic mod-
sualization of relationships between the texts (represented els applied to text corpora [Kau15]. In other works, nodes
as nodes) of a corpus in the form of a tree [HFM16] (see represent the words of a corpus and links are drawn to
Figure 13) or a network [WH11]. Proximity can be used reflect semantic relationships [AGL∗ 07, RFH14, OGH15]
to express the similarity of these texts [EX10, MRMK15], or co-occurrences [VCPK09, KKL∗ 11, WJ13a, BB15b].


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

(b) Combination of a dust-and-magnet visual-


(a) Interactive dot plot visualizes text re-use ization, a timeline and a tag cloud for browsing (c) Topological landscape visualizes the-
patterns between two Arabic texts (Figure historical newspapers (Figure reproduced with matic clusters in the New York Times cor-
based on [JGBS14]). permission from Eisenstein et al. [ESK14]). pus (Figure based on [OST∗ 10]).

Figure 16: Miscellaneous distant reading methods.

In [HAHB15], a graph visualizes cross references in his- tion of Thomas Jefferson’s social relationships (Figure 15c),
toric encyclopedia by linking related entries. Further appli- the nodes placed on a vertical axis are connected with
cations are the visualization of scene changes and charac- arcs [Kle12]. Riche proposed a layout for Euler diagrams,
ter movements in Shakespearen plays [RRRG05], as well as which can also be utilized to visualize relationships between
the display of conceptual [Arm14], contextual [HSC08] or characters extracted from Shakespearean texts [RD10]. Fi-
multilingual [GZ12] information. Phrase nets connect tex- nally, GeneaQuilts smartly visualizes large genealogies ex-
tual entities that appear in the form of a user-specified re- tracted from literary texts such as the Bible [BDF∗ 10].
lation (syntactic or lexical) [vHWV09]. All aforementioned
works apply force-directed algorithms for the placement of Miscellaneous methods also produce beneficial results
nodes. Radial graphs can be used to unveil the relationships for certain text analysis tasks, most often for the explo-
of words within poems [MFM13], or, again, to highlight the ration of similar patterns. In [JGBS14], an interactive dot
similarity among texts and, in this case, as nodes radially plot interface is used to visualize and explore patterns of
grouped by authors [Wol13]. The Word Tree [WV08], also text re-use between two texts (Figure 16a). In [GCL∗ 13],
used in [MH13], visualizes sentences sharing the same be- a parallel coordinates and a dot plot view, which is used for
ginning in the form of a tree. In contrast to the variant graph, filtering purposes, visualizes the similarity of parallel text
a technique that supports close reading of textual editions, sections. Sankey diagrams are used to compare the cate-
the Word Tree is a distant reading technique as it dissolves gories of words contained in two books [HCC14], and to
the order of sentences. Finally, we found a method that visu- highlight plagiarized text passages when juxtaposing a PhD
alizes plain event trigraphs extracted through phrase mining thesis to potential sources [RPSF15]. For the analysis of
algorithms and thus providing metaphors to display uncer- repetitions in Gertrude Stein’s The Making of the Ameri-
tain information [MLSU13]. When analyzing named enti- cans [CDP∗ 07], parallel coordinates visualize the frequency
ties, graphs are the means of choice to visualize the rela- of phrases across sections, and TextArc [Pal02] is used to ex-
tionships between people in the form of social networks. plore the repetition of individual words. Two miscellaneous
Such representations are widely applied in the digital hu- methods are applied to analyze topics. For the exploratory
manities to illustrate the relationships between characters thematic analysis of historical newspaper archives [ESK14],
in literary texts [CSV08, Tót13, BB15a, TFK15]. In these an application of the dust-and-magnet metaphor [YMSJ05]
graphs, the size of a node can be used to encode the fre- yielded useful results (Figure 16b). Another topical anal-
quency of a character name in the text [BHW11, Pet14], ysis technique uses a landscape metaphor to visualize the
the thickness of an edge [Cob05] (Figure 15b) or the prox- topology-based clustering of articles taken from the New
imity of the nodes [Poi15, JFS16] can serve to reflect the York Times Corpus [OST∗ 10] (Figure 16c). Various meth-
strength of a relationship, and edge style can be used to clas- ods were also developed to support the analysis of a text of
sify the type of relationship [KOTM13]. As per the afore- interest. The tool PlotVis allows users to model and inter-
mentioned works, Kochtchi uses a force-based graph lay- act with XML-encoded literary narratives in 3D [PBD14].
out to visualize social networks automatically extracted from A further complex tool named “Simulated Environment for
newspaper articles [KLB14]. In contrast, radial layouts and Theatre (SET)” supports the story flow simulation of theatri-
parallel coordinates are used in [Boo13]. For the visualiza- cal plays [RSDCD∗ 13]. It consists of various 2D interfaces

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

illustrating the “line of action” and a 3D interface populated showing the relationships among textual entities is illus-
by character avatars. For the analysis of word statistics & re- trated in [WV08, RFH14]. Here, textual entities can be se-
lationships, tree maps are used to illustrate the occurrences lected in both the graph and the text, triggering mutual up-
of adjectives in fairy tales in [WJ13a]. The Column Explorer dates. Other text analysis tasks also benefit from the com-
introduced in [JFS16] supports the analysis of named enti- bination of top-down and bottom-up approaches. A typi-
ties, in that case by comparatively visualizing biographical cal use case are visual analytics methods. The Varifocal-
profiles of musicians. Reader [KJW∗ 14] hierarchically visualizes a document with
the help of distant views (structural overview, tag clouds)
and close reading techniques (use of color, digital copy), thus
8.3. Techniques for Combining Close and Distant
supporting hierarchical navigation. In close reading mode,
Reading
automatically acquired classifications of textual entities can
Most of the visualizations we found provide either a close be manually modified, which subsequently affects distant
or a distant reading of a text corpus. Still, an important fea- views. The same applies to social networks automatically ex-
ture for literary scholars when working with distant read- tracted from newspaper articles [KLB14]. The user browses
ing visualizations is direct access to source texts or, in the graph, opens close reading views associated with indi-
other words, close reading. Among the papers in our collec- vidual nodes and annotates the source text, which, again,
tion providing close and distant reading, some visualizations affects the distant view and is used for classifier training.
combine both techniques – most often in the form of coor- WordSeer [MH13] allows for a multifaceted perusal of a text
dinated views [WBWK00]. We do not consider the methods corpus. For selected textual entities, several close and dis-
in [WH11,CTA∗ 13,Ben14,BJ14] as the presented visualiza- tant reading views can be used to browse the corresponding
tions for close and distant reading serve different purposes, source texts. Within the close reading views, the user can
and are not connected to one another. Table 5 orders the re- group words into classes, which can then be used as a start-
maining 37 remaining techniques, which are outlined in de- ing point for text corpus analysis.
tail below, according to the given text analysis tasks in three
groups. Top-down strategies support nearly all types of text anal-
ysis tasks. They are mostly applied to combine close and
Bottom-up methods focus primarily on close reading, es-
distant reading visualizations. Such methods implement the
pecially, when focusing similar patterns. In [GCL∗ 13], the
Information Seeking Mantra in its original meaning. Ini-
user selects a desired text passage in Shakespeare’s Othello,
tially, a distant view on the textual data is shown, the user
which is shown in various German translations. Distant read-
can often manipulate the visualization by means of filtering
ing visualizations are processed (parallel coordinates view,
and zooming, and finally retrieve the details-on-demand by
dot plot view, heat map) based on that selection. Another
clicking on a potentially interesting data item. In some cases,
bottom-up approach supports the semi-automatic alignment
the texts are simply shown at the end of the information
of early new high German text variants [MRMK15]. A
seeking pipeline [HSC08, DWS∗ 12, MFM13, RSDCD∗ 13,
graph displaying the similarities between text editions is up-
FKT14, Wil15a, Wil15b, HFM16]. Observed words or text
dated as annotations are collected in close reading sessions.
patterns are often highlighted in the close reading view by
In [Mur11], the literary scholar selects a certain phrase dur-
way of coloring [VCPK09, GZ12, Wol13, AKV∗ 14, HPR14,
ing the close reading process. Next, that phrase is searched
HAC∗ 15, JBR∗ 15, JKH∗ 15]. Various colors can thereby il-
within the text corpus and the phrase’s distribution is shown
lustrate word categories [CDP∗ 07], e.g., types of toponyms
in the form of a heat map. Two approaches provide bottom-
in the Herodotus Timemap [BPBI10] or topological clus-
up strategies to support the analysis of named entities. When
ter information [OST∗ 10]. In some systems, close reading
annotating literary texts [AGZH15], places related to Ed-
is more closely related to the preceding distant reading.
inburgh are marked, and a linked heat map that displays
In [BGHJ∗ 14], the connection between close and distant
the distribution of all annotations is accordingly updated.
reading is achieved by zooming. The distant view, a struc-
In [OGH15], the user explores automatically tagged named
tural overview, highlights certain patterns, and zooming al-
entities of scientific papers in close reading mode. After edit-
lows the close reading of individual passages. In [JGBS14],
ing, a graph reflecting contained entities and relationships
a grid-based heat map visualizes similarities between the
among them is generated.
texts of a corpus, and clicking on a grid cell opens a close
Top-down & bottom-up approaches taken within one reading view showing the corresponding two texts juxta-
visualization entity allow for switching between close and posed with connections between related text passages. Sim-
distant reading while taking into account manipulations of ilarly, the navigation between distant plagiarism overviews
the preceding view. Some of these approaches support the and the close reading of plagiarized passages is organized
analysis of similar patterns. In [JRS∗ 09] and [PMMR15], in [ZNMS15] and [RPSF15]. A distant reading visualization
the user can switch between heat map (distant reading) and illustrating the variance of verses among multiple Bible edi-
text view (close reading). A side-by-side navigation be- tions provides distant views as heat maps on various text hi-
tween source text (close reading) and a distant reading graph erarchy levels (entire Bible, book, chapter) [JG15]. In the


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

chapter view, the close reading of individual verses is possi- Rahman to examine research questions without the aid of
ble. The CorpusSeparator presented in [CWG11] is a distant existent visualizations. The generation of a text corpus is of-
view used to generate a weighted tag list (dependent on cor- ten an enduring humanities scholars’ task that begins with
pus statistics). Based upon these weights, the close reading a project launch [HFM16]. As a consequence, visualization
view of a text (illustrated with Shakespeare’s A Midsummer researchers start with a small training set, and should there-
Night’s Dream) is manipulated by coloring and sizing lines. fore design a visualization as flexible as possible in order
to enable potential changes of humanities scholars’ research
interests, and to avoid limitations. In the best case, the text
9. Collaboration Experiences corpus to be analyzed is already available in digital form and
a precise research question is at hand, as outlined in [JFS16].
Within our collection, we examined papers about the re-
search experiences reported by visualization researchers in Iterative development of prototypes. The involvement
order to provide suggestions that might help visualization of humanities scholars in various stages of the development
scholars new to the field of digital humanities to develop suc- is necessary to ensure creating an intuitive visualization that
cessful visualizations. Some projects reveal valuable insights will be used. For example, regular face-to-face sessions be-
into collaboration experiences. Excellent design studies are tween computer scientists and humanities scholars can help
outlined by Abdul-Rahman et al. [ARLC∗ 13], McCurdy et to identify problems and potential enhancements of the pro-
al. [MLCM16] and Hinrichs et al. [HFM16]. All applica- totype design [JGBS14]. Such a session should be composed
tions were successfully presented in visualization and digital of a demonstration and trials of the visualization prototype
humanities issues. Other publications also share important as well as intense discussions in order to gather the levels
experiences. A collective overview of the gained insights re- of detail and complexity that a visualization should ideally
garding various aspects of the development phase are out- reach [ARLC∗ 13]. Geßner [JGF∗ 15] stated that such a pro-
lined below. cess finally helps to gain an intuitive result “even for the in-
experienced, maybe sceptical user.” When designing a pro-
Project start. The beneficial, initial decision of carry-
filing system for musicians [JFS16], a frequent interdisci-
ing out a user-centered design study [Mun09] is reported
plinary get-together was important for the visualization re-
in various works (e.g., [ARLC∗ 13, JFS16]). This leads to a
searchers to communicate their own concerns and to itera-
very close collaboration between researchers of the different
tively redesign the underlying mathematical basis (similar-
fields, which helps to avoid gearing the development of a vi-
ity measures) – thereby ensuring that aspects of data trans-
sualization into false directions. Further important tasks at
formation retained comprehensible for the collaborating mu-
the beginning of a digital humanities project are discussions
sicologists. That the scholarly exchange is of particular im-
about the research questions and perspectives for which a
portance if the textual data source evolves throughout the
visualization, be it for close or distant reading, can be bene-
project time, is outlined by Hinrichs et al. [HFM16]. On the
ficial [JGBS14]. These discussions include the analysis of
one hand, the archival work of humanities scholars when
the data features [VCPK09] as well as the setup of reg-
working on the dataset may further develop hypotheses that
ular project meetings to work on and extend a collabora-
trigger new visualization ideas, and on the other hand, a visu-
tive idea. A typical problem of digital humanities projects
alization has the potential of changing the humanities schol-
is reported in [MLCM16]. The “initial conversations [be-
ars’ research processes and their perspective on a text col-
tween visualization and humanities scholars] were broad
lection. For the development of Poemage [MLCM16], fre-
and open-ended,” also, because the humanities scholars “did
quent meetings helped visualization researchers in under-
not have specific goals” in mind. Furthermore, the human-
standing the problem space and engaged literary scholars
ities scholars were sceptical that visualization can support
in working with the visualization, and finally, in develop-
their research, and there was also an “anxiety that the com-
ing “an interface that reflected their interests, aesthetics, and
puter would inhibit the qualitative experience of the poetic
values.” The authors also document that the departure from
encounter.” After humanities scholars presented examples
well established design principles such as regarding “ambi-
of interesting features and computer scientists “established
guity as a fundamental source of insight” or “not restricting
methods for computationally detecting and analyzing the de-
the tool to avoid clutter” was necessary in raising the value
vices that most interested them,” a common project basis and
of the visualization. Another example is given by scholars
tasks had been be generated. In such circumstances, special
involved in the development of Neatline [NMG∗ 13], which
workshops can also help computer scientists and humani-
is based upon Omeka [Ome15], a content management sys-
ties scholars get acquainted with each others’ tasks, mind-
tem for online digital collections. The stepwise development
sets and workflows [ARLC∗ 13]. Abdul-Rahman reports the
of Neatline led to advancements of Omeka itself, thus bene-
importance of visualization researchers participating in po-
fiting a far wider audience than originally anticipated.
etry readings and in-depth discussions with literary schol-
ars to discover “a variety of interesting problems that might Evaluating visualizations with humanities scholars.
be subject to visualization solutions.” Also, a small cor- The evaluation sessions provide important insights into de-
pus generated for literary scholars was helpful for Abdul- sign, intuitiveness, the utility of visualizations and into

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

potential enhancements. A number of humanities schol- Adapting existing visualization techniques. For some
ars working with the visualizations suggested further en- of the text analysis research questions posed in the digital
hancements, some of which strengthen the importance of humanities, the adaption of existent techniques proposed in
close reading solutions [CWG11, HFM16]. For example, visualization research papers is beneficial. A positive ex-
when similar close and distant views were provided, “users ample is the Trading Consequences project [HAC∗ 15]. In-
stressed that it is preferable to see the actual words” rather volved visualization scholars designed a system inspired
than abstract overviews [JRS∗ 09]. When working with the by VisGets [DCCW08] and made use of Parallel Tag
VarifocalReader [KJW∗ 14], the user liked to view “the dig- Clouds [CVW09]. Both visualization techniques were not
itized image of a book’s page and mentioned that this would primarily developed for digital humanities data, but they
increase his trust in the approach.” The metaphor of a digi- were beneficially adapted to support humanities scholars.
tized text is also used when comparing various English trans- Occasionally, new techniques for close and distant reading
lations of the Bible [JGF∗ 15], which “reminds the user that are designed while appropriate, sophisticated visualizations
it is a book to be read, not just some string of letters.” Al- unrelated to digital humanities data already exist. For future
though developed for museum visitors, the importance of research tasks, the inclusion of these visualizations into the
aesthetic appeal to engage in information exploration was workflows of humanities scholars could lead to faster hy-
reported in [HSC08]. The fact that visualizations should be potheses generation due to the limited time for development.
designed to meet humanities work practices is mentioned As an example, the Sequence Surveyor, which provides a
in [BDF∗ 10]. Some humanities scholars also mentioned is- dendrogram to explore genomic structures [ADG11], could
sues or limitations with the presented tools. For instance, support future research. Each leaf of the dendrogram shows
the need to confirm temporary results by analyzing larger a heat map illustrating genome distributions. This metaphor
datasets or, in other words, more texts and in more lan- could be used to visualize both the rhyme structure of a poem
guages [GCL∗ 13]. In [HAC∗ 15], the attached labeling was in dendrogram form and the heat maps displaying phonetic
a crucial issue. The authors resumed the requirement of a patterns. Other possible adaptations of existing visualization
visual representation “to be clear in order to make visu- techniques for digital humanities research can be found in
alizations a valid research tool.” As stated in [JFS16], fu- the previous version of this survey [JFCS15].
ture extensions of visualizations usually require efforts from
Novel techniques for close reading. Various publications
both computer scientists and humanities scholars. Scien-
outline that close reading benefits from visualization, e.g.,
tists involved in [HKTK14] stated that collaborative work
by highlighting crowdsourcing statistics [KG13, WMN∗ 14]
helped to reactivate and to regenerate traditional literary
or displaying information about textual features and struc-
methodologies rather than abandon them. The turn from ini-
ture [ARLC∗ 13, JGF∗ 15] alongside the source text. Al-
tial scepticism when starting the digital humanities project
though close reading is an essential task for humanities
to enthusiasm when using the resultant visualization is re-
scholars, in most cases only simple visualization techniques,
ported in [MLCM16]. In a long-term evaluation, Hinrichs et
such as color coding textual entities, are provided. Few
al. summarize the potential of their developed visualization
works attend to the matter of enhancing close reading in a
for the collaborating humanities scholars [HFM16]. Like in
beneficial manner. For example, the work on word scale vi-
their case study on fiction literature, a visualization should
sualizations is a promising technique [GWFI14] from which
be able (1) to confirm existing hypotheses, (2) to refine hu-
many humanities scholars may profit. But despite the pro-
manities scholars’ research questions, (3) to offer new ways
posed annotations of individual words with statistics or of
of answering research questions, (4) to negotiate quantitative
country names with polygons, the concept needs to be ex-
and qualitative interpretation of the underlying text corpus,
panded to annotating other kinds of named entities. For ex-
and (5) to trigger new research questions. Other visualiza-
ample, providing supplementary information about (1) act-
tion researchers share similar experiences gained in evalua-
ing persons and their relationships, (2) artifacts mentioned
tion sessions with humanities scholars [CWG11, ARLC∗ 13,
in texts, or (3) occurring references could be interesting fea-
GCL∗ 13, HAC∗ 15, MLCM16]. Such an example is given
tures for humanities scholars. Future work in visualization
by Vuillemot et al. [VCPK09]. When working on Gertrude
should include the development of design methods to meet
Stein’s The Making of Americans with POSVis, the collab-
such use cases, and studies that measure the benefit of glyph
orating literary scholar could generate substantial knowl-
based approaches for close reading in comparison to using
edge about the usage of the word one. This led to a pub-
color or font size to express certain text features.
lication she presented at the digital humanities conference
2009 [CPV09]. Visualizing transpositions in parallel texts. When ob-
serving similarities and differences among various editions
of a text, one focus is to detect transpositions of textual en-
10. Future Challenges
tities. Such transpositions may occur on various text hierar-
Throughout our work on this survey, we marked major chal- chy levels, e.g., changed word order, modified argumenta-
lenges in the digital humanities where the visualization com- tion structures, or even when exchanging whole paragraphs
munity can contribute valuable research. or sections. Although suitable methods exist for the first two


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

hierarchy levels (words, sentences) [WJ13b, JGF∗ 15], there tions considered within our survey was illustrated by usage
are no visualization techniques capable of coherently visu- scenarios, we found little evidence about conducted usability
alizing transpositions on all hierarchy levels by combining studies to, for example, justify taken design decisions. The
means of close and distant reading. number of humanities scholars participating in such studies
is potentially very small due to the multifarious research in-
Geospatial uncertainty. Many visualizations deal with
terests scholars may have on a large body of texts belonging
placenames extracted from literary texts to illustrate the geo-
to different eras and genres. Generating a user study format
graphical knowledge of a particular era. Here, various map-
that caters for the interests of many different scholars is re-
ping issues arise [JW13]. Texts may contain placenames of
quired to gain valuable insights into guidelines for design-
varying granularity (e.g., country, region, city) or type (e.g.,
ing visualizations for the digital humanities. When it comes
points for cities, polygons for areas, polylines for rivers) or
to tool building, in fact, the digital humanities community
even fictional placenames, which are hard to represent. Fur-
poses interesting and complex challenges by virtue of its in-
thermore, placenames can themselves carry uncertainty of
terdisciplinary nature. It embraces a wider range of disci-
varying degrees, e.g., the exact locations of “Sparta” and
plines, so the techniques it offers should address the larger
“Atlantis” have yet to be discovered. Another form of un-
scope. It also welcomes contrasting mindsets, methods and
certainty is defined by contextual information, e.g., expres-
cultures. While sharing similar logical and analytical meth-
sions like “in London” and “close to London” cover various
ods, computer scientists tend towards problem solving, hu-
geospatial ranges. The development of a design space pro-
manities scholars towards knowledge acquisition and dis-
viding solutions to visualize these various types of geospa-
semination [Hen14]. No one community should operate in
tial uncertainty is one of the current primary challenges in
subservience to the other but together, complementing each
digital humanities. Such a design space could be built upon
others’ approaches. For these reasons and in this context,
the ideas of MacEachren for visualizing geospatial uncer-
specialist terminology, assumptions and technical barriers
tainty [MRO∗ 12].
should all be avoided. It is in this sense that tool usability
Temporal uncertainty. The visualization of temporal un- should be understood not only as improved functionality or
certainty is an equally important future task. Such uncertain- aesthetics but as a transparent guide to utility [GO12].
ties occur, for instance, when dating cultural heritage ob-
Qualitative studies. The number of projects that include
jects, such as historical manuscripts [JW13, BESL14]. Tem-
visualization components as valuable means of text analy-
poral metadata, in fact, can be provided in multifarious man-
sis indicates the potential of visualization to support digital
ners, e.g., 1450, before 1450, after 1450, around 1450, 15th
humanities research. Some scholars suggest the role of visu-
century, first half of the the 15th century, etc. One can try
alization as providers of new perspectives on the texts that
to transform such temporal formats into machine-parsable
facilitate text comprehension and hypothesis generation. For
time ranges, but the visualization of such uncertainties is a
example, humanities scholars involved in the development
crucial issue as it comprises considerable risks of misinter-
of the PoemViewer [ARLC∗ 13] mentioned that “they would
pretation. Applying methods capable of visualizing temporal
not likely look for insight from the tool itself ... they would
uncertainty as proposed by Slingsby [SDW11] can be a first
look for enhanced poetic engagement, facilitated by visu-
step, but their utility for humanities applications needs to be
alization.” Hinrichs et al. [HFM16] state that “information
investigated.
visualizations ... are not a means to an end but a starting
Reconstructing workflows with visualization. In two vi- point to explore, interpret, and discuss literary collections.”
sualization papers, authors related situations where, during Similarly, Sinclair [SRR13] argues that “a visualization that
their conducted case studies, humanities scholars mentioned produces a single output for a given body of material is of
the importance of visualization features that emulate the limited usefulness; a visualization that provides many ways
scholar’s workflow. In [KJW∗ 14], users liked the display of to interact with the data, viewed from different perspectives,
digital copies as this builds trust in the visualization. When is better; a visualization that contributes to new and emergent
working with genealogy visualizations [BDF∗ 10], histori- ways of understanding the material is best.” Comprehensive
ans “insisted on redundant representation of gender ... that case studies that scientifically debate the actual influence and
is consistent with their current practices.” Both situations il- impact of visualization could further specify its role and fur-
lustrate the future challenge of inventing visualization tech- ther strengthen its value as part of humanities research.
niques for digital humanities applications that the humani-
ties scholar can easily adapt. An important task for the com-
puter scientist is not only to incorporate a scholar’s workflow 11. Conclusion
when designing the visualization, but to also communicate
Computer scientists and humanities scholars seemingly do
all aspects of data transformation, so that a scholar is able
not have many things in common. Although they share some
to generate trustworthy hypotheses. The importance of this
methodologies, they are geared towards different goals. But
issue is documented in [GO12].
the digital age created a platform that brings people from
Usability studies. Although the utility of most visualiza- two research areas together: the digital humanities.

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

with Semantic Networks. In Proceedings of the Digital Humani-


ties 2007 (2007). 8, 14
[AGZH15] A LEX B., G ROVER C., Z HOU K., H INRICHS U.:
Palimpsest: Improving Assisted Curation of Loco-specific Liter-
ature. In Proceedings of the Digital Humanities 2015 (2015). 6,
8, 12, 16
[AKV∗ 14] A LEXANDER E., KOHLMANN J., VALENZA R.,
W ITMORE M., G LEICHER M.: Serendip: Topic Model-Driven
Visual Exploration of Text Corpora. In Visual Analytics Science
and Technology (VAST), 2014 IEEE Conference on (Oct 2014),
pp. 173–182. 7, 8, 10, 12, 16
[AL09] ATHENIKOS S. J., L IN X.: WikiPhiloSofia: Extraction
and Visualization of Facts, Relations, and Networks Concerning
Philosophers Using Wikipedia. In Proceedings of the Digital Hu-
manities 2009 (2009). 5
Figure 17: Papers published in visualization and digital hu-
[Arc15] Internet Archive, 2015. http://www.archive.org
manities communities by year.
(Retrieved 2015-01-09). 1
[ARLC∗ 13] A BDUL -R AHMAN A., L EIN J., C OLES K.,
M AGUIRE E., M EYER M., W YNNE M., J OHNSON C. R.,
During our survey, we had the opportunity to take a look T REFETHEN A., C HEN M.: Rule-based Visual Mappings–with
at various fascinating digital humanities projects proposing a Case Study on Poetry Visualization. In Computer Graphics
visualization techniques that support a number of text anal- Forum (2013), vol. 32, Wiley Online Library, pp. 381–390. 6, 8,
11, 12, 17, 18, 19
ysis tasks. We classified the papers providing visualizations
for historical texts according to our proposed taxonomy for [Arm14] A RMASELU F.: The Layered Text. From Textual Zoom,
Text Network Analysis and Text Summarisation to a Layered In-
text analysis tasks in digital humanities, categorized applied
terpretation of Meaning. In Proceedings of the Digital Humani-
close and distant reading techniques and analyzed methods ties 2014 (2014). 8, 15
of combining both views to allow for multifaceted data anal-
[ARR∗ 12] A RAZY O., RUECKER S., RODRIGUEZ O., G IA -
yses. In the process, we derived insights into a research area COMETTI A., Z HANG L., C HUN S.: Mapping the Information
that requires the design of intuitive interfaces, but visualiza- Science Domain. In Proceedings of the Digital Humanities 2012
tions for textual data as part of the cultural heritage are rarely (2012). 6, 7, 8, 14
published in the visualization community. Figure 17 shows [BB15a] B ESHERO -B ONDAR E.: Visualizing the Digital Mitford
the temporal distribution of the papers in our collection. The Project’s Prosopography Data. In Proceedings of the Digital Hu-
trend of related works published within the digital humani- manities 2015 (2015). 7, 8, 15
ties reflects the increasing value of close and distant reading [BB15b] B ESHERO -B ONDAR E.: World-View from Poetic Struc-
visualizations for text analysis tasks in the recent years. Un- ture: An "Anti-Social" Network Analysis of Robert Southey’s
and Eramus Darwin’s Epic Poems. In Proceedings of the Dig-
til now, the visualization community did not notice or con- ital Humanities 2015 (2015). 8, 13, 14
sider these needs. The reason for this may lie in the obstacles
[BDF∗ 10] B EZERIANOS A., D RAGICEVIC P., F EKETE J., BAE
encountered in publishing application papers with a digital J., WATSON B.: GeneaQuilts: A System for Exploring Large Ge-
humanities background because the often demanded quan- nealogies. Visualization and Computer Graphics, IEEE Transac-
titative evaluations are hard to perform due to the usually tions on 16, 6 (Nov 2010), 1073–1081. 8, 15, 18, 19
limited number of collaborating humanities scholars. [Bea08] B EAVAN D.: Glimpses though the clouds: collocates in a
new light. In Proceedings of the Digital Humanities 2008 (2008).
To strike a balance between our discussed shortcomings, 6, 7, 8, 13
we listed future challenges to support humanities schol-
[Bea11] B EAVAN D.: ComPair: Compare and Visualise the Usage
ars’ tasks with close and distant reading. Developing solu- of Language. In Proceedings of the Digital Humanities 2011
tions could provide beneficial contributions to both research (2011). 8, 13
fields. Furthermore, we outlined collaboration experiences [Bea12] B EAVAN D.: DiaView: Visualise Cultural Change in Di-
reported by visualization researchers working in the field of achronic Corpora. In Proceedings of the Digital Humanities 2012
digital humanities as a means of singling out the important (2012). 7, 8, 13
ingredients for a successful project. [Bea14] B EALS M.: TEI for Close Reading: Can It Work for His-
tory?, 2014. http://tinyurl.com/nvdndsb (Retrieved
2015-01-09). 3
References
[Ben14] B ENNER D. C.: "The Sounds of the Psalter: Computa-
[ADG11] A LBERS D., D EWEY C., G LEICHER M.: Sequence tional Analysis of Soundplay". Literary and Linguistic Comput-
Surveyor: Leveraging Overview for Scalable Genomic Align- ing 29, 3 (2014), 361–378. 8, 11, 12, 16
ment Visualization. Visualization and Computer Graphics, IEEE [BESL14] B INDER F., E NTRUP B., S CHILLER I., L OBIN H.:
Transactions on 17, 12 (Dec 2011), 2392–2401. 18 Uncertain about Uncertainty: Different ways of processing fuzzi-
[AGL∗ 07] AUVIL L., G ROIS E., L LORÀ X., PAPE G., G OREN ness in digital humanities data. In Proceedings of the Digital
V., S ANDERS B., ACS B.: A Flexible System for Text Analysis Humanities 2014 (2014). 19


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

[BGHE10] B ÜCHLER M., G ESSNER A., H EYER G., E CKART Q U H., T ONG X.: TextFlow: Towards Better Understanding of
T.: Detection of Citations and Textual Reuse on Ancient Greek Evolving Topics in Text. Visualization and Computer Graphics,
Texts and its Applications in the Classical Studies: eAQUA IEEE Transactions on 17, 12 (Dec 2011), 2412–2421. 8, 13, 14
Project. In Proceedings of the Digital Humanities 2010 (2010).
[CLWW14] C UI W., L IU S., W U Z., W EI H.: How Hierarchical
7, 8, 11
Topics Evolve in Large Text Corpora. Visualization and Com-
[BGHJ∗ 14] B ÖGEL T., G OLD V., H AUTLI -JANISZ A., puter Graphics, IEEE Transactions on 20, 12 (Dec 2014), 2281–
ROHRDANTZ C., S ULGER S., B UTT M., H OLZINGER K., 2290. 7, 8, 13, 14
K EIM D. A.: Towards visualizing linguistic patterns of deliber-
[CMS99] C ARD S. K., M ACKINLAY J. D., S HNEIDERMAN B.:
ation: a case study of the S21 arbitration. In Proceedings of the
Readings in information visualization: using vision to think.
Digital Humanities 2014 (2014). 5, 8, 12, 16
Morgan Kaufmann, 1999. 5
[BHW11] B INGENHEIMER M., H UNG J.-J., W ILES S.: Social
[Cob05] C OBURN A.: Text Modeling and Visualization with Net-
network visualization from TEI data. Literary and Linguistic
work Graphs. In Proceedings of the Digital Humanities 2005
Computing 26, 3 (2011), 271–278. 5, 8, 15
(2005). 8, 14, 15
[BJ14] B INDER J. M., J ENNINGS C.: Visibility and meaning in
topic models and 18th-century subject indexes. Literary and Lin- [Cor13] C ORDELL R.: "Taken Possession of": The Reprinting
guistic Computing 29, 3 (2014), 405–411. 5, 7, 8, 13, 16 and Reauthorship of Hawthorne’s "Celestial Railroad" in the An-
tebellum Religious Press. Digital Humanities Quarterly 7, 1
[BM13] B REHMER M., M UNZNER T.: A multi-level typology of (2013). 5, 7, 8
abstract visualization tasks. IEEE Transactions on Visualization
and Computer Graphics 19, 12 (2013), 2376–2385. 7 [CPV09] C LEMENT T., P LAISANT C., V UILLEMOT R.: The
Story of One: Humanity scholarship with visualization and text
[BNJ03] B LEI D. M., N G A. Y., J ORDAN M. I.: Latent Dirichlet analysis. In Proceedings of the Digital Humanities 2009 (2009).
Allocation. the Journal of machine Learning research 3 (2003), 18
993–1022. 7
[CRS∗ 14] C HRISTIE A., ROSS S., S AYERS J., TANIGAWA K.,
[Boo13] B OOTH A.: Documentary Social Networks: Collective T EAM I.-M. R.: Z-Axis Scholarship: Modeling How Modernists
Biographies of Women. In Proceedings of the Digital Humanities Write the City. In Proceedings of the Digital Humanities 2014
2013 (2013). 5, 8, 15 (2014). 3
[Boy13] B OYLES N.: Closing in on Close Reading. Educational [CSV08] C IULA A., S PENCE P., V IEIRA J. M.: Expressing com-
Leadership 70, 4 (2013), 36–41. 2 plex associations in medieval historical documents: the Henry
[BPBI10] BARKER E., P ELLING C., B OUZAROVSKI S., I SAK - III Fine Rolls Project. Literary and Linguistic Computing 23,
SEN L.: Mapping the World of an Ancient Greek Historian: The 3 (2008), 311–325. 8, 15
HESTIA Project. In Proceedings of the Digital Humanities 2010 [CTA∗ 13] C LEMENT T., T CHENG D., AUVIL L., C APITANU B.,
(2010). 5, 6, 8, 13, 14, 16 BARBOSA J.: Distant Listening to Gertrude Stein’s ’Melanctha’:
[Bra12] B RADLEY A. J.: Violence and the Digital Humanities Using Similarity Analysis in a Discovery Paradigm to Analyze
Text as Pharmakon. In Proceedings of the Digital Humanities Prosody and Author Influence. Literary and Linguistic Comput-
2012 (2012). 3 ing 28, 4 (2013), 582–602. 6, 7, 8, 11, 12, 16
[BW08] B YRON L., WATTENBERG M.: Stacked Graphs – Geom- [CVW09] C OLLINS C., V IEGAS F., WATTENBERG M.: Parallel
etry & Aesthetics. Visualization and Computer Graphics, IEEE Tag Clouds to explore and analyze faceted text corpora. In Vi-
Transactions on 14, 6 (Nov 2008), 1245–1252. 14 sual Analytics Science and Technology, 2009. VAST 2009. IEEE
Symposium on (Oct 2009), pp. 91–98. 8, 13, 18
[CAA∗ 14] C ORRELL M., A LEXANDER E., A LBERS D.,
S ARIKAYA A., G LEICHER M.: Navigating Reductionism and [CWG11] C ORRELL M., W ITMORE M., G LEICHER M.: Explor-
Holism in Evaluation. In Proceedings of the Fifth Workshop on ing collections of tagged text for literary scholarship. Computer
Beyond Time and Errors: Novel Evaluation Methods for Visual- Graphics Forum 30, 3 (2011), 731–740. 8, 10, 11, 12, 17, 18
ization (New York, NY, USA, 2014), BELIV ’14, ACM, pp. 23– [DCCW08] D ÖRK M., C ARPENDALE S., C OLLINS C.,
26. 3 W ILLIAMSON C.: VisGets: Coordinated Visualizations for
[CDP∗ 07] C LEMENT T., D ON A., P LAISANT C., AUVIL L., Web-based Information Exploration and Discovery. Visualiza-
PAPE G., G OREN V.: ’Something that is interesting is interesting tion and Computer Graphics, IEEE Transactions on 14, 6 (Nov
them’: Using Text Mining and Visualizations to Aid Interpreting 2008), 1205–1212. 18
Repetition in Gertrude Stein’s The Making of Americans. In Pro- [DFM∗ 08] DYENS O., F OREST D., M ONDOU P., C OOLS V.,
ceedings of the Digital Humanities 2007 (2007). 5, 7, 8, 12, 15, J OHNSTON D.: Information visualization and text mining: ap-
16 plication to a corpus on posthumanism. In Proceedings of the
[CEJ∗ 14] C RAIG H., E DER M., JANNIDIS F., K ESTEMONT M., Digital Humanities 2008 (2008). 7, 8, 13
RYBICKI J., S CHÖCH C.: Validating Computational Stylistics in [DNCM14] DYE D. J., NAPOLIN J. B., C ORNELL E., M ARTIN
Literary Interpretation. In Proceedings of the Digital Humanities W.: Digital Yoknapatawpha: Interpreting a Palimpsest of Place.
2014 (2014). 7, 8, 14 In Proceedings of the Digital Humanities 2014 (2014). 5, 8, 13,
[CGM∗ 12] C HATURVEDI M., G ANNOD G., M ANDELL L., 14
A RMSTRONG H., H ODGSON E.: Myopia: A Visualization Tool [Dru11] D RUCKER J.: Humanities Approaches to Graphical Dis-
in Support of Close Reading. In Proceedings of the Digital Hu- play. Digital Humanities Quarterly 5, 1 (2011). 3
manities 2012 (2012). 5, 6, 8, 11
[DWS∗ 12] D OU W., WANG X., S KAU D., R IBARSKY W.,
[CL13] C OLES K., L EIN J. G.: Solitary Mind, Collaborative
Z HOU M.: LeadLine: Interactive visual analysis of text data
Mind: Close Reading and Interdisciplinary Research. In Pro-
through event identification and exploration. In Visual Analyt-
ceedings of the Digital Humanities 2013 (2013). 3
ics Science and Technology (VAST), 2012 IEEE Conference on
[CLT∗ 11] C UI W., L IU S., TAN L., S HI C., S ONG Y., G AO Z., (Oct 2012), pp. 93–102. 8, 13, 14, 16

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

[Ede14] E DER M.: Stylometry, network analysis, and Latin liter- [GTAHS15] G RAY S. J., T ERRAS M., A MMANN R., H UDSON -
ature. In Proceedings of the Digital Humanities 2014 (2014). 5, S MITH A.: Textal: Unstructured Text Analysis Workflows
7, 8, 14 Through Interactive Smartphone Visualisations. In Proceedings
of the Digital Humanities 2015 (2015). 6, 7, 8, 13
[EJ14] E VANS C., JASNOW B.: Mapping Homer’s Catalogue of
Ships. Literary and Linguistic Computing 29, 3 (2014), 317–325. [GTW13] G OODING P., T ERRAS M., WARWICK C.: The myth
7, 8, 13 of the new: Mass digitization, distant reading, and the future of
the book. Literary and Linguistic Computing 28, 4 (2013), 629–
[eMa15] eMargin, 2015. http://eMargin.bcu.ac.uk/ 639. 5
(Retrieved 2015-01-09). 3
[GWFI14] G OFFIN P., W ILLETT W., F EKETE J.-D., I SENBERG
[ESK14] E ISENSTEIN J., S UN I., K LEIN L. F.: Exploratory The- P.: Exploring the Placement and Design of Word-Scale Visual-
matic Analysis for Historical Newspaper Archives. In Proceed- izations. Visualization and Computer Graphics, IEEE Transac-
ings of the Digital Humanities 2014 (2014). 8, 13, 14, 15 tions on 20, 12 (Dec 2014), 2291–2300. 8, 10, 11, 18
[EX10] E STEVA M., X U W.: Finding Stories in the Archive [GZ12] G EDZELMAN S., Z ANCARINI J.-C.: HyperMachiavel: a
through Paragraph Alignment. In Proceedings of the Digital Hu- translation comparison tool. In Proceedings of the Digital Hu-
manities 2010 (2010). 8, 14 manities 2012 (2012). 7, 8, 15, 16
[FGM05] F INKEL J. R., G RENAGER T., M ANNING C.: Incor- [HAC∗ 15] H INRICHS U., A LEX B., C LIFFORD J., WATSON A.,
porating Non-local Information into Information Extraction Sys- Q UIGLEY A., K LEIN E., C OATES C. M.: Trading Conse-
tems by Gibbs Sampling. In Proceedings of the 43rd Annual quences: A Case Study of Combining Text Mining and Visual-
Meeting on Association for Computational Linguistics (2005), ization to Facilitate Document Exploration. Digital Scholarship
Association for Computational Linguistics, pp. 363–370. 7 in the Humanities (2015). 6, 7, 8, 13, 14, 16, 18
[Fin10] F INN E.: The Social Lives of Books: Mapping the [HAHB15] H EUSER R., A LGEE -H EWITT M., B ENDER J.:
Ideational Networks of Toni Morrison. In Proceedings of the Knowledge Networks, Juxtaposed: Disciplinarity in the Ency-
Digital Humanities 2010 (2010). 5 clopédie and Wikipedia. In Proceedings of the Digital Humani-
ties 2015 (2015). 8, 15
[FKT14] FANKHAUSER P., K ERMES H., T EICH E.: Combining
Macro- and Microanalysis for Exploring the Construal of Sci- [HAHT∗ 15] H EUSER R., A LGEE -H EWITT M., T RAN V.,
entific Disciplinarity. In Proceedings of the Digital Humanities L OCKHART A., S TEINER E.: Mapping the Emotions of London
2014 (2014). 8, 12, 13, 16 in Fiction, 1700-1900: A Crowdsourcing Experiment. In Pro-
ceedings of the Digital Humanities 2015 (2015). 8, 13
[FMT15] F RANZINI G., M AHONY S., T ERRAS M.: A Catalogue
of Digital Editions. In Scholarly digital editions: Theory, prac- [Haw00] H AWTHORN J.: A glossary of contemporary literary
tice and future perspectives (2015), Pierazzo E., Driscoll M. J., theory. Oxford University Press, 2000. 2
(Eds.), Open Book Publishers. 5 [HCC14] H SIANG J., C HEN L., C HUNG C.-H.: A glimpse of
[FS11] F ORSTALL C., S CHEIRER W. J.: Visualizing Sound as the change of worldview between 7th and 10th century China
Functional N-Grams in Homeric Greek Poetry. In Proceedings through two leishu. In Proceedings of the Digital Humanities
of the Digital Humanities 2011 (2011). 6, 8, 12 2014 (2014). 8, 15

[GCL∗ 13] G ENG Z., C HEESMAN T., L ARAMEE R. S., F LANA - [HDvHM∗ 15] H AENTJENS D EKKER R., VAN H ULLE D., M ID -
DELL G., N EYT V., VAN Z UNDERT J.: Computer-supported col-
GAN K., T HIEL S.: ShakerVis: Visual analysis of segment vari-
ation of German translations of Shakespeare’s Othello. Informa- lation of modern manuscripts: CollateX and the Beckett Digital
tion Visualization (2013). 5, 6, 7, 8, 12, 15, 16, 18 Manuscript Project. Digital Scholarship in the Humanities 30, 3
(2015), 452–470. 7
[GDMF∗ 14] G REGORY I., D ONALDSON C., M URRIETA -
[Hen14] H ENSELER C.: Minecraft Anyone? Encouraging A New
F LORES P., RUPP C., BARON A., H ARDIE A., R AYSON P.:
Generation of Computer Scientists and Humanists, 2014. http:
Digital approaches to understanding the geographies in literary
//tinyurl.com/lk58xlv (Retrieved 2015-01-09). 19
and historical texts. In Proceedings of the Digital Humanities
2014 (2014). 8, 13, 14 [HFM16] H INRICHS U., F ORLINI S., M OYNIHAN B.: Specu-
lative Practices: Utilizing InfoVis to Explore Untapped Literary
[Geo15] GeoNames Gazetteer, 2015. http://www.
Collections. Visualization and Computer Graphics, IEEE Trans-
geonames.org/ (Retrieved 2015-01-10). 13
actions on 22, 1 (Jan 2016), 429–438. 6, 7, 8, 13, 14, 16, 17, 18,
[GH11a] G OODWIN J., H OLBO J.: Reading graphs, maps, trees: 19
responses to Franco Moretti. Parlor Press, Anderson, SC, 2011. [HKTK14] H OWELL S., K ELLEHER M., T EEHAN A., K EATING
Book, Whole. 3 J.: A Digital Humanities Approach to Narrative Voice in The
[GH11b] G REGORY I. N., H ARDIE A.: Visual GISting: bringing Secret Scripture: Proposing a New Research Method. Digital
together corpus linguistics and Geographical Information Sys- Humanities Quarterly 8, 2 (2014). 7, 8, 11, 18
tems. Literary and Linguistic Computing 26, 3 (2011), 297–314. [Hoc04] H OCKEY S.: The history of humanities computing. A
7, 8, 13 companion to digital humanities (2004), 3–19. 1
[GO12] G IBBS F., OWENS T.: Building Better Digital Humani- [HPR14] H OYT E., P ONTO K., ROY C.: Visualizing and An-
ties Tools: Toward broader audiences and user-centered designs. alyzing the Hollywood Screenplay with ScripThreads. Digital
Digital Humanities Quarterly 6, 2 (2012). 19 Humanities Quarterly 8, 4 (2014). 6, 8, 14, 16
[Goo15] Google Books, 2015. https://books.google. [HSC08] H INRICHS U., S CHMIDT H., C ARPENDALE S.: EMDi-
com/ (Retrieved 2015-01-09). 1, 3 alog: Bringing Information Visualization into the Museum. Visu-
alization and Computer Graphics, IEEE Transactions on 14, 6
[Got15] Gothenburg model, 2015. http://wiki.tei-c.
(Nov 2008), 1181–1188. 5, 8, 14, 15, 16, 18
org/index.php/Textual_Variance (Retrieved 2015-
10-06). 7 [Jas01] JASINSKI J.: Rhetoric and Society: Sourcebook on


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

Rhetoric: Key Concepts in Contemporary Rhetorical Studies, Transactions on 20, 12 (Dec 2014), 1723–1732. 7, 8, 10, 11, 12,
vol. 4. Sage Publications, 2001. 2 13, 16, 18, 19
[JBR∗ 15] J ÄNICKE S., B LUMENSTEIN J., R ÜCKER M., [KKC15] K ERRACHER N., K ENNEDY J., C HALMERS K.: A
Z ECKZER D., S CHEUERMANN G.: Visualizing the Results of Task Taxonomy for Temporal Graph Visualisation. IEEE Trans-
Search Queries on Ancient Text Corpora with Tag Pies. Digital actions on Visualization and Computer Graphics 21, 10 (Oct
Humanities Quarterly (2015). 7, 8, 12, 13, 16 2015), 1160–1172. 7
[JFCS15] JÄNICKE S., F RANZINI G., C HEEMA M. F., [KKL∗ 11] K IM H., K ANG B.- M ., L EE D.-G., C HUNG E., K IM
S CHEUERMANN G.: On Close and Distant Reading in Digital I.: Trends 21 Corpus: A Large Annotated Korean Newspaper
Humanities: A Survey and Future Challenges. In Eurographics Corpus for Linguistic and Cultural Studies. In Proceedings of
Conference on Visualization (EuroVis) - STARs (2015), Borgo R., the Digital Humanities 2011 (2011). 7, 8, 14
Ganovelli F., Viola I., (Eds.), The Eurographics Association. 2, [KLB14] KOCHTCHI A., L ANDESBERGER T. V., B IEMANN C.:
18 Networks of Names: Visual Exploration and Semi-Automatic
[JFS16] J ÄNICKE S., F OCHT J., S CHEUERMANN G.: Interac- Tagging of Social Networks from Newspaper Articles. Computer
tive Visual Profiling of Musicians. Visualization and Computer Graphics Forum 33, 3 (2014), 211–220. 5, 6, 8, 15, 16
Graphics, IEEE Transactions on 22, 1 (Jan 2016), 200–209. 7, [Kle12] K LEIN L. F.: Social Network Analysis and Visualiza-
8, 13, 15, 16, 17, 18 tion in ’The Papers of Thomas Jefferson’. In Proceedings of the
[JG15] J ÄNICKE S., G ESSNER A.: A Distant Reading Visualiza- Digital Humanities 2012 (2012). 5, 6, 8, 14, 15
tion for Variant Graphs. In Proceedings of the Digital Humanities [KO07] K EIM D., O ELKE D.: Literature Fingerprinting: A New
2015 (2015). 6, 8, 11, 12, 16 Method for Visual Literary Analysis. In Visual Analytics Sci-
[JGBS14] J ÄNICKE S., G ESSNER A., B ÜCHLER M., S CHEUER - ence and Technology, 2007. VAST 2007. IEEE Symposium on
MANN G.: Visualizations for Text Re-use. GRAPP/IVAPP (Oct 2007), pp. 115–122. 6, 7, 8, 12
(2014), 59–70. 5, 7, 8, 11, 12, 15, 16, 17 [KOTM13] K IMURA F., O SAKI T., T EZUKA T., M AEDA A.:
[JGF∗ 15] J ÄNICKE S., G ESSNER A., F RANZINI G., T ERRAS Visualization of relationships among historical persons from
M., M AHONY S., S CHEUERMANN G.: TRAViz: A Visualiza- Japanese historical documents. Literary and Linguistic Comput-
tion for Variant Graphs. Digital Scholarship in the Humanities ing 28, 2 (2013), 271–278. 7, 8, 15
30, suppl 1 (2015), i83–i99. 7, 8, 11, 17, 18, 19 [KZ14] K RAUSE T., Z ELDES A.: ANNIS3: A new architecture
[JHSS12] J ÄNICKE S., H EINE C., S TOCKMANN R., S CHEUER - for generic corpus query and visualization. Literary and Linguis-
MANN G.: Comparative Visualization of Geospatial-temporal tic Computing (2014). 6, 8, 12
Data. In GRAPP/IVAPP (2012), pp. 613–625. 8, 13, 14 [LIJ∗ 14] L EHMANN J., I SELE R., JAKOB M., J ENTZSCH A.,
[JKH∗ 15] J OHN M., KOCH S., H EIMERL F., M ÜLLER A., E RTL KONTOKOSTAS D., M ENDES P. N., H ELLMANN S., M ORSEY
T., K UHN J.: Interactive Visual Analysis Of German Poetics. In M., VAN K LEEF P., AUER S., ET AL .: DBpedia – A Large-
Proceedings of the Digital Humanities 2015 (2015). 6, 8, 12, 16 scale, Multilingual Knowledge Base Extracted from Wikipedia.
Semantic Web Journal 5 (2014), 1–29. 7
[Joc12] J OCKERS M.: Computing and Visualizing the 19th-
Century Literary Genome. In Proceedings of the Digital Hu- [LPP∗ 06] L EE B., P LAISANT C., PARR C. S., F EKETE J.-D.,
manities 2012 (2012). 7, 8, 14 H ENRY N.: Task Taxonomy for Graph Visualization. In Pro-
ceedings of the 2006 AVI Workshop on BEyond Time and Errors:
[Joc13] J OCKERS M. L.: Macroanalysis: Digital Methods & Lit- Novel Evaluation Methods for Information Visualization (New
erary History. University of Illinois Press, 2013. 2 York, NY, USA, 2006), BELIV ’06, ACM, pp. 1–5. 7
[JOL∗ 15] J ÄHNICHEN P., O ESTERLING P., L IEBMANN T., [LRKC10] L EE B., R ICHE N., K ARLSON A., C ARPENDALE
H EYER G., K URAS C., S CHEUERMANN G.: Exploratory Search S.: SparkClouds: Visualizing Trends in Tag Clouds. Visualiza-
Through Interactive Visualization of Topic Models. In Proceed- tion and Computer Graphics, IEEE Transactions on 16, 6 (Nov
ings of the Digital Humanities 2015 (2015). 6, 8, 13 2010), 1182–1189. 8, 13
[JRS∗ 09] J ONG C.-H., R AJKUMAR P., S IDDIQUIE B., [LWW∗ 13] L IU S., W U Y., W EI E., L IU M., L IU Y.: StoryFlow:
C LEMENT T., P LAISANT C., S HNEIDERMAN B.: Interac- Tracking the Evolution of Stories. Visualization and Computer
tive Exploration of Versions across Multiple Documents. In Graphics, IEEE Transactions on 19, 12 (Dec 2013), 2436–2445.
Proceedings of the Digital Humanities 2009 (2009). 8, 10, 12, 8, 14
16, 18
[Mar12] M ARCHE S.: Literature is not Data: Against Digital Hu-
[JW13] J ÄNICKE S., W RISLEY D. J.: Visualizing Uncertainty: manities, 2012. http://www.lareviewofbooks.org/
How to Use the Fuzzy Data of 550 Medieval Texts? In Proceed- article.php?id=1040 (Retrieved 2015-01-09). 3
ings of the Digital Humanities 2013 (2013). 5, 7, 8, 13, 14, 19
[MBL∗ 06] M EHLER A., BAO Y., L I X., WANG Y., S KIENA S.:
[Kau15] K AUFMAN M.: ’Everything on Paper Will Be Used Spatial Analysis of News Sources. Visualization and Computer
Against Me’: Quantifying Kissinger. In Proceedings of the Dig- Graphics, IEEE Transactions on 12, 5 (Sept 2006), 765–772. 8,
ital Humanities 2015 (2015). 5, 6, 7, 8, 14 13
[KBK11] K RSTAJIC M., B ERTINI E., K EIM D.: CloudLines: [McC15] M C C ABE M. M.: Platonic Conversations. Oxford Uni-
Compact Display of Event Episodes in Multiple Time-Series. Vi- versity Press, USA, 2015. 2
sualization and Computer Graphics, IEEE Transactions on 17,
[MFM08] M ENESES L., F URUTA R., M ALLEN E.: Exploring the
12 (Dec 2011), 2432–2439. 6, 8, 14
Biography and Artworks of Picasso with Interactive Calendars
[KG13] K EHOE A., G EE M.: eMargin: A Collaborative Textual and Timelines. In Proceedings of the Digital Humanities 2008
Annotation Tool. Ariadne 71 (July 2013). 2, 3, 18 (2008). 5
[KJW∗ 14] KOCH S., J OHN M., W ORNER M., M ULLER A., [MFM13] M ENESES L., F URUTA R., M ANDELL L.: Ambiances:
E RTL T.: VarifocalReader – In-Depth Visual Analysis of Large A Framework to Write and Visualize Poetry. In Proceedings of
Text Documents. Visualization and Computer Graphics, IEEE the Digital Humanities 2013 (2013). 8, 15, 16

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

[MH13] M URALIDHARAN A., H EARST M. A.: Supporting ex- [PBD14] P EÑA E., B ROWN M., D OBSON T.: On Metaphor in
ploratory text analysis in literature study. Literary and Linguistic Text Visualization Prototypes. In Proceedings of the Digital Hu-
Computing 28, 2 (2013), 283–295. 7, 8, 12, 15, 16 manities 2014 (2014). 8, 15
[MLCM16] M C C URDY N., L EIN J., C OLES K., M EYER M.: Po- [Per15] Perseus Digital Library, 2015. Ed. Gregory R. Crane.
emage: Visualizing the Sonic Topology of a Poem. Visualization Tufts University. http://www.perseus.tufts.edu (ac-
and Computer Graphics, IEEE Transactions on 22, 1 (Jan 2016), cessed March 19, 2015). 1
439–448. 8, 10, 12, 17, 18 [Pet14] P ETERSON N.: Visualization As a Bridge to Close Read-
[MLSU13] M ILLER B., L I F., S HRESTHA A., U MAPATHY K.: ing: The Audience in The Castle of Perseverance. In Proceedings
Digging into Human Rights Violations: phrase mining and tri- of the Digital Humanities 2014 (2014). 5, 8, 15
gram visualization. In Proceedings of the Digital Humanities [PHI15] PHI Latin Texts, 2015. http://latin.packhum.
2013 (2013). 8, 15 org/ (Retrieved 2015-01-09). 1
[Mor05] M ORETTI F.: Graphs, Maps, Trees: Abstract Models for [Pie10] P IEZ W.: Towards Hermeneutic Markup: An architectural
a Literary History. Verso, July 2005. 1, 3, 4 outline. In Proceedings of the Digital Humanities 2010 (2010).
5, 8, 11
[Mor13] M ORETTI F.: Distant reading. Verso, 2013. 3
[Pie13] P IEZ W.: Markup Beyond XML. In Proceedings of the
[MRMK15] M EDEK A., R ITTER J., M OLITOR P., K ÖSSER S.: Digital Humanities 2013 (2013). 7, 8, 11
Interactive Similarity Analysis of Early New High German Text
Variants. In Proceedings of the Digital Humanities 2015 (2015). [Ple15] Pleiades: a community-built gazetteer and graph of an-
8, 14, 16 cient places, 2015. http://pleiades.stoa.org/ (Re-
trieved 2015-10-06). 7, 13
[MRO∗ 12] M AC E ACHREN A., ROTH R., O’B RIEN J., L I B.,
S WINGLEY D., G AHEGAN M.: Visual Semiotics & Uncertainty [PMMR15] P ÖCKELMANN M., M EDEK A., M OLITOR P., R IT-
Visualization: An Empirical Study. Visualization and Computer TER J.: _CATview_ - Supporting The Investigation Of Text Gen-
Graphics, IEEE Transactions on 18, 12 (Dec 2012), 2496–2505. esis Of Large Manuscripts By An Overall Interactive Visualiza-
19 tion Tool. In Proceedings of the Digital Humanities 2015 (2015).
8, 12, 16
[MSR∗ 15] M ONTAGUE J., S IMPSON J., ROCKWELL G.,
RUECKER S., B ROWN S.: Exploring Large Datasets with Topic [Poe16] Poemage, 2016. http://www.sci.utah.edu/
Model Visualizations. In Proceedings of the Digital Humanities ~nmccurdy/Poemage/ (Retrieved 2016-03-18). 10
2015 (2015). 6, 8, 12, 13 [Poi15] P OIBEAU T.: Generating Navigable Semantic Maps from
Social Sciences Corpora. In Proceedings of the Digital Humani-
[Mun09] M UNZNER T.: A nested model for visualization de-
ties 2015 (2015). 6, 8, 14, 15
sign and validation. Visualization and Computer Graphics, IEEE
Transactions on 15, 6 (2009), 921–928. 17 [Pos07] P OSAVEC S.: Literary Organism, 2007. http://www.
stefanieposavec.co.uk/ (Retrieved 2015-01-09). 3
[Mur11] M URALIDHARAN A.: A Visual Interface for Exploring
Language Use in Slave Narratives. In Proceedings of the Digital [Pro15] Project Gutenberg, 2015. http://www.gutenberg.
Humanities 2011 (2011). 7, 8, 10, 12, 16 org/ (Retrieved 2015-01-09). 3
[NMG∗ 13] N OWVISKIE B., M C C LURE D., G RAHAM W., [PSA∗ 06] P LAISANT C., S MITH M. N., AUVIL L., ROSE J., Y U
S OROKA A., B OGGS J., ROCHESTER E.: Geo-Temporal In- B., C LEMENT T.: "Undiscovered Public Knowledge": Mining
terpretation of Archival Collections with Neatline. Literary and for Patterns of Erotic Language in Emily Dickinson’s Correspon-
Linguistic Computing 28, 4 (2013), 692–699. 17 dence with Susan Huntington (Gilbert) Dickinson. In Proceed-
ings of the Digital Humanities 2006 (2006). 7, 8
[OGH15] O DAT S., G ROZA T., H UNTER J.: Extracting struc-
tured data from publications in the Art Conservation Domain. [RARC∗ 15] ROE G., A BDUL -R AHMAN A., C HEN M., G LAD -
STONE C., M ORRISSEY R., O LSEN M.: Visualizing Text Align-
Digital Scholarship in the Humanities 30, 2 (2015), 225–245. 6,
7, 8, 14, 16 ments: Image Processing Techniques for Locating 18th-Century
Commonplaces. In Proceedings of the Digital Humanities 2015
[OKK13] O ELKE D., KOKKINAKIS D., K EIM D. A.: Fingerprint (2015). 8, 12
Matrices: Uncovering the dynamics of social networks in prose
[RAW∗ 15] R IND A., A IGNER W., WAGNER M., M IKSCH S.,
literature. In Computer Graphics Forum (2013), vol. 32, Wiley
L AMMARSCH T.: Task Cube: A three-dimensional conceptual
Online Library, pp. 371–380. 8, 12
space of user tasks in visualization design and evaluation. Infor-
[Ome15] Omeka, 2015. http://www.omeka.org/ (Re- mation Visualization (2015). 7
trieved 2015-01-10). 17 [RD10] R ICHE N., DWYER T.: Untangling Euler Diagrams. Vi-
[ÓML14] Ó M URCHÚ T., L AWLESS S.: The Problem of sualization and Computer Graphics, IEEE Transactions on 16, 6
Time and Space: The Difficulties in Visualising Spatiotemporal (Nov 2010), 1090–1099. 8, 15
Change in Historical Data. In Proceedings of the Digital Human- [RFH14] R EITER N., F RANK A., H ELLWIG O.: An NLP-based
ities 2014 (2014). 6, 8, 13, 14 cross-document approach to narrative structure discovery. Liter-
[OST∗ 10] O ESTERLING P., S CHEUERMANN G., T ERESNIAK ary and Linguistic Computing 29, 4 (2014), 583–605. 6, 7, 8, 14,
S., H EYER G., KOCH S., E RTL T., W EBER G.: Two-stage 16
framework for a topology-based projection and visualization of [RPSF15] R IEHMANN P., P OTTHAST M., S TEIN B.,
classified document collections. In Visual Analytics Science F ROEHLICH B.: Visual Assessment of Alleged Plagiarism
and Technology (VAST), 2010 IEEE Symposium on (Oct 2010), Cases. Computer Graphics Forum 34, 3 (2015), 61–70. 6, 7, 8,
pp. 91–98. 8, 15, 16 10, 12, 15, 16
[Pal02] PALEY W. B.: TextArc: Showing word frequency and [RRRG05] RUECKER S., R AMSAY S., R ADZIKOWSKA M., G A -
distribution in text. In Poster presented at IEEE Symposium on LEY A.: Interface Design. In Proceedings of the Digital Human-
Information Visualization (2002), vol. 2002. 15 ities 2005 (2005). 7, 8, 12, 15


c 2016 The Author(s)
Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.
S. Jänicke, G. Franzini, M. F. Cheema & G. Scheuermann / Visual Text Analysis in Digital Humanities

[RSDCD∗ 13] ROBERTS -S MITH J., D E S OUZA -C OELHO S., [WBWK00] WANG BALDONADO M. Q., W OODRUFF A.,
D OBSON T. M., G ABRIELE S., RODRIGUEZ -A RENAS O., K UCHINSKY A.: Guidelines for using multiple views in infor-
RUECKER S., S INCLAIR S., A KONG A., B OUCHARD M., mation visualization. In Proceedings of the Working Conference
H ONG M., JAKACKI D., L AM D., KOVACS A., N ORTHAM L., on Advanced Visual Interfaces (New York, NY, USA, 2000), AVI
S O D.: Visualizing Theatrical Text: From Watching the Script to ’00, ACM, pp. 110–119. 16
the Simulated Environment for Theatre (SET). Digital Humani- [Wei15] W EIGEL M.: Graphs and Legends: Raymond Williams
ties Quarterly 7, 3 (2013). 6, 8, 15, 16 tried to save culture from a priestly elite. Can the same be said
[SDW11] S LINGSBY A., DYKES J., W OOD J.: Exploring Uncer- of the digital humanities?, 2015. http://www.thenation.
tainty in Geodemographics with Interactive Graphics. Visualiza- com/article/graphs-and-legends/ (Retrieved 2015-
tion and Computer Graphics, IEEE Transactions on 17, 12 (Dec 10-09). 5
2011), 2545–2554. 19 [WH11] WALSH J. A., H OOPER W.: Computational Discovery
[Shn96] S HNEIDERMAN B.: The Eyes Have It: A Task by Data and Visualization of the Underlying Semantic Structure of Com-
Type Taxonomy for Information Visualizations. In Visual Lan- plicated Historical and Literary Corpora. In Proceedings of the
guages, Proceedings (1996), pp. 336–343. 3 Digital Humanities 2011 (2011). 5, 6, 7, 8, 14, 16
[Sin13] S INGER K.: Digital Close Reading: TEI for Teaching [Wil15a] W ILLS T.: Relational data modelling of textual corpora:
Poetic Vocabularies. Journal of Interactive Technology and Ped- The Skaldic Project and its extensions. Digital Scholarship in the
agogy 3 (2013). 7 Humanities 30, 2 (2015), 294–313. 5, 6, 8, 13, 16
[SOI10] S AITO S., O HNO S., I NABA M.: A Platform for Cultural [Wil15b] W ILSON E. A.: Building The Early Modern Digital
Information Visualization Using Schematic Expressions of Cube. University: Using Social Network Analysis and Digital Visual-
In Proceedings of the Digital Humanities 2010 (2010). 1 ization Tools To Bring The Early Modern Network Of Networks
(EMNON) To Life. In Proceedings of the Digital Humanities
[SRR13] S INCLAIR S., RUECKER S., R ADZIKOWSKA M.: In- 2015 (2015). 7, 8, 13, 16
formation Visualization for Humanities Scholars, 2013. 19
[WJ13a] W EINGART S., J ORGENSEN J.: Computational analy-
[TEI15] TEI Consortium, 2015. eds. TEI P5: Guide- sis of the body in European fairy tales. Literary and Linguistic
lines for Electronic Text Encoding and Interchange. 2.8.0. Computing 28, 3 (2013), 404–416. 5, 8, 14, 16
2015-04-06. TEI Consortium. http://www.tei-c.org/
Guidelines/P5/ (Retrieved 2015-10-09). 1 [WJ13b] W HEELES D., J ENSEN K.: Juxta Commons. In Pro-
ceedings of the Digital Humanities 2013 (2013). 7, 8, 11, 19
[TFK15] T RILCKE P., F ISCHER F., K AMPKASPAR D.: Digital
Network Analysis of Dramatic Texts. In Proceedings of the Dig- [WMN∗ 14] WALSH B., M AIERS C., NALLY G., B OGGS J.,
ital Humanities 2015 (2015). 6, 7, 8, 15 T EAM P. P.: Crowdsourcing individual interpretations: Between
microtasking and macrotasking. Literary and Linguistic Com-
[TKMS03] T OUTANOVA K., K LEIN D., M ANNING C. D., puting 29, 3 (2014), 379–386. 7, 8, 10, 11, 18
S INGER Y.: Feature-Rich Part-of-Speech Tagging with a Cyclic
Dependency Network. In Proceedings of the 2003 Conference [Wol13] W OLFF M.: Surveying a Corpus with Alignment Visu-
of the North American Chapter of the Association for Compu- alization and Topic Modeling. In Proceedings of the Digital Hu-
tational Linguistics on Human Language Technology-Volume 1 manities 2013 (2013). 7, 8, 15, 16
(2003), Association for Computational Linguistics, pp. 173–180. [WV08] WATTENBERG M., V IEGAS F.: The Word Tree, an Inter-
7 active Visual Concordance. Visualization and Computer Graph-
[Tót13] T ÓTH G. M.: The computer-assisted analysis of a me- ics, IEEE Transactions on 14, 6 (Nov 2008), 1221–1228. 8, 15,
dieval commonplace book and diary (MS Zibaldone Quaresimale 16
by Giovanni Rucellai). Literary and Linguistic Computing 28, 3 [YMSJ05] Y I J. S., M ELTON R., S TASKO J., JACKO J. A.: Dust
(2013), 432–443. 7, 8, 15 & Magnet: multivariate information visualization using a magnet
[Tra09] T RAVIS C.: Patrick Kavanagh’s Poetic Wordscapes: GIS, metaphor. Information Visualization 4, 4 (2005), 239–256. 15
Literature and Ireland, 1922-1949. In Proceedings of the Digital [ZNMS15] Z AHORA T., N IKULIN D., M EWS C. J., S QUIRE D.:
Humanities 2009 (2009). 8, 13 Deconstructing Bricolage: Interactive Online Analysis of Com-
[UIU98] University of Illinois at Urbana-Champaign Digital Li- piled Texts with Factotum. Digital Humanities Quarterly 9, 1
braries Initiative: UIUC DLI Glossary, 1998. http:// (2015). 8, 10, 12, 16
archive.is/qOUJz (Retrieved 2016-03-18). 5
[Und15] U NDERWOOD T.: A dataset for distant-reading litera-
ture in English, 1700-1922, 2015. http://tinyurl.com/
nu26zr7 (Retrieved 2015-10-09). 5
[VCPK09] V UILLEMOT R., C LEMENT T., P LAISANT C., K U -
MAR A.: What’s being said near "Martha"? Exploring name en-
tities in literary text collections. In Visual Analytics Science and
Technology, 2009. VAST 2009. IEEE Symposium on (Oct 2009),
pp. 107–114. 6, 8, 12, 13, 14, 16, 17, 18
[vHWV09] VAN H AM F., WATTENBERG M., V IEGAS F.: Map-
ping Text with Phrase Nets. Visualization and Computer Graph-
ics, IEEE Transactions on 15, 6 (Nov 2009), 1169–1176. 8, 15
[VWF09] V IEGAS F., WATTENBERG M., F EINBERG J.: Partic-
ipatory Visualization with Wordle. Visualization and Computer
Graphics, IEEE Transactions on 15, 6 (Nov 2009), 1137–1144.
11

c 2016 The Author(s)


Computer Graphics Forum
c 2016 The Eurographics Association and John Wiley & Sons Ltd.

You might also like