Leximancer Manual
Leximancer Manual
Version 4
2011
Manual Version 4
Table of Contents
SECTION 1. INTRODUCTION TO LEXIMANCER
..............................................................4
WHAT IS LEXIMANCER?
............................................................................................4
THEORY: CONTENT ANALYSIS
........................................................................................................8
WHAT IS CONTENT ANALYSIS?
...................................................................................8
Types of Content Analysis
.................................................................................9
INTERESTED IN LEARNING MORE ABOUT CONTENT ANALYSIS?
..........................................10
SECTION 2. THE CONCEPT MAP
........................................................................................10
Theory: Concepts and Conceptual Mapping in Leximancer
...............................10
Concept Seed Words
........................................................................................11
Concept Learning
............................................................................................12
The Initial Display
............................................................................................13
Themes
...........................................................................................................14
Concepts
.........................................................................................................18
Buttons in the header above the Concept Map
.................................................21
The Concept Cloud
..........................................................................................27
REPORT TABS
......................................................................................................29
Themes tab
.....................................................................................................29
Concepts tab
...................................................................................................30
Thesaurus
.......................................................................................................34
Pathway tab
.....................................................................................................36
Query
..............................................................................................................39
Summary tab
...................................................................................................43
SECTION 3: CREATING AN AUTOMATIC/EXPLORATORY MAP
.............................44
CREATING AN AUTOMATIC CONCEPT MAP
................................................................................44
Supported File Types
.......................................................................................44
Desktop Installations
.......................................................................................44
Leximancer Portal Accounts
.............................................................................45
CREATING A NEW FOLDER AND PROJECT
...................................................................................47
New Folder
......................................................................................................47
New Project
.....................................................................................................48
THE MAIN LEXIMANCER USER INTERFACE:
.................................................................................50
Using the Web Crawler
.....................................................................................55
SECTION 4: CREATING A MANUALLY ADJUSTED MAP
.............................................62
2A. TEXT PROCESSING
..................................................................................................................63
STOPWORD REMOVAL
............................................................................................68
2B. CONCEPT SEEDS SETTINGS
....................................................................................................74
3. THESAURUS GENERATION:
.......................................................................................................78
3A. PRACTICAL: CONFIGURING CONCEPT EDITING
..................................................................79
USING TAGS
........................................................................................................86
THE AUTOMATIC SENTIMENT LENS
............................................................................88
Manual Version 4
Manual Version 4
The learning materials in this section are designed to give the new
Leximancer user an introduction to the workings of the program;
This section also identifies some common applications of the
software;
Theoretical background on content analysis is then provided.
What is Leximancer?
Leximancer is a text analytics tool that can be used to analyse the content
of collections of textual documents and to display the extracted
information visually. The information is displayed by means of a
conceptual map that provides a birds eye view of the material,
representing the main concepts contained within the text as well as
information about how they are related:
Manual Version 4
Essentially, this map allows the user to view the conceptual structure of a
body of text, as well as perform a directed search of the documents. The
interactive nature of the map permits the user to explore examples of
concepts, their connections to each other, as well as links to the original
text.
Manual Version 4
Applications of Leximancer
Application
Type of Text
Output Options
Possible
Projects
Any non-protected
text:
Communication
research;
Customer feedback
data;
Statistical, via
Leximancer data
exports;
Electronic content;
Litigation evidence
(electronic form)
Media Analysis
Electronic media
articles
Customer
Communication from
Relationship
customers
Management (CRM)
Academic Research
Any
Link open-ended
questions to
metadata.
Profile concepts
being investigated;
Concept cooccurrence data.
Legal e-discovery;
Alternative to
manually maintained
site maps.
Concept Map;
History, Literature,
Media Studies,
Sociology, Politics
Statistical output.
Manual Version 4
Employee satisfaction
survey;
Manual Version 4
This section of the manual is for those wishing to understand more about the
theoretical underpinnings of Leximancer;
More practical, instructional chapters are to follow.
It is used for
Manual Version 4
Manual Version 4
Manual Version 4
10
Associated
thesaurus
items
Leximancer
Concept
Aside from detecting the overall presence of a concept in the text, the
concept definitions are also used to determine the frequency of cooccurrence between concepts.
They are called seeds as they represent the starting point of the concept,
with more terms being added to the definition through learning.
Occasionally, more appropriate central terms may be discovered, pushing
the seeds away from the centre of the concept definition.
Manual Version 4
11
Concept Learning
Leximancer begins with a set of seed words, as defined above. During the
learning process, words highly relevant to the seed are continuously
updated, and eventually form a thesaurus of terms for each concept.
Manual Version 4
12
Manual Version 4
13
Themes
The concepts are clustered into higher-level themes when the map is
generated. Concepts that appear together often in the same pieces of
text attract one another strongly, and so tend to settle near one another
in the map space. The themes aid interpretation by grouping the clusters
of concepts, and are shown as coloured circles on the map:
Here, a cluster of
conceptually related
concepts is grouped by the
theme pipework.
The themes are heat-mapped to indicate importance. This means that the
hottest or most important theme appears in red, and the next hottest in
orange, and so on according to the colour wheel.
Manual Version 4
14
When the map first opens, the Theme Size is set to 33%, but you can
move the Theme Size slider beneath the map to adjust the grouping of
concepts on the map. Move the slider to the right to make fewer, broader
themes, and move it to the left to make more, tighter themes:
When the map first opens, the tab on the right presents a Summary of the
Themes. A bar chart ranks the most important themes relative to one
another, and beneath that the concepts visible within each theme are
listed. A list of representative excerpts is included for each theme, so that
you can read some examples quickly to understand how and why the
concepts in that theme appear together in the text. Hover your mouse
over the More button to see the query syntax used to return each excerpt
within a theme, and click the more button to read further examples.
Manual Version 4
15
In the screen shot below, the Pipework theme (shown as a red circle on
the map) contains concepts such as pipework, LPG and gas. Excerpts
linking these concepts are shown in the Theme Summary in the righthand tab:
If you adjust the size of the theme circles using the slider beneath the
map, the Themes Summary updates to represent the new groups you
have created on the map.
You can make all the themes disappear from the map by moving the
Theme Size slider all the way to the left (0%).
If you hover your mouse over a theme circle on the map, the name of that
theme will appear.
Initially, each theme takes its name from the most frequent and
connected concept within that circle.
Manual Version 4
16
You can change the names of the themes if you right click on the map
near the theme name. A list of nearby concepts and themes will appear.
Hover your mouse over the concept or theme of interest to get an option
to Rename it:
You can make all the theme names visible permanently on the map by
clicking the Map Settings (crossed spanner and screwdriver) button in the
header above the map and tick Theme Names Always Visible.
Manual Version 4
17
Concepts
The Concept Map contains the names of the main concepts that occur
within the text. These are shown as grey labels on the map.
Concept Display
Regular Word
Concept
Name Concept
The frequencies with which the name- and word-like concepts appear in
the text are also listed separately in the Concepts tab on the right of the
map:
Manual Version 4
18
Name
Conce
pt
Regular
Word
Concept
The brightness of a concepts label reflects its frequency in the text. The
brighter the concept label, the more often the concept is coded in the
text.
Manual Version 4
19
With concept
visibility at 50%,
some concepts do
not appear on the
map. The grey
nodes illustrate
where they would
have appeared.
Map
sliders
You can reveal hidden concepts by moving the % Visible Concepts slider
(underneath the map) to the right. To reveal the most important concepts
in order, move the slider far left then slowly drag the pointer to the right.
Manual Version 4
20
Hover your mouse over a button above the map to see its name.
Starting from the left, they include:
The Center Map button simply centers the map image in the screen
space.
The Rest Map to Original View button returns the map to the way it
looked when it first opened.
The Recluster Map button scatters the concepts randomly in the map
space initially, then uses a clustering algorithm to allow the concepts to
attract one another once more so as to lay the map out on screen.
The Cluster Map button allows the concepts more iterations of attracting
one another to settle in stable locations on the map (without randomising
them first).
Manual Version 4
21
The Map Settings button allows you to change various visual aspects of
the concept map. Clicking the crossed spanner and screwdriver button
above the map opens this interface:
The Map Settings interface allows you to increase the Font Size of labels
on the map, and to change the background colour from white to black.
It also allows you to make the theme names always visible on the map by
ticking the Themes Names Always Visible button.
You can choose whether to show the spanning tree on the map. The
spanning tree appears as a grey network of connections between
concepts (like a spider web) beneath the concept network. It shows the
most-likely connections between concepts (like a road map of highways),
but there are other (less-strong) connections between concepts (like
backstreets).
Manual Version 4
22
Untick the Name, Word or Tag boxes to hide certain types of items on the
map.
If your map contains tags, ticking Tags as themes allows the tags to give
their names to themes if appropriate. Normally tags are not allowed to
give their names to themes.
The map below was created using the map settings shown above:
The Increase and Decrease Map Zoom buttons allow you to zoom your
map view in or out.
The Toggle Pathway Mode button allows you to turn the pathways
facility on and off. Pathway Mode, when enabled, behaves like this: if you
Manual Version 4
23
A black line appears on the map to indicate which other concepts might
be bypassed in order to move from the gas to the explosion concepts in
this example. The tab on the right shows the probability of each leg of
the path, and presents an excerpt of text linking the two concepts
involved in each leg.
The pathways are intended to tell stories emerging from the text, and
focus on indirect connections between concepts on the map.
Toggling the pathways mode off allows you to click on more than one
concept on the map without a pathways being drawn between them.
Clicking the same button again returns you to the pathways mode.
The Export Map button allows you to take a picture of the concept map:
Manual Version 4
24
You can choose the file type and resolution of the image.
Note that pop-ups must be enabled in the browser so that the map
image can appear in a new window. The image can then be saved to local
disk.
The Save Map button lets you save the current map configuration (in case
you have changed theme names etc.) so that you can return to it later.
Simply enter a name for the map image, and click OK.
The Load Map button lets you reload a map configuration saved
previously using the Save Map button. Simply select the name of the map
image you require and click Load.
Manual Version 4
25
Manual Version 4
26
The Concept Cloud, like the concept map, is heat-mapped, in that hot
colours (red, orange) denote the most relevant concepts, and cool colours
Manual Version 4
27
(blue, green), denote the least relevant. The font size of each concepts
label denotes its frequency in the text.
The Concept Cloud is fully interactive. It behaves like the concept map, in
that you can click on a concept (or tag) to select it and see the list of
related concepts in the right-hand tab:
Manual Version 4
28
Report Tabs
The right-hand window contains several report tabs that allow for further
interaction with the Concept Map. Each tab represents a different way to
interact with the map and explore the results.
Themes tab
Manual Version 4
29
button to see the syntax of the query run to return that piece of text.
Also notice the Hits score to the right showing how many text excerpts
matches each query.
Concepts tab
Manual Version 4
30
of
the
related
name-
and
word-like
concepts
is
displayed
Manual Version 4
31
From the related concepts lists, you can browse the locations in the
document where concepts co-occur by clicking in the Browse button (the
magnifying glass icon).
Browsing extracts automatically takes you to the Query tab in the righthand panel, and displays instances where the concepts of interest cooccur in the text.
Manual Version 4
32
From the Query Results, click on Add to Log to add an extract of interest
to the LogBook for export or reporting. When an extract is logged, the
Add to Log button changes to a View Log option. Click View Log to review
the list of extracts in the LogBook, then select Edit to add your own notes
about an excerpt:
Manual Version 4
33
Thesaurus
The Thesaurus tab displays a list of your concepts, the number of
iterations performed by Leximancer on that concept, and a ranked list of
the thesaurus words that define and describe each concept.
Manual Version 4
34
Click on a concept in the alphabetical list on the left to reveal the list of
Leximancer found to be associated with that concept on the right. The
thesaurus list also shows the relevancy weightings associated with each
indicative word. The iterations count (top left) tells you the number of
times the corpus was reread and coded with evolving concept definitions
before a stable classification result was achieved.
You can click on the Evidence button (the magnifying lens icon) to the left
of a thesaurus item to browse text excerpts where that thesaurus term
appears as evidence for a concept of interest:
Manual Version 4
35
Pathway tab
The Pathway tab displays and describes the most likely relationship chain
between two concepts. It allows you navigate the most likely path in
conceptual space from a start concept to an end concept. To create a
pathway you must first enable Pathway Mode using the relevant button
above the map. Then select a concept on the map:
Manual Version 4
36
Now, clicking another concept will illustrate the pathway between them,
along with example text. The relationships between concepts are best
thought of as correlations, though the text segments describing the
relationship may define a direction for cause.
The example below creates a pathway between the tank concept and the
explosion concept:
Manual Version 4
37
The start concept appears at the top of the list, and the associated text
segment explains the link between this and the next related concept in
the pathway. This pattern continues until the final text segment linked to
the end concept is listed.
The link between
connections:
Manual Version 4
tank
and
explosion
revealed
the
following
38
Query
We have already discussed one of Leximancers query functions: querying
one concept against another:
Manual Version 4
39
NAME:[concept]
- searches for your specified name concept
WORD:[concept]
- searches for your specified regular word concept
TAG:[file, folder, or tag]
- searches for a pre-defined tag in your data
WTERM:[word]
- searches for a regular keyword (different to a concept)
NTERM:[word]
- searches for a name keyword (different to a concept)
TERM:[word]
- equivalent to (WTERM:[word] OR NTERM:[word])
Manual Version 4
40
NAME:[partial concept]*
- adding a star to the end of a concept search will cause the query to
be applied to the letters present and any derivatives. You must
Manual Version 4
41
provide at least one character for this search. NAME:* alone will not
produce results.
- For example, a query of NAME:environment* will search for
environment, environments, environmental, etc.
NAME:[concept] OR NAME:[concept]^2
- searches for either concept specified, but with the latter concept to
be considered more important
- for example, a query of NAME:greenhouse OR NAME:emissions^2
will search for both greenhouse and emissions concepts, but
emissions concepts will be considered more important.
+NAME:[concept] +WORD[concept]
- this is shorthand code for the search NAME:[concept] AND WORD:
[concept]
+NAME[concept] +WTERM:[concept]
- searches for the name concept co-occurring with a keyword
Manual Version 4
42
Summary tab
The Summary tab displays extracts containing the most important
concepts discovered from the text. The list contains characteristic text
segments that illustrate the relationships between key concepts:
When you have finished exploring the Concept Map project tabs, close
the Map Explorer tab using the Close button (X) in the top right-hand
corner.
Manual Version 4
43
Exploratory maps involve minimal input from the user and are the
starting off point for analysis using Leximancer. They are a means of
gaining an overview of your data before adjusting Leximancer settings for
more tailored results. Manually adjusted maps will be explored in the
next chapter.
Manual Version 4
44
A Leximancer icon will appear in your system tray when the programme is
running. To exit the application, right click on the icon and select Exit
Leximancer.
When the program starts, youre presented with the Manage Projects
interface:
Manual Version 4
45
Manual Version 4
As a project file or folder is created using this name, try to use typical
conventions for naming files (for instance, avoid including characters
such as ".", "\", "*", or "/").
Manual Version 4
47
After you have named the folder, click OK. This creates an empty project
folder by this name under Leximancer / User Projects.
New Project
Right click on your new folder under Leximancer / User Projects, and
select Create Project to name (and optionally describe) a new project.
Manual Version 4
48
This opens the newly-created project, and displays the main user
interface in the right hand panel:
You can open several Leximancer projects at once. When youve opened
one or more projects, you can also collapse the Project Selection
interface using the arrows:
Manual Version 4
49
Having created a project, you are now ready to use the Main Leximancer User Interface.
A middle column allowing you to change the settings for each processing
stage:
Manual Version 4
50
And a status column that keeps track of the completion of each stage of
processing:
For an Automatic Map, we only need to complete the Load Data step, and
then click Run Project. Both of these buttons appear in orange in the
main user interface:
Manual Version 4
51
- You will see your local drives listed in the Available Documents panel on
the left.
- Expand the drive directories to see the files and folders within them.
- Drag and drop the desired files and folders into the Document Set area
on the right to link them to the project, then click OK.
Manual Version 4
52
- Right click on your user data folder (usually your name or company) in
the Available Documents panel on the left, and select Upload.
- Browse to locate the data on your local machine. If you wish to upload
multiple files at once, place them in a zipped folder before selecting them
for uploading. The files and folders will be extracted from the zipped
archive automatically on upload to Leximancer.
In the File Upload window, select the documents you wish to analyse and
click Add. Once all the documents you want have been selected, click OK
to begin uploading:
Manual Version 4
53
There is some feedback in the lower left of screen to let you know that
your files are uploading, and when it is complete.
Once the data is uploaded, you can expand the parent folder to see the
files and folders within:
You could choose to analyse just one of the documents within a folder
(e.g., a single days transcript) by dragging and dropping an individual
file into the Document Set area on the right. Alternatively you can drag
the whole parent folder into the Document Set area on the right to
analyse all its contents.
Selecting the parent folder instructs Leximancer to analyse all the files
and folders within. This allows you to analyse multiple files and folders at
Manual Version 4
54
Manual Version 4
55
Provide a name for the web crawl, then press Add Sources and type in
the Url(s) you are interested in analysing. Blog and review sites tend to
work best.
Tick the box to the right to do a Single Page Fetch if you only want the
text on that page to be retrieved.
In the Advanced Settings you can set the Level of Recursion. This defaults
to 2, allowing the crawling to follow links from the page to 2 levels down.
You can also increase the maximum data size if you want to crawl a large
amount of textual data.
Enter keywords into the Specific or Global Content Pattern boxes to focus
the crawl on pages that mention particular words.
Manual Version 4
56
The web crawl will begin when you click Ok. When it is complete, you can
drag and drop the folder of crawled text into the Document Set area, and
then run the project as usual.
Manual Version 4
57
For All users, once you have moved some files into the Document Set
area:
You have the option at this stage to specify the type of document(s) you
wish to analyse. The current example uses online data, so .html was
selected. Selection of data type helps Leximancer to be more sensitive to
the idiosyncrasies of that data type (for example, html artifacts from
online documents).
Manual Version 4
58
This returns you to the main user interface, where the Status of the Load
Data stage will be green, as this step has now been completed:
While the stages are running, they will flash orange and say In Process.
Once they are completed, the status nodes turn to green in colour and
display the word Ready. Progress information is also visible at the bottom
left of the screen:
Manual Version 4
59
When all phases of processing are complete, either click the Concept Map
button at the bottom on the main Control Panel, or click Open Map in
New Window (in the upper left of screen to open a larger map in a new
window).
Manual Version 4
60
A Close tab button (x) that allows you to close the current project
(current settings are saved) to exit or load other projects;
Manual Version 4
61
However, the Load Data stage will not be covered in this chapter. Please
refer to the start of the previous chapter, Creating an Automatic/
Exploratory Map: For Beginner Users (page 38) for information on how to
start Leximancer and load data.
Stages of Processing
1. Load Data:
Manual Version 4
62
Please refer to the previous chapter (page 38) for information on how to
load data in Leximancer.
Alternatively this stage can be expanded (using the plus sign next to
Show Settings) to reveal two sub-stages within:
There are options to Edit the Text Processing and Concept Seeds Settings.
transitions
in
meaning.
The
conceptual
map
of
the
Manual Version 4
63
information
(such
as
the
words
and
and
of).
To remove non-textual
material from the text, such as menus and forms in web pages,
sentences that are unlikely to be part of the specified language are
removed. This is achieved heuristically by removing sentences that
contain less than 1 (or 2) of the stop-list words. If processing
spoken language, this setting should be turned off.
Manual Version 4
64
Clicking on the Edit button for Text Processing reveals the following
interface:
Prose Test Threshold (0-5): The Prose Test feature examines raw text
sentences to decide whether they are valid prose from the configured
languages. This is achieved by counting the number of stop-words that
appear within each sentence. If this number is high, it is likely to be a
Manual Version 4
65
Manual Version 4
66
useful much of the time for tagging proper names. Note that it binds
compound names into one token.
Break at Paragraph (Yes|No): This setting is to prevent context blocks
from crossing paragraph boundaries. Only if the majority of paragraphs
in the text are shorter than 2 sentences should you consider ignoring
paragraphs.
Auto-Paragraphing (Yes|No):
Manual Version 4
67
the previously prepared text from being deleted. This will also
prevent the importation and preparation of any files in the data
folder which have been prepared before.
Stopword Removal
During Preprocessing, words with low semantic-content (meaning) are
removed from the text data using a predefined Stopword List. An
example of an English Stopword is 'and'. This word occurs frequent in
English text, but would not constitute a useful or clearly-defined concept.
If you are using an unsupported language, you can update this list (e.g.
by translating the contained words into your language). Note that stop
words are removed from the text to analysed, and cannot be selected as
manual seed words.
Edit Stopword List (Button): You can check the words that are counted
as stop-words by clicking this button:
Manual Version 4
68
You can browse through this alphabetical stoplist and see if you find any
words that you would rather have left in the analysis. If so, click on the
Remove option next to the word, and change the status of the word to
Allow using the dropdown menu. Set the words status to Evidence if you
Manual Version 4
69
want to allow the word to form part of a concept thesaurus, but not to be
considered as a possible concept candidate itself:
You can edit words in the list by clicking on them and typing in the text
box thats revealed. You can also Add words to the stoplist, or Remove
them from the list. The Download button lets your download the
current .xml stoplist file, and the Upload option lets you upload
another .xml stoplist file.
Manual Version 4
70
The Language column lets you know the language from which particular
stopwords come in case you are analysing documents in different
languages. You can change this setting so that stopwords are only
removed
from
text
in
the
appropriate
language.
The
language
You may also add the stoplist of another language. This includes highfrequency, non-semantic words from a wide selection of languages. Note
that the relevant stoplist will automatically load after selection of the
language in the earlier document selection window:
Manual Version 4
71
Tagging Options
Tags are important for comparing different documents based on their
conceptual content, for example, different speakers in transcript
documents, or for a comparison between different text sources. At this
stage in processing, you can instruct Leximancer to pay attention to
certain tag categories so that you may analyse them later.
The Apply Folder Tags (Yes|No) and Apply File Tags (Yes|No) options
can cause each part of the folder path to a file, and optionally the
filename itself, to be inserted as a tag on each sentence in the file. In our
example project, we can use the File Tagging facility to compare the
content of the transcripts on different days of the hearing. Source
document tags can then be included as concepts on the map.
Commentary: This is a powerful feature that lets you code all the
sentences in each document with categorical tags just by placing
the files in folders, possibly within other folders etc. The tags can
Manual Version 4
72
Manual Version 4
73
Manual Version 4
74
Automatically
Identify
Concepts
(Yes|No):
Turn
off
automatic
identification of concepts if you would like only concepts that you define
yourself on the map.
75
the list. If you are not interested in names at all, you can set this to
0%. Increase this number if you are particularly interested in
names.
Commentary: Use this with caution. You need lots of memory and
lots of concepts to use this option effectively.
76
it could also cull concepts you might want. Note that this filter does
not remove the words from the text, it just stops them being
selected as automatic concepts. You can still use these words as
manual seeds.
Manual Version 4
77
3. Thesaurus Generation:
Alternatively you can expand this stage (using the plus sign next to Show
Settings) to Edit the settings for two sub-stages within:
Manual Version 4
78
you may wish to create your own concepts (such as violence) that
you are interested in exploring, or create categories (such as dog)
containing specific instances of terms found in your text (such as
hound and puppy).
Once the Generate Concept Seeds stage has been run, you can check and
modify the discovered concepts by opening the Generate Thesaurus
Settings and clicking on Edit Concept Seeds. The following interface will
appear:
Manual Version 4
79
Manual Version 4
80
Single click on concept seeds to select them. Use the Remove button to
remove any unwanted automatic concept seeds. Use the Select All button,
or hold down control <ctrl> while clicking, to select multiple items.
You can merge similar concept seeds by selecting two concepts and
clicking the Merge button. If you do so, the merged concept takes its
name from one of the concept seeds, and the concept then has two
thesaurus items. For example, if you merge the concept seeds company
and companies, the following will result:
Manual Version 4
81
If you change your mind, you can select the merged concept in the list
and use the Unmerge button to separate the two original seeds.
Manual Version 4
82
You can also edit automatically extracted concepts. For instance, if you
wish to add additional thesaurus items, select the concept from the list
and click Edit.
The following dialogue will appear:
This interface allows you to change the name of the concept, identify it as
a name-like or word-like concept, and choose whether to allow
Leximancer to rename the concept during subsequent learning.
You can use the arrow buttons to Add or Remove terms related to the
concept. Add terms if you believe that there are other words that predict
well the presence of the concept in a section of text. You can choose
words from the Frequent Words or Frequent Names lists, or enter your
own words in the Add Words or Add Names text boxes. You can also
identify terms that constitute negative evidence, or evidence that a
concept is absent, but use this option with care. Leximancer will
Manual Version 4
83
automatically learn the weightings for these words from the text during
the Thesaurus Learning phase.
If you wish to create your own concept(s), close this dialogue and return
to the Concept Seed Editor interface. Click on the User Defined Concepts
tab and then click on Add. This opens the Add Concepts interface, where
you can define new concepts yourself.
Name the new concept using the lists of frequent words and names, or
type a concept name into the text box. Use the right arrow button to
move the name of the new concept over to the Current Concepts list:
Click OK to close this window to see your new concept under the Used
Defined Concepts tab in the main Edit Emergent Concept Seeds dialogue.
The new user-defined concept can now be edited in a similar fashion to
automatic concepts.
Manual Version 4
84
If you wish to rerun the project from a prior stage and retain your edits to
emergent concepts, you should click OK to save your edits. Then reopen
the Edit Concept Seeds interface, and Download your edited list of
concepts somewhere on your local drives.
If you run the Generate Concept Seeds stage again, Leximancer will
revert to its original list of auto concepts.
If you wish to use your saved list of edited concepts, go into the Edit
Concept Seeds interface again, and Upload your saved concepts seeds file
in the Auto-concepts tab.
You must save the concept seeds in each of the Auto- and User-defined
tabs separately. The separation affords greater control by allowing you to
reload individual seeds lists if you wish.
Manual Version 4
85
Using Tags
If you have opted to Apply File Tags in the Text Processing settings, then
you should see a tag representing each of your source documents in the
Auto Tags tab in the Edit Concept Seeds interface:
Tag concepts are concepts for which no associated terms will be learned
by Leximancer (unless otherwise instructed). They are useful if you want
to make comparisons among groups within the data.
Manual Version 4
86
You can aggregate the tags using the Merge button, similar to merging
concepts. In this case for example, we could merge the document tags to
create 4 weeks of hearing transcripts for comparison:
If you have created Folder Tags, the numbering lets you know in which
level of the hierarchy a folder resides (Level 1 is the top level).
Manual Version 4
87
**Please note: if you make other changes to your user seeds, including
tags, you must make these changes FIRST and then save them by clicking
ok. THEN you may re-enter the Edit Emergent Concept Seeds stage and
Manual Version 4
88
Manual Version 4
89
If you run the Thesaurus Learning stage, and then return to the User
Defined Concepts tab in the Edit Emergent Concept Seeds stage, you
can observe the effect of Sentiment Lens. Sentiment thesaurus terms that
are irrelevant or inconsistent in your text will be grey. Those left coloured
are suitable for analyzing sentiment in your text and will be used as
thesaurus items to develop sentiment concepts.
Once you reach the Interactive Concept Map, you may also observe new
Sentiment terms that have been automatically added to the Thesaurus.
When analysing news articles about climate change for example, the term
mongering does not appear in the original seed list under _unfavterms.
Yet once Sentiment Lens is applied and run, it appears in the list of
Thesaurus items for _unfavterms (next page):
Manual Version 4
90
Thesaurus tab in
Manual Version 4
91
Manual Version 4
92
Manual Version 4
93
included in each concept. After you have run the learning phase, examine
the log to see how many iterations of thesaurus learning took place to
arrive at the final concept definitions. This number should ideally be
between 3 and 9. If the number is more than 9, consider lowering the
learning threshold. Conversely, if the number of iterations is very low,
consider raising this threshold.
Learn From Tags (Yes|No): You can use this option if you have any tags,
either automatically extracted from tables, file or folder names, or
speaker tags, or manually entered user tags. Turning on Learn From Tags
will treat tags like concept seeds, learning a thesaurus definition for each.
This setting is important if you are conducting Concept Profiling
(discussed below) where you wish to extract concepts that discriminate
between different folders or files (such as extracting what topics
segregate Liberal from Labour party speeches).
Learning Type (Normal|Supervised): There are two forms of learning
that are supported by Leximancer: Automatic and Supervised. Automatic
Manual Version 4
94
Manual Version 4
95
Concept Profiling
These settings allow the learning process to discover new concepts that
are associated with selected user-defined and automatic concepts. This is
useful for profiling concepts or names, for doing discriminant analysis on
prior concepts, or for adding a layer of more specific concepts which
expand upon a top layer of general concepts. Profiling also allows you to
ignore large sections of text that are not relevant to your particular
interests.
Once the initial concept definitions have been created, words that are
highly relevant to these concepts can be identified as potential seeds for
new concepts. For example, if you profile the initial seed flowers, a
concept definition is grown around this word as usual. Then new
concepts are developed from the flowers definition that would produce
more specific topics, such as roses, daffodils, petals, and bees.
This process is useful if you are trying to generate concepts that will
allow segregation between various document categories. For example, if
you are trying to discover differences between good and bad applicants,
simply place exemplars of each in two separate folders (one for each
type), and Apply Folder Tags in the Preprocessing stage. This will create a
Manual Version 4
96
concept class for each folder. In the Concept Editor, only retain these
folder tags in your Automatic Concepts and Tags lists. Switch on Learn
From Tags in the Thesaurus Learning phase, and use the profiling
settings described below to extract relevant segregating concepts.
97
Only Discover Names (Yes|No): This option lets you discover name
concepts only when profiling. This is useful for discovering social
networks of association.
Manual Version 4
98
4. Run Project:
You can run this stage of processing (and all of these preceding it)
using default settings by clicking the Run Project button.
Alternatively you can expand this stage (using the plus sign next to Show
Settings) to Edit the settings for three sub-stages within:
Manual Version 4
99
Select the boxes for any concepts you wish to combine into compound
concepts. There are lists of all tags, name concepts and word concepts in
the left and centre columns. Move them across to the compounding
workbench using the right-hand arrow:
Manual Version 4
100
Combing the concepts using the AND operator requires both concepts to
appear in the same (2-sentence) piece of text for the compound to be
coded.
After moving the concepts you wish to combine into the right column,
tick both of them again and click the AND button in the header:
Manual Version 4
101
Hover your mouse over the compound concept to see the equation
defining it.
Use the same method to combine concepts with the Boolean operator
OR. Using the OR conjunction means that evidence for your compound
concept will be calculated to include evidence for either of your concepts.
Compounding concepts using the OR operator is similar to Merging
concepts in the Edit Concept Seeds interface.
Manual Version 4
102
To make a compound concept using the NOT operator, first select the
concept you wish to negate. The NOT button at the top of the column
will become available. Clicking the NOT button will negate the concept
you have selected:
You can now combine the negated concept with another concept by
following the steps for combining concepts with AND as outlined above.
Doing so will include instances of the positive concept, and exclude
instances of the negated concept:
Manual Version 4
103
The same procedures outlined above can also be used to build more
complex Sentiment concepts.
This is done automatically when you click Sentiment Lens in the Concept
Seeds Editing interface. For example, the Sentiment Lens creates a
compound concept for Positive Sentiment that includes favourable terms
and excludes negation terms:
Manual Version 4
104
You can specify different rules for coding sentiment by combining the
concepts used as building blocks for the Sentiment Lens (favterms,
unfavterms, negationterms) in different ways if you wish.
Finally, once you have created a compound concept you may edit it. Do
so by selecting a compound, and then clicking Edit:
Manual Version 4
105
This allows you to rename your new compound concept, specify whether
it is a name or a regular word, and whether or not it should be treated as
sentiment:
(NOTE: Sentiment terms will not appear on the map, but will be present in
the report tabs to the right of the map).
Manual Version 4
106
Select the compound concept you wish to add to the map, and move it to
the left hand column using the arrow.
Now when you click Ok, Run the Project, then open the concept map,
your newly created compound concept will appear:
Manual Version 4
107
Manual Version 4
108
Manual Version 4
109
By default, All Concepts and All Discovered Names appear in the Mapping
Concepts tab. This means that all word-like and name-like concepts
discovered by Leximancer will appear on the concept map. Tags,
compound concepts and name-like user-defined concepts must be
added to the list manually. Using the arrows to replace these wildcards
with others from the General list allows you map other groups of
concepts:
Instead of using the option in the General list, you can choose to map
particular names (including tags) and concepts using the lists on the
right. Simply use the arrow buttons to add or delete concepts from each
of the lists.
Manual Version 4
110
This interface also allows you to filter records in and out of the analysis
by specifying Kill Concepts and Required Concepts. Simply click on the
appropriate tab and move the desired concept(s) or tag(s) across using
the arrow buttons.
Manual Version 4
111
For name-like concepts, all the associated terms that are present in
the block are noted. The block is said to contain the concept if the
word with the highest relevancy to the concept is above a set
threshold.
Classification Settings
Clicking on the Options tab opens the following dialogue:
112
are summed. The block is said to contain the concept if this sum is
greater than a predefined threshold. This threshold specifies how much
cumulative evidence per sentence is needed for a word concept
classification to be assigned to a context block.
113
Classification
Threshold
for
Supervised
Classifiers
(0.7-1.4):
Manual Version 4
114
115
Manual Version 4
116
Manual Version 4
117
random place on the edge, the marble could settle in different valleys
depending
on
where
it
starts.
There
may
be
multiple
shallow
valleys (local minima) in the map terrain if words are used ambiguously
and the data is semantically confused. In this case the data should not
form a stable pattern anyway. Another possibility is that some concepts
in the data should in fact be stop words, but aren't in the list. An example
of this is the emergence of the concept 'think' in interview transcripts.
This concept is often bleached of semantic meaning and used by
convention only. The technical result of the presence of highly-connected
and indiscriminate concept nodes is that the map to loses differentiation
and stability. The over-connected concept resembles a mountain which
negates the existence of all the valleys in the terrain. To fix this, remove
the over-connected concept.
Note
that
rotations
and
reflections
are
permitted
118
Manual Version 4
119
Default Theme Size Percentage (10-65): This option sets the size of
themes (concept groupings) visible when the map interface opens
initially. The theme size slider beneath the concept map also allows this
parameter to be adjusted through the map interface.
You can also specify the Size (Width and Height) of the concept map to be
produced in this interface.
Final Outputs
The Concept Map
Manual Version 4
120
Manual Version 4
121
Manual Version 4
122
Manual Version 4
123
Manual Version 4
124
Manual Version 4
125
Using the Segment Statistics setting, concepts are coded and classified at
the level of the text segment. You can define a text segment (the coding
resolution) using the Sentences per Block setting in the Pre-processing
Options.
Using the Section Statistics setting, concept codes are applied to sections
of the data. The definition of a document section depends on the type
of data:
The Auto Scale setting scales the axes in the Quadrant graphic.
This spreads the Attributes in the Quadrant space for improved
visibility.
Manual Version 4
126
Select the Tags (or concepts) of interest from the Available Names
(or Concepts) list(s), and use the appropriate left arrow to add
these to the Categories list.
Note: If you wish to use Tags as Categories, you must add these to the
Mapping Concepts list by hand in the Select Concept to Locate phase.
This codes the data with the Tags so that they can be used in the
Dashboard Report. It will also cause them to clustered on the map among
the topical concepts.
Manual Version 4
127
Select the All Concepts wild card from the General list, then click
the Attributes left arrow to use all the word-like concepts as
Attributes. This wildcard includes all the entries in the Available
Concepts list on the right. Alternatively, you can select individual
concepts from the list and add them as Attributes by hand.
Select the All Discovered Names wild card from the General list,
then click the Attributes left arrow to use all the word-like concepts
as Attributes. This wildcard includes all the entries in the Available
Names list in the centre. Alternatively, you can select individual
names from the list and add them as Attributes by hand.
When the settings have been configured, click Ok and run the final
stages of processing to produce the Dashboard Report.
After clicking the Run Project button, you can download the Dashboard
Report from the button beneath the main Project Control Panel.
Manual Version 4
128
The html version is useful if you wish to make edits to the Report.
This version can be saved as a zipped folder (or archive) on your
local machine. You may need to rename the folder (changing the
extension from insight-dashboard-zip to insight-dashboard.zip) to
allow it to be extracted or opened. Click on the insightdashboard.html file to view the Report in a browser tab, or right
click and select Open With to view the Report in another application
(such as Microsoft Word).
The Dashboard is named after the project in which it was created. The
header provides counts of the Total text Segments or Sections coded in
the Report. It also presents counts for the number of Concepts and
Categories specified.
Note: The preamble explains the various sections of the Report, and
is included as part of Dashboard to allow others to understand its
contents.
129
The Strength score is the reciprocal conditional probability. Given that the
Attribute is present in a section of text, it gives the probability that this
text comes from that Category. Strong concepts distinguish the Category
from others, whether or not the Attribute is mentioned frequently.
The percentages in the Ranked Concept for Categories lists match the
quadrant coordinates. They reflect the same Strength and Frequency
conditional probabilities.
The Prominence score combines the Strength and Frequency scores using
Bayesian statistics. Prominence scores are absolute measures of
correlation between category and attribute, and can be used to make
comparisons over time.
Manual Version 4
130
Data Exports
Among the projects results, Leximancer produces several statistical
reports. These can be exported for reporting or to allow further analysis
in other applications.
Several reports are available for Download from the Data Exports button
beneath the Project Control Panel. These include: (1) the Pairs of
Concepts across the Entire Corpus (the co-occurrence matrix); (2) the List
or Pairs of Concepts within each Text Excerpt; (3) the List of Concepts
within each Context Block, within each Text Excerpt; and (4) The
Sentiment Lens Seeds Set:
Manual Version 4
131
Hover your mouse over the Download button for a description of the level
of detail in the each of the reports.
Leximancer Exports are formulated from the base Data Corpus and
loaded into the system for analysis. The granularity of the Exports is
denoted by:
1. Pairs of Concepts across the entire Data Corpus (the cooccurrence matrix)
Downloads a comma delimited file displaying the matrix of cooccurrences between concepts. This file will open a spreadsheet program,
including recent versions of Excel. It contains co-occurrence counts,
listed for every concept pair combination, as well as x,y coordinates for
each concept on the map, and the weight for each concept, which is the
sum of its co-occurrences with all the other concepts:
Manual Version 4
132
Manual Version 4
133
This tab delimited text file contains one row for each context block. Each
row indicates:
- the file, text excerpt, and starting sequence number of the context
block;
- the html surrogate link for viewing this context block in a browser;
- the presence or absence of a concept or tag in the context block.
There is one column for each concept or tag class. As a result, this table
is very sparse.
This import is specifically designed for input into statistics or data mining
packages for building models such as: decision trees, rule sets, logistic
regression, or market basket analysis. There is a setting in the
Classifications Settings tab to generate either real valued or binary values.
Manual Version 4
134
This file can be Uploaded directly into other projects via the Concept
Seeds Settings Edit interface.
Logbook Exports
In the map interface, Leximancer allows complex records of queries to be
stored and exported. Find a particular query of interest, and its example
text, and add it to the logbook:
Manual Version 4
135
Manual Version 4
136
From here, you can export either just the current page, or every entry in
the logbook:
Manual Version 4
137
138
Manual Version 4
139
these act like lists of keywords, or fixed dictionaries they are not
modified by the learning process.
in the User Defined Concepts tab, you may wish to Add your own
seeds (such as violence) to search for concepts that you are interested
in exploring, or create categories (such as dogs) containing the
specific instances found in your text (such as hound and puppy).
Manual Version 4
140
There are tabs for editing the Automatic Concepts (concepts identified by
Leximancer) and User Defined concepts (concepts that you wish to define
yourself).
At this stage, only the central key word for each concept has been
identified. The learning of associated terms and their weightings occurs
in the following Thesaurus Learning phase.
In the interface above, you can select and Merge or Delete concept seeds
(note: holding down <ctrl> while clicking allows you to select multiple
items).
Open the User-Defined Concepts tab and click on Add to open the Add
Concepts interface:
Manual Version 4
141
Manual Version 4
142
Here you can add terms strongly-related to your concept. For example, if
you are interested in finding sections in your text containing violence,
create the concept violence and add any terms from the Frequent Words
or Names list above that you think indicate a violent act. Only use words
that fairly unambiguously indicate this concept in the text. Leximancer
will automatically find additional terms from the text during the
Thesaurus Learning phase, so you don't have to know all the relevant
words in advance. Click on Ok to save your changes and exit.
When you return to the Concept Seed Editing interface, you can also
create and edit Tag categories. These are concepts for which no
associated terms will be learned by Leximancer. This useful if you want to
compare groups in the data (using file or folder tags) or perform a simple
keyword search for terms.
Click on OK to close the Concept Seed Editor and return to the main
Project Control Panel. Click on Run Project to run the remaining phases
Manual Version 4
143
on default settings. Click on Concept Map to view the map containing the
new concepts.
2. Profiling
This function is not the same as automatic concept discovery. The aim
here is to discover new concepts during learning which are relevant to the
concepts defined in advance, either in the Automatic- or in the UserDefined Concepts tabs. For example, this setting would allow you to
extract the main concepts related to stem cell research from a broader
set of documents. Concept profiling settings can be found under Edit for
the Thesaurus Settings in the Generate Thesaurus stage. Note that Tags
do not take part in this process automatically. If you have Tag categories,
folder tags for example, which you wish to profile, you must turn on the
Learn From Tags option in the Thesaurus Settings. It is important to
understand that although these discovered concepts are seeded from
words that are relevant to the prior concepts, they are then learned as
fully-fledged independent concepts. As a result, the map will contain
some peripheral areas of meaning that do not directly contain the prior
concepts. Contrast this with the Required Concepts function described
below.
The profiling function has three alternative behaviours: ALL, ANY, and
EACH. You can ask for the related concepts to be relevant to many of the
prior concepts, and thus follow a theme encompassed by all the prior
concepts this is the ALL option, and resembles set intersection.
Alternatively, the discovered concepts need only be related to at least one
of the prior concepts this is the ANY option, which is similar to set
union. The EACH option discovers equal fractions of profile concepts for
each predefined concept, and these concepts show what is peculiar to
Manual Version 4
144
each predefined concept. The EACH option is very useful for enhanced
discrimination of prior concepts.
If you wanted to extract the main concepts related to stem cell research
from a broader set of documents, for example, you could disable
Automatic Concept Identification in Concept Seeds Edit in the Generate
Concepts stage. Instead you would create user-defined seeds for multiple
simple concepts that encompass the scenario. You might seed concepts
such as research, ethics, debate and so on. Keep the seed words for
each concept simple, and dont try too hard to restrict the seeds of each
concept to just the topic you are after. In this instance we will be
considering the intersection of all these elements. Expand the Generate
Thesaurus Settings, and Click on Edit for Thesaurus Settings. Specify a
quota of concepts to be discovered in the Concept Profiling options.
Choose to discover several profiled concepts per prior concept. Select
Concepts in ALL as the operator.
If you are attempting to discover the network of associated names around
a name or a scenario, you can choose to only discover name-like
concepts during profiling. You should try the Social mapping algorithm
first for this style of map.
Manual Version 4
145
Manual Version 4
146
The concept map will contain profiled concepts specific to your area of
interest. The map below is a profile of the Pipework concept in the
Explosion Enquiry (hearing transcript) data set:
Manual Version 4
147
Concept profiling is very useful in cases where you have a large amount
of text, but are only interested in exploring particular themes.
Profiling also allows you to make conceptual comparisons (highlighting
divergence or consensus) between groups in your data. This strategy
combines the use of tag categories and profiling, and is described in the
next section (Profiling using Tag Categories).
Manual Version 4
148
149
should then show the tag categories distributed around the concepts. If
the locations of the tags relative to the concept field show little
repeatability, then you should conclude that the tag categories are
difficult to differentiate based on the global concept selection. Essentially,
they all address most of the same global concepts to similar degrees.
This is a result in itself, but if you actually wish to discriminate between
tag categories, see the section on Discrimination of Concepts, Names, or
Categories based on Semantics.
break up a book into various chapters (one file per chapter) to allow
you to explore what topics or characters appear in each chapter,
naming letters or reports - make each file name indicate the person or
organisation who wrote the document to let you see each of these
bodies on the map,
Leximancer can create a category called a Tag for each folder and / or file
name in your data documents. You can create multiple levels of folders
under your parent (top-level) data folder. For example, you can create a
folder for each newspaper, and under each of those a folder for each
Manual Version 4
150
month, and under each of those a folder for each journalist. You would
place the text file for each article in the appropriate folder at the bottom
of this tree. Leximancer can then create a Tag for each folder at each
level of the tree.
To enable this function, expand the Generate Concepts Settings and click
on Edit Text Preprocessing Settings to bring up the interface below. In the
Tagging Options, select Apply folder tags or Apply file tags:
the
Generate
Concepts
and
Generate
Thesaurus
stages
of
processing. Expand the Run Project Settings, and Edit the Concept Coding
Settings. Use the left-hand arrow button to add the file tags to the list of
Mapping Concepts:
Manual Version 4
151
Click Ok, and then click on Run Project in main Control Panel to complete
the final phase of processing. Click on Concept Map when the project is
complete.
The concept map now includes the chapter file tags, and the concepts are
clustered around these according to their relationships. Concepts coming
from the content of a particular chapter will tend to settle near that
chapters file tag in the map space.
You can explore the topics characteristic of a chapter by clicking on a file
tag. A ranked list of related topics is revealed in the panel on the right.
These are the concepts that are coded into the chapter frequently:
Manual Version 4
152
In the example above, the Mock Turtle and Gryphon are clustered near
Chapter 10 file tag, indicating that they are fairly specific to this chapter.
Filename and folder tags are useful if you wish to explore similarities or
differences between various conditions. For example, placing speeches by
various politicians into folders according to the party to which they
belong will allow you to explore the views of the party, whereas placing
files into folders corresponding to different periods of time will allow you
to explore temporal differences.
Manual Version 4
153
The final map will show the tag categories in proximity to their
discriminating discovered concepts. Be aware that the discovered
concepts in this case do not characterise the whole text. In fact they are
quite specific to the tag categories, and do not necessarily cover the
major themes of the whole data set at all.
In order to profile tags:
Select your data files as usual. In this example, the data consists of 4
folders containing speeches by members of Australian political parties
concerning the stem cell debate.
Expand the Generate Concepts Settings, and Edit the Text Processing
Settings. Use the Apply Folder Tags option to create a tag for the
speeches of each political party.
Manual Version 4
154
Edit on the Concept Seeds Settings and disable the radio button near
the top. Here we are interested in profiling the tags, and don't want
any automatic- or user-defined concepts present.
Expand the Generate Thesaurus Settings, and Edit the Concept Seeds
Settings open. In the Auto Tags tab you will see that a tag has been
created
for
each
data
sub-folder.
Delete
the
Nationals
and
Independents tags, so that you are only left with tags for the Labour
and Liberal parties. Click Ok.
Edit the Thesaurus Settings, and tick the Learn From Tags option. In
the Concept Profiling section, enter the number of concepts that you
would like to see on the map related to the two tags (60 in this
example). Since we want to discover concepts that distinguish the
party tags, use the Concepts in EACH operator this time:
Manual Version 4
155
Click Ok, then expand the Run Project Settings, and Edit the Concept
Coding Settings. Move the Folder tags of interest into the Mapping
Concepts list on the left (to allow them to be shown on the map).
Click Ok, then click the Run Project button in the main Project Control
Panel to complete the project, and then click Concept Map to see the
party profile map.
The concept map will contain concepts that distinguish the tags from one
another. The map below shows the clusters of concepts that contrast the
arguments made by the Australian Labour and Liberal parties on the
subject of the stem cell debate:
Manual Version 4
156
and
to
characterize
the
conceptual
nature
of
their
Manual Version 4
157
Next, go into the Thesaurus Settings and enable the Concept Profiling
function. Select the Concepts in ANY operator, and choose to discover
one concept per prior name. You can increase the number of discovered
concepts if you want a richer map.
Run the Generate Thesaurus phase, then expand the Run Project Settings
and go into the Concept Coding Settings. Make sure the tags, names and
concepts of interest are included in the Mapping Concepts list. If not, use
the left-hand arrow to add them to the list.
Manual Version 4
158
Open the Concept Coding Settings, and put only the name-like
concepts in the Mapping Concepts list. Place the desired structural
concepts in the list of Required Concepts, then run the remaining
stages.
The result will be that only text segments that contain one of the
Required Concepts will be mapped. Consequently, the map will show a
network of names based on relationships that involve at least one of the
required concepts
Manual Version 4
159
5. Analysing Transcripts
Transcripts of meetings, interviews and focus groups can be analysed in
Leximancer as normal text, and if you group the interviews into files and
folders, you can use Folder Tagging (Chapter 14) to enhance your
analysis. Moreover, if your transcripts are in plain text or Microsoft Word
and are suitably formatted, Leximancer allows you to select, ignore, or
compare all the utterances of each distinct speaker. To allow the program
to identify the speaker of any text segment, they must be identified in a
certain way, and a new speaker label must be inserted whenever any new
speaker begins. The format requires dialogue markers which are at the
start of a paragraph; use upper case first letters for each constituent
term; are made up of a maximum of three terms; and end in a colon
followed by a space. For example:
Manual Version 4
160
Given text data in this format, Leximancer can extract the dialogue
markers as tags and identify the speaker of every subsequent sentence
until the next dialogue marker.
Manual Version 4
161
This transforms each dialogue marker in the text into a tag, which is then
inserted into each relevant sentence and displayed for you under
Autotags tab in the Edit Concept Seeds interface.
There is another setting in the Pre-process stage called the Prose Test
Threshold. If your interview text is quite colloquial and does not conform
to standard stop-word usage, set the Prose Test Threshold to 0. This
filter is most useful for prose interspersed with non-textual material,
such as web pages.
Manual Version 4
162
Then click on the Auto Tags tab to see the list of speakers identified by
Leximancer:
Manual Version 4
163
Manual Version 4
164
The Mapping Concepts list lets you chose which tags and concepts you
wish to appear on the concept map. By default, there are two items in the
list, All Concepts and All Discovered Names.
The All Discovered Names wildcard represents only those namelike concepts discovered automatically by the software.
Full lists of the possible name-like and word-like concepts appear in the
right-hand panels so that you can choose which entries you would like to
see on the map.
Manual Version 4
165
The Required and Kill Concepts tabs allow you to select whose utterances
you wish to analyse, and whose you wish to leave out from the analysis.
For example, if you wanted to suppress all the utterances of the
Interviewer, you would move the Interviewer speaker tag into the Kill
Concepts list. This causes all the concepts coded into questions asked by
the Interviewer to be removed from the analysis:
With these changes made, you can click Run Project to complete the final
phase of processing and produce a concept map:
Manual Version 4
166
Manual Version 4
167
Create a Leximancer project as usual, and click on Load Data. Browse and
select the spreadsheet file (the .tsv or .csv file). Drag and drop it into the
Document Set area, then click Ok.
Run the first phase of processing (Generate Concepts), then expand the
Generate Thesaurus Settings and Edit the Concept Seed Settings. You will
see the concept seeds automatically extracted by the software in the
Auto-Concepts tab:
Manual Version 4
168
You can seed your own concepts in the User Defined Concepts tab.
If you want more (or fewer) automatic concepts, just go back one node to
the Concept Seeds Settings and change the Total Number of Concepts to
be suggested by Leximancer.
Manual Version 4
169
You can review what tags Leximancer has extracted by clicking on the
Auto Tags tab. The tags take their names from the column headers and
categorical responses in the data. There is no hard limit to the number of
free-text and categorical variables that can be analysed in Leximancer.
The Auto Tags can be used for data mining correlations with textual
concepts, and for selecting which text column(s) you wish to map at any
time:
Run the Generate Thesaurus phase to extract a thesaurus from the data
describing each concept seed.
Manual Version 4
170
Expand the Run Project Settings, and Edit the Concept Coding Settings to
access the data mining options.
The Mapping Concepts list lets you select what variables you want on the
concept map, like choosing the columns you want in a database query.
For example, lets say you wish to examine the correlations of the three
position types (faculty members, staff and students) with the comments
from the text column called IT feedback comments in the data. You
would like all the textual concepts on the map, so retain the All Concepts
wildcard in the Mapping Concepts list. You would also need to select and
add the three respondent type tags (Position: Faculty, Position: Staff and
Position: Student) to the Mapping Concepts tab using the left arrow:
Once you have configured the data mining settings, run the last phase of
processing and inspect the resulting concept map:
Manual Version 4
171
You could choose to correlate the textual concepts with the satisfaction
ratings instead of the position categories. To do so, return to the data
mining settings in the Concept Coding Settings, and remove the position
tags from the Mapping Concepts list. Replace them with the satisfaction
score tags. Rerun the final phase to produce this new view of the data
very quickly:
Manual Version 4
172
You could also aggregate the satisfaction tags in the Edit Concept Seeds
stage to produce Low and High satisfaction tags to place on the
concept map. For instance, you could Merge the satisfaction scores of
1-3 and Edit this to rename it as tag as Low. Then Merge the 4-7 tags
and Edit to rename them as High. Adding the Low and High tags to
the Mapping Concepts list in the Concept Coding Settings produces the
following map:
Manual Version 4
173
You can also use the Required Concepts and Kill Concepts tabs in the
Concept Coding Settings to filter records in or out of your analysis. For
instance, if you had more than one text column in your spreadsheet, you
could choose to examine the concepts associated with a particular text
response. You would do this by moving the tag denoting the text column
of interest into the Required Concepts tab. In this case, if a data cell does
not come from that text column, it will not be coded for concepts and
mapped.
A Kill Concept is almost the opposite of a Required Concept. If a text
segment matches a Kill Concept tag or concept, it will not be coded with
any other classifier. For example, you could suppress the analysis of all
text segments which match the concept available by identifying
available as a Kill Concept.
By way of example, if we wished to map only the comments made by
women in this spreadsheet, we could either add the Gender: female tag to
Manual Version 4
174
the Required Concepts tab, or by add the Gender: male tag to the Kill
Concepts tab:
If we remove all the tags from the Mapping Concepts tab (so that only the
default All Concepts and All Discovered Names wildcards remain), the
resulting map reflects all the responses made by women in the data:
Manual Version 4
175
*********************************************************************
This documentation is Copyright 2011 Leximancer Pty Ltd,
http://www.leximancer.com/.
All rights reserved.
********************************************************************
Manual Version 4
176