Semiotic Margins Meaning in Multimodalities
Semiotic Margins Meaning in Multimodalities
Semiotic Margins Meaning in Multimodalities
Mathematical Discourse
Kay O’Halloran
Multimodal Semiotics
Edited by Len Unsworth
Semiotic Landscapes
Edited by Crispin Thurlow and Adam Jaworski
Semiotic Margins
Meaning in Multimodalities
Edited by
Shoshana Dreyfus,
Susan Hood
and
Maree Stenglin
Continuum International Publishing Group
The Tower Building 80 Maiden Lane
11 York Road Suite 704
London SE1 7NX New York, NY 10038
www.continuumbooks.com
All rights reserved. No part of this publication may be reproduced or transmitted in any
form or by any means, electronic or mechanical, including photocopying, recording, or
any information storage or retrieval system, without prior permission in writing from the
publishers.
Introduction 1
Shoshana Dreyfus, Susan Hood and Maree Stenglin
Index 271
Introduction
Shoshana Dreyfus
University of Sydney
Susan Hood
University of Technology, Sydney
Maree Stenglin
University of Sydney
The initial inspiration for this book was a conference held at the University
of Sydney in December 2007. The conference, entitled ‘Semiotic Margins:
Reclaiming Meaning’, brought together scholars interested in multimodal
discourse studies from a social-semiotic perspective. As the terms ‘semiotic’
and ‘margins’ suggest, it was motivated by a strong desire to explore meaning-
making resources other than language, especially those modes that are often
considered to lie on the borders, fringes and peripheries of semiosis, and which
have tended to receive less attention from the field of semiotics. Unifying the
contributions to that conference were connections to social semiotics within
the Systemic Functional tradition, acknowledging a theoretical heritage in the
work of Michael Halliday and his colleagues in language as a social-semiotic
system. Even though Halliday (e.g. as in Halliday & Hasan 1985) has long
emphasized that language is only one semiotic system among many, the work
on language as a resource for meaning-making has to date dominated the
semiotic landscape.
This volume of current work in the field reflects similar interests and
concerns with pushing at the margins of our understandings of semiosis. The
contributions present analyses of the meaning-making potential of a wide range
of modalities including body language, colour and ambience, laughter, archi-
tectural spaces, music, diagramming and image-verbiage relations. Contributions
also engage with a second interpretation of the title Semiotic Margins, one to
do with the relationship of other modalities to language, the question of what
mode is marginal to what, and the ways in which different modes co-articulate,
or co-pattern to create meaning.
2 Semiotic Margins
The contributions to the volume have been subdivided into 4 key themes:
1. Beyond paralinguistics;
2. Evolving accounts of space and music;
3. Intermodality between the visual, verbal and aural;
4. Imaging representations of meaning.
Beyond Paralinguistics
The first theme explores modalities of meaning that have been categorized
as paralinguistic, here including body language and laughter.
In Chapter 1, Naomi Knight explores the meaning-making potential of
laughter, articulating its semiotic functions. Of particular interest, are the ways
laughter construes affiliation and bonding. The data in her study are drawn
from conversational humour.
In Chapter 2, Susan Hood investigates the ways in which teachers use body
language in face-to-face teaching in tertiary classrooms, with a particular focus
on the ways in which body language functions to give salience to particular
information and to manage interaction and engagement. From analyses of the
data, Hood begins to construct system networks of choices in interpersonal and
textual meaning.
In Chapter 3, Shoshana Dreyfus analyses the non-verbal communication of
a boy with a severe intellectual disability and shows how this requires that a
range of modes of semiotic behaviour be analysed. In particular, behaviours
that have traditionally been regarded as ‘paralinguistic’ become central rather
than peripheral to communication.
The third theme is concerned with theorizing the co-articulation of visual and
verbal meaning in children’s picture books, and the implications of that for
student literacy.
In Chapter 6, Clare Painter, J.R. Martin and Len Unsworth present two
resources for analysing aspects of visual semiosis that fall within the textual
metafunction. They are ‘framing’ and ‘balance’. The chapter illustrates these
systems using images from well-known picture books.
In Chapter 7, Eveline Chan presents findings from a research project invest-
igating how visual and verbal meanings co-pattern in reading comprehension
tests. Chan introduces a new model that begins with the interplay between
representation and composition, and applies it to school literacy test materials.
The study shows that successful comprehension requires the recovery of meanings
across semiotic modes.
In Chapter 8, Theo van Leeuwen addresses the question of intermodal
relationships, challenging the notion that we can communicate at all in a single
modality. Central to his explanation of relationships across modalities is the
element of ‘rhythm’, which is seen as an essential framework for coordinating
and aligning different modalities in meaning making.
In Chapter 9, David Rose explores how children learn to engage with books
as a mode of communication, and how this engagement is pivotal to how they
learn from reading in school. Rose’s account of student literacy includes a
methodology for supporting those children relegated to the socioeconomic
margins of meaning making, and facilitating their apprenticeship into main-
stream literacy practices.
The final theme engages with new approaches to transcribing and mapping
representations of multimodal semiosis in screen-based technologies.
In Chapter 10, Michele Zappavigna presents three new tools for visualizing
text patterns: text arcs, stream graphs and animated networks. Zappavigna
demonstrates that the advantages of these tools are two-fold: they visualize
linguistic patterns that the eye is unable to identify and provide a synoptic
perspective on text without sacrificing logogenesis.
In the Chapter 11, David Caldwell and Michele Zappavigna explore how one
visualization method, the arc diagram, is able to represent the way meanings
build up as a text unfolds. The text chosen for analysis is a rap song by Kanye
West and two of his collaborators and their analysis focuses on one system of
meaning: graduation, a system from appraisal theory (Martin & White 2005).
The key focus of the analysis is how graduation is materialized in the end-rhymes
of a rap song: a feature that distinguishes the rhyming capacity of a rap artist.
4 Semiotic Margins
Chapter 12, by J.R. Martin, concludes this volume with an overview of what it
means to say that language is a semiotic system, and a summary account of what
the dimensions of that semiosis are. The discussion generates a set of questions
for the multimodal analyst in terms of how they theorize the modalities they
are exploring. Martin then addresses a number of significant challenges in
multimodal research, some of which are the focus of ongoing work, some
of which are newly emerging and some of which we have barely begun to
consider.
Conclusion
The contributions in this book are stunning in terms of both their depth and
breath. They are also at the cutting edges of multimodal discourse analysis in
three main ways. First, they extend the field of semiotic analysis; second, they
expand the theory; and finally, they contribute to the ongoing dialogue between
linguistics and other disciplines, including those of psychology, architecture,
music, education, language disorder, advertising and information technology.
References
Halliday, M.A.K. & Hasan, R. (1985). Language, context and text: Aspects of language
in a social-semiotic perspective. Geelong: Deakin University Press.
Martin, J.R.M. & White, P.R.R. (2005). The language of evaluation: Appraisal in English.
Hampshire/NY: Palgrave Macmillan.
Part One
Beyond Paralinguistics
This page intentionally left blank
Chapter 1
How much lies in Laughter: the cipher-key, wherewith we decipher the whole man.
(Thomas Carlyle)
Introduction
Literature on Laughter
We begin with a review of just some of the broad scope of literature that pro-
vides a backdrop to the current study of laughter and interpersonal meaning,
a scope that ranges from an interest in the phylogenetic origins of laughter in
the body language of apes, to the ontogenetic development of laughter in
babies and its importance in the development of mother tongue, and to the
growing body of studies from a number of theoretical orientations that concern
the role of laughter in human conversation.
observed both in the infant’s laugh and in the origins of ape displays as dis-
cussed above. An infant’s laugh is a signal to the trusted mother of ‘non-threat’
(removal of fear) and function interactionally to bring infant and caregiver
together in bonding. It is a signal of togetherness with the meaning of ‘you and
me’ (Halliday 1975).
Painter (2003) suggests that in the ontogenetic transition from protolanguage
to language, laughter again plays an important role, and is in fact a driving
force behind the move into the language proper. As the child transitions
from protolanguage into the mother tongue language and incorporates
the notion of metafunctions, the functionality of laughter develops as well.
Following Matthiessen (2006), laughter becomes a semiotic system that fuses
with language in the semantic stratum where ‘different semiotic systems are
integrated as complementary contributions to the making of meaning in con-
text’ (2006:2). In other words, different expressive resources are distributed
across semiotic systems, including that of laughter, ‘paralanguage’ and ‘body
language’. Each has different semiotic affordances on the content plane, but
each combines with language to achieve a unified ‘performance’ (2006:3) of
meaning in context. While language is distinguished from other semiotic
systems because it has a level of grammar, or a ‘higher degree of systematisa-
tion of its meaning potential’ (2006:7), laughter is a semiotic system that
construes meanings of language by its own expression system. Laughter is
represented as a semiotic system that coordinates with language to make
meaning in Figure 1.1.
context
semantics
content
plane: LA
E UG
G HT
G UA lexicogrammar ER
LAN
C And Yana somehow it’s like she only got in because she was
N ()
C When she was auditioning, = =
N = = Oh yea:::h ( ) laughing
C Marissa laughed so she felt bad (laughs) so she let her in
N, C (laugh) = =
F = = Oh:::!
N (laughs) = =
C = = It was like she just started laughing in the middle of our audition so
she was like felt so bad (N laughs) and she was like ‘Let her in’ (laughs)
with the use of inscribed attitude in ‘she felt bad’, and later as graded up in
force in ‘she felt so bad’. C also negatively evaluates Yana in terms of judgement
as capacity in the opening move but this time the attitudinal meaning is invoked
rather than inscribed in Yana somehow . . . got in.
In Examples 1 and 2 the speakers share laughter in the interaction (which
we will attend to shortly). However, in Example 2, the speakers also discuss
instances of the shared experience of laughter and its meaning potential,
providing insight into the potential for laughter to convey attitudes on its own
without the complement of speech. In telling her funny story to her friends,
C is not suggesting that Marissa herself inscribed in speech her negative
judgement of Yana’s audition. Rather, C interprets the attitudinal meaning
potential of Marissa’s laughter as suggesting this negative judgement. Neither
does C suggest that Marissa inscribed in speech a negative self-judgement.
Rather this is intuited from the fact that having laughed at Yana’s performance
she subsequently gave her a part in the play to minimize the negative judge-
ment she had seemingly conveyed towards Yana. In this conversation, the
speakers’ references to laughter in a previous interaction are seen to convey
evaluative meanings.
In their re-telling, the participants also share in laughter, and this gives rise to
another level of analysis in the relationship of laughter to attitude to do with
the sharing or otherwise of attitudes in processes of affiliation.
text are the points around which we negotiate our alignments in degrees
of ‘otherness’ and ‘in-ness’ and laughter is a key resource in this process of
negotiation. In this way, laughter serves as our way in to the study of affiliation,
as it offers an explicit signal that this social process is going on.
In humour, laughter therefore does not only indicate attitude but attitudes
coupled with experience that are laughable to the interactants. These couplings
are only found laughable in relation to how the interactants have been affiliat-
ing together. In Example 3, the interactants laugh off the coupling of the
ideational meaning of ‘eating’ with the positive appreciation well (in N’s
utterance We all ate well) because it creates a tension with underlying values
around eating too much that they share together.
The laughter shows that attitudinal couplings are presented that create laugh-
able tensions for the participants in their negotiations of affiliation. Laughter
provides a window into how interactants negotiate their communal values,
identities and alignments, indicating degrees of ‘otherness’ and ‘in-ness’ (Eggins
& Slade 1997:155). In phases of humour in conversation, laughter is a reaction
to (and marker of) value-infused meanings that need to be negotiated in pro-
cesses of affiliation. It is through laughter that interactants manage tensions
that may arise as they construe themselves as members of communities.
There is more to be considered here in terms of the particular expressive
features of the laughter in the talk and how this relates to attitudinal meaning,
but first we will look more closely at the placement of the laughter in terms
of the move structure of the interaction.
The meanings of the laughter are dependent upon its placement with
speech as a conversational move in the exchange. That is, the meanings are
affected by the speech function the laughing constitutes. Laughter can mark
humorous tension, such as in U’s own laughter (in Example 1) following her
16 Semiotic Margins
utterance I ate well, speakers can laugh off a tension they do following N’s
utterance: We all ate well. Move choices that are made in the articulation of
the laugh are described in the following section.
When in combination with speech, a laugh co-articulates the move that the
speech is construing. In the following example, the reaction move functions to
‘counter’ the speaker’s claim and is expressed in verbiage and laughter:
The meaning of the laughter therefore depends on the kind of move function
it realizes in the interaction, whether it is part of an initiation or a reaction, and
where it occurs in relation to verbiage.
Before we bring together the partial analyses presented so far to demonstrate
how laughter makes meaning as a conversational move in relation to attitude
and affiliation, there is one further aspect of the meaning potential of laughter
that we need to attend to, that of the choices in expressive features. As a social
semiotic, speakers vary the meaning indicated in their laughter by changing
the characteristics of its expression. By considering laughter within a particular
context, it is possible to represent these choices of sound features systematically
and paradigmatically in a system network. The following section will present
a system network that models the sound potential of laughter in convivial
conversational humour, relating its systems of meaning with particular uses in
the social context.
18
NON-
non-close
CONSTRICTION CLOSE-TYPE half-close
close
VOICED- ingressive
voiced
VOICING TYPE egressive
articulated
unvoiced
Semiotic Margins
moderate
quiet
ITERATED- continuous
iterated
LENGTH TYPE pulsed
laughter burst
prosodic VOICE- creaky
voice-quality
QUALITY- QUALITY-TYPE breathy
vocal-quality
CHARACTER TYPE nasalized
PROSODY no-addition
high
PITCH
mid
low
non-prosodic
MOVEMENT stable
shifting
Figure 1.2 A system network of laughter sound potential in convivial conversational humour
The Interpersonal Semiotics of Having a Laugh 19
As laughter indicates rather than realizes meaning, its expressions are depend-
ent on the context and surrounding co-text in the interaction; and in making
choices from the specified options, it is important to interpret these in their
situational environment. That is to say that who is laughing (e.g. speaker or
hearer(s)) must be considered in relation to the utterances in the text. Whether
the verbal co-text specifically precedes or follows or is overlapped by the laugh
conveys information about meanings produced as well. In combination with
language the semiotic potential of the laugh is more fully revealed.
The paradigmatic options in sound for laughter are given with respect to its
articulatory and its prosodic features,7 and these can change depending on its
movement, providing three simultaneous subsystems in the system network.
The ‘stable’ versus ‘shifting’ distinction in the subsystem of movement follows
Halliday’s (1992) classification for Chinese syllabic phonology,8 and captures
the possibility of a laugh changing its course, from which the laugher re-enters
the system; or if stable, only one entry into the system is necessary. This may
impact upon the meaning made as the speaker may alter his or her attitude by
changing to a different combination of sound features.
Choices in the articulation of the laugh are presented in a close-up version in
Figure 1.3.
As a laugh is articulated, the closure of the mouth is captured in the system
of constriction. While it is represented as distinct choices, it may be seen to follow
Stewart (1995) in that the vocalic sound that combines with the initial /h/
aspiration (e.g. ‘ha’) can be measured on a continuum based on the opening
of the mouth.9 A laugh can also be articulated as voiced or unvoiced in the
system of voicing, and this combined with the closure of the mouth affects
whether the laugh is more nasal or oral. Chafe (2007:28) also identifies ingres-
sive voicing (recovery inhalations with enough laryngeal friction to make
audible) as a feature of laughter that is not found in ordinary speech, and
that has a highly distinctive sound. This has been incorporated into the
NON- open
non-close
CONSTRICTION CLOSE-TYPE half-close
close
VOICED- ingressive
voiced
VOICING TYPE egressive
articulated
unvoiced
21
22 Semiotic Margins
This example shows speaker P joking that both the restaurant owners and the
government have broken the law and engaged in bribery. By doing so, P implies
a coupling of positive judgement for bribery as sanctioned behaviour on the
part of both the restaurant owners and the government, creating an affiliative
tension with the values shared with her interactants as law-abiding citizens.
In the system of move in Negotiation, the laughs by G indicate reactions,
which work to laugh off the tension created by the positive attitudinal coupling.
At the same time, the laugh shows that an additional attitude is being shared
towards the restaurant and the government, as their ‘breaking of the law’ is
laughable (rather than actually sanctionable), and so in reality the supposed
‘law-breakers’ are positively judged as aligning with these interactants as
law-abiding citizens (and administrators of the law).
This additional attitude and their affiliation with the restaurant are varied,
however, by the particular choices in sounding of G’s reacting laughter.
In Example 5, the first laugh (1) is a mid-pitch, neutral, single burst that is
moderate in volume, and is noticeably short and stable. G’s second laugh (2) is
similar and only slightly higher in pitch, with a short, quick pulse. These follow
from her first reaction to the utterance They probably paid off somebody with Yea:::h,
in which she indicates a possible agreement for negatively judging the restaur-
ant for its sanctionable behaviour (because she does not at first laugh it off).
With her short, quick laughter pulses that follow, G indicates that she cannot
fully share the underlying restaurant community by which these utterances
are taken to be funny. In fact, G makes clearer in later talk that she is not a
regular visitor to the restaurant (I haven’t been ever!), while P is (‘I’ve been two
times! . . . just this year’). The laugh expressions thus show that to laugh off
The Interpersonal Semiotics of Having a Laugh 23
K: Yeah but you see a lot of guys in Brazil who aren’t necessarily gay
who like to dress like women an . . . Because I remember being at = =
T = = Oh you’re talking about (festival) right
K the Carnival and like a whole group of guys they were all dressed like
women = =
T = = Yeah but they’re not men dressed like women; they’re like in a
costume like a little costume like you know whaddamean? You can li-
they’re not reading into this about women’s feelings you know what
I mean? They-they don’t wanna know what it’s about to be a woman.
They-they wanna just have fun an-an I don’t know pick up girls that’s the
idea of the thing. Well that’s how they ( ) = =
K = = °Dressed like a girl° (laughs) = =
T = = Well they don’t really dress like a girl! Alright?
moves to nearly close. The quiet, near close and high-pitched quality of her
laughter indicates negative judgement (and may also indicate self-consciousness
on her part for doing so, cf. Edmondson 1987) and further suggests that T is
the target as she continues laughing through his following speech. While T
continues to construe himself as a serious member of the Brazilian male com-
munity, K’s laughter expression not only acknowledges the tension his values
create but also conveys her own judgement of him, affecting their affiliation.
It is also informative to consider the prosody of laughter in a text, or specifi-
cally, a humorous phase of discourse, as the changes in participant laughter
as they construe a humorous sequence indicate the affiliation process that is
occurring. We expand on the laughter description for Example 1 to exhibit the
way that the changing laughter expression affects the meanings it conveys (see
Example 7). Recall that the three interactants construe a family community in
which eating heavy foods was a value they shared with that community over the
holidays, but something that they need to laugh off to share a young female
community in their conversation:
U = = Yeah I saw like my family and friends . . . I ate well (laughs) (1)
N We all ate well.
(all laugh) (2)
N Dude we all (laughing) ate good pie!
(continuous laughing)
U Yes I agree. (continuous laughing) On a diet now.
(all laugh) (3)
While the speakers present the couplings as creating tension, the laughter
exhibits a rising solidarity as they all share belonging to both of the communities
construed.10 The first speaker marks her coupling (‘I ate well’) as laughable by
expressing a single quiet breathy burst with a front-spread posture following
her own speech in a continuing move in (1) (see Example 7). She has coupled
positive appreciation with heavy eating, and her somewhat nervous (or self-
conscious) laughter indicates a negative self-judgement for having done so and
creating an affiliative tension that needs to be laughed off with the others.
This is made more explicit when the following speaker shifts the underlying
judgement towards all those in the conversation, and reiterates the laughable
coupling (‘We all ate well’). The reacting laughter (2) (see Example 7) is marked
by an increase in amplitude and a decrease in pitch with continuous iteration,
and is shared by speaker and all hearers. Together the participants laugh off the
tension that their ‘ate + well’ coupling causes together, and they exhibit their
shared memberships to both the family and the young female communities
being construed (as they all participated in this ‘bad behaviour’). Towards the
The Interpersonal Semiotics of Having a Laugh 25
end of the phase, they begin to negotiate even within the young female com-
munity by laughing off dieting and their own negative self-judgements, and as
a response, the laughter in (3) (see Example 7) is even louder, with a more
open constriction in its iteration. The negative self-judgement is now jubilantly
laughed off, and this roar is shared as the interactants achieve solidarity in
affiliation by identifying as close members of similar contrasting communities,
laughing off those tensions that their respective couplings cause for all of them
as family members. The prosodic unfolding of the laughing from a single quiet
burst marking the humorous tension to a shared roar indicates the progression
of affiliation, and the achievement in the moment-to-moment negotiation of
community through convivial conversational humour.
These examples display various combinations of sounding choices in laughter
that speakers can make in convivial conversational humour, and suggest that
in relation to the context and verbal co-text, particular forms of laughter
indicate distinguishable attitudinal and affiliative meanings. Their placement
in the text also shows how moves in Negotiation are indicated and impact
upon the affiliative relations of the participants. The meanings indicated by
laughter in this context show a development from the social functions of
laughter as a signal of ‘non-threat’ to its role as an indicator of a humorous
(or non-threatening) tension between the social values of communities in
convivial conversational humour. Laughter is not only a semiotic system that
combines facial expression and vocalization to construe various interpersonal
meanings, but in combination with speech as well, its meaning potential grows
as an essential component of the social negotiation of affiliation.
Conclusion
While laughter has been variously linked to differing origins and social
functions, its development as a semiotic system functioning interpersonally,
and complementing speech in the negotiation of affiliation, exhibits its role as
a meaningful mechanism for the maintenance of cohesive relations between
interactants. The meaning potential of laughter has developed from the micro-
functions of the reflective mode into the array of interpersonal discourse
semantic systems that are shared in language. Systematized choices in sound
may be combined to make particular meanings within a specified context, and
this has been exhibited through convivial conversational humour. Interactants
combine these variables to indicate particular attitudes and to negotiate differ-
ent degrees of affiliation in relation to their complex identities, and in this way,
laughter functions as a powerful tool in casual conversation for the manage-
ment of social values that bring people together in communities of the culture.
As a complementary semiotic to language, their co-articulation demonstrates
not only the intrinsic functionality and expanding meaning potential that these
combined semiotic systems make possible, but it also displays the development
26 Semiotic Margins
of laughter as a social semiotic in its own right, which enables the constant
negotiation of similarity and difference that characterizes casual conversation.
Beyond conversation, laughter has also been shown to convey a variety of
meanings from play to a display of superiority, indicating that laughter may
be the cipher-key for unlocking a world of semiotic potential beyond speech
in systemic functional linguistic research. This study has provided an initial
attempt to open the door.
Acknowledgements
Notes
1
This can be explained in relation to the systemic functional classification of
intensive identifying processes in lexicogrammar, in which classes of signifying
processes are separated according to the relationship between Token and Value
(what is identified) (cf. Halliday & Matthiessen 2004:238). Meanings that are
realized or denoted within semiotic systems (in lexical items such as ‘signify’ and
‘realize’) are distinguished from those that are suggested rather than denoted (in
lexical items such as ‘indicate’ and ‘suggest’) (and these are also distinguished
from relationships between non-semiotic manifestations and their meanings)
(Martin 1992:280–282), and this reflects the difference between language and
semiotic systems like laughter.
2
This is similar to Provine’s (2000) term ‘convivial humor’, but is specific to
conversation between friends and intimates, characterized by shared laughter
and the negotiation of community values. This is also a reformulation of the
earlier term ‘cooperative conversational humour’ used by Knight (2008), and
was recommended by Salvatore Attardo (personal communication, 2008) to
remove its association with the pragmatics notion of ‘cooperation’.
3
Names have been changed for privacy.
4
Because they can be variously negotiated, bonds here differs from Stenglin’s
(2004) bonding and the notion of ‘bonding icons’ in that bonding icons bring
interactants together into communities around quite strong and serious values
such as nationhood (see Stenglin 2004:410) and peace (see Martin 2008:131)
that cannot be laughed off.
5
The ‘+’ symbol will hereafter denote the coupling of attitude with ideational
meaning.
6
We may also reject couplings altogether in the ‘condemning’ strategy of
affiliation (see Knight 2008), such as in discourses of gossip.
7
These may correspond with ‘calls’ in laughter for the former and ‘bouts’ of
laughter for the latter (see Owren 2007). Chafe (2007) also refers to these as
The Interpersonal Semiotics of Having a Laugh 27
‘pulses’ and ‘laugh clusters’. Each laugh should be considered as a whole, but its
pulses can be distinguished by the constriction, posture and voicing characteris-
tics, while the whole cluster makes differences in amplitude, length, character
and pitch in relation to all of its pulses.
8
This classification is, however, incorporated here as an overall option rather than
in relation only to what Halliday has classified as ‘aperture’.
9
This is adapted from Halliday’s (1992) system for aperture, but in constriction,
it is the opening and closure of the vocal chamber with the lips rather than its
narrowing or opening by the placement of the tongue that is chosen from in
laughter.
10
Extended thanks to John Knox for his feedback in regard to the prosody of
laughter in this clip.
References
Abercrombie, D. (1968). Paralanguage. British Journal of Disorders of Communication,
3, 55–59.
Apte, M.L. (1985). Humor and laughter: An anthropological approach. Ithaca, NY:
Cornell University Press.
Archakis, A. & Tsakona, V. (2005). Analyzing conversational data in GTVH terms:
A new approach to the issue of identity construction via humor. Humor, 18(1),
41–68.
Bachorowski, J.A. & Owren, M.J. (2001). Not all laughs are alike: Voiced but
not unvoiced laughter readily elicits positive affect. Psychological Science, 12(3),
252–257.
Bateson, G. (1987). Steps to an ecology of mind. New Jersey, London: Jason Aronson.
Bonaiuto, M., Castellana, E. & Pierro, A. (2003). Arguing and laughing: The use
of humor to negotiate in group discussions. Humor, 16(2), 183–223.
Chafe, W. (2001). Laughing while talking. In D. Tannen & J.E. Alatis (Eds), Lin-
guistics, language, and the real world: Discourse and beyond (pp. 36–49). Washington,
D.C.: Georgetown University Press.
Chafe, W. (2007). The importance of not being earnest: The feeling behind
laughter and humor. Amsterdam/ Philadelphia, PA: John Benjamins.
Clark, J. & Yallop, C. (1990). An introduction to phonetics & phonology. Oxford: Basil
Blackwell.
Coates, J. (2007). Talk in a play frame: More on laughter and intimacy. Journal
of Pragmatics, 39, 29–49.
Darwin, C. (1965 [1872]). The expression of the emotions in man and animals.
Chicago, IL: University of Chicago Press.
Devillers, L. & Vidrascu, P. (2007). Positive and negative emotional states behind
the laughs in spontaneous spoken dialogs. Paper presented at the Interdiscip-
linary Workshop on The Phonetics of Laughter, Saarbrucken, 4–5 August.
Edmondson, M.S. (1987). Notes on laughter. Anthropological Linguistics, 29, 23–34.
Eggins, S. & Slade, D. (1997). Analysing casual conversation. London, New York:
Cassell.
Ellis, Y. (1997). Laughing together: Laughter as a feature of affiliation in French
conversation. Journal of French Language Studies, 7 (2), 147–161.
28 Semiotic Margins
Freud, S. (1976 [1905]). Jokes and their relation to the unconscious (J. Strachey, trans.
and A. Richard, ed.). Harmondsworth: Penguin Books.
Glenn, P. (2003). Laughter in interaction. Cambridge: Cambridge University Press.
Goodwin, C. (1986). Audience diversity, participation and interpretation. Text, 6
(3), 283–316.
Gumperz, J. (1982). Discourse strategies. Cambridge: Cambridge University Press.
Halliday, M.A.K. (1975). Learning how to mean: Explorations in the development of
language. London: Edward Arnold.
Halliday, M.A.K. (1978a). Language as social semiotic: The social interpretation of
language and meaning. London: Edward Arnold.
Halliday, M.A.K. (1978b). Meaning and the construction of reality in early
childhood. In H.L. Pick & E. Saltzman (Eds), Modes of perceiving and processing
of information (pp. 67–96). Hillsdale, NJ: Lawrence Erlbaum Associates.
Halliday, M.A.K. (1992). A systemic interpretation of Peking syllable finals. In P. Tench
(Ed.), Studies in systemic phonology (pp. 98–121). London, New York: Pinter.
Halliday, M.A.K. & Matthiessen, C.M.I.M. (1999). Construing experience through
meaning: A language-based approach to cognition. London: Cassell.
Halliday M.A.K. & Matthiessen, C.M.I.M. (2004). An introduction to functional
grammar (3rd edn). London: Edward Arnold.
Jefferson, G. (1979). A technique for inviting laughter and its subsequent accept-
ance/declination. In G. Psathas (Ed.), Everyday language: Studies in ethnomethodology
(pp. 79–96). New York: Irvington Publishers.
Jefferson, G. (1984). On the organization of laughter in talk about troubles. In
J.M. Atkinson & J. Heritage (Eds), Everyday language: Studies in ethnomethodology
(pp. 346–369). New York: Irvington Publishers.
Jefferson, G. (1985). An exercise in the transcription and analysis of laughter.
In T.A. van Dijk (Ed.), Handbook of discourse analysis: Volume 3 (pp. 25–34).
London: Academic Press.
Jefferson, G., Sacks, H. & Schegloff, E. (1987). Notes on laughter in the pursuit
of intimacy. In G. Button & J.R.E. Lee (Eds), Talk and social organisation
(pp. 152–205). Clevedon: Multilingual Matters.
Knight, N.K. (2010). Wrinkling complexity: Concepts of identity and affiliation
in humour. In M. Bednarek & J.R. Martin (Eds), New discourse on language:
Functional perspectives on multimodality, identity, and affiliation (pp. 59–98).
London: Continuum.
Knight, N.K. (2008). ‘Still cool . . . and american too!’: An SFL analysis of deferred
bonds in internet messaging humour. In N. Norgaard (Ed.), Systemic functional
linguistics in use (pp. 481–502). Odense: Odense Working Papers in Language
and Communication, vol. 29.
Knight, N.K. (in preparation). Laughing our bonds off: Conversational humour
in relation to affiliation. PhD Thesis in progress, Department of Linguistics,
University of Sydney.
Koestler, A. (1964). The act of creation. London: Hutchinson.
Martin, J.R. (1992). English text: System and structure. Philadelphia, PA: John
Benjamins.
Martin, J.R. (2000). Beyond exchange: Appraisal systems in English. In S. Hunston
& G. Thompson (Eds), Evaluation in text: Authorial stance and the construction of
discourse (pp. 142–175). Oxford: Oxford University Press.
The Interpersonal Semiotics of Having a Laugh 29
Sacks, H., Schegloff, E.A. & Jefferson, G. (1974). A simplest systematics for the
organization of turn-taking for conversation. Language, 50, 696–735.
Spencer, H. (2007 [1911]). The physiology of laughter. In H. Spencer (Ed.), Essays
on education and kindred subjects (pp. 298–309). London: Dent.
Sroufe, L.A. & Wunsch, J.P. (1972). The development of laughter in the first year
of life. Child Development, 43, 1326–1344.
Stenglin, M.K. (2004). Packaging curiosities: Towards a grammar of three-dimensional
space. PhD Thesis, University of Sydney.
Stewart, S. (1995). The multiple functions of laughter in a Dominican Spanish
conversation. Paper presented at the Language South of the Rio Bravo Confer-
ence, Tulane University.
Trouvain, Jürgen (2001). Phonetic aspects of ‘Speech-laughs’. Proceedings of the
conference on orality & gestuality (ORAGE) (pp. 634–639). Aix-en-Provence
(France).
van Hooff, J.A. (1967). The facial displays of the catarrhine monkeys and apes. In
D. Morris (Ed.), Primate ethology (pp. 7–68). Chicago, IL: Aldine.
van Hooff, J.A. (1972). A comparative approach to the phylogeny of laughter and
smiling. In R. Hinde (Ed.), Non-verbal communication (pp. 209–241). Cambridge:
Cambridge University Press.
van Leeuwen, T. (1999). Speech, music, sound. Basingstoke: Macmillan.
Vettin, J. & Todt, D. (2004). Laughter in conversation: Features of occurrence and
acoustic structure. Journal of Nonverbal Behaviour, 28(2), 93–115.
Warren, J.E., Sauter, D.A., Eisner, F., Wiland, J., Dresner, M.A., Wise, R.J., Rosen, S.
& Scott, S.K., (2006). Positive emotions preferentially engage an auditory-motor
‘mirror’ system. The Journal of Neuroscience, 26(50), 13067–13075.
Zijderveld, A.C. (1983). The sociology of humour and laughter. Current Sociology,
31, 1–100.
Chapter 2
Introduction
Theory
The study of body language presented in this chapter builds on foundational
studies in gesture from a number of fields, including the seminal work in cogni-
tion of McNeill (1992, 1998, 2000), Kendon (1980, 2004) and more recently,
Enfield (2009). More directly, it draws on a growing field of social semiotics
which has over recent years extended beyond language to include modelling of
the semiotic modes of image (Kress & van Leeuwen 2006, Painter 2007), space
(Stenglin 2004, 2007, Martin & Stenglin 2006), typography (van Leeuwen 2006),
32 Semiotic Margins
sound, music and voice quality (van Leeuwen 1999, McDonald, Chapter 5), facial
expression, gesture and position (Martinec 2000, 2001, 2002, 2004, Munitgl 2004),
and importantly to theorizing the relationships within and across different semi-
otic systems (Bednarek & Martin 2010, Painter & Martin, in press, Martinec &
Salway 2005, Royce & Bowcher 2007, Ventola, Charles & Kaltenbacher 2004).
While referencing the influences of studies in cognition and in social semi-
otics, it is important to note that each discipline approaches research on body
language from different premises and different theories, or interpretations
of theory. Studies in cognition, as articulated, for example, in Enfield (2009),
are primarily interested in cognitive processes of intention and interpretation.
Enfield explains the quest as understanding ‘how it is that interpreters may
derive meaning from composite utterances, or how we recognise “others” com-
municative and informative intentions’ (2009:1). From a grounding in cogni-
tion, Enfield critiques what he describes as a (neo-)Saussurean view of meaning
– ‘that a sign has meaning because it specifies a standing-for relation between a
signifier and a signified’. This interpretation is then negatively evaluated as a
view of signs as ‘static, arbitrary and abstract’ (2009:2). Enfield argues the need
to explain meaning as processes of interpretation of signs that are ‘dynamic,
motivated and concrete’. He suggests that the only alternative to ‘a static view
of meaning’ is available through Peircean semiotics (e.g. Peirce 1955) or
through pragmatics (e.g. Grice 1975, Levinson 1983).
In this chapter, I take a different perspective, based on a different conceptu-
alization of meaning and a different interpretation of Saussure than is consid-
ered in Enfield’s argument. Revisiting the notion of the sign in Saussurean
linguistics, an alternative interpretation to that in Enfield (2009) is provided by
Martin (1992, 2007) who argues with reference to Hjelmslev (1961) that the
domain of social semiotics is not a theorization of the relation between signifier
and signified, but is in fact the theorization of the delineating line – the space
between the two dimensions. The Saussurean contribution is to bind the signi-
fied and signifier into sign and then to theorize language as system of signs.
As a system of signs, the potential to mean is in the relationship of signs to
other signs in the system. We mean in relation to what we could have meant but
did not (Martin 1992). Hjelsmlev expands the meaning potential of this space
(of systems of signs) as a stratified system of signs, that is, as expression form
and content form. In Systemic Functional Linguistics (SFL) (Halliday 1978,
1992, Martin 1992, Martin & Rose 2007), the content form of language as a
system of signs is then further stratified as discourse semantics and lexicogrammar.
The relationship across these strata is one of abstraction. Martin (1992) extends
the system of signs further to a stratified context plane (context form) of genre
and register. This already rich theorization of sign systems acquires greater
explanatory power when the hierarchy of realization (briefly articulated above as
stratification) is complemented with the hierarchy of instantiation (Halliday 1991,
1992). Instantiation has to do with constraints on the generalized meaning
potential of the system through genres and registers to specific instantiations
in texts. The resultant theorization of the system of signs, of how we mean in
Body Language in Face-to-face Teaching 33
sub-stages of activities or tasks. And at a more micro level again, within such
phases a series of episodes of interaction can be identified by shifts in the
pattern of interaction in which the teacher and students are engaged. The data
were viewed multiple times as whole lessons enabling the researchers to track
shifts in patterns of body language as phases of lessons. Detailed transcriptions
were made of the verbiage and descriptions of gestures for selected phases and
sequences of phases.
In this chapter, I focus on phases of lessons in which the teacher is fronting
the class and engaged in episodes of instruction and explanation with some
teacher-coordinated discussion. The language is dominantly monologic and the
analyses focus on the teacher’s embodied meanings co-expressed with spoken
language. In all instances the teachers are intent on engaging students and guid-
ing them to a greater understanding of aspects of content (academic writing).
Image 2.1a Identifying actual wordings Image 2.1b Identifying potential word-
ings
self
vector to self
direction
actual
other(s) vector directed to
actual referent(s)
potential
Figure 2.1a Partial system network for the body language of identification
(a) (b)
(c)
Images 2.2 (a), (b), (c) Identifying with different degrees of specificity
38 Semiotic Margins
in the data. The teacher points with her hand, index finger then little finger,
as she guides the students to attend to more general or more specific parts of
the text or wordings. The teacher’s body language functions here in relation to
another dimension of identification, that of specificity.
We can interpret the variation in the bodily resources evident in Images 2.2a,
2.2b and 2.2c as varying along a cline of specificity. The smallest body part that
enables the highest degree of specificity is the little finger. The system network
for Identification can therefore be extended as in Figure 2.1b.
There is yet a third dimension of identification enacted in body language,
that of specification as delineation. In this case a gesture is formed in such a way
as to indicate boundaries. It may, for example, include two hands extended with
palms vertical and facing inwards, as in instances where the teacher indicates
three students sitting side by side as the ones she wants to form one group. But
the delineation can also be enacted with boundaries formed by the bent finger
and thumb as in Image 2.3 where the teacher specifies the boundary of what
she wants students to attend to on a projected text.
The more complete system network of identification, represented in Figure 2.1c
can be interpreted as meaning that where identification is enacted in body
language, the gesture encodes direction to a referent (actual or potential) and
a degree of particularization, and +/− delineation of boundaries for the referent.
In the data in this study, teachers typically rely heavily on resources of body
language to construe meanings of identification. We could say that consider-
ably more meaning of identification is committed in the teachers’ body lan-
guage than is committed in their spoken language. A general verbal reference
to students as ‘you’, for example, might be committed with additional meaning
of specificity in particularization and delineation in gesture. Similarly, verbal
reference to a segment of text as ‘this’ could be further committed in body
self
direction to self
direction
actual
other(s) directed to actual
referent(s)
potential
Identification not directed to
actual referent(s)
+
particularization surface size of point
–
specification
Figure 2.1b Partial system network for the body language of identification
Body Language in Face-to-face Teaching 39
self
direction to self
direction
actual
other(s) directed to actual
referent(s)
potential
Identification not directed to
actual referent(s)
+
particularization surface size of point
–
specification
2 fingers / hands / arms
marking boundaries
delineation
(a)
(b)
Body Language in Face-to-face Teaching 41
(c)
Images 2.4 (a), (b), (c) Cyclic movements towards board and back to class
The spoken language associated with one or other position (i.e. close to
students or close to board) was transcribed, identifying tone groups, using
Halliday’s analytical framework of 5 tones: 1 = falling (‘certain’); 2 = rising
(‘uncertain’); 3 = level (‘unfinished’); 4 = fall-rise (‘but’); 5 = rise-fall (‘sur-
prise’) (Halliday 1963 (2005)), as well as shifts in ideational and interpersonal
meaning choices. The epilinguistic body language characterizing each position
is also described. The extract of the analyses across one wavelength in Table 2.1
(from close to board to close to class to close to board) represents the kinds
of shifts in language and body language that are repeated across subsequent
shifts of position.
An analysis of the teacher’s spoken language and body language associated
with each stage shows evidence of a shift in the level of actualization of key
ideational meanings associated with the content of the text that the students
are attending to. When the teacher is positioned at the board these meanings
are construed as actual rather than potential. She refers to meanings realized
in the discourse and verbalizes and gestures specific locations in the text. The
referents for her identifying gestures are dominantly parts of the text. Her
tone is predominantly one of certainty (tone 1: this is what is). Her role is domi-
nantly to inform. When the teacher is close to the class the ideational meanings
are construed as potential (to be elicited/negotiated) rather than actualized
and they are de-specified through resources of focus (some kind of). Her tone is
predominantly ‘uncertain’ (tone 2: what is it?). Her body language functions
dominantly to identify her students and herself, and also to potential meanings
42 Semiotic Margins
Table 2.1 Patterns of spoken language and body language construing phases of
interaction
Spoken Teacher’s Body language Multi-semiotic phases of interaction
language in position patterns
tone groups
(in space, not on the text). Her role is dominantly to elicit and engage. The
teacher’s shifts in position in the room correspond to shifts in the meanings
she is orienting students to, from actual to potential, and from the written
text to the students as potential writers. The teacher’s shifts in position and
accompanying shifts in patterns of body language function to texture the
discourse and hence the teaching-learning activity into phases of interaction.
Each multimodally constructed phase makes salient different kinds of informa-
tion to be attended to by the students, and expresses different expectations
in terms of student engagement and participation. These cyclical shifts can
be interpreted as an aspect of the teacher’s scaffolding of students’ academic
writing as she opens up space for new possibilities and guides students towards
new instantiations of meaning.
While not analysed in this chapter, there are also movements of the body at
much smaller wavelengths mainly involving the fingers and hands, and small
movements of the head, which are synchronous with phonological rhythms
of stress and intonation, movements that constitute linguistic body language
according to Cléirigh (in Martin, Chapter 12) in contrast to the epilinguistic
body language analysed here. While Eisentein (2008) does not differentiate
kinds of movements in functional terms, the analyses of body movement
presented here do correspond to his description, suggesting that
the small linguistic units (e.g., phrases) are synchronized with fast moving
body parts (e.g., hands and fingers) and large discourse units (e.g., topic
segments) are synchronised with slower moving body parts (e.g., the torso).
(. . .) posture shifts occur much more frequently at segment boundaries.
(Eisentein 2008:29)
monogloss
expansion
engagement
heterogloss
contraction
affect . . .
APPRAISAL
attitude appreciation. . .
judgement . . .
force . . .
graduation
focus . . .
(See Figure 2.2 for a skeletal model of appraisal and Martin & White 2005 for
a comprehensive explanation).
In analysing attitude in verbal discourse a distinction can be drawn between
attitude that is explicitly expressed or inscribed and attitude that is implied or
invoked (Martin & White 2005). Graduation provides one important means
by which attitudinal meanings can be invoked (Hood 2004, 2006, 2010). By
grading an objective (ideational) meaning the speaker gives a subjective slant
to that meaning, signalling for the meaning to be interpreted evaluatively.
So, for example, when a teacher says ‘you all need to listen to this’, both all
and need are instances of grading the force of what is said, implying though
not explicitly encoding the meaning of ‘this is very important’.
In analysing body language co-expressed with spoken language we can
consider these same dimensions of meaning (see Macken-Horarik 2004 on
appraisal analysis of images). Resources of facial expressions are not analysed
here but can, for example, function to express affect as happiness, sadness etc.
But the body can also play a role in invoking attitude through the grading
of meanings along a number of clines. Meanings can be graded in intensity
in the muscle tension employed in gestures accompanying the verbiage. The
intensification may or may not be co-expressed in the verbiage. Tension
realizing intensification can be expressed in various parts of the body and
is illustrated as tensed and relaxed hand muscles in Image 2.5a and 2.5b.
Body Language in Face-to-face Teaching 45
(a) (b)
Image 2.5a + muscle tension expressing Image 2.5b − muscle tension express-
intensification (. . . the grammar rules) ing lack of intensification (how did you
. . . ?)
Image 2.6 Amplifying size: Invoking value (That’s what we’re talking about!)
intensification
muscle tension
force
quantification
GRADUATION size
focus
(a) (b)
Image 2.7a Expanding space for Image 2.7b Contracting space for negoti-
negotiation (How did you work out the ation (a draft isn’t a complete rewrite)
answer?)
out the answer? A prone-hand gesture, in contrast, functions to close down space
for other voices, and typically accompanies verbal discourse that functions in
a corresponding way. In Image 2.7b the teacher is negating the possibility of
other positions, with phonological stress on the negation (underlined) as he
says a draft isn’t a complete rewrite. While a supine and prone distinction is most
often enacted with the hand, it may also be evident in the positioning of the
index finger in pointing gestures. So pointing to a student to invite them to
contribute to the discussion can be made with the inside of the index finger
facing up, while a direction to do something can be made with the inside of the
finger facing down.
The data also reveals a gesture constructed as a movement back and forth
between that of supine and prone positions in an oscillating gesture. This is
interpreted as expressing modality of possibility, and in terms of engagement,
as expanding heteroglossic space by entertaining other possible positions. In
these data it was always co-instantiated with a verbal expression of modality
(congruent or metaphoric), and the extent to which a possibility is entertained
as relatively likely or unlikely seems to depend on additional resources such as
facial expression or voice quality. In these data the oscillation is typically enacted
with the hands, but other parts of the body such as the head or even the upper
torso can also be used in the expression of this meaning. The representation of
these options as a system network is shown in Figure 2.4.
48 Semiotic Margins
heteroglossic contraction
ENGAGEMENT
entertain
heteroglossic expansion
invite
supine body positions
Figure 2.4 A system network for expanding and contracting space for negotiation
The frequency with which the teachers use these supine, prone and oscillat-
ing gestures varies from one stage of a lesson or pedagogic activity to another.
The more frequent use of elicitation gestures with supine hand position char-
acterizes phases of lesson in which teachers coordinate discussion. They func-
tion in this context to open up space for students to contribute. The extent to
which individual teachers engage in dialogue with students is also no doubt a
reflection of a more general pedagogic model (Bourne 2003). There is an
urgent need for more research into the ways in which interpersonal epilinguis-
tic body language functions in relation to teaching and learning in face-to-face
classrooms, and in turn into the impact a lack of access to embodied meanings
might have in computer-mediated online learning.
The instances of body language described above highlight the ways in which
metafunctional meanings can be co-instantiated in both speech and body
language, albeit in ways that commit meaning potential to a greater or lesser
degree. It is also noted that the metafunctional load can be distributed differ-
ently across modes, so that meaning in relation to one metafunction may be
instantiated in gesture but not the verbiage, and vice versa, and in any one
instance of body language there may be fused different kinds of metafunctional
meanings. Pointing gestures, for example, doing the work of identification,
readily fuse with other gestural expressions of interpersonal meaning. A point-
ing gesture identifying a participant in the discourse can do so in the context
of an elicitation with a supine hand position or in the context of a command
with a prone-hand position. In Image 2.5b, for example, the gesture integrates
a meaning of elicitation together with a meaning of identification of the
intended interactant in the directionality of the fingers of the hand. Muscle
Body Language in Face-to-face Teaching 49
Conclusion
Acknowledgements
I would like to thank Linda, Matt and Juliana who together with their students
generously allowed me to film their classrooms, and Insearch Language Centre
for its ongoing support for and cooperation in research at a time when more
and more barriers are being constructed for research in classrooms in Australia.
I also thank my research assistant, Catherine Baird, for her technical know-how
50 Semiotic Margins
and insight. This research was undertaken with the assistance of a grant from
the University of Technology, Sydney.
References
Bednarek, M. & Martin J.R. (Eds) (2010). New discourse on language: Functional
perspectives on multimodality, identity, and affiliation. London: Continuum.
Bourne, J. (2003). Vertical discourse: The role of the teacher in the transmission
and acquisition of decontextualised knowledge. European Educational Research
Journal, 2(4), 496–521.
Brady, N.C., McLean, J.E., McLean, L.K. & Johnston, S. (1995). Initiation and repair
of intentional communication acts by adults with severe to profound cognitive
disabilities. Journal of Speech and Hearing Research, 38, 1334–1348.
Chafai, N.E., Pelachaud, C. & Pele, D. (2007). A case study of gesture expressivity.
Language resources and evaluation, 41, 341–365.
Christie, F. (1997). Curriculum macrogenres as forms of initiation into a culture.
In F. Christie & J.R. Martin (Eds), Genres and institutions: Social processes in the
workplace and school (pp. 134–160). London: Cassell.
Efron, D. (1972). Gesture, race, and culture. The Hague, Netherlands: Mouton de
Gruyter.
Eisenstein, J. (2008). Gesture in automatic discourse processing. PhD Thesis,
Massachusetts Institute of Technology.
Enfield, N. 2009. The anatomy of meaning: Speech, gesture, and composite utterances.
Cambridge: Cambridge University Press.
Flewitt, R. (2006). Using video to investigate pre-school classroom interaction:
Education research assumptions and methodological practices. Visual Communi-
cation, 5 (1), 25–50.
Gregory, M. & Malcolm, K. (1995). Generic situation and discourse phase: An
approach to the analysis of children’s talk. In Jin Soon Cha (Ed.), Before and towards
communication linguistics: Essays by Michael Gregory and Associates (pp. 154–195).
Seoul, Korea: Sookmyung Women’s University.
Grice, H.P. (1975). Logic and conversation. In P. Cole & J.L. Morgan (Eds), Syntax
and semantics, Vol III, Speech acts (pp. 41–58). New York: Seminar Press.
Gullberg, M. (1999). Gestures in spatial descriptions. Lund University Department of
Linguistics Working Papers, 47, 87–97.
Halliday, M.A.K. (1963). The tone of English. Archivum Linguisticum, 15(1), 1–28.
Republished in 2005 as Chapter 8: The tone of English. In J. Webster (ed.),
Collected Works of M.A.K. Halliday, Volume 7. London: Continuum.
Halliday, M.A.K. (1978). Language as social semiotic: The social interpretation of language
and meaning. London: Edward Arnold.
Halliday, M.A.K. (1991). The notion of context in language education. In T. Le &
M. McCausland (Eds), Language education: Interaction and development: Proceedings
of the international conference (pp. 1–26). Ho Chi Minh City, Vietnam.
Halliday, M.A.K. (1992). The act of meaning. In J.E Alatis (Ed.), Georgetown Round
Table on languages and linguistics: Language, communication and social meaning
(pp. 7–21). Washington, D.C.: Georgetown University Press. [Republished in
Body Language in Face-to-face Teaching 51
Martin, J.R. & White, P.R.R. (2005). The language of evaluation: Appraisal in English.
London: Palgrave Macmillan.
Martinec, R. (2000). Rhythm in multimodal texts. Leonardo, 33(4), 289–297.
Martinec, R. (2001). Interpersonal resources in action. Semiotica 135, 1(4), 117–145.
Martinec, R. (2002). Rhythmic hierarchy in monologue and dialogue. Functions of
language, 9(1), 39–59.
Martinec, R. (2004). Gestures that co-occur with speech as a systematic resource:
The realization of experiential meaning in indexes. Social semiotics, 14(2),
193–213.
Martinec, R. & Salway, A. (2005). A system for image–text relations in new (and
old) media. Visual Communication, 4(3), 337–371.
McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago, IL
and London: University of Chicago Press.
McNeill, D. (1998). Speech and gesture integration. In J.M. Iveron & S. Goldin-
Meadow (Eds), The nature and functions of gesture in children’s communication, 79
(pp. 11–27). San Francisco, CA: Jossey-Bass publishers.
McNeill, D. (Ed.) (2000). Language and gesture: Window into thought and action.
Cambridge: Cambridge University Press.
Muntigl, P. (2004). Modelling multiple semiotic systems: The case of gesture and
speech. In E. Ventola, C. Charles & M. Kaltenbacher (Eds), Perspectives on
multimodality (pp. 31–50). Amsterdam: John Benjamins.
Painter, C. (2007). Childrens’ picture book narratives: Reading sequences of
images. In A. McCabe, M. O’Donnell & R. Whitakker (Eds), Advances in language
and communication (pp. 40–59). London: Continuum.
Painter, C. & Martin, J.R. (In press). Intermodal complementarity: Modelling
affordances across verbiage and image in children’s picture books. Ilha do
Desterro: A Journal of English Language, Literatures in English and Cultural Studies
(Special Issue on Multimodality).
Peirce, C.S. (1955). Philosophical writings of Peirce. New York: Dover Publications.
Royce, T.D. & Bowcher, W.L. (Eds) (2007). New directions in the analysis of multimodal
discourse. Mahwah, NJ: Lawrence Erlbaum Associates.
Stenglin, M. (2004). Packaging curiosities: Towards a grammar of three-dimensional
space. Unpublished PhD Thesis, Department of Linguistics, University of
Sydney.
Stenglin, M. (2007). Making art accessible: Opening up a whole new world. Visual
Communication, 6(2), 202–213.
Ventola, E., Charles, C., & Kaltenbacher, M. (Eds) (2004). Perspectives on multi-
modality. Amsterdam: John Benjamins.
van Leeuwen, T. (1999). Speech, music, sound. London: Macmillan.
van Leeuwen, T. (2006). Towards a semiotics of typography. Information Design
Journal, 14(2), 139–155.
Chapter 3
Introduction
of SFL, there is no research, other than my own (see Dreyfus 2007, 2008) that
focuses on the communication of people who do not use speech as their main
form of communication and who have intellectual disabilities. My work therefore
begins, in a very tentative way, to expand into this area.
The purpose of this chapter is to highlight the complexities of both studying
non-verbal multimodal communication in general and using SFL theory for
this kind of study. The first section of the chapter examines the nature of the
communication environment for a non-verbal multimodal communicator with
an intellectual disability. The second section is a brief description of my study
which examined this kind of communication. The third section expands on
the issues arising from using a normative theory for the study.
altogether in one sign (Johnston 1996, Dreyfus 2007). For example, in the
case of the boy reported on in this chapter, grabbing the driver’s sleeve when
travelling in the car, means ‘Where are we going?’ while flicking the door
handle means ‘Can I get out?’
In summary, the semiotic umwelt of the non-verbal multimodal commun-
ication of a person with an intellectual disability encompasses:
The chapter will now turn to a closer examination of the study itself.
The Study
Modes of Expression
Bodhi’s modes of expression have been clustered into six broad types based
on the categories from another study of the communication of non-verbal
children with disabilities (see Light et al. 1985). These modes are vocalizations,
gestures, materials, actions, behaviours and eye gaze. However, the modes are
invariably not communicated separately; that is to say, Bodhi generally does
not communicate using only one mode at one time, but combines different
modes in the one meaning-making move. For example, he may vocalize while
pointing to something and looking at the communication partner. Each of
the different modes carries different components of meaning. Together, these
modes of expression give Bodhi a limited but nevertheless somewhat func-
tional system of communication, with an informed or interested communication
partner (Dreyfus 2007).
Grappling with a Non-speech Language 57
word
eg /dad/, /b√b/
approximations,
=(mum), /√/=(his
brothers),
vocalizations /i/
sounds that are not
word approximations /U/
/√/
laughter
expressions of affect
crying
distal
pointing
contact
getting something
actions going to something
leading someone somewhere
to someone
eye gaze
to something
While the analyses conducted in order to answer the questions are by no means
exhaustive, they were still able to capture Bodhi’s meaning-making abilities
with different communication partners. In brief, the study showed that Bodhi is
a very social being, and an active communicator who initiated more than half
the conversation exchanges in the data. However, his meaning-making abilities
are restricted with a limited number of experiential meanings and very few
textual meanings expressed. In terms of the semiotic space Bodhi occupies, it is
predominantly one of the concrete world, and not of the abstract world. With
regard to textual meanings, Bodhi only communicates peaks of salience in his
moves, which are constituted by new information. The rest of the meaning must
be co-constructed by the communication partner using sources outside the
text. Interpersonally, Bodhi deploys three of the four basic speech functions,
although their expression is undifferentiated and the communication partner
must work together with Bodhi to determine which speech function he means.
To paraphrase Halliday (1975), Bodhi does not have an open-ended system
with massive potential that ‘can create indefinitely many meanings and indefi-
nitely many sentences and clauses and phrases and words for the expression of
those meanings’ (pp. 35–36). The system is limited by both his lack of a lexico-
grammar and his severe intellectual disability, but the way these two interact
is hard to unravel and was not the task of the study. However, as the study
showed, not having a lexicogrammar does not mean that Bodhi cannot make
meaning. Indeed, he uses his multimodes to make a variety of meanings, albeit
limited, in a process of joint construction with his communication partners.
For a detailed description of the findings of the study, see Dreyfus (2007).
Bodhi’s laughter is a prime example of this. That is to say, when Bodhi laughs,
while he may not intentionally be communicating that he is happy, the com-
munication partner responds to his laughter as a move communicating positive
affect, therefore it constituted a move in the conversation, and was treated as
such. The excerpt in Example 1 illustrates this:
Mark: . . . Tomorrow Dad will take you to Saturplay and Bruce will come
in the car too. We’ll have Bruce in the car, with a banjo and a double
bass. Yes . . .
Bodhi:3 /1 ye´/ (smile)
Mark: Yeah Bruce
60 Semiotic Margins
Example 1
But it is not just whether Bodhi’s behaviours are semiotic or not that is important;
once it is decided that a behaviour is semiotic, there then needs to be some
classification of that behaviour in terms of its semiosis. In other words, there
can be degrees of semiosis, where some semiotic behaviours are more abstractly
semiotic than others. That is to say, some of Bodhi’s moves are stratally simpler,
being much like a child’s protolanguage in that they are simple content expres-
sion pairs where there is no intermediate (abstract) stratum of meaning (i.e.
a lexicogrammar). These include modes such as behaviour and actions. For
example, when he lies on the floor and kicks the door to communicate that
he wants to go out. However, there are also moves that display more stratal com-
plexity, having some kind of abstract layer sandwiched between the semantic
and expression planes, such as signs and pictorials/pictographs. An example
of this is when Bodhi makes the sign for ‘toilet’ (which is the index finger of
one hand contact pointing the palm of the other hand) to communicate he
wants to go or is going to go to the toilet.
Straddling the AAC view that all behaviour is communication and the SFL
view that behaviour can be divided into semiotic and non-semiotic behaviour
raises questions for the study of non-verbal multimodal communication, such
as what constitutes a language or communication system, what the boundaries
of these are, and how fixed or how flexible they can be. Do we classify these
in developmental terms such as protolanguage or adult language? And does
this kind of classification help with understanding and describing them? (See
Dreyfus 2007 for a more detailed discussion of the issues associated with the
classification of Bodhi’s communication system.)
In order to capture this variance in type or degree of semiosis, a cline is
posited to capture the varying degrees of semiosis, along the lines of both
Halliday (1985) and Cléirigh (in preparation; see also Martin Chapter 12) (see
Figure 3.2). At one end of the cline are the modes of expression that are
content expression pairs, and more like the primary semiotic of protolanguage.
At the opposite end are the most symbolic or higher order modes, which, of
course, means the most linguistic modes.
In order to classify and analyse these different types of semiotic behaviour,
their function needs to be determined. While within verbal language it is the
Grappling with a Non-speech Language 61
Example 2
As researchers within the field of AAC articulate, any study of non-verbal multi-
modal communication needs to be able to take into account the contributions
of the communication partner. Thus, a within-clause perspective needs to be
supplemented by a beyond-clause or discourse semantic perspective. Exchange
Structure Analysis (after Coulthard & Montgomery 1981, Berry 1981a, 1981b,
1981c, Martin 1992, Ventola 1987, 1988) offers this perspective and was there-
fore used to determine the function of certain moves by looking at the move
in its dialogistic context. It is the subsequent communication partner’s moves
that can provide clues to the function of the multimodal move when the
function of the move cannot be determined from within the move itself.
62 Semiotic Margins
This means the function of the non-verbal multimodal move can be deter-
mined retrospectively. In the exchange above, it is not until the final move,
where Bodhi giggles rather than replays, showing his satisfaction with Dodo’s
response, that we are able to understand that Bodhi’s initial move meant he
likes the bowl.
The discourse semantic (above clause) perspective was used in combination
with a within-turn or clause perspective, in order to attempt to capture the
instantiated meanings within each of Bodhi’s moves. All Bodhi’s moves were
plotted onto a table that could reflect the metafunctional perspective. As
shown in Table 3.1, there is a move of Bodhi’s that comes from an exchange
where Bodhi is travelling in the car with his father. They are on their way to
the chemist and Bodhi, who had an obsession with flushing toilets, asks if
there are toilets there (see Example 3):
Example 3
The top left-hand corner with the B (standing for ‘Bodhi’), records the number
of the turn in terms of where it is located in the transcript. The column below
that lists all the possible modes of expression. The next column, the instance
Possible meanings
column, records which modes Bodhi actually uses in the instance of com-
munication that is his turn. The following three columns are for recording the
metafunctional meanings associated with the modes of expression. The bottom
line ‘gloss’ refers to my interpretation of what Bodhi has tried to communicate.
For those of Bodhi’s moves that were able to be analysed for metafunctional
meanings, the columns were filled in. For those that weren’t, the columns were
left empty.
no matter how frequently one may remind the reader that the gloss is
no substitute for the sign, if there is nothing in the text that represents
the sign per se (be it picture or script) the glossing may take on a life of
its own. (p. 6)
If we consider both Bodhi’s move and the gloss, it can be seen that the sounds
and tone express some interpersonal content (some kind of demand); the
sign ‘toilet’ expresses some experiential content; and together, they express
a textual component that can be called New. One of the findings of this study
was that Bodhi typically only expresses the New. Everything else must be gleaned
from the context, where the communication partner must work together with
Bodhi to jointly construct the meaning. However, having said that, for many
moves I was not able to fill this table in because there was not enough informa-
tion from Bodhi’s move itself or from the move in its dialogistic context.
Distinguishing between different types of services available for Bodhi then gives
rise to a different set of options within the system network of move in dialogue
(see Figure 3.4).
I
initiate
move
respond
+ give
role
TTdemand
other
goods-&
services
commodity +I
linguistic service
(articulation)
+
information
Figure 3.4 Bodhi’s system of speech functions (after Halliday & Matthiessen
2004:108)
Key:
I = if, T = then
The case Iinitiate +give +information, Tdemand +linguistic service (articulation), means, if Bodhi
initiates with giving information, then he also demands the linguistic service of articulation of that
information.
In the case of Ilinguistic service (articulation) Tdemand, means if Bodhi expresses a linguistic service,
it is always as a demand.
Conclusion
This chapter has attempted to highlight some of the issues arising from the
study of the non-verbal multimodal communication of a child with an intellec-
tual disability using systemic functional linguistic theory. The chapter explains
how the differing nature of this kind of communication gives rise to a different
communication environment – a transmodal environment where meaning
is jointly negotiated and a variety of semiotic behaviour is brought into focus.
The chapter also discusses how the boundaries between semiotic and non-
semiotic behaviour are at times difficult to determine. We are at the edges of
the theory in terms of being able to accurately describe this kind of commun-
ication using current networks. As such, an expansion of the speech function
network is posited to capture the move that provides information while simul-
taneously demanding a linguistic service from the communication partner.
Expanding the range of moves possible also has ramifications for the types of
moves possible within the Exchange Structure model.
Notes
1
This is not to say that there is no research into other sorts of non-verbal multi-
modal communication that uses systemic functional linguistic theory. Other such
research includes studies of the communication systems of the primate species
Pan paniscus in Benson and Greaves (2005) and Knight (2006 and Chapter 1).
2
Trevor Johnston’s (1991, 1992 and 1996) work applying SFL theory to Auslan does
focus on a different kind of language – the non-verbal language of the deaf;
however, this is again an intellectually able population, and as Johnston has noted,
while sign languages have the added dimension of the visual-gestural medium,
they are comparable to spoken languages in that they are seen to be tristratal
languages, even if they are more iconic than spoken languages.
Grappling with a Non-speech Language 67
3
Bodhi’s vocalizations are written phonetically. The numbers in front of Bodhi’s
vocalizations reflect Halliday’s (1994) descriptions of the tones in spoken English.
4
Contact pointing refers to touching things when pointing to them (what we might
call tapping). This is contrasted with distal pointing, which refers to pointing
to things without touching them – these things are usually more than 6 inches
away (Brady et al. 1995).
References
Armstrong, E. (1991). The potential of cohesion analysis in the analysis and
treatment of aphasic discourse. Clinical Linguistics and Phonetics, 5(1), 39–51.
Armstrong, E. (1992). Clause complex relations in aphasic discourse: A longitudinal
case study. Journal of Neurolinguistics 7(4), 261–275.
Armstrong, E. (1993). Aphasia rehabilitation: A sociolinguistic perspective. In
M.M. Forbes (Ed.), Aphasia treatment: World perspectives (pp. 263–290). San Diego,
CA: Singular.
Armstrong, E. (2001). Connecting lexical patterns of verb usage with discourse
meanings in aphasia. Aphasiology, 15(10/11), 1029–1045.
Benson, J.D. & Greaves, W.S. (Eds) (2005). Functional dimensions of ape-human
discourse. London and Oakville: Equinox.
Berry, M. (1981a). Systemic linguistics and discourse analysis: A multi-layered
approach to exchange structure. In M. Coulthard and M. Montgomery (Eds),
Studies in discourse analysis. London: Routeledge and Kegan Paul.
Berry, M. (1981b). Polarity, ellipticity and propositional development, their relevance
to the well-formedness of an exchange. Nottingham Linguistic Circular, 10(1),
36–63.
Berry, M. (1981c). Towards layers of exchange structures for directive exchanges.
Network, 2, 22–32.
Brady, N.C., McLean, J.E. & Mclean, L.K. (1995). Initiation and repair of intentional
communication acts by adults with severe to profound cognitive disabilities.
Journal of Speech and Hearing Research, 38, 1334–1348.
Cléirigh, C. (in preparation) The Life of Meaning.
Coulthard, M. & Montgomery, M. (1981). Studies in discourse analysis. London:
Routledge and Kegan Paul.
Dreyfus, S. (2007). When there is no speech: A case study of the nonverbal multi-
modal communication of a child with an intellectual disability. Unpublished
doctoral study, University of Wollongong.
Dreyfus, S. (2008). A systemic functional approach to misunderstandings. Bridging
Discourses. Online Proceedings of the Australian Systemic Functional Linguistics
Association conference, University of Wollongong.
Fine, J. (1994). How language works: Cohesion in normal and nonstandard communication.
Norwood, NJ: Ablex Publishing Company.
Fine, J. (2004). Language in psychiatry: A handbook of clinical practice. London: Equinox.
Halliday, M.A.K. (1975). Learning how to mean. London: Edward Arnold.
Halliday, M.A.K. (1978). Language as social semiotic. London: Edward Arnold.
Halliday, M.A.K. (1985). Spoken and written language. Geelong: Deakin University
Press.
68 Semiotic Margins
Introduction
Space provides the setting in which we conduct all the activities that are part
of our ongoing lives. These activities constitute our evolving ontogenesis and
include working, learning, shopping, eating and resting. In western cultures,
many of these activities occur in a built environment:
First houses are the grounds of our first experiences. Crawling about at floor
level, room by room, we discover laws that we apply later to the world at large.
And who is to say if our notions of space and dimension are not determined
for all time by what we encounter there, in the particular relationship of
living rooms to attic and cellar (or, in my case, under-the-house), of inner
rooms to the verandas that are open boundaries. Each house has its own
topography, its own lore: negotiable borders, spaces open or closed. (Malouf
1985:8–9)
tropics and the cooler southern parts of Australia. These two locations have been
chosen because these choices for housing offer strong contrasts, and in doing
so, provide illuminating insights into the important role domestic space plays
in all of our lives. The final section of this chapter moves beyond space gram-
mar, to explore some of the other potential applications of the tools we will be
discussing as well as articulating some of the remaining challenges.
1. visual images (Kress & van Leeuwen 1990, 1996/2006, O’Toole 1994)
2. movement (Martinec 1997, 1998a, 1998b, 2000a, 2000b)
3. speech, music, sound (van Leeuwen 1991, 1999)
4. architecture/three-dimensional space (O’Toole 1994, 2004, Kress & van
Leeuwen 1990, 1996/2006, van Leeuwen 1998, 2005, Stenglin 2002, 2004,
2007, 2008a, 2008b, 2009a, 2009b, Martin & Stenglin 2007, Ravelli & Stenglin
2008).
private
public
industrial
commercial
agricultural
governmental
Practical function educational
medical
cultural
religious
residential
domestic
utility
certainly one that is used by architects who focus strongly on form and function
in their work.
From an SF perspective, is it possible to use a system network to represent
the varying functions of space that O’Toole has identified (see Figure 4.1).
In applying this network to domestic Australian architecture, it is apparent
that we are concerned with exploring spaces that are private, residential and
domestic. However, O’Toole’s public-private distinction is more complex than
it initially appears. The reason being that the degree of privacy or public expos-
ure tends to vary considerably from space to space. For example, the most
private domestic spaces tend to be our bedrooms while the domestic spaces
closest to the public end of the scale are the glass open-plan living areas that
are popular in contemporary architecture and that passers-by can look into.
These appear to be public as they are exposed to the external gaze but access
and entry to them remain very much private and restricted. So, a more accur-
ate representation would be semi-public. To accommodate such complexity,
O’Toole’s ‘public-private’ dimension could be represented as a sliding scale
rather than a discrete set of choices (see Figure 4.2).
Using Martin’s stratified model of context we can also project experiential
meaning contextually, which yields a focus on field (Martin 1992:536). A field-
focus means that experiential meanings can either have an object orientation
An Evolving Cartography of a Visceral Semiotic 77
private
public
along a central hallway. Orbital structures, on the other hand, are organized
around a nucleus-satellite configuration (Figure 4.4). A typical example would
be a house with a central courtyard functioning as the nucleus with all the
rooms located around that courtyard.
Smothered Vulnerable
Too restricted Exposed
Too Bound Too Unbound
Comfortable Free
Safe Open
Bound Unbound
feel Too Unbound overwhelm their occupants by towering over them and not
providing enough enclosure such as large public buildings and cathedrals.
The two other choices for emotion along the Binding scale are the Bound
and the Unbound dimensions (Figure 4.6). Both are concerned with spatial
security and located in the centre of the scale. Bound spaces are womb-like
spaces that make users feel safe and secure while Unbound spaces also main-
tain a relationship of security with users by lessening the degree of spatial
enclosure and making them feel freer and less enclosed.
Bonding is also concerned with interpersonal meaning in space but focuses
on affiliation rather than in/security. It is a multidimensional resource con-
cerned with aligning people into groups with shared dispositions. It explores
ways of building togetherness, inclusiveness and solidarity through connection.
There are at least four tools that materialize Bonding in the third dimension:
Bonding icons, the attitudinal re/alignment of people around shared attitudes,
classification and framing (Bernstein 1975).
Bonding icons are emblems of social belonging with the potential to rally
people around shared values. They distil, crystallize and fuse interpersonal
attitudes to ideational meanings. They include buildings (e.g. the Sydney Opera
House), leaders (e.g. Nelson Mandela), songs (e.g. the Maori haka), symbols
(e.g. Olympic rings) as well as medals, badges, trophies and even paintings (e.g.
the Mona Lisa, which is a Bonding icon for the Louvre). They not only accrue
values but radiate out for communities to rally around or reject (Stenglin 2008b,
2009a, 2009b, Martin & Stenglin 2007, Ravelli & Stenglin 2008).
80 Semiotic Margins
rhyme
contrast lockable
total
gaps
permeable
segregation
disconnection sealed auditory
separation partial
visual
permanent
temporary
The replication of a familiar aesthetic style also meant that the cottages
were able to function as Bonding icons in an alien environment through the
evocation of strong positive attitudes. These attitudes included positive appre-
ciation of the Georgian aesthetic; feelings of positive affect, especially familiar-
ity, security and happiness alongside positive judgements of, and confidence in,
the British empire which was the home of the Georgian cottage. By functioning
as Bonding icons in this way, the early cottages were able to palpably rally
the immigrants around shared ideals of ‘home and hearth’ and sustain their
connection to their country of origin to which they still emotionally belonged.
The early settlers living in Australia were not alone in clinging to such
Bonding icons. According to architectural writer Balwant Saini:
How glad he was to leave the tiny, sun-baked box that till now had been his
home . . . It had neither blind nor shutter; and, on entering it of a summer
midday, it had sometimes struck hotter than outside. (Richardson 1982:236)
People living in such houses clearly felt Too Bound, especially in the oppressive
heat of summer. This is important as it points to the fact that the design of
space is not a world-wide given in respect of what gives comfort and security
to occupants. There is clearly cultural and climactic variation.
In practical terms, verandas provided shade, shelter and protection from the
elements. They also moderated the heat of the sun on the walls (Archer 1998).
Initially a defence against the Australian climate, together with other choices
for Binding such as shutters, thick walls and heavily lined curtains, they were
used to deliberately exclude the sun as well as the alien Australian landscape
from intruding into nostalgically furnished domestic interiors (Drew 1992). In
this way Australian homes in the south became strongly Bound fortresses that
shut out the threatening and unfamiliar elements, that is, the climate and the
landscape.
Let us now explore how classification and framing impacted on social
interaction in these Bound fortresses. As we have already seen, most homes
in the south were initially built with two rooms. This meant that the initial
classification of the spaces was weak. However, as soon as the materials and
financial resources were available, additional rooms were added. Such expan-
sion strengthened the classification of the house. Those with the financial
means expanded into four and six room houses maintaining the English
cottage plan as the model. The ideal was strong classification: one room per
person and one room for cooking, a separate room for dining, a separate room
for reading, sleeping in and so forth (Boyd 1952:12).
In such strongly classified and strongly framed spaces, furthermore, people
have privacy. Privacy was important in the early cottages. The strong valuation
of privacy actually began two centuries before Australia was occupied, when
Queen Elizabeth of England proclaimed the principle of a private house for
each family (Boyd 1952). Privacy is thus a relatively recent phenomenon in
human history, and one that was transported to the Australian continent with
British occupation/invasion.
An Evolving Cartography of a Visceral Semiotic 85
The Tropics
Houses built in the tropical regions of Australia also followed the Georgian
English cottage plan. Initially they fought the land in the same ways as their
counterparts in the south. Their interiors were also strongly classified and
framed into 4 to 6 compartments for cooking, sleeping, dining and enter-
taining. They also quickly added verandas to provide them with much-needed
shelter and shade. Verandas, moreover, typically covered all four sides of the
house and not just the front as they did in the south.
However, in the tropics, roofs were made of corrugated galvanized iron as it
was light and could be transported long distances at low costs. Metal, however,
is a poor choice of material as it is a poor insulator and good heat conductor.
So roofs heated up quickly and this heated everything below them. Tempera-
tures inside these houses were commonly double the temperatures outside.
Such soaring heat made living conditions intolerable. So the interpersonal
relationship set up with the occupants was one of intense smothering. In
response, windows became larger to allow the breeze inside and people began
86 Semiotic Margins
using verandas more and more to provide respite from the heat and access to
uninterrupted breezes.
Thermal comfort is clearly a very important part of Binding or feeling secure
in a space. In fact, the hotter the climate, the more the veranda was used.
Not surprisingly in the north of WA, Queensland and the NT the veranda
became more than a shelter to the rooms – it became the main living area. In
these regions, the width of the veranda physically expanded while the size of
the rooms contracted. Soon the veranda was used not only for dining and
sitting, but also for sleeping. In the dry season, verandas were accordingly
furnished with tables, chairs, pictures and vases. The need for privacy was
served by the interior of the house where Bound – securely enclosed and
strongly framed rooms – were used for undressing.
In terms of field and activity orientation, dining, sitting and sleeping were
not the only possibilities for action on the veranda. Verandas had many other
uses as well:
Most family life took place on the verandah, which functioned as a dining
room, a recreation centre, playground for the young on wet or scorching
days, store room and vantage point for surveying the scenery or passers-by.
Suspended from its rafters were the meat safe, the water bag, the clothesline
in bad weather, swings for the children, bird cages, the Christmas hams and
numerous pieces of wire or hooks on which to hang hats, bags and overcoats.
At night it was the coolest place to sleep, with a mosquito net carefully tucked
in for protection from the abundant tropical insect life. (Archer 1998:27)
In other words, the veranda was the space in which all the activities of daily life
took place (Drew 1992).
From the perspective of Bonding icons, these houses represent an interesting
development. Known as ‘the Queenslander’, they were one of the first vernacu-
lar housing styles to develop here from a western perspective and came to be
a Bonding icon for the entire colony – one could even argue that they still are.
In fact, the Queenslander is characterized by two features: a wide, all encom-
passing veranda and their elevation on stilts to increase airflow. Symbolically,
they point to the fact that the interpersonal bond to the mother country has
begun to weaken.
In terms of framing, the importance of the veranda in providing shade, shelter
and access to breezes meant that solid walls could not be used to compart-
mentalize it in permanent ways. The occupants needed flexibility to be able to
shift from one part of it to another as the direction of the sun changed during
the day. This flexibility and minimal framing meant that people were able to
complete the activities of their daily living comfortably. Also in interpersonal
terms, it meant that it represents a significant shift to declassifying both the
activities of domestic living and the spaces in which these activities took place.
The significance of this development has four dimensions. First, verandas
broke down the barriers separating internal and external spaces. They did
An Evolving Cartography of a Visceral Semiotic 87
this by extending the living area into the semi-outdoor realm. This meant that
people living in the tropics did not shut themselves indoors as people in the
south did. They actually lived around the house more than inside it. So there
were two living areas in the tropics: the veranda, which was semi-public and
strongly Unbound; and the compartmentalized interior which was private and
strongly Bound.
Second, verandas began to dissolve the compartmentalization of living
areas through weak classification and minimal framing. This forced the devel-
opment of a more open and fluid lifestyle – one which was not characterized
by the strong boundaries and classifications of housing in the southern part
of the continent. This in turn meant a significant break down in the division
of behaviours associated with the kitchen, the dining room and the parlour in
the past. These behaviours now occurred simultaneously in one large space –
the veranda.
In addition, once the barriers compartmentalizing living had broken down,
it was not just the range of peoples’ behaviours that increased. The range of
interactions that were possible between the occupants of a house also increased
accordingly as there was no longer a one to one relationship between a room,
its function and behaviour. The potential for conversation on the veranda was
therefore greater as the host/hostess were no longer in control; so the topics
could range from the intimate to the more general.
Finally, weak classification and weak framing yields openness and freedom.
This openness, however, is a double-edged sword. Its flipside is that it enables
surveillance as discussed by Foucault in relation to the panopticon prison
(1977/1991). This meant that the occupants of the Queensland veranda had
a vantage point for looking at passers-by but they could also be continuously
scrutinized. So textual choices for framing were strengthened by the addition
of adjustable louvres and lattice (see Image 4.2). These optimize breezes and
maintain thermal comfort, and together with the Bound rooms of the interior
functioned to give the occupants of ‘the Queenslander’ their privacy.
In this way, Australia developed two very different baselines for domestic
security: the Unbound in the tropics and the Bound in the south of the con-
tinent. But baselines for security are not static. They are dynamic and evolve in
response to cultural changes, technological innovation and economic factors.
As a result, houses in the south also moved towards the Unbound dimension
of the security scale albeit over a much longer time period.
this was the appearance of a combined living and dining room referred to as
the common room (Boyd 1952). In addition, architects began using arches
rather than solid walls to separate the common room from the sitting room.
Textually, this meant that the flow of internal spaces was more continuous
and integrated. Experientially, this represented a very profound shift in the
culture as internal boundaries, driven by the English desire for privacy, had
been firmly entrenched for 150 years and this shift had major interpersonal
consequences.
First, the people had to learn to feel secure in houses that had less spatial
enclosure. Rather than being strongly Bound they now felt minimally Bound.
Second, the merging of rooms weakened classification and framing, and forced
a change in social and cultural attitudes to domestic living. According to archi-
tectural theorist Robin Boyd, they ‘required a degree of social informality
contrary to the established concept of suburban life’ (Boyd 1952:184).
The material shortages of life after World War II provided other challenges
as well. In particular, the size of rooms diminished as a direct consequence
of legislation restricting the size of houses to either 92 or 111 square metres.
This provided architects with a deep challenge as reducing the size of a space
makes it feel oppressive. To prevent people feeling Too Bound, they began to
use windows more judiciously as they unbind occupants and provide a sense of
spatial freedom:
Unbinding by increasing the size and span of windows was thus deliberately
adopted to compensate for the decrease in available space. Every window
within reach of a corner ran into and turned it with a curve of glass. By 1950
the material shortages were less of a problem but the trend to Unbinding in the
south continued more strongly than ever. In fact the shift to weakly classified
internal spaces and Unbinding through windows paved the way for the open-
plan ‘glass house’ living of today which was first introduced by Harry Seidler
in 1948 at Rose Seidler House built for his mother in Turramurra, Sydney.
Rose Seidler House is a landmark in domestic Australian architecture in
the south as it pushed all the boundaries towards Unbinding and weak classi-
fication and framing that had been occurring steadily since 1900. So much
so that it was seen as radical and confronting and elicited strong feelings of
insecurity from the general public. The main innovations were the use of
movable floor to ceiling glass walls, known as sliding doors, and the removal
of internal petitions; so sleeping, play and utility areas merged into one.
Once again weak classification and weak framing delivered ‘fishbowl living’ and
locals gathered outside on weekends to peer in at the occupants. Despite this,
90 Semiotic Margins
Binding also takes into account individual as well as cultural variation and
although many people felt insecure at the thought of fishbowl living, there are
people who enjoy being on display (personal communication, resident, Harry
Seidler’s Horizon building, Sydney, 10 October 2005).
Nevertheless, the publicity Rose Seidler house received together with the
work of other modernist architects meant domestic architecture became
increasingly Unbound in the south. People became accustomed to houses such
as those built by Seidler, and over time, such houses became a ‘Given’. Not only
that, they established a new and Unbound cultural baseline for security in the
south with an emphasis on outdoor living. So much so that many houses are
now designed to flow out into gardens (Image 4.3).
This is a strong trend especially in refurbished inner city terraces. Bi-folding
doors are frequently used to optimize the permeability between internal and
external spaces, and integrate them so that they flow seamlessly into one
another. From the point of view of security, the occupants of such houses main-
tain their privacy as the courtyard provides external boundaries that screen
them from the voyeuristic gaze of neighbours in the same way as gardens
screened the internal spaces of the wealthy in the nineteenth century.
Another trend in contemporary Australian housing is to use glass window
walls to extend the indoor spaces out onto panoramic vistas of the natural
environment or urban skylines (Image 4.4). From the point of view of attitu-
dinal re/alignment, this is a most significant cultural shift as it evokes a strongly
positive valuation and appreciation of the Australian landscape together with a
sense of pride and confidence in its city skylines, beaches, mountains and bush.
It also reflects a strong love and affection for the land, which is now openly
invited inside. This positive western valuation of the Australian environment
began with Harry Seidler and has become so strong in recent years that
Renzo Piano, world famous Italian architect has said: ‘I think in this country
the sensitivity to nature, to breeze, to view, to sun is stronger than anywhere
else’ (Drew 1999:xv).
From the point of view of Bonding and social interaction, the refurbished
Unbound terrace is often characterized by weak classification and weak fram-
ing especially in their living areas which are now designed to serve multiple
functions. Open-plan in design, they function as a living room, library, home
theatre, informal dining room, study and playroom. Exciting as it may sound,
living in one room with many functions is not a positive experience for every-
body. Sydney journalist Maggie Alderson describes it in the following way:
Lovely notion as it is to have one sprawling family area, with the youngest
child doing homework at the kitchen table while Dad cooks a stir-fry listening
to the cricket on the radio, Mum reads the paper and two teenagers play
a violent video game, the reality is an imperfect experience for everyone
involved. (Alderson 2007:41)
The dissatisfaction Alderson expresses seems to stem from the fact that weak
classification and framing result in too much interaction. They also result in
surveillance, which is why parents often like open-plan areas: they can easily
keep an eye on their children. Teenagers, however, tend to respond negatively
to continuous adult surveillance and seek refuge in the strongly classified
and framed spaces of their bedrooms (Image 4.5). These provide them with
sanctuary and escape.
92 Semiotic Margins
Image 4.5 Close me in and set me free: Strong classification and strong framing
Having swung from one extreme to the other, it seems that many of us need
both openness and enclosure in our domestic spaces. Perhaps the capacity to
provide both explains why the Queenslander has successfully survived as a
housing choice for such a long time. It also explains why the house shown in
Images 4.4 and 4.6 functions so effectively. The open-plan area is both a dining
room and a lounge room. Large glass sliding doors dissolve internal-external
boundaries but the inclusion of diaphanous curtains reduces the permeability
of the space and provides the family with privacy whenever it is needed.
Privacy, for this family, is usually sought in the mornings. So the curtains
remain drawn until the family is dressed and ready to open itself up to the light,
activity and scrutiny of the world outside (Image 4.7). Once open, the curtains
tend to stay open all day and all night. Regardless of the choice the family
makes at any point in time, the curtains have the potential to either strengthen
or weaken the framing of the space and give the family control over the open-
ness of the house, that is, its privacy or degree of exposure.
An Evolving Cartography of a Visceral Semiotic 93
On the middle level of this terrace, where most of the communal activities
take place, the house contains another living area (Image 4.8). This space is
located adjacent to the open area just discussed but the choices for enclosure
are very different. The second space is strongly Bound. It is mainly used for TV
viewing and has become one of the family’s favourite spaces. It is small and the
furniture is laid out in such a way that everybody sits in very close proximity
to one another but the TV is the focal point of the space, not the interaction
between the occupants. Having the TV as the focus softens engagement and
mitigates contact through oblique angles, to use Kress and van Leeuwen’s tools
(1996/2006), but still enables the entire family to participate in the shared
activity of watching TV.
It is easy to see why this small, cocooned space is popular for family with two
teenagers. It provides a secure environment that enables the family members to
commune by drawing them into physically close proximity with one another
and then allowing them to simply ‘be’ without pressure to overtly share feelings
or attitudes. Bonding in this space is not about valuing the external Australian
landscape. It is about feeling part of an important social unit and enjoying the
deep attachments that form between family members without demanding any
of the intimate familial social interactions that many teenagers seem to struggle
with, and rebel against. It is therefore not surprising that the mother of the
family says she would never demolish that wall and turn the living area into
one seamless space (personal communication, 5 November 2007). It also seems
that the nature of the Bonding, the physical and emotional connection family
members desire changes over time, and our spaces can be designed to accom-
modate those changes and facilitate the types of interaction that are sought.
An Evolving Cartography of a Visceral Semiotic 95
Returning to Bonding icons, this analysis so far has been strongly oriented
to the way whole houses crystallize interpersonal attitudes to the land. At the
heart of this discussion has been a consideration of who and what is allowed to
enter into the domestic space and who or what is deliberately excluded. If
we link the idea of inclusiveness and exclusiveness to the notion of a hierarchy
of Bonding icons, it seems that households, including couples, may develop
personalized (as opposed to rallying national) icons. The family we have just
been discussing, for instance, have a collection of teddy bears, which lives on
the sill in the Bound TV nook (Image 4.9). These teddy bears evoke feelings
of security, love, warmth, tenderness, affection, happiness and intimacy. Each
member of this family has their own personal teddy with their name sewn on
it and all the members of the family are represented: past and present, nuclear
and extended. Significantly, these teddies are not a rallying icon like the
Olympic torch (see Stenglin 2008b for a detailed analysis of the Olympic torch
as a Bonding icon). They are much more privileging – you have to be part
of this family to belong to that ledge.
So it seems that we can grade Bonding icons along at least three dimensions:
local icons (teddy bears) and international icons (the Olympic flame). Second,
each Bonding icon has the potential to evoke an affectual charge that varies
in its intensity: at times it can be so strong that it moves you to tears, at other
times it may just evoke a feeling of warmth. This means the intensity of the
affectual charge can be graded along a continuum of minimal to maximum.
Finally, the function of Bonding icons varies considerably: some rally while
others privilege.
Moving beyond the space grammar and domestic Australian architecture, the
final section of this chapter concludes by extending the application of the tools
we have been exploring as well as identifying those dimensions that need fur-
ther consideration. First, the contact dimension as theorized by Kress and van
Leeuwen (1996/2006) in relation to visual images needs to be incorporated
into the space grammar. At this point, it appears to sit most comfortably within
Bonding. The reason for this being that Bonding is concerned with social
interaction and contact constitutes an important dimension of that. If a space
has weak classification and weak framing, for example, it is open to surveil-
lance. At the heart of such surveillance is contact and it raises some interesting
anomalies: living in a fishbowl appears to mainly be an offer but it can also be
a demand . . . how do we reconcile these choices?
In addition, contact may need to be adapted in relation to 3D space. In 3D
space, for example, it is not just contact that is important but the directionality
of the contact: is it one-way or two-way? If one way, is it ‘in-out’ as we saw with
lattice-enclosed verandas or is it ‘out-in’ as is the case at night when outsiders
look into plate glass enclosures. Control also seems to be an element for con-
sideration. To what extent can the occupants control the extent of the contact
through choices for screening offered by curtains, blinds and shutters? Another
related dimension seems to be the participants: the ‘who’ or ‘what’ the contact
is with. Is it with the natural environment, passers-by or other members of the
household? As the participants may be inanimate objects there is also a need
to theorize furnishings and the ways we interact with objects as the discussion of
the TV and contact in the contemporary Glebe terrace pointed out. Scheflen
and Ashcroft’s (1976) work on territoriality would constitute an important
starting point here but one would clearly need to go beyond and consider the
social implications of such interaction on Bonding.
An Evolving Cartography of a Visceral Semiotic 97
Conclusion
Acknowledgements
First, I would like to thank Joan Rothery for opening up and guiding my
exploration of architectural spaces, especially housing on the Australian
continent. Second, I would like to thank my friends and family – Carolyn, Clare,
Daniel, Kelvin, Stephanie, Vanessa and Yarro – who trusted me to photograph
their private spaces, analyse them and share my thoughts in the public domain.
I also owe an enormous debt of gratitude to my critical readers – Chris Cléirigh,
Sally Humpreys, Shooshi Dreyfus Michele Zappavigna and Ahmar Mahboob –
for their insightful and encouraging comments. Finally, I thank Roland Stocker
for his generosity in assisting with the diagrams and supporting me in every
aspect of this work.
Note
1
See Stenglin (2009b) for an alternative exploration of domestic security – one
that occurs in homes characterized by abuse – verbal, physical or sexual. This
account is based on an analysis of an exhibition called ‘Scumbag’ by renowned
Australian artist and photographer, Ella Dreyfus.
References
Alderson, M. (2007). Makeover madness. Sydney Morning Herald, Good Weekend
Magazine, June 30, p. 41.
Archer, J. (1987). The great Australian dream: The history of the Australian house.
Sydney: Angus & Robertson.
Archer, J. (1998). Your home: The inside story of the Australian home. Port Melbourne,
Victoria: Lothian Books.
Bachelard, G. (1964). The poetics of Space (M. Jolas, trans.). New York: Orion Press.
Bernstein, B. (1975). Class, codes and control: Volume 3 (2nd edn). London, Boston,
MA and Henley: Routledge & Kegan Paul.
Boyd, R. (1952). Australia’s home. Melbourne: Melbourne University Press.
Broadbent, J. (2001). The colonial bungalow. Unpublished talk, 31 July, Australian
National Trust Centenary of Federation Lecture Series, Sydney: SH Irwin
Gallery.
Brown, N. (2000). Making oneself comfortable, or more rooms than persons. In
P. Troy (Ed.), A history of European housing in Australia (pp. 107–124). Cambridge:
Cambridge University Press.
Drew, P. (1992). Verandah: Embracing place. Sydney: Angus & Robertson.
Drew, P. (1999). Touch this earth lightly: Glenn Murcutt in his own words. Potts Point,
Sydney: Duffy & Snellgrove.
Evans, I. (1983). The Australian home. Sydney: Flannel Flower Press.
Falk, J. & Dierking, L. (1995). Public institutions for personal learning: Establishing a
research agenda. Washington, DC: American Association of Museums.
An Evolving Cartography of a Visceral Semiotic 99
van Leeuwen goes on to suggest that both the use and analysis of sound are
perhaps not yet culturally developed enough to be susceptible to systemati-
zation – in his terms, they are still a ‘medium’ rather than a ‘mode’:
[T]he semiotics of sound cannot be approached in quite the same way as the
semiotics of language and or of images. It is not, or not yet, a ‘mode’, and it
has therefore not or not yet reached the levels of abstraction and functional
structuration that (written) language and image have reached, as a result
of their use in social crucial ‘design’ processes. (van Leeuwen 1999:192)
While the implications of this ‘argument from design’, and its relationship to
analysis, are not clearly spelled out by van Leeuwen, it seems to me that what
is needed for a social-semiotic treatment of any particular modality is a kind of
triangulation between the analysis of its texts, the theoretical frameworks that
have been applied to it, and the social meanings it has for its communities of
users. It is not enough to have just one or two of these: the theoretical and
social without the textual leaves the analysis ungrounded, with no way of under-
standing in detail how analysts have come up with their interpretations; the
social and the textual without the theoretical traps analysts in the (unexamined)
presuppositions of their commonsense (or ‘intuitive’) viewpoints; the textual
and theoretical without the social makes analyses ultimately only personal
ones – insightful, perhaps, but in the end only one individual interpretation.
The current chapter makes no claim to have achieved a comprehensive
model of musical meaning of this kind within a social semiotic framework.
What is attempted here is the more modest aim of what Mao Zedong referred
to as ‘reactionary editing’: in other words, putting several different kinds of
discourse about the phenomenon side by side and seeing what emerges from
the mix. It seems to me there is an enormous untapped resource for analysts
not just in ‘analyst talk’, as it were, but also in ‘audience talk’ and ‘performer-
composer talk’: the latter is particularly useful because it is based on a (literally)
hands-on experience of the modality – including the crucial – though again
often ‘intuitive’ alas – sense of the probabilities involved: how common is a
particular feature? What sort of regularities is it playing off? – in other words,
how can it be contextualized logo-genetically?
This chapter makes a start on this kind of project with an analysis of selected
analyst talk (a preliminary attempt at incorporating performer-composer
talk is made in McDonald, 2010). It begins with an analysis of two recent
textbooks dealing with music, van Leeuwen (1999) already discussed above,
and Vella (2000), then moves on to the semiotic approach of Nattiez (1990),
whose model explicitly includes the viewpoints of both the composer-performer
(‘poietic’) and listener (‘esthesic’), ending up with the phenomenological
Dealing with Musical Meaning 103
approach of Burrows (1990). The conclusion that emerges from this discursive
journey is that the crucial factor required for any conceptualization of musical
meaning is embodiment: in other words, that the ultimate locus of musical
meaning must be the signifying human body (Thibault 2004).
a model for sound and music can be discussed on the same level and not based
on hierarchies of ability. With a similar inclusive purpose, van Leeuwen’s treat-
ment places spoken language, music and sound effects on the same plane, and
discusses commonalities of meaning and expression across all three modalities.
Both, however, in effect avoid what for the analyst is the key question of musical
meaning: how it is that particular patterns of sound expression relate to their
different levels of interpretation.
Vella’s book, Musical Environments, is based on a course he taught for some
years in the School of Mathematics, Physics, Computing and Electronics, and
is very much the embodied ‘hands-on’ and ‘ears-on’ approach of a performer-
composer, as he explains:
van Leeuwen’s Speech, Music, Sound derives from courses he taught first in the
Department of Media Studies at Macquarie, and then at the London School
of Printing, and in contrast to Vella’s approach, takes more of the semiotic
analyst’s point of view:
van Leeuwen introduces here what can only be called a moralistic note, a
prominent sub-theme in his book, which both presents and demonstrates
a moral purpose, to ‘re-educate’ our ‘ill-educated’ ears. In this, van Leeuwen is
Dealing with Musical Meaning 105
in the tradition of writers on music such as composer R. Murray Schafer, and his
near namesake Schaeffer, composer and theorist of musique concrete (Schaeffer
1967), as well as earlier examples such as the Futurists of the early twentieth
century, one of whose purposes was to awaken our senses to the range of sounds
around us, beyond what is usually thought of as ‘music’ as such.
Vella and van Leeuwen place their exploration of music in relation to quite
distinct professional and academic enterprises. For Vella, himself a well-known
composer experienced in film, theatre and concert music, his concerns are
in the first instance those of a composer, improviser and then of the listener
(Vella 2000:9):
This book has been conceived from a composer’s point of view which is
fundamentally a sensory relationship to the creation and ordering of sounds.
As soon as sounds are placed next to each other, we as listeners automatically
invent relationships between the sound events and therefore meaning.
The perception of texture and musical events falls within the domain of music
cognition, a complex field of study drawing together the two disciplines of
music and psychology. It is largely concerned with the way we, as listeners,
perceive musical structures, differentiate and organise sonic information,
remember, predict and reject musical events, internalise larger formal struc-
tures, and create relationships between sounds. (Vella 2000:10)
allow more options, more tools for the production and interpretation of
meaningful action . . . (van Leeuwen 1999:4–10)
Although, as noted above, neither book sets out to formally define the concept
of the meaning of music, each does nevertheless provide a clear working
definition of musical meaning for their very different purposes. For Vella, in
line with his contextual ‘hands-on’ approach, meaning is created through
the listener’s experience of sound:
The music itself includes all its auditory qualities; its context is defined by
where or how the music is positioned in relation to the listener and its
purpose; listening to music through a pair of headphones, for example, is a
very different experience from hearing it in a concert hall; and the meaning
of music is determined by who is listening and the cultural experiences
and associations of its audience. (Vella 2000:24)
Metronomic
Measured
Non-
Sound time metronomic
Unmeasured
(Halliday 1978, Halliday & Hasan 1985), a framework within which van
Leeuwen, wearing his linguist’s hat as a phonologist, has himself worked.
Figure 5.1 illustrates his representation of the semiotic resources available
in the area of musical timing (van Leeuwen 1999:6–7):
Measured time is time you can tap your feet to . . . The physical reaction
to unmeasured time in more likely to be a slow swaying of the body . . .
‘. . . metronomic’ and ‘non-metronomic’ time form a subdivision of ‘meas-
ured time’. ‘Metronomic time’ is governed by the implacable regularity
of the machine, whether or not a metronome (or a drum machine or a stop-
watch) is actually used. It is the time of the machine, or of soldiers on the
march. ‘Non-metronomic time’ is also measured, but it subverts the regular-
ity of the machine. It stretches time, it anticipates or delays sounds and so
on. It is the time of human speech and movement, or of Billie Holliday
singing a slow blues while ‘surfing on the beat’ . . .
For Meyer, there are on one side absolutists who believe that meaning is
based exclusively on the relationships between the constituent elements of
formalists
absolutists
expressionists
referentialists
the work itself, and on the other referentialists for whom there cannot be
meaning in music, except by referring to an extramusical universe of
concepts, actions, emotional states, and characters (1956:1). But this first
dichotomy is mirrored in another than does not exactly correspond to it:
the formalists who (according to Meyer) do not acknowledge that music
can provoke affective responses (it has an intrinsic significance given to
it by the play of its forms), and the expressionists who acknowledge the
existence of feelings. But though formalists are necessarily absolutists,
expressionists will be absolutists if (for them) the expression of emotion is
contained in music itself, and they will be referentialists if the expression
is explained in terms of music referring to the external world. (Nattiez
1990:108–109)
Nattiez then combines this typology with his own basic theoretical framework
of what he calls the ‘semiotic tripartition’, 3 ‘dimensions’ of any ‘symbolic
phenomenon’, which he defines as follows:
1. The poietic dimension: even when it is empty of all intended meaning . . . the
symbolic form results from a process of creation that may be described
or reconstituted.
2. The aesthesic dimension: ‘receivers’, when confronted by a symbolic form,
assign one or many meanings to the form; the term ‘receiver’ is, however, a
bit misleading . . . we do not ‘receive’ a ‘message’s’ meaning . . . but rather
construct meaning, in the course of an active perceptual process.
3. The trace: the symbolic form is embodied physically and materially in the
form of a trace accessible to the five senses . . . An objective description
[of the trace] can always be proposed – in other words an analysis of its
immanent and recurrent properties. This is referred to . . . as ‘analysis of
the neutral level’. (Nattiez 1990:11–12, original emphasis)
We will now use this classification, and the theorists Nattiez quotes, to give a
brief guided tour through a range of stances on musical meaning.
Nattiez then uses this distinction between empirical and normative, together
with his notion of the semiotic tripartition referred to in the previous section,
to carry out an admirably clear dissection of Hanslick’s complex position:
(1) from the poietic side, emotion exists in the composer, but does not
manifest itself except in a purely musical form;
(2) on the immanent level, music’s content is its form;
(3) from the esthesic side, emotion is the result of the form’s effect, and
its origin must be sought in the music itself
Dealing with Musical Meaning 111
(1) Poietic: one should not write program music, or imitative or sentimental
music. In opera, music should occupy the predominant position.
(2) Immanent level: ‘the Beautiful is nothing more than form’ (Hanslick
1976[1854]:16).
(3) Esthesic: perception is not exempt from emotions, but it must try to
elevate itself to pure contemplation of forms. (Nattiez 1990:109–110)
Tones too [like words EMcD] indicate, point to something. The meaning of
a tone, however, lies not in what is points to but in the pointing itself; more
precisely, in the different way, in the individual gesture, with which each tones
points toward the same place. The meaning is not the thing indicated but
the manner of indicating (otherwise all tones would mean the same thing,
namely, î [tonic EMcD]) . . . In the strictest sense . . . what the tone means
is actually and fully contained in the tone itself. Words lead away from
themselves; but tones lead into themselves. Words only point toward what
they mean, but, beyond that, leave it, so to speak, where it is . . . Tones, on
the other hand, have completely absorbed their meaning into themselves
and discharge it upon the hearer directly in their sound. (Zuckerkandl
1956:67–68, original emphasis)
In other words, in order to understand what music means, one simply needs
to be able to identify relationships between the different parts of the musical
text, and link them into a coherent whole. Linguist Robert Austerlitz makes
a similar point, using the notion of pointing or ‘deixis’, normally used of
linguistic elements which refer outside language to aspects of the situation. In
the case of music, according to Austerlitz, any musical deixis is text-internal or
‘cataphoric’, that is, referring forward to subsequent musical patterns:
Musicologist Nicolas Ruwet, one of the earliest scholars to apply the then new
ideas of transformational-generative grammar to the analysis of music, basically
updates Hanslick by claiming that it is only an analysis of the ‘syntax’ of music,
that is, its internal patterns of organization, that can give us any ‘access’ to the
‘study of musical meaning’:
So what exactly is this ‘world’ that music ‘presents’? The term ‘presents’ sug-
gests that what is going on in music is some sort of ‘performance’, not just in
the obvious sense that music is a performance art, but in some deeper sense
that involves the music and all the participants in some sort of shared experi-
ence. However, this still leaves unproblematized the exact nature of musical
presentation, and how it is that the participants interpret what the ‘perform-
ance’ means. Thus, we seem to be pushed towards attempting to define some
sort of notion of how music ‘refers’ to something outside itself: in a broad sense,
how it is that the ‘world of music’ relates to the ‘world of lived experience’.
114 Semiotic Margins
The analogy works here via a long tradition of identification of the ‘motion’
of the body with the ‘emotion’ of the feelings (the terms themselves, derived
from Latin for ‘to move’ and ‘to move out of’, respectively, show how this
metaphor is embedded in many Western European languages at least). Davies
argues that just as we can interpret how someone is feeling by the way they
move, using ‘move’ in the broadest possible sense to include facial expressions
and gait, we can also interpret musical form by ‘establishing a symbolic relation
between particular parts of the music and particular parts of human behaviour’.
Thus, music ‘expresses’ by ‘referring’, by presenting symbolic forms which can
be interpreted as expressing emotions.
A complementary approach to explaining what music refers to is taken by
Niall Griffith, who again stresses the performative, dynamic nature of music,
in this case in contrast to the analytic, static nature of language:
Both these types of ‘referring’, the emotional and the causational, can be
brought together in a model put forward by Michel Imberty in his notion of
Dealing with Musical Meaning 115
Thus, music operates through elements of ‘perceived and felt change’ that
refer to ‘vitality affects’, emotional responses that have been developed on
the basis of previous experiences. Ruthrof’s The Body in Language (2000) puts
forward a similar model for language, in which he attempts to put bodily
experience back into theories of language via a process of psychological
imagery:
So it begins to seem as though it is the body that must be the ultimate locus
of attempts to ground musical meaning in something external to itself. In
the final section, I go on to deal with this notion of embodiment in relation
to music and show how it makes us rethink the whole nature of musical experi-
ence and musical meaning.
In a book produced almost a decade before van Leeuwen and with almost the
same title, musicologist David Burrows in his Sound, Speech, and Music (1990)
sets out to understand the characteristic uses of sound in language and music
in a way that stems from its fundamentally embodied nature. Burrows emphas-
izes the semiotic affordances provided by sound, claiming that ‘sound is far
more to speech than a passive conveyance’ but rather the means through which
human thought has evolved, exploiting ‘the unique capacity of vocal sound
for rapidity of articulation in detachment from the world of enduring spatial
116 Semiotic Margins
[T]he voice is . . . the most intimate and powerful human exploitation of the
potential in sound, a means of displaying mood and attitude and a way of
bonding separate individuals and negotiating their mutual interests . . . If
speech is a displacement of the mutual awareness of speaker and listener
from Field 1, the here and now conveyed by the senses, and into the metasen-
sory domain of Field 2, then the speaking self is defined by its relationship to
shifting possibilities outside the actuality of the moment, possibilities at best
indirectly verifiable. This means that a corresponding quality of contingency
and provisionality must characterize that range of identity which is at the focus
of speech and speech-related thought. Music is seen . . . as one of a range of
activities that help compensate for this debilitation of identity by moving the
participants’ orientation towards that of Field 3. (Burrows 1990:11–13)
Burrows’ model provides a rationale for the existence of music in every human
society in a way that allows for its extra-bodily dimensions but does not detach
them from the fundamentals of bodily experience.
From a philosopher’s point of view, Stephen Davies in his Musical Meaning
and Expression (1994) gives a long discussion of various attempts within philo-
sophy and musicology to capture how it is that music expresses meaning. His
very wide-ranging study puts forward, among others, the very useful concept of
music expressing ‘emotion characteristics in appearance’ already referred to
above. But although he spends almost the whole book trying to define ‘mean-
ing’, he takes the equally slippery concept of ‘music’ as a given, and this causes
him enormous problems: for example, in trying to justify how a ‘non-sentient’
phenomenon like music can have ‘feelings’ (1994:163).
In fact as musicologist Christopher Small points out, such questions are
basically pseudo-questions: ‘[t]here is no such thing as music’:
Music is not a thing at all but an activity, something that people do.
The apparent thing ‘music’ is a figment, an abstraction of the action, whose
reality vanishes as soon as we examine it at all closely . . . If there is no
such thing as music, then to ask ‘What is the meaning of music?’ is to ask a
question that has no possible answer. Scholars of Western music seem to have
sensed rather than understood that this is so; but rather than directing their
attention to the activity we call music, whose meanings have to be grasped in
time as it flies and cannot be fixed on paper, they have quietly carried out a
process of elision by means of which the word music becomes equated with
‘works of music in the Western tradition’. Those at least do seem to have a
real existence, even if the question of how and where they exist does create
Dealing with Musical Meaning 117
problems. In this way the question ‘What is the meaning of music?’ becomes
the more manageable ‘What is the meaning of this work (or these works) of
music?’ – which is not the same question at all. (Small 1998:2–3)
Both Nattiez and Davies fall into this trap of trying to locate the meaning of
music in particular musical works, a trap that has been carefully prepared for
them, and many others, by a whole tradition of thinking about music in the
European cultural sphere with the development over the last millennium of
ever more sophisticated and comprehensive forms of musical notation. Having
the music ‘in black and white’ on the page not only means that the focus of
attention then becomes interpreting and explaining the notation, but that
music quite naturally comes to be seen as an object, removed from its immedi-
ate context, and thus amenable to ‘scientific’ study. Small quotes ‘the doyen
of contemporary German musicologists, Carl Dalhaus’ in a pithy summary of
this attitude in which he states that ‘the concept “work” and not “event” is the
cornerstone of music history’ (Dalhaus 1983, in Small 1998:4).
Thus, Nattiez’s long discussion of the musical fact and Davies’ equally
long struggle with musical meaning both suffer from a type of misplaced con-
creteness, because they both take it for granted that such as thing as ‘music’
exists, and that it exists in musical ‘works’, not musical ‘events’. Small’s study
attempts to redress this imbalance, by looking not at this pseudo-object ‘music’
or the musical work, but at the musical event, a process to which he gives the
name ‘musicking’. The fact that he does this by a detailed ethnographic study
of the Western concert hall and what goes on there is a nice riposte to the
tradition represented by Dalhaus which places actual performance outside the
ken of observation or theorization.
And as soon as we start focusing on performance, as soon as we bring back
living breathing people into our conception of music, more specifically the
musical event, we are also perforce required to focus on people’s bodies.
The notion that the body is central to musical meaning is not a new one, as
the analogy between ‘motion’ and ‘emotion’ shows, though overshadowed by
another long tradition in European thinking about music, dating back at least
to Pythagoras, which places music in some sort of abstract, celestial realm
beyond our everyday mundane lives. But the relationship between bodily
movement and musical expression was already being emphasized by French
psychologist Frances half a century ago:
The kinship between rhythmic and melodic pattern in music and the patterns
of gestures that accompany behaviour, represents one of the basic elements
of music’s expressive language . . . The basic psychological states (calm,
excitation, tension, relaxation, exaltation, despair) normally translate them-
selves as gestural forms that have a given rhythm, as tendencies and ascents,
as modalities for organizing fragmentary forms within global forms . . . the
transpositions of these rhythms, tendencies, and modalities of movement
118 Semiotic Margins
into the sound structure of music constitutes music’s basic expressive lan-
guage. (Frances 1958:299, quoted in Nattiez 1990:118–119)
Even earlier, Swiss music educator Jacques-Dalcroze was stressing the inherent
link between movement and music by identifying the crucial links between
time, space and muscular energy:
In fact the whole body of theory and practice of music learning which stems
from Dalcroze’s work, known as ‘Eurythmics’, works precisely by maintaining
the link between body movement (‘rhythm’) and aural training at every step
of the music learning process. Table 5.1, simplified and adapted from a
pedagogical work in this tradition, shows how such a link can be made in terms
of a process of music learning that involves listening, movement, cognition
and symbolization (Vanderspar 1984/1992:25–26).
The sort of detailed repertoire of practice as developed by Eurythmics, allied
to an embodied conception of musical meaning such as put forth by Frances,
could begin to really get to grips – to use another embodied metaphor – with
the nature of musical meaning as grounded in the embodied context of human
semiosis more generally (Thibault 2004). Such a model has been adumbrated
in several publications by myself in collaboration with a voice expert colleague
(Callaghan & McDonald 2002, 2003, McDonald 2002, forthcoming), but the
full working out of a social semiotic model of music, including both text
analysis and ethnographic studies, remains in the form of a promissory note.
The present chapter, through its critique of a range of instances of analyst talk,
has hopefully managed at least to point one possible way out of the haze of
misconceptions with which the topic has been obscured. So it is perhaps fitting
to end this patchwork of quotations with a self-quotation which sums up the
general attitude taken here towards this complex topic:
The human animal uses its body to dance and sing and move and speak,
to model and (re)enact the processes and interactions of its material and
social worlds, as well as to create verbal and musical texts that embody
(pun intended) its semiotic worlds. How much longer can musicologists –
or linguists for that matter – ignore the fact that they are dealing with an
embodied social-semiotic system? (McDonald 2002:305)
References
Austerlitz, R. (1983). Meaning in music: Is music like language and if so, how?
American Journal of Semiotics, 2(3), 1–12.
Burrows, D. (1990). Sound, speech, and music. Amherst, MA: The University of
Massachusetts Press.
Callaghan, J. & McDonald, E. (2002). Expression, content and meaning in language
and music: An integrated semiotic approach. In P. McKevitt et al. (Eds), Language,
Vision, Music (pp. 205–230). Amsterdam: John Benjamins.
Callaghan, J. & McDonald, E. (2003). The singer’s text: Music, language and the
expression of meaning. Australian Voice, 9, 42–48.
Dalhaus, C. (1983). Foundations of Music History (J.B. Robinson, trans.). Cambridge
and London: Cambridge University Press.
Davies, S. (1994). Musical meaning and expression. Ithaca, NY: Cornell University Press.
Frances, R. (1958). La perception de la musique. Paris: Vrin.
Griffith, N. (2002). Music and language: Metaphor and causation. In P. McKevitt,
C. Mulvihill & S.O. Nuallain (Eds). Language, Vision & Music (pp. 191–220).
Amsterdam: John Benjamins.
Halliday, M.A.K. (1967). Notes on Transitivity and Theme in English, Part 1. Journal
of Linguistics, 3(1), 37–81.
120 Semiotic Margins
Vanderspar, E. (1984). Dalcroze handbook: Principles and guidelines for teaching eurhythmics.
Updated edition 1992. Launceston, Cornwall: The Dalcroze Society.
Vella, R. (2000). Musical environments: A manual for listening, improvising and
composing. Sydney: Currency Press.
Zuckerkandl, V. (1956). Sound and symbol: Music and the external world. Princeton,
NJ: Princeton University Press.
This page intentionally left blank
Part Three
J.R. Martin
University of Sydney
Len Unsworth
University of New England
Introduction
The social semiotic analysis of visual texts has made considerable progress in
the past decade since the publication of Kress and van Leeuwen’s (2006) Read-
ing Images: The Grammar of Visual Design, which makes use of M.A.K. Halliday’s
(1978) theory of ‘metafunctions’ to identify three distinct but coexisting kinds
of meanings that interplay within any text. This chapter aims to develop further
the social semiotic analysis of visual images within one of these metafunctions
and in relation to one particular source of data – a corpus of children’s nar-
rative picture books. The region of meaning under focus is that of the ‘textual’
metafunction (Halliday 1978; Halliday & Matthiessen 2004) or ‘composition’
(Kress & van Leeuwen 2006), within which a number of visual choices will be
identified and their meanings discussed. Picture book narratives have the
advantage as data for visual analysis that they are ‘apprenticing’ texts in terms
of visual as well as verbal literacy, thus making the relevant visual choices salient
as they guide readers/viewers into an understanding of the meanings made.
Halliday’s notion of metafunctions as regions of meaning has been developed
as part of systemic-functional linguistics for the explanation and analysis of
verbal texts. Whereas the ‘ideational’ metafunction is concerned with the con-
tent or topic of a text, and the ‘interpersonal’ metafunction with attitudes,
stances and relations of power and social distance between reader and writer
(or between characters in a fictional work), the ‘textual’ metafunction is said
to be concerned with the organization of both ideational and interpersonal
meanings. On the one hand the textual metafunction of language involves
‘cohesion’ by such means as ellipsis, lexical chains or pronominal reference,
126 Semiotic Margins
which create links across different parts of a text, while on the other hand it
concerns the staging and packaging of ideational and interpersonal meanings
by such means as choice of initial clause or paragraph element (Theme) and
the organization of given and new information. It is with the visual equivalent
of this latter aspect of meaning that the current chapter will be concerned.
That is, in considering the visual systems of textual or compositional choices
found within picture books, the focus will not be on colour ‘rhymes’ or visual
repetitions that achieve cohesion across the narrative, but rather on the way
visual elements are ‘packaged’ on the page, on questions of the separation or
integration of elements and the training or direction of attention.
It is proposed here that the textual, or compositional, metafunction as
it applies to children’s picture-book images principally involves three
systems, or sets of options: those of framing, balance and intermodal
integration, the first two of which will be described and discussed in this
chapter. These systems have been inferred from an examination of a corpus
of over 50 narrative picture books including many prize-winning texts. In such
texts, the visual choices made are highly systematic and contribute to creating
the thematic significance of the story for the young reader. For example, one of
the most popular and acclaimed books aimed at the preschooler, Maurice
Sendak’s (1963) Where the Wild Things Are, immediately foregrounds the issue
of how a story image is framed by the white border or margin of the page
(framing) and how the placement of the verbal text relates to the visual
(intermodal integration). This is because the first five page openings
have verbal text on the left-hand page and on the right-hand page an image
surrounded by a white margin, but with each succeeding image larger than the
previous one, until the 6th image fills the entire page, expunging the margin,
and the 7th begins to transgress across the gutter to encroach on the left-hand
(text) page. By the 9th spread the image has extended right across both pages
so that the text has to appear beneath, rather than facing the image, following
which the text is entirely ousted and there are three spreads entirely filled
by images that extend to the page edge.
These choices are far from arbitrary when considered in relation to the
ideational and interpersonal meanings in the story. The young protagonist,
Max, is shown initially in smaller pictures ‘hemmed in’ and constrained by the
surrounding white margin. At this point, he is full of aggression, getting up
to serious ‘mischief’ and being sent to his bedroom as a punishment. Once
there he begins to use his imagination, the room expands and transforms into
a forest and Max sets off in his boat on an adventure to the land of the ‘wild
things’. The gradual expansion of the images to the edge of the page and
beyond clearly symbolizes the liberating quality of Max’s imagination. The
central wordless set of spreads depict him having a ‘wild rumpus’ with the wild
things, after which the verbiage and the margin are reinstated by degrees as
Max’s emotional storm subsides and he gradually returns ‘home’ in a calmer,
happier and more reflective frame of mind. As various critics have noted
Organizing Visual Meaning 127
(e.g. Nodelman 1988), the framing choices are an important means of con-
veying Max’s imaginative and emotional journey, and suggest clearly the kind
of meaning made by ‘margined’ images (i.e. with a border) as against those
that bleed to the edge of the page.
In Figure 6.1, which gives the entire framing system, this choice of the
presence or absence of a margin of space around the story image is indicated
by the ‘features’ [bound] versus [unbound], with the meaning residing not
in the label but in the contrast between the options (the network does not
imply that options are consciously taken up by the artist). All such pairs of
features in the diagram are to be read as either/or options, the selection of
which may lead to further options as the figure is read from left to right (the
visual realization of each feature or sub-feature is indicated by the downward
sloping arrows on the diagram). In some picture books a consistent choice of
one of these meaning features will be made throughout – for example, every
image will be similarly [bound]1 or not, while in other cases, like Where the Wild
Things Are, the shifting of choices across the course of the narrative is what
proves most relevant in terms of the narrative theme.
While the changing margin in Sendak’s story is most significant for referen-
cing Max’s emotional state, Kress and van Leeuwen (2006) have also pointed
out the role of frames, borders and white space in separating elements out, or
conversely (where absent) in creating greater connections. In a picture-book
image, this characteristic helps explain an important aspect of an unbound
image – the fact that the lack of an intervening white margin between the image
and the page edge reduces as far as possible the boundary between the reader’s
world and that of the story, inviting the reader to connect and feel part of
that world. Again, consistent choices may be found within a particular story,
or in other cases, frames and margin borders may be present on many pages
but removed at key points where the reader is ‘invited in’ (see, for example,
the final image of Anthony Browne’s (2004) Into the forest, where a previously
anxious child protagonist is greeted by a close up of the smiling mother
welcoming him with open arms. The absence of any boundary between the
story world and the reader’s world encourages the child reader at this point to
participate in the welcome and share in the strong positive affect created here).
Thus, a choice of [unbound] for the image avoids fencing in the character
and also avoids holding the reader at any distance. Such images may still vary,
though, in whether the story-world setting fills the entire page or whether the
characters are simply shown without context on a white page background (see
Image 6.1).
This is the choice of [contextualized] versus [decontextualized] for an
unbound image, an option which Kress and van Leewen (2006) treat as an
interpersonal one relating to the relative ‘realism’ of the image. However, in
picture books, the removal of a depicted context seems most significant as a
means of making the behaviour or attitudes of the depicted character much
more salient (Nodelman 1988), thus triggering an evaluative response in the
128
refocalized
+margin image
ambient margin
non white margin colour
contained
image within edge
bound
+margin breaching ideational
image breaks edge
Semiotic Margins
surrounded frame formed by
demarcated ideational element
margin space
on all sides textual
image limited
partial margin line frame
iconized
framed
+ picture frame
ambient frame
non white
frame colour
contextualized
unbound
setting fills page
localized
–margin minimal (iconic) setting/ symbolic
de-contextualized attributes
individuated
white space background
participant(s) only, no setting
reader. Because of this, in some books, such images may occur at particular
moments in the story when the reader is strongly invited to empathize with or
judge the character (positively or negatively). Browne’s (1996) Piggybook has
variation of this kind, containing just a few [unbound: decontextualized]
images in the course of the story at key moments in the generic structure (see
Painter 2008 for discussion of how they contribute to the theme of the narrat-
ive). In other cases where the entire book comprises decontextualized images,
the effect is to make the character/s rather than story world itself the focus of
attention, thus achieving the more generic status for those participants that
Kress and van Leeuwen observe for such images. An example is Machin and
Vivas’s (1991) I Went Walking, where the preschooler’s attention is to be focused
on the increasing number of animal participants at each stage of the simple
story rather than on any fully realized alternative imaginative world.
Where a largely decontextualized image nevertheless includes a very limited
local context, (the [decontextualized: localized] option), that context is likely
to include what Kress and van Leeuwen (2006) refer to as ‘symbolic attributes’
associated with the character. For example, in Fox and Vivas’ renowned tale of
Possum Magic – a story in which a wise old possum makes her baby grandchild
invisible to keep him safe from predators and then faces a dilemma when he
wishes to return to visibility – we see Grandma Poss making baby Hush invisible
in an image which also contains the minimal context of a shelf of magic books.
The page is not filled out with any depiction of the background setting, but the
shelf of books provides just enough localized setting to symbolize Grandma’s
special knowledge and power.
130 Semiotic Margins
In sum, then, the most general option for framing a picture-book image
involves the presence or absence of a margin to ‘hold’ the image within the
page. Where the margin is absent and the page edge is the only limit to
the image, it is [unbound] in two senses. The depicted characters are less
constrained by their circumstances and the story world is more opened up to
the reader. Where such an unbound image of the characters is decontextual-
ized, attention is focused on the behaviour or nature of the depicted character/s,
which when used selectively, has the potential to trigger an evaluative response
at particular moments in the story or, where just a few iconic elements of the
setting are provided, to assist the symbolic ‘reading’ of the character.
When it comes to bound images (i.e. those with a margin), there are a number
of possible meaningful choices simultaneously available, as shown by the brace
enclosing five different sets of oppositions in Figure 6.1. The first two of these
relate to the way the margin may afford interpersonal meaning. This may occur
first through the use of colour in place of the default choice of white for the
margin. Colour in a picture-book image is a crucial means of creating ‘ambi-
ence’ or mood (Painter 2008) and can be carried by the margin as well as the
image itself. A nice example is provided by the Australian picture book Lucy’s
Bay (Crew & Rogers 1992), relating the story of a young boy coming to terms
with his sister’s death. All pictures are [bound] and the surrounding margin is
a soft, light peach colour, providing a warm ambience that plays an important
role in avoiding a dark, depressing atmosphere for such a sombre theme.
A second way in which the margin may afford interpersonal meaning is
by the depiction of characters in the margin itself. This is very rarely done,
but has been used to great effect in the subtle and sophisticated Australian
picture book Hyram and B (Caswell & Ottley 2003). This story of two discarded
teddy bears is narrated by one of the bears, but where another character’s
experiences are related, the presence of that character in the margin signals
a re-focalization, such that the depicted image bound by the margin is read
as that other character’s memory or experience. This text in fact makes use
of both [bound: ambient margin] and [bound: refocalized] options to great
interpersonal effect.
As well as these possibilities for managing interpersonal meaning, there are
a number of other options for bound images to be considered. First of all, the
margin may surround the image on all four sides [bound: surrounded] as in
the opening images of Where the Wild Things Are, or it may extend from only a
single picture edge, thus limiting the image on the page but not enclosing it
entirely [bound: limited], as also occurs in Wild things as the image expands
(see Figure 6.2 for a schematic representation).
Then, in either case the image may be entirely contained by the margin
[bound: contained] or may transgress the edge created by the margin [bound:
breaching]. The choice of [breaching] is one quite frequently taken up, usually
providing an iconic way of suggesting that the depicted character has too
much energy, presence or momentum to be entirely constrained or bound by
Organizing Visual Meaning 131
the margin, as shown in Image 6.2. Examples of a character breaking the frame
in this way can be found in several of Browne’s books, including Piggybook
and My Dad, while in Sendak’s Where the Wild Things are, by contrast, it is the
gradually expanding setting that breaches the margin as Max’s imaginative
world expands even beyond the confines of a single page. The [breaching]
option thus signifies in general terms the transgressing of a border, underlining
the overall meaning of the [bound] option.
The final set of options for bound images relates to the presence or absence
of lines creating a defined frame around the image.
132 Semiotic Margins
(a) (b)
If the images in Image 6.3 are compared, it can be seen that the effect of a
frame is to make the image more overtly a ‘picture’, an option often favoured
in more traditional illustrated stories (and like the margin itself, a frame can be
coloured and thus contribute to the ambience created by the picture). While
most books make the same choice throughout in terms of framing the images
or not, Browne’s (1994) Zoo is an example of one that exploits the possibility of
variation to great effect. The book tells the humorous story of a family’s day out
at the zoo to make a moral point about the inhumanity of caging and objectify-
ing animals. The book is laid out with a small unframed picture of members of
the family on each left-hand page (together with the verbal text), and a large,
beautifully rendered, framed picture of the animals on the right-hand page, a
contrast in framing which quietly emphasizes the way the animals are ‘a sight’
displayed for human enjoyment. In the book, the reader/viewer is gradually
moved to take on the animals’ perspective, and as part of this process there
comes a point where the left-hand image of the family as part of an unpleasant
crowd of zoo patrons is enclosed in a frame, emphasizing how they appear as
an ugly sight from the animal’s point of view.
Browne’s picture books in fact make considerable clever use of the [framed]
option. While the frames in Zoo are of the most straightforward kind, explicitly
rendered by a black or coloured line ([demarcated: textual]), in other books
there are images where it is the ideational content of the image that creates
the frame. For example, in Voices in the Park (Browne 1994), there is an image
where the playground apparatus being enjoyed by the children serves as a frame
to the image (the [demarcated: ideational] option), and in My Dad (Browne
2000), the edge of the blackboard on the wall behind Dad-as-teacher creates
an ideational frame within the textual one. In such cases where an ideational
element is used to demarcate the frame, the frame appears to serve as a sym-
bolic attribute: signifying the playfulness of the children in the first instance
(Voices in the Park) and the authority and knowledgeability of Dad in the second
(My Dad). Finally, another of Browne’s books, Piggybook (1996), illustrates an
additional option that can be taken up. This is the possibility of elaborating the
Organizing Visual Meaning 133
Image 6.4 The [framed: iconized] option: Initial spread of J. Baker’s (1991)
Window
134 Semiotic Margins
intermodally
resolved
intramodally
diagonal unresolved
not
face-to-
oppose face
orthogonal character back-to-back
match
setting
vertical
aligned
iterating
horizontal
scattered
135
Figure 6.3 Choices in balance
136 Semiotic Margins
A centrifocal image can take a number of forms, with the principal contrast
between a [centralized] and [polarized] composition, as shown in Figure 6.4.
centralized
polarized
The most straightforward form for a centralized image to take is for the
centre of the page to be filled, usually by a single central character or group,
drawing the gaze to that participant in an unambiguous way. This is the option
of [centralized: full] – a kind of bullseye composition that may be used to
create a moment of stasis in the momentum of the narrative. Balance can
also be created by ranging the participants around the centre of the page in a
circle, [centralized: hollow], a rare choice in narratives, though common for
life-cycle depictions in information books.
Where the centre is filled, the centralized participant may be accompanied
either by a pair of other elements [accompanied: dual] or by an encircling
ring of other participants [accompanied; radial]. These are compositional lay-
outs noted for other kinds of material by Kress and van Leeuwen (2006), who
refer to them as ‘triptychs’ and ‘centre-margin’ compositions, respectively. Two
different book covers can illustrate these patterns: that of Wild and Vivas’
(1991) Let the Celebrations Begin (about the sense of community possible even
in a concentration camp) shows the narrator accompanied by a companion
on either side, while Lunn and Pignataro’s (2002) Waiting for Mum, about an
overanxious child, shows the protagonist encircled by her worries. Thus, one
cover uses the [accompanied: dual] composition to signify lack of aloneness
as a positive feature, while the other uses the [accompanied: radial] option to
thematize the negative situation of feeling surrounded and besieged on all
sides (see Image 6.5).
The other group of [centralized] images are those taking up the [polarized]
option and balancing different depicted elements around a space. Budding
photographers are advised to place elements of their composition in this way
on a diagonal axis to create a sense of balance without filling the centre (Präkel
2006) (see Figure 6.5 for the points favoured for creating a balanced composi-
tion around a vacant centre (Dondis 1973)).
Organizing Visual Meaning 137
Where this is done, the polarization is [diagonal] and [resolved]2 (i.e. achiev-
ing balance), which is a very common choice for picture-book images, usually
with characters as the polarized pictorial elements. Balance of this kind may
be further enhanced by mutual gaze between the depicted characters, which
strongly guides the reader to view the polarized composition as a cohering unity
(the [+eyeline vector] option in Figure 6.3). Sometimes however, the balance
is only achieved intermodally by opposing a pictorial element against a verbal
text element which participates in the composition ([resolved: intermodal]).
On occasions, of course, it is preferable not to resolve the polarization, but
rather to create an unbalanced effect in order to encourage page-turning or
to foreground narrative complication. While this option is not taken up as
often as might be predicted for narratives, the preschooler text I went walking
138 Semiotic Margins
(Machin & Vivas 1991) provides a repeated series of two somewhat unresolved
images followed by a fully balanced third, matching the wording of ‘I went
walking/ What did you see?/ I saw a [animal] looking at me’. These choices
encourage the novice reader to turn the page to arrive and pause at the
balanced image, helping to create a pattern that can be broken at the climax,
and to pace the text into a series of comparable incidents, introducing the
pre-reader into some very fundamental aspects of literary form.
Polarization in picture-book images occurs not only on a diagonal axis but
also on a vertical or horizontal one [polarized: orthogonal], where a balance
may be created in relation to either the setting or the characters. For example,
polarization of setting may occur where a clump of trees on the left is balanced
against a building on the right or where the image is split into a dark and a light
half, as in Browne’s (1998) Voices in the Park, where the cheerful child sits on a
bench in a sunny, summery setting next to the nervous and cowed child in
a more gloomy setting. The choice of [polarized: opposed] here organizes
the interpersonal ambience and helps the novice reader to read the symbolic
significance of setting, teaching another fundamental aspect of narrative. Less
commonly, two similar (rather than contrasting) elements of the setting – for
example, a pair of beach umbrellas on one of the pages of Possum Magic – may
be balanced against each other in a choice of [polarized: match].
While Kress and van Leeuwen (2006) see left-right polarization as signifying
a Given-New relation, and polarization on the vertical axis as signifying an
Ideal-Real relation, these interpretations were not found to be very convincing
for images in the picture-book narratives. In fact the most frequent kind
of orthogonal polarization is the depiction of two characters on a vertical or
horizontal axis, so as to enable the image composition to organize interper-
sonal meanings. Narratives are primarily concerned with interpersonal rela-
tionships between characters and these are readily signalled by the placement
of characters on the page and their orientation to one another. Where the
stance and posture of characters ‘match’ one another, some form of solidarity
and likeness is foregrounded, as in the example from Let the Celebrations Begin,
where two of the camp inmates are shown sitting side by side with similar poses
(see Image 6.6).
If characters are depicted in a [polarized: orthogonal: opposed] composition
on the other hand, whether on a vertical or horizontal axis, the nature of the
interpersonal relation will vary according to their bodily orientation. If the
characters are face-to-face, they are in contact, possibly in dialogue, with prox-
imity, stance and expression indicating the intimacy and affect of the contact.
On the other hand, if back-to-back with one another, disconnection or conflict
is signalled, as in the central spread of John Burningham’s (1984/1988) Granpa.
Here the separated back-to-back image of child and old man is accompanied by
the snatch of dialogue ‘That was not a very nice thing to say to Granpa’, evoking
with both force and economy the temporary rupture in the familial relation-
ship, without any need for an intervening narrative voice.
Organizing Visual Meaning 139
Image 6.6 [polarized: orthogonal: match/character] (Wild & Vivas 1991, Let the
Celebrations Begin)
another. While some pictures are indeed arranged in only one of the basic
idealized layouts described by the network, very many in fact combine different
principles within the one image.
Another example is to be found in Possum Magic (Fox & Vivas 1983) on a
page where the two main characters are on the bottom right of a spread in an
arrangement that is [polarized: diagonal: resolved]. Alongside them a balance
is provided intermodally by several lines of text, but above them there are nine
people sitting on benches with their backs to the reader, ‘extras’ in the scene.
Taken as a whole, this upper group realizes the option [iterated: aligned], but
considered more closely there are, within that, pairs of people in either match-
ing or opposed face-to-face orientations. Examples such as these are relevant to
the question of whether there is a visual equivalent of the linguistic notion of
‘hierarchy of periodicity’, where a verbal text sets up a higher order structure.
A visual text differs from a verbal one in that it does not unfold in time but has
all its levels of organization available to the viewer simultaneously. Rather than
a macro-Theme unfolding hyper-Themes which in turn predict succeeding
Themes, the ‘layers’ of a visual text are all present simultaneously; an image
offers what might be thought of as an ‘array of foci’. Information is visually
packaged so that ‘at first glance’ one particular kind of organization is most
dominant as a general compositional principle, but closer scrutiny is possible,
allowing additional patterns to be attended to.
In organizing a complex composition of this kind, different artists may prefer
different means of training and guiding the viewer’s attention. Vivas in Possum
Magic tends to create salience through the use of size and subtle colour choices
in order to offer a number of potential foci within an overall view. By contrast,
Anthony Browne is an artist who makes heavier use of internal frames to
provide an array of foci. For example, in one image from Gorilla (1983/1992),
the protagonist Hannah is in the centre of the page in a [centralized: accom-
panied: dual] composition, where she stands between two large male father
figures (one a gorilla in coat and hat and the other the father’s outdoor clothes
hanging on a hook). Within this overall composition, a door jamb provides an
internal frame which allows us to notice the gorilla and Hannah as a distinct
pair in a balanced composition of [polarized: diagonal: character: face-to-face].
The six panes of the window set in the door offer further frames for additional
foci of attention though these will not necessarily be attended to at first. Indeed
Browne famously hides visual elements on the page by making them non-salient
on the viewer’s first overall ‘take’ as guided by the organizational balance of
the image, so that they are revealed only on closer scrutiny or subsequent
readings. The possibility of doing this depends on managing the viewer’s atten-
tion in the first place to take in a view which foregrounds certain depicted
elements to create an initial compositional ‘take’.
Where Browne’s images usually offer a clear overall principle of balance ‘at
first glance’, McKee’s (1982) I hate my teddy bear is interesting for its distracting
and somewhat confusing images in which the two child protagonists are rarely
Organizing Visual Meaning 141
centre stage or made particularly salient in any way. This is in keeping with
the book’s metafictive nature, which frustrates our expectations of a simple
narrative line with main characters, offering instead a myriad of potential, but
incomplete visual stories. Thus, there are on most pages several competing
foci of attention, with little sense of any single overarching compositional
principle to guide the reader. By disturbing our expectations in this way McKee
makes clearer what is going on in the more typical case.
The two systems of meaning that have been presented here, those of
framing and balance, are proposed as sets of semiotic choices within only
one of the three metafunctions into which meaning is organized. The textual
metafunction of language is sometimes described as a ‘derived’ function in
comparison with the ideational and the interpersonal. That is to say, it is brought
into being by the presence of ideational content (talking about something) and
interpersonal meaning (enacting social relations), structuring these meanings
into coherent and cohesive discourse. Similarly, the visual textual metafunction
described here (usually referred to as the compositional metafunction) serves to
organize ideational and interpersonal meanings of picture books, and in-text
interpretation needs to be considered in relation to those other metafunctions
(see Painter 2007 for some account of these in picture books). Indeed, in the
explanation of the meaning potential of the various systems, it has been
necessary to discuss such matters as how relations between characters may
be organized, how readers’ attention may be constrained, how readers may
be positioned in relation to the story world and how dynamism or stasis
may be enabled by compositional choices. The two systems of framing and
balance are not proposed as exhausting the meaning potential of the textual
metafunction, since the various ways that the verbiage may (or may not) be
visually integrated into the image also needs to be taken into account, together
with a recognition of the way choices in colour, shape, setting and framing may
contribute to cohesion over the course of a complete narrative. However, the
two systems play a key role in organizing visual meaning within the page or
spread, and while our descriptions have been informed by pioneering work by
Kress and Van Leeuwen (2006) on visual grammar, the exploration of children’s
picture books has also indicated the value of focusing on one particular register
of texts for further developing our understandings of visual semiotics.
Notes
1
The terms ‘bound’ and ‘unbound’ were first introduced by Stenglin (2004) as
semiotic resources for analysing interpersonal meaning in 3D space. This chapter
extends their use to textual meanings in 2D visual images.
2
The term ‘resolved’, after Caple (2009), borrows from Gestalt theories of percep-
tion in which perceptual ‘resolution’ or closure is achieved when information is
organized around the balance points shown in Figure 6.5.
142 Semiotic Margins
References
Arnheim, R. (1982). The power of the center: A study of composition in the visual arts.
Berkley, CA: University of California Press.
Baker, J. (1991). The window. London: Julia MacRae Books.
Browne, A. (1992). Gorilla. London: Walker Books. First published London: Julia
MacRae Books, 1983.
Browne, A. (1994). Zoo. London: Red Fox. First published London: Julia MacRae
Books, 1992.
Browne, A. (1996). Piggybook. London: Walker Books. First published London: Julia
MacRae Books, 1986.
Browne, A. (1998). Voices in the park. New York: DK Publishers.
Browne, A. (2000). My dad. London: Doubleday.
Browne, A. (2004). Into the forest. London: Walker Books.
Burningham, J. (1988). Granpa. London: Puffin Books. First published London:
Jonathan Cape, 1984.
Caple, H. (2009). Playing with words and pictures: Text-image relations and
semiotic interplay in a new genre of western news reportage. PhD Thesis,
University of Sydney, Sydney.
Caswell, B. & Ottley M. (Illus.) (2003). Hyram and B. Sydney: Hodder Children’s
Books.
Crew, G. & Rogers, G. (Illus.) (1992). Lucy’s bay. Nundah, Qld: Jam Roll Press.
Dondis, D.A. (1973). A primer of visual literacy. Cambridge, MA: MIT Press.
Fox, M. & Vivas, J. (Illus.) (1983). Possum magic. Adelaide: Omnibus Books.
Halliday, M.A.K. (1978). Language as social semiotic. London: Edward Arnold.
Halliday, M.A.K. (1979). Modes of meaning and modes of expression: Types of
grammatical structure and their determination by different semantic functions.
In D.J. Allerton, E. Carney & D. Holdcroft (Eds), Function and context in linguistic
analysis (pp. 57–79). London: Cambridge University Press.
Halliday, M.A.K. & Matthiessen, C.M.I.M. (2004). Introduction to functional grammar
(3rd edn). London: Edward Arnold.
Kress, G. & Van Leeuwen, T. (2006). Reading images (2nd edn). London & New York:
Routledge. First published in 2001.
Lunn, H. & Pignataro, A. (Illus.) (2002). Waiting for mum. Sydney: Scholastic Australia.
Machin, S. & Vivas, J. (Illus.) (1991). I went walking. Norwood: Omnibus Books.
Marsden, J. & Tan, S. (Illus.) (1998). The rabbits. Port Melbourne: Lothian.
Martin, J.R. (1996). Waves of abstraction: Organising exposition. In T. Miller (Ed.),
Functional approaches to written text: Classroom applications (pp. 87–104). Paris:
TESOL France & US Information Service.
McKee, D. (1982). I hate my teddy bear. London: Andersen.
Nodelman, P. (1988). Words about pictures: The narrative art of children’s picture books.
Athens, GA: University of Georgia Press.
Painter, C. (2007). Children’s picture book narratives: Reading sequences of
images. In A. McCabe, M. O’Donnell & R. Whittaker (Eds), Advances in language
and education v.2 (pp. 40–59). London: Continuum.
Painter, C. (2008). The role of colour in children’s picture books: Choices in
ambience. In L. Unsworth (Ed.), New literacies and the English curriculum:
Multimodal perspectives (89–111). London: Continuum.
Organizing Visual Meaning 143
Introduction
among the elements. It is intended that such a model will contribute to a richer
understanding of students’ reading of multimodal texts, while offering a sys-
tematic approach to describing inter-semiotic relations in a way that is both
useful and accessible to teachers and test-writers. To test the efficacy of the
model, the framework has been applied to the analysis of data from a project
investigating multimodal reading comprehension in group literacy tests admin-
istered by a state government education authority (Unsworth et al. 2006–2008).
The questions explored in this research relate to how image and verbiage
interact in the test stimulus materials and how students interpret meanings
involving image-text relations.
One of the goals of the project was to develop an account of the kinds of
image-text relations students are likely to encounter in curriculum materials,
tested in the first instance with the data from this study. The modelling of
these relations, while initially derived from theory and research on multimodal
analysis from a social-semiotic perspective, is also very much data-driven and
draws on 3 sets of data gathered for this project:
1. Stimulus texts from the reading comprehension section of the Basic Skills
Tests (BST) for students in primary Years 3 and 5 in 2005 and 2007, and the
English Language and Literacy Assessment (ELLA) for students in Year 7
in 2007 (NSW DET 2005a, 2005b, 2007a, 2007b, 2007c);
2. student results on questions involving images from the literacy (Reading)
component of the BST and ELLA for the state test populations, and post-test
performance on the same subset of items for individual student participants
in the study; and,
3. participants’ verbalizations of their understandings of the images and
texts in the test stimulus materials, and their strategies for responding to
test items related to these texts – these were audio recorded in post-test
interviews.
The first section of this chapter presents an approach to describing the rela-
tionships between printed text and still images, drawing on related work on
modelling image-text relations in social semiotics. An account of image-text
relations that may be applied to the comprehension of multimodal texts is then
explored, examining a framework of relations through the analysis of the data
collected for the study. Examples from the test materials are used to illustrate
inter-semiotic relations in ideational, or representational meaning (Kress &
van Leeuwen 2006), and to explore briefly how this may interact with com-
positional meaning; excerpts from interviews with participants are presented
to highlight how children integrate meanings from image and text, and the
difficulties they may experience with this. In light of the findings from the
project, the chapter then revisits some questions on the nature of image-text
relations and the implications for multimodal reading comprehension and its
assessment.
146 Semiotic Margins
Figure 7.1 A schematic representation of elements in Telling The Time Using Water
B M-S FRAME
IV TEXT main
V IMAGE [conceptual: analytical]
VI TEXT supplementary
Greek water clock [caption: heading]
1. Water supply [label]
2. Overflow [label]
3. Cogs [label]
4. Water trickles in . . . [caption: explanatory]
5. Float [label]
Figure 7.2 Equivalence in Mapping Islands. From The Earth: Oceans and Sea
by Wendy Blaxland, © Macmillan Education Australia, 2000:27. Reproduced by
permission of Macmillan Education
words shown in speech bubbles come from three speakers. The third character
is the grandma, who is represented indirectly by her projected speech
[frames 3 & 8]. In this way the image and text augment each other in repre-
senting the human participants in the story.
A second type of extension, distribution, refers to juxtaposed images and text
jointly constructing activity sequences. According to Gill (2002), there are two
types of distribution. Intra-process distribution refers to the portrayal by images
and text of different aspects of a shared process. For example, the image might
depict the end result of a process described in the verbal text. This occurs in
the extract from ‘Mr Archimedes’ Bath’ (Allen 1980), where the text states ‘the
water rose’, while the accompanying image shows water overflowing from
the bath (NSW DET 2007a:6).
Inter-process distribution occurs when images fill a gap in the meaning in the
text; image and text complement each other in that activities or processes are
distributed across the two modes. For example, in the Year 5 stimulus (NSW
DET 2005b:6), ‘Two Summers’ (an extract from Heffernan & Blackwood 2003),
the text and images are juxtaposed to jointly construct the events from one
summer to the next (Figure 7.6). The activities of opening text (in italics),
‘Rick is coming to stay again. It takes him seven hours on the train from the city.
He’s staying for a whole week . . . ‘are represented in the words alone [clauses
1–3]. This introduction is followed by the first image, depicting a scene and
some of the activities from last summer’s visit [image I]. The main text below
this image introduces the contrast portrayed in the images between the last
summer – green with plenty of water in the river and dam, and the following
drought-stricken summer in the second image [image II]. While the first
image is elaborated upon by the text that follows it [clauses 4–6], the second
image conveys through visual representation alone, the effects of the drought
on the landscape. The changes of the second summer may be inferred from
an integrated reading of the text [clauses 7–9] and image.
Divergence was used to describe the third type of extending relation, where the
ideational content of the text is opposed or at variance to that of the image,
or vice versa. This term was also applied to instances where the meanings in
the text and image contradicted each other. An example of divergence can be
found in the extract from Anthony Browne’s (1992) Zoo (NSW DET 2007b:2–3),
where the family’s dialogue about the chocolate on the first page of the extract
is at variance with the pictures depicting the father and the giraffes.
Image 7.1 Enhancement in Eggs. ‘Eggs’. Article from Choice Magazine ( Jan/Feb
2001:23). Copyright © Australian Consumers’ Association. Reproduced with permission
from CHOICE Australian Consumers’ Association
Integrating Visual and Verbal Meaning 157
homospatiality
partial
ELABORATION
equivalence
concurrence complete
co-variate unity exposition
exemplification
augmentation
intra-process
complementarity EXTENSION distribution
multivariate unity inter-process
divergence
spatial
ENHANCEMENT
temporal
causal
locution
PROJECTION
idea
A representative sample of students who completed the 2005 Basic Skills Tests
(BST) for Year 3 (N = 70) and Year 5 (N = 55) with results in low (L = lower 25%
of test cohort), medium (M = middle 50% of test cohort) and high (H = upper
25% of test cohort) performance bands were interviewed about their under-
standings of the stimulus texts and asked to explain their strategies for answer-
ing the test questions. A structured ‘think-aloud’ protocol was used to elicit
student verbalizations of: (a) their understandings of the texts and images;
(b) whether they thought the information in the words and pictures were
similar and/or different, and in what ways they were similar and/or different;
and (c) the strategies they used to arrive at their answers to the test questions.
The same students were interviewed again following their completion of the
2007 BST for Year 5 students (N = 55) and the 2007 English Language and
Literacy Assessment (ELLA) for Year 7 students (N = 41).
In the excerpts from the interviews below, two female students explain
what they think the test stimulus ‘Zoo’ (NSW DET 2007b:2–3) is about. Student
Mf1 (medium result band) comments on the first picture and infers that
the father in the story is mean and reads the second picture in the context
of the overall activity constructed across the words and pictures of the text.
She adopts the view of the family visiting the zoo to make sense of the text
as a whole and successfully integrates the visual and verbal elements into a
cohesive whole:
Mf1: Well, it’s really about a family going to the zoo and what . . . looking
at the animals . . . it shows two giraffes, but it doesn’t really say
anything about this picture. But this one it . . . I reckon it’s like when
they say he was in one of his moods, and they have two horns, it makes
him look like the devil.
This response contrasts with that of student Lf2 (low result band), who describes
the text quite literally and does not go beyond a close paraphrase of the literal
meanings explicitly stated in the text or represented in the pictures, and so
misses the more implicit relationships constructed across the modes:
Lf2: Okay, it’s about like a mum brought a chocolate and the two kids,
they want to eat the chocolate but the dad’s saying no, you can’t have
it now. And there’s tigers walking at the zoo and in the first picture
like on the clouds dad had horns and the chocolate has been eaten
by dad.
For the stimulus text, ‘Puddles’ (NSW DET 2007b:4), student responses showed
further differences between participants in the high, medium, and low result
Integrating Visual and Verbal Meaning 159
groups. In the example below, a low scoring male student (Lm1) gives a literal
description of the text:
Lm1: I think it’s about a man and he wants peace and quiet. He wants to
sit down and drink his tea and read his newspaper. But there’s a little
boy who wants to play with him and take him for a walk. And . . .
oh yeah, he runs away, ‘cause he runs to his mum’s or his house –
grandma’s house. And so then he sits down there and reads his
newspaper and drinks his tea, but then the boy comes and he attaches
a lead to him and wants to take him for a walk.
Hf3: The pictures show like they’re a big part of the story, and they show
the actions that the boy and the grandfather do. They . . . in the first
three pictures, they’re showing that he’s relaxed, and like the picture
. . . the way he’s standing, and things, and . . .
Four test items with a spread in terms of item difficulty were associated with the
‘Puddles’ text (correct answers underlined). Again, some differences can be
noted in the responses from students with results in the different performance
bands with respect to the difficulties they had in answering the questions and
the strategies they used to obtain their answers. To answer question 7 correctly,
(Who says ‘Oh well. He’s such a little dear?’ (a) the boy; (b) children; (c) the man;
(d) Grandma), students needed to read both image and text, which comple-
mented each other through augmentation. For the whole test population, this
item was one of the most difficult in the test (42nd out of 46; 54% students in
the state answered correctly).
When asked whether she found the question difficult, a low performing
student responded with the following reason:
Lf2: Kind of. Because there’s no person, like they’re not showing you the
person, it’s like there . . . they don’t know who’s saying it.
Participants were also asked to explain how they obtained their answers.
Prompts were used to elicit more elaborated responses where necessary. For
example, in response to question 8, What is the man trying to do?, participant
Lm2 was able to obtain the right answer from the images alone ((a) go for
a walk; (b) watch television; (c) make a cup of tea; (d) read the newspaper).
160 Semiotic Margins
This was one of the easiest questions in the test (5/46; 94% of all students
answered correctly):
In question 9 (The speech bubble is drawn like this to show the speaker is (a) thinking;
(b) whispering; (c) feeling pain; (d) feeling excited), students needed to
read both image and text to answer correctly. In this instance, text and
image displayed equivalence in meaning. While this item required the integ-
rative reading of text and image, most students (86%) answered correctly,
for example,
Hf3: Because that . . . usually that’s the speech bubble, . . . it shows that it’s
something expressive, and if he was thinking or whispering, it would
be that sort of graphic. And he says ‘ow’ with it, so it’s like he’s feeling
pain.
Lf3: Two.
I: Who are they?
Lf3: The kid and grandpa.
I: How did you get your answer?
Lf3: Looking at the pictures.
The findings from the first stage of the study (summarized in Table 7.3)
indicate that the students in the high reading performance bands effectively
integrated meaning across the visual and verbal modes; used a range of test-
taking strategies in addition to reading comprehension strategies to arrive at
Integrating Visual and Verbal Meaning 161
Table 7.3 How do students recover meaning from the visual-verbal interface?
Poor readers Good readers
their answers; and could read beyond the literal representations in the text/
images. They also appeared to be more attuned to interactional meanings as
well as compositional meanings, although this was not a focus of the analysis.
For the low performing students in the study, concurrence (equivalence)
between verbal and visual meaning facilitated comprehension. The reinforce-
ment of linguistic meaning through visual representation appeared to assist the
poorer readers, providing additional cues for making sense of the material.
Where decoding linguistic meaning was unsuccessful or only partially success-
ful, students would rely on the images to support their interpretation. When
decoding visual meaning, students mostly would scan the text to find clues for
interpreting images. However, this strategy was seldom successful for struggling
readers, particularly where an unfamiliar, abstract visual representation was
accompanied by language that was also unfamiliar or grammatically complex.
Text features which appeared to cause difficulty for students with results in the
low to medium performance bands include: technicality and grammatical
abstraction in language, abstraction in images, and image-text relations of
extension (and enhancement).
data, exposition was more difficult than equivalence. These findings have clear
implications for reading comprehension as is indicated by student test results.
Where there is equivalence between text and image, there is maximal corres-
pondence of meaning across modes, each mutually reinforcing the meanings
afforded by the multimodal text. It could be expected then, that image-text
relations of this type are the easiest to comprehend. The performance of
students across the state on items targeting this kind of information supports
this expectation. For example, 81–91% of all students answered items 1 to 4
correctly in the 2005 BST3; items 3 (91%) and 1 (98%) in the 2005 BST5;
items 16 (89%) and 13 (90%) in the 2007 BST3; and items 2 (96%) and 17
(97%) in the 2007 BST5. (The percentages in brackets indicate the proportion
of the test cohort who answered the questions correctly.)
In stimulus material where the meanings in the text and image/s extend
or complement each other, it could be expected that comprehension of the
material would make greater demands on a students’ ability to access and integ-
rate meanings from across the modes. The questions involving images that
were most difficult in the 2005 and 2007 tests, according to state-wide student
performance, were items: 30 (32%) and 31 (51%) in the 2005 BST3; 28 (44%),
and 37 (56%) in the 2005 BST5; 35 (29%), and 30 (47%) in the 2007 BST3;
and, 32 (59%) and 30 (65%) in the 2007 BST5; all of these items targeted
relationships of augmentation and distribution in comprehending the stimulus
materials. This would suggest that the greater the difference in the meanings
represented across the modes, the greater the level of cognitive demand on
the reader in synthesizing these meanings into a coherent understanding of the
material as a multi-semiotic whole.
In the context of this study, I have restricted the account of image-text relations
to the ideational meanings represented in printed test stimulus materials, focus-
ing only on the affordances targeted by the test items. Even so, the analysis of
this set of data has brought to light some of the complexities encountered in
attempting to model inter-semiotic relations. One of the difficulties in modelling
image-text relations is that we are looking at the interface between typological
meanings in language, which are very often discrete realizations of meaning,
and meanings which are more typically continuous or topological (Lemke 1998).
Where there may be correspondences in ideational material at certain points
(what we have termed concurrence), there are also continuities in visual meaning
that cannot be captured in language. In that sense, any description of image-
text relations at best can connect a generalization (word) with a specific instance
(image) which stands in a relationship of elaboration to that word and vice
versa. When discrete categories are assigned for the description or analysis
of continuous meanings, typologies are inevitably imposed. Furthermore, in
Integrating Visual and Verbal Meaning 163
augmentation
distribution
complementarity
exposition
concurrence
equivalence
Notions such as how texts wholly or partially relate to an image are useful
as an initial foray into the way meanings are constructed across modes, but
in defining test constructs that may yield useful diagnostic and pedagogical
information to assist students struggling with reading complex multimodal
texts, a more finely tuned model is required. For such purposes, an operational
model of image-text relations first needs to be able to systematically describe
the different kinds of meaning relations that are constructed intermodally,
taking into account the multiple levels on which different semiotic modes
connect. It also needs to specify how test items might target particular aspects
of these relations that are significant for different levels of text comprehension.
For example, in tests of reading, to what degree are we testing students for
their visual decoding skills, or their understanding of literal meanings as com-
pared with their skills in synthesizing meaning across different modes or
critically reading the representations presented to them? Does the range of
items reflect the range of educational goals we set for students growing up in
a digitally enhanced, visually rich culture, where information is abundant if
not superfluous but at the same time largely under-evaluated? Finally, to be of
value for these purposes, any framework for description must be accessible
and comprehensible to teachers and test-writers to be workable. From this
initial exploration of the impact that different types of image-text relations
may have on test item difficulty and by implication reading comprehension,
assessment practices need to be carefully considered alongside the educational
goals and contexts in which multimodal texts are engaged.
Acknowledgements
This research was supported under the Australian Research Council’s Linkage
Projects funding scheme (LP0561658).
The author acknowledges the copyright holders for their kind permission to
include the following material in this book:
Figure 7.2 ‘Mapping Islands’ (map and key). From The Earth: Oceans and Sea
by Wendy Blaxland. Copyright © Macmillan Education Australia, 2000.
Reproduced with permission from Macmillan Education Australia.
Figure 7.3 ‘Ten Years of Recycling – The Good, the Bad and the Ugly’.
© Reproduced with permission from NSW Department of Education,
Educational Measurement and School Accountability Directorate.
Figure 7.4 ‘Secret Life’. © Reproduced with permission from Sand Swimmers
by Narelle Oliver, Lothian Children’s Books, 1999, an imprint of Hachette
Livre Australia.
Figure 7.5 ‘Puddles’. From The Puddleman by Raymond Briggs published by
Jonathan Cape/Red Fox. Copyright © The Random House Group, 2004.
Integrating Visual and Verbal Meaning 165
Notes
1
In Martinec and Salway (2005:50) ‘image and text are of the same level of
generality’ in exposition in contrast to those that represent a different level
of generality in exemplification. The coupling of these definitions via contrast
between categories was found to be unworkable when applied to our data as many
instances emerged where image provided more specificity than text or vice versa,
such as in Figure 7.3, but not necessarily in an exemplifying relationship.
2
Measures of item difficulty applied to the BST and ELLA by the NSW Department
of Education and Training were measured in logits (δ) using Rasch item response
modelling. This model locates student ability and item difficulty on the same scale,
allowing the interpretation of student ability scores in terms of task demands.
Item difficulty is defined probabilistically as the level of ability at which the
probability of success on the item is 0.5 for a student of average ability.
References
ACARA (Australian Curriculum, Assessment and Reporting Authority) (2009).
Shape of the Australian Curriculum: English. Canberra: Commonwealth of
Australia. Available from http://www.acara.edu.au/verve/_resources/Australian_
Curriculum_English.pdf.
Allen, P. (1980). Mr. Archimedes’ bath. New York: Angus & Robertson/Harper Collins
Publishers.
Briggs, R. (2004). The Puddleman. London: Random House Group.
Browne, A. (1992). Zoo. London: Red Fox.
Djonov, E. (2005). Analysing the organisation of information in websites: From
hypermedia design to systemic functional hypermedia discourse analysis. PhD
Thesis, University of New South Wales, Sydney.
Gill, T. (2002). Visual and verbal playmates: An exploration of visual and
verbal modalities in children’s picture books. Unpublished BA (Hons) Thesis,
University of Sydney.
Halliday, M.A.K. (1994). An Introduction to functional grammar (2nd edn). London:
Edward Arnold.
Halliday, M.A.K. (2004). An Introduction to functional grammar (3rd edn). Revised
by C.M.I.M. Matthiessen. London: Edward Arnold.
Halliday, M.A.K. & Hasan, R. (1985). Language, context, and text: Aspects of language
in a social-semiotic perspective. Geelong: Deakin University Press.
Heffernan, J. & Blackwood, F. (2003). Two summers. Lindfield: Scholastic.
166 Semiotic Margins
Kress, G. & van Leeuwen, T. (2006). Reading images: A grammar of visual design
(2nd edn). London: Routledge.
Lemke, J. (1998). Multiplying meaning: Visual and verbal semiotics in scientific
text. In J.R. Martin & R. Veel (Eds), Reading science: Critical and functional
perspectives on discourses of science (pp. 87–113). London: Routledge.
Lemke, J. (2006). Towards critical multimedia literacy: Technology, research, and
politics. In M. McKenna, D. Reinking, L. Labbo & R. Kieffer (Eds), International
Handbook of Literacy & Technology, volume 2.0. (pp. 3–14). Mahwah, NJ: Lawrence
Erlbaum Associates.
Lim, V.F. (2004). Developing an integrative multi-semiotic model. In K. O’Halloran
(Ed.), Multimodal discourse analysis: Systemic functional perspectives (pp. 220–246).
London and New York: Continuum.
Martinec, R. & Salway, A. (2005). A system for image-text relations in new (and old)
media. Visual Communication, 4(3), 337–371.
Matthiessen, C.M.I.M. (2007). The multimodal page: A systemic functional
exploration. In T.D. Royce & W.L. Bowcher (Eds), New directions in the analysis of
multimodal discourse. Mahwah, NJ: Lawrence Erlbaum Associates.
McCloud, S. (1994). Understanding comics: The invisible art. New York: Harper Collins.
MCEETYA (Ministerial Council on Education, Employment, Training and Youth
Affairs) (2007). National Assessment Program Literacy and Numeracy (NAPLAN).
Reading sample questions. Accessed on 18 Feb 2008 from www.naplan.edu.au/
test_samples/test_samples.html.
NSW DET (New South Wales Department of Education and Training) (2005a).
Basic Skills Tests. Water, Year 3 BST 2005 (Stimulus material).
NSW DET (New South Wales Department of Education and Training) (2005b).
Basic Skills Tests. Water, Year 5 BST 2005 (Stimulus material).
NSW DET (New South Wales Department of Education and Training) (2007a).
Basic Skills Tests. Puzzles and problems, Year 3 BST 2007 (Stimulus material).
NSW DET (New South Wales Department of Education and Training) (2007b).
Basic Skills Tests. Puzzles and problems, Year 5 BST 2007 (Stimulus material).
NSW DET (New South Wales Department of Education and Training) (2007c).
English Language and Literacy Assessment. Places and possibilities, ELLA 2007
(Stimulus material).
OECD (Organisation for Economic Co-operation and Development) (2006).
Assessing scientific, reading and mathematical literacy: A framework for PISA
[Electronic Version]. Retrieved 02.03.2007 from www.oecd.org.
O’Halloran, K.L. (Ed.). (2004). Multimodal discourse analysis: Systemic-functional
perspectives. London and New York: Continuum.
Royce, T. (2007). Intersemiotic complementarity: A framework for multimodal
discourse analysis. In T. Royce & W. Bowcher (Eds), New directions in the analysis
of multimodal discourse (pp. 63–109). Malwah, NJ and London: Lawrence Erlbaum
Associates.
Royce, T.D. & Bowcher, W.L. (Eds). (2007). New directions in the analysis of multimodal
discourse. Mahwah, NJ: Lawrence Erlbaum Associates.
Unsworth, L. (2006). Towards a metalanguage for multiliteracies education:
Describing the meaning-making resources of language-image interaction.
English Teaching: Practice and Critique, 5(1), 55–76.
Integrating Visual and Verbal Meaning 167
I once heard the jazz bassist and composer Marcus Miller explain how he com-
posed the score for the film Siesta, in 1987, laying a bass line first, then using a
synthesizer to build up the percussion, layer by layer. At the end of that process,
he realized there was something missing. The rhythm was all too mechanical.
So he engaged a drummer to play a single drum in the studio, on top of the
tracks he had already laid. What next, he then asked himself. I like Herbie
Hancock’s chords, I’ll put some of those in. It was at this point that I had a
revelation. I had always seen harmony as the language of Western music, and
harmonic structure as its basic source of textual development, whether in
Beethoven, Broadway or the Beatles. But to Marcus Miller chords were just
some added spicing, some added colour. It dawned on me that in multimodal
texts any semiotic mode can in principle either provide the basic structure or
remain incidental, fragmented, providing, here and there, some added colour.
Language is no exception. In transcriptions of intonation and conversation
(and today also in email messages), (spoken) language provides the basic struc-
ture and other elements are added as diacritics or indications-in-the-margin,
providing salience, or emotive overtone, or a deictic connection, as can be seen
in Figures 8.1 and 8.2.
Figure 8.1 Intonation transcription (Crystal 1969:179). Loudness and tempo are
indicated in the margin. Pitch is indicated in the margin as well as by arrows and
grave and acute accents. Stress marks have different levels
Rhythm and Multimodal Semiosis 169
Figure 8.2 Transcript of excerpt from the Rodney King trial (from Goodwin
2001:176). The pointing finger indicates that Sgt Duke is pointing at the relevant
body part on the screen
|[but/where will I/ FIND you//] [I've / got to/pick up my/ BAGS now//|
|oh/ yes they could/ easily/ check through the/ last/ CAS es//|]
larger narrative moves. Note the increase in tempo and tension at the start of
the second of these units, where Eve says ‘Wait a minute’. Elements other than
speech – the edits of the film, the gestures of Thornhill and Eve – find their
place within the temporal order of the speech rhythm. The cuts (indicated by
a vertical line across all the rows) coincide with stressed syllables, the gestures
with the boundaries between rhythmic phrases. Even when there is no speech,
towards the end of the excerpt, the timing of the cuts still follows the rhythm
initiated by the preceding speech.
Rhythm frames and delineates the communicative moves of the unfolding
text, here the moves of the narrative. The excerpt immediately precedes the
famous scene in which Thornhill (Cary Grant) is attacked by a cropduster
plane. Eve Kendall (Eva Marie Saint) has just told Thornhill when and where to
meet a mysterious man called Kaplan. What Eva knows, and what Thornhill
does not know, is that the meeting is a trap and that Thornhill will be attacked.
Rhythm and Multimodal Semiosis 171
After some perfunctory lines of dialogue, during which the audience is left
to wonder whether Eve will intervene, there is a change of pace. Tension rises.
At the last minute Eve seems to have second thoughts. ‘Wait a minute’, she says,
‘Please’. A tense silence hangs between them. But the moment passes, and
Thornhill leaves to board his train.
Figure 8.4 analyses a scene from Marcel Carné’s Hotel du Nord. Here the
structure is carried by the rhythm of the actor’s movements. Jean (Jean-Pierre
Aumont) and his girlfriend Renée (Annabella) have made a suicide pact and
locked themselves in a hotel room. Jean has shot Renée but as he points the
gun at himself there is a knock on the door. He escapes the hotel room via the
balcony and is then seen walking along badly lit, gloomy streets, in deep despair.
He stops on a railway bridge, obviously intending to commit suicide by throw-
ing himself in front of a train. Just as he has climbed over the railing, and as
an approaching train has nearly reached the bridge, a cart drawn by a white
horse passes through frame, close to the camera, obscuring Jean from view.
When the steam from the locomotive has cleared, we discover that Jean has not
jumped. He climbs over the railing and walks back in the direction he came
from to give himself up.
In this excerpt the rhythm is carried, not by speech, but by Jean’s actions. The
first rhythmic phrase leaves the audience in uncertainty as to what he will do
next and ends when a prostitute grabs his arm, speaking the only line of dia-
logue in the scene. At this point the audience will wonder whether the prosti-
tute is going to play a role in the subsequent events. But no, Jean walks on, and
as he stops on the bridge, with a train approaching, the possibility of suicide can
be envisaged. The next larger rhythm unit is carried by the rhythm of Jean’s
deliberate movements as he is getting ready to jump. As the horse-drawn
cart passes the rhythm of his movements, now no longer visible, can still be felt.
The clock continues to tick. At the tenth measure, well after we might have
expected something new to happen, we hear the train’s whistle, and exactly
at the moment of the twelfth measure, we cut to a frontal view, revealing that
Jean has not jumped.
In the scene from North by Northwest the edits and gestures were coordinated
with the rhythm of the speech. Here the camera movements, the edits and the
sounds, including the line of dialogue, are aligned to the rhythm of Jean’s
actions. And just as tempo and tension increase in the middle of the North by
Northwest excerpt, so here, too, the tempo becomes tighter and tenser as Jean
begins to climb over the railing of the bridge.
Figure 8.5, finally, shows a brief scene from an anonymous travel docu-
mentary called Latin American Rhapsody. The shots of mothers and babies have
neither continuity of action, nor continuity of commentary or dialogue. It is
the musical rhythm, which provides cohesion here – edits and gestures are
aligned to the musical accents and the boundaries of musical phrases, under-
lining the expository structure of the short scene, which forms a mini catalogue
of ethnic variety in Latin America.
172 Semiotic Margins
Figure 8.4 Rhythmic analysis of a scene from Hotel du Nord (Marcel Carné 1938)
Rhythm and Multimodal Semiosis 173
In sum, either music or speech or action can provide the rhythm that carries
the narrative and expository development of texts of this kind. Of course, it may
be that two semiotic modes join in carrying the rhythm, as in dance, or that two
rhythms are in some kind of polyrhythmic relation (cf. van Leeuwen 1999), but
the general point stands: language, action and music can all be either ‘para’,
‘marginal’, or central. It would be worthwhile to study such crossmodal rhyth-
mic relationships not just in film (although film provides convenient examples),
but also in everyday interactions, a promising strand of research (see, for
example, Hall 1983) which was abandoned when the tape recorder replaced
the 16 mm film camera as the primary research tool in the late 1960s.
In spatially ordered texts, too, cohesion, structure and identity do not just
come from language. Densely printed pages are normally read from left to
right and from top to bottom, but so are many comic strips. In comic strips,
language may be ‘para-visual’, consisting of little more than occasional verbal
gestures (AKA Comics 2004:18):
Here, too, the language may be restricted to nouns and nominal groups which,
on their own, without the visual structure would make no sense. Here is a home
page, without the boxes, the colours, the columns, the colours, the fonts, the
bullet points (from Lupton 2004:161):
Documents which only yesterday would have taken the form of discursive
reports are now often prepared on Excel sheets originally designed for figures.
For ‘personal action plans’, for instance, a template may be provided with
columns for ‘action’, ‘person responsible’, ‘purpose’, ‘timeline’ and so on.
Elements of this kind used to be connected through the grammar of clauses. If
I am the person responsible, and writing is my action, I write ‘I write’, and not:
Now such elements are more and more often connected by the grammar of the
diagram, the grid or the network. Martinec and I (Martinec & van Leeuwen
2008) have described a number of such diagrammatic ways of arranging
information, showing how they underlie the structure of contemporary multi-
modal texts and websites. In such contexts, words and pictures become inter-
changeable. I could also ‘write’ this multimodal ‘clause’ as in Figure 8.6.
Here is another one of my favourite examples. A single page magazine
advertisement for Sheba catfood, which has just four words ‘Spoilt, spoilt,
spoilt, spoilt’ (see Image 8.1). Analysing its language only makes little sense.
But together with the pictures these four words begin to make sense as an
almost rebus-like sentence – something like ‘This fluffy kitten is spoilt four
times over, once by each variety of Sheba cat food’. And the cohesion between
the disparate elements of this multimodal ‘clause’ is predominantly visual –
cohesion of colour (the yellow of the cat’s eyes is repeated in the tins of cat food
and the grey of the text coheres with the grey of the cat’s fur) and cohesion of
line and texture (both the outline of the kitten and the outline of the letters
are soft and flowing).
The question to ask is not, or no longer: What is the relation between
language and action, language and image, image and music, language and
music, and so on, as if they could adequately communicate on their own, or
as if some generalized statement about their central or marginal role in
multimodal texts could be made. Yes, in the past, image and caption, text and
Rhythm and Multimodal Semiosis 175
illustration, were relatively distinct, and the performance of spoken words did
not count for as much as the words themselves. Today this is changing. Modes
can become so utterly intertwined with one another that they no longer make
sense on their own. Scholars exploring these issues, like the contributors to
this volume, may at present still feel they are in the semiotic margins, but they
will not be so for long, and their work deserves a place in the centre.
References
AKA Comics (2004). Sword of Majido. Cairo: AKA Comics.
Crystal, D. (1969). Prosodic systems and intonation in English. Cambridge: Cambridge
University Press.
Goodwin, C. (2001). Practices of seeing visual analysis: An ethnomethodological
approach. In C. Jewitt & T. van Leeuwen (Eds), Handbook of visual analysis
(pp. 175–182). London: Sage.
Hall, E.T. (1983). The dance of life – The other dimension of time. New York: Anchor
Press.
Lupton, E. (2004). Thinking with type – A critical guide for designers, writers, editors &
students. New York: Princeton Architectural Press.
Martinec, R. & van Leeuwen, T. (2008). The language of new media design. London:
Routledge.
van Leeuwen, T. (1999). Speech, music, sound. London: Macmillan.
van Leeuwen, T. (2005). Introducing social semiotics. London: Routledge.
Chapter 9
Introduction
In a dialogue, face-to-face, two persons fill the space between with expres-
sions of emotion. They are linked by many threads of contact between senses
and movements. Each emotion is a test or judgement in that space between
selves in the eyes of each other, a vibration in the threads. Eyes make a recip-
rocal link, each person’s regard both signalling interest, or disinterest . . . But
the voice carries a more intimate message of rhythms and tones, and the
hands are active in gesturing the impulses of intention and memory, often
referring in explicit mimetic ways to absent places and events, and to hopes
and fears of protagonists in the spoken narrative . . . By the way all these
parts of the body move in concert, the traffic of thoughts and feelings in
one’s mind are offered to, and crave response from, the sensibility of the
other. (2005:104)
For most of us, this is the lived experience of everyday social interaction, but
for a few of us there is an additional parallel universe of interaction mediated
by written texts, disembodied from the direct relationship between speaking
people, and the actual times and places in which we speak. Yet as intangible
as the written world may be, it can be as real and meaningful for writers and
readers as the spoken world of interacting people, things and events. I am not
thinking here merely of losing oneself in the plot of an absorbing novel, but
of scholars exploring new fields of knowledge, or excavating old ones, making
discoveries and recharting the borders of their disciplines, all through the
virtual world of the written word.
178 Semiotic Margins
the roles of language in communication. The analysis here assumes the strati-
fied model of language in social context described by Martin and Rose (2007a,
2008). At the level of register, the tenor of relations between speakers may
be equal or unequal, close or distant, and fields of activity may be everyday,
specialized, technical or institutional. The roles of language are to simultane-
ously enact the tenor of relationships and construe these fields of experience.
These dimensions of register are coordinated at the higher level of genre,
the types of text-in-context that are recognizable in a culture, from stories
to arguments to casual conversation, each of which may vary in its tenor, field
and mode.
As language both enacts relations and construes fields, its mode varies in two
dimensions: in terms of field, from texts that accompany activity (language-
in-action) to texts that constitute their own field (language-as-reflection), and
in terms of tenor, from spoken dialogue to written monologue. Values along
these two dimensions are independently variable, for instance one can talk like
a book, or write speech down. But taking both together, at the most dialogic
end of language-in-action are direct interactions between people, the mode
in which children first acquire language. Further along the continua are oral
stories, in which speakers reconstruct past experience in face-to-face contact
with listeners. More remote again are written texts that construct new fields,
from literary fiction to academic theory. A focus of this chapter is on what
happens to the relationship between interactants in this progression, from
interacting directly with people, through interacting with oral stories, to inter-
acting with books. These variations in mode are modelled in Figure 9.1, and
illustrated in Tables 9.1–9.3.
monologic
Interacting
tenor with books
orientation
Interacting
with stories
Interacting
with people
dialogic
action field reflection
orientation
This is a Dreaming story (tjukurpa), it is said. The people were living in this land. setting
In all the land, it’s said, lived the people.
Complication
And they, those people, had useless fire, with black firesticks (i.e. useless for problem 1
igniting a fire). With black firesticks it’s said they were living.
Look, they were unable it’s said to obtain fire. It was like perpetual night, like problem 2
living in darkness, in the dark night, and those people were living in ignorance.
And it’s said one man, Kipara (plains bustard), was living with fire with good problem 3
firesticks. So in numerous places men were thinking of this one man, of getting
that fire from him.
And they were unable to get it, as they followed him and followed him problem 4
continuously, snatching at the fire. All those men were unable to snatch the fire
from him.
(Continued )
182 Semiotic Margins
of the story, but are all expressed by the sensory contacts between storyteller
and listeners that Trevarthen describes – eyes widening and narrowing, hands
gesturing intention and direction, the voice intimating pity, frustration, fear,
relief, joy. Furthermore, the children listening are familiar with the protagon-
ists, crows who crouch in darkness and snatch at scraps, the bustard who walks
in long strides with his beak in the air, and the black falcon who hovers high
above grassfires and dives into them after prey.
Where attention was directed in Table 9.1 by exophoric references to the
context, in the story listeners’ attention is directed by anaphoric references to
previous mentions in the text (underlined), and by the textual organization of
its clauses. For example, each shift from one phase to the next is signalled to
the listener by marked starting points in a clause. The first problem is signalled
by iterating identities And they, those people, the next problem by a series of
circumstances Like perpetual night, in darkness, in the dark night, the third problem
by iterating an identity And it’s said one man, Kipara, the fourth problem by
a series of circumstances At another place, at the sea, and the solution by again
iterating an identity And Warutjulyalpai, that bird Warutjulyalpai.
In sum, the resources that storytellers draw on to engage listeners include
(at least):
Molly and Gracie finished their breakfast and decided to take all their dirty setting
clothes and wash them in the soak further down the river. They returned to the
camp looking clean and refreshed and joined the rest of the family in the shade for
lunch of tinned corned beef damper and tea.
Complication
The family had just finished eating when all the camp dogs began barking, problem 1
making a terrible din. ‘Shut up,’ yelled their owners, throwing stones at them.
The dogs whined and skulked away.
Then all eyes turned to the cause of the commotion. A tall, rugged white man stood on description
the bank above them. He could easily have been mistaken for a pastoralist or a
grazier with his tanned complexion except that he was wearing khaki clothing.
Fear and anxiety swept over them when they realized that the fateful day they reaction
had been dreading had come at last. They always knew that it would only be a
matter of time before the government would track them down.
When Constable Riggs, Protector of Aborigines, finally spoke his voice was full of problem 2
authority and purpose.
They knew without a doubt that he was the one who took children in broad
daylight – not like the evil spirits who came into their camps at night.
‘I’ve come to take Molly, Gracie and Daisy, the three half-caste girls, with me to
Moore Rive Native Settlement,’ he informed the family.
Reaction
The old man nodded to show that he understood what Riggs was saying. The reaction
rest of the family just hung their heads, refusing to face the man who was taking
their daughters away from them. Silent tears welled in their eyes and trickled
down their cheeks.
Molly and Gracie sat silently on the horse, tears streaming down their cheeks as reaction
Constable Riggs turned the big bay stallion and led the way back to the depot.
A high-pitched wail broke out. The cries of agonized mothers and the women, reaction
and the deep sobs of grandfathers, uncles and cousins filled the air. Molly and
Gracie looked back just once before they disappeared through the river gums.
Behind them, those remaining in the camp found sharp objects and gashed
themselves and inflicted deep wounds to their heads and bodies as an expression of
their sorrow.
The two frightened and miserable girls began to cry, silently at first, then reaction
uncontrollably; their grief made worse by the lamentations of their loved ones
and the visions of them sitting on the ground in their camp letting their tears
mix with the red blood that flowed from the cuts on their heads.
This reaction to their children’s abduction showed that the family were now in comment
mourning. They were grieving for their abducted children and their relief would
come only when the tears ceased to fall, and that will be a long time yet.
Learning to Interact with Books 185
After presenting the protagonists, the author introduces tension with the dogs
barking, pauses to describe the antagonist who caused it, and intensifies it with
the family’s feelings towards him, then worsens the problem with his brutal
announcement, followed by a series of climaxing reactions, which are then
explained to the reader. As in the oral Pitjantjatjara story, shifts from phase to
phase are signalled by their starting points, the first description with Then all
eyes turned, the reaction by Fear and anxiety, the next problem by an iterated
identity When Constable Riggs, Protector of Aborigines, and the series of reactions by
shifts from one identity to another The old man, Molly and Gracy, A high pitched
wail, The two frightened and miserable girls.
So written stories can deploy the same resources of generic stages and
phases as oral stories do for enacting empathy and antipathy, apprehension
and commiseration, tension and relief. But in the absence of sensory contact
with storytellers and familiarity with the field of a story, the events are expanded
instead with far more diverse descriptive lexis and appraisals (including meta-
phors), as well as with grammatical expansions. In this extract, descriptive lexis
and appraisals comprise a full third of the total words. The immediate sensory
exchange between speakers in a dialogue, and between storyteller and listeners
in an oral story, has been replaced by words alone. Instead of a living, feeling,
speaking, gesturing person, the reader now interacts with words on the pages
of a book.
Follow the Rabbit-Proof Fence is a novel written for adult readers, but the capacity
for being absorbed by its events, characters, scenes, feelings and judge-
ments begins for most readers in early childhood, particularly with parent-
child reading in the home. How do young children learn to do without
the direct expressions of interpersonal relations in spoken interactions, and
instead engage on their own with emotions expressed by written words? The
answer, of course, is that reading most often begins not as a solitary activity,
but as a medium for the sharing of emotion and attention between adult
and child.
In this respect learning to read is no different from learning to speak.
Careful observers consistently foreground the sharing of emotion and atten-
tion in early childhood learning. Painter (2003) shows how language begins in
infancy, not with experiential categorizations, but with affective appraisals of
perceptions that are shared with caregivers. Halliday (1993) describes how each
new breakthrough in language learning occurs in the context of emotionally
charged events. Trevarthen (2005) describes how communication between
child and adult begins immediately after birth with the exchange of emotion.
186 Semiotic Margins
(a) (b)
Parent-child reading works with this same repertoire of emotion and attention,
to engage young children in the act of reading as a meaningful activity, that
is, to learn to interact with a book as a partner in communication. How this
engagement with books develops is illustrated in the following interaction
between a mother and her 18-month old child (from McGee 1998:163), around
The Three Little Pigs (Kellogg 1997), with relevant pages shown in Images 9.2
and 9.3. The extract includes three cycles of interaction, over four pages of
the book.
1 2
(a) (b)
As with Table 9.1, each move is labelled as K1, K2, A1 or A2. Non-verbal
moves are further distinguished as ‘nv’. For example, the first move in the
exchange is the child bringing and opening the book. This is labelled A2nv,
as she is implicitly demanding her mother read it.
In addition, the purpose of each move is labelled in two steps. First, each
interaction cycle consists of four types of phases. In one phase, the mother
prepares the child to recognize a feature of the text; in the second the child
identifies a text feature; in the third the mother evaluates her response; in the
fourth she may elaborate with more information.
Within each of these phases, the purpose of each move is further specified.
For example, in the first interaction cycle, the mother draws the child’s atten-
tion to an image in the book by pointing at it. This move is labelled as A2nv,
as the mother is implicitly demanding the child pay attention to the image.
She then names the image, and this move is labelled as KI, as she is giving
information.
In the Prepare phase, the mother draws attention to the story’s main characters
by pointing (A2nv) and names them (K1). The child is too young to recognize
the significance of the characters, but interprets the mother’s move as pre-
paring her to likewise point and name. The Identify phase thus involves her
pointing at the background images of trees (A1) and naming them (K2).
Learning to Interact with Books 189
Significantly the child does not simply imitate her mother, but responds with
her own innovation on pointing and naming. Her motivation for doing so is
apparent as she looks to her mother to affirm her effort. This move is labelled
K2nv as she is asking for evaluation, so that the mother’s affirmation is KI. The
Evaluation thus involves both these moves, apparently initiated by the child.
The positive emotion induced by success and affirmation expands the child’s
potential for learning something more. Elaboration phases capitalize on this
positive emotion, and on the learner’s attention to what has just been identi-
fied. Although the child has not recognized the significance of the characters
here, the mother capitalizes on her attention, by repeating what she had said,
with correct pronunciation in a full sentence. The child has thus received
a micro-lesson in grammar and articulation, at the moment when she is affec-
tively and cognitively most likely to retain it.
The child then innovates again by turning the page and identifying another
tree, and asks and receives another affirmation. However, the mother does not
elaborate this time, but takes advantage of her attention to initiate a second
cycle, drawing her attention to the characters, and elaborating on their actions.
In this second cycle, the learning goal progresses from identifying characters
to engaging the child’s empathy with their activities, and expectancy of events
to come. Again the mother prepares by pointing and naming the characters,
but then elaborates their activities in words and a gesture ‘Bye bye mama [waves
her hand]. We’re going to build a house’. These are not the words in the text,
rather the images are re-interpreted in terms she knows the child will recognize
from her own experience.
The child can thus see herself reflected in the characters, in their activities
and their relationship with their mother. This identification with the protag-
onists is the seed of empathy. Accordingly, the child laughs in recognition,
repeating the waving gesture. Her identification also engages her interest in
the characters’ intentions, and so in the events to come, so that she turns the
page to see what happens next.
190 Semiotic Margins
(a)
(b)
(Continued)
Learning to Interact with Books 191
In the third cycle the learning focus progresses explicitly to feelings of empathy
and antipathy. This time the mother directs attention to both the image by
pointing, and her own facial expression with I see that wolf. She evaluates the
image with the apprehensive Oh oh, interpreting the pig’s facial expression with
her own, modelling the reader’s empathy with the protagonist, and the anti-
pathy to the antagonist. The child thus recognizes both the emotion and
expectancy inherent in the apprehension, and responds by turning the page,
and pointing to the next picture of the wolf and repeating Oh oh, which the
mother affirms by repeating Oh oh herself.
In the fourth cycle the mother reads the words on the page for the first time.
She prepares the child to recognize their relation to the image by blowing
on her, imitating the wolf in the image. Recognizing the wolf’s behaviour in
both words and image then provides a context for elaborating with a moral
judgement Very bad, isn’t he?
Here are the core elements to be found in any learning interaction:
the teacher directs attention, or follows the learner’s attention, and models a
behaviour, the learner applies the model, the teacher evaluates, and may then
capitalize on the learner’s success and positive feelings, by elaborating with
more information. The teacher is almost always the primary knower, with the
authority to evaluate the learner’s responses, as well as providing information,
as we saw in Table 9.1. The learner is by definition the secondary knower, the
beneficiary of the information provided, whose own offerings are evaluated
by the teacher/parent.
We have described these patterns as scaffolding interaction cycles (Rose
2004, 2007). In the parent-child reading genre they appear to consistently
include the four phases, Prepare, Identify, Evaluate and Elaborate, diagrammed
in Figure 9.2.
In this brief excerpt, the child’s attention has been drawn to features that
identify main characters, engage readers in their activities, expect sequences of
events, enact emotional reactions, and judge their behaviour. The continual
affirmations serve to engage the child in the activity of story reading, rewarding
192 Semiotic Margins
Prepare
Elaborate Identify
Evaluate
her for responding to the mother’s preparing moves. But the affirmations also
function to give intense positive value to the meanings that the mother presents
and the child repeats. Each exchange of value-laden meanings then enhances
the child’s capacity for understanding a further elaboration, which the mother
usually takes advantage of.
The mother carefully and deliberately interprets the meanings in the book
for the child. She adjusts, translates and reduces the meanings expressed by
words and images in the book, down to the level of spoken language she knows
the child will understand. This includes making implicit meanings explicit,
which must be inferred by readers from the co-text, or interpreted from their
own experience and values. So in order to make the text’s field accessible to
the child, the mother commits fewer wordings than are presented in the text,
but commits more meanings that are implicit in the text. In Bernstein’s theory
of pedagogic classification (1990, 1996), the boundary between the child’s
oral experience and the written discourse of the book is weakened in each
preparation move. But once the child understands each meaning in her own
terms, the boundary is then strengthened in elaboration phases, to extend her
understanding of the esoteric field of the book.
Over weeks, this book will be read again and again. Each time the book is
read, the new meanings presented in elaborations become shared meanings;
these then become the basis for preparing more new meanings until the child
is thoroughly familiar with both the book’s words and its semantic patterns.
These patterns will then be identified and further elaborated in the next book.
Over months and years the complexity of reading books increases, that is, their
mode becomes more highly written. The long-term instructional sequence,
through which the child’s repertoire is steadily expanded, is thus shaped by
the system of written language, the reservoir of meanings she encounters in
children’s literature. At the same time, the child will tacitly acquire a general
orientation towards recognizing, interrogating and interpreting patterns of
meaning in written texts. This is the semantic orientation that generates and is
Learning to Interact with Books 193
fed by the play of layered meanings in literature, the literary ‘gaze’ that distin-
guishes members of the middle class’ inner circles. Furthermore, the child is
building an orientation to interacting about these meanings with her parents,
or talk-around-text. When she gets to school, the child will be ready to apply
these orientations to texts and talk-around-text, and so display an aptitude for
school learning that will win her constant praise from her teacher, which will
in turn enhance her capacity for further learning, and so on, into the bright
future of a successful student.
The elements of learning that we have identified to this point constitute
what we shall call the pedagogic genre, including 4 dimensions:
learning activities
doing/studying
instructional field
skills/knowledge
social relations:
inclusive/exclusive,
success/failure
modalities:
visual, manual,
spoken, written
Reading to Learn
These lessons from parent-child reading are applied in the literacy pedagogy,
Reading to Learn (Martin 2006, Martin & Rose 2005, 2007b, Rose 2004, 2007,
2008, www.readingtolearn.com.au), together with an explicit metalanguage
designed from genre and register theory and discourse analysis (Martin & Rose
2007a, 2008). The sequence of the pedagogy is informed by this model of
language-in-context, ordering the complex task of reading and writing in
manageable steps, from patterns in the context, to the text, to its sentences
and words, enabling all learners to succeed with each component in turn.
The first step prepares learners for following a text as it is read aloud, using
spoken, visual and manual modalities to explore the text’s field, depending
on the nature of the text and the needs of students. In early years classes, for
example, the teacher may talk through a picture book with children, using
discussion around the pictures, similar to that above in Table 9.4. As with par-
ent-child reading, the text may be read again and again until the children are
thoroughly familiar with the field and can say and understand all the words
of the text. With older students, visual images may be used to explore the field,
including illustrations in books, video or other images. The sequence of the
text will then be orally paraphrased or summarized by the teacher in terms
familiar to the students, providing a framework for them to follow with general
understanding as it is read aloud, as illustrated in Image 9.4.
Once students are familiar with the sequence of meanings in the text, they
are supported to read it themselves, sentence by sentence, in an activity known
as Detailed Reading. With young children beginning to read, the teacher first
writes sentences from the reading story on cardboard strips. The children are
then shown how to point at each word in the familiar sentence as they say them,
and then to cut up words and word groups, put them back in the sentence and
read it again, until they can read the sentence accurately. This practice is a
powerful catalyst for children to make the semiotic journey from the spoken
to the written medium, via visual and manual modalities. Older students
are orally guided to identify each group of words in each sentence from the
reading text, using cues for their meaning and position in the sentence. The
students then mark the words with highlighters or underlining, and their
meaning may be elaborated. These techniques are shown in Image 9.5.
Learning to Interact with Books 195
(a) (b)
(a) (b)
Once this control has been mastered, the Token = Value relation of spoken
and written expression evaporates, as graphology replaces phonology as the
medium of expression. That is, experienced readers do not translate from
written to spoken expression in order to recognize meanings. Martin (2006)
describes this as a shift in the child’s understanding of reading from ‘book tells
us meaning’ to ‘writing realizing meaning’; the semiotic relation shifts from
projection (a says ‘b’) to identification (a = b). The automaticity of written
expression then allows the reader to focus their conscious attention wholly
on semantic patterns in the content plane. This is what Vygotsky observes in
the development of ‘higher psychological functions’:
At the centre of development during the school age is the transition from the
lower functions of attention and memory to higher functions of voluntary
attention and logical memory . . . the intellectualisation of functions
and their mastery represent two moments of one and the same process –
the transition to higher psychological functions. We master a function to
the extent that it is intellectualised. The voluntariness in the activity is
always the other side of its conscious realization. (Wertsch 1985:26, cited
in Hasan 2004)
This cycle shares many similarities with the parent-child interaction in Table
9.4. The students’ task is to identify text elements. The teacher prepares by
directing attention and interpreting meanings, and evaluates with affirmation.
But in addition she uses a ‘Focus’ question to elicit a response from one stu-
dent, and a ‘Highlight’ instruction to ensure that all students mark the same
words (Martin 2006). Here the direction of attention is from the position in
the text and sentence, to the grammatical function ‘what the earthquake did’
(Medium+Process), to the grammatical structure It started. Instead of manu-
ally pointing, the teacher explicitly states the position (K1), which implicitly
demands the students look at the position (A1).
The meaning cue is then restated as a Focus question, directed to a particular
student. This question is labelled dK1, for ‘delayed primary knower’, as the
teacher already knows the answer. The purpose of dK1 questions, which are
pervasive in classroom discourse, is to get students to attend to and repeat
information. They function to hand control over to students to do a task them-
selves, rather than simply listening to the teacher, and then allow the teacher
to evaluate and elaborate on students’ responses.
As one student says the wording aloud (K2), all the others are also seeing it
and reading it silently, interpreting it in terms of the semantic category given
by the teacher. The teacher’s affirmation and repetition of the wording intensi-
fies the affective value of the identifying activity, then the manual activity of
highlighting the wording (A1) cements its value for each learner.
As they repeatedly do the task of identifying word groups from such cues, all
students rapidly come to consciously recognize relations between grammatical
functions, denoted by the natural metalanguage of who or what, what did/
happened, where, when, how, and so on, and the written grammatical structures
that realize these functions. (At this stage a more technical metalanguage is not
yet required for students to identify such function structures.)
Learning to Interact with Books 199
This elaboration includes three cycles. In cycle 1, the teacher first directs
attention to remembering the preceding preparation ‘I used the word earth-
quake’, then to remembering the preceding mentions in the text ‘we know
it’s an earthquake’, then to the discourse function ‘what have they used instead
of earthquake’ (anaphoric reference), then the wording ‘what’s the word
they’ve used’ (a pronoun), then the position in the text ‘there to begin that
paragraph’.
In cycle 2, she uses affirmation and repetition to intensify students’ attention
to the discourse function, getting them to repeat the referent back to her, and
strongly affirming them. This creates a firm semantic basis in cycle 3 for asking
the class to remember a linguistic term that denotes a word class and its dis-
course function, ‘pronoun’. Repetition and affirmation of terms like this, within
200 Semiotic Margins
elaboration phases, will eventually enable all students in the class to remember
and use such metalanguage appropriately. In this way, the class builds an explicit,
systematic and consistent metalanguage, through experiencing instances in
actual texts.
The next element to be identified is a grammatical metaphor long low
roar, which the teacher prepares by glossing as a ‘sort of sound’. As ‘roaring’
is actually a process, and the qualities long low are normally associated with
concrete objects, many children may not recognize this lexical item without
such support.
Again the cycle of attention begins here with the position in the sentence
‘when it started’, then the lexical category ‘what sort of sound’, then the posi-
tion of the grammatical structure ‘it started with something’, so the students
know that the wording follows with, making it easier to identify. And again
one student says the words, is affirmed, and the class is directed to highlight
the words.
Next the students are guided to interpret the conceptual image evoked by
long low roar, by reference to their previous experience.
(Continued)
Learning to Interact with Books 201
(Continued)
202 Semiotic Margins
Again the teacher uses repeated affirmation here, to intensify the class’ atten-
tion to the next elaboration, which focuses on two features, the qualities ‘long
low’ and process ‘starts’. The goal of this sequence is to direct students’ atten-
tion to the function of these elements in the discourse structure of the text ‘it
starts off long . . . low’. That is, tension builds through the text passage as
the earthquake approaches. A key technique the author uses to build tension
is to start low and uncertain seemed to be approaching.
The students need to understand both the meaning of each of these
elements within the sentence, and their discourse function in the text. The
teacher’s strategy is to relate the local meaning to their own experience, draw-
ing their attention to aspects that are relevant to the discourse function. As
the Detailed Reading of the passage continues, she will point out the global
discourse patterns of mounting tension, and remind them of the aspects of
each wording that contribute to this pattern.
behavioural
demand action
instructional
instruct
to text
direct attention
initiation to memory
type
initiate
prepare successful response
elicit response
query without preparing
specific student
respondent
whole class
act
verbal
response identify meaning in a text
visual
verbal
curricular
select meaning from memory
extracurricular
affirm
polarity
reject
evaluate
strong
strength median
weak
curricular
feedback field
elaboration extracurricular
text
elaborate
monologic (teacher)
- mode
dialogic (return to elicit response)
Thirdly, feedback moves (Figure 9.6) always involve evaluations that either
affirm or reject the response, with more or less strength. For example, affirma-
tions may range from ‘yep’ to ‘fantastic’ and are often intensified by repetition;
rejections range between qualifying responses, ignoring, negating or even
admonishing. Where affirmations function to enhance learning capacity and
engagement, rejections may have the opposite effect, particularly for students
with weak learner identities. In the stratified context of the typical classroom,
affirmations and rejections can thus serve to differentiate students. On the
other hand, where differentiation is not an issue, an interplay of affirmation
and rejection can serve to guide learners towards a goal, as in Table 9.1.
In addition feedback may elaborate on the response, providing more informa-
tion about either the text or the field. Again the field of elaboration may
be curricular or extracurricular. The mode of elaboration may be a teacher
monologue, or a dialogue with students. If the elaboration is dialogic, the cycle
begins again with eliciting a response (usually elicited by the teacher but
students may also ask questions that demand elaborations). Elaborations are
optional (shown by the minus option in Figure 9.6), but teachers typically use
students’ responses as stepping stones in a lesson, expanding them with more
technicality or detail, either strengthening the boundaries between everyday
and esoteric knowledge, or traversing back and forth between them, as
illustrated in Tables 9.4 and 9.5.4
The school must disconnect its own internal hierarchy of success and failure
from ineffectiveness of teaching within the school and the external hierarchy
of power relations between social groups outside the school. How do schools
individualize failure and legitimize inequalities? The answer is clear: failure is
attributed to inborn facilities (cognitive, affective) or to the cultural deficits
relayed by the family which come to have the force of inborn facilities.
(1996/2000:5)
for teachers trained in Reading to Learn, are consistently twice to four times
beyond expected rates of growth (Culican 2006, Rose et al. 2008), accelerating
the learning of all students, while rapidly closing the gap in their levels of
achievement.
Notes
1
Anecdotes are not resolved like narratives, but conclude with a Reaction (Martin
& Rose 2008).
2
The model of pedagogic genre is derived from Bernstein’s model of ‘pedagogic
discourse’, including an instructional discourse ‘which creates specialized skills
and their relationship to each other’, but is embedded in and dominated by a
regulative discourse ‘which creates order, relations and identity’ (1996/2000:46).
Extending Martin (1999), Bernstein’s regulative discourse is re-interpreted as the
pedagogic register, including the field of learning activities, the tenor of peda-
gogic relations, and the mode of learning. These three variables in pedagogic
register project the instructional field of skills and knowledge to be acquired.
3
Grammatical functions, such as Epithet, Thing, Medium, Place, are described in
Halliday 1994/2004 and Martin and Rose 2007a.
4
Some of the points made in this analysis have been identified by neo-Vygotskyan
activity theorists such as Mercer (2000) or Wells (1999). Key differences here include:
References
Australian Bureau of Statistics (1994, 2004). Australian Social Trends 1994 & 2004:
Education - National summary tables. Canberra: Australian Bureau of Statistics,
www.abs.gov.au/ausstats.
Adams, M.J. (1990). Beginning to read: Thinking and learning about print: A summary.
Urbana-Champaign: University of Illinois.
Alexander, R. (2000). Culture and pedagogy: International comparisons in primary
education. London: Blackwell.
Bernstein, B. (1990). The structuring of pedagogic discourse. London: Routledge.
Learning to Interact with Books 207
Bernstein, B. (1996). Pedagogy, symbolic control and identity: Theory, research, critique.
London: Taylor & Francis. [Revised Edition 2000].
Cloran, C. (1999). Contexts for learning. In Christie (Ed.), Pedagogy and the shaping
of consciousness: Linguistic and social processes (pp. 31–65). London: Cassell.
Culican, S. (2006). Learning to read: Reading to learn, a middle years literacy
intervention research project, final report 2003–4. Catholic Education Office:
Melbourne. www.cecv.melb.catholic.edu.au/Research and Seminar Papers.
Halliday, M.A.K. (1993). Towards a language-based theory of learning. Linguistics
and Education, 5(2), 93–116.
Halliday, M.A.K. (1994/2004). An introduction to functional grammar (2nd edn).
London: Edward Arnold.
Hasan. R. (2004). Semiotic mediation and three exotropic theories: Vygotsky,
Halliday and Bernstein. In J. Muller, B. Davies & A. Morais (Eds), Reading
Bernstein, researching Bernstein (pp. 30–43). London: RoutledgeFalmer.
Hasan, R. & Cloran, C. (1990). A sociolinguistic interpretation of everyday talk
between mothers and children. In M.A.K. Halliday, J. Gibbons & H. Nicholas
(Eds), Learning, keeping and using language. Vol. 1: Selected papers from the 8th World
Congress of Applied Linguistics. Amsterdam: John Benjamins.
Kellogg, S. (1997). The three little pigs. New York: Morrow Junior Books.
Kensinger, E.A. (2004). Remembering emotional experiences: The contribution
of valence and arousal. Reviews in the Neurosciences, 15(4): 241–251.
Martin, J.R. (1999). Mentoring semogenesis: ‘Genre-based’ literacy pedagogy. In
F. Christie (Ed.), Pedagogy and the shaping of consciousness: Linguistic and social
processes (pp. 123–155). London: Cassell.
Martin, J.R. (2006). Metadiscourse: Designing interaction in genre-based literacy
programs. In R. Whittaker, M. O’Donnell & A. McCabe (Eds), Language and
literacy: Functional approaches (pp. 95–122). London: Continuum.
Martin, J.R. & Rose, D. (2005). Designing literacy pedagogy: Scaffolding asymmet-
ries. In R. Hasan, C.M.I.M. Matthiessen & J. Webster (Eds), Continuing discourse
on language (pp. 251–280). London: Equinox.
Martin, J.R. & Rose, D. (2007a). Working with discourse: Meaning beyond the clause
(2nd edn). London: Continuum.
Martin, J.R. & Rose, D. (2007b). Interacting with text: The role of dialogue in
learning to read and write. Foreign Languages in China, 4(5): 66–80.
Martin, J.R. & Rose, D. (2008). Genre relations: Mapping culture. London: Equinox.
McGee, L.M. (1998). How do we teach literature to young children? In S.B. Neuman
& K.A. Roskos (Eds), Children achieving: Best practices in early literacy (pp. 162–179).
New Jersey: Newark International Reading Association.
Mercer, N. (2000). Words & minds: How we use language to work together. London:
Routledge.
Painter, C. (1996). The development of language as a resource for thinking:
A linguistic view of learning. In R. Hasan and G. Williams (Eds), Literacy in Society
(pp. 50–85). London: Longman.
Painter, C. (2003). The ‘interpersonal first’ principle in child language development.
In G. Williams & A. Lukin (Eds), Language development: Functional perspectives
on species and individuals (pp. 133–153). London: Continuum.
Pilkington, D. (1996). Follow the rabbit-proof fence. St Lucia: University of Queensland
Press.
208 Semiotic Margins
Imaging Representations of
Meaning
This page intentionally left blank
Chapter 10
Introduction
suggesting how text arcs, streamgraphs and animated networks might be used
by functional discourse analysts as tools for exploring text.
Preserving Logogenesis
discourse semantics
content 'plane'
lexicogrammar
phonology/
graphology
expression 'plane'
It doesn’t matter how many clauses we analyse, it’s only once we analyse
meaning beyond the clause that we’ll be analysing discourse. And we need
to analyse discourse right along the cline of instantiation if we want to make
sense of the semiotic weather we experience in the ecosocial climate of our
times. (Martin & Rose 2003:272)
214 Semiotic Margins
In order to make such a jump out of the clause, we need means of commun-
icating the kinds of patterning that we will find. Static forms of representation
such as bar charts will not meet our needs because they reduce the complexity
to a single value visualized in two dimensions. Instead we perhaps require tech-
niques that assist linguists in exploring the patterning of annotations that they
have made to a text across as many dimensions as are necessary.
Rather than reducing the annotated text to a table of statistical values we
might employ various kinds of text visualization to achieve a dynamic lens on
the data. For example, consider the description, provided by a developer of a
text visualization system that presents texts in a three-dimensional network:
Such a ‘qualitative slice’ may be of great use to the discourse analyst because
it emphasizes relationships between linguistic features in texts as they are
interwoven to create particular meanings. What is presented here is not an
argument against ‘counting’ these features, but a suggestion that should not
toss out information about their sequencing.
Accumulation
A B C D E
Shedding
F G
making visible patterns that emerge in sequences that exceed a single page or
screen is significant. This is a problem of tractability. Until we have a way of
representing extended patterning we are limited to probing small co-patternings
of meaning deemed qualitatively relevant to the particular questions the analyst
is asking about the text.
To be true to the unfolding of a genuine, multimodal text, however, we
need to find ways of analysing and representing the unfolding of two kinds
of co-ocurrence in actual texts: co-occurrence across unfolding modes, for
example, simultaneous use of a particular intonation and a particular gesture,
and co-occurrence within the same text sequence, for example, use of affect
together with graduation1 in a clause in the unfolding text. Time, in the
second type of co-occurrence is not clock time but instead a form of ‘text
time’ dependent on the dimension of meaning that the discourse analyst is
interested in exploring. The latter type of co-occurrence has begun to be
explored in the notions of coupling (Martin 2000) and syndromes (Zappavigna
et al. 2008) introduced earlier. Coupling refers to meanings that are co-related
in a text, for example, relationships between evaluative and ideational meaning
integral to construing shared values in a community (Knight 2008). Syndromes
are larger-scale configurations involving multiple associations between different
meanings involved in the overall rhetoric being developed as the text unfolds.
I will now suggest how the domain of text visualization may offer assistance to
linguists trying to analyse unfolding textual patterns.
For any reader, the rather slow serial process of mentally encoding a text
document is the motivation for providing a way for them to instead user their
primarily preattentive, parallel processing powers of visual perception . . .
The goal of text visualization, then, is to spatially transform text information
into a new visual representation that reveals thematic patterns and relation-
ships between documents in a manner similar if not identical to the way the
natural world is perceived. (Wise et al. 1995:51–52)
Thus, a visualization will only be effective to the extent that it can profitably
make use of preattentive perceptual capabilities. In addition, as with all forms
of computing, ‘bad data in equals bad data out’. Careful attention needs to be
paid to which visualization strategies best accommodate the kinds of linguistic
relationships that we want to explore. We risk creating a representation that
resemiotizes our data in misleading ways.
The following sections present three visualization techniques that may be
useful in resolving the tension between gaining a synoptic perspective on the
text (the paradigmatic perspective) and capturing its unfolding (the syntag-
matic perspective). The overview of these three techniques is intended as
an invitation to the reader to think about how we might begin the task of
exploring the emergent complexity of logogenesis.
Visualizing Logogenesis 217
Text arcs are a technique for summarizing repetition in long strings. They have
been used to visualize text, code (Wattenberg 2001), DNA (Spell & Brady 2003)
and music (Wattenberg 2001). Text arcs are a development of the dotplot
technique, a form of recurrence plot used in, for example, bioinformatics, to
graphically compare repetition in genomic sequences (Figure 10.3). Dotplots
represent repetitions in a similarity matrix by shading identical cells. The text
arc layout, on the other hand, creates links between repeated units using trans-
lucent arcs (Figure 10.4) and is thus able to preserve a view of the time sequence.
Figure 10.3 A DNA dotplot of a human zinc finger transcription factor (GenBank
ID NM_002383), showing regional self-similarity
A simplified text to speech engine is used to break down the poem into
individual phonemes, so that ‘Once upon a time’ becomes ‘w-ah-n-s ax-p-aa-n
ey t-ay-m’ these phonemes can then be identified in patterns representative
of alliteration, rhyme and rhythm. (Byron 2007)
The steps beneath the arcs represent rhythm, while the link repeated rhyme
represents alliteration and homophones (Figure 10.5). The rhyming engine
has also been used to create an interactive limerick writing assistance applica-
tion with which a child can begin to type a line and be prompted with informa-
tion about how many syllables remain to be used in that line. As you exhaust
‘remaining syllables the words become shorter, if you begin to type a word,
words that begin with what letters you have typed so far are presented’ (Byron
2007).
Text arcs have also been used to ‘represent visually different types of multi-
modal prosody so that a single text can be explored or comparisons can be
made between different texts’ (Zappavigna & Caldwell 2008). Caldwell and
Figure 10.5 Dynamic text arc visualization of ‘Hickory Dickory Dock’ (Byron 2007)
Visualizing Logogenesis 219
Zappavigna (Chapter 11) explored how text arcs could be used in visualizing
the patterning of end-rhymes in rap music. They also showed how end-rhymes
unfold in popular rap music, providing a logogenetic view that allows the rhym-
ing style of rap artists to be compared in terms of how they unfold with the text.
In general, the text arc technique may be useful to discourse analysts investigat-
ing how repeated patterns differ across texts of the same or different genres.
Discourse analysts are usually interested in tracking the unfolding of more than
one linguistic feature as it varies over a text or across a corpus. A visualization
technique able to represent multiple features on the same diagram is the
streamgraph. Streamgraphs are a form of stacked graph, a display where
multiple data series are positioned one on the top of the other, offering a
way of fulfilling this requirement. Streamgraphs visualize multiple variables as
coloured ‘streams’ flowing with the time series on a single graph. Smooth
curves are generated by interpolating between points to produce the ‘flowing’
river of data. The technique has been used to visualize box office revenues
changing over time (Byron & Wattenberg 2008), changes in music listening
habits (Byron 2008), shifts in lexical themes in corpora with time (Havre et al.
2002) and changes in word association in Twitter status messages (Clark 2008b).
For example, Figure 10.6 is a streamgraph depicting a user’s ‘listening history’,
which is the variation in artists that a user listens to over time. In this graph
Sufjan Stevens
Dj Shadow
each layer or ‘stream’ represents a different artist and the width of the layer
represents the frequency of the listening. Time is the movement from left to
right over an 18 month span. The developer describes the graph as ‘a sort of
virtual mirror, reflecting very personally significant events made visible by the
changes in listening trends’ (Byron 2008). The colour scheme, represented
here only in greyscale, was also used to indicate the level of interest a user had
in each artist:
the earlier example of candidates running for election, we might ask how
the candidates’ themes change in response to news events. Do their speeches
appear to trigger news events? Does a candidate’s opinion have any apparent
impact on the stock market? (Havre et al. 2002:11)
However, as always, the old adage that ‘correlation does not equal cause’
needs to be kept in mind.
Streamgraphs have been used to visualize Twitter feeds (Clark 2008b). Twitter
is a micro-blogging service that allows users to post status messages in text of
up to 140 characters. Other users can subscribe to an individual’s twitter feed to
receive these updates automatically. For example, Figure 10.8 shows a ‘Twitter
Topic Stream’ for the top 100 twitter users (twits), which uses a variation of the
Streamgraph technique to represent the distribution of the most ‘interesting’
capitalized words that occur in a database of twitter messages for the top 100 users.
The developer employed a particular operationalization of ‘interestingness’:
discovered that it is a company founded by Loic Le Meur, the 6th top twitter
user. (Clark 2008c)
However, it is clear that any number of linguistic criteria might be used, although
these are limited by what might be automatically detected. The interactive
application is available at www.neoformix.com/Projects/TwitterStreamGraphs/
view.php.
Figure 10.9 is an example by the same developer of the streamgraph tech-
nique applied to a single text, the novel ‘Tom Sawyer’ by Mark Twain, to visual-
ize the salience of particular characters throughout the novel. The streamgraph
technique allows an intuitive exploration of temporal changes across multiple
attributes; in this case the attributes are different characters in a novel. While
the accuracy of interpolated values, that is, new values that have been calculated
based on a discrete set of known values, might be questioned, the strategy offers
a useful qualitative view on sequential data such as text.
Visualizing Logogenesis 223
The metaphor often used in SFL of the text unfolding (logogenesis) invokes
ideas about linear progression that might not be optimal for modelling a text’s
complexity. An alternative metaphor that might be invoked is that of an ani-
mated network. This type of visualization seems more in accord with viewing
the text as a complex adaptive system in which changes, particularly in initial
conditions, have repercussions throughout the system. These types of systems
are common in nature. An animated representation also invokes a metaphor of
‘becoming’ or propagation. Indeed it is through propagation that systems such
as evaluative language swarm in a text, forming prosodic rather than constitu-
ent structures (Zappavigna et al. 2008). Fry’s (2000a:19) concept of ‘organic
information visualization’ deploys related ideas, conceiving visualization as
functioning to employ ‘simulated organic properties in an interactive, visually
refined environment to glean qualitative facts from large bodies of quantitative
data generated by dynamic information sources’. His system, Valence (Fry
2000b), will be reviewed in this section. A simplified example of Valence read-
ing another of Mark Twain’s works, The Innocents Abroad, is available at www.
benfry.com/valence (screen capture provided in Figure 10.10).
Valence (Fry 2000b) is a system that visualizes word usage as a network
unfolding in a three-dimensional globe. The system renders words as ‘nodes’ in
the network and connects words with branches if they are adjacent in the text
so that ‘each time these words are found adjacent to each other, the connecting
line shortens, pulling the two words closer together in space’ (Fry 2000a:67).
An important aspect of the value of the system is this foregrounding of the
relationality of language:
The premise is that the best way to understand a large body of information
. . . is to provide a feel for general trends and anomalies in the data, by pro-
viding a qualitative slice into how the information is structured. The most
important information comes from providing context and setting up the
interrelationships between elements of the data. If needed, one can later
dig deeper to find out specifics, or further tweak the system to look at other
types of parameters. (Fry 2000b)
While the system only models one kind of relationship, lexical adjacency,
a logical extension appears to replace the input data, currently ‘raw text’
(Figure 10.11), with annotated data and to specify different kinds of relation-
ships between annotation series. This would occur at the ‘preprocessor engine’
stage of the information pipeline that Fry proposes as a software engineering
method (Figure 10.11).
Valence ‘reads’ the text by moving words that are used most frequently to
the edges of the globe and less frequent words to its centre (Figure 10.12).
Within the system, logogenesis is represented as a proximal–distal relationship
rather than movement from left to right across a page. The text ‘unfolds’ by
moving the current lexical item being ‘read’ to the centre front of the three-
dimensional space. In some versions of the system a small page is shown next
to the network with lines of the text appearing in sync with the ‘reading’
provided by the movement of the network. A Quicktime video of Valence in
reading Mark Twain’s The Innocents Abroad is available at www.benfry.com/
valence/movie.html.
Figure 10.11 The ‘information pipeline’, a software engineering method for the
Valence visualization (Fry 2000a:65)
Visualizing Logogenesis 225
The three-dimensional visualization affords a way for the user to move around
inside the text and explore relationships between words. The user is able
to zoom in or view the network from different viewpoints (Figure 10.13),
depending on the relationships that they wish to investigate.
Conclusion
This chapter has presented three text visualization techniques that use particu-
lar representation strategies for making logogenesis both visible and tractable:
Text Arcs, Streamgraphs and animated networks. The first technique is useful
for discourse analysts exploring repeated patterns in texts, the second for rep-
resenting the unfolding of more than one linguistic feature on the same graph,
and the third for achieving a dynamic representation of features unfolding
in time. The techniques are examples of moving beyond a ‘bag of entities’
perspective on texts to embrace the complex sequencing of discourse. If we are
able to develop these techniques to cope with annotated systemic functional
input then we will have a powerful lens on our data. We will also have a useful
mechanism for communicating analyses of patterns that will allow us, in turn,
to develop functional theory about discourse patterning without factoring out
time2 (Zhao 2009, forthcoming).
Effective annotation is the first step in visualization of features that cannot
be automatically extracted from text with current computational techniques.
This means that we require systems that support easy manual annotation of
texts by the linguist. Examples of existing text annotation systems developed by
Systemic Functional linguists include Systemics (Judd & O’Halloran 2001),
UAM Corpus Tool (O’Donnell 2008) and SysAM (Matthiessen & Wu 2001).
To date, there has been no work done on how the output of these systems
might be visualized. We might think of ourselves as biologists trying to map
the genome without a theory of sequencing.
Acknowledgements
Notes
1
These categories are taken from Appraisal theory (Martin & White 2005) and
refer respectively to language about emotional responses and language scaling
evaluation in a text.
Visualizing Logogenesis 227
2
By time, I do not refer to physical time but instead to ‘text time’ in the sense of
logogenetic unfolding.
References
Berry, M. (2003). Survey of text mining: Clustering, classification, and retrieval. New York:
Springer.
Byron, L. (2007). Children’s poetry and lymerick visualizations. Retrieved 11 August
2008, from Lee Byron: www.leebyron.com/what/poetry/.
Byron, L. (2008). Last.fm listening history – What have I been listening to? Retrieved
31 July 2008, from Lee Byron : www.leebyron.com/what/lastfm/.
Byron, L. & Wattenberg, M. (2008). Stacked graphs – Geometry & aesthetics. Retrieved
8 July 2008, from Lee Byron: www.leebyron.com/else/streamgraph/.
Clark, J. (2008a). Tom Sawyer character streamgraph. Retrieved 11 August 2008, from
Neoformix: Discovering and illustrating patterns in data:www.neoformix.
com/2008/TomSawyer.html.
Clark, J. (2008b). Twitter topic stream. Retrieved 31 July 2008, from Neoformix:
Discovering and illustrating patterns in data:www.neoformix.com/2008/
TwitterTopicStream.html.
Clark, J. (2008c). Twitter topic streams for some top users. Retrieved 11 August 2008,
from Neoformix: Discovering and illustrating patterns in data:www.neoformix.
com/2008/TwitterTopicStreamsTopUsers.html.
Fry, B. (2000a). Organic Information Design. Unpublished dissertation. Boston,
MA: Massachusetts Institute of Technology.
Fry, B. (2000b). Valence. Retrieved 18 July 2008, from Ben Fry:www.benfry.com/
valence/applet/.
Halliday, M.A.K. (1991). Towards probabilistic interpretations. In E. Ventola (Ed.),
Functional and systemic linguistics: Approaches and uses (pp. 39–61). Berlin and
New York: Walter de Gruyter.
Halliday, M.A.K. (1993). Language in a Changing World. Occasional Paper Number 13.
Toowoomba, Queensland: Applied Linguistics Association of Australia, Centre
for Language Learning and Teaching, University of Southern Queensland.
Halliday, M.A.K. & Martin, J.R. (1993). Writing science: Literacy and discursive power.
London: Routledge, Taylor & Francis Group.
Halliday, M.A.K. & Mattheissen, C. (2004). An Introduction to Functional Grammar.
London: Edward Arnold.
Havre, S., Hetzler, E., Whitney, P. & Nowell, L. (2002). ThemeRiver: Visualizing
thematic changes in large document collections. IEEE Transactions on visualisation
and Computer Graphics, 8(1), 9–20.
Judd, K. & O’Halloran, K. (2001). Systemics. Singapore: Singapore University Press
2001. (Educational software).
Knight, N. (2008). ‘Still cool . . . and american too!’: An SFL analysis of deferred
bonds in internet messaging humour. In N. Nørgaard (Ed.), Systemic Functional
Linguistics in Use, Odense Working Papers in Language and Communication (Vol. 29)
(pp. 481–502). Odense: University of Southern Denmark, Institute of Language
and Communication.
228 Semiotic Margins
Lemke, J.L. (1991). Text production and dynamic text semantics. In E. Ventola (Ed.),
Functional and Systemic Linguistics: Approaches and Uses 23 (pp. 23–38). Berlin and
New York: Mouton de Gruyter.
Martin, J.R. (2000). Beyond exchange: Appraisal systems in English. In J. Martin,
S. Hunston & G. Thompson (Eds), Evaluation in text: Authorial stance and the
construction of discourse (pp. 142–175). Oxford: Oxford University Press.
Martin, J.R. (2004). Mourning: How we get aligned. Discourse and Society15, (2–3),
321–344.
Martin, J.R. (2008, July 21–25). Chaser’s war on context: Making meaning. Paper
presented at the 35th International Systemic Functional Congress. Sydney.
Martin, J.R. & Rose, D. (2003). Working with discourse: Meaning beyond the clause.
London, New York: Continuum.
Martin, J.R. & White, P.R.R. (2005). The language of evaluation: Appraisal in English.
New York: Palgrave Macmillan.
Matthiessen, C. (2007). The ‘architecture’ of language according to systemic
functional theory: Developments since the 1970s. In R. Hasan, C. Matthiessen &
J. Webster (Eds), Continuing discourse on language: A functional perspective (Volume
two). London: Equinox.
Matthiessen, M.I.M. & Wu, C. (2001). SysAm. [Programs for computational
Analysis] Available at: www.iminerva.ling.mq.edu.au.
O’Donnell, M. (2008). Demonstration of the UAM CorpusTool for text and
image annotation. Proceedings of the ACL-08:HLT Demo Session (Companion volume)
(pp. 13–16). Columbus, OH: Association for Computational Linguistics.
Spell, R. & Brady, R. (2003). BARD: A visualization tool for biological sequence
analysis. Proceedings of the IEEE Symposium on Information Visualization. Seattle,
WA: IEEE.
Wattenberg, M. (2001). The shape of song. Retrieved 31 July 2008, from Turbulence:
www.turbulence.org/Works/song.
Wattenberg, M. (2002). Arc diagrams: Visualizing structure in strings. Proceedings
of the IEEE Symposium on Information Visualization (pp. 110–116). Boston, MA.
Wise, J., Thomas, J., Pennock, K., Lantrip, D., Pottier, M., Schur, A., et al. (1995).
Visualizing the non-visual: Spatial analysis and interaction with information
from text documents. Proceedings of the IEEE Information Visualization Symposium
(pp. 51–58). Atlanta, GA.
Zappavigna, M. & Caldwell, D. (2008). Visualising multimodal patterning. Paper
presented at Critical Dimensions in Applied Linguistics, July 4–6. Sydney.
Zappavigna, M., Dwyer, P. & Martin, J. (2008). Syndromes of meaning: Exploring
patterned coupling in a NSW Youth Justice Conference. In A. Mahboob &
K. Knight (Eds), Questioning linguistics (pp. 103–117). Newcastle: Cambridge
Scholars Publishing.
Zhao, S. (2010). Intersemiotic relations as logogenetic patterns: Towards the
restoration of the time dimension in hypertext description. In M. Bednarek &
J. Martin(Eds), New discourse on language: Functional perspectives on multimodality,
identity, and affiliation (pp. 195–218). London: Continuum.
Chapter 11
Michele Zappavigna
University of Sydney
Text Visualization is an emergent field closely related to the more general field
of information visualization, a field that represents abstract data visually. The
objective is to computationally process a text so that it can be represented in
ways that leverage the ‘primarily preattentive, parallel processing powers of
visual perception’ (Wise et al. 1999:442). In short, visualization aims to make
complex data that is encoded by machines meaningful to humans, using tools
such as colour, space and animation to produce visual representations.
Text visualization is especially useful for discourse analysts, and linguists
more generally, as they are interested in making claims about text patterns.
Such patterning is often highly complex, involving different types of linguistic
features, depending upon the linguistic theory deployed. For example, pattern-
ing of interpersonal meaning has been analogized with musical patterning:
Because of the high dimensionality of language, many such patterns are not
necessarily directly evident through close analysis of individual texts, especially
in the case of extended texts or corpora. Ware (2004) notes a number of
important advantages afforded by visualization that may assist the linguist in
exploring large data sources:
Figure 11.3 Arc diagram of the folk song Clementine (Watternberg 2002)
(a)
(b)
Figure 11.4 Comparison of a dotplot and arc diagram for the same string
(Watternberg 2002:3)
Figure 11.5 Dynamic arc diagram visualization of ‘Hickory Dickory Dock’ (Byron
2007)
The data for this chapter is from the contemporary, ‘popular’ North American
rap musician Kanye West. The song chosen for analysis is titled Spaceship
from West’s inaugural album: The College Dropout (2004). The lyrics (which for
copyright reasons are only reproduced here as individual rhymes) have been
accessed online from The Original Hip-Hop Lyrics Archive (www.ohhla.com).
Drawing inspiration from Wattenberg (2002) and Byron (2007) and their
visualizations of music, we considered rap music an attractive source of data.
Generally speaking, the vocal performance of ‘rapping’ requires a performer
to match the rhythm of their voice to the beat of music, and this is often
unrehearsed and spontaneous. In addition, rapping is articulated in poetic
form so it involves rhyme, as well as African-American language practices
such as narrativizing, toasting and punning (Richardson 2006:11).
Rap is about virtuosity. It is a means by which one can establish a reputation
within the hip-hop community. And in most cases, rap artists are explicitly
234 Semiotic Margins
We could argue then that repetition (as Intensification) is much more like
‘paralinguistic’ repetition than discourse semantic repetition. First, while the
‘sense’ or meaning of consecutive rhyming might have some kind of semantic
thread between the particular lexemes (see examples in Figure 11.6), this is not
QUANTIFICATION
number: a few, many, heaps...
mass/presence: tiny, small, large...
FORCE
INTENSIFICATION
Qualities: slightly corrupt, very corrupt...
Processes: like, love, adore...
Repetition:
A deplorable act, disgraceful, despicable act [Quality]
We laughed and laughed and laughed [Processes]
Nothing’s there, nothing’s fair, I don’t ever want to go back
there [Rhyme]
necessarily the case. Moreover, we would argue that it is the ‘sound’, or ‘sensory
force’ of consecutive rhymes, particularly of the same sound, that signifies
Intensification or ‘force’. In a way, it can be likened to a gradual increase in
loudness (a crescendo), albeit realized through the repetition of sounds that
do not necessarily increase in amplitude. In musical terms, a crescendo is a pas-
sage of music that gradually increases in force or loudness. So with respect to
the system of graduation (Figure 11.6), we include consecutive rhyming (of
the same sounds) as part of the Intensification system, and in particular, the
sub-system of repetition. However, we do note that this is not the same as repeti-
tion of the discourse semantic kind; it is better classified as a kind of paralin-
guistic or ‘sensory’ intensification (for want of a better term).
The following set of arc diagrams aim to visualize the ‘virtuosity’ of a rapper in
terms of their capacity to produce consecutive end-rhymes using syllables of
the same sound. As mentioned, there are three data sets: Kanye West, GLC
and Consequence, all of which have been taken from the same song: Spaceship
(West 2004). A basic generic structure of Spaceship is outlined in Figure 11.7.
We have limited our analysis to the three verses of Spaceship. Each verse varies
slightly in size: West’s verse comprises 41 clauses, GLC’s comprises 50; and
Consequence’s comprises 25.
Before we introduce the analysis of the data, it is important to explain the way
in which we have used the texts arcs to represent the build-up of consecutive
end-rhymes. Figure 11.8 is one segment of analysis of the West data set.
The horizontal axis represents time, or more technically, logogenesis; the
text as it unfolds ‘in the world’. Each horizontal axis is segmented into smaller
components with a single, vertical line. Each of these segments represents a
single line of text, basically equivalent to a clause, tone group and poetic ‘line’.
Introduction^
Chorus^
Verse 2 (GLC)^
Chorus^
Verse 3 (Consequence)
Figure 11.8 Arc diagram of consecutive end-rhymes from Kanye West in Spaceship
(West 2004)
Below the horizontal axis, and within each of these segments, is the end-rhyme.
While it would be more accurate to place the end-rhyme to the far right-
hand side of each segment, there is simply not enough space. A single, non-
translucent arc is used to represent repetition; in this case, a consecutive
end-rhyme of the same sound, for example, ‘again’/’him’ and ‘up’/’up’ (see
Figure 11.8). However, we do not code for any end-rhymes that are not con-
secutive, or that are not the same sound (with some exceptions discussed
below). So, for example, the end-rhymes ‘again’/’him’ and ‘up’/’up’ only con-
stitute two arcs. In other words, there is no arc between ‘him’ and the initial
‘up’. Unlike Wattenberg (2002), we are only using non-translucent, lower level
arcs because the small quantity of instances does not really offer us the poten-
tial to show patterns of patterns, just single patterns.
On several occasions, we do ignore some ‘non-consecutive’ rhymes so as to
visualize a lengthy, consecutive end-rhyme coding. As illustrated above, ‘back’,
‘gap’, ‘cheque’ and ‘scratch’ are all coded as part of a consecutive string,
stole fault stole caught pat me khakis walk in blackie kanye store marl
hits hits rhymes mind helpin quit welcome struggle hustle hustle dude
Figure 11.9 Arc diagrams of end-rhymes from West in Spaceship (West 2004)
238 Semiotic Margins
job mob time grind momma mind love mine signed time alone ya ll
mine
ball fall mine mine now prime mine twelve now shelves myself
Figure 11.10 Arc diagrams of end-rhymes from GLC in Spaceship (West 2004)
right right like night come go goes service cars shows me goatees
me me natu- act- fact- c tas- me there fair there off off bloaw
rally ually ually trophe
We will explain these arc diagrams using an aggregated view in the following
section.
Summary of Findings
Before we compare the rhyming capacity of the three rappers, it is worth noting
that there are very few instances where the rappers do not rhyme. While there
are clear differences in terms of the extent to which the rappers do or do not
rhyme consecutive syllables of the same sound, all three rappers clearly display
some kind of virtuosity in terms of their capacity to rhyme consistently. And
while that may not constitute or evoke the same kind of semiotic ‘force’ as
consecutive syllables of the same sound, it does however, at the very least,
demonstrate a capacity to rap. It would be worth comparing these arc diagrams
to other rap songs in which the artists were not studio recorded. In ‘freestyle’
rapping, for example, these kinds of findings would show an even greater level
of virtuosity given that the performance is spontaneous, and does not afford
the luxury of rehearsal.
Most significantly though, the arc diagrams reveal a clear difference between
West’s rhymes and those of his two collaborators. Figure 11.12 is an aggregated
text arc view, comparing the consecutive rhyming of the three artists.
This aggregated arc diagram view shows that Consequence and particularly
GLC rhyme consecutive end-rhymes of the same sound to a much greater
extent than West. So what does this then say about West’s virtuosity as a
rapper? Quite simply, we could argue that West’s status as an ‘average’ rapper
is somewhat justified, at least in terms of his use of rhyme as a means of
Intensification.
In addition, it is important to note that the really significant build-up of
Intensification through end-rhymes occurs mainly at the end of both GLC’s
and Consequence’s verse. This appears to be a very deliberate tactic, especially
when we consider that all three rappers finish their verse with ‘bloaw’, a refer-
ence to the metaphorical spaceship ‘taking off’. One could certainly argue that
West
GLC
Consequence
Figure 11.12 Aggregated text arc view from Spaceship (West 2004)
240 Semiotic Margins
both GLC and Consequence are very aware of the ‘graduating’ function of
consecutive end-rhymes and deploy them accordingly. Moreover, this finding
suggests that consecutive rhyming of the same sound does function in a similar
way to a musical crescendo.
From a different perspective though, it could be argued that by avoiding
consecutive end-rhymes of the same sound, West is actually able to express
many more detailed and elaborate meanings in his lyrics given that he is not
continually limited by having to find appropriate lexis that matches a particular
sound. Compare, for example, West’s rhyming couplets with the consecutive
end-rhymes from GLC shown in Table 11.1.
The point here, albeit difficult to recognize without a complete clause, is
that GLC’s verse is not as semantically ‘rich’ when compared with West’s. Or
perhaps, more technically, it lacks the same level of semantic or ‘ideational’
coherence (see Martin & Rose 2003). In the extract above, GLC lists people he
hopes to ‘see’, for example, ‘freddy g’ and ‘yousef g’. He then explains, without
any obvious semantic link, that police watch him (‘me’) smoking marijuana
(‘weed’). It is only in the final four clauses where GLC’s rhymes have some
kind of semantic coherence. In those clauses he is self-reflective: he recognizes
that he has people counting on him (‘me’), that he is trying to find ‘peace’,
and that, somewhat related, he should have finished school like his ‘niece’
instead of using a ‘piece’ (a gun).
In contrast, West’s lyrics are much more semantically coherent as they clearly
relate to the ‘macro’ theme of the song. Spaceship is basically about leaving one’s
see hits/
g hits
g rhymes/
g mind
g helping/
me welcome
weed struggle/
g’s hustle/
me hustle
peace
niece
piece
Visualizing Multimodal Patterning 241
ordinary circumstances and ‘taking off’ to a better place, hence the metaphor-
ical ‘spaceship’ (see Smitherman 2006:99). West’s rhyming couplets provide
a really clever juxtaposition of his adverse circumstances and his tenacity.
For example, West contrasts the fact that he receives ‘hits’, that is, punches
(metaphorical or not), but at the same time, writes ‘hits’, that is, successful song
lyrics. Or, despite his job not ‘helping’, he quits with a departing phrase, ‘you’re
welcome’. And in the final three rhymes West is even more explicit, where he
claims that no one knows his ‘struggles’, but at the same time, they can’t match
his ‘hustle’, that is, his tenacity.
Perhaps this hypothesis, which would obviously benefit from an analysis of
more data, is best explained in terms of ‘sound versus sense’. When the ‘sound’
or sonic intensification of the consecutive rhyme is foregrounded, as it is here
with GLC, then the artist must compromise their lyrical meaning potential.
If, however, the artist, like West, foregrounds their lyrical meaning potential,
then it is more likely that the artist cannot foreground the sonic force, in this
case, through consecutive end-rhymes of the same sound.
Conclusion
This chapter has applied only one type of text visualization to a very small and
unique data set. And despite these obvious limitations, it has proved to be a good
illustration of the need to complement large-scale corpus analyses with methods
of analysis that enables us to visualize large amounts of qualitative data. In the case
of Kanye West and his collaborators, some noteworthy findings and hypotheses
may never have been considered if we were not able to visualize such a specific
linguistic variable like rhyme as it unfolded throughout a complete text.
The arc diagrams revealed a logogenetic patterning of rhyme that would
have almost certainly been lost with large-scale, quantitative methodology.
It was found that both GLC and Consequence dramatically increased their
rhyme as they neared the end of their verse. This logogenetic intensification
or ‘crescendo’ is important and should never be lost or submerged. It means
something. And in this case, those rappers deliberately built-up their rhyme
to reach a point of semantic and sensory salience which perfectly coincided
with their spaceship ‘taking off’: ‘bloaw’ . . .
242 Semiotic Margins
References
Alim, H.S. (2003). On some serious next millennium rap ishhh Pharoahe Monch,
hip hop poetics, and the internal rhymes of internal affairs, Journal of English
Linguistics, 31, 60–84.
Byron, L. (2007). Children’s poetry and lymerick visualizations. Retrieved 11 August
2008, from Lee Byron: www.leebyron.com/what/poetry/
Keyes, C. (2002). Rap music and street consciousness. Chicago, IL: University of Illinois
Press.
Macken-Horarik, M. (2003). Envoi: Intractable issues in appraisal analysis? Text,
23(2), 313–319.
Martin, J. (2004). Mourning: How we get aligned. Discourse and Society, 15(2–3),
321–344.
Martin, J. & Rose, D. (2003). Working with discourse: Meaning beyond the clause.
London: Continuum.
Martin, J. & White, P. (2005). The language of evaluation: Appraisal in English. London
and New York: Palgrave.
OHHLA.com (2008). – Favorite artists: Kanye West. The Original Hip-Hop Lyrics
Archive. Retrieved 1 May 2008, from: www.ohhla.com/anonymous/kan_west/
college/spaceshp.wst.txt.
Richardson, E. (2006). Hip hop literacies. London: Routledge.
Smitherman, G. (2006). Word from the mother: Language and African Americans.
New York and London: Routledge.
Ware, C. (2004). Information visualization perception for design. San Francisco, CA:
Morgan Kaufman.
Wattenberg, M. (2002). Arc diagrams: Visualizing structure in strings. Paper
presented at the IEEE Symposium on Information Visualization (InfoVis’02),
Boston, MA, 28–29 October 2002, 110–116.
West, K. (2004). Spaceship. The college dropout. Roc-A-Fella/Def Jam, 5:24.
Wise, J.A., Thomas, J.J., Pennock, K., Lantrip, D., Pottier, M., Schur, A. & Crow, V.
(1999). Visualizing the non-visual: Spatial analysis and interaction with informa-
tion from text documents. In S.K. Card, J.D. Mackinlay & B. Shneiderman (Eds.),
Readings in information viusalization: Using vision to think (pp. 442–450). San
Francisco, CA: Morgan Kaufmann.
Zappavigna, M. (2007). Visualising instantiation: Text visualisation techniques
for preserving logogenesis. Paper presented at Semiotic Margins: Reclaiming
Meaning, Sydney, 10–12 December 2007.
Chapter 12
Multimodal Semiotics:
Theoretical Challenges
J.R. Martin
University of Sydney
Multimodality
The Sign
To begin, I’d like to return to Saussure’s conception of the sign (1959 Baskin
translation of the Cours used here). On my reading, Saussure’s sign is consti-
tuted by an inextricable bonding of signified (hereafter signifié) with signifier
(hereafter signifiant) (see Figure 12.1). It follows that signs do not realize
meaning; rather they make meaning. The common sense idea that signs stand
for something, so that, for example, a stop sign means ‘stop’, is precisely what
Saussure is trying to supplant.
On this reading, the question for Saussure is not what a sign means but how
it means. And it means by fusing signifié with signifiant and organizing signs
into systems in which they mean in relation to what they are not. Language
is thus conceived as a system of signs, in which meaning is difference (or in
Saussure’s terms valeur). It follows that in a simple traffic lights system such as
244 Semiotic Margins
signifié
significant
Figure 12.1 Bonding of signifié and signifiant in Saussure’s concept of the sign
stop
red
speed
up
yellow
go
green
that in Figure 12.2, it doesn’t matter whether we name signs using terms
reflecting signifié or signifiant; what matters is the relationships among the
signs – one sign versus another in the process of making meaning.
Based on this reading of Saussure one could ask of any multimodal analyst:
Realization
linguistics
signifié
significant
Stratification
In order to explore this complexity, Hjelsmlev proposes that the bonding of
signifié with signifiant involves two interlocking systems of valeur – content
form, which deals with systems of meaning, and expression form, which deals
with systems of sound (or image or gesture if we take graphology or signing into
account). In SFL terms, these two systems of mutually defining valeur are
related by the concept of realization, and generally modelled as co-tangential
circles, with content form subsuming expression form. In these terms Figure
12.4 is best read as a conceptualization of the bonding space outlined as the
object of linguistic inquiry in Figure 12.3. Cléirigh (in preparation) refers to
hierarchies of this kind as supervenient1; Lemke (1984) refers to them as
metaredundant, since content form is a pattern of expression form (a pattern
of patterns in other words). Stratified systems can be conceived as evolving out
of single stratum systems (such as animal language or the proto-language2
spoken by infants up to around 18 months of age) through a process of
emergent complexity (Matthiessen 2004).
From an SFL perspective, the emergence of grammatical metaphor and
the elaboration of discourse resources for organizing meaning beyond the
clause argue for a tri-stratal model of language with a stratified content
plane – with discourse semantics an emergently complex pattern of lexico-
grammatical patterns (cf. Halliday & Matthiessen 1999:237, Martin 1992) (see
Figure 12.5).
Based on this reading of Hjelmslev and Halliday, one could ask of any
multimodal analyst:
1. For a given semiotic system, how many strata are you proposing, and on
which stratum is your description located?
2. Are your strata related by metaredundancy (as patterns of patterns)?
3. Are there distinct systems of valeur on each of the strata you propose?
content
form
expression
form
Figure 12.4 Expression form realizing content form in a stratified semiotic system
(supervenience)
Multimodal Semiotics: Theoretical Challenges 247
discourse
semantics
lexico-
grammar
phonology
Rank
In SFL the principle of distinctive valeur is used to explore both relations
between and within strata. Within strata, one possibility is that valeur is hier-
archically organized in relation to units of different size, with higher-level units
composed of one or smaller units, which may in turn be decomposed (Halliday
& Matthiessen 2004, 2009). Distinctive levels of decomposition are referred
to as ranks – for example a tone group consisting of one or more feet, a foot
consisting of one or more syllables, and a syllable consisting of one or more
phonemes in the phonology of a stress-timed language like English. What is
critical here is that tone group systems differ from foot ones, foot ones from
syllable ones and syllable ones from phoneme ones, and that the distinctive
systems of valeur involved are related to one another by means of a constitu-
ency hierarchy. The insistence on distinctive valeur constrains the number of
ranks in the hierarchy, so that depth is not a simple function of the length of
a unit. The allocation of ranks to strata is partially exemplified for English in
Figure 12.6.
Based on this reading of Halliday, one could ask of any multimodal analyst:
1. For a given stratum, how many ranks are you proposing, and at which rank
is your description located?
2. Are there distinct systems of valeur on each of the ranks you propose?
3. Are your distinct systems of valeur related by constituency (as parts to
wholes)?
248 Semiotic Margins
sequence
clause
figure
group
etc.
etc.
word
syllable
etc.
phoneme
Metafunction
In SFL the principle of distinctive valeur is also used to explore the organiza-
tion of valeur with respect to kinds of meaning (Halliday & Matthiessen 2004,
2009). Distinctive regions of relatively interdependent systems are referred
to as metafunctions – for example, ideational meaning (TRANSITIVITY),
interpersonal meaning (mood) and textual meaning (theme) at clause rank in
lexicogrammar. What is critical here is that ideational systems complement
interpersonal systems which complement textual ones. The three kinds of
meaning cannot be integrated hierarchically into one super system; each
perspective is partial and a comprehensive account of valeur depends on look-
ing from three directions at the same time. When viewing Figure 12.7, it is
important to keep in mind that metafunctions are not three parts of language,
but three simultaneous dimensions of meaning.
Based on this reading of Halliday, one could ask of any multimodal analyst:
1. For a given semiotic system, how many metafunctions are you proposing?
2. Are there topologically distinct systems of valeur for each of the meta-
functions you propose?
3. By what criteria are systems of valeur seen as relatively independent or
interdependent of one another?
SFL makes further suggestions about the types of structural realization associ-
ated with different kinds of meaning (e.g. Martin 1996), with interpersonal
meaning realized through prosodic structures, textual meaning through peri-
odic structures and ideational meaning through particulate ones (orbital for
experiential meaning and serial for logical meaning) (see Figure 12.8).
Multimodal Semiotics: Theoretical Challenges 249
textual
interpersonal
ideational
periodic
prosodic
textual
interpersonal
ideational
serial
orbital
Based on this reading of Halliday one could ask of any multimodal analyst:
1. For a given semiotic system, how many kinds of structural realization are
you proposing?
2. Are the different types of realization associated with different types of meaning?
3. When analogizing from metafunctions in language to your semiotic system
did you take kinds of meaning or types of structure as point of departure?
250 Semiotic Margins
declarative
indicative Subject^Finite
+Subject;
interrogative
+Finite
Finite^Subject
imperative
System/Structure Cycles
Absolutely critical to the discussion of stratification, rank and metafunction in
this section is the theoretical dimension of axis, which underpins the relation
of system and structure in SFL (Halliday & Matthiessen 2009:41–52). Like
signifié and signifiant, system and structure are mutually defining complemen-
tarities. Paradigmatic relations (formalized in system networks) are ‘realized’
through syntagmatic relations (formalized in function structures), and con-
versely, syntagmatic relations constrain and motivate paradigmatic ones. A snip-
pet of this interfacing is outlined in Figure 12.9 for English MOOD, where
the choice of [indicative] conditions the presence of both a Subject and a
Finite element of structure, and the more delicate choices of declarative or
interrogative sequence them in relation to one another.
SFL depends on system/structure cycles of this kind to establish the ways
in which systems formalizing valeur are related to one another, and emergently
organized according to strata, rank and metafunction.
Based on this reading of paradigmatic and syntagmatic relations in SFL one
could ask of any multimodal analyst:
Instantiation
Developing Hjelmslev and Firth (e.g. Firth 1957a), Halliday argues that the
hierarchy of realization outlined above has to be complemented by a hierarchy
of instantiation relating the systemic potential of a language to instances of
use (e.g. Halliday & Matthiessen 1999:382–387, 2009:79–82). In Helmslev’s
terms, this is the relation of system to process (for semiotic systems in general)
Multimodal Semiotics: Theoretical Challenges 251
system
genre/
register)
text type
text
reading
genre system
text
system
register
text
discourse system
semantics
text
system
lexicogrammar
text
system
phonology
text
Based on this reading of system5 and text in SFL one could ask of any
multimodal analyst:
Coupling
While realization tells us what choices are available, instantiation explores
which choices are taken up and how they are put together to form a text
(Martin 2008a, b, 2010). The logogenetic process whereby meanings from
different systems are woven together along the instantiation hierarchy is
referred to by Martin (2008a) as coupling (Zappavigna et al. 2010). Coupling
may involve combining choices from the same semiotic system (across ranks,
cline of integration
maximal minimal
integratio! integratio!
same system in
degree of
diversification
di" eren#
syntag$
di" eren#
lower stratu$
di" eren#
semiotics
253
254 Semiotic Margins
Table 12.1 Sample convergent and divergent coupling matrix (across image
and verbiage)
convergent ÅÆ divergent
coupling
metafunction meaning meaning
potential potential
visual verbal
ideational CONCURRENCE
character attribution
depiction
setting circumstances
...
interpersonal RESONANCE
facial affect
expression
ambience affect
...
textual SYNCHRONY
balance Theme/New
array of foci periodicity
...
that deployed by Kress & Van Leeuwen 1996/2006 for compositional relations
of Ideal/Real, Given/New and Centre/Margin across verbiage and image
modalities – but which has arguably proved more difficult to implement for
interpersonal meaning.
Based on these intermodal integration and complementarity issues one
can ask the multimodal analyst:
Commitment
Instantiation also opens up theoretical and descriptive space for considering
commitment (Martin 2008a, 2010), which refers to the amount of meaning
instantiated as a text unfolds. This depends on the number of optional systems
256 Semiotic Margins
taken up and the degree of delicacy pursued in those that are, so that the more
systems entered, and the more options chosen, the greater the semantic weight
of a text (Hood 2008). Commitment is one avenue for further exploring Kress
and van Leeuwen’s (e.g. 2001) notion of the affordances of a given semiotic
system. By affordance they refer to the facility with which a certain kind of
meaning is committed in one semiotic system compared to another – for
example, the different ways in which verbiage and image register evaluation,
through facial expression and bodily stance resources in image versus appraisal
resources in verbiage (Martin & White 2005). On the one hand, language has
extensive resources for inscribing affect, judgement and appreciation, whereas
image arguably inscribes a narrower range of typological affectual distinctions
and can only invoke, not inscribe, judgement and appreciation; at the same
time, images arguably afford a visceral somatic attitudinal punch (cf. Martin
2001, Stenglin, Chapter 4) that can only be approximated in language through
verbal imagery (i.e. lexical metaphor). Painter & Martin (in press) discuss
examples of the rhetorical effect of complementary commitment of verbiage
and image commitment in children’s picture books.
Based on this discussion of affordances and commitment one could ask the
multimodal analyst:
1. How do you model the amount of meaning committed and thereby the com-
plementary contribution of different semiotic systems in an intermodal text?
2. How does the semantic weight of a given system’s contribution reflect its
affordances?
Semiotic Margins
The Semiotic Margins conference and the key proceedings of which this
chapter is attempting to close, addressed the question of systems of meaning,
which in some sense have marginal status as semiotic systems. Unlike language,
image, music, dance and space, which are generally regarded as canonical
semiotic systems, these systems are often treated as somehow dependent on
denotative semiotic systems. Body language (including gesture, posture, facial
expression and proxemics), and paralanguage (including vocal timbre, tempo
and loudness) are well-known examples. For ease of reference we’ll refer to
body language and paralanguage together as body language below.
1. as an elaboration of protolanguage,
2. as part of language’s expression form, coordinated with phonology, and
3. as a supplementary instantiation of language’s content plane. In doing so,
Cléirigh makes explicit 3 senses in which body language can be interpreted
as a semiotic margin.
Parametric Systems
Van Leeuwen (2009; see also 1999, Kress & van Leeuwen 2001) discusses what
he calls parametric systems. These systems have the property of involving a
number of simultaneous systems, consisting of two terms, which are graded in
relation to one another rather than in dichotomous opposition. He illustrates
systems of this kind through sound quality (Figure 12.13), where, for example,
a singing voice can be more or less tense or lax, loud or soft, high or low, and
so on (see also Caldwell, in press). Van Leeuwen notes that the same kind of
parametric system can be proposed for typography and colour (cf. Kress & van
Leeuwen 2002, van Leeuwen 2005a).
Taking Cléirigh’s perspective on body language as point of departure, sys-
tems of this kind could all be explored in relation to language for their proto-
linguistic, linguistic and epilinguistic potential. As elaborated protolanguage
they co-opt physical and material resources to construe the visceral embodied
meanings that can be fashioned out of sound (cf. McDonald, Chapter 5), colour
and typeface. At the same time, like linguistic body language, they can be
co-opted by the expression plane of language to punctuate a phase of discourse
or enhance its tone (cf. Caldwell and Zappavigna, Chapter 11). As epilanguage
they can be deployed ideationally to represent physical or biological phenom-
ena (e.g. bird calls), interpersonally to register feeling (e.g. imminent danger
music in a film soundtrack) and textually to highlight meanings (e.g. bold-face,
italics, special font, colour in text).
We also need to allow for the fact that parametric resources can interact
with denotative semiotics other than language, for example, image, music or
dance. The sound track for a film, for example, arguably functions in all three
ways in relation to moving images, as would the music accompanying dance,
Multimodal Semiotics: Theoretical Challenges 259
Tense
Lax
Loud
Soft
High
Low
Sound
Rough
quality
Smooth
Breathy
Non-breathy
Vibrato
Plain
Nasal
Non-nasal
Figure 12.13 Van Leeuwen’s (1999) systems for sound quality (a parametric
system)
I shall use the term repertoire to refer to the set of strategies and their analogic
potential possessed by any one individual and the term reservoir to refer to the
total of sets and its potential of the community as a whole. Thus the repertoire
of each member of the community will have both a common nucleus but
there will be differences between the repertoires. There will be differences
between the repertoires because of the differences between members arising
out of differences in members’ context and activities and their associated
issues. (Bernstein 1996:157)
Multimodal Semiotics: Theoretical Challenges 261
culture
master
affiliation
identity
allocation
sub-culture
persona
reservoir
genre
repertoire
reservoir
register
repertoire
reservoir
discourse
semantics
repertoire
reservoir
lexicogrammar
repertoire
reservoir
phonology
repertoire
In light of these concerns with identity and affiliation, one could ask:
1. How do you describe the allocation of the semiotic resources you are
focusing on to repertoires of users?
2. How do these repertoires engender communities of such users?
3. Is there a distinctive role for denotative semiotic, protosemiotic and episemi-
otic systems in this process?
as say :-) or :-(, one might even argue that they have been fully co-opted by lan-
guage’s expression plane as part of graphology (cf. Knox 2009a, b). Over time,
the overwhelming trend is to make things mean, and co-opt more and more of
the somatic and physical environment of semiosis into the semiosis itself, as
technology affords. Currently electronic communication and ‘clinical’ inter-
vention (e.g. plastic surgery, piercing and tattoo) are key technologies shifting
the borders of what counts as in or outside of a social semiotician’s gaze.
However excluded, somatic and physical systems are the material context in
which we mean, so straddling the borders of Figure 12.16, on some kind of
interdisciplinary basis or other, is ever a useful corrective to social semiotic ana-
lysis alone. We also need to consider the possibility of construing somatics as
if it were semiosis; Martinec’s work on action, for example (1998, 2000a, b, c,
2001; cf. Martinec 2004 on what is termed here), might well be considered
an endeavour of this kind by those (not including the author) who view its
purview as somehow beyond semiosis.
In light of these concerns with the limits of semiosis, one could ask:
1. On what basis do you distinguish between the semiosis you are considering
and its biological and/or physical environment?
physical
somatic
social
semiotic
Challenging Theory
Let me make just two points in closing. The first is the importance of bringing
time into the picture, which I have not had space to pursue here. Clearly,
multimodal texts unfold through time (Zhao 2010), and what was referred to
as instantiation above is a logogenetic process – snowballing meaning. In addi-
tion, clearly, identity is something that develops throughout the lifetime of an
individual, and what was referred to as individuation above is an ontogenetic
process accumulating logogenesis as repertoire – seasons of meaning (Painter
2009) (see Figure 12.17). And finally, clearly, systems evolve, as reservoirs of
meaning adapt phylogenetically to changing technologies and environmental
instantiation
system/ text
reservoir
logogenesis
i
n
d
i
v
i
d
u
a
t
i
o
n
phylogenesis
repertoire
ontogenesis
Acknowledgement
Notes
1
Supervenience sits in contrast to circumvenient systems discussed in this chapter.
Supervenient systems are in a relation of realization, whereas circumvenient
systems are in a relation of embedding.
2
Halliday (e.g. 1975) in fact describes infant’s protolanguage a bi-stratal in spite
of the fact that for languages of this kind valeur on the two strata would have to
be identical. I prefer an emergent complexity model here in which stratification
cannot be proposed in the absence of distinct valeur (and thus metaredundancy);
see Painter 1984:34–36 and Matthiessen 2007a:516–519 for discussion (for
266 Semiotic Margins
Matthiessen the two SFL theoretical dimensions of axis and strata are conflated
in protolanguage).
3
In answering this query we have to keep in mind that structural realization may
involve a single unit; in Tagalog, for example, the realization of interrogative
from a MOOD network like that in Figure 12.9 would be the question particle ba,
not a syntagm like Finite^Subject. It is also crucial to distinguish axial realization
from instantiation; both dimensions of axis, paradigmatic system and syntagmatic
structure are instantiated in texts as they logogenetically unfold – the structural
realization of a system is NOT its instantiation!
4
To this hierarchy I have added the strata of register and genre (after Martin 1992,
Martin & Rose 2008), by way of incorporating social context as higher levels
of abstraction; in Hjelmslev’s terms register and genre are connotative semiotic
systems, defined as systems which take another semiotic as their expression
plane (versus denotative semiotics which have their own expression plane).
5
Note that I am now, like most systemicists, in the uncomfortable position of
having used the term system in two ways – with respect to axis as the paradigmatic
complement of syntagmatic structure, and with respect to instantiation as the
meaning potential specialized in texts. This dual usage of the term in SFL is too
sedimented to undo here, and in any case reflects the privileged position given
to paradigmatic relations are far as modelling the systemic reservoir of meanings
in a culture is concerned.
6
Halliday (e.g. 2004a) and Matthiessen (e.g. 2007a:516) prefer an interpretation
of protolanguage in which axis is conflated with strata (i.e. system conflated
with content form and structure with expression form) and thus refer to
protolinguistic systems as bi-stratal.
7
Hood’s work on gestures illustrating abstract concepts in disciplinary discourse
suggests a possible extension of Cléirigh’s conception of epilinguistic systems to
diagrams, which co-opt images in order to co-instantiate language’s content plane,
by drawing on a page (versus drawing in the air); a powerpoint presentation or
instruction manual consisting solely of diagrams would be thus akin to mine.
8
Treating epilinguistic body language as co-instantiating language’s content form
is an extrapolation by the author from Cléirigh’s work.
9
Matthiessen (e.g. 2004, 2007a), following Halliday, proposes four stages of
evolution, physical, biological, social and semiotic; I find it hard to imagine a
social system without some form of communication, so prefer the physical,
biological and social semiotic genesis implied in Figure 12.16 here. It might
be possible, however, to draw a line between social systems dependent on ‘proto-
language’ type systems alone and those additionally deploying stratified linguistic
systems, which may be what Halliday and Matthiessen have in mind.
10
This would be especially true for activity theorists, who instead of seeing action
as a kind of meaning see meaning as a kind of behaviour involving verbal
artefacts (e.g. Norris & Jones 2005).
References
Bateman, J. (2008). Multimodality and genre: A foundation for the systematic analysis
of multimodal documents. London: Palgrave Macmillan.
Multimodal Semiotics: Theoretical Challenges 267
Bednarek, M. & Martin, J.R. (Eds) (2010). New discourse on language: Functional
perspectives on multimodality, identity, and affiliation. London: Continuum.
Bernstein, B. (1996), Pedagogy, symbolic control and identity: Theory, research, critique.
London: Taylor & Francis. [Revised Edition 2000].
Caldwell, D. (in press). Making Many Meanings in Popular Rap Music. In
A. Mahboob & N. Knight (Eds), Directions in Appliable Linguistics. London:
Continuum.
Caple, H. (2009). Playing with words and pictures: Text-image relations and
semiotic interplay in a new genre of western news reportage. PhD Thesis,
Department of Linguistics, University of Sydney, Sydney.
Cléirigh, C. (in preparation). The life of meaning. Draft manuscript.
Firth, J.R. (1957a). A synopsis of linguistic theory, 1930–1955. In Studies in Linguistic
Analysis (Special volume of the Philological Society), pp. 1–13. London:
Blackwell. Reprinted in Palmer, F.R. (Ed.) (1968). Selected papers of J R Firth,
1952–1959 (pp. 168–205). London: Longman.
Firth, J.R. (1957b). Personality and language in society. Papers in linguistics
1934–1951 (pp. 177–189). Oxford: Oxford University Press.
Gill, T. (2002). Visual and verbal playmates: An exploration of visual and verbal
modalities in children’s picture books. BA Hons Thesis, Department of
Lingusitics, University of Sydney, Sydney.
Halliday, M.A.K. (1975). Learning how to mean: Explorations in the development of
language. London: Edward Arnold (Explorations in Language Study).
Halliday, M.A.K. (2004a). The language of early childhood. London: Continuum
(Vol. 4 in the Collected Works of M.A.K. Halliday).
Halliday, M.A.K. (2004b). The language of science. London: Continuum (Vol. 5 in the
Collected Works of M.A.K Halliday). London: Continuum.
Halliday, M.A.K. & Matthiessen, C.M.I.M. (1999). Construing experience through
language: A language-based approach to cognition. London: Cassell.
Halliday, M.A.K. & Matthiessen, C.M.I.M. (2004). An introduction to functional
grammar (3rd edn). London: Edward Arnold.
Halliday, M.A.K. & Matthiessen, C.M.I.M. (2009). Systemic functional grammar: A first
step into theory. Beijing: Higher Education Press.
Hasan, R. (2005). Language, society and consciousness. London: Equinox (Vol. 1 in
The Collected Works of Ruqaiya Hasan, edited by J. Webster).
Hasan, R. (2009). Semantic variation: Meaning in society and sociolinguistics. London:
Equinox (The Collected Works of Ruqaiya Hasan, edited by J. Webster).
Hjelmslev, L. (1961). Prolegomena to a theory of language. Madison, WI: University
of Wisconsin Press.
Hood, S. (2008). Summary writing in academic contexts: Implicating meaning in
processes of change. Linguistics and Education 19, 351–365.
Knight, N. (2010). Wrinkling complexity: Concepts of identity and affiliation
in humour. In M. Bednarek & J.R. Martin (Eds), New discourse on language:
Functional perspectives on multimodality, identity, and affiliation (pp. 35–58). London:
Continuum.
Knox, J.S. (2009a). Visual minimalism in hard news: Thumbnail faces on the smh
online home page. Social Semiotics, 19(2), 165–189.
Knox, J.S. (2009b). Punctuating the home page: Image as language in an online
newspaper. Discourse and Communication, 3(2), 145–172.
268 Semiotic Margins
Kress, G. & van Leeuwen, T. (1996). Reading images: The grammar of visual design.
London: Routledge [Revised second edition 2006].
Kress, G. & van Leeuwen, T. (2001). Multimodal discourse – The modes and media of
contemporary communication. London: Edward Arnold.
Kress, G. & van Leeuwen, T. (2002). Colour as a semiotic mode: Notes for a
grammar of colour. Visual Communication, 1(3), 343–368.
Lemke, J. (1984). Semiotics and education. Toronto: Toronto Semiotic Circle (Mono-
graphs, Working Papers and Publications 2).
Martin, J.R. (1992). English text: System and structure. Amsterdam: Benjamins.
Martin, J.R. (1996). Types of structure: Deconstructing notions of constituency in
clause and text. In E.H. Hovy & D.R. Scott (Eds), Computational and conversational
discourse: Burning issues – an interdisciplinary account (pp. 39–66). Heidelberg:
Springer (NATO Advanced Science Institute Series F – Computer and Systems
Sciences, Vol. 151).
Martin, J.R. (2001). Fair trade: Negotiating meaning in multimodal texts. In
P. Coppock (Ed.), The semiotics of writing: transdisciplinary perspectives on the
technology of writing (pp. 311–338). Brepols (Semiotic & Cognitive Studies X).
Martin, J.R. (2008a). Tenderness: Realisation and instantiation in a Botswanan
town. Odense Working Papers in Language and Communication (Special Issue of
Papers from 34th International Systemic Functional Congress edited by Nina
Nørgaard) (pp. 30–62).
Martin, J.R. (2008b). Intermodal reconciliation: Mates in arms. In L. Unsworth
(Ed.), New literacies and the English curriculum: Multimodal perspectives (pp. 112–148).
London: Continuum.
Martin, J.R. (2008c). Innocence: Realisation, instantiation and individuation in
a Botswanan town. In N. Knight & A. Mahboob (Eds), Questioning linguistics
(pp. 27–54). Cambridge: Cambridge Scholars Publishing.
Martin, J.R. (2010). Semantic variation: Modelling system, text and affiliation in
social semiosis. In M. Bednarek & J.R. Martin (Eds), New discourse on language:
Functional perspectives on multimodality, identity, and affiliation (pp. 1–34). London:
Continuum.
Martin, J.R. & Rose, D. (2008). Genre relations: Mapping culture. London: Equinox.
Martin, J.R. & Stenglin, M. (2006). Materialising reconciliation: Negotiating
difference in a post-colonial exhibition. In T. Royce & W. Bowcher (Eds),
New directions in the analysis of multimodal discourse (pp. 215–238). Mahwah, NJ:
Lawrence Erlbaum Associates.
Martin, J. & White, P.R.R. (2005). The language of evaluation: Appraisal in English.
London: Palgrave (Chinese translation in preparation for Peking University
Press).
Martinec, R. (1998). Cohesion in action. Semiotica1, 20(½), 161–180.
Martinec, R. (2000a). Rhythm in multimodal texts. Leonardo, 33(4), 289–297.
Martinec, R. (2000b). Types of process in action. Semiotica, 130(3/4), 243–268.
Martinec, R. (2000c). Construction of identity in M Jackson’s ‘Jam’. Social Semiotics,
10(3), 313–329.
Martinec, R. (2001). Interpersonal resources in action. Semiotica, 135(1/4), 117–145.
Martinec, R. (2004). Gestures which co-occur with speech as a systematic resource:
The realisation of experiential meaning in indexes. Social Semiotics, 14(2),
193–213.
Multimodal Semiotics: Theoretical Challenges 269
image-text relations, modeling 146–7, meaning, of music 101, 102, 103, 106,
157, 164 107, 108, 109, 111, 112, 113,
cohesive relations 146, 147 115, 116, 117
inter-semiotic relations 146, 148, 162 meaning potential 105
intra-semiotic relations 148 metafunctions 33–4, 40, 45, 48, 75,
topological meaning 162 101, 102, 248–9
typological meaning 162 ideational (and experiential)
immanent (neutral) 109, 110, 111 meaning 33, 41, 75–8, 125,
individuation 260–2, 264–5 126, 149
instantiation 251–2, 264–5 interpersonal meaning 7, 8, 9, 11,
intensification 235, 241 12, 25, 33–4, 41, 43, 45, 48,
see also crescendo 78–80, 125, 229, 234
intermodal textual meaning 33–5, 39, 81–2, 125,
integration 126 126, 141
relations 147 see also composition
see also image-text relations, metaredundancy 246
coupling motion 114, 117
interpretation 101, 109, 110, 111 movement 117–18, 119
inter-semiotic relations 146, 148, 162 multimodal analysis 146–50
see also image-text relations, frames 148
modelling multi-semiotic frames 148, 150
intra-semiotic relations 148 rank 147, 148, 247–8
see image-text relations, modelling units of analysis 148–50
intonation 33, 43 visual and verbal elements 148, 158
multimodal reading comprehension 145
language, compared with music 101, integrative reading 160
104, 111, 114, 115, 116 reading comprehension 160, 162
laughter, ontogenetic development see also pedagogy
of 9–11 multimodal texts 144, 175–6
phylogenetic origins of 8–11 hybrid texts; comics 146
relation to smiling 8–9 multimodality 9–11, 31, 75
sound potential of modalities 31
see also system network of laughter modalities of learning 193, 194–5
sound potential modes 31, 75, 102, 174
linguistic service 64 see also context
listener 105, 109 music 101, 105, 106, 110, 114, 116, 117
listening 103, 104, 105, 109, 119 absolute 101, 103
logico-semantic relations 146 as activity 116
elaboration 149–52, 157, 162 cognition 105
enhancement 157 as language 108
extension 153–6, 157, 161 notation 117
projection 157 as object 116
logogenesis 212–15, 236, 264 structures 105, 111, 112–13
dynamic modelling 213 syntax 112
syntagmatic patterning 214–15 text 111, 112
works of 116, 117
meaning, extra-musical 109, 111 musical event 105, 117
meaning, intra-musical 111, 112 musicking 117
274 Index