Manovich 2006 Image Future
Manovich 2006 Image Future
Image Future
Lev Manovich
For the larger part of the 20th century, different areas of commercial
moving image culture maintained their distinct production methods
and distinct aesthetics. Films and cartoons were produced completely
differently and it was easy to tell their visual languages apart. Today
the situation is different. Computerization of all areas of moving image
production created a common pool of techniques, which can be used
regardless of whether one is creating motion graphics for television, a
narrative feature, an animated feature, or a music video. The ability to
composite many layers of imagery with varied transparency, to place
still and moving elements within a shared 3D virtual space and then
move a virtual camera through this space, to apply simulated motion
blur and depth of field effect, to change over time any visual parameter
of a frame – all these can now be equally applied to any images, regard-
less of whether they were captured via a lens-based recording, drawn
by hand, created with 3D software, etc.
The existence of this common vocabulary of computer-based tech-
niques does not mean that all films now look the same. What it means,
however, is that while most live action films and animated features do
look quite distinct today, this is the result of deliberate choices rather
than the inevitable consequence of differences in production methods
and technology. At the same time, outside the realm of live action films
and animation features, the aesthetics of moving image culture dramat-
ically changed during the 1990s.
What happened can be summarized in the following way. Around
the mid-1990s, the simulated physical media for moving and still image
production (cinematography, animation, graphic design, typography),
new computer media (3D animation), and new computer techniques
(compositing, multiple levels of transparency) started to interact
within a single computing environment – either a personal computer
or a relatively inexpensive graphics workstation affordable for small
companies and even individuals. The result was the emergence of a
new hybrid aesthetics that quickly became the norm. Today this
aesthetics is at work in practically all short moving image forms: TV
advertising and TV graphics, music videos, short animations, broad-
cast graphics, film titles, music videos, web splash pages. It also defines
a new field of media production – motion graphics – but it is import-
ant to note that the hybrid aesthetics is not confined to this field but
can be found at work everywhere else.
This aesthetics exists in endless variations but its logic is the same:
juxtaposition of previously distinct visual languages of different media
within the same sequence and, quite often, within the same frame.
Hand-drawn elements, photographic cutouts, video, type, 3D elements
are not simply placed next to each other but interwoven. The result-
ing visual language is a hybrid. It can also be called a metalanguage as
it combines the languages of design, typography, cell animation, 3D
computer animation, painting, and cinematography.
In addition to special effects features, the hybrid (or meta) aesthet-
ics of a great majority of short moving images sequences that surround
us today is the most visible effect of computerization of moving image
production. In this case, animation frequently appears as one element
of a sequence or even a single frame. But this is just one, more obvious,
role of animation in the contemporary post-digital visual landscape. In
this article I will discuss its other role: as a generalized technique that
can be applied to any images, including film and video. Here, anima-
tion functions not as a medium but as a set of general-purpose tech-
niques – used together with other techniques in the common pool of
options available to a filmmaker/designer.
I have chosen a particular example for my discussion that I think
illustrates well this new role of animation. It is a relatively new method
Manovich Image future 27
Uneven development
Before proceeding, I should note that not all of the special effects in
The Matrix rely on Universal Capture and, of course, other Hollywood
films already use some of the same strategies. However, in this article
I focus on the use of this process in The Matrix because Universal
Capture was actually developed for the second and third films of the
trilogy. And while the complete credits for everybody involved in
developing the process would run for a number of lines, in this text I
will identify it with Gaeta. The reason is not because, as a senior
special effects supervisor for The Matrix Reloaded (2003) and The
Matrix Revolutions (2003), he got most publicity. More importantly,
in contrast to many others in the special effects industry, Gaeta has
extensively reflected on the techniques he and his colleagues have
developed, presenting it as a new paradigm for cinema and entertain-
ment, and coining useful terms and concepts for understanding it.
In order to understand better the significance of Gaeta’s method,
let us briefly run through the history of 3D photo-realistic image
synthesis and its use in the film industry. In 1963, Lawrence G. Roberts
(a graduate student at MIT) became one of the key people behind the
development of Arpanet, and published a description of a computer
algorithm to construct images in linear perspective. These images
represented the objects’ edges as lines; in contemporary language of
computer graphics they can be called ‘wire frames’. Approximately 10
years later, computer scientists designed algorithms that allowed for
the creation of shaded images (so-called Gouraud shading and Phong
shading, named after the computer scientists who created the corre-
sponding algorithms). From the middle of the 1970s to the end of the
1980s, the field of 3D computer graphics went through rapid develop-
ment. Every year new fundamental techniques were created: trans-
parency, shadows, image mapping, bump texturing, particle system,
compositing, ray tracing, radiosity, and so on.2 By the end of this
creative and fruitful period in the history of the field, it was possible
to use a combination of these techniques to synthesize images of
almost every subject that were often not easily distinguishable from
traditional cinematography.
All this research was based on one fundamental assumption: in
order to re-create an image of reality identical to the one captured by
30 animation: an interdisciplinary journal 1(1)
special effects artists have dealt with this challenge using a variety of
techniques and methods. What Gaeta realized earlier than others is
that the best way to align the two universes of live action and 3D
computer graphics is to build a single new universe.4
Rather than treating sampling reality as just one technique to be
used along with many other ‘proper’ algorithmic techniques of image
synthesis, Gaeta and his colleagues turned it into the key foundation
of the Universal Capture process. The process systematically takes
physical reality apart and then systematically reassembles the elements
into a virtual computer-based representation. The result is a new kind
of image that has a photographic/cinematographic appearance and
level of detail yet internally is structured in a completely different way.
Universal Capture was developed and refined over a three-year
period from 2000 to 2003 (Borshukov, 2004). How does the process
work? There are actually more stages and details involved, but the
basic procedure is as follows (for more details, see Borshukov et al.,
2003). An actor’s performance in ambient lighting is recorded using
five synchronized high-resolution video cameras. ‘Performance’ in this
case includes everything an actor says in a film and all possible facial
expressions.5 (During production, the studio was capturing over 5
terabytes of data each day.) Next, special algorithms are used to track
each pixel’s movement over time at every frame. This information is
combined with a 3D model of a neutral expression of the actor created
using a cyberscan scanner. The result is an animated 3D shape that
accurately represents the geometry of the actor’s head as it changes
during a particular performance. The shape is mapped with color
information extracted from the captured video sequences. A separate
very high resolution scan of the actor’s face is used to create the map
of small-scale surface details like pores and wrinkles, and this map is
also added to the model.
After all the data have been extracted, aligned, and combined, the
result is what Gaeta calls a ‘virtual human’ – a highly accurate recon-
struction of the captured performance, now available as 3D computer
graphics data – with all the advantages that come from having such a
representation. For instance, because the actor’s performance now
exists as a 3D object in virtual space, the filmmaker can animate a
virtual camera and ‘play’ the reconstructed performance from an
arbitrary angle. Similarly, the virtual head also can be lighted in any
way that is desired and attached to a separately constructed CG body
(Borshukov et al., 2004). For example, all the characters that appeared
in the Burly Brawl scene in Matrix 2 were created by combining the
heads constructed via Universal Capture done on the leading actors
with CG bodies which used motion capture data from a different set
of performers. Because all the characters as well as the set were
computer generated, this allowed the directors of the scene to chore-
ograph the virtual camera, making it fly around the scene in a way not
possible with real cameras on a real physical set.
34 animation: an interdisciplinary journal 1(1)
Animation as an idea
methods, the animator does not directly create the movement. Instead
it is created by the software that uses some kind of mathematical
model. For instance, in the case of physically based modeling the
animator may set the parameters of a computer model, which simu-
lates a physical force such as a wind that will deform a piece of cloth
over a number of frames. Or, the animator may instruct the ball to drop
on the floor, and let the physics model control how the ball will
bounce after it hits the floor. In the case of particle systems used to
model everything from fireworks, explosions, water, and gas to animal
flocks and swarms, the animator only has to define initial conditions:
the number of particles, their speed, their lifespan, etc.
In contrast to live action cinema, these computer graphics methods
do not capture real physical movement. Does it mean that they belong
to animation? If we accept that the defining feature of traditional
animation was manual creation of movement, the answer will be no.
But things are not so simple. With all these methods, animators set the
initial parameters, run the model, adjust the parameters, and repeat
this production loop until they are satisfied with the result. So while
the actual movement is produced not by hand but by a mathematical
model, animators maintain significant control. In a way, animators act
as film directors – only in this case they are directing not the actors
but a computer model until it produces a satisfactory performance. Or
we can also compare animators to film editors as they are selecting
the best performances of the computer model.
James Blinn, a computer scientist responsible for creating many
fundamental techniques of computer graphics, once made an interest-
ing analogy to explain the difference between the manual keyframing
method and physically based modeling.6 He told the audience at a
SIGGRAPH panel that the difference between the two methods is anal-
ogous to the difference between painting and photography. In Blinn’s
terms, an animator who creates movement by manually defining
keyframes and drawing in-between frames is like a painter who is
observing the world and then making a painting of it. The resemblance
between a painting and the world depends on the painter’s skills,
imagination and intentions, whereas an animator who uses physically
based modeling is like a photographer who captures the world as it
actually is. Blinn wanted to emphasize that mathematical techniques
can create a realistic simulation of movement in the physical world
and an animator only has to capture what is created by the simulation.
Although this analogy is useful, I think it is not completely accurate.
Obviously, the traditional photographer whom Blinn had in mind (i.e.
before Photoshop) chooses composition, contrast, depth of field, and
many other parameters. Similarly, animators who are using physically
based modeling also have control over a large number of parameters
and it depends on their skills and perseverance to make the model
produce a satisfying animation. Consider the following example
from the related area of software art, which uses some of the same
Manovich Image future 37
And what about animation? What will be its future? As I have tried
to explain, besides purely animated films and animated sequences
used as a part of other moving image projects, animation has become
a set of principles and techniques that animators and filmmakers
employ today to create new methods and new visual styles. Therefore,
I think it is not worth asking if this or that visual style or method for
creating moving images that emerged after computerization is ‘anima-
tion’ or not. It is more constructive to say that most of these methods
were born from animation and have animation DNA – mixed with DNA
from other media. I think that such a perspective that considers
‘animation in an extended field’ is a more productive way to think
about animation today, especially if we want our reflections to be
relevant for everybody concerned with contemporary visual and
media cultures.
Notes
1 For technical details of the method, see the publications of Georgi Borshukov
[www.virtualcinematography.org/publications.html].
2 Although not everybody would agree with this analysis, I feel that after the
end of the 1980s, the field significantly slowed down: on the other hand, all
key techniques that can be used to create photorealistic 3D images have
already been discovered. The rapid development of computer hardware in
the 1990s meant that computer scientists no longer had to develop new
techniques to make the rendering faster, since the already developed
algorithms would now run fast enough.
3 The terms ‘reality simulation’ and ‘reality sampling’ have been invented for
this article; the terms ‘virtual cinema’, ‘virtual human’, ‘universal capture’
and ‘virtual cinematography’ come from John Gaeta. The term ‘image-based
rendering’ first appeared in the 1990s.
4 Therefore, while the article in Wired which positioned Gaeta as a
groundbreaking pioneer and as a rebel working outside Hollywood contained
the typical journalistic exaggeration, it was not that far from the truth
(Silberman, 2003).
5 The method captures only the geometry and images of an actor’s head; body
movements are recorded separately using motion capture.
6 I am not sure about the exact year of the SIGGRAPH conference where Blinn
gave his presentation, but I think it was the end of the 1980s when physically
based modeling was still a new concept.
7 For more on this process, see the chapter ‘Synthetic Realism and its
Discontents’ in Manovich (2001).
8 Seen from this perspective, my earlier book The Language of New Media
(2001) can be seen as a systematic investigation of a particular slice of
contemporary culture driven by this hybrid aesthetics: the slice where the
logic of digital networked computer intersects the numerous logics of already
established cultural forms.
44 animation: an interdisciplinary journal 1(1)
References
Blinn, J.F. (1978) ‘Simulation of Wrinkled Surfaces’, Computer Graphics, August:
286–92.
Borshukov, Georgi (2004) ‘Making of the Superpunch’, presentation at Imagina
2004, available at [ww.virtualcinematography.org/publications/acrobat/
Superpunch.pdf].
Borshukov, Georgi, Piponi, Dan, Larsen, Oystein, Lewis, J.P. and Tempelaar-Lietz,
Christina (2003) ‘Universal Capture – Image-Based Facial Animation for “The
Matrix Reloaded”’, SIGGRAPH 2003 Sketches and Applications Program,
available at [http://www.virtualcinematography.org/publications/acrobat/
UCap-s2003.pdf].
Feeny, Catherine (2004) ‘“The Matrix” Revealed: An Interview with John Gaeta’,
VFXPro, 9 May [www.uemedia.net/CPC/vfxpro/article_7062.shtml]
Gaeta, John (2003) Presentation during a workshop on the making of The Matrix,
Art Futura 2003 festival, Barcelona, 12 October.
Manovich, Lev (2001) The Language of New Media. Cambridge, MA: MIT Press.
Silberman, Steve (2003) ‘Matrix 2’, Wired, 11 May [http://www.wired.com/
wired/archive/11.05/matrix2.html]
Venturi, Robert, Izenour, Steven and Scott Brown, Denise (1977[1972]) Learning
from Las Vegas: The Forgotten Symbol of Architectural Form, rev edn.
Cambridge, MA: MIT Press.