0% found this document useful (0 votes)
41 views65 pages

A System For Automatic Animation of Piano Performances

This document describes a system for automatically animating piano performances from MIDI music files. The system uses a graph theory-based motion planning method to determine optimal fingerings for chords. It then generates 3D animations of a hand playing the piece by calculating finger and hand positions over time based on the fingering choices. The animations are optimized to smooth finger and hand motions between chords. The system is able to generate realistic-looking animations of piano performances from musical scores without any motion capture data.

Uploaded by

Alex Ar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views65 pages

A System For Automatic Animation of Piano Performances

This document describes a system for automatically animating piano performances from MIDI music files. The system uses a graph theory-based motion planning method to determine optimal fingerings for chords. It then generates 3D animations of a hand playing the piece by calculating finger and hand positions over time based on the fingering choices. The animations are optimized to smooth finger and hand motions between chords. The system is able to generate realistic-looking animations of piano performances from musical scores without any motion capture data.

Uploaded by

Alex Ar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

A System for Automatic Animation of Piano Performances

Journal: Computer Animation and Virtual Worlds


Manuscript ID: CAVW-12-0012.R1
Wiley - Manuscript type: Research Article
Date Submitted by the Author: 23-Jun-2012
Complete List of Authors: Zhu, Yuanfeng; University of California, Davis, Computer Science
Ramakrishnan, Ajay; University of California, Davis, Computer Science
Hamann, Bernd; University of California, Davis, Computer Science
Neff, Michael; University of California, Davis, Computer Science
Keywords: piano animation, fingering generation, optimization method

Note: The following files were submitted by the author for peer review, but cannot be converted to
PDF. You must view these files (e.g. movies) online.
demo.rar


http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds



Flowchart of system implementation
582x255mm (72 x 72 DPI)


Page 1 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



A trellis graph where consecutive time-slices correspond to consecutive chords in the piece. The weighted
nodes represent the cost of hand poses for a fingering choice and weighted edges represent the cost of hand
motion between the hand poses
410x223mm (96 x 96 DPI)


Page 2 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Hand pose cost values for poses corresponding to different separation of the thumb and ring fingers
233x117mm (96 x 96 DPI)


Page 3 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Repositioning of non-instructed fingers which are indicated by the red circles around them
374x253mm (96 x 96 DPI)


Page 4 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Data flow of simulation of piano performance
389x213mm (96 x 96 DPI)


Page 5 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Important joints in our hand model
508x285mm (96 x 96 DPI)


Page 6 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



The axes in our model
256x178mm (96 x 96 DPI)


Page 7 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Algorithm to handle finger crossover
256x178mm (96 x 96 DPI)


Page 8 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Arpeggio Skill
272x171mm (96 x 96 DPI)


Page 9 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Hand motion from one chord to another. The hand will reach its highest position above the piano keyboard
in the middle of the motion
505x122mm (96 x 96 DPI)


Page 10 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Rotation of instructed finger(s) influences the rotation of non-instructed fingers. In this example, the
rotation of instructed middle finger influences the rotation of non-instructed ring and little fingers
506x121mm (96 x 96 DPI)


Page 11 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Key poses of finger crossover while playing scales. The first row shows a key-frame of the thumb crossing
over the middle finger while playing the C-major scale and the second row shows a key-frame of the thumb
crossing over the ring finger while playing the D-major scale, both from 3 perspectives. Note that the
ring/middle finger firmly presses down the keys, the fingers avoid collisions with black keys
in C-major, the wrist maintains a natural rotation and the thumb is positioned well on the key to play it after
crossing over
375x230mm (96 x 96 DPI)


Page 12 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Correct fingering generated for the first part of ``Bilder einer Ausstellung"
324x140mm (96 x 96 DPI)


Page 13 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Pose for (a) E4-G4-C5 (b) F4-F5 (c) \#D4-G4-\#A4-\#D5
332x113mm (96 x 96 DPI)


Page 14 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Motion comparison before and after optimization. The first three sub-figures show the motion curve along
three axis components are smoothed after optimization as the red curves shows. The fourth sub-figure
shows the rotation of the wrist along vertical axis(the Y axis) is also smoothed. The four figures together
demonstrate that the hand moves with less distance and rotation for the same music clip after optimization,
and therefore the hand motion is more smooth after optimization routine
183x227mm (96 x 96 DPI)


Page 15 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Motion comparison between ground truth data and optimization result. The motion curves along three
translation components follows tightly with the those of corresponding motion capture data for the same
music clip
350x142mm (96 x 96 DPI)


Page 16 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60



Some key-poses while playing ``Childhood memory"
384x112mm (96 x 96 DPI)


Page 17 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A System for Automatic Animation of Piano
Performances
Yuanfeng Zhu, Ajay Sundar Ramakrishnan, Bernd Hamann, Michael Neff
Department of Computer Science
University of California, Davis
1 Shields Ave.
Davis, CA, USA, 95616
Tel. 816-830-6264
email: [email protected]
Abstract
Playing the piano requires one to precisely position ones hand in order to strike
particular combinations of keys at specic moments in time. This paper presents the
rst system for automatically generating three dimensional animations of piano perfor-
mance, given an input midi music le. A graph theory-based motion planning method
is used to decide which set of ngers should strike the piano keys for each chord. An-
1
Page 18 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ticipating the progression of the music, the positions of unused ngers are calculated to
make possible efcient ngering of future notes. Initial key poses of the hands, includ-
ing those for complex piano techniques such as crossovers and arpeggio, are determined
based on the nger positions and piano theory. An optimization method is used to rene
these poses, producing a natural and minimal energy pose sequence. Motion transitions
between poses are generated using a combination of sampled piano playing motion and
music features, allowing the system to support different playing styles. Our approach
is validated through direct comparison with actual piano playing and simulation of a
complete music piece requiring various playing skills. Extensions of our system are
discussed.
Keywords: piano animation, ngering generation, optimization method
2
Page 19 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
1 Introduction
The piano is a complex musical instrument to play as one performs musical theme and
chords simultaneously. It is also a very complex task to correctly animate piano playing,
given the tight constraints on the timing of the motion, necessary to generate the correct
notes, and the high number of degrees of freedomof the hand and ngers. While specialized,
piano playing is not an uncommon activity and will occur in movies, games and virtual
environments. Furthermore, in the eld of piano tuition, given any piece of music, amateur
players nd it difcult to determine the most effective ngering and to render it exactly the
way it is supposed to be performed. Therefore, a system capable of generating high quality
piano animation may also be useful in piano tutoring.
Piano playing is a challenging task for multiple reasons. First, piano animation requires
high dimensional poses for both hands involving many joints (16 joints with 27 degrees
of freedom for each hand in our hand model). Second, exact musical timing constraints
require precisely aligning the hand motion with music in order to strike, hold and release
the instructed piano keys. Third, occlusions often occur in playing, such as when the hand
crosses over the thumb, making optical motion capture of limited use. Even if motion
capture were available, in order to play a new piece of music, the motion would need to
be adapted to the volume, velocity, pitch, note structure and timing of that piece problems
that are still unexplored to the best of our knowledge. We instead develop a kinematic
system that is very exible in the range of music it can play, including unanticipated pieces,
3
Page 20 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
and uses limited motion capture data to improve the output quality.
Our system automatically generates animations of piano performances as follows:
Given any piano midi le as input, a novel algorithm combining geometry constraints,
graph theory and piano theory is used to determine the most efcient piano ngering to
play the whole piece. For this, a rule-based method based on geometric constraints for the
ngertips is rst used to nd the most comfortable ngering choices for playing each chord
(the instructed ngers). Then, an optimal motion planning method based on graph theory
is used to determine the sequence of ngering choices for the sequence of chords that make
up the musical piece, expending minimum energy. Finally, geometric constraints are used to
determine where the unused ngers for each chord (the non-instructed ngers) should be
placed in order to most efciently play subsequent notes such that all the ngers are placed
in comfortable and natural positions. Then, using geometric constraints on the ngers, the
wrist and the keyboard contact surface, the algorithm determines initial natural hand poses
for each chord.
Given the generated set of poses, a novel optimization-based method is proposed to gen-
erate detailed motion curves under the requirements of different performance styles: A)
playing scales, where a special case of notes having complex motion sequences such as
nger crossover is handled; B) playing chords, which is more difcult to handle because
several ngers are used to play one chord and the wrist must have reasonable translation
and rotation to maintain the naturalness of the pose while the instructed ngers are pressing
piano keys, and C) the special case of arpeggio, in which notes in a chord are played in se-
4
Page 21 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
quence rather than simultaneously. We consider additional important factors to enhance the
realism of the animation, like how the music volume inuences the motion, how instructed
ngers inuence non-instructed ngers and add wrist compensation to simulate the reaction
of the piano key strike.
Our system is implemented as a plug-in component to Maya, following the procedure
outlined in Figure1. After reviewing relevant previous work, the remainder of the paper
explains these steps in detail and presents the results obtained with the system. The appli-
cations of our work is discussed in the last section.
2 Related Work
There are two problems that we deal with in our paper: ngering generation and hand an-
imation. Many researchers have worked on the problem of generating the right ngering
for various musical instruments. The ngering problem for string instruments has been
extensively surveyed by [1]. Heijink and Meulenbroek [2] uses statistical approaches to
analyze the factors which inuence left-hand movements in classical guitar playing, and
demonstrates that guitar players usually keep their nger joints in the middle of their range,
and control the sound variation by regulating the timing and placement of the left ngers.
Viana and Junior [3], Lin and Liu [4] and Parncutt et al [5] uses rule-based expert systems
to generate piano ngering, but there are cases of conicts between applicable rules and
cases where no rules were applicable. Yonebayashi et al [6] uses Hidden Markov Mod-
5
Page 22 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
els for piano ngering generation, but can not generate ngering for chords. Radisavljevic
and Driessen [7] proposes a path difference learning method which evaluates cost function
weights to adapt guitar ngering generation for a given guitar playing style, but the gen-
erated ngering does not perform well when not enough training data is available. Tuohy
and Potter [8] uses a genetic algorithm to nd playable guitar ngering, but it usually only
generates playable ngering rather than ngering like that published in guitar books, which
means the generated ngering might not be elegant, smooth and/or energy-saving to play.
Tuohy [9] employs genetic algorithms and articial neural networks for music arrangement
and tablature generation for guitar. This namely takes a music piece originally written for
another instrument as an input, and then the proposed method translates this music to a new
version which guitar can play. Rather than nding optimal ngering solutions for a cer-
tain instrument, Handelman et al [2] report their development of an interactive program that
models various performance possibilities for the same music being player by violin, viola
or cello. Hart et al [10], Kasimi et al [11], Radicioni et al [12], Radicioni and Lombardo
[13],Radisavljevic and Driessen [7] use a greedy algorithm approach of traversing a Trellis
graph to nd optimal ngering for piano or guitar, on which we base our approach.
There has been a lot of work on human hand modeling. Yasumuro et al [14] used an
anatomical approach for building a hand model. Lee and Kroemer [15] proposed a kinematic
model of the human hand. Pollard and Zordan [16] proposed an effective physics-based
approach for grasping control and hand interaction with a small set of parameters. Much
recent work on hand animation has focused on physical models, but has not considered
6
Page 23 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
complex nger tasks with exact timing constrains. By contrast, our work uses kinematic
models, but addresses the challenge of calculating a coordinated, rapidly changing set of
hand poses to interact with a keyboard and satisfy musical timing constraints.
Research has also been done on animating the motions associated with playing a musical
instrument. The Handrix system proposed in [17] generates the motion of the fretting hand
when playing a guitar using a procedural algorithm. Kim et al [18] presents an approach
to control a violinists hand movement using neural networks. Viana and Junior [3] also
implemented a 2D piano key animation given any input music.
We improve the existing work in the following two directions: 1) No previous methods
consider the position of non-instructed ngers, which is important for generating comfort-
able non-instructed nger positions while minimizing hand motion for the whole music
piece. A novel ngering approach accomplishing such a goal is proposed based on hand
motion distance and ease for the whole music piece, and also based on the principle of pi-
ano theory that the best ngering is the one which involves least effort for the player. 2)
None of previous systems are capable of automatically generating three dimensional piano
playing animation, while our approach automatically generates accurate hand motion at all
points of time even for complex piano performance such as nger crossover or arpeggio
skill.
7
Page 24 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
3 Fingering Determination
We need to determine the best possible ngering, out of the many possibilities, for the whole
musical piece, such that it satises the principle of piano theory that a player will play most
effectively by making choices which relax his body and save energy. For each chord, we
determine which nger should press on the corresponding key for each note. Then, we
calculate the best positions for the ngers that do not press on a key (non-instructed ngers),
to make it easier to play the succeeding notes. Note that the ngers are numbered 1 to 5 with
1 denoting the thumb and 5 denoting the little nger.
3.1 Placement of instructed ngers
Instructed ngers are those that press on a key to play a given note while playing a chord.
The rst step in our approach is to determine the position of instructed ngers to play every
note of every chord for a given piece of music.
3.1.1 Representation as a Trellis graph
We rst represent the instructed ngering problem as a Trellis graph (Figure 2) where each
node in the same time-slice represents a nger combination to play that chord and the edges
connecting them correspond to the hand motion between neighboring chords.
Depending on the type of chord and the number of notes in it, we can generate different
ngering choices for it: a chord with 1 or 4 notes has 5 ngering choices, a chord with 2 or
8
Page 25 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
3 notes has 10 choices, etc.
3.1.2 Determining cost of hand pose
The hand pose cost, corresponding to each node weight,quanties the effort required to pose
the ngers in a particular conguration to play a given chord, and these ngers are called
instructed ngers in our system.
C
N
i
=

C(a, b) (1)
C(a, b) is the energy cost for any adjacent instructed ngers a and b for maintaining a pose.
The function value is obtained considering the distance and ease of two neighboring ngers
pressing on the piano keys. For example, we expect C(1, 4) to be minimum when d(a, b)
3-5, where d(a, b) denotes the distance between the 2 ngers pressing the piano keys in units
proportional to the breadth of a white key in the piano, because the thumb and ring ngers
are 3 ngers (corresponds to 3 keys on the piano) apart by default and so this corresponds
to the most relaxed arrangement. The cost value C(a, b), which is a segmented function
evaluated based on the performance experience of piano players, increases for larger or
smaller separations, as illustrated in Figure 3.
9
Page 26 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
3.1.3 Determining cost of hand motion
The hand motion cost from current ngering choice i to the next j, corresponding to the
edge E
i,j
connecting from Node i N
i
to Node j N
j
, arises from 3 individual costs:
C
E
i,j
= C
f
+ C
c
+ C
r
(2)
C
f
, which is a constant set based on the piano players experience, is used to penalize
the reuse of a nger in the subsequent chord, if it plays a different note than in the current
chord. It is easier for a non-instructed nger from the rst chord to play a new note in the
second chord, as we do not have to worry about re-positioning the ngers after playing the
rst chord. This cost encourages consecutive chords to be played with different ngers.
C
c
represents the extra energy required for a nger to cross over other ngers. While
playing a melody, the best ngering may involve the ngers crossing over the thumb to
play the next note or the thumb passing under the ngers, but this should be avoided when
unnecessary. This value is linearly proportional to the number of ngers between the moving
nger and the nger being crossed-over.
C
r
penalizes the extra local movement of the ngers required to strike from one note
to another, based on the fact that larger note changes for each nger will cause the wrist to
move more. This value is linearly proportional to the sum of the distance all ngers move
from the current piano keys to the next ones.
10
Page 27 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
3.1.4 Finding shortest paths by Dijkstras algorithm
Now that we have modeled the problem of ngering choice on a Trellis graph and computed
the costs of all the nodes and all the edges connecting them, we use Dijkstras algorithm to
nd the shortest path. We cannot use Dijkstras algorithm directly to compute the shortest
path, because the nodes have non-zero costs. Therefore we update the graph such that the
node costs are also incorporated into the edge costs as follows:
1. Update the edge cost C
E
i,j
of E
i,j
as:
C

E
i,j
= C
E
i,j
+ (C
N
i
+ C
N
j
)/2, edges(i, j) (3)
2. Update the node weights to zero as:
C
N
i
= 0, i (4)
Now that all nodes have zero weights, we can use Dijkstras algorithm to get the shortest
path in the updated trellis graph. Each node selected at each level of the graph gives the
ngering choice of instructed ngers for that chord.
3.2 Placement of non-instructed ngers depending on future notes
After calculating the ngering for instructed ngers, we need to determine how to pre-
position the non-instructed ngers to minimize the overall effort, which is required by the
piano performance for smooth hand motion. This has not been addressed by any previous
work as it is essential only while generating three dimensional animation output as our
11
Page 28 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
system does. The position for non-instructed ngers for the current chord are chosen so that
they are easier to re-position to play the next chord where they will actually be used, and
therefore to minimize the energy cost of hand motion. The algorithm operates as follows:
1) We wish to determine the position of the non-instructed ngers for chord i and assume
that the instructed ngers for all chords have already been positioned. We consider the next
k (1 k 4) chords in positioning the non-instructed ngers at i due to a maximum of 4
non-instructed ngers to pre-position. Note that the ability to pre-read the notes varies based
on the level of music, player and familiarity with the music piece, and our method provides
the solution assuming the player is familiar with the music and can generate most reasonable
ngering for the current non-instructed ngers.
2) If j is an instructed nger in chord i + k, then j has to be positioned in chord i, so
that it does not have to move much when we play chord i + k.
3) Let the instructed nger in chord i closest to the j
th
nger be a. There could be two
such ngers, one on either side. For all adjacent ngers that exist, the nger js position
should satisfy that the distance d(j, a) between j
th
and an adjacent nger should be in the
range of comfort playing.
4) When the above conditions are satised, the position of nger j has been determined
for chord i based on chord i +k. Do this until all the positions for non-instructed ngers for
a chord has been determined.
When a non-instructed nger is the thumb, ring, or little nger, we update the nger
position by -0.5 along the Z axis (move it to the left). In case the new position is occupied
12
Page 29 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
by another nger, we increase it instead by 1 (move it to the right). When the non-instructed
nger is the index or middle nger, we update the nger position by 0.5 (move it to the
right). When this new position is occupied by another nger, we decrease it by 1 (move it to
the left). We prefer to move some ngers to the left rst and some to the right rst in the aim
of generating a hand pose where ngers have minimum inuence of each other as indicated
by [19]. Row 1 of Figure 4 shows that the default positioning of the non-instructed ngers
is not always meaningful. Row 2 shows the hand pose after the correction has been done.
4 Finger and Hand Pose Calculation for Chord
For any given chord, there are 4 steps in simulating the hand motion as Figure 5: 1) calculate
the hand pose which includes the position of the ngertips and the position and orientation
of the wrist; 2) press down the piano keys; 3) hold the piano keys; fourth; 4) release keys
back to the original position. Because the last three steps can be simulated using methods
similar to the rst one, we just focus on the rst step. Also, an algorithm to handle the
complex performance such as nger crossovers and arpeggio are described.
Before further discussion of this section, we rst describe our hand model, as shown in
Figure 6. The ve ngers in the hand model are labeled from Finger 1 for the thumb to 5
for the little nger. There are 16 total joints with 27 DOFs in this hand model: 6 DOFs for
wrist joint labeled with black circle, 1 DOF for extension and exion of each DIP (Distal
Interphalangeal) joint and PIP (Proximal Interphalangeal) joint, 2 DOFs for extension and
13
Page 30 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
exion, adduction and abduction of each MCP (Metacarpophalangeal) joint for Finger 2
to Finger 5, and 3 DOFs for thumbs nger base rotation. Each nger is assigned an IK
Handler starting from the nger base and going to the ngertip. IK is used when the ngers
strike and release the keys, and FK is used for the in between movement for different chords,
with a blend used to smoothly switch between the FK and IK.
Some important notation is dened as follows: B
i
refers to the base of nger i, T
i
refers
to the ngertip of nger i. P denotes joint position,and denotes the joint orientation.
World axes and wrist local axes are dened as shown in Figure 7.
4.1 Initiate hand pose
The generated ngering of instructed and non-instructed ngers in Section 3 is used to
decide the position of the ngertips along the axis Z, which are then used as the basic pa-
rameters to evaluate other position components of ngertips and the position and orientation
of the wrist. The nger base position P
B
is decided by the wrist position P
W
and orienta-
tion
W
because the nger bases are xed on the palm, but the ngertip positions P
T
, wrist
position P
W
and orientation
W
have to be calculated.
4.1.1 Initiate nger positions
P
T
i
(t)
z
represents which key is pressed and is hence determined by the ngering method.
P
T
i
(t)
x
is determined by wrist position and piano key range occupied by occupied by
ngers.P
T
i
(t)
y
is the height of a black or white key being pressed.
14
Page 31 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
4.1.2 Initiate wrist position
The wrist position is determined based on the fact that when the ngers are more spread,
the wrist will move forward along the X axis and have a lower position value along the Axis
Y; Along the Axis Z, the wrist moves relatively closer to little nger and farther from the
thumb. Therefore we compute the wrist position as follows:
P
W
(t)
z
is the weighted sum of the Z components of all nger tip positions. The thumb
and little nger generally have much more inuence in determining the wrist position along
the axis Z, and therefore they have much larger weight than other ngers.
P
W
(t)
x
is determined by the instructed ngers relative positions along the X axis in a
standard pose, the inuence of the nger (prioritized 1, 5, 2, 4 and 3 in decreasing order)
and the allowable contact range on the keys pressed by these ngers. For example, if the
instructed thumb has to move to a black key from a white key, then this will fully dictate
the wrist movement, as the wrist will have to move less to accommodate the position of any
other nger.
P
W
(t)
y
has similar way of the determination of P
W
(t)
x
.
4.1.3 Initiate wrist orientation
The wrist orientation along the Y axis,
W
(t)
y
, is computed as:

W
(t)
y
=
5

i=1
w
i
(P
T
i
, P
W
, t)
y
(5)
where (P
T
i
, P
W
, t)
y
is the orientation around the Y axis of the ray from the wrist to
15
Page 32 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
T
i
at time t. w
i
for ve ngers are pre-computed weights, which reect the dependence of
rotation of Joint i and the wrist.
In order to obtain ve w
i
, we setup an equation set which consists of ve equations
based on Equation (5) for ve different chords. For each equation, take w
i
for i=1,..,5 as
ve unknown variables, and obtain values of (P
T
i
, P
W
, t)
y
for i=1,..,5 based on the motion
capture data with information of wrist and ngertip position for each chord. Solve the
equation set to obtain the ve weights.
The result of this process is that w
1
has the smallest weight, while w
3
has the largest
weight, which means the wrist orientation mainly depends on the position of non-thumb
ngers on the palm and the middle nger position on the X axis and Z axis, because hand
poses usually satisfy the condition that the middle ngertip, nger base and wrist are almost
co-linear.
We calculate
W
(t)
z
as follows:

W
(t)
z
=
W maxz
w
D
(t) (6)
This equation implies that the wider the key range the ngers press on, the lower the
wrist orientation around Z. This happens because we get a wider range to spread the ngers
when the wrist is closer to the keyboard surface than when it is far away.
W maxz
is the
largest angle of the wrist along the Z axis (also the initial orientation along the Z axis for
the standard pose). controls the largest wrist orientation along the axis Z; w
D
(t) is the
rotation weight inuenced by the 5 nger distribution at time t, and a larger nger extension
16
Page 33 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
will have larger w
D
(t)

W
(t)
x
is computed as:

W
(t)
x
= arctan
P
B(ring)
(t)
y
P
B(index)
(t)
y
P
B(ring)
(t)
z
P
B(index)
(t)
z
(7)
This equation keeps the hand nearly parallel with the piano face. Because the index and
ring nger bases are xed on the palm, they can be used to dene a line in three dimensional
space, and therefore the projection to the plane perpendicular to the X axis can be used to
evaluate the orientation of the wrist around the axis X. The parameter is evaluated based
on the standard pose pressing on 5 neighboring keys with distance of 4, and distance of 7,
respectively, from the little nger to thumb.
4.2 Crossover between thumb and other ngers
Crossover is common while playing various note sequences. This case is handled separately
using the algorithm outlined in Figure 8 which illustrates the case where the thumb crosses
under nger j.
After a nger j presses down the corresponding key for chord i, the wrist is translated
by a distance, which is evaluated based on the corresponding key postures extracted from
motion capture data depending on what nger it crosses over: index, middle or ring. After
the translation, the wrist is rotated such that
(P
W
, P
B
j
, t)
y
= arctan
P
B
j
(t)
x
P
W
(t)
x
P
B
j
(t)
z
P
W
(t)
z
(8)
17
Page 34 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
(P
W
, P
T
j
, t)
y
= arctan
P
T
j
(t)
x
P
W
(t)
x
P
T
j
(t)
z
P
W
(t)
z
(9)
(P
W
, P
B
j
, t)
y
(P
W
, P
T
j
, t)
y
[
min
(j),
max
(j)] (10)
The joint angles of all ngers other than the thumb are kept the same, and the positions
are translated by the wrist rotation. After j releases the key and the thumb presses down its
key, the wrist and ngertips are translated to the position for Chord i + 1.
The algorithmfor ngers crossing over the thumb is similar to case of the thumb crossing
under other ngers.
4.3 Arpeggio skill
In arpeggios, notes in a chord are played in sequence rather than simultaneously. Sometimes
the successive notes might be farther apart from each other, such as playing a typical chord
with notes C3-G3-C4-E4 using left ngering 5-3-2-1, which is more difcult to handle than
playing common chords because more complex wrist motion is needed to make sure the
instructed nger can reach the required piano note in the arpeggio in time while keeping
hand pose natural. The method in Section 4.1.2 is rst used to compute the initial wrist
position for the common chord which has the same notes as the required arpeggio, and then
we shift the wrist position to satisfy the following geometry constraints:

B
i1
,T
i1

i1
(d(T
i1
, T
i
)) =
B
i
,T
i

i
(d(T
i1
, T
i
)) +
W,B
i1

W,B
i
(11)
Just as shown in Figure 9, T
i
is the nger tip used to strike note i and B
i
is the corre-
sponding nger base;
B
i
,T
i
is the orientation along the Y axis from nger base i to ngertip
18
Page 35 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
i;
W,B
i
is the orientation along the Y axis of the vector from wrist to note is nger base;

i
(d(T
i1
, T
i
)) is the rotation offset used for the nger to strike note i based on the distance
between two neighboring instructed nger tips. This constraint is to determine the wrist
position so that the two instructed ngertips can be positioned on the required neighboring
piano keys for smooth hand motion from note i 1 to note i.
5 Optimization of Key Hand Poses
For most relaxed playing, piano theory requires the player to keep natural and precise poses
while decreasing extra motion. Therefore an novel optimization method with geometry con-
straints is proposed to smooth the hand motion between the hand key poses for each chord.
The following is the objective function to nd the wrist pose sequence which minimizes
the overall motion cost determined by the wrist translation and rotation for all the given n
chords:
min
C
i
(
n

i=2
C
i
) = min
P
i
(
n

i=2
(||P
i
P
i1
|| + w ||
i

i1
||)) (12)
Where C
i
is energy cost for Chord i; w is the weight between translation and rotation
components; ||P
i
P
i1
|| and ||
i

i1
|| are the wrist translation and rotation between two
neighboring chords respectively. Sequential Quadratic Programming (SQR) is used for op-
timization solution of the minimum motion cost, considering the following four constraints
for each chord.
c
1
:
J
i,j
[
J
i,j
min
,
J
i,j
max
] (13)
19
Page 36 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
c
2
: d(T
i
, B
i
) [d(T
i
, B
i
)
min
, d(T
i
, B
i
)
max
] (14)
c
3
: P
i
[P
i

1
, P
i
+
2
] (15)
c
4
:
i
(16)
c
1
describes a reasonable rotation range constraint for nger is Joint j, which is used to
ensure that the nger has a natural pose.
c
2
describes the distance constraint between the ngertip and base, so that the ngertip
can reach the required piano key. d(T
i
, B
i
)
min
and d(T
i
, B
i
)
max
respectively denote the
maximal and minimal distance range between nger is tip and base.
c
3
is the translation constraint used to maintain the local optimized position P
i
for Chord
i. Because our system can usually generate a good initial pose for each chord, our optimiza-
tion method uses these good initial poses to generate natural and energy-saving poses with
smaller variable range of wrist translation around initial poses.
c
4
is used to generate a natural wrist orientation
i
based on the P
i
and 5 nger distri-
butions, and is computed in Section 4.1.3.
6 Simulation of Motion Curve
After generating the optimized natural key poses for the given notes, the following set of
steps are used to construct the realistic motion curve between these key poses.
20
Page 37 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
6.1 Wrist motion between chords
When the hand moves from one chord to another, the hand will move up and down during
the motion as shown in Figure 10. Motion capture data shows that the wrist is usually raised
to its maximum height (along the Y axis) in the middle of two neighboring chords, and this
component is calculated by:
P
W
((t
i
+t
i1
)/2)
y
= P
W
(t
i
)
y
+P
W
(t
i1
)
y
+V (i)D(i, i1)H(P
W
(t
i
), P
W
(t
i1
)) (17)
Where V is linearly determined by the volume of Chord i; D is linearly determined by
Chord i 1 and Chord is duration; H is the basic height, inversely proportionate to wrist
translation between the two chords.
After determining the height in the middle position, motion capture data is used to inter-
polate the key frames between the middle position and the two key poses for the two chords
along the Axis Y, and the motion capture data along the X axis and Z are used to generate
the corresponding curve components in order to generate more realistic hand motion by the
following procedure: 1) Sample hand motion used for 20 performances of the same chords.
2) Segment the motion capture curve manually into up and down motion. 3) Normalize the
motion capture clips to the same duration and average them to generate a reference curve.4)
Sample this reference curve and use it as the interpolation function when generating output
motion.
21
Page 38 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
6.2 Inuence of instructed ngers on non-instructed ngers
While striking the piano keys, the rotation of instructed ngers induces some rotation in
the non-instructed ngers, just as shown in Figure 11. Usually, the less skilled the player
is, the more the dependence the instructed ngers will have on the non-instructed ngers.
In order to simulate the inuence of instructed ngers on non-instructed ngers, we dene
dependence index Dep(i) for non-instructed nger is movement inuenced by instructed
nger j as:
Dep(i) =

S
ij
(i, j)
I
(18)
where S
ij
is the slope of the relative motion of the i
th
nger during the j
th
instructed
movement [19], and I is the number of instructed ngers. If nger i is an instructed nger,
the dependence index will be 0 because the instructed nger should exactly press on the
piano key no matter where the other ngers are. If nger i is a non-instructed nger, it
will be close to 1 when the neighboring instructed ngers have high inuence on this nger
and will be close to 0 when the neighboring instructed ngers have little inuence. Given
a maximum movement range Y
max
(i) along the Y axis, the position of the non-instructed
nger along the Y axis due to the inuence of surrounding ngers is given by:
P(i)
y
= Y
max
(i) Dep(i). (19)
22
Page 39 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
6.3 Wrist compensation
The action of pressing a key will tend to induce an upward movement of the wrist due to
the reaction force of the strike. While playing low-volume notes (such as only rotate nger
base and/or wrist to strike the keys which will generate low-volume sound), the wrist will do
an up-down vertical response, and while playing high-volume notes (such as rotate up arm
and/or shoulder to strike the keys which will generate much larger volume sound), the wrist
will do an down-up-down response. Additional key frames based on the feature between
extracted motion capture data of wrist and the corresponding sound are inserted to achieve
this, and the amount of compensation is therefore scaled based on the note volume.
7 Results and Discussion
We present piano animations showing a range of different nger placement and motions
styles.
7.1 Scales
In music, a scale is a sequence of notes in ascending or descending order that is used to
conveniently represent part or all of a musical work including melody and/or harmony [20].
Finger crossovers always happen during playing scales, and are generated realistically in our
system. This feature is demonstrated by Figure 12 and the rst demo in the accompanying
video which includes a side-by-side comparison between the generated animation and real
23
Page 40 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
playing.
7.2 Chords
A chord consists of a set of notes that are heard as simultaneous sound, but might not be
played simultaneously. Generally, notes are played at the same time, except in the special
case of an arpeggio, where the notes are played quickly in sequence. This can be handled
by the simple case of individual notes being pressed. Below, we discuss the common chord
which is played simultaneously and show that our algorithm generates realistic piano per-
formance animation for complex chords.
Figure 13 shows that our system generates correct ngering of instructed ngers for the
rst part (6 chords) of the musical notation of Bilder einer Ausstellung composed by Modest
Mussorgsky.
The generated ngering choice is found to be very feasible for the hand, and Figure 14
shows snapshots of hand poses for some complex chords in the chord demo, which are again
feasible to play and natural.
We analyze the realism of our animation below after the optimization step. In this ex-
ample our optimization method improves the total motion cost (150.6) by about 22%, trans-
lation improvement(111.3 cm) by 10%, and rotation improvement (78.5 degree) by 45%.
Note that the default weight between translation and rotation is 1.
Figure 15 visualizes the three translation components and the one rotation component
24
Page 41 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
along the Y axis (the other components are based on the same parameters and are the same)
before and after optimization. These graphs illustrate that the optimization method can
yield a smoother key pose sequence with less wrist translation and rotation, and therefore
can minimize motion cost. Note the 28 nodes in each curve correspond to the key poses
for the 28 chords; the line between nodes are used to better trace how the wrist component
changes as the chord music progresses.
The following Figure 16 shows for the music with 28 chords, the animated wrist motion
agrees well with the ground truth data.
7.3 A music piece
Finally a music piece, Childhood Memory composed by Modest Mussorgsky, is used to
generate a comprehensive demo to show all the features supported by our current system,
including the simulation of relative nger rotation, wrist compensation, nger crossover and
arpeggio skill. Some key snapshots are shown in the images below.
The rst image in Figure 17 shows the instructed index nger for next chord causing the
relative rotation of the non-instructed ring nger; the second shows the index nger fully
pressing down the key while ring nger returns back to the key surface; the third shows the
wrist moving up due to wrist compensation after the index nger fully presses down the key
(the wrist motion causes the joints of other ngers to rotate a little while keeping contact
with the the piano keys).
25
Page 42 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
8 Conclusion and Future Work
We have described a system that automatically generates three dimensional animation of pi-
ano playing, given an input piece of music. The graph-based approach determines ngering
which ensures that the character plays the piece in as relaxed a manner as possible, which is
one of the fundamental principles of piano theory. A novel rule-based approach is proposed
to pre-position non-instructed ngers such that it is convenient and easy to use them for
playing succeeding notes. Initial key hand poses are determined based on generated nger-
ing and piano theory, and the complex but often encountered cases such as nger crossovers
and arpeggio are also handled. An optimization-based method operating on geometry con-
straints is proposed to generate smooth and natural key pose sequences for the hand. Motion
capture data is then employed to further smooth the transition between poses. We believe
the resulting motion is realistic enough to be used directly as a tool for piano self-study. The
comparison between the generated motion curves and the curves of the raw motion capture
data of a real piano playing shows a good level of realism.
Our hand touch model, which allows goal locations for each nger and maintains exact
timing constraints could be extended to perform animation of other instruments with keys,
such as woodwinds, brass and string instruments. Also, our approach may be benecial for
generating natural grasping with a more beautiful hand pose.
The rst limitation of our system is that although it solves the collision between the
ngertip and the piano surface, it does not handle interpenetration between ngers and col-
26
Page 43 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
lisions with the sides of the black keys.
Secondly, although our system can generate plausible and reasonably realistic piano
playing for standard music, it is not capable of generating emotional piano playing which
reects a personal understanding of the music and players performance background. This
is the main future work we will be pursuing.
Thirdly, it might be benecial to apply principles of machine learning to our system to
learn the parameters for determining standard ngering more accurately in cases where mul-
tiple ngering sequences have the same optimal cost during instructed ngering generation.
Fourthly, we will enhance our system to generate animation for various size hand mod-
els, to meet the requirements of piano students with different hand shapes, so that our system
can be used as a good piano teaching tool.
Fifthly, sometimes a melody must be played continuously by two hands. Because our
method generates ngering for each hand separately, our work cannot generate ngering for
this performance that requires planning ngering simultaneously for two hands. Note that
this problem is also unsolved in all of the previous work.
Finally, our solution does not consider the interdependent rotation of the joints within a
nger, and does not directly simulate the inuence from interdependence between ngers.
Future work on this point would be helpful to improve the nger motion during striking and
releasing the piano keys.
27
Page 44 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
References
[1] S. I. Sayegh. Fingering for string instruments with the optimum path paradigm. Com-
puter Music Journal, 13(3):7684, 1989.
[2] H. Heijink and R. G. J. Meulenbroek. On the complexity of classical guitar playing:
Functional adaptations to task constraints. Journal of Motor Behaviour, 34(4).
[3] A. B. Viana and A. C. de M. Junior. Technological improvements in the siedp. IX
Brazilian Symposium on Computer Music, 2003.
[4] C.-C. Lin and D. S.-M. Liu. An intelligent virtual piano tutor. Proceedings of the
2006 ACM International Conference on Virtual Reality Continuum and its Applica-
tions, pages 353356, 2006.
[5] J.A. Clarke E.F. Raekallio M. Parncutt, R. Sloboda and P. Desain. An ergonomic model
of keyboard ngering for melodic fragments. Music Perception: An Interdisciplinary
Journal, 14:341382, 1997.
[6] H. Yonebayashi, Y. Kameoka and S. Sagayama. Automatic decision of piano ngering
based on hidden markov models. Proceedings of the 20th International Joint Confer-
ence on Articial Intelligence, pages 29152921, 2007.
[7] A. Radisavljevic and P. Driessen. Path difference learning for guitar ngering problem.
Proceedings of the International Computer Music Conference, 2004.
28
Page 45 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
[8] D.R. Tuohy and W.D. Potter. A genetic algorithm for the automatic generation of
playable guitar tablature. Proceedings of the International Computer Music Confer-
ence, page 499502, 2005.
[9] D. R. Tuohy. Creating tablature and arranging music for guitar with genetic algorithms
and articial neural networks. A Thesis Submitted to the Graduate Faculty of The
University of Georgia in Partial Fulllment of the Requirements for the Degree Master
of Science, The University of Georgia, 2006.
[10] R. Hart, M. Bosch and E. Tsai. Finding optimal piano ngerings. Undergraduate
Mathematics and Its Applications, 21(2):67177, 2000.
[11] E. Kasimi, A. A. Nichols and C. Raphael. Automatic ngering system. The Interna-
tional Society for Music Information Retrieval poster presentation, 2005.
[12] L. Radicioni, D. Anselma and V. Lombardo. A segmentation-based prototype to com-
pute string instruments ngering. Proceedings of the Conference on Interdisciplinary
Musicology, 2004.
[13] D. Radicioni and V. Lombardo. Guitar ngering for music performance. Proceedings
of the International Computer Music Conference, page 527530, 2005.
[14] Q. Yasumuro, Y. Chen and K. Chihara. Three-dimensional modeling of the hu-
man hand with motion constraints. Proceedings of Image and Vision Computing,
17(2):149156, 1999.
29
Page 46 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
[15] K-H. Lee and K. H. Kroemer. A nger model with constant tendon moment arms.
Proceedings of Human Factors and Ergonomics Society 37th Annual Meeting, 37:710
714, 1993.
[16] N.S. Pollard and V.B. Zordan. Physically based grasping control from example. Pro-
ceedings of the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Ani-
mation, pages 311318, 2005.
[17] G. ElKoura and K. Singh. Handrix: animating the human hand. Proceedings of
the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages
110119, 2003.
[18] J. Kim, F. Cordier, and N. Magnenat-Thalmann. Neural network-based violinists hand
animation. Proceedings of Computer Graphics International 2000, pages 3741, 2000.
[19] C. Hager-Ross and M. H. Schieber. Quantifying the independence of human nger
movements: comparisons of digits, hands, and movement frequencies. The Journal of
Neuroscience, 20(22):85428550, 2000.
[20] B. Benward and M. Saker. Music in Theory and Practice Volume 1. Mcgraw-Hill
College; 7 Edition, 1997.
30
Page 47 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 1: Flowchart of system implementation
31
Page 48 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 2: A trellis graph where consecutive time-slices correspond to consecutive chords in
the piece. The weighted nodes represent the cost of hand poses for a ngering choice and
weighted edges represent the cost of hand motion between the hand poses
32
Page 49 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 3: Hand pose cost values for poses corresponding to different separation of the thumb
and ring ngers
33
Page 50 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 4: Repositioning of non-instructed ngers which are indicated by the red circles
around them
34
Page 51 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 5: Data ow of simulation of piano performance
35
Page 52 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 6: Important joints in our hand model
36
Page 53 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 7: The axes in our model
37
Page 54 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 8: Algorithm to handle nger crossover
38
Page 55 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 9: Arpeggio Skill
39
Page 56 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 10: Hand motion from one chord to another. The hand will reach its highest position
above the piano keyboard in the middle of the motion
40
Page 57 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 11: Rotation of instructed nger(s) inuences the rotation of non-instructed n-
gers. In this example, the rotation of instructed middle nger inuences the rotation of
non-instructed ring and little ngers
41
Page 58 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 12: Key poses of nger crossover while playing scales. The rst row shows a key-
frame of the thumb crossing over the middle nger while playing the C-major scale and the
second row shows a key-frame of the thumb crossing over the ring nger while playing the
D-major scale, both from 3 perspectives. Note that the ring/middle nger rmly presses
down the keys, the ngers avoid collisions with black keys in C-major, the wrist maintains
a natural rotation and the thumb is positioned well on the key to play it after crossing over
42
Page 59 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 13: Correct ngering generated for the rst part of Bilder einer Ausstellung
43
Page 60 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 14: Pose for (a) E4-G4-C5 (b) F4-F5 (c) #D4-G4-#A4-#D5
44
Page 61 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 15: Motion comparison before and after optimization. The rst three sub-gures
show the motion curve along three axis components are smoothed after optimization as
the red curves shows. The fourth sub-gure shows the rotation of the wrist along vertical
axis(the Yaxis) is also smoothed. The four gures together demonstrate that the hand moves
with less distance and rotation for the same music clip after optimization, and therefore the
hand motion is more smooth after optimization routine.
45
Page 62 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 16: Motion comparison between ground truth data and optimization result. The
motion curves along three translation components follows tightly with the those of corre-
sponding motion capture data for the same music clip.
46
Page 63 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 17: Some key-poses while playing Childhood memory
47
Page 64 of 64
http://mc.manuscriptcentral.com/cavw - For Peer Review
Computer Animation and Virtual Worlds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

You might also like