Introduction To Machine Learning For Computer Graphics: Peter M. Hall University of Bath
Introduction To Machine Learning For Computer Graphics: Peter M. Hall University of Bath
Peter M. Hall
University of Bath
Abstract
Computer Graphics is increasingly using techniques from Machine
Learning. The trend is motivated by several factors, but the difficulties and expense of modelling is a major driving force. Here
modelling is used very broadly to include models of reflection
(learn the BRDF of a real material), animation (learn the motion of
real objects), as well as three-dimensional models (learn to model
complex things).
Building around a few examples, we will explore the whys and
hows of Machine Learning within Computer Graphics. The course
will outline the basics of data-driven modelling, introduce the foundations of probability and statistics, describe some useful distributions, and differentiate between ML and MAP problems. The ideas
are illustrated using examples drawn from previous SIGGRAPHs;
well help non-artists to draw, animate traffic flow from sensor data,
and model moving trees from video.
Machine Learning is of growing importance to many areas of modern life; big-data, security, financial prediction, and web-search
are areas that impact lives of almost everyone. Computer Graphics,
along with allied fields, like Computer Vision, are also benefiting
from ideas and techniques that originate from Machine Learning.
Computer Graphics research that pulls on Machine Learning is typified by the inclusion of real-world data.
Users of Computer Graphics the creative industries in film, broadcast, games, and others articulate the value of acquiring models
(in a broad sense) from real data. Modelling for them is a very expensive business. Consider the effort involved in modelling a landscape populated with trees, rivers, flower meadows, stones on the
ground, moss on the stones ... now animate it. Machine Learning
offers a feasible way towards less expensive acquisition of models,
because it is able to construct models from data.
Looking back at past SIGGRAPHs and Eurographics there is a clear
trend towards to use of Machine Learning in Computer Graphics,
see Figure 1. The data in the figure are my interpretation of papers presented; others may take a different view. Even so, when
compared with 5 or 10 years ago the recent trend is unmistakable.
The influence of Machine Learning reaches across almost all of
Computer Graphics, but clusters around subjects such as modelling
the reflection properties of surfaces, animation of people and faces,
organising databases of models, capturing natural phenomena, interaction, and other areas where data-driven, example-based input
offers an advantage.
In this course we will first introduce the basics of Machine Learning. We will first look at three of the generic problems: regression;
density estimation; and classification. Next we will study the underlying mathematics used, later illustrated using two examples taken
from SIGGRAPH 2013 and one example from SIGGRAPH Asia
2011.
Our first example is a crowd sourcing application of Machine
Learning to help people draw; the input set comprises raw pen
strokes made by a user, the output set comprises refined user
strokes, and the function is learned by looking at pen strokes made
by many different people.
The second example comes from the realistic animation of traffic
flow in a city. In this case reading from sensors that monitor traffic
flow make up the input set, the output set is vehicle trajectories. The
parameters of a simulation are learned to enable realistic animation.
The final example is modelling moving trees. The input is now a
video of a real tree, and the output is a statistical description of the
moving tree that enables new individual trees to be spawned. This
example shows how learned probabilistic rules can replace hardcoded rules of production.
e-mail:[email protected]
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for commercial advantage and that copies bear this notice and the full citation on the
first page. Copyrights for third-party components of this work must be honored. For all
other uses, contact the Owner/Author.
SIGGRAPH 2014, August 10 14, 2014, Vancouver, British Columbia, Canada.
2014 Copyright held by the Owner/Author.
ACM 978-1-4503-2962-0/14/08
Figure 3: Left sample points and postulated lines with error vectors. Right, the least-squares line (red) and true line (black).
Figure 2: A schematic view of Computer Graphics, Computer Vision, and Machine Learning. The elements in blue are inputs to
some computer program, the element in red is what the program
produces.
Regression
y = f (x).
(2)
2.1
(1)
where the inputs to the algorithm are subsets of the domain and
range: {x} X, and {y} Y. The subsets contain samples of
the wider population, typically samples are observed in some way
(by sensors, or by a human). Since the wider population may be an
infinite set it is often the case that samples are the best we can hope
for. All Machine Learning algorithms need a sampled domain as
input; but as we will see, they do not all need a sampled range.
In the rest of this course we will use X to mean a sampled domain,
and Y to mean sampled range values. We will assume that the values in these sets are not perfect, but have been jittered away from
their true values: they are noisy data.
We will now consider three generic Machine Learning problems to
make the idea of data modelling more concrete: regression, density estimation, and classification, explaining these in general terms
and using the concepts just introduced. After which, well provide
mathematical underpinning so that we can then study three SIGGRAPH papers from an Machine Learning point of view.
E(m, c) =
N
X
(mxi + c yi )2 .
(3)
i=1
(4)
Regression is not just for straight lines. The least squares method is
very general and can be used to fit a wide range of different types
of function. Regression is behind modern model acquisition techniques such as laser scanning. The scanner produces a point cloud,
in which X comprises points xi , each in three dimensions. Typically, subsets of points are selected and planes are fitted through
them by assuming a contour, that is f (x, ) = 0, where is a tuple
that contains the parameters of the plane (e.g. a point on the plane
and a unit normal to it).
with i the weights (which must sum to 1) and each (i , Ci ) specifies a particular Gaussian component. The parameters i , i , Ci for
a GMM density estimator can be found by regression, albeit a more
complex problem than the straight line and search must be used.
The Expectation-Maximisation [Dempster et al. 1977] is a popular choice but that algorithm is beyond the scope of this course.
Once fitted, a parametric density estimator can be very informative
and useful, provided the underlying density is well approximated
by the chosen functions (such as the Gaussian).
2.2
Density Estimation
The density of physical matter is defined as the mass per unit volume. If we think about matter as being composed of particles, each
of the same mass, then the density is directly proportional to the
number of particles in a region.
Similarly, we can think about data density. This means we think
about elements in the domain, x, as being points in space. It is common to observe more points in some regions of space, and fewer
elsewhere. Studying these patterns is a key part of Machine Learning. An analogy would be salt spilled onto a table. The number of
grains of salt in a region of the table is the local density, and the
pattern of salt could lead us to infer about the manner in which the
salt was spilled.
Density estimation requires only a domain, X, as input to the Machine Learning algorithm. The range, Y, is now the local density,
hence it is not given as input to a density estimator, rather it is computed. Notice this is equivalent to determining a function, because
a function can be defined in general by listing the range elements
that correspond to domain elements (as in a look-up table).
The importance of density estimation comes from the fact that the
data density is the probability of observing a point sampled at random from the population. For example, suppose you wished to
animate someone throwing darts at a board. You might make physical model complete down to the tiniest details including muscle
strength, air currents, the shape of the darts feathers, and so on. It
would be much easier to model the probability of where the dart
will land. If you could model the density of real dart-holes from a
real player, then an avatar could emulate that real player, see Figure 4.
There are many ways to estimate data density, but the large-scale
categories are parametric and non-parametric. The parametric
methods fit a function to the density of data in X. Often a Gaussian Mixture Model (GMM) is used, which is a weighted sum of K
Gaussians. Well consider Gaussians in Section 3.3.1, but a GMM
is
f (x) =
K
X
i=1
i N (x|i , Ci )
(6)
f (x) =
N
1 X
x xi
h
.
N d i=1
(7)
Typical kernels include the top hat k(x) = 1 only if |x| < for
some radius ; Gaussian kernels are also used, usually fixed to have
a tight spread.
Density estimators allow us to compute important properties of
data. They are needed to solve problems such as
x = argmax p(x).
(8)
2.3
Classification
There are many classifiers. Two of the best known are the Gaussian
Mixture Model (GMM) and the Support Vector Machine (SVM).
The GMM assumes that each component of a GMM covers just one
class, the data in any one class is Gaussian distributed. SVM algorithms construct surfaces that separate one class from another, as
seen in Figure 5. Usually these surfaces are planes (hyper-planes if
the data are high-dimensional) but some methods allow more general separating surfaces.
N
1 X
xi ,
N i=1
N
1 X
(xi )2 .
N i=1
(9)
(10)
N
1 X
xi ,
N i=1
(11)
N
1 X
(xi )(xi )T .
N i=1
(12)
PN
PN
2
i=1 ui1
i=1 ui1 ui2
i=1 ui1 ui3
P
P
P
N
N
N
(13)
u221
i=1 u21 ui3
i=1 u21 ui1
PN
Pi=1
PN
N
2
i=1 u31 ui1
i=1 u31 u21
i=1 ui3
in which ui = xi .
Underpinning Mathematics
3.1
It can be easy to confuse statistics with probability, not least because the confusion permeates popular language and is evident in
newspapers and the media. We should therefore begin by making
the difference between these clear.
The P
mean ofP
a set numbers is an expectation value, because
1/N
xi =
xi p(xi ), with p(xi ) = 1/N for all xi .
The probability p(y) is called the prior. It is the chance that event
y is observed, and is therefore a measure of our confidence of the
assumption used in the likelihood. The term p(x) is called the evidence or evidence density.
B: greater than 4
8 6
10
0
5
7 9
3
1
3.2
Axioms of Probability
3.3
Probability Distributions
Bayes Law
p(x|y)p(y)
.
p(x)
(15)
3.3.1
(x
)
C
(x
)
. (16)
2
(2)K/2 |C|1/2
This is often written in short-hand form:
p(x|, C) = N (x|, C).
(17)
The term outside the exponential 1/((2)K/2 |C|1/2 ) is a normalisation factor used to ensure the distribution integrates to unity. It
is this factor that turns the Gaussian function into a Normal distribution. This factor is an indication of the hyper-volume of the
Q
1/2
particular distribution. In fact |C|1/2 = K
i=1 ii , which is just
the volume of a cube whose lengths are standard deviations.
3.3.2
Multinomial
The multinomial distribution is used for discrete events. An intuitive and useful way to think about a multinomial distribution is as
a histogram that has been scaled so that it sums to unity. Figure 8
shows a multinomial distribution constructed from a histogram of
letters in an English document.
The multinomial probability density function gives the probability
of observing a histogram, given a vector of mean values. For vectors of length K the probability of observing a histogram with entry
values (m1 , m2 , ...), given a vector of means, (1 , 2 , ...) is
PK
K
(mi )! Y mi
p(m1 , m2 , ...mK |1 , mu2 ..., mK ) = Qi=1
i . (19)
K
i=1 mi ! i=1
Q
mi
The product K
appears because the probability of a sei=1 i
quence of independent events is just the product of their individual
probabilities. The normalising term in front of the product counts
P
the number of ways to partition the M = K
i1 mi objects into K
groups; one size m1 , another of size m2 , and so on.
C = U LU T
(18)
T
3.4
We have already seen three problem areas for Machine Learning: regression, density estimation, and classification. Each of
these problems can be addressed by solving a Maximum Likelihood (ML) or Maximum A-Posterior (MAP) problem, we will not
consider the fully Bayes approaches.
We will illustrate the difference between ML and MAP using the
regression of a polynomial as an example, so D = {(xi , y)}N
i=1 .
Before continuing we note that a polynomial
y = w0 x0 + w1 x1 + w2 x2 + ...
(20)
3.4.1
Maxium Likelihood
(21)
This means the program must find the parameter that maximises
the likelihood of the data, p(D|). Within the context of regression
the probability of the data is
p(D|) =
N
Y
p(xi |).
(22)
i=1
p(D|) =
exp
. (23)
2
2
2
i=1
Limpaecher et al. [2013] describe how to construct an app for mobile phones that helps users trace over photographs of a friend or a
celebrity. The idea is to correct a users stroke as it is drawn (via the
touch screen). Correction is possible because the app learns from
the strokes of many different people; this is an example of crowdsourcing.
N
1 X T
(w xi yi )2 .
2
2 i=1
(24)
N
X
(wT xi yi )2 ,
(25)
i=1
Maximum A-Posterior
(26)
This differs from the maximum likelihood solution because it requires a prior over the parameters, p() so that the posterior p(|D)
can be maximised. Following Bayes law we have
p(|D)
p(D|)p()
p()
N
Y
p(xi |).
(27)
(28)
i=1
N
X
ln p(xi |).
(29)
i=1
1
p(w) = exp( |w|2 ).
2
2
(30)
N
X
i=1
(wT xi yi )2 + |w|2
The app has as input a users stroke and a database of strokes collected by crowd-sourcing. The crowd-sourced strokes are assumed
to be drawn over the same photograph that the user is currently tracing. Every stroke comprises a set of points. The algorithm corrects
a users stroke in two steps: (i) for each point on the users stroke,
suggest a corrected point ; (ii) correct the stroke as a whole, using
all the suggested points.
Well consider suggesting a correction point first. Let z be a point
on the users stroke for which a suggestion is to be made. Let xi be
a crowd-sourced point. To make the mathematics a little easier, we
will follow Limpaecher et al. by shifting all the points so that z is
at the origin, then xi xi z. Figure 9 depicts the sitation after
this shift the point z is at the origin.
The suggested point is an expectation value:
N
X
p(xi )xi ,
(32)
i=1
(31)
p(xi )
wi
PN
j=1
wj
(33)
N
N
X
X
(x0i (xi + i )) +
((x0i x0i1 ) (xi xi1 )), (39)
i=1
i=2
i=1
v
uN
uX
t
p(xi )(uT xi )2
(36)
i=1
N (xi |, C)
(37)
in which
C
[u1 u2 ]
1
0
0
2
[u1 u2 ]
(38)
is first predicted from previous steps and then corrected by comparing the prediction with observations.
We will develop our intuition of the Kalman filter by briefly considering its application in Computer Vision, where it is used to track
objects in video. The idea is that an object (such as a vehicle or
a person) being tracked has a state, here denoted x. For tracking,
state is normally a position and velocity but sometimes includes acceleration. The state (or at least some of it) is often not directly
observable, for example the velocity and acceleration of an object
must be inferred from video from an observation, here denoted z.
The observation and state are related; z = H(x), for some function
H.
If the state (of a real system) were known exactly, then the state at
the next time instant could be predicted exactly. Newtons laws of
motion are often used to make a prediction, but any suitable model
will suffice. However, the state is usually not known exactly, so the
prediction will be uncertain. In fact the state is assumed to be Gaussian distributed. This means that in principle the system could be in
any state, but some are more likely than others and the likelihoods
are given by a Gaussian, so p(x) = N (x|, C), for some and C.
This is process noise, which written in the mathematics of Machine
Learning is
xt = f (xt1 , mt ), mt N (0, I).
need several states; so xit is used to denote the state of lane i at time
t. An ensemble for a road of M lanes is then X = {x1t , x2t , ...xM
t }.
Kalman filtering predicts the next state of a single object; EnKS
predicts the next state of an ensemble. The predictor can be any
function that suits the application. In their paper Wilkie et al. use
a traffic simulator designed by Aw and Rascale [Aw and Rascale
2000] and Zhang [Zhang 2002]; it is called ARZ. This particular
simulator was selected as it is a good model of traffic flow, but the
EnKS and therefore the method of Wilkie et al. do not depend upon
it. We only need to know that the simulator is a function call that
predicts the next state given the current state, and the prediction is
corrected by Ensemble Kalman Smoothing. The EnKS algorithm
is given in Figure 16
(40)
(41)
(42)
Figure 15: State of traffic flow for a three lane road. Each lane
divides into cells, and the density of traffic and velocity of traffic in
each cell comprise the state.
Looking at the algorithm we see it is a predictor-corrector. At each
time step it first uses the simulator to predict both the new state,
xit and a simulated observation zti in each cell (simulated observation is our term). Once all cells have been updated the algorithm
switches from prediction to correction.
For correction to be meaningful it must relate the real measurements
to the simulated state x via the simulated observations, z. Clearly,
the simulation should be adjusted so that deviation between real
and simulated observations are minimised. More exactly, the distribution of simulated observations (which is a consequence of the
distribution of states) should match the distribution of real observations.
Correction begins by computing the mean simulated observation,
zt ,
over all lanes at the given time instant. Notice that this mean is a
vector. The covariance t over simulated observations is computed
at the same instant. These computations assume that simulated observations are Gaussian distributed; that is p(z) = N (z|
zt , t ).
Since zt and t are all that is needed to characterise the distribution they are called sufficient statistics. We can now imagine the
In this way the state of the traffic flow simulation is updated using
real data. Of course, the real data need not be real it could be
supplied by an artist, for example. The rest of Wilkie et al. describes the traffic simulator, and how to use the output of the simulator to make Computer Graphics (broadly, put models of vehicles
in the lane cells, and render them). Since we have now covered the
Machine Learning elements of the paper, well move on to our next
example.
Modelling Trees
Figure 17: Mappings inside the EnKS algorithm, schematically illustrated. State vectors (red points) are distributed in state space
(pink) under a Gaussian (ellipse). Observations are distributed
in observation space (blue), also under a Gaussian. A whitening transform, 1 carries observations to a sphere at the origin
(green points), and a covariance, , carries whitened observations
to the state distribution (black points). The Kalman Gain, K, maps
points in observation space directly into state space.
Our final example shows how create models of many moving trees
using video as source material. It comes from Li et al [Li et al.
2011]. The motivation in this case is that trees are very hard to
model, and even harder to animate. Li et al. propose a method
to capture 3D trees that move in 3D, and which can be used to
spawn new individual trees of the same type. The only input to
their system is a single video stream, and the only user input is a
rough sketch around the outline of the tree in frame one. Figure 18
shows a group of trees that has been produced from two input video.
There is a significant amount of Computer Vision required that segmented the tree from its background and tracks the leaves of the
tree as they are blown by wind. Further processing is required to
use the tracking information to infer a branching structure for the
tree, which is obscured by the leaves being tracked. (If the tree is
bare of leaves a method to track the branches would replace this
pre-processing). We pick up the paper at the point of converting a
2D skeleton that moves with the 2D tree as seen in the video into a
3D tree.
approach unreliable, but the prediction is the basis for a probabilistic approach to motion reconstruction. This probabilistic approach
follows exactly the same form as growing except that push distance
xi is replaced with a 3D rotation, and the rotation of the tree below the current node is taken into account too. Again the reader is
referred to the original paper.
=
=
p(|Xi1 , xi )
p(Xi1 , x),
p(Xi1 , )
p(|Xi1 , x1 )
p(Xi1 |xi )p(xi ). (43)
p(Xi1 , )
We want xi , the location of a branch node in 3D. The Xi1 represents all of the tree below that particular node, and is the volume
the tree is growing into. A greedy algorithm is used: the xi with
the largest probability value p(xi |Xi1 , ) is selected as the push
position this is a MAP solution.
To see how this probabilistic approach is useful in tree growing, notice that on any given push we can safely ignore any term that does
not contain xi . Therefore we need only consider p(|Xi1 , xi )
and p(Xi1 |xi ). The first term governs the growth of the tree into
the volume . There are points placed on the surface of the volume that xi is attracted to it is attracted to the nearest points most
strongly. The term p(Xi1 |xi ) is used to restrict the push of xi
away from the branches below it so that the resulting branch is not
bent out of shape. The reader is directed towards the original paper
for details.
Motion in 3D is reconstructed by taking advantage of foreshortening. A branch that sways towards or away from the camera plane is
foreshortened by a predictable amount that depends on the angle of
sway. In practice the noise in measurements makes a deterministic
Figure 21: Creating new trees using probabilistic rules. Angles bifurcations (middle) sum to 2, so can be represented as multinomials. The multinomials (bottom) are considered as random samples
from a Dirichlet distribution (top). Sampling from the Dirichlet allows a 2D skeleton to be grown, which is converted into 3D (right,
top and bottom).
Now, there is a multinomial for every bifurcation in the tree. Li et
al. assume that for a tree of a given type these multinomials will
be similar. For example, in a tall thin tree the angle opposite the
local root will usually be narrow, whereas in a broad flat tree the
same angle will usually be wide. The problem is to characterise the
distribution of these multinomials for a given tree. To do this Li
et al. use regression to fit a Dirichlet distribution to them [Minka
2000]. The Dirichlet distribution is defined by
P
K
( K i ) Y i 1
p(1 , 2 , ...|1 , 2 , ...) = QK i=1
,
i=1 (i ) i=1
(44)
three branches at a bifurcation to sum to 1. A second Dirichlet distribution is fitted to the multinomials for branch length.
Clearly, the Dirichlet distribution is closely related to the multinomial distribution, in fact a draw from a Dirichlet distribution is a
multinomial that belongs to a particular class (in this case, type of
tree). This exactly what is needed here it provides Li et al. with a
probabilistic generative model that can be used to make new individual examples belonging to the same class as the original data.
The generative model for bifurcations is put to work to make new
individual trees. Starting from the trunk, bifurcations are added one
by one, using exactly the same Bayesian approach as for pushing a
point into 3D. At each step, a bifurcation belonging to the family
for the tree is created using the pair of Dirichlet distributions. It is
scaled to a size that depends on its depth in the tree (which may also
be learned), rotated, and placed amongst the existing branches. Tree
motion is also generated using a Dirichlet distribution. In this case
the multinomials are over the energy distributions in the resonant
frequencies of angular velocity.
The use of a probabilistic generative model replaces the need for
hand-crafted rules (as in L-systems), does not require users to draw
a tree (although the system could learn from drawings), nor does it
require a new photograph or video for each new tree. Instead the
learning of probabilistic rules is completely automatic and hidden
from the user.
There is plenty of material to support the Computer Graphics researcher wanting to use Machine Learning in their book. Chris
Bishops book [2006] is an excellent general reference for Pattern Recognition and Machine Learning, as Duda, Hart, and
Stork[2012]; both are regarded as classic texts.
Allied fields, notably Computer Vision, use Machine Learning extensively. Papers in top conferences such as CVPR, ICCV, and
ECCV are rich in its use. The Machine Learning community publish at conferences such as ICML and NIPS.
Acknowledgements
Thanks to Qi Wu for producing Figure 5, and to Dmitry Kit for
having the patience to proof-read the script. Thanks too to all of
the authors of the SIGGRAPH papers used to illustrate this course.
And last but not least, thanks to the EPSRC funding agency who
support much of my work.
References
AW, A., AND R ASCALE , M. 2000. Ressurection of second order
models of traffic flow. SIAM Journal on Applied Mathematics,
916938.
B ISHOP, C. M., ET AL . 2006. Pattern recognition and machine
learning, vol. 1. springer New York.
C OMANICIU , D., AND M EER , P. 2002. Mean shift: A robust
approach toward feature space analysis. Pattern Analysis and
Machine Intelligence, IEEE Transactions on 24, 5, 603619.
D EMPSTER , A. P., L AIRD , N. M., RUBIN , D. B., ET AL . 1977.
Maximum likelihood from incomplete data via the em algorithm.
Journal of the Royal statistical Society 39, 1, 138.
D UDA , R. O., H ART, P. E., AND S TORK , D. G. 2012. Pattern
classification. John Wiley & Sons.
6/17/14
6/17/14
Machine Learning in
Computer Graphics
Why?
complexity of models
Where?
ubiquitous
How?
this course!
The ML trend in CG
6/17/14
6/17/14
Regression
Find parameters, , for a function, y = f(x,)
standard example: a straight line;
parmeters = (m,c).
y = mx + c
6/17/14
Regression
Find parameters for a function y = f(x,)
which line?
Regression
Find parameters for a function y = f(x,)
N
E(m, c) = (mxi + c yi )2
i=1
6/17/14
Using Regression
Collomosse and Hall, Motion Analysis in Video: Dolls, Dynamic Cues, and Modern Art. Graphical Models 2006
Density Estimation
How many data points at (close to) x? y = p(x)
parametric
non-parametric
6/17/14
Classification
What thing is x? y = class(x)
class(x) = tomato
6/17/14
Classification
Non-Linear
SVM
Use of Classification
Shape Classification used
in the service of NPR
6/17/14
Underpinning Mathematics
Random Variate
set of possible values that can be assigned at random
Random Variable
the variable assigned a value at random
6/17/14
covariance
10
6/17/14
! 5 $
&
1#
= # 3 &
12 #
" 4 &%
Axioms of Probability
1: all probabilities lie between 0 and 1
2: the probabilities of mutually exclusive events sum
11
6/17/14
Bayes Law
Connects a posterior with a prior,
via an observed likelihood and the chance of evidence.
A: even numbers > 0
2
4
B: greater than 4
8 6
10
0
5
7 9
3
1
12
6/17/14
ML and MAP
Maximum Likelihood
* = argmax p(x | )
Maximum A-Posterior
* = argmax p( | x)
13
6/17/14
Limpaecher et al. Real-time drawing Assistance Through Crowd Sourcing, SIGGRAPH 2013.
Used by Permission
14
6/17/14
15
6/17/14
Wilke et al. Flow Reconstruction for Data-Driven Traffic Animation. SIGGRAPH 2013
Used by Permission
Used by Permission
16
6/17/14
Wilke et al. Flow Reconstruction for Data-Driven Traffic Animation. SIGGRAPH 2013
Used by Permission
17
6/17/14
Wilke et al. Flow Reconstruction for Data-Driven Traffic Animation. SIGGRAPH 2013
Used by Permission
18
6/17/14
Li et al. Modeling and Generating Moving Trees from Video. SIGGRAPH Asia 2011
19
6/17/14
Push 2D into 3D
20
6/17/14
Learn More
Books
Bishop. Pattern Recognition and Machine Learning
Duda, Hart, Stroke.
Conferences
CVPR, ICCV, ECCV
NIPS, ICML
21