0% found this document useful (0 votes)
8 views29 pages

COLOURKEY

The document discusses advancements in digital video engineering, particularly focusing on color keying and matting techniques. It highlights various methods such as Maximum Likelihood, Belief Propagation, and Deep Learning approaches for image matting, emphasizing the importance of temporal coherency in video processing. Key contributions from notable researchers and their methodologies are also summarized, showcasing the evolution of matting from traditional techniques to modern deep learning applications.

Uploaded by

zhangx30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views29 pages

COLOURKEY

The document discusses advancements in digital video engineering, particularly focusing on color keying and matting techniques. It highlights various methods such as Maximum Likelihood, Belief Propagation, and Deep Learning approaches for image matting, emphasizing the importance of temporal coherency in video processing. Key contributions from notable researchers and their methodologies are also summarized, showcasing the evolution of matting from traditional techniques to modern deep learning applications.

Uploaded by

zhangx30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Professor Anil Kokaram

[email protected]
5C1 Motion Picture Engineering
Digital Video Engineering
Engineering for Moving Pictures
Colour Keying
Binary Matte

Foreground

Background
Manually set the location of this area
And measure the mean and variance of each colour plane
E_t = 40
\lambda = 0.0
Iteration = all
This is ML
The Gibbs Energy Function as MRF

Hammersley-Clifford theorem posits that if we define local energy functions like this, then the field of
variables over the whole image is an MRF
E_t = 40
\lambda = 10.0
Iteration = 10
MAXIMUM LIKELIHOOD MAP
The first paper to present Matting in a Bayesian formulation

Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR 2001), Vol. II, 264-271, December 2001
Key terms in Modern Matting
Composite Image
This is called a TRIMATTE OR TRIMAP

Known Background
Known Foreground

Premultiplied Foreground
Unknown Region
Premultiplied Background

In fact the Premulitplied foreground


and background are both actually the
F and B composited against black (i.e.
0) respectively.
Optimal solutions
• Belief Propagation (Judea Pearl)
• Graph Cuts (Boykov, Zabih)
• These solve the “labelling” problem ith MRF
priors optimally
Other kinds of Markov Fields : Autoregressive
process
MRFs for
Video
Need to know about motion !
Need to know how to measure success!
In 2009 IEEE Conference on Comp Vision and
Pattern Recognition http://www.alphamatting.com/index.html

• Idealised synthetically created composite images created with GT mattes


• New metrics introduced : Gradient difference; Connectivity measure
• Good mattes are “as smooth as” the ground truth matte and “as conncted
as” the grond truth matte
Anat Levin in 2008 • A. Levin, D. Lischinski and Y. Weiss, "A Closed-Form
Solution to Natural Image Matting," in IEEE
produced a work of Transactions on Pattern Analysis and Machine
Intelligence, vol. 30, no. 2, pp. 228-242, Feb. 2008, doi:
genius 10.1109/TPAMI.2007.1177.
Transform our 3 variable problem into 1
• Assume F and B are locally constant in a small window
• Instead of estimating 𝛼𝛼, 𝐹𝐹, 𝐵𝐵 from the compositing equation 𝐶𝐶 = 𝛼𝛼𝛼𝛼 +
1 − 𝛼𝛼 𝐵𝐵 JUST ESTIMATE α
• 𝐶𝐶 = 𝛼𝛼 𝐹𝐹 − 𝐵𝐵 + 𝐵𝐵
𝐶𝐶 𝐵𝐵
• ⇒ 𝛼𝛼 = −
𝐹𝐹−𝐵𝐵 𝐹𝐹−𝐵𝐵
• ⇒ 𝛼𝛼 = 𝑎𝑎𝑎𝑎 + 𝑏𝑏
• Now our non-linear problem for estimating 𝛼𝛼, 𝐹𝐹, 𝐵𝐵 becomes a linear least
squares problem because 𝛼𝛼 is dependent on C and B through two linear
coefficients 𝑎𝑎, 𝑏𝑏
• She then went on to remove a,b from the solution entirely
• Very fast algorithm … looks a lot like filtering
Amazing results
Then in 2017 .. • N. Xu, B. Price, S. Cohen and T. Huang, "Deep Image
Matting," 2017 IEEE Conference on Computer Vision
Along came Deep and Pattern Recognition (CVPR), Honolulu, HI, 2017,
Nets pp. 311-320
• Input is picture and associated
trimatte or trimap
Key Ideas • Two stages : Encoder Decoder
followed by fully connected
“Refinement” stage. Train E-D first
then train “refinement” stage
• Loss function combines 𝛼𝛼-matte
loss with composite image loss
2
• 𝐿𝐿𝛼𝛼 = 𝛼𝛼 − 𝛼𝛼𝑔𝑔 + 𝜖𝜖 ; 𝐿𝐿𝐶𝐶 =
2
𝐶𝐶 − 𝐶𝐶𝑔𝑔 + 𝜖𝜖
• 𝐿𝐿 = 𝑤𝑤𝑙𝑙 𝐿𝐿𝛼𝛼 + 1 − 𝑤𝑤𝑙𝑙 𝐿𝐿𝐶𝐶
• So first network is trained to
produce BOTH the alpha-matte
AND the observed composite
image
• Second network has NO
DOWNSAMPLING so acts to refine
Results etc
• 320 x 320 image crops :
Can’t process 720 for
instance.
• 49,300 images
• Encoder initialised with first
14 layers of VGG-16
Levin’s success can explain why a DNN can work
Levin and DNN
• Levin reposed the mating problem as essentially a filtering problem
over pixels
• The nonlinearity was removed by introducing “latent” variables “a,b”
• The DNN is learning to do the same thing PLUS it is injecting
semantics into the problem … learning to spot objects and what their
“coherency” should be.
Why have I bothered to show you anything
before 2017?
• In postproduction : DNNs are still not being used
• They are unpredictable in the sense that they are good with things they
have seen before but not well behaved with “new” material
• In video temporal coherency is EVERYTHING. Video matting is still trying to
sort this out. DNNs are only now emerging (2020) which address temporal
coherency
• DNNs are massively computationally heavy. Imagine trying to do all of this
with 8K plates! (In postproduction they call a frame a PLATE)
• Hybridisation of these two genres of techniques is the way forward DNNs +
Levin
• Remember 95% success is still not good enough for post-production!
M. Forte, B. Price, S. Cohen, N. Xu and F. Pitié,
"Interactive Training And Architecture For Deep Object
Selection," 2020 IEEE International Conference on
Multimedia and Expo (ICME), London

Francois Pitie (sigmedia.tv)


Working to push deep matting into a usable state
See ICME paper from 2020 which won a prize
FIN

You might also like