Simulation Photoréaliste de L'éclairage en Synthèse D'images
Simulation Photoréaliste de L'éclairage en Synthèse D'images
Simulation Photoréaliste de L'éclairage en Synthèse D'images
d’Images
Nicolas Holzschuch
Nicolas Holzschuch
Composition du jury :
Sumanta Pattanaik Rapporteur
Bernard Péroche Rapporteur
Hans-Peter Seidel Rapporteur
George-Pierre Bonneau Président
George Drettakis Examinateur
Claude Puech Examinateur
François Sillion Examinateur
Habilitation préparée au sein de l’équipe ARTIS du laboratoire LJK. Le LJK est l’UMR 5224, un
laboratoire commun au CNRS, à l’INPG, à l’Université Joseph Fourier et à l’Université Pierre
Mendès-France. ARTIS est une équipe du LJK et un projet de l’INRIA Rhône-Alpes.
2
3
À Myriam,
À Henry, Erik et Lena,
ma famille qui me supporte... et qui me soutient.
4
Table des matières
1 Introduction 9
1.1 Structure du mémoire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5
6 TABLE DES MATIÈRES
7
8 TABLE DES FIGURES
1.
Introduction
Les techniques de rendu en Synthèse d’Images produisent des images d’une scène virtuelle
en prenant comme point de départ la définition de cette scène : les objets qui la composent, leur
position, leurs matériaux, mais aussi la position de l’observateur.
Parmi les techniques de rendu, on distingue les techniques de rendu Photo-Réalistes, qui
cherchent à produire une image aussi proche que possible de la réalité, en simulant les échanges
lumineux à l’intérieur de la scène. On obtient ainsi une image de la scène avec les effets d’éclai-
rage, à la fois directs et indirects, les reflets, les ombres...
La recherche dans le domaine de la simulation de l’éclairage a énormément progressé au
cours des dernières années, de telle sorte que la production d’images photoréalistes est désormais
un objectif accessible pour le grand public. Plusieurs applications industrielles tirent parti de cette
génération d’images photoréalistes : visite virtuelle de bâtiments, jeux vidéo, prototypage virtuel,
effets spéciaux, design architectural...
Ces applications industrielles ont un effet d’entraînement sur la recherche : les utilisateurs
(et les industriels) sont demandeurs d’effets toujours plus réalistes, et les chercheurs sont mis
à contribution. Le décalage entre la date de publication d’un nouvel algorithme et son emploi
dans un produit industriel s’est considérablement réduit, passant de plus de 10 ans dans les
années 1990 à quelques années seulement en 2006. Non seulement ce dynamisme augmente les
possibilités d’application industrielle pour nos recherches, mais encore il ouvre de nouvelles
directions de recherche, pour combler les besoins accrus en interactivité et en réalisme des
utilisateurs.
Dans ce mémoire, nous allons nous intéresser à ces problèmes de simulation photo-réaliste de
l’éclairage. En particulier, nous allons présenter : la simulation de l’éclairage par des méthodes
d’éléments finis multi-échelles (radiosité par ondelettes), la détermination des caractéristiques
de la fonction d’éclairage (dérivées, fréquence), et la simulation en temps-réel ou interactif de
plusieurs effets lumineux (ombres, reflets spéculaires, éclairage indirect). Ces trois domaines
recouvrent l’ensemble de nos travaux pendant cette période.
– Dans le domaine de la simulation de l’éclairage par éléments finis, nous avons travaillé
sur la méthode de radiosité hiérarchique, puis sur la méthode de radiosité par ondelettes.
Nous avons montré l’efficacité de la représentation hiérarchique, mais aussi ses limites :
la hiérarchie est conditionnée par la modélisation initiale de la scène. Nous avons montré
comment s’affranchir de cette limitation. Nous avons également montré que l’extension de
la méthode de radiosité aux ondelettes d’ordre élevé imposait une modification radicale
de l’algorithme, puis comment combiner efficacement les ondelettes d’ordre élevé avec un
maillage de discontinuité.
– Un des problèmes commun à plusieurs méthodes de simulation de l’éclairage, c’est qu’il
est nécessaire d’adapter l’échantillonnage employé aux caractéristiques de la fonction
9
10 CHAPITRE 1. INTRODUCTION
d’éclairage. Mais ces caractéristiques sont, par définition, inconnues au début de la si-
mulation. Nous avons montré comment calculer localement certaines caractéristiques de la
fonction d’éclairage : d’une part ses dérivées, d’autre part la fréquence de ses variations.
– Enfin, on remarque qu’une part très importante des calculs en simulation de l’éclairage
est consacrée à l’aspect visuel du résultat, plus qu’à la précision physique des calculs.
Il est possible de calculer une image physiquement acceptable d’une scène en un temps
très court, mais calculer une image visuellement réaliste de la même scène multiplie ap-
proximativement par 10 le temps de calcul. Cette règle expérimentale vaut pour plusieurs
algorithmes de simulation, comme la radiosité ou le photon mapping. Mais le réalisme
visuel n’est, par définition, nécessaire que pour la partie visible de la scène. Nous avons
développé plusieurs algorithmes qui permettent de calculer en temps réel certains effets
essentiels pour le réalisme visuel (ombres, reflets spéculaires). En combinant ce calcul
temps-réel avec un calcul séparé des effets d’éclairage indirect, notre but est d’obtenir une
simulation en temps-réel de l’éclairage global dans une scène dynamique.
Les techniques de radiosité hiérarchiques ou par ondelettes utilisent une représentation multi-
échelle de l’éclairage. Cette représentation hiérarchique permet d’accélérer les calculs. Cepen-
dant, la représentation hiérarchique est basée sur le modèle géométrique de la scène ; l’influence
de ce modèle géométrique peut ralentir la simulation et en diminuer la qualité. Nous présentons
deux méthodes pour s’affranchir du modèle géométrique original. Nous présentons également
une analyse de l’algorithme de radiosité par ondelettes. L’emploi de fonctions de base hiérar-
chiques d’ordre 2 ou 3, par opposition aux fonctions de base constantes par morceaux, impose
d’adapter l’algorithme pour tenir compte des coûts proportionnels à l’ordre des fonctions de base
élevé puissance n. Nous montrons qu’avec une adaptation fine de chacune des étapes de l’algo-
rithme, ces ondelettes d’ordre élevé permettent de réduire la complexité et d’accélérer les calculs.
1. James T. K. « The Rendering Equation ». Computer Graphics (ACM SIGGRAPH ’86 Proceedings), 20(4):143–
150, août 1986.
2. Cindy M. G, Kenneth E. T, Donald P. G et Bennett B. « Modelling the Interaction of
Light Between Diffuse Surfaces ». Computer Graphics (ACM SIGGRAPH ’84 Proceedings), 18(3):212–222, juillet
1984.
11
12 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
Figure 2.1 – Images de modèles architecturaux calculées avec la méthode de radiosité. Les mo-
dèles ont été fournis par l’École d’Architecture de Nancy.
(a) (b)
Figure 2.2 – Images du modèle du Soda Hall calculées avec la méthode de radiosité. Le modèle
du Soda Hall a été fourni par Carlo Sequin.
La matrice M contient les coefficients de transfert d’énergie lumineuse entre les différentes
facettes de la discrétisation de la scène. On résout l’équation 2.2 de façon itérative :
∞
X
B = (I − M)−1 E = Mk E (2.3)
k=0
X X X
X X X
.. = ..
X = XX . . . XX
..
X
. .
.
X X X
(a) Shooting (b) Gathering
Figure 2.3 – Gathering et Shooting : utilisation d’une ligne ou d’une colonne de la matrice de
transport, M.
3. Pat H, David S et Larry A. « A Rapid Hierarchical Radiosity Algorithm ». Computer Gra-
phics (ACM SIGGRAPH ’91 Proceedings), 25(4):197–206, juillet 1991.
4. Pat H et David S. « A Rapid Hierarchical Radiosity Algorithm for Unoccluded Environments ».
Dans Photorealism in Computer Graphics (Eurographics Workshop on Photosimulation, Realism and Physics in
Computer Graphics), p. 151–171, juin 1992.
14 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
– propager l’énergie reçue par les différents niveaux hiérarchiques de chaque objet dans l’en-
semble de la hiérarchie (push-pull).
La méthode de radiosité hiérarchique modifie en profondeur l’algorithme de radiosité : à
chaque étape, on a une représentation complète de l’opérateur de transport, et on peut donc cal-
culer un transfert de radiosité dans toute la scène en une seule étape.
Dans une étape de propagation, il est plus facile d’utiliser du gathering que du shooting :
avant qu’une surface puisse envoyer son énergie dans la scène, il est nécessaire d’effectuer une
étape de push-pull. Le shooting impose ainsi un trop grand nombre d’étapes de push-pull, ce
qui le rend moins efficace que le gathering. Expérimentalement, nous avons aussi trouvé que la
méthode de shooting est plus sensible à l’imprécision sur les calculs des coefficients de transfert,
et divergeait avec les méthodes imprécises de calcul de ces coefficients utilisées à l’époque.
Nous avons effectué une étude en profondeur de l’algorithme de radiosité hiérarchique [16]
(voir p. 26). Cette étude a montré deux choses importantes :
– la plus grande partie du temps de calcul est passé dans les tests de visibilité (plus de 80 %
dans une scène complexe). Les autres étapes de l’algorithme ont une influence relativement
modeste;
– la méthode construit une hiérarchie sur chacun des objets de haut niveau composant la
scène . On établit d’abord un lien entre chacune de ces hiérarchies (initial linking), puis
on raffine ces liens. La méthode a ainsi une complexité réduite par rapport au nombre
de facettes générées lors du raffinement, mais reste quadratique par rapport au nombre
d’objets de haut niveau composant la scène.
En nous basant sur cette étude, nous avons proposé une méthode de lazy linking, qui réduit le
temps de calcul d’un facteur 2, et permet d’obtenir les premières images plus rapidement. Nous
avons également proposé un nouvel oracle de raffinement, qui évite des raffinements inutiles, et
permet de gagner encore un facteur 2 sur le temps de calcul.
5. Francois S. « Clustering and Volume Scattering for Hierarchical Radiosity Calculations ». Dans Rendering
Techniques ’95 (Eurographics Workshop on Rendering), p. 105–117, juin 1994.
6. A. W, P. H et M. G. « Face Cluster Radiosity ». Dans Rendering Techniques ’99 (Eurogra-
phics Workshop on Rendering), p. 293–304, 1999.
2.3. EFFICACITÉ DE LA HIÉRARCHIE 15
(a) Détail de la figure 2.1(b) (b) Triangulé : 32 tri- (c) Non-triangulé : une
angles seule surface
Figure 2.4 – Les modèles contiennent des surfaces planes complexes, qui sont triangulées exces-
sivement.
Figure 2.6 – La fonction de radiosité est définie sur un domaine plan étendu, plus simple que la
surface originale.
7. Laurent A, François C, Sylvain P, Jean-Claude P, Sylvain L et Eric W. « The Virtual
Mesh: A Geometric Abstraction for Efficiently Computing Radiosity ». ACM Transactions on Graphics, 20(3):169–
201, juillet 2001.
8. Gregory L, Bruno L, Laurent A et Jean-Claude P. « Master-Element Vector Irradiance for Large
Tesselated Models ». Dans Third International Conference on Computer Graphics and Interactive Techniques in
Australasia and South East Asia (GRAPHITE ’05), p. 315–322, 463, novembre 2005.
9. Steven J. G, Peter S, Michael F. C et Pat H. « Wavelet Radiosity ». Dans ACM SIG-
GRAPH ’93, p. 221–230, 1993.
2.4. ONDELETTES D’ORDRE ÉLEVÉ 17
100 100
Original Original
50 Tesselated 50 Tesselated
20 20
10 10
5 5
2 2
1 1
0 1e4 2e4 5000
(a) Place Stanislas (figure 2.1(b)) (b) Temple (figure 2.1(a))
100
Original
50 Tesselated
20
10
1
0 1e5 2e5
(c) Soda Hall (figure 2.2)
Figure 2.7 – Taux de convergence (rapport énergie restant à propager sur énergie initiale) en
fonction du temps de calcul (en secondes)
coût de stockage des coefficients attachés à des ondelettes d’ordre n est de (n + 1)k , où k est la
dimension de la fonction échantillonnée. Ainsi, le coût lié au stockage de chaque facette de la
hiérarchie évolue comme (n + 1)2 , et le coût de stockage de chaque interaction évolue comme
(n + 1)4 . L’emploi d’ondelettes d’ordre 2 (quadratiques par morceaux) augmente donc le coût de
stockage de chaque interaction de deux ordres de grandeur.
Pour cette raison, une étude expérimentale10 a montré que les ondelettes d’ordre élevé n’ap-
portaient pas d’amélioration significative par rapport aux ondelettes de Haar (constantes par
morceaux). Cette étude était basée sur une implémentation naïve de l’algorithme de radiosité
par ondelettes, en modifiant simplement la fonction de base. Avec le doctorant François Cuny
(co-encadré par Jean-Claude Paul et Laurent Alonso), nous avons montré qu’il était nécessaire
d’adapter chacune des étapes de l’algorithme aux spécificités introduites par les fonctions de base
d’ordre élevé [8] (p. 56). Il était également nécessaire de réfléchir aux conséquences de chaque
adaptation sur les autres étapes de l’algorithme ; ainsi, la décision de ne pas stocker les liens a eu
une influence sur le choix de l’algorithme de résolution.
10. Andrew W et Paul H. « An Empirical Comparison of Progressive and Wavelet Radiosity ». Dans
Rendering Techniques ’97 (Eurographics Workshop on Rendering), p. 175–186, 1997.
18 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
11. Philippe B et Yves W. « Error Control for Radiosity ». Dans Rendering Techniques ’96 (Eurographics
Workshop on Rendering), p. 153–164, 1996.
2.4. ONDELETTES D’ORDRE ÉLEVÉ 19
8500 32000
Haar Haar
M2 M2
8000 M3 30000 M3
7500 28000
7000 26000
6500 24000
0.001 0.01 0.1 0.001 0.01 0.1
Global Error Global Error
(a) Dining room (b) Classroom
Figure 2.8 – Coût mémoire (en Ko) pour les différentes fonctions de base, en fonction de l’erreur
sur la simulation.
0.1
Haar Haar
0.1 M2 M2
M3 M3
0.01
0.01
0.001 0.001
1 10 100 1000 10 100 1000 10000 100000
CPU Time (s) CPU Time (s)
(a) Dining room (b) Classroom
Figure 2.9 – Erreur commise sur la simulation en fonction du temps de calcul (en s).
Figure 2.10 – Comparaison entre les ondelettes de Haar (constantes), M2 (linéaires) et M3 (qua-
dratiques), pour le même temps de calcul.
20 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
Figure 2.11 – Lissage a posteriori des ondelettes de Haar. On peut comparer la qualité avec la
figure 2.10.
tions précédentes). La qualité de la fonction que nous obtenons avec des ondelettes linéaires ou
quadriques sans post-traitement est supérieure à celle obtenue avec des ondelettes constantes par
morceaux et un post-traitement (voir figure 2.11)
12. Daniel L, Filippo T et Donald P. G. « Discontinuity Meshing for Accurate Radiosity ».
IEEE Computer Graphics and Applications, 12(6):25–39, novembre 1992.
13. Paul H. « Discontinuity Meshing for Radiosity ». Dans Eurographics Workshop on Rendering, p. 203–226,
mai 1992.
14. George D et Eugene F. « A Fast Shadow Algorithm for Area Light Sources Using Backprojection ».
Dans ACM SIGGRAPH ’94, p. 223–230, 1994.
15. Daniel L, Filippo T et Donald P. G. « Combining Hierarchical Radiosity and Discontinuity
Meshing ». Dans ACM SIGGRAPH ’93, p. 199–208, 1993.
16. George D et Francois S. « Accurate Visibility and Meshing Calculations for Hierarchical Radiosity
». Dans Rendering Techniques ’96 (Eurographics Workshop on Rendering), p. 269–278, 1996.
17. Fredo D, George D et Claude P. « Fast and Accurate Hierarchical Radiosity Using Global Visi-
bility ». ACM Transactions on Graphics, 18(2):128–170, 1999.
2.4. ONDELETTES D’ORDRE ÉLEVÉ 21
continuités calculées n’ont par ailleurs pas d’effets visibles sur la simulation14 . Enfin, le maillage
généré est incompatible avec l’approche par ondelettes, qui nécessite une hiérarchie régulière.
Figure 2.12 – Une maille coupée par une discontinuité est (a) décomposée en deux mailles. Pour
chacune des mailles, nous identifions le parallélogramme englobant (b). Sur chaque parallélo-
gramme, nous conduisons un algorithme de radiosité par ondelettes classique.
Nous avons développé un algorithme qui combine les ondelettes d’ordre élevé avec le
maillage de discontinuité [13] (voir p. 68). Notre algorithme utilise les ondelettes d’ordre élevé
autant que possible, et n’introduit les discontinuités dans le maillage que si l’oracle de raffinement
établit qu’elles permettront d’en réduire la complexité. Nous avons trouvé qu’il n’est nécessaire
d’introduire dans le maillage qu’un petit nombre de discontinuités seulement (voir figure 2.13).
Les autres discontinuités sont approximées de façon correcte par les fonctions de base linéaires
ou quadratiques.
Notre travail est basé sur les travaux précédents sur la radiosité sur des surfaces planes quel-
conques [14], mais en modifiant l’approche sur plusieurs points :
– Les deux côtés de la discontinuité jouent un rôle dans la simulation de l’éclairage. Il faut
donc créer deux mailles spéciales pour chaque discontinuité introduite (voir figure 2.12).
– Sur la discontinuité, il faut calculer des coefficients de push-pull particuliers, qui tiennent
compte de la proportion effective de la surface qui se trouve de chaque côté de la discon-
tinuité. Sur le reste de la hiérarchie, en amont et en aval de la discontinuité, on utilise une
subdivision régulière, avec tous ses avantages.
– Il peut y avoir intersection entre plusieurs discontinuités (les frontières d’ombre causées
par plusieurs sources lumineuses, ou encore les frontières d’ombre et de pénombre qui se
rejoignent quand l’obstacle touche le récepteur). Il faut alors traiter les discontinuités les
unes après les autres, et prévoir certains cas particuliers.
Notre approche permet une représentation très compacte de la radiosité, y compris en pré-
sence de discontinuités (voir [13] et figure 2.13). Dans les zones éclairées, les ondelettes M3
permettent d’utiliser des mailles très larges tout en fournissant une représentation continue, tan-
dis que sur les frontières d’ombre, l’emploi des discontinuités permet d’interrompre rapidement
le raffinement. Dans les zones de pénombre, nous avons montré qu’il n’est pas toujours néces-
saire d’introduire les discontinuités, et que si la transition est suffisamment douce, les ondelettes
M3 peuvent la modéliser correctement.
(a) (b)
Figure 2.13 – Radiosité avec ondelettes quadriques (M3 ) et maillage de discontinuité, en présence
d’une source surfacique. Une partie de la zone de pénombre du fauteuil est modélisée avec des
mailles régulières, sans avoir besoin d’insérer de discontinuités. On voit aussi que des mailles
très larges suffisent pour la zone éclairée au pied du fauteuil.
cohérence temporelle de l’éclairage, par une décomposition hiérarchique portant également sur
la dimension temporelle.
Les travaux préliminaires18 souffraient d’importantes discontinuités temporelles. Ces discon-
tinuités existent également dans les dimensions spatiales dans le cas de la radiosité hiérarchique
classique, où elles sont partiellement corrigées par un post-traitement. Les discontinuités tempo-
relles sont à la fois beaucoup plus gênantes, parce qu’elles concernent l’ensemble de la scène, et
plus difficiles à traiter.
Nos travaux sur les ondelettes d’ordre élevé avaient montré que ces ondelettes génèrent di-
rectement une solution continue dans le domaine spatial. Nous avons montré qu’en introduisant
les ondelettes d’ordre élevé dans le domaine temporel [6] (p. 82), il était également possible
d’éliminer les discontinuités temporelles.
18. C. D et F. S. « Space-Time Hierarchical Radiosity ». Dans Rendering Techniques ’99 (Eurographics
Workshop on Rendering), p. 235–246, 1999.
2.6. DISCUSSION 23
D’un autre côté, nos propres travaux et ceux d’autres chercheurs ont montré que la connais-
sance de la structure de la scène sous-jacente permet une meilleure qualité dans la simulation de
l’éclairage, par exemple en remplaçant un ensemble de triangles déconnectés par un seul poly-
gone plan [14], ou bien en remplaçant un ensemble de polygones par une surface paramétrée7 , ou
encore en exploitant la structure de la scène pour construire une hiérarchie efficace pour le lancer
de rayons, le clustering ou l’instantiation.
Nous avons lancé le projet SHOW, en collaboration avec François Sillion et Cyril Soler, et
avec trois autres projets INRIA : ALICE, REVES et IPARLA. L’objectif du projet SHOW est
de reconstruire la structure d’un grand ensemble de données désordonnées. En particulier, le
doctorant Aurélien Martinet, financé par le projet SHOW et que je co-encadre avec Cyril Soler
et François Sillion travaille sur l’extraction d’une structure de scène en partant d’un ensemble de
polygones désordonnés.
Les premiers résultats [4] (voir p. 96) ont permis l’extraction d’ensembles de triangles
connexes par arêtes, qui forment des briques de base. Nous sommes également parvenus à extraire
la liste des symétries de chaque brique de base, et à identifier les briques identiques, même si leur
tesselation diffère. Des travaux ultérieurs19 permettent l’identification rapide d’objets identiques
dans la scène, où les objets sont formés par assemblage de briques de base.
2.6 Discussion
Dans ce chapitre, nous avons présenté nos travaux dans le domaine de la radiosité par onde-
lettes. Nos contributions portent sur des améliorations de l’algorithme : faire porter la hiérarchie
sur l’ensemble de la scène, adapter la méthode aux ondelettes d’ordre élevé, combiner ondelettes
d’ordre élevé et maillage de discontinuité.
Ces améliorations sont importantes, voire essentielles pour toute application pratique de la
méthode de radiosité. Combinées, elles permettent une représentation optimale — compacte —
de l’éclairage diffus dans une scène. Mais ces travaux ont leurs propres limitations :
– on ne calcule que l’éclairage diffus, or les fonctions de réflectance complexes jouent un
rôle important dans le réalisme de la scène ;
– on manque d’informations a priori sur le comportement de la fonction d’éclairage, ce qui
ne permet pas d’adapter l’échantillonnage aux variations de la fonction ;
– une part très importante du temps de calcul est consacrée au calcul des frontières d’ombre
et de pénombre dans l’éclairage direct, alors que ces frontières ne jouent qu’un rôle mineur
dans le calcul de l’éclairage indirect.
Ces différents points sont partiellement liés. Ainsi, introduire des fonctions de réflectance qui
ne sont ni diffuses, ni spéculaires impose d’échantillonner à la fois dans le domaine spatial et
dans le domaine angulaire, augmentant ainsi la dimensionnalité du problème et le coût mémoire
de l’algorithme. Une connaissance a priori des variations de la fonction d’éclairage permettrait
d’adapter l’échantillonnage (spatial et angulaire) à ces variations, et ainsi de contrôler le coût
mémoire de l’algorithme. Le prochain chapitre est consacré à l’extraction des propriétés de la
fonction d’éclairage.
Le troisième point est commun à d’autres algorithmes de simulation de l’éclairage : une
grande part du temps de calcul est consacrée à des effets qui sont importants pour l’aspect visuel
de l’image calculée (frontières d’ombre, réflexions spéculaires) mais qui sont sans intérêt pour
7. Laurent A, François C, Sylvain P, Jean-Claude P, Sylvain L et Eric W. « The Virtual
Mesh: A Geometric Abstraction for Efficiently Computing Radiosity ». ACM Transactions on Graphics, 20(3):169–
201, juillet 2001.
19. Aurélien M. « Structuration automatique de scènes 3D ». Thèse, Université Joseph Fourier, 2006.
24 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
les calculs d’éclairage indirect. Cette disproportion impose de réfléchir à ce qu’on simule, et
d’étudier les moyens de calculer séparément certains effets. C’est l’objet du chapitre 4.
2.7. ARTICLES 25
2.7 Articles
2.7.1 Liste des articles
–An efficient progressive refinement strategy for hierarchical radiosity (EGWR ’94)
–Wavelet Radiosity on Arbitrary planar surfaces (EGWR 2000)
–A novel approach makes higher order wavelets really efficient for radiosity (EG 2000)
–Combining higher-order wavelets and discontinuity meshing: a compact representation for
radiosity (EGSR 2004)
– Space-time hierarchical radiosity with clustering and higher-order wavelets (CGF 2004)
– Accurate detection of symmetries in 3D shapes (TOG 2006)
26 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
2.7.2 An efficient progressive refinement strategy for hierarchical radiosity (EGWR ’94)
Auteurs : Nicolas H, François S et George D
Conférence : 5e Eurographics Workshop on Rendering, Darmstadt, Allemagne.
Date : juin 1994
An Ecient Progressive Re nement Strategy for
Hierarchical Radiosity
Nicolas Holzschuch, Fran
cois Sillion, George Drettakis
iMAGIS / IMAG ?
1 Introduction
The radiosity method for the simulation of energy exchanges has been used to
produce some of the most realistic synthetic images to date. In particular, its
ability to render global illumination e ects makes it the technique of choice for
simulating the illumination of indoor spaces. Since it is based on the subdivision
of surfaces using a mesh and on the calculation of the energy transfers between
mesh elements pairs, the basic radiosity method is inherently a costly algorithm,
requiring a quadratic number of form factors to be computed.
Recent research has focused on reducing the complexity of the radiosity simu-
lation process. Progressive re nement has been proposed as a possible avenue [1],
whereby form factors are only computed when needed to evaluate the energy
transfers from a given surface, and surfaces are processed in order of importance
with respect to the overall balance of energy. The most signi cant advance in
recent years was probably the introduction of hierarchical algorithms, which
attempt to establish energy transfers between mesh elements of varying size,
thus reducing the subdivision of surfaces and the total number of form factors
computed [4, 5].
Since hierarchical algorithms proceed in a top-down manner, by limiting the
subdivision of input surfaces to what is necessary, they rst have to establish a
number of top-level links between input surfaces in an \initial linking" stage. This
? iMAGIS is a joint research project of CNRS/INRIA/UJF/INPG. Postal address:
B.P. 53, F-38041 Grenoble Cedex 9, France. E-mail: [email protected].
27
2 Nicolas Holzschuch, Fran
cois Sillion, George Drettakis
results in a quadratic cost with respect to the number of input surfaces, which
seriously impairs the ability of hierarchical radiosity systems to deal with envi-
ronments of even moderate complexity. Thus a reformulation of the algorithm
is necessary in order to be able to simulate meaningful architectural spaces of
medium complexity (several thousands of input surfaces). To this end the ques-
tions that must be addressed are: What energy transfers are signi cant? When
must they be computed? How can their accuracy be controlled?
The goal of the research presented here is to extend the hierarchical algorithm
into a more progressive algorithm,by identifying the calculation components that
can be delayed or removed altogether, and establishing improved re nement
criteria to avoid unnecessary subdivision. Careful analysis of the performance
of the hierarchical algorithm on a variety of scenes shows that the visibility
calculations dominate the overall compute time.
Two main avenues are explored to reduce the cost of visibility calculations:
First, the cost of initial linking is reduced by delaying the creation of the links
between top-level surfaces until they are potentially signi cant. In a BF re ne-
ment scheme this means for instance that no link is established between dark
surfaces. In addition, a form factor between surfaces can be so small that it is
not worth performing the visibility calculation.
Second, experimental studies show that subdivision is often too high. This is
a consequence of the assumption that the error on the form factor is of magni-
tude comparable to the form factor itself. In situations of full visibility between
surfaces, relatively large form factors can be computed with good accuracy.
2 Motivation
To study the behaviour of the hierarchical algorithm, we ran the original hierar-
chical program [5] on a set of ve di erent interior environments, varying from
scenes with simple to moderate complexity (from 140 to 2355 input polygons).
The scenes we used were built in di erent research e orts and have markedly
di erent overall geometric properties. By using these di erent scenes, we hope
to identify general properties of interior environments. We thus hope to avoid,
or at least moderate, the pitfall of unjusti ed generalisation that oftens results
from the use of a single scene or a class of scenes with similar properties to char-
acterise algorithm behaviour. The scenes are: \Full oce", which is the original
scene used in [5], \Dining room", which is \Scene 7" of the standard set of scenes
distributed for this workshop, \East room" and \West room", which are scenes
containing moderately complex desk and chair models, and nally \Hevea", a
model of a hevea tree in a room. Table 1 gives a short description and the num-
ber of polygons n for each scene. Please refer to colour section (Figs. 1, 3, 5 and
9-12) for a computed view of the test scenes.
2.1 Visibility
The rst important observation we make from running the algorithm on these
test scenes is the quanti cation of the cost of visibility calculations in the hier-
28
Progressive Re nement Strategy for Hierarchical Radiosity 3
Name n Description
Full Oce 170 The original oce scene
Dining room 402 A table and four chairs
East room 1006 Two desks, six chairs
West room 1647 Four desks, ten chairs
Hevea 2355 An hevea tree with three light sources
100 Push-Pull
90 Visibility
80 Form-Factors
70 Re ne
60 Gather
50
40
30
20
10
0
Full Oce Dining RoomEast Room West Room Hevea
Fig. 1. Relative time spent in each procedure.
29
4 Nicolas Holzschuch, Francois Sillion, George Drettakis
Of course this is relative to the algorithm used. A better approach, e.g. with
a pre-processing step, as in Teller et al. [9] could probably reduce the relative
importance of visibility.
The second important observation concerns the actual cost of the initial linking
step. As mentioned in the introduction, this cost is at least quadratic in the
number of polygons, since each pair of input polygons has to be tested to de-
termine if a link should be established. Since this step is performed before any
transfer has commenced, it is a purely geometric visibility test, in this instance
implemented by ray-casting. The cost of this test for each polygon pair can vary
signi cantly, depending on the nature of the scene and the type of ray-casting
acceleration structure used. In all the examples described below, a BSP tree is
used to accelerate the ray-casting process.
Table 2. Total computation time and cost of initial linking (in seconds).
Table 2 presents timing results for all test scenes. The total computation
time is given for ten steps of the multigridding method described by Hanrahan
et al [5].3.
These statistics show that the cost of initial linking grows signi cantly with
the number of polygons in the scene. The dependence on scene structure is also
evident, since the growth in computation time between East room and West room
is actually sublinear, while on the other hand the growth of the computation
time between West room and Hevea displays greater than cubic growth in the
number of input polygons. For all tests of more than a thousand polygons, it is
clear that the cost of initial linking becomes overwhelming. Invoking this cost at
the beginning of the illumination computation is particularly undesirable, since
a useful image cannot be displayed before its completion. Finally, we note that
recent improvements of the hierarchical radiosity method by Smits et al. [8] and
Lischinski et al. [6] have allowed signi cant savings in re nement time, but still
3
The k'th step of the multigridding method is typically implemented as the k'th
\bounce" of light: the rst step performs all direct illumination, the second step all
secondary illumination, the third all tertiary illumination etc.
30
Progressive Re nement Strategy for Hierarchical Radiosity 5
rely on the original initial linking stage. Thus initial linking tends to become the
most expensive step of the algorithm4.
Another interesting observation can be made concerning the number of top-
level links (links between input polygons) for which the product BF never be-
comes greater than the re nement threshold "refine over the course of the ten
re nement steps5 . Figure 2 shows the percentage of such links during the rst
ten iterations. A remarkably high percentage of these links never becomes a can-
didate for re nement: after 10 steps, between 65% and 95% of the links have
not been re ned. A signi cant number of those links probably have very little
impact on the radiosity solution.
%
100 Hevea
90 West Room
80 East Room
70 Dining Room
60 Full Oce
50
40
30
20
10
0 Iterations
1 2 3 4 5 6 7 8 9 10 11
What can be concluded from the above discussion? First, if the initial linking
step can be eliminated at the beginning of the computation, a useful solution be-
comes available much more quickly, enhancing the utility of the the hierarchical
method. Second, if the top-level links are only computed when they contribute
signi cantly to the solution, there is the potential for large computational savings
from eliminating a large number of visibility tests.
2.3 Unnecessary Re nement
The third important observation made when using the hierarchical algorithm
is that unnecessary subdivision is incurred, especially for areas which do not
include shadow boundaries. This observation is more dicult to quantify than
the previous two. To demonstrate the problem we present an image of the Dining
room scene, and the corresponding mesh (see colour section, Fig. 1 and 2). The
simulation parameters were "refine = 0:5 and MinArea = 0:001.
As can be seen in Fig. 2 in the colour section, the subdivision obtained
with these parameters is such that acceptable representation of the shadows
4
For example Lischinski et al. report a re nement time of 16 minutes for an initial
linking time of 2 hours and 16 minutes.
5
This is the " used in the original formulation.
31
6 Nicolas Holzschuch, Fran
cois Sillion, George Drettakis
is achieved in the penumbral areas caused by the table and chairs. However, the
subdivision on the walls is far higher than necessary: the illumination over the
wall varies very smoothly and could thus be represented with a much coarser
mesh. In previous work it was already noted that radiance functions in regions
of full illumination can be accurately represented using a simple mesh based on
the structure of illumination [2].
If this unnecessary subdivision is avoided, signi cant gains can be achieved
since the total number of links will be reduced, saving memory, and since an
attendant reduction of visibility tests will result, saving computation time.
time) if they are at least partially facing each other. If not, the pair is marked
as classi ed and no link is created. If they are facing, we compute an approxi-
mation of their form factor, without a visibility test. If the product of the form
factors and the radiosity is still larger than " , we mark the pair of polygons as
link
classi ed, and compute the visibility of the polygons. If they are visible, a link
is created using the form factors and visibility already computed. Thus a pair
of polygons can become classi ed either when a link is created, or when the two
polygons are determined to be invisible. Figure 3 shows a pseudo-code listing of
both the Initial Linking phase and the Main Loop in the original algorithm [5]
and Fig. 4 gives the equivalent listing in our algorithm.
32
Progressive Re nement Strategy for Hierarchical Radiosity 7
Initial Linking
for each pair of polygons (p; q)
if pand are facing each other
q
and are at least partially visible from each other
if p q
link p and q
Main Loop
for each polygon p
foreach link l leaving p
if B F > "refine
re ne l
foreach link l leaving p
gather l
Initial Linking
for each pair of polygons (p; q)
record it as unclassi ed
Main Loop
for each unclassi ed pair of polygons (p; q)
if p and are facing each other
q
if Bp > "link or Bq > "link
compute the unoccluded FF
if B F > "link
link p and q
record (p; q) as classi ed
else record (p; q ) as classi ed
for each polygon p
for each link l leaving p
if B F > "refine
re ne l
for each link l leaving p
gather l
The threshold "link used to establish top-levels interactions is not the same as
the threshold used for BF re nement, "refine . The main potential source of error
in our algorithm is an incomplete balance of energy. Since energy is transfered
across links, any polygon for which some top-level links have not been created
is retaining some energy, which is not propagated to the environment.
When recursive re nement is terminated because the product BF becomes
smaller than "refine , a link is always established, which carries some fraction of
this energy (the form factor estimate used in the comparison against "refine is an
upper bound of the form factor). On the other hand, when two top-level surfaces
are not linked because the product BF is smaller than "link , all the corresponding
energy is \lost". It is thus desirable to select a threshold such that "link < "refine .
33
8 Nicolas Holzschuch, Francois Sillion, George Drettakis
Since radiosity is mainly used for its ability to model light interre ection, it is
important to maintain energy consistency when modifying the algorithm. An
issue raised by the lazy linking strategy is that \missing" links, those that have
not been created because they were deemed insigni cant, do not participate
in energy transfers. Thus each gather step only propagates some of the energy
radiated by surfaces.
If the corresponding energy is simply ignored, the main result is that the
overall level of illumination is reduced. However a more disturbing e ect can
result for surfaces that have very few (or none) of their links actually established:
these surfaces will appear very dark because they will receive energy only from
the few surfaces that are linked with them.
The solution we advocate in closed scenes is the use of an ambient term
similar to the one proposed for progressive re nement radiosity [1]. However the
distribution of this ambient term to surfaces must be based on the estimated
fraction of their interaction with the world that is missing from the current
set of links. The sum of the form factors associated with all links leaving a
surface gives an estimate of the fraction of this surface's interactions that is
actually represented. Thus, in a closed scene, its complement to one represents
the missing link. Using this estimate to weight the distribution of the ambient
energy, the underestimation of radiosities can be partially corrected: surfaces
that have no links will use the entire ambient term, whereas surfaces with many
links will be only marginally a ected.
However, since form factors are based on approximate formulas, the sum of
all form-factors can di er from one, even for a normally linked surface. This
comes from our BF re nement strategy: we accept that the form-factor on a
link between two dark surfaces be over-estimated, or under-estimated. This may
results in energy loss, or creation. If the error we introduced by not linking some
surfaces is of the same order { or smaller { than the one due to our lack of
precision on the form-factor estimation, using the ambient term will not suce
to correct the energy inbalance.
To quantify the in uence of those errors on the overall balance of energy, we
compute the following estimate of the incorrect energy:
EET =
X j1 , F j B A
p p p (1)
p
6
The storage cost for the classi ed bit represents 62 kb for a thousand surfaces, 25
Mb for twenty thousand surfaces.
34
Progressive Re nement Strategy for Hierarchical Radiosity 9
where Ap is the area of polygon p, Bp its radiosity and Fp the sum of the form
factors on all links leaving p. This can be compared to the total energy present
in the scene: X
ET = Bp Ap (2)
p
%
100 Full Oce, Orig.
90 Full Oce, Lazy
80 Dining Room, Orig.
70 Dining Room, Lazy
60
50
40
30
20
10
0 Iterations
1 2 3 4 5 6 7 8 9 10
Figure 5 shows a plot of the ratio EET =ET for the Dining Room scene and
the Full Oce, for both the original algorithm and our algorithm. Note that the
error can be signi cant, but is mainly due to the original algorithm.
35
10 Nicolas Holzschuch, Fran
cois Sillion, George Drettakis
Fq;p =
XF (5)
q;p
i
i
These relations only concern the exact values of the form factors. However
they can be used to compare the new form factor estimates with the old ones,
and determine a posteriori wether re nement was actually required. If the sum
of the Fq;p is close to the old Fq;p , and they are not very di erent from one
i
another, little precision was gained by re ning p. Moreover, if Fp;q is close to the
average of the Fp ;q , and the Fp ;q are not too di erent from one another, then
i i
the re nement process did not introduce any additional information. In this case
we force p and q to interact at the current level, since the current estimates of
form factors are accurate enough.
In our implementation we only allow reduction of links in situations of full
visibility between surfaces. We compute the relative variation of the children
form factors, which we test against a new threshold "reduce. We also check that
the di erence between the old form factor Fp;q and the sum of the Fp ;q , and i
the di erence between Fq;p and the average of the Fq;p are both smaller than
"reduce.
i
If we note Fu;v our current estimation of the form-factor between two patches
u and v, and assuming we want to re ne a patch p in pi , we note:
min = mini (Fp ;q )
Fp;q i
min = mini(Fq;p )
Fq;p i
Fp;q A i p i Fq;p i i
jF ,F j jF ,F j
> "reduce > "reduce
0 0
p;q p;q q;p q;p
F 0
p;q F 0
q;p
36
Progressive Re nement Strategy for Hierarchical Radiosity 11
5 Results
5.1 Lazy Linking
Figures 3 in coulour section shows the same scene as in Fig. 1, computed using
the lazy linking strategy of Sect. 3. Note that it is visually indistinguishable
from its original counterpart. Figure 4 plots the absolute value of the di erence
between these two images.
5.2 Reduction of the Number of Links
To measure the performance of the reduction criterion, we computed the ratio of
the number of quadtree nodes (surface elements) obtained with this criterion, to
the number of nodes obtained with the original algorithm. The graph in Fig. 6a
plots this ratio against the number of iterations. Note that an overall reduction
by nearly a factor of two is achieved for all scenes. Figure 6b shows a similar
ratio for the number of links. This global reduction of the number of objects
involved leads to a similar reduction of the memory needed by the algorithm,
thus making it more practical for scenes with more polygons.
% %
100 100 East Room
90 90 West Room
80 80 Dining Room
70 70 Hevea
60 60 Full Oce
50 50
40 40
30 30
20 20
10 10
0 Iterations 0 Iterations
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
a. Percentage of Nodes b. Percentage of Links
Figure 7 shows the ratio of the computation times using the improved cri-
terion and the original algorithm. The reduction of the number of links has a
dramatic impact on running times, with speedups of more than 50%.
Figure 5 and 6 in colour section shows the image obtained after link reduc-
tion. Note the variation in the mesh on the walls, and the similarity of the shaded
image with the ones in Figs. 1 and 3. Figure 7 plots the absolute value of the
di erence between the image produced by the original algorithm and the image
obtained after link reduction. Note that part of the di erences are due to the
lazy linking strategy of Sect. 3. So Figure 8 shows the di erence between lazy
linking and reduction of the number of links.
5.3 Overall Performance Gains
Timing results are presented in Table 3. As expected, a signi cant speedup is
achieved, particularly for complex scenes. For all scenes, ten iterations with lazy
37
12 Nicolas Holzschuch, Francois Sillion, George Drettakis
%
100 Dining Room
90 West Room
80 East Room
70 Hevea
60 Full Oce
50
40
30
20
10
0 Iterations
1 2 3 4 5 6 7 8 9 10
linking took less time to compute than the rst iteration alone with the original
algorithm. Finally, using lazy linking and reduction produces a useful image in
a matter of minutes even for the most complex scenes in our set.
Table 3. Time needed for ten iterations (and time for producing the rst image).
38
Progressive Re nement Strategy for Hierarchical Radiosity 13
An improved subdivision criterion was introduced for situations of full vis-
ibility between surfaces, which allows a signi cant reduction of the number of
links.
Future work will include the simpli cation of the hierarchical structure due
to multiple sources and subsequent iterations. A surface that has been greatly
re ned because it receives a shadow from a given light source can be fully illu-
minated by a second source, and the shadow become washed in light.
Better error bounds, both on form factor magnitude and global energy trans-
fers, should allow even greater reduction of the number of links. Accurate visibil-
ity algorithms can be used to this end, by providing exact visibility information
between pairs of surfaces.
7 Acknowledgments
George Drettakis is a post-doc hosted by INRIA and supported by an ERCIM
fellowship. The hierarchical radiosity software was built on top of the original
program kindly provided by Pat Hanrahan.
References
1. Cohen, M. F., Chen, S. E., Wallace, J. R., Greenberg, D. P.: A Progressive Re-
nement Approach to Fast Radiosity Image Generation. siggraph (1988) 75{84
2. Drettakis, G., Fiume, E.: Accurate and Consistent Reconstruction of Illumination
Functions Using Structured Sampling. Computer Graphics Forum (Eurographics
1993 Conf. Issue) 273{284
3. Goral, C. M., Torrance, K. E., Greenberg, D. P., Bataille, B.: Modeling the Inter-
action of Light Between Di use Surfaces. siggraph (1984) 213{222
4. Hanrahan, P. M., Salzman, D.: A Rapid Hierarchical Radiosity Algorithm for Un-
occluded Environments. Eurographics Workshop on Photosimulation, Realism and
Physics in Computer Graphics, June 1990.
5. Hanrahan, P. M., Salzman, D., Auperle, L.: A Rapid Hierarchical Radiosity Al-
gorithm. siggraph (1991) 197{206
6. Lischinski, D., Tampieri, F., Greenberg, D. P.: Combining Hierarchical Radiosity
and Discontinuity Meshing. siggraph (1993)
7. Sillion, F.: Clustering and Volume Scattering for Hierarchical Radiosity calcula-
tions. Fifth Eurographics Workshop on Rendering, Darmstadt, June 1994 (in these
proceedings).
8. Smits, B. E., Arvo, J. R., Salesin, D. H.: An Importance-Driven Radiosity Algo-
rithm. siggraph (1992) 273{282
9. Teller, S. J., Hanrahan, P. M.: Global Visibility Algorithm for Illumination Com-
putations. siggraph (1993) 239{246
39
14 Nicolas Holzschuch, Fran
cois Sillion, George Drettakis
40
Progressive Re nement Strategy for Hierarchical Radiosity 15
41
42 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
1 Introduction
Wavelet radiosity [12] is one of the most interesting technique for global illumination
M
simulation. Recent research [7] has shown that higher order multi-wavelets ( 2 and M
3 ) are providing a very powerful tool for radiosity computations. Multi-wavelets can
approximate the radiosity function efficiently with a small number of coefficients. As a
consequence, they give a solution of better quality in a shorter time.
Multi-wavelets are defined only on parallelograms and triangles. This causes prob-
lems for radiosity computations on scenes coming from real world applications, such
as architectural scenes, or CAD scenes. In such scenes, planar surfaces have a fairly
complicated shape (see figure 1 and 12(a)). To do wavelet radiosity computations on
such scenes, we have to tessellate these planar shapes into triangles and parallelograms,
which results in a large number of input primitives (see figure 1(b)). Furthermore, this
decomposition is purely geometrical and was not based on the illumination, yet it will
influence our approximation of the radiosity function. In some cases, this geometric
decomposition results in a poor illumination solution (see figure 2(a) and 11(a)).
In the present paper, we separate the radiosity function from the surface geometry.
This enables us to exploit the strong approximating power of multi-wavelets for radios-
ity computations, with planar surfaces of arbitrary shape – including concave contours,
contours with holes or disjoint contours. Our algorithm results in a better approxima-
tion of the radiosity function (see figure 2(b) and 11(b)) with a smaller number of input
primitives, faster convergence and lower memory costs.
1 INRIA Lorraine.
2 Institut
National Polytechnique de Lorraine.
3 UMR no 7503 LORIA, a joint research laboratory between CNRS, Institut National Polytechnique de
Lorraine, INRIA, Université Henri Poincaré and Université Nancy 2.
1
(a) Detail of fig- (b) Tessellated: (c) Our algorithm:
ure 12(a) 32 triangles 1 surface
Fig. 2. Wavelet radiosity on arbitrary planar surfaces (see also figure 11)
Our algorithm extends the radiosity function defined on the original shape onto a
simpler domain, better behaved for hierarchical refinement and wavelet computations.
This extension of the radiosity function is defined to be easily and efficiently approxi-
mated by multi-wavelets. The wavelet radiosity algorithm is modified to work with this
abstract representation of the radiosity function.
Our paper is organised as follows: in section 2, we will review previous work on
radiosity with planar surfaces of complicated shape. Section 3 is a detailed explanation
of our algorithm and of the modifications we brought to the wavelet radiosity algorithm.
Section 4 presents the experiments we have conducted with our algorithm on different
test scenes. Finally, section 5 presents our conclusions.
2
2 Previous work
The wavelet radiosity method was introduced by [12]. It is an extension of the radiosity
method [11] and especially of the hierarchical radiosity method [13]. It allows the use
of higher order basis functions in hierarchical radiosity.
In theory, higher order wavelets are a very powerful tool to approximate rapidly
varying functions with little coefficients. In practice, they have several drawbacks, es-
pecially in terms of memory costs. In the early implementations of wavelets bases in
the radiosity algorithm, these negative points were overcoming the positive theoretical
advantages [19]. Recent research [7] has shown that using new implementation meth-
ods [2, 3, 7, 18, 21] we can actually exploit the power of higher order wavelets, and that
their positive points are now largely overcoming the practical problems. They provide
a better approximation of the radiosity function, with a small number of coefficients,
resulting in faster convergence and smaller memory costs.
On the other hand, higher order wavelets, and especially multi-wavelets (M2 and
M3 ) are defined as the tensor products of one-dimensional wavelets. As a consequence,
they are defined over a square. The definition can easily be extended on parallelograms
or triangles, but higher order wavelets are not designed to describe the radiosity function
over complex surfaces.
Such complex surfaces can occur in the scenes on which we do global illumination
simulations. Especially, scenes constructed using CAD tools such as CSG geometry or
extrusion frequently contain complex planar surfaces, with curved boundaries or holes
in them.
The simplest solution to do radiosity computations on such surfaces is to tessellate
them into triangles, and to do radiosity computations on the result of the tessellation.
This method has several negative consequences on the radiosity algorithm:
It increases the number of input surfaces and the algorithmic complexity of the
radiosity algorithm is linked to the square of the number of input surfaces.
The tessellation is made before the radiosity computations and it influences these
computations. It can prevent us from reaching a good illumination solution.
The tessellation does not allow a hierarchical treatment over the original surface,
only over each triangle created by the tessellation. We can not fully exploit the
capabilities of hierarchical radiosity, and especially of wavelet radiosity.
By artificially subdividing an input surface into several smaller surfaces, we are
creating discontinuities. These discontinuities will have to be treated at some
point in the algorithm.
Tessellation can create poorly shaped triangles (see figure 1(b)), or slivers. These
slivers can cause Z-buffer artifacts when we visualise the radiosity solution, and
are harder to detect in visibility tests (e.g. ray-casting).
Some of these problems can be removed by using clustering [10, 16, 17]. In clus-
tering, neighbouring patches are grouped together, into a cluster. The cluster receives
radiosity and distributes it to the patches that it contains. On the other hand, current
clustering strategies are behaving poorly in scenes with many small patches located
close to each other [14]. It would probably be more efficient to apply clustering to the
original planar surfaces instead of applying it to the result of the tessellation.
A better grouping strategy is face-clustering [20]. In face-clustering, neighbouring
patches are grouped together according to their coplanarity. Yet even face-clustering
depends on the geometry created by the tessellation. Furthermore, it would not allow
us to exploit the strong approximating power of multi-wavelets.
3
if the original planar shape is polygonal:
– compute its convex hull (in linear time) using the
chain of points [9].
– compute the minimal enclosing parallelograms of
the convex hull (in linear time) using Schwarz et
al. [15].
– if the previous algorithm gives several enclos-
ing parallelograms, select the one that has angles
closer to 2 .
if the original shape is a curve, or contains curves:
– approximate the curve by a polygon
– compute the enclosing parallelogram of the poly-
gon
– compute the extrema of the curve in the directions
of the parallelogram.
– if needed, extend the parallelogram to include
these extrema.
Bouatouch et al. [5] designed a method for discontinuity meshing and multi-wavelets.
In effect, they are doing multi-wavelets computations over a non-square domain. How-
ever, their algorithm requires several expensive computations of push-pull coefficients.
Our algorithm avoids these computations.
Baum et al. [1] designed a method for radiosity computations with arbitrary planar
polygons, including polygons with holes. Their method ensures that the triangles pro-
duced are well-shaped, and suited for radiosity computations. Since it is designed for
non-hierarchical radiosity, it is done in a preliminary step, before all radiosity compu-
tations. Our method, designed for wavelet radiosity, acts during the global illumination
simulation, and adapts the refinement to the radiosity.
4
The radiosity function
5
(a) Trapezoidal map for the sur- (b) Using the trapezoidal map for visibility queries
face in figure 1(c)
The radiosity function on the extended domain is then defined as the radiosity func-
tion, as computed by the wavelet radiosity algorithm, using this extended visibility
function in the radiosity kernel.
Visibility. For all the visibility computations, only the original planar surface can act
as an occluder. The extended domain is never used in visibility computations.
6
1
w0 w1 w2 α =
percent
covered
0 1
Fig. 6. The weights of the quadrature points can be seen as the area of a zone of influence.
Emission. During the reception, the entire extended domain has received illumination.
The radiosity received over parts of the extended domain that are not included in the
original surface does not exist in reality, and it should not be sent back into the scene.
Otherwise, there would be an artificial creation of energy, violating the principle of
conservation of energy.
Because of the hierarchical nature of the wavelet radiosity algorithm, it would be
difficult to compute the exact part of this radiosity function that really exists. Instead,
we act on the weights of the quadrature points.
In the wavelet radiosity algorithm, all the transfer coefficients between an emitter
and a receiver are computed using quadratures. Quadratures allow the evaluation of
a complex integral by sampling the function being integrated at the quadrature points,
and multiplying the values by quadrature weights. Most implementations use Legendre-
Gauss quadratures.
Since the weights are positive and their sum is equal to 1, you can visualise them
as being the length of a zone of influence for the corresponding quadrature point (see
figure 6(a) for the one dimension case). The same applies in two dimensions: the
weights of the quadrature points can be seen as the area of a zone of influence (see
7
for each interaction s ! r:
for each quadrature point qi on the emitter s
Ai = area of influence of qi
i percentage of Ai that is inside the original emitter
qi0 = nearest point from qi on the emitter
for each quadrature point pj on the receiver r
p0j = nearest point on the receiver
V (qi0 ; p0j ) = visibility between qi0 and p0j
G(qi ; pj ) = radiosity kernel between qi and pj
Br + = i wi wj Bs (qi )V (qi ; pj )G(qi ; pj )
0 0
end for
end for
Fig. 7. Pseudo-code for wavelet radiosity emission using the extended domain.
figure 6(b)); the weight of quadrature point pi;j is wi wj . Please note that these zones
of influence are not equal to the Voronoı̈ diagram of the quadrature points.
We suggest an extension to the Gaussian quadrature to take into account the fact
that the extended domain is not entirely covered by the actual emitter: the weight of a
quadrature point is multiplied by the proportion of its area of influence that is actually
covered by the emitter. For example, on figure 6(c), the weight of the quadrature point
in the hashed area should be w1 w2 . Since the fraction of its area of influence covered
by the emitter is , the weight used in the computation will be w1 w2 .
Our method allows for a quick treatment of low precision interactions, and for high
precision interactions, it tends toward the exact value. The more we refine an interac-
tion, the more precision we get on the radiosity on the emitter. We also get the exact
value if the zone of influence is entirely full or entirely empty.
In some cases, it can happen that the quadrature point falls outside the original
emitter. We use these quadrature points anyway.
Figure 7 shows the pseudo-code for radiosity emission using the extended domain.
Refinement. As with the original wavelet radiosity algorithm, the extended domain
can be subdivided if the interaction needs to be subdivided. The refinement oracle
deals with the extended domain as it would deal with any other patch. Because of the
hierarchical representation of the radiosity function in the wavelet radiosity algorithm,
we must have the same precision on the radiosity function over the entire domain. The
push-pull step can make parts of the domain that are not inside the original surface
influence our representation of the radiosity function over the entire domain.
8
Name # initial surfaces after tessellation ratio
Opera 17272 32429 1.88
Temple 7778 11087 1.43
Soda Hall 145454 201098 1.38
1e+06
200000
Our algorithm
800000
Tesselated
150000
600000
100000
400000
50000 200000
0 0
Opera Temple Soda Hall Opera Temple Soda Hall
If the extended domain is refined, we deal with each part of the subdivided extended
domain as we would deal with the original extended domain. Two special cases can
appear (see figure 8):
if the result of the subdivision does not intersect at all with the original planar
surface, it is empty. Therefore it cannot play a role in the emission of radiosity,
but we keep computing the radiosity function over this patch.
if the result of the subdivision is totally included inside the original planar sur-
face. In that case, we are back to the standard wavelet radiosity algorithm on
parallelograms.
4 Experiments
We have tested our algorithm for wavelet radiosity on arbitrary planar surfaces on vari-
ous test scenes (see figure 12 for images of our test scenes, and table 1 and figure 9(a)
for their description). We were interested in a comparison between our algorithm and
the standard wavelet radiosity algorithm, acting on parallelograms and triangles. All
the computations were conducted on the same computer, a SGI Origin 2000, using a
parallel version [6] of our wavelet radiosity algorithm [7, 21].
In all these test scenes, the number of surfaces after tesselation is less than twice the
number of surfaces in the original scene. Much less than what could be expected from
figure 1. Most of the initial surfaces in the scenes are parallelograms or triangles, and
don’t require tesselation.
The first result is that our algorithm gives better visual quality than doing wavelet
radiosity computations on a tessellated surface (see figure 2 and 11). Our separation of
the radiosity function from the surface geometry results in a better approximation of the
radiosity function.
9
100 100
Original Original
50 Tesselated 50 Tesselated
20 20
10 10
5 5
2 2
1 1
0 1e4 2e4 5000
100
Original
50 Tesselated
20
10
1
0 1e5 2e5
Fig. 10. Convergence rate (un-shot energy over initial energy) as a function of computation time
(in seconds).
10
computation times (see figure 10). In our experiments, we measure the energy initially
present in the scene and the energy that hasn’t yet been propagated in the scene. The ra-
tio of these two measures tells us how far we are from complete convergence. Figure 10
displays this ratio as a function of the computation time, both for our algorithm and for
the wavelet radiosity algorithm operating on a tessellated version of the scene. Our
algorithm ensures a faster convergence on all our test scenes. The speedup is of about
30 %, which shows that acting on the original planar surface instead of the tessellated
surface gives more efficient refinement.
5 Conclusion
In conclusion, we have presented a method to separate the radiosity function from the
surface geometry. This method removes the need to tessellate complex planar surfaces,
resulting in a more efficient global illumination simulation, with better visual quality.
Our method results in faster convergence, with smaller memory costs.
In our future work, we want to extend this algorithm to discontinuity meshing. Dis-
continuity meshing introduces a geometric model of the discontinuities of the radiosity
function and its derivatives, the discontinuity mesh. The discontinuity mesh provides
optimal meshing for radiosity computations near the discontinuities. The discontinuity
mesh is a complicated structure, and it can influence radiosity computations away from
the discontinuities, for example because of triangulation. We want to use our algorithm
to smoothly integrate the discontinuity mesh in the natural subdivision for multi-wavelet
radiosity, removing the need to tesselate the discontinuity mesh.
We also want to explore a combination of our algorithm with clustering techniques.
First, our algorithm could be used to group together neighbouring coplanar patches in
a natural way. This would help the clustering strategy [14] and give a more accurate
result. Second, we would like to integrate our algorithm with face-clustering, bringing
multi-wavelets into face-clusters.
Finally, our separation of the radiosity function from the surface geometry could
also be used to compute radiosity using multi-wavelets on curved surfaces. There are
several parametric surfaces for which the limits of the parametric space are not square.
We suggest using our algorithm to enclose these limits into a square limit, making it
easier for multi-wavelets.
6 Acknowledgements
Permission to use the Soda Hall model4 was kindly given by Prof. Carlo Sequin.
Jean-Claude Paul has started and motivated all this research. The authors would like
to thank him for his kind direction, support and encouragements.
References
1. D. R. Baum, S. Mann, K. P. Smith, and J. M. Winget. Making Radiosity Usable: Automatic
Preprocessing and Meshing Techniques for the Generation of Accurate Radiosity Solutions.
Computer Graphics (ACM SIGGRAPH ’91 Proceedings), 25(4):51–60, July 1991.
2. P. Bekaert and Y. Willems. Error Control for Radiosity. In Rendering Techniques ’96 (Pro-
ceedings of the Seventh Eurographics Workshop on Rendering), pages 153–164, New York,
NY, 1996. Springer-Verlag/Wien.
4 The Soda Hall model is available on the web, at http://www. s.berkeley.edu/~kofler.
11
3. P. Bekaert and Y. D. Willems. Hirad: A Hierarchical Higher Order Radiosity Implementa-
tion. In Proceedings of the Twelfth Spring Conference on Computer Graphics (SCCG ’96),
Bratislava, Slovakia, June 1996. Comenius University Press.
4. J.-D. Boissonnat and M. Yvinec. Algorithmic Geometry. Cambridge University Press, 1998.
5. K. Bouatouch and S. N. Pattanaik. Discontinuity Meshing and Hierarchical Multiwavelet
Radiosity. In W. A. Davis and P. Prusinkiewicz, editors, Proceedings of Graphics Interface
’95, pages 109–115, San Francisco, CA, May 1995. Morgan Kaufmann.
6. X. Cavin, L. Alonso, and J.-C. Paul. Parallel Wavelet Radiosity. In Second Eurographics
Workshop on Parallel Graphics and Visualisation, pages 61–75, Rennes, France, Sept. 1998.
7. F. Cuny, L. Alonso, and N. Holzschuch. A novel approach makes higher or-
der wavelets really efficient for radiosity. Computer Graphics Forum (Euro-
graphics 2000 Proceedings), 19(3), Sept. 2000. To appear. Available from
http://www.loria.fr/˜holzschu/Publications/paper20.pdf.
8. O. Devillers, M. Teillaud, and M. Yvinec. Dynamic location in an arrangement of line
segments in the plane. Algorithms Review, 2(3):89–103, 1992.
9. H. Edelsbrunner. Algorithms in Combinatorial Geometry, volume 10 of EATCS Monographs
on Theoretical Computer Science. Springer-Verlag, Nov. 1987.
10. S. Gibson and R. J. Hubbold. Efficient hierarchical refinement and clustering for radiosity in
complex environments. Computer Graphics Forum, 15(5):297–310, Dec. 1996.
11. C. M. Goral, K. E. Torrance, D. P. Greenberg, and B. Battaile. Modelling the Interaction of
Light Between Diffuse Surfaces. Computer Graphics (ACM SIGGRAPH ’84 Proceedings),
18(3):212–222, July 1984.
12. S. J. Gortler, P. Schroder, M. F. Cohen, and P. Hanrahan. Wavelet Radiosity. In Computer
Graphics Proceedings, Annual Conference Series, 1993 (ACM SIGGRAPH ’93 Proceed-
ings), pages 221–230, 1993.
13. P. Hanrahan, D. Salzman, and L. Aupperle. A Rapid Hierarchical Radiosity Algorithm.
Computer Graphics (ACM SIGGRAPH ’91 Proceedings), 25(4):197–206, July 1991.
14. J. M. Hasenfratz, C. Damez, F. Sillion, and G. Drettakis. A practical analysis of clustering
strategies for hierarchical radiosity. Computer Graphics Forum (Eurographics ’99 Proceed-
ings), 18(3):C–221–C–232, Sept. 1999.
15. C. Schwarz, J. Teich, A. Vainshtein, E. Welzl, and B. L. Evans. Minimal enclosing parallelo-
gram with application. In Proc. 11th Annu. ACM Sympos. Comput. Geom., pages C34–C35,
1995.
16. F. Sillion. A Unified Hierarchical Algorithm for Global Illumination with Scattering Volumes
and Object Clusters. IEEE Transactions on Visualization and Computer Graphics, 1(3), Sept.
1995.
17. B. Smits, J. Arvo, and D. Greenberg. A Clustering Algorithm for Radiosity in Complex
Environments. In Computer Graphics Proceedings, Annual Conference Series, 1994 (ACM
SIGGRAPH ’94 Proceedings), pages 435–442, 1994.
18. M. Stamminger, H. Schirmacher, P. Slusallek, and H.-P. Seidel. Getting rid of links in hierar-
chical radiosity. Computer Graphics Journal (Proc. Eurographics ’98), 17(3):C165–C174,
Sept. 1998.
19. A. Willmott and P. Heckbert. An empirical comparison of progressive and wavelet ra-
diosity. In J. Dorsey and P. Slusallek, editors, Rendering Techniques ’97 (Proceedings of
the Eighth Eurographics Workshop on Rendering), pages 175–186, New York, NY, 1997.
Springer Wien. ISBN 3-211-83001-4.
20. A. Willmott, P. Heckbert, and M. Garland. Face cluster radiosity. In Rendering Techniques
’99, pages 293–304, New York, NY, 1999. Springer Wien.
21. C. Winkler. Expérimentation d’algorithmes de calcul de radiosité à base d’ondelettes. Thèse
d’université, Institut National Polytechnique de Lorraine, 1998.
12
(a) Tessellated (b) Our Algorithm
Fig. 11. Using our algorithm for wavelet radiosity on arbitrary planar surfaces (see also figure 2)
(a) Opera
13
56 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
2.7.4 A novel approach makes higher order wavelets really efficient for radiosity (EG
2000)
Auteurs : François C, Laurent A et Nicolas H
Conférence : Eurographics 2000, Interlaken, Suisse. Cet article a également été publié dans
Computer Graphics Forum, vol. 19, no 3.
Date : septembre 2000
EUROGRAPHICS 2000 / M. Gross and F.R.A. Hopgood Volume 19 (2000), Number 3
(Guest Editors)
Abstract
Since wavelets were introduced in the radiosity algorithm5 , surprisingly little research has been devoted to higher
order wavelets and their use in radiosity algorithms. A previous study13 has shown that wavelet radiosity, and
especially higher order wavelet radiosity was not bringing significant improvements over hierarchical radiosity
and was having a very important extra memory cost, thus prohibiting any effective computation. In this paper,
we present a new implementation of wavelets in the radiosity algorithm, that is substantially different from pre-
vious implementations in several key areas (refinement oracle, link storage, resolution algorithm). We show that,
with this implementation, higher order wavelets are actually bringing an improvement over standard hierarchical
radiosity and lower order wavelets.
57
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
experimental study13 the practical problems of higher order all the other objects in the scene. K(x, y) is the kernel of the
wavelets were largely overcoming their theoretical benefits. equation, and expresses the part of radiosity emitted by point
y that reaches x.
However, these practical problems are not inherent to
higher order wavelets themselves, only to their implemen- To compute the radiosity function, we use finite element
tation in the radiosity method. In this paper, we present a methods. The function we want to compute, B(x), is first
new approach to higher order wavelets, that is substantially projected onto a finite set of basis functions φi :
different from previous implementations in several key ar-
B̃(x) = ∑ αi φi (x) (2)
eas, such as refinement oracle, link storage and resolution i
algorithm. Our approach has been developed by taking a
complete look at higher order wavelets and at the way they Our goal is to compute the best approximation of the
should integrate with the radiosity method. With this imple- radiosity function, given the set of basis functions φi . We
mentation, we show that the theoretical advantages of higher must also find the optimal set of basis functions. A possibil-
order wavelets are overcoming the practical problems that ity is to use wavelets. Wavelets are mathematical functions
have been encountered before. Higher order wavelets are that provide a multi-resolution analysis. They allow a multi-
now providing a better approximation of the radiosity func- scale representation of the radiosity function on every object.
tion, with faster convergence to the solution. They also re- This multi-scale representation can be used in the resolution
quire less memory for storage. algorithm6, 5 , allowing us to switch between different repre-
Our paper is organised as follows: in section 2, we review sentations of the radiosity function, depending on the degree
the previous research on wavelet radiosity and higher order of precision required. This multi-scale resolution results in a
wavelets. Then in section 3, we present our implementation, great reduction of the complexity of the algorithm6 .
concentrating on the areas where it is substantially different There are two broad classes of resolution algorithm: gath-
from previous implementations: the refinement oracle, not ering and shooting. In gathering, each patch updates its own
storing the interactions and the consequences it has on the radiosity function using the energy sent by all the other
resolution algorithm. patches, whereas in shooting each patch sends energy into
The main result that we present in this paper is the exper- the scene, and all the other patches update their own radios-
imental study we have conducted on higher order wavelets ity. In both cases, the energy is carried along links, that are
with our implementation. Section 4 is devoted to this exper- established by the wavelet radiosity algorithm, and used to
imentation and its results, namely that higher order wavelets store the information related to the interaction. A key ele-
are providing a faster convergence, a solution of better qual- ment of the wavelet radiosity algorithm is the refinement or-
ity and require less memory for their computations. Finally, acle, that tells which levels of the different multi-scale rep-
section 5 presents our conclusions and future areas of re- resentation of radiosity should interact.
search. Finally, before each energy propagation, we must update
the multi-scale representation of radiosity, so that each level
2. Previous work contains a representation of all the energy that has been re-
ceived by the object at all the other levels. This is done dur-
In this section we review the basis of the wavelet radiosity ing the push-pull phase.
algorithm (section 2.1), then we present the implementation
details of previous implementations for key areas of the al-
gorithm (section 2.2): the refinement oracle, the visibility es- 2.2. Details of previous implementations
timation and the memory problem. This review will help for 2.2.1. Refinement oracles
the presentation of our own implementation of these areas,
in section 3. The refinement oracle is one of the most important parts in
hierarchical radiosity algorithms. Since it tells at which level
the interaction should be established, it has a strong influ-
2.1. The wavelet radiosity algorithm ence on both the quality of the radiosity solution and the
In the radiosity method, we try to solve the global illumina- time spent doing the computations. A poor refinement ora-
tion equation, restricted to diffuse surfaces with no partici- cle will give poor results, or will spend a lot of time doing
pating media: unnecessary computations.
Z In theory, the decision whether or not to refine a given in-
B(x) = E(x) + ρ(x) B(y)K(x, y)dy (1) teraction could only be taken with the full knowledge of the
S
complete solution. However, the refinement oracle must take
Eq. 1 expresses the fact that the radiosity at a given point the decision using only the information that is locally avail-
x in the scene, B(x), is equal to the radiosity emitted by x able: the energy to be sent, and the geometric configuration
alone, E(x), plus the radiosity reflected by x, coming from of the sender and the receiver.
58
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
Given two patches in the scene, let us consider their inter- is much costlier than computing a kernel sample without
action: patch s, with its current approximation of the radios- visibility. As a consequence, estimating the visibility be-
ity function B̃s (y), is sending light toward patch r. Using a tween two patches is the most costly operation in wavelet
combination of eq. 1 and eq. 2, we can express the contribu- radiosity9 . Several methods have been developed in order to
tion of patch s to the radiosity of patch r: provide a quick estimate of visibility, sometimes at the ex-
Z pense of reliability.
Bs→r (x) = ρ ∑ αi φi (y)K(x, y)dy (3)
i s The easiest method6, 5 assumes a constant visibility be-
tween the patches. The constant is equal to 1 for fully visible
For the interaction between the two patches we will use patches, 0 for fully invisible patches, and is in ]0, 1[ for par-
the relationship coefficients, Ci j : tially visible patches. It is estimated by computing several
jittered visibility samples between the patches and averag-
Bs→r (x) = ∑ β j φ j (x) ing the results.
j
Z
Another method computes exact visibility between the
βj = Bs→r (x)φ j (x)
r corners of the patches, and interpolates between these val-
Z Z
ues for points located between the corners, using barycentric
β j = ρ ∑ αi φi (y)φ j (x)K(x, y)dydx
r s coordinates.
i
β j = ρ ∑ αiCi j Shadow masks16, 11 have also been used in wavelet radios-
i ity computations. In theory, shadow masks allow the decou-
pling of visibility from radiosity transport, and therefore a
These Ci j coefficients express the relationship between better compression of the radiosity transport operator, thus
the basis functions φ j (x) and φi (y). Computing the Ci j re- reducing the memory cost.
quires the computation of a complex integral, which cannot
be computed analytically and must be approximated, usually All these methods attempt to approximate visibility by
using quadratures. computing less visibility samples than kernel samples, in or-
der to reduce the cost of visibility in wavelet radiosity. Ac-
In most current implementations, refinement oracle esti- cording to an experimental study of wavelet radiosity con-
mate the error on this approximation of the Ci j . This error ducted by Willmott13, 14 the result is a poor approximation of
is then multiplied by the energy of the sender, to avoid re- the radiosity function, especially near shadow boundaries.
fining interactions that are not carrying significant energy.
Another method is to compute exactly one visibility sam-
There are several ways to estimate the error on the Ci j coef-
ple for each kernel sample. It has been used at least by
ficients: pure heuristics6 , sampling the Ci j at several sample
Gershbein4 , although it is not explicitely stated in his paper.
points5 and a conservative method giving an upper-bound on
According to our own experience, as well as Willmott ex-
the propagation of the energy10, 8 .
tended study14 , this method gives better visual results. Fur-
A recurrent problem with current refinement oracles is thermore, it gives more numerical precision. On the other
that they concentrate on the Ci j coefficients. This provides hand, it can introduce some artefacts, because the visibility
a conservative analysis, but it can be too cautious, especially samples are forced to be in a regular pattern.
with higher order basis functions. The Ci j coefficients are In our implementation, we used one visibility sample for
usually bound with constant functions and hence so is the each kernel sample, because we were looking for numeri-
radiosity function. Such a binding does not take into account cal accuracy, and because the artefacts are removed by our
the capacity of higher order wavelets to model rapidly vary- refinement oracle.
ing functions in a compact way. To take this into account,
we need to move the radiosity function inside the refinement 2.2.3. Memory usage
oracle. In section 3.1, we present a refinement oracle that
addresses this problem. Since the computation of the Ci j coefficients can be rather
long, they are usually stored once they have been computed,
2.2.2. Visibility estimations so that they can be reused. The storage is done on the link
between s and r.
Discontinuities of the radiosity function and its deriva-
An important problem with previous wavelet radiosity im-
tives are only caused by changes in the visibility between
plementations is the memory required for this storage. If we
objects7 . Therefore, great care must be taken when adding
use wavelet bases of the order m, then we have m one dimen-
visibility information to the radiosity algorithm.
sional functions in the wavelet base. For two dimensions,
As we have seen, we use a quadrature to compute the such as the surface of objects in our virtual scene, we have
Ci j coefficients. This quadrature requires several estimates m2 functions in the base. As a consequence, storing the inter-
of the kernel function K(x, y) and therefore of the visibil- action between two patches requires computing and storing
ity between points x and y. Computing a visibility sample m4 Ci j coefficients.
59
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
Hence, the memory usage of wavelet radiosity grows with for each interaction s → r:
the fourth power of the wavelet base used. Wavelets of order compute the radiosity function on the receiver: Bs→r (x)
3 will have a memory usage almost two orders of magnitude for each control point Pi
higher than wavelets with 1 vanishing moment. In an exper- compute the radiosity at this control point directly: Bs→Pi
imental study of wavelet radiosity, Willmott13 showed that compare with interpolated value,
this memory usage was effectively prohibiting any serious store the difference:
δi = |Bs→r (Pi ) − Bs→Pi |
computation with higher order wavelets.
end for
In 1998, Stamminger12 showed that it was possible to take the Ln norm of the differences:
eliminate completely the storage of the interactions in hier- δB = kδi kn
archical radiosity. His study was only made for hierarchical compare with refinement threshold
end for
radiosity, but it could be extended to wavelet radiosity, and
it would remove the worst problem of radiosity with higher
Figure 1: Our refinement oracle
order wavelets. In section 3.2, we review the consequences
of not storing links on the wavelet radiosity algorithm.
60
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
4.1. The experimentation protocol an energetic difference of 10 % between the energetic distri-
butions of the computed solution and the reference solution.
4.1.1. The wavelet bases
According to our experiments, this measure of global er-
We wanted to use our implementation of wavelet radiosity
ror is consistent, and gives comparable visual results on all
for a comparison of several wavelet bases. We have used the
the test scenes. For example, a global error of 10−1 will
first three multi-wavelets bases : M1 (Haar), M2 and M3 .
always give a poor result (see fig. 3(a)), a global error of
We use the Mn multi-wavelets as they were previously 10−2 will give a better result, but still with visible artefacts
defined 1, 5 : the smoothing functions for Mn are defined by at shadow boundaries (see fig. 3(b)), and a global error of
tensorial products of the first n Legendre polynomials. 10−3 will always give a correct result (see fig. 3(c)). In our
experience, (see fig. 3) the global error must be lower than
We have not used flatlets bases (Fn ), because although
5.10−3 in order to get visually acceptable results.
they have n vanishing moments, they are only piecewise con-
stant, and therefore do not provide a better approximation As it has been pointed out12 , we have also found that this
than Haar wavelets with further refinement. global error is closely correlated to the refinement threshold
on each interaction (the local error).
4.1.2. The test scenes
Our tests have been conducted on several test scenes, rang- 4.1.5. Experimentation details
ing from simple scenes, such as the blocker (see fig. 2(a)) In all our experiments, we have used the same computer, a
to moderately complex scenes, such as the class room (see SGI Octane working at 225 MHz, with 256 Mb of RAM.
fig. 2(d)). All our test scenes are depicted on fig. 2, with their
number of input polygons.
4.2. Results
4.1.3. Displaying the results
4.2.1. Visual comparison of our three wavelet bases
All the figures in this paper are depicting the exact results of
The first test to conduct is whether higher order wavelets are
the computations, without any post-processing of any kind:
giving a better visual impression. In previous tests13 , higher
the radiosity function is displayed exactly as it has been com-
order wavelets were unable to provide a correct approxima-
puted. Specifically, there has been no attempt to ensure con-
tion of the radiosity function, especially near shadow bound-
tinuity of the radiosity function, except in the refinement or-
aries. Shadow boundaries are very important because they
acle. Similarly, we haven’t balanced or anchored the com-
have a large impact on the visual perception of the scene.
puted mesh. So, for example, in fig. 4(c), the continuity of
the radiosity function is due only to the refinement oracle Our first experiment focuses solely on this problem. We
depicted in section 3.1. have computed direct illumination from an area light source
to a planar receiver, with an occluder partially blocking the
M3 wavelets can result in quadrically varying functions,
exchange of light. All wavelet bases were used with the same
which can not be displayed on our graphics engines. To dis-
computation time (66 s).
play these functions, we subdivide each patch into four sub-
patches, on which we compute four linearly varying func- Fig. 4 shows the radiosity function computed for each
tions approximating the quadrically varying radiosity func- wavelet base, along with the mesh used for the computation.
tion. Two elements appear clearly: higher order wavelets are pro-
viding a much more compact representation of the radios-
4.1.4. Computing the error ity function, even near shadow boundaries, and the radiosity
function computed with M2 and M3 wavelets is smoother
In order to compute the computational error, we have com-
than the function computed with Haar wavelets.
puted a reference solution, using M2 wavelets, with a very
small refinement threshold. Furthermore, the minimal patch Haar wavelets are usually not displayed as such, but us-
area in the reference solution was 16 times smaller than ing some sort of post-processing, such as Gouraud shad-
the minimal patch area in the computed solutions. We also ing. Fig. 5 shows the result of applying Gouraud shading to
checked that with all the wavelet bases, the computed solu- fig. 4(a). As you can see, although it can hide some of the dis-
tions did converge to the reference solution. continuities, Gouraud shading can also introduce some new
artefacts.
We have measured the energetic difference between this
reference solution and the computed solutions. In order to Judging from fig. 4, higher order wavelets are better for
have comparable results on all our test scenes, this differ- radiosity computations than lower order wavelets. This is
ence is divided by the total energy of the scene. It is this only a qualitative results and must be confirmed by quanti-
ratio of the energetic difference over the total energy that we tative studies; that is the object of the coming sections (4.2.2
call global error. Thus, a global error of 10−1 means there is and 4.2.3).
61
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
(a) Blocker (3) (b) Tube (5) (c) Dining room (402) (d) Classroom (3153)
62
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
1
Haar Haar
M2 M2
M3 M3
0.01 0.1
0.01
0.001
0.001
0.1 1 10 100 1 10 100
CPU Time (s) CPU Time (s)
(a) Blocker (b) Tube
0.1
Haar Haar
0.1 M2 M2
M3 M3
0.01
0.01
0.001 0.001
1 10 100 1000 10 100 1000 10000 100000
CPU Time (s) CPU Time (s)
(c) Dining room (d) Classroom
63
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
4.2.2. Computation time a given wavelet base degrades quickly if we try to bring the
global error level below a certain threshold. This effect ap-
Fig. 6 shows the relationship between global error and com-
pears very clearly on fig. 7(c) and 7(d). There seems to be
putation time for our four test scenes and our three wavelet
a maximum degree of precision for each wavelet base, and
bases.
the wavelet base can only conduct global illumination simu-
The most important point that can be extracted from lations below this degree. Be aware, however, that the degra-
these experimental data is that with our implementation, dation is made more impressive on fig. 7 by the fact that
higher order wavelets are performing better than lower or- we are using a logarithmic scale for global error and a non-
der wavelets. They obtain results of higher quality, and they logarithmic scale for memory use. Furthermore, the degrada-
are faster: to get a visually acceptable result on the classroom tion is quite small when it is compared to the total memory
scene (global error below 5.10−3 ), M3 wavelets use 104 s used: between 10 % and 20 %. Since the effect appears in a
(see fig. 6(d)). In the same computation time, Haar wavelets similar way for all the wavelet bases used in the test we think
only reach a global error level of 10−2 . This test scene is it could be a general effect, and apply to all wavelet bases.
our hardest test scene, with lots of shadow boundaries. It is
Please note that the fact that higher order wavelets have a
on such test scenes that higher order wavelets were behaving
lower memory use than lower order wavelets is actually quite
poorly with previous experimentations13 .
logical. Higher order wavelets are providing a more power-
The advantage of higher order wavelets is more significant ful tool for approximating complex functions, with a higher
on high precision computations and on complex scenes. The dimensional space for the approximation. Furthermore, they
more precision you need on your computations, the faster have more vanishing moments, so their representation of a
they are, compared to lower order wavelets. given complex function is more compact and requires less
On the contrary, for quick approximations, M2 wavelets coefficients. Our experiments are therefore bringing practi-
are performing better than M3 wavelets. The same applies cal results in connection with theoretical expectations.
to Haar wavelets compared to M2 wavelets, for very quick The fact that lower order wavelets are more compact for
and crude approximations. low precision computations was also to be expected from
Each wavelet base has an area of competence, where it theory. Low precision computations are, by nature, not tak-
outperforms all the other wavelet bases: Haar wavelets are ing into account all the complexity of the radiosity function.
the most efficient base for global error above 10−1 — which As a consequence, they provide a very simple function, that
corresponds to a simulation with many artefacts still visible is also easy to approximate, especially for simple wavelet
(see fig. 3(a)). M2 wavelets are better than all the other bases bases.
for global error between 10−1 and (roughly) 5.10−3 , and
M3 wavelets are the best for global error below 5.10−3 . 4.3. Discussion and comparison with previous studies
Despite the fact that we are reaching opposite conclusions,
4.2.3. Memory use
we would like to point out that our study is actually consis-
The key problem with higher order wavelets in previous tent with the previous study by Willmott13, 14 .
studies13 was their high memory use, that effectively prohib-
In Willmott’s study, higher order wavelets were carry-
ited any real computation. We have computed the memory
ing a strong memory cost, due to link storage. As a conse-
footprint of our implementation of wavelets for our four test
quence, radiosity computations with higher order wavelets
scenes and our three wavelet bases. Fig. 7 shows the memory
were restricted to low precision computations. According to
used by the algorithm as a function of the global error.
our experiments, for low precision computations, lower or-
As you can see, for high precision computations (global der wavelets are indeed providing a faster approximation,
error below 5.10−3 ), higher order wavelets actually have a with a lower memory use.
lower memory use than low order wavelets. The effect is
Our study can therefore be seen as an extension of Will-
even more obvious on our more complex scenes (see fig. 7(c)
mott’s study to high precision computations. Such high pre-
and 7(d)).
cision computations were made possible only by getting rid
On the other hand, for low precision computations, this hi- of links12 . Once you have eliminated link storage, the mem-
erarchy is reversed, and Haar and M2 wavelets have a lower ory cost of the radiosity algorithm is almost reduced to the
memory use. Once again, each wavelet base has an area cost of mesh storage. The refinement oracle (see section 3.1)
of competence, where it outperforms all the other wavelet ensures that the mesh produced is close to optimal with re-
bases. For very crude approximations, Haar wavelets are the spect to the radiosity on the surfaces.
most efficient with respect to memory use, then, for moder-
Also, by concentrating the oracle on the mesh instead of
ately good approximations, M2 wavelets are the most effi-
the interactions, we are able to exploit the power of wavelet
cient, until M3 takes over for really good approximations.
bases functions to efficiently approximate functions. This re-
A very impressive result is the way the memory cost of sults in a coarser mesh, both at places where the radiosity
64
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
1500 30000
1400 Haar Haar
M2 M2
1300 M3 M3
1200 29000
1100
1000
900 28000
800
700
600 27000
0.001 0.01 0.1 0.001 0.01 0.1 1
Global Error Global Error
(a) Blocker (b) Tube
8500 32000
Haar Haar
M2 M2
8000 M3 30000 M3
7500 28000
7000 26000
6500 24000
0.001 0.01 0.1 0.001 0.01 0.1
Global Error Global Error
(c) Dining room (d) Classroom
function has slow variations, such as an evenly lit wall, and Although in this paper we have only conducted tests on
at place with rapid variations, such as shadow boundaries. relatively small test scenes (up to 3000 input polygons), our
implementation (Candela15 ) enables us to use higher order
wavelets on arbitrarily large scenes. Fig. 8* shows a radios-
5. Conclusion and future work
ity computation with M2 wavelets made with our imple-
We have presented an implementation of wavelets bases mentation on a scene with 144255 input polygons. The com-
in the radiosity algorithm. With this implementation, we putations took 3 hours, and required approximately 2 Gb of
have conducted experimentations on several wavelet bases. memory on 32 processors of a SGI Origin 2000. The com-
Our experiments show that for high precision computa- plete solution had approximately 1.5 million patches.
tions, higher order wavelets are providing a better approx-
imation of the radiosity function, faster, and with a lower The optimal choice for radiosity computations depends
cost in memory. Please note that our implementation is not on the degree of precision required. Lower order wavelets
putting any disadvantage on lower order wavelets; for Haar are better for low precision computations, and higher order
wavelets, our refinement oracle only uses a few tests and the wavelets are better for high precision computations. Each
visibility estimation only requires one visibility test. Simi- wavelet base corresponds to a certain degree of precision,
larly, the benefit of not storing links is independant of the where it outperforms all the other wavelet bases, both for
wavelet base. the computation time and the memory footprint. Although
65
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
our computations have been limited to Haar, M2 and M3 tors, Rendering Techniques ’95 (Proceedings of the Sixth Eu-
wavelets, we think that this effect applies to all the other rographics Workshop on Rendering) in Dublin, Ireland, June
wavelet bases, such as M4 , M5 ... and that for even more 12-14, 1995), pages 264–273, New York, NY, 1995. Springer-
precise computations, M4 would outperform M3 , and so Verlag. 3
on. 5. Steven J. Gortler, Peter Schroder, Michael F. Cohen, and Pat
Hanrahan. Wavelet Radiosity. In Computer Graphics Pro-
However, for moderately precise computations, M2
ceedings, Annual Conference Series, 1993 (ACM SIGGRAPH
wavelets are quite sufficient. The precision level that corre- ’93 Proceedings), pages 221–230, 1993. 1, 2, 3, 5
sponds, in our experience, to visually acceptable results is at
the boundary between the areas of competence of M2 and 6. Pat Hanrahan, David Salzman, and Larry Aupperle. A Rapid
M3 , so M2 wavelets can be used. M2 wavelets also have Hierarchical Radiosity Algorithm. In Computer Graphics
(ACM SIGGRAPH ’91 Proceedings), volume 25, pages 197–
a distinct advantage over all the other wavelet bases: they
206, July 1991. 1, 2, 3
result in linearly varying functions that can be displayed di-
rectly on current graphics hardware (using Gouraud shad- 7. Paul Heckbert. Discontinuity Meshing for Radiosity. In Third
ing), as opposed to constant, quadric or cubic functions. Eurographics Workshop on Rendering, pages 203–226, Bris-
tol, UK, May 1992. 3
In our future work, we want to explore the possibility to
8. Nicholas Holzschuch and Francois. X. Sillion. An exhaustive
use several different wavelet bases in the resolution process.
error-bounding algorithm for hierarchical radiosity. Computer
In this approach, it would be possible to use Haar wavelets
Graphics Forum, 17(4):197–218, December 1998. 3
for interactions that do not require a lot of precision, such as
interactions that do not carry a lot of energy, and M2 , and 9. Nicolas Holzschuch, Francois Sillion, and George Drettakis.
perhaps M3 , M4 ..., wavelets for interactions that require a An Efficient Progressive Refinement Strategy for Hierarchi-
high precision representation. We think that this approach cal Radiosity. In Fifth Eurographics Workshop on Rendering,
pages 343–357, Darmstadt, Germany, June 1994. 3
could be especially interesting with shooting since the first
interactions will carry a lot of energy, while later interactions 10. Dani Lischinski, Brian Smits, and Donald P. Greenberg.
will only carry a small quantity of energy. Bounds and Error Estimates for Radiosity. In Computer
Graphics Proceedings, Annual Conference Series, 1994 (ACM
We also want to explore the possibility to use higher order SIGGRAPH ’94 Proceedings), pages 67–74, 1994. 3
wavelets on non-planar objects. Since they have a better abil-
11. Philipp Slusallek, Michael Schroder, Marc Stamminger, and
ity to model rapidly varying radiosity functions, they seem
Hans-Peter Seidel. Smart Links and Efficient Reconstruction
to be the ideal choice for curved surfaces, such as spheres or for Wavelet Radiosity. In P. M. Hanrahan and W. Purgathofer,
cylinders. editors, Rendering Techniques ’95 (Proceedings of the Sixth
Eurographics Workshop on Rendering), pages 240–251, New
York, NY, 1995. Springer-Verlag. 3
6. Acknowledgements
12. M. Stamminger, H. Schirmacher, P. Slusallek, and H.-P. Sei-
The authors would like to give a very special thank to Jean- del. Getting rid of links in hierarchical radiosity. Computer
Claude Paul. It was his insight that started this work on Graphics Journal (Proc. Eurographics ’98), 17(3):C165–
higher order wavelets, and it was his advice and support that C174, September 1998. 4, 5, 8
ensured its success.
13. Andrew Willmott and Paul Heckbert. An empirical compar-
ison of progressive and wavelet radiosity. In Julie Dorsey
References and Phillip Slusallek, editors, Rendering Techniques ’97 (Pro-
ceedings of the Eighth Eurographics Workshop on Rendering),
1. B. Alpert, G. Beylkin, R. Coifman, and V. Rokhlin. Wavelet- pages 175–186, New York, NY, 1997. Springer Wien. ISBN
like bases for the fast solution of second-kind integral equa- 3-211-83001-4. 1, 2, 3, 4, 5, 8
tions. SIAM Journal on Scientific Computing, 14(1):159–184,
January 1993. 5 14. Andrew J. Willmott and Paul S. Heckbert. An empirical
comparison of radiosity algorithms. Technical Report CMU-
2. Philippe Bekaert and Yves Willems. Error Control for Radios- CS-97-115, School of Computer Science, Carnegie Mel-
ity. In Rendering Techniques ’96 (Proceedings of the Seventh lon University, Pittsburgh, PA, April 1997. Available from
Eurographics Workshop on Rendering), pages 153–164, New http://www.cs.cmu.edu/ radiosity/emprad-tr.html. 3, 8
York, NY, 1996. Springer-Verlag/Wien. 4
15. Christophe Winkler. Expérimentation d’algorithmes de calcul
3. Philippe Bekaert and Yves D. Willems. Hirad: A Hierarchical de radiosité à base d’ondelettes. Thèse d’université, Institut
Higher Order Radiosity Implementation. In Proceedings of National Polytechnique de Lorraine, 1998. 4, 9
the Twelfth Spring Conference on Computer Graphics (SCCG
’96), Bratislava, Slovakia, June 1996. Comenius University 16. Harold R. Zatz. Galerkin Radiosity: A Higher Order Solution
Press. 4 Method for Global Illumination. In Computer Graphics Pro-
ceedings, Annual Conference Series, 1993 (ACM SIGGRAPH
4. Reid Gershbein. Integration Methods for Galerkin Radios- ’93 Proceedings), pages 213–220, 1993. 3
ity Couplings. In P. M. Hanrahan and W. Purgathofer, edi-
66
F. Cuny, L. Alonso and N. Holzschuch / A novel approach makes higher order wavelets really efficient for radiosity
67
68 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
Abstract
The radiosity method is used for global illumination simulation in diffuse scenes, or as an intermediate step in
other methods. Radiosity computations using Higher-Order wavelets achieve a compact representation of the
illumination on many parts of the scene, but are more expensive near discontinuities, such as shadow boundaries.
Other methods use a mesh, based on the set of discontinuities of the illumination function. The complexity of this
set of discontinuities has so far proven prohibitive for large scenes, mostly because of the difficulty to robustly
manage a geometrically complex set of triangles. In this paper, we present a method for computing radiosity that
uses higher-order wavelet functions as a basis, and introduces discontinuities only when they simplify the resulting
mesh. The result is displayed directly, without post-processing.
Categories and Subject Descriptors (according to ACM CCS): I.3.7 [Three-Dimensional Graphics and Realism]:
I.3.5 [Computational Geometry and Object Modeling]:
69
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
wavelets, defined on a regular subdivision in places where However, many scenes on which we wish to compute
they provide a good approximation, and we introduce dis- global illumination exhibit sharp discontinuities of the illu-
continuities only in places where they reduce the complex- mination functions, for example shadows caused by point
ity of the mesh. This selection of effective discontinuities is light sources or small area light sources, or shadows caused
done during the refinement process, by the refinement ora- by occluders that are close to the receiver. Regular hierar-
cle. The mesh produced is still a regular grid, but some of its chical basis of continuous polynomials are unable to model
patches are cut by discontinuities. such discontinuities. In the presence of these discontinuities,
most radiosity algorithms refine the hierarchy a lot, using
We use a fragment program to display quadric wavelets
very small patches to approximate the radiosity function.
directly. We are displaying the results of our illumination
The result is that the number of patches used near the dis-
computations immediately, without post-processing or final
continuity is roughly independent from the order of the ba-
gather. We are exploiting the fact that higher-order wavelets
sis function. Since each patch stores (k + 1)2 coefficients,
with the proper refinement oracle result in apparently con-
wavelet bases of higher-order polynomials end up being
tinuous functions after reconstruction, even in the absence
more costly at these discontinuities.
of a specific step to enforce this continuity.
Discontinuities of the radiosity function can be
This paper is organised as follows: in the following sec-
computed using geometrical methods [LTG92, Hec92]
tion, we will review previous work on hierarchical – or
[DF94, SG94, GS96, DDP02]. An adaptive mesh based on
wavelet – radiosity and discontinuity meshing, as well as in-
these discontinuities provides a better approximation of
tegrating them. Then, in section 3 we will present our algo-
the radiosity function [LTG92, Hec92]. Radiosity methods
rithm, and in section 4 we will present results and pictures
based on the discontinuity mesh have been proposed, either
from our experimentations. Finally, we will conclude and ex-
with classical radiosity [LTG92, Hec92, Stu94, DF94] or
pose future research directions.
with hierarchical radiosity [LTG93, DS96, DDP99]. All
these methods start with the complete set of discontinuities,
2. Previous Work triangulating it and refining it as necessary. The entire set
of discontinuites is quite large, giving a very complex mesh
The radiosity method was first introduced for global illu-
as a starting point. Managing this mesh proves compli-
mination simulations by Goral et al. in 1984 [GTGB84].
cated, and the associated memory cost is not neglictible.
It uses a finite element formulation of the rendering equa-
[DS96] used a regular mesh for visible areas, but kept the
tion [KH84] for diffuse scenes, and gets a complete rep-
triangulated set of discontinuities for penumbra regions.
resentation of global illumination. The radiosity method
was later extended using a hierarchical formulation of Several of the discontinuities in the discontinuity mesh
the finite element method [HS92, HSA91]. The hierar- are not visible in the radiosity function. Simplifica-
chical representation limits the complexity of the radios- tions of the discontinuity mesh have been suggested
ity algorithm to O(n) instead of O(n2 ). This hierarchi- [DF94, HWP97, Hed98]. But as they are computing discon-
cal formulation was later extended using a wavelet frame- tinuites before the illumination computations, this selection
work [GSCH93, SGCH93]. uses only geometrical tools and has not access to illumina-
tion information.
It is possible to use wavelets of different order (piecewise-
constant basis, piecewise-linear basis, piecewise-polynomial In our algorithm, however, we use a regular subdivision
basis). Early implementations of higher-order wavelets as often as possible, and we only introduce discontinuities
proved inefficient [WH97], until a complete analysis of the if they result in a simpler mesh. Significative discontinuities
wavelet radiosity algorithm [CAH00] showed that with the are thus naturally selected during the hierarchical refinement
right implementation, a good refinement oracle [BW96] and process.
efficient memory management [SSSS98] they were actually
A single paper has used Discontinuity Meshing to-
more interesting than hierarchical piecewise-constant basis
gether with wavelet radiosity with higher order-basis func-
functions for global illumination simulations, with smaller
tions [PB95]. Their study is quite complete, but they used
memory costs and shorter computation times.
only very simple scenes for their tests: a single patch with a
Piecewise polynomial wavelets are more costly for each single discontinuity. As a consequence, they could not iden-
patch of the finite element formulation, requiring (k + 1)2 tify several problems that only occur in larger scenes, such
coefficients for a wavelet basis made of polynomials of de- as intersecting discontinuities or the cost of computing push-
gree k. But they provide a better approximation of the illu- pull coefficients: their method would not scale to scenes
mination function, resulting in a smaller number of patches. much bigger. They tried to merge wavelets with disconti-
The study by Cuny et al. [CAH00] showed that most of the nuities by finding a wavelet-compatible parametrization of
time, the reduction in the number of elements more than the patch that followed the discontinuity. This causes a com-
compensates for the extra cost for each element, allowing plex computation of push-pull coefficients for each hierar-
a faster computation of radiosity and a smaller memory cost. chical level. Also, building such a parametrization is not al-
70
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
ways possible, in the case of intersecting discontinuities. Fi- for display. It is done during the push-pull step. The push-
nally, their approach does not address the problem of man- pull step is a recursive procedure, where parent nodes add
aging the set of discontinuities. Our algorithm, by contrast, their energy to their children, and the children’s energy is
keeps the same parametrization for all patches in the hier- collected in each parent and averaged.
archy, making it easy to compute push-pull coefficients. We
can deal with multiple discontinuities and intersecting dis- 3.1.2. Using Higher-Order Wavelets
continuities. Each discontinuity inserted in the hierarchy is
Using higher-order wavelets, such as Multi-Wavelets (M2
treated only at its hierarchical level.
and M3 ) [Alp93], does not change the algorithm, except in
these details:
3. Algorithm • each patch carries a wavelet representation of the radiosity
In this section, we present our algorithm for merging ra- function. The Mn basis is made of polynomials of degree
diosity using higher-order wavelets with meshing discon- n − 1, so each patch has n2 basis functions and stores n2
tinuities. We start with a short summary of the Hierarchi- coefficients.
cal Radiosity algorithm, and how it has been adapted to • The interaction between two patches implies computing
higher-order wavelets (section 3.1). Then we present our al- the influence that each wavelet coefficient on the shooting
gorithm for merging the wavelet bases with discontinuities patch has on every wavelet coefficient on the receiving
(section 3.2). Some finer points of the implementation are patch. Each of these influence coefficients is expressed as
explained in section 3.3. an integral, which is approximated using quadratures. As
there are n2 coefficients on each patch, we must evaluate
n4 integrals.
3.1. Wavelet Radiosity • The push-pull step implies computing the influence that
3.1.1. The Hierarchical Radiosity Algorithm each wavelet coefficient on the parent patch has on ev-
ery wavelet coefficient on the children patches, and recip-
In Wavelet Radiosity, each surface of the scene carries a hi- rocally. These influences are also expressed as integrals.
erarchical representation of illumination, using the wavelet These integrals only depend on the respective geometry
basis. This representation is computed iteratively, through of the parents and children patches in the hierarchy. For
three essential steps: a regular subdivision, the push-pull coefficients are there-
• refinement of interactions, fore constant on the hierarchy, and are pre-computed. For
• propagation of energy, irregular subdivision, the push-pull coefficients must be
• push-pull. recomputed at each level, a potentially costly step.
• As 2D wavelets are usually defined as tensor-products
At the beginning of the algorithm, we select the surface of 1D wavelets, they are only defined over a parallelo-
with the largest unshot energy, and indentify all surfaces that gram. Researchs have shown how to extend this definition
are potentially visible from it. We establish interactions be- for complex planar surfaces [HCA00] and for parametric
tween the shooting surface and all these receiving surfaces. curved surfaces [ACP∗ 01].
• Given the large number of coefficients for each interaction
We then refine these interactions, in a hierarchical manner.
(n4 , as much as 81 coefficients for polynomials of degree
At each point in time, we consider the current multi-scale
2), it is important to avoid storing them. Once we have
representation of the interaction, and check whether it is ac-
treated an interaction, we delete all its coefficients. This
curate enough, according to the refinement oracle. If not, we
strategy can result in computing the same interaction co-
refine the interaction, by subdividing either the shooting sur-
efficients twice, but the gain in memory largely offsets the
face or the receiver.
potential loss in time [SSSS98, CAH00].
Once we are satisfied with the level of precision on the
interaction, we propagate the energy by sending the unshot
3.2. Combining Wavelets and Discontinuity Meshing
energy of the shooting surface to the receivers, updating the
wavelet coefficients on the receivers. 3.2.1. The algorithm
After these steps, the unshot energy of the shooting sur- Our algorithm works as follows:
face is set to zero, and we pick the surface with the largest
• For each shooting surface, for each receiving surface, we
unshot energy as the next shooting surface.
compute the set of discontinuities on the receiving sur-
After the propagation, the different levels of the hierarchy face.
on each surface have received energy, but there isn’t a con- • We proceed with the usual refinement of interaction, using
sistent representation of the energy received at all hierarchi- the oracle and a regular subdivision.
cal levels. This representation must be reconstructed before • When the refinement oracle identifies that the interaction
we can use the hierarchical representation for shooting or should be subdivided only because of a discontinuity, it
71
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
Figure 1: A patch cut by a discontinuity (a) results in two children patches. For each children patch, we identify the enclosing
parallelogram (b). We conduct standard wavelet radiosity on each parallelogram (c).
introduces a discontinuity-based subdivision instead of a processing (see [CAH00] and Figure 2 for an example using
regular subdivision. M3 wavelets).
• Discontinuity-based subdivision works by:
In our algorithm, we do two computations of the refine-
– Computing the intersection of the current patch with ment oracle: one with standard visibility computations, and
the discontinuity. one assuming full visibility. If their results differ, visibil-
– For each part of the subdivided patch, identify the ity is the only reason for subdivision and we introduce a
smallest parallelogram that encloses it. discontinuity-based subdivision.
– Apply our radiosity algorithm using a regular subdivi- Subdivisions are thus only introduced in the hierarchy
sion over each parallelogram (see Figure 1). if they actually cancel further refinements on at least one
• Once we are satisfied with the level of refinement for this side, resulting in a more compact hierarchy. For point light
interaction, we propagate the energy, then erase the dis- sources, introducing a subdivision generates a coarse mesh
continuities and the interaction coefficients. Discontinu- on both sides of the subdivision (see Figure 2(a)). For area
ities that have not been used for subdivision are forgotten. light sources, introducing subdivisions creates a coarse mesh
in fully lit areas and in the umbra, while the penumbra is
We want to use the regular subdivision as much as possi- more refined (see Figure 2(b)).
ble for its robustness and simplicity. Our algorithm only in-
troduces discontinuities if they are considered important by For stability and robustness, a discontinuity is introduced
the refinement oracle. Smooth transitions that can be prop- only if the intersection between the discontinuity and the
erly approximated by the wavelet basis will not be intro- current patch is simple enough. Thus our algorithm only has
duced in the hierarchy. to manage simple patches and surfaces. For complex occlud-
ers casting a combination of simple and complex disconti-
In the following paragraphs, we review each step of this nuities, only the simple discontinuities are introduced in the
algorithm in detail: refinement oracle, discontinuity-based mesh (see Figure 11).
subdivisions, push-pull over a discontinuity, intersection of
discontinuities. In our implementation, we have used the following crite-
ria for selecting simple discontinuities: at least one of the
patches resulting from the discontinuity-based refinement
3.2.2. Refinement oracle and selection of discontinuities must be convex, and the number of vertices in each polygon
remains below a certain threshold.
We use the refinement oracle described in previous pub-
lications [BW96, CAH00]: for each patch, we select test-
3.2.3. Discontinuity-based subdivisions
ing points, where we compute radiosity directly. The val-
ues computed are compared with values obtained using the Once we have selected a patch for discontinuity-based sub-
wavelet basis. If the norm of the differences is above the division, we compute the intersection between the patch and
refinement threshold, the oracle concludes that we should the discontinuities, resulting in two separate patches. Most
refine. of the time, these sub-patches are neither parallelograms nor
triangles. For each of the sub-patches, we build the smallest
This oracle works well, especially if the testing points are
enclosing parallelogram (see Figure 1). We then use these
chosen with a good heuristics. By putting some of the test-
enclosing parallelograms instead of the patches in the radios-
ing points on the boundaries of the patches, we have found
ity algorithm as we would use standard patches:
that we obtain a representation of radiosity that looks con-
tinuous without having to ensure this continuity in post- • For radiosity reception, the enclosing parallelogram is
72
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
treated as a standard receiver. It is subdivided normally, where the subscript i on the dot product expresses the fact
using regular subdivision. that the integration takes place on ei . We are looking for the
• For radiosity emission, only the actual sub-patch is al- contribution of B p to the αij , pushij :
lowed to emit radiosity; other parts of the enclosing par-
allelogram are not allowed to emit. Following previous pushij = hB p |φij ii
research [HCA00] we do this through the quadrature = h∑ αk φk |φij ii
weights, during the computation of Gaussian quadratures. k
We see each quadrature weight as the representative of an
area of influence for the quadrature point (see Figure 3).
= ∑ αk hφk |φij ii
k
We modulate the quadrature weight by the percentage of
this area of influence that is inside the actual sub-patch.
= ∑ αkCki j
k
• For push-pull, we use the standard push-pull coefficients
since we have a standard subdivision. The push coefficients, Cki j ,
only depend on the basis func-
tions and on the relative geometry of p and ei . We have an
integral expression for the push coefficients:
3.2.4. Push-Pull Coefficients over a discontinuity Z
On most steps of the radiosity algorithm, our method uses Cki j = hφk |φij ii = φk (x)φij (x)dx
ei
classical methods. The main difference lies in the push-pull
step over the discontinuity. 3.2.4.2. Pull coefficients: For the pull step, we need to
combine together the radiosity functions on patches pi , and
The enclosing parallelograms of the children patches are
express this radiosity on the wavelet basis for patch p. As
overlapping, and we need the push-pull step to compensate
the ei patches are overlapping, we restrict the definition of
for this. Let us assume a patch p has been subdivided into
Bei to its support. We use the characteristic function of ei ,
two children patches p1 and p2 . The children patches pi are
δei , defined as being equal to 1 on ei and 0 everywhere else.
enclosed into parallelograms ei . Each of the patches have its
own set of wavelet basis functions: φ j on p, φij on ei . The Combining together the radiosity functions computed on
radiosity function is expressed as: the children gives us:
Be1 (x)δe1 (x) + Be2 (x)δe2 (x) = ∑ ∑ αij φij (x)δei (x)
B p (x) = ∑ α j φ j (x) i j
j
Bei (x) = ∑ αij φij (x) The pull step projects this combined function on the wavelet
basis for p:
j
pullk = ∑ ∑ αij hφij δe |φk i i
3.2.4.1. Push Coefficients: For the push step, we need to i j
project B p on the basis functions of the children ei . The
wavelets coefficients of the projection will be added to the
= ∑ ∑ αij Dijk
i j
wavelet coefficients on each child ei . Since, on each patch,
wavelets functions form an orthonormal basis, wavelet coef- The pull coefficients, Dijk depend on the geometry of the
ficients are expressed as the scalar product of the radiosity subdivision:
function with the basis functions: Z
Dijk = hφij δei |φk i = φij (x)δei (x)φk (x)dx
αij = hBei |φij ii p
73
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
a =
w0 w1 w2 percent
covered
0 1
0 1 w0 w1 w2
(a) on the unit segment (b) on the unit square (c) on the extended
domain
Figure 3: The weights of the quadrature points can be seen as the area of a zone of influence.
74
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
(a) Point light sources (M3 wavelets) (b) Area light sources (M3 wavelets)
Figure 5: Combining together several discontinuities (both scenes have three light sources, red, green and blue, located in a
triangle above the cube).
e.g. [BDS∗ 92, CGA]). This randomized data structure an- and indirect illumination. Pictures of the test scenes are
swers our positions query in average time O(log n), with available in Figures 7 and 13 (see color plates).
creation time O(n log n) and memory cost O(n). Wavelet Bases: We have tested our algorithm with the first
Handling Discontinuites in the Refinement Oracle: The three multi-wavelets bases: M1 (Haar), M2 (piecewise-
refinement oracle takes sampling points on the receiving linear) and M3 (piecewise-quadric). In the pictures, Haar
patch. Some sampling points can lie on a discontinuity, wavelets are displayed after a post-processing step to en-
which makes their exact value unknown. To avoid sure continuity, M2 wavelets are displayed using stan-
unnecessary refinement, points lying on a discontinuity dard linear interpolation from the graphics hardware and
can take a different value in the oracle depending on the M3 wavelets use a fragment program for the quadrically
patch being considered. varying part.
Handling Visibility Queries: In our radiosity computa- Material: All computations were done on the same com-
tions, we need the percentage of the light source that puter: a 2.4 GHz Pentium IV, with 1Gb memory and an
is visible from the receiving points. Previous implemen- NVIDIA GeForce FX 5600.
tations used a geometric data-structure, the Backprojec-
tion [DF94] to compute an exact value of this percentage.
4.2. Visual comparison for point light sources
We are computing it instead using an OpenGL extension,
OcclusionQuery [ARB], which gives us the percent- The first reason to use discontinuity meshing is the quality
age of the pixels of the light source that are visible from of the illumination computed. Adapting the mesh to the dis-
the receiving point. In our experiments, occlusion queries continuities produces a radiosity function that looks pleasing
are more robust than the geometric data structure while to the eye.
having the same speed, and they are much faster than cast-
The leftmost columns of Figures 6 and 12 show a side-
ing rays, while giving more precise results.
by-side comparison of the different wavelet bases on a spe-
Displaying Results: M3 wavelets give quadrically varying
cific detail of the Cabin test scene, with a point light source.
functions; they are displayed using a small fragment pro-
All pictures were generated with the same computation time
gram (10 lines of code). Linear interpolation from the
(25 s) to give a fair comparison of the different wavelet
graphics hardware (Gouraud shading) is not perfect for
bases. Without discontinuity meshing, the most satisfying
M2 wavelets, which are bilinear functions. It is possi-
representation is obtained with M2 wavelets, but artefacts
ble to replace this linear interpolation by a small fragment
are clearly visible along the discontinuity line; Haar and M3
program.
wavelets are visually not acceptable within the prescribed
time frame; they would eventually achieve a satisfying re-
sult, but for a longer computation time.
4. Experimentations and Results
With discontinuity meshing, all wavelets bases achieve a
4.1. Experimentation protocol visually pleasing result. Our algorithm for merging disconti-
nuities with wavelets thus achieves a visually better result in
Test scenes: We have used two different test scenes: the
the same computation time.
Cabin, from Radiance set of test scenes, and Room 523
from the Soda Hall model. For each scene, we used either We did a similar comparison for Room 523 of the Soda
point light sources or area light sources, giving a total of Hall. Figure 8 shows the pictures obtained with the differ-
four test scenes. On all test scenes, we computed direct ent wavelet bases on a detail of the room. All pictures were
75
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
Figure 6: Wireframe version of simulation for the Cabin test scene. See also the color plates.
Cabin, 3 point light sources
Figure 7: Wireframe version of our test scenes after simulation with M3 wavelets. See also the color plates.
76
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
Figure 8: Visual comparison of the different wavelet bases for a point light source. All pictures used roughly 190 s computation
time.
generated with approximately the same computation time higher-order wavelet bases always gives better results than
(190 s). existing algorithms, with a smaller memory cost.
77
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
78
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
(a) (b)
Figure 11: Our algorithm only inserts discontinuities that are perceived as useful by the refinement oracle (M3 wavelets).
only inserts discontinuities as they are needed in the refine- [ARB] ARB_occlusion_query. http://oss.
ment process, it starts by computing all potential discontinu- sgi.com/projects/ogl-sample/
ities for the current interaction, a costly preliminary step. We registry/ARB/occlusion_query.
will explore the possibility to suppress this step, using stan- txt%.
dard refinement but detecting inside the refinement oracle
[BDS∗ 92] B OISSONNAT J.-D., D EVILLERS O., S CHOTT
that subdivision is probably caused by a discontinuity, then
R., T EILLAUD M., Y VINEC M.: Applications
only computing and inserting this discontinuity in the mesh.
of random sampling to on-line algorithms in
This would reduce the computation cost of our algorithm. In
computational geometry. Discrete and Compu-
our experiments, almost all the discontinuities caused by in-
tational Geometry 8, 1 (1992), 51–71.
direct lighting are not important enough to justify their inser-
tion in the hierarchy. It is therefore not practical to compute [BW96] B EKAERT P., W ILLEMS Y.: Error Control for
them in advance. Radiosity. In Rendering Techniques ’96 (7th
Eurographics Workshop on Rendering) (1996),
6. Acknowledgements pp. 153–164.
This work was partly funded by the “Région Rhône-Alpes” [CAH00] C UNY F., A LONSO L., H OLZSCHUCH N.: A
under the DEREVE project. novel approach makes higher order wavelets re-
ally efficient for radiosity. Computer Graphics
The Soda Hall model was created by the original Berkeley Forum (Eurographics 2000) 19, 3 (2000), C–
Walkthru team under the direction of Prof. Carlo H. Sequin. 99–C–108.
For more information on the Soda Hall, see http://www.
cs.berkeley.edu/~sequin/soda/soda.html [CGA] CGAL, Computational Geometry Algorithms
Library. http://www.cgal.org.
79
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
[DF94] D RETTAKIS G., F IUME E.: A Fast Shadow [KH84] K AJIYA J. T., H ERZEN B. P. V.: Ray Tracing
Algorithm for Area Light Sources Using Back- Volume Densities. Computer Graphics (ACM
projection. In Computer Graphics Proceedings, SIGGRAPH ’84) 18, 3 (July 1984), 165–174.
Annual Conference Series, (ACM SIGGRAPH
[LTG92] L ISCHINSKI D., TAMPIERI F., G REENBERG
’94) (1994), pp. 223–230.
D. P.: Discontinuity Meshing for Accurate Ra-
[DS96] D RETTAKIS G., S ILLION F.: Accurate Visi- diosity. IEEE Computer Graphics and Applica-
bility and Meshing Calculations for Hierarchi- tions 12, 6 (November 1992), 25–39.
cal Radiosity. In Rendering Techniques ’96 (7th
[LTG93] L ISCHINSKI D., TAMPIERI F., G REENBERG
Eurographics Workshop on Rendering) (1996),
D. P.: Combining Hierarchical Radiosity and
pp. 269–278.
Discontinuity Meshing. In Computer Graphics
[GS96] G HALI S., S TEWART A. J.: A Complete Treat- Proceedings, Annual Conference Series, (ACM
ment of D1 Discontinuities in a Discontinuity SIGGRAPH ’93) (1993), pp. 199–208.
Mesh. In Graphics Interface ’96 (May 1996),
[PB95] PATTANAIK S. N., B OUATOUCH K.: Discon-
pp. 122–131.
tinuity Meshing and Hierarchical Multiwavelet
[GSCH93] G ORTLER S. J., S CHRODER P., C OHEN M. F., Radiosity. In Graphics Interface ’95 (May
H ANRAHAN P.: Wavelet Radiosity. In Com- 1995), pp. 109–115.
puter Graphics Proceedings, Annual Confer-
[SG94] S TEWART A. J., G HALI S.: Fast Computation
ence Series, (ACM SIGGRAPH ’93) (1993),
of Shadow Boundaries Using Spatial Coher-
pp. 221–230.
ence and Backprojection. In Computer Graph-
[GTGB84] G ORAL C. M., T ORRANCE K. E., G REEN - ics Proceedings, Annual Conference Series
BERG D. P., BATTAILE B.: Modelling the In- 1994 (ACM SIGGRAPH ’94) (1994), pp. 231–
teraction of Light Between Diffuse Surfaces. 238.
Computer Graphics (ACM SIGGRAPH ’84) 18,
[SGCH93] S CHRODER P., G ORTLER S. J., C OHEN M. F.,
3 (July 1984), 212–222.
H ANRAHAN P.: Wavelet Projections for Ra-
[HCA00] H OLZSCHUCH N., C UNY F., A LONSO L.: diosity. In 4th Eurographics Workshop on Ren-
Wavelet radiosity on arbitrary planar sur- dering (June 1993), pp. 105–114.
faces. In Rendering Techniques 2000 (11th
[SSSS98] S TAMMINGER M., S CHIRMACHER H.,
Eurographics Workshop on Rendering) (2000),
S LUSALLEK P., S EIDEL H.-P.: Getting rid
pp. 161–172.
of links in hierarchical radiosity. Computer
[Hec92] H ECKBERT P.: Discontinuity Meshing for Ra- Graphics Forum (Eurographics ’98) 17, 3
diosity. In Third Eurographics Workshop on (September 1998), C165–C174.
Rendering (May 1992), pp. 203–226.
[Stu94] S TURZLINGER W.: Adaptive Mesh Refinement
[Hed98] H EDLEY D.: Discontinuity Meshing for Com- with Discontinuities for the Radiosity Method.
plex Environments. PhD thesis, Department of In Photorealistic Rendering Techniques (5th
Computer Science, University of Bristol, Bris- Eurographics Workshop on Rendering) (June
tol, UK, Aug. 1998. 1994), pp. 239–248.
[HS92] H ANRAHAN P., S ALZMAN D.: A Rapid [SWND03] S HREINER D., W OO M., N EIDER J., DAVIS
Hierarchical Radiosity Algorithm for Unoc- T.: OpenGL Programming Guide: The Official
cluded Environments. In Photorealism in Guide to Learning OpenGL, Version 1.4. Addi-
Computer Graphics (Proceedings Eurograph- son Wesley Professional, 2003, ch. 11. Tessel-
ics Workshop on Photosimulation, Realism and lators and Quadrics.
Physics in Computer Graphics, 1990) (1992),
[WH97] W ILLMOTT A., H ECKBERT P.: An empiri-
pp. 151–171.
cal comparison of progressive and wavelet ra-
[HSA91] H ANRAHAN P., S ALZMAN D., AUPPERLE diosity. In Rendering Techniques ’97 (8th Eu-
L.: A Rapid Hierarchical Radiosity Algorithm. rographics Workshop on Rendering) (1997),
Computer Graphics (ACM SIGGRAPH ’91) 25, pp. 175–186.
4 (July 1991), 197–206.
[HWP97] H EDLEY D., W ORRALL A., PADDON D.: Se-
lective culling of discontinuity lines. In Ren-
dering Techniques ’97 (8th Eurographics Work-
shop on Rendering) (1997), pp. 69–80.
80
N. Holzschuch and L. Alonso / Combining Higher-Order Wavelets and Discontinuity Meshing for Radiosity
Figure 12: Visual comparison of results for the Cabin test scene. See also figure 6 for wireframe representation.
Cabin, 3 point light sources
Figure 13: Our test scenes (all figures with M3 wavelets). See also figure 7 for wireframe representation.
81
82 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
2.7.6 Space-time hierarchical radiosity with clustering and higher-order wavelets (CGF
2004)
Auteurs : Cyrille D, Nicolas H François S
Journal : Computer Graphics Forum, vol. 23, no 2.
Date : avril 2004
Volume 23 (2004), number 2 pp. 129–141 COMPUTER GRAPHICS forum
1 Max-Planck-Institut
für Informatik, Saarbrücken, Germany
2 ARTIS, GRAVIR/IMAG-INRIA, Grenoble, France
Abstract
We address in this paper the issue of computing diffuse global illumination solutions for animation sequences. The
principal difficulties lie in the computational complexity of global illumination, emphasized by the movement of
objects and the large number of frames to compute, as well as the potential for creating temporal discontinuities
in the illumination, a particularly noticeable artifact. We demonstrate how space-time hierarchical radiosity, i.e.
the application to the time dimension of a hierarchical decomposition algorithm, can be effectively used to obtain
smooth animations: first by proposing the integration of spatial clustering in a space-time hierarchy; second, by
using a higher-order wavelet basis adapted for the temporal dimension. The resulting algorithm is capable of
creating time-dependent radiosity solutions efficiently.
the temporal continuity of the animations produced. Second, In particular, the accumulated errors due to the incremen-
we have combined space-time radiosity with clustering, thus tal nature of most interactive algorithm may cause distract-
enabling the algorithm to work more efficiently and on larger ing artifacts. The resulting frames quality may seem accept-
scenes. This allows us to use this algorithm in a range of able when considered separately. Nevertheless, discontinu-
complexity where its benefits can be fully realized. ities in the shading of surfaces may appear between two
consecutive frames. The interactive global illumination al-
The paper is organized as follows. In the next section, we gorithm proposed by Wald et al. [10], though it recomputes
briefly discuss previous work in time-dependent illumination a complete global illumination solution for each frame in-
of animated scenes, and review the shortcomings of our pre- dependently, is fast enough to converge to a good quality
liminary approach. Then, in Section 3, our fully developed view-dependent solution within a couple seconds. However,
algorithm is given in detail. In Section 4, we provide an anal- in order to avoid light flickering due to its stochastic na-
ysis of the performances of space-time hierarchical radiosity, ture, it requires to use the same random seeds from one
compared to our previous approach as well as frame-by-frame frame to the other, which only ensures temporal continu-
computations. Finally, in Section 5, we draw our conclusions ity of lighting for light paths that do not intersect moving
and trace directions for future work. objects.
It also seems natural to try to capitalize on the knowledge of
2. Background and Motivations objects movement to enhance the quality of the rendered ani-
mation. Therefore, it makes sense to consider high-quality an-
2.1. Global illumination algorithms for animations imations rendering as a separate problem, and to develop al-
Several algorithms to compute global illumination images gorithms specifically designed to solve it. Myszkowski et al.
have been proposed since the pioneering work of Goral et al. [13] extended the density estimation photon-tracing algo-
[3]. As the performance of the said algorithms and the com- rithm to the case of animated scenes, allowing the use of
puting power of graphics workstations improved, several photons for several consecutive frames. The decision to ex-
propositions have been made to extend these algorithms tend or contract the segment of time during which a given
and reduce the overwhelming cost of computing globally il- sample is valid is based on a perception-based Animation
luminated animations. Two classes of applications can be Quality Metric. It is used to measure the perceived dif-
distinguished: ference between consecutive frames, and therefore reduce
the flickering which results from the stochastic noise. How-
ever, to this date, this method is based on a fixed mesh
r Interactive methods, which render new solutions quickly,
and lack some kind of adaptive refinement scheme. There-
usually by reusing previous computation results as much
fore, the spatial resolution of the solutions computed is
as possible. They aim at offering as fast a feedback as
limited.
possible in response to changes made by the user. In-
teractive methods have been developed for Hierarchical Martin et al. [15] proposed a two pass algorithm based on
Radiosity [4,5], Path Tracing [6,7] and Particle Tracing hierarchical radiosity. During the first pass, a coarse hierar-
[8,9,10,11]. chical solution for the complete animation is computed incre-
r Offline methods, where the objects movements are sup- mentally. Then during the second pass, the resulting mesh and
posed continuous and known a priori. They aim at render- link structure is used to efficiently perform final gathering,
ing high-quality animations, and therefore should ensure assigning to each space-time mesh element a high-resolution
a constant quality. Global Monte Carlo methods [12] and texture movie representing the radiosity of this patch during
Particle Tracing [13] algorithms have been proposed to the corresponding interval of time. Since this algorithm ef-
compute high-quality global illumination animations. ficiently solves the problem of high-quality final gathering
for animated scenes, which our approach does not address,
A study of the current state of the art for both types of ani- both methods can be seen as complementary. In particular,
mated global illumination algorithms can be found in Damez the algorithm of Martin et al. does not make use of a cluster
et al. [14]. In this section, we discuss briefly only the methods hierarchy during the first pass, which limits its application to
allowing the computation of higher-quality animations. very simple scenes. However, we show in Section 3.3 how to
solve this particular issue. As a consequence, coupling both
Surprisingly, the case of high-quality animations has re- approaches seems promising.
ceived little attention when compared to the amount of work
devoted to interactive algorithms in the literature. Indeed,
most interactive algorithms could be used to compute a movie 2.2. Previous work on the space-time hierarchical
sequence. However, the quality of the resulting animation radiosity algorithm
may not always be satisfying, as these methods were de-
signed to satisfy real-time constraints instead of animation In order to reduce the cost of diffuse global illumination
quality criteria. computations for animations, we introduced in a previous
c
The Eurographics Association and Blackwell Publishing Ltd 2004
C. Damez et al. / Space-Time Hierarchical Radiosity 131
c
The Eurographics Association and Blackwell Publishing Ltd 2004
132 C. Damez et al. / Space-Time Hierarchical Radiosity
Figure 2: Example of temporal discontinuities. Visualizations of the scene at t = 1/2 − ǫ, at t = 1/2 + ǫ and the difference
image. The illumination of the entire scene has been modified in a single frame interval.
equation (1) is formally equivalent to the classical radiosity a given element k of our mesh, defined as the cross product of
equation in the static case. Therefore, any algorithm capable polygon P k and time interval T k , correspond L basis functions
of solving the latter can probably be extended in a straight- u Lk+i (p, t) with 0 ≤ i < L, equal to 0 when (p, t) is outside
forward manner to solve the former. In particular, we shall (Pk × Tk ), and to i k (t) otherwise. Since the u i have to form
see that we can derive a finite element formulation similar to an orthogonal basis, the i k are the restriction of the first L
that of standard radiosity [17,20]. Legendre polynomials to the time interval Tk = [α k , β k ], i.e.
0
3.2. Discretization k (t) =1
√
Equation (1) is a Fredholm equation of the second kind, and 1
t − αk
k (t) = 3 2 −1
can be discretized by the Galerkin method. We want to com- βk − αk
pute an approximation B̃ of B in a finite-dimensional function ...
space spanned by an orthogonal basis of functions (u i ) 1≤i≤N .
Therefore, we can express B̃ as a linear combination of the Therefore, the variations of radiosity of each element k in
ui: the mesh will be described by L unknown coefficients BLk , . . .,
N B Lk+L−1 . Furthermore, from equation (5), it can be derived
B̃ = Bju j. that each pair (k, l) of elements in our mesh corresponds to a
j=1 L × L block ρ k I k,l in the matrix M, where I k,l is the following
The Galerkin condition [17] defines the approximation B̃ so interaction matrix:
that the residual function 1
Ik,l = G k,l (t) k( p, q, t) dq d p dt
||Pk ||(βk − αk ) Tk ∩Tl Pk Pl
r (X ) = B̃(X ) − E(X ) − B̃(Y )K(X , Y ) dY
Y ∈(S×T ) (6)
is orthogonal to all the u i . In such a case, the coefficients B j and the matrix G k,l is defined as:
that define B̃ are solutions of the following linear system: 0 0 0 L−1
k (t) l (t) ··· k (t) l (t)
(I − M)B = E, (4) . .. ..
G k,l (t) = .. .
. .
where I is the identity matrix, the vector E is defined by L−1 0
(t) l (t) · · · L−1
(t) L−1
(t)
k k l
E, u i
∀i ∈ [1, N ]E i = The interaction matrix extends the traditional notion of
u i 2
form factor used in the classical static radiosity algorithms.
and the matrix coefficients are defined by:
Y K(., Y )u j (Y ) dY , u i 3.3. Hierarchical solution of the discrete equation
2
∀(i, j) ∈ [1, N ] Mi, j = (5)
u i 2 We are using piecewise polynomial functions to describe the
variations of radiosity in time. Therefore, the resulting al-
The simplest possible choice of a function basis is piece- gorithm is an extension of the Wavelet Radiosity algorithm
wise constant functions. As discussed in Section 2.2, this [21,22], using Haar basis over the spatial dimension and
choice proves unsatisfying in certain cases, where it causes Alpert’s M L basis [23] over the time dimension.
noticeable temporal discontinuities in indirect lighting. As
a consequence, we propose instead functions that are piece- Since our mesh elements are defined both by their geom-
wise constant in space, and piecewise polynomial in time. To etry and their time interval, they can be subdivided either in
c
The Eurographics Association and Blackwell Publishing Ltd 2004
C. Damez et al. / Space-Time Hierarchical Radiosity 133
c
The Eurographics Association and Blackwell Publishing Ltd 2004
134 C. Damez et al. / Space-Time Hierarchical Radiosity
Figure 4: The cluster in the left-hand image (represented in 2D for simplicity) can be subdivided either spatially (center),
or temporally (on the right). Spatial subdivision builds one new hierarchical level of clusters around the surfaces. Temporal
subdivision duplicates the surfaces and subdivides the corresponding time interval.
We propose to extend for space-time radiosity an oracle r An average temporal variance: for each fixed control
designed for Wavelet Radiosity in the static case [18,19,26], point we compute the variance of error values at each
based on estimates of the error on the propagated energy control times, and then take the spatial average.
rather than on estimates of the variation of this energy. We
use a grid of control points located on the receiving element, We refine the interaction in time if the average temporal
and a set of control points in time. On these control points, at variance is above the average spatial variance, and in space
the control times, we estimate the radiosity value using two otherwise.
methods:
(1) by multiplying the emitters’ radiosity vector by the in- 3.5. Light exchanges computations
teraction matrix corresponding to the link, and then in-
terpolating the radiosity values at the control times, Computing the light exchanged between two linked surfaces
is straightforward. The product of the link’s interaction ma-
(2) by direct integration of the radiosity on the emitter, us- trix by the radiosity of the emitter is added to the radiosity
ing a quadrature. of the receiver. The interaction matrix has generally been
computed previously during the refinement procedure using
The difference between these two values is an indication simple Gaussian quadratures.
of the error made when evaluating the interaction at this point
However, interactions involving one or more clusters re-
for this level of precision. The norm of these differences is
quire a special approach, based on the one described by
used as the error on the current interaction. Refinement will
Sillion [27]. Roughly, anisotropic emission from a cluster is
occur if this norm is above the refinement threshold set by
approximated by going down to the surfaces level to estimate
the user.
the directional radiant intensity exiting the cluster (Delayed
The control points and times must be carefully chosen Pull), and the irradiance gathered by a cluster from a given
so that they provide meaningful information. They must be hierarchical element is distributed to all the surfaces inside
different from the quadrature points and times used for the the cluster immediately at gathering time according to their
form factor computations. The number of control points and orientation (Immediate Push). The specificities of the space-
times must be higher for large receivers so that we do not time hierarchical radiosity method come from the fact that the
miss important features. Also, placing control times at the position, orientation and radiosity of the objects can change
beginning and at the end of the time interval greatly enhances with time.
temporal continuity.
3.5.1. Emission from a cluster: Delayed pull
Once we have made the decision to refine an interaction,
we must choose between refinement in space or in time. We In the classical hierarchical radiosity algorithm, the compu-
compute two variance estimates for the set of estimated error tation of the light emitted from an object involves the compu-
values on our grid of control points and times: tation of the form factor between the sender l and the receiver
k. It is very difficult to define what the form factor should be
r An average spatial variance: for each fixed control time if the sender is an anisotropic cluster. Therefore, we directly
we compute the variance of error values at each control compute the irradiance emitted by the cluster to the receiver,
points, and then take the temporal average. by summing the contributions of the N surfaces contained in
c
The Eurographics Association and Blackwell Publishing Ltd 2004
C. Damez et al. / Space-Time Hierarchical Radiosity 135
the cluster l. At a given time t, point p receives from the N Since both the cluster and the sender may be moving, the
elements i in l the total irradiance: receiver factor is time-dependent. We need to project it on
N
our wavelet basis. Let us assume a cluster k has received
Ir eceived ( p, t) = Bi (q, t)g( p, q, t)v( p, q, t)dq, an irradiance I received . This irradiance is distributed to each
Qi
i=1 surface i in the cluster k according to its orientation:
where the geometric configuration function is defined by: Ii = Ir eceived (t) cos θi (t)
R(t) cos θ ′
g(x, y, t) = I i is then reprojected on the wavelet basis for the time inter-
πr 2 val T i over which the hierarchical element i is defined. The
and the R function is the receiver factor defined in [27] as resulting approximate irradiance is then:
cos θ if the receiver is a surface and 1 if the receiver is a cluster L−1
(the surfaces orientation in the receiving cluster will be taken
j
I˜i = γ1 i
into account by the Immediate Push mechanism described in j=0
Section 3.5.2).
and the γ coefficients are:
We approximate the received irradiance by projecting it
1 j
on our function basis: The resulting approximation is a linear γj = j 2
Ii | i
combination of our L basis functions: || i ||
L−1 1 j
= Ir eceived (t) cos θi (t) i (t) dt.
I˜ = λ j u Lk+ j . βi − αi Ti ∩Tl
j=0
These integrals are once again approximated using a Gaussian
Since the u i are orthogonal we have: quadrature.
1
λj = Ir eceived , u Lk+ j Our method contains two successive approximations: we
||u Lk+ j ||2
have separately computed the irradiance received at the clus-
j
i Tk ∩Ti Pk Q i
g( p,q,t)v( p,q,t)dqd p Bi (t) k (t)dt
ter level, which was time-dependent, projected it onto the
= . function basis, then dispatched it to the surfaces, taking into
Ak (βk − αk )
account the surface movement, and reprojected it on the func-
tion basis for the receiving surface. This double approxima-
Computing the above integral is costly as it involves a tion is consistent with the clustering approach. If the refine-
number of visibility estimations proportional to the number ment oracle decides that we can compute an interaction at
of surfaces in the cluster. Therefore we approximate it by the cluster level, then this approximation should be sufficient.
factoring out the visibility, and average it over the sending Spending more computation time to find a better approxima-
cluster: tion would impair the hierarchical nature of the algorithm
j
and would reduce its performance.
i Tk ∩Ti Pk Q i
g( p, q, t)dqd p Bi (t) k (t)Ṽ (t)dt
λj = , 3.6. Push–pull traversal
Ak (βk − αk )
where After the irradiances have been gathered across all links in
1
the scene, a traversal of the complete hierarchy is necessary
Ṽ (t) = v(x, y, t) dx dy. to maintain coherence between the different hierarchical lev-
A k Al Pk l
els. First, irradiance contributions computed at various level
We compute the α j and Ṽ (t) using a Gaussian quadrature. of the hierarchy have to be pushed down to the lowest level of
Since the cost of evaluating the approximate visibility must the structure and summed along the way. Here the radiosities
not depend on the number N of surfaces inside the cluster l of each leaf are computed, and these radiosities are then pro-
we place the quadrature points independently of the surfaces gressively pulled up the hierarchy and averaged to compute
positions inside the cluster’s bounding box. the correct radiosity representation corresponding to each hi-
erarchical level.
3.5.2. Reception inside a cluster: Immediate push
In the case of Wavelet Radiosity [21], this process is
The reception inside a cluster obeys the immediate push prin- slightly more complicated than it is for static Hierarchical
ciple: the irradiance received at the cluster level is immedi- Radiosity since we need to define how to combine the coef-
ately dispatched to all surfaces inside the cluster, where it is ficients describing the radiosity variations, to convert them
multiplied by the cosine of the angle between the normal of from one hierarchical level to the other. Remember from Sec-
the surface and the direction of the incoming radiance. The tion 3.2 that our multi-resolution basis functions are cross
origin of the incoming radiance is assumed to be the center products of the scale functions of the Haar basis over space,
of the emitter, whether a cluster or a surface. and scale functions of the M L basis over time. Since we use
c
The Eurographics Association and Blackwell Publishing Ltd 2004
136 C. Damez et al. / Space-Time Hierarchical Radiosity
a very simple midpoint subdivision scheme when subdivid- needing change would be the Push–Pull matrices we gave in
ing elements in time, the coefficients that have to be pushed Section 3.6.
down or pulled up during this traversal can be computed us-
ing simple linear transformations, which are independent of However, this problem does not arise in the temporal
the element or the hierarchical level. Both linear transforms dimension. Moreover, the lower dimensionality makes the
are referred to as the two scale relationship [22], and are de- added cost of the use of wavelets lower in the temporal di-
termined by two L × L matrices P and Q. When L = 2 (linear mension than it is in the spatial dimension. Therefore, we
wavelets), those matrices are: decided to limit our use of wavelets to the description of the
temporal variations of radiosity. In our implementation, our
1 0 1 0 function basis was composed of linearly varying functions
P=
√
Q = √
. (the M2 basis). This choice proved sufficient in practice to
3 1 3 1 significantly reduce the temporal discontinuities (see Sec-
−
2 2 2 2 tion 4).
In order to provide a smooth appearance for patches in
When pushing down the total irradiance I (remember that our example animations, we applied a simple linear interpo-
this is a L-dimensional vector) from a given element split in lation over the polygons as a postprocess, when traversing
time to each of its two children, the corresponding irradiances the space-time mesh to generate the images. Though it no-
I ′ and I ′′ to be transmitted to its first and second children are ticeably increases the visual appeal of the results, this post-
given by the following linear transform: process doesn’t improve the precision of the solution. Much
I′ = t PI and I ′′ = t Q I . better reconstruction methods have been proposed for static
scenes, such as final gathering [30,31], and can be applied
Respectively when pulling up, the average radiosity B of an here on an image per image basis. Moreover, Martin et al.
element can be computed from the radiosities of its two chil- have proposed recently a final gathering acceleration method
dren B′ and B′′ : for animated scenes [15], whose coupling with our approach
1 seems promising.
B= (P B ′ + Q B ′′ )
2
c
The Eurographics Association and Blackwell Publishing Ltd 2004
C. Damez et al. / Space-Time Hierarchical Radiosity 137
Figure 5: Variation of the radiosity function at the center of the highlighted element during the animation.
The refinement can be performed as a traversal of the hier- sweeping movement of spotlights over walls painted in dif-
archy that would correspond to a depth-first-order traversal ferent colors cause important changes in the indirect illumi-
of the time intervals binary tree. Only the elements whose nation of the scene. In particular, strong color bleeding effects
time interval contain the one currently visited should be kept can be observed moving on the ceiling and the floor of the
in memory. Disk access would only take place when moving scene.
from one time interval to the other.
As explained in Section 2.2, this scene was purposedly
We ran an experiment to estimate the corresponding gain designed as a “worst-case scenario” for the space-time hier-
in memory that could be expected from such a traversal. For archical radiosity algorithm in order to exhibit strong tempo-
the SPOT scene (see Sections 2.2 and 4.1), we used 25 − 1 = ral discontinuities. When using a piecewise constant function
31 time interval buckets to sort our elements (one for each basis to describe the variation of radiosity in time, the indirect
time interval corresponding to the first five subdivision level). lighting effects in this scene are extremely discontinuous. For
The maximum total cost of the portion of the hierarchy that example, the color bleeding patches seem to be updated only
needs to be kept in memory is 40 MB whereas more than every second or so. The amplitude of these discontinuities is
450 MB are required when we are keeping everything in shown in the radiosity variation plot of Figure 5. It can be
RAM (cf. Table 1). In such a case, file accesses should not clearly seen that the greatest discontinuity is located at the
reduce excessively the performances of our algorithm: The middle of the animation, then at the first and third quarter.
added cost of reading and writing 31 files each about 15 The magnitude of the largest discontinuity is about 40% of
MB big should be reasonable since an iteration on this scene the time-average radiosity of this patch, which makes it quite
already requires 10–20 minutes. noticeable. Smaller discontinuities can be observed at other
even subdivisions of the time interval.
c
The Eurographics Association and Blackwell Publishing Ltd 2004
138 C. Damez et al. / Space-Time Hierarchical Radiosity
Figure 6: Comparison of temporal discontinuities at t = 12 . Darker colors indicate a higher discontinuity (arbitrary units). Left:
hierarchical radiosity, right: M2 wavelets.
Table 1: Performance comparison on the SPOTS scene, between frame-by-frame Hierarchical Radiosity, our algorithm using the Haar basis,
and our algorithm using the M2 basis
Computation Time
Direct Lighting Indirect Lighting Total per Image Memory Used (MB)
The following comments can be made about these results: The first of our three test animations takes place in a small
room with some furniture (a couple desks, chairs, pens, etc.).
r Though this scene is geometrically quite simple, the It is lit by four area light sources. The bookshelf, against
speedup factor obtained, when compared to a frame-by- the wall, falls to the floor. The animation is 4 seconds long,
frame computation, is about 6. (Note that this accelera- and is composed of 100 frames. The input geometry is com-
tion factor is only about 2 if we only take into account posed of 7, 200 polygons. The second animation takes place
the time needed to compute the direct illumination). Our in a large library hall with several desks separated by rows of
algorithm performance on such scenes where the indirect bookshelves. This scene is lit by numerous area light sources.
lighting is dominant and dramatically changing over time A character is moving through the hall. The animation is 20
is therefore satisfying. seconds long and is composed of 500 frames. There are about
35 000 input surfaces. The third animation is somehow sim-
r The memory consumption when using the M2 basis is
ilar to the test scene we use in Section 4.1. We replaced the
15% lower than when using the Haar basis, in spite of
boxes by more complex objects. The resulting scene is com-
the added storage cost of the second radiosity coefficient
posed of approximately 30 000 polygons and is 24 seconds
and the interaction matrices. The animation has also been
long.
computed slightly faster. This is due to the fact that fewer
subdivisions in time are needed to obtain a precise enough Table 2 summarizes our experimental results for our three
representation of the variations of radiosity in time, re- test scenes. In this table, we compare the resources neces-
sulting in a faster refinement and a lighter mesh. sary to compute the animations when using the space-time
hierarchical radiosity algorithm and when performing a hier-
archical radiosity with Clustering frame-by-frame providing
4.2. Validation of the Clustering Approach the same image quality. All timings have been observed on a
300 MHz MIPS R12000.
We have tested our algorithm on scenes composed of sev-
eral thousands of input polygons (see Figure 7). For such The more elements the mesh is composed of, the more
scenes, Hierarchical Radiosity computations without the use the hierarchical approach is advantageous. Since it makes it
of clustering would have been extremely long because of the possible to compute more complicated animations, clustering
quadratic cost of the initial linking stage. really allows us to benefit fully from the hierarchical nature of
c
The Eurographics Association and Blackwell Publishing Ltd 2004
C. Damez et al. / Space-Time Hierarchical Radiosity 139
Table 2: Comparative results for the use of clustering: we compare computation time and memory use of our algorithm to the time and memory
needed to compute the same animation frame-by-frame with classical Hierarchical Radiosity with Clustering
c
The Eurographics Association and Blackwell Publishing Ltd 2004
140 C. Damez et al. / Space-Time Hierarchical Radiosity
the simultaneous use of a piecewise-linear wavelet basis in 9. X. Granier and G. Drettakis. Incremental updates for
the time dimension and of an adequate space-time refinement rapid glossy global illumination. In Computer Graph-
oracle. ics Forum (Proceedings of Eurographics 2001), vol. 20,
Promising directions for future research include: pp. 268–277. 2001.
r The derivation of a space-time final gathering approach, 10. I. Wald, T. Kollig, C. Benthin, A. Keller and P.
adapting the one proposed by Martin, Pueyo and Tost Slusallek. Interactive global illumination. In Proceed-
[15]. ings of the 13th Eurographics Workshop on Rendering.
2002.
r The implementation and extensive testing of disk caching
schemes such as the one suggested in Section 3.7.2. 11. K. Dmitriev, S. Brabec, K. Myszkowski and H.-P. Sei-
r The parallelization of this algorithm. This should be del. Interactive global illumination using selective pho-
straightforward on a shared memory architecture [32] ton tracing. In Proceedings of the 13th Eurographics
but will certainly prove more difficult on a cluster of PC. Workshop on Rendering. 2002.
r Experiments with alternative wavelet basis in the time
12. G. Besuievsky and X.Pueyo. Animating radiosity envi-
dimension, for example using higher-order polynomials.
ronments through the multi-frame lighting method. Jour-
r The extension of our algorithm to nondiffuse scenes, us- nal of Visualization and Computer Animation, 12:93–
ing a unified mesh-based particle shooting approach [9]. 106, 2001.
r The construction of a refinement criteria using human
perception-based animation quality metrics [13]. 13. K. Myszkowski, T. Tawara, H. Akamine and H.-P. Sei-
del. Perception-guided global illumination solution for
References animation rendering. In Computer Graphics (ACM SIG-
GRAPH ’01 Proceedings), pp. 221–230. 2001.
1. P. Hanrahan, D. Salzman and L. Aupperle. A rapid hi-
erarchical radiosity algorithm. In Computer Graphics 14. C. Damez, K. Dmitriev and K. Myszkowski. State of
(ACM SIGGRAPH ’91 Proceedings), vol. 25, pp. 197– the art in global illumination for interactive applications
206. 1991. and high-quality animations. Computer Graphics Fo-
rum, 22(1):55–77, 2003.
2. C. Damez and F. Sillion. Space-time hierarchical radios-
ity. In Proceedings of the 10th Eurographics Workshop 15. I. Martı́n, X. Pueyo and D. Tost. Frame-to-frame coher-
on Rendering, pp. 235–246. 1999. ent animation with two-pass radiosity. IEEE Transac-
tions on Visualization and Computer Graphics, 9(1):70–
3. C. M. Goral, K. E. Torrance, D. P. Greenberg and B. Bat- 84, 2003.
taile. Modelling the interaction of light between diffuse
surfaces. In Computer Graphics (ACM SIGGRAPH ’84 16. B. Smits, J. Arvo and D. Greenberg. A clustering algo-
Proceedings), vol. 18, pp. 212–222. 1984. rithm for radiosity in complex environments. In Com-
puter Graphics (ACM SIGGRAPH ’94 Proceedings),
4. E. Shaw. Hierarchical radiosity for dynamic envi- pp. 435–442. 1994.
ronments. Computer Graphics Forum, 16(2):107–118,
1997. 17. F. Sillion. Clustering and volume scattering for hier-
archical radiosity calculations. In Proceedings of the
5. G. Drettakis and F. Sillion. Interactive update of global 5th Eurographics Workshop on Rendering, pp. 105–117.
illumination using a line-space hierarchy. In Computer 1994.
Graphics (ACM SIGGRAPH ’97 Proceedings), pp. 57–
64. 1997. 18. P. Bekaert and Y. D. Willems. Error control for radiosity.
In Proceedings of the 7th Eurographics Workshop on
6. B. Walter, G. Drettakis and S. Parker. Interactive render- Rendering, pp. 153–164. 1996.
ing using the render cache. In Proceedings of the 10th Eu-
rographics Workshop on Rendering, pp. 235–246. 1999. 19. P. Bekaert and Y. D. Willems. Hirad: A hierarchical
higher order radiosity implementation. In Proceedings
7. P. Tole, F. Pellaccini, B. Walter and D. P. Greenberg. In- of the Twelfth Spring Conference on Computer Graphics
teractive global illumination in dynamic scenes. ACM (SCCG ’96), Bratislava, Slovakia, Comenius University
Transactions on Graphics (SIGGRAPH ’02 Proceed- Press, June 1996.
ings), vol. 21(3), pp. 537–546. 2002.
20. M. F. Cohen and J. R. Wallace. Radiosity and Realistic
8. A. Keller. Instant radiosity. In Computer Graphics (ACM Image Synthesis. Academic Press Professional, Boston,
SIGGRAPH ’97 Proceedings), pp. 49–56. 1997. MA, 1993.
c
The Eurographics Association and Blackwell Publishing Ltd 2004
C. Damez et al. / Space-Time Hierarchical Radiosity 141
21. S. J. Gortler, P. Schroder, M. F. Cohen and P. Hanrahan. ters. IEEE Transactions on Visualization and Computer
Wavelet radiosity. In Computer Graphics (ACM SIG- Graphics, 1(3): 240–254, 1995.
GRAPH ’93 Proceedings), pp. 221–230. 1993.
28. A. Willmott and P. Heckbert. An empirical comparison
22. P. Schroder, S. J. Gortler, M. F. Cohen and P. Hanrahan. of progressive and wavelet radiosity. In J. Dorsey and
Wavelet projections for radiosity. In Fourth Eurograph- P. Slusallek. (eds), Rendering Techniques ’97 (Proceed-
ics Workshop on Rendering, pp. 105–114. 1993. ings of the Eighth Eurographics Workshop on Render-
ing), New York, NY, pp. 175–186, Springer, Wien. 1997.
23. B. K. Alpert. A class of bases in L2 for the sparse repre- ISBN 3-211-83001-4.
sentation of integral operators. SIAM Journal on Math-
ematical Analysis, 24(1):246–262, 1993. 29. M. Stamminger, H. Schirmacher, P. Slusallek and H.-
P. Seidel. Getting rid of links in hierarchical radiosity.
24. P. H. Christensen, D. Lischinski, E. J. Stollnitz and D. H. Computer Graphics Journal (Proc. Eurographics ’98),
Salesin. Clustering for glossy global illumination. ACM 17(3), C165–C174, 1998.
Transactions on Graphics, 16(1):3–33, 1997.
30. M. Reichert. A Two-Pass Radiosity Method to Transmit-
25. J.-M. Hasenfratz, C. Damez, F. Sillion and G. Dret- ting and Specularly Reflecting Surfaces, M.Sc. thesis,
takis. A practical analysis of clustering strategies for Cornell University, 1992.
hierarchical radiosity. In Computer Graphics Forum
(Proc. Eurographics ’99), vol. 18, pp. C221–C232. Sept. 31. A. Scheel, M. Stamminger and H.-P. Seidel. Grid based
1999. final gather for radiosity on complex clustered scenes.
Computer Graphics Forum, 21(3): 547–556, 2002.
26. F. Cuny, L. Alonso and N. Holzschuch. A novel approach
makes higher order wavelets really efficient for radios- 32. F. Sillion and J.-M. Hasenfratz. Efficient parallel
ity. In Computer Graphics Forum (Proc. Eurographics refinement for hierarchical radiosity on a DSM com-
2000), vol. 19, pp. C99–C108. 2000. puter. In Proceedings of the Third Eurographics Work-
shop on Parallel Graphics and Visualisation. pp. 61–
27. F. Sillion. A unified hierarchical algorithm for global 74. 2000. http://www-imagis.imag.fr/Membres/Jean-
illumination with scattering volumes and object clus- M..Hasenfratz/PUBLI/EGWPGV00.html.
c
The Eurographics Association and Blackwell Publishing Ltd 2004
96 CHAPITRE 2. MODÉLISATION MULTI-ÉCHELLES DE L’ÉCLAIRAGE
We propose an automatic method for finding symmetries of 3D shapes, that is, isometric transforms which leave a shape globally
unchanged. These symmetries are deterministically found through the use of an intermediate quantity: the generalized moments.
By examining the extrema and spherical harmonic coefficients of these moments, we recover the parameters of the symmetries
of the shape. The computation for large composite models is made efficient by using this information in an incremental algorithm
capable of recovering the symmetries of a whole shape using the symmetries of its subparts. Applications of this work range from
coherent remeshing of geometry with respect to the symmetries of a shape to geometric compression, intelligent mesh editing,
and automatic instantiation.
Categories and Subject Descriptors: I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling—Curve, surface,
solid and object representations
General Terms: Algorithms
1. INTRODUCTION
Many shapes and geometrical models exhibit symmetries: isometric transforms that leave the shape
globally unchanged. Using symmetries, one can manipulate models more efficiently through coherent
remeshing or intelligent mesh editing programs. Other potential applications include model compres-
sion, consistent texture-mapping, model completion, and automatic instantiation.
The symmetries of a model are sometimes made available by the creator of the model and represented
explicitly in the file format the model is expressed in. Usually, however, this is not the case, and auto-
matic translations between file formats commonly result in the loss of this information. For scanned
models, symmetry information is also missing by nature.
In this article, we present an algorithm that automatically retrieves symmetries in a geometrical
model. Our algorithm is independent of the tesselation of the model; in particular, it does not assume
that the model has been tesselated in a manner consistent with the symmetries we attempt to identify,
and it works well on noisy objects such as scanned models. Our algorithm uses a new tool, the generalized
moment functions. Rather than computing these functions explicitly, we directly compute their spherical
harmonic coefficients, using a fast and accurate technique. The extrema of these functions and their
spherical harmonic coefficients enable us to deterministically recover the symmetries of a shape.
For composite shapes, that is, shapes built by assembling simpler structures, we optimize the compu-
tation by applying the first algorithm to the subparts, then iteratively building the set of symmetries of
Authors’ addresses: A. Martinet, C. Soler, N. Holzschuch, F. X. Sillion, ARTIS, INRIA Rhône-Alpes, Saint Ismier, France; email:
Aurelien. [email protected].
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first
page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists,
or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested
from Publications Dept., ACM, Inc., 1515 Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or [email protected].
c 2006 ACM 0730-0301/06/0400-0439 $5.00
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006, Pages 439–464.
97
440 • A. Martinet et al.
the composite shape, taking into account both the relative positions of the subparts and their relative
orientations.
We envision many applications for our work, including geometric compression, consistent mesh edit-
ing, and automatic instantiation.
This article is organized as follows. In the following section, we review previous work on identifying
geometric symmetries on 2D and 3D shapes. Then in Section 3, we present an overview of the symmetry-
detection problem and the quantities used in our algorithms. In Section 4, we introduce the generalized
moments and our method to compute them efficiently; in Section 5, we present our algorithm for
identifying symmetries of a shape. The extension of this algorithm to composite shapes is then presented
in Section 6., Finally, in Section 7, we show various applications of our algorithm.
2. RELATED WORK
Early approaches to symmetry detection focused on the 2D problem. Attalah [1985], Wolter et al. [1985]
and Highnam [1985] present methods to reduce the 2D-symmetry detection problem to a 1D pattern
matching problem for which efficient solution are known [Knuth et al. 1977]. Their algorithms efficiently
detect all possible symmetries in a point set but are highly sensitive to noise.
Identifying symmetries for 3D models is much more complex, and little research on this subject has
been published. Jiang and Bunke [1991] present a symmetry-detection method, restricted to rotational
symmetry, based on a scheme called generate and test, first finding hypothetical symmetry axes, then
verifying these assumptions. This method is based on a graph representation of a solid model and uses
graph theory. The dependency between this graph representation and the mapping between points
makes their method highly dependent on the topology of the mesh and sensitive to small modifications
of the object geometry. Brass and Knauer [2004] provide a model for general 3D objects and give an
algorithm to test congruence or symmetry for these objects. Their approach is capable of retrieving
symmetry groups of an arbitrary shape but is also topology-dependent since it relies on a mapping
between points of the model. Starting from an octree representation, Minovic et al. [1993] describe an
algorithm based on octree traversal to identify symmetries of a 3D object. Their algorithm relies on
PCA to find the candidate axis; PCA, however, fails to identify axes for a large class of objects, including
highly symmetric objects such as regular solids.
All these methods try to find strict symmetries for 3D models. As a consequence, they are sensitive
to noise and data imperfections. Zabrodsky et al. [1995] define a measure of symmetry for nonperfect
models, defined as the minimum amount of work required to transform a shape into a symmetric shape.
This method relies on the ability to first establish correspondence between points, a very restrictive
precondition.
Sun and Sherrah [1997] use the Extended Gaussian Image to identify symmetries by looking at
correlations in the Gaussian image. As in Minovic et al. [1993], they rely on PCA to identify potential
axes of symmetry, thus possibly failing on highly symmetric objects. More recently, Kazhdan et al. [2004]
introduced the symmetry descriptors, a collection of spherical functions that describe the measure of a
model’s rotational and reflective symmetry with respect to every axis passing through the center of mass.
Their method provides good results in the shape identification but involves a surface integration for each
sampled direction; this surface integration is carried on a voxel grid. Using the symmetry descriptors
to identify symmetries requires an accurate sampling in all directions, making their algorithm very
costly for an accurate set of results. In contrast, our algorithm only computes a deterministic small
number of surface integrals, which are performed on the shape itself, and still provides very accurate
results. Effective complexity comparisons will be given in Section 8.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
98
Accurate Detection of Symmetries in 3D Shapes • 441
Fig. 1. Mirror symmetries and rotational symmetries found by our algorithm for a cube (for clarity, not all elements are repre-
sented).
3. OVERVIEW
Considering a surface S, the symmetries of S are the isometric transforms which map S onto itself, in
any coordinate system centered on its center of gravity. Symmetries of a shape form a group for the
law of function composition with identity as its neutral element. For a given shape, the study of such a
group relates to the domain of mathematical crystallography [Prince 2004].
The group of the cube, for instance, contains 48 elements (see Figure 1): the identity, eight 3−fold
rotations around 4 possible axes, nine 4−fold rotations around 3 possible axes, six 2−fold rotations
around 6 possible axes, nine mirror-symmetries, and fifteen other elements obtained by composing
rotations and mirror symmetries.
Studying the group of isometries in IR 3 shows that, for a given isometry I , there always exists an
orthonormal basis (X, Y, Z) into which the matrix of I takes the following form:
⎛ ⎞
λ 0 0
α ∈ [0, 2π[
I (λ, α) = ⎝ 0 cos α − sin α ⎠ with
λ = ±1
0 sin α cos α
As suggested by the example of the cube, this corresponds to 3 different classes of isometries: rotations,
mirror symmetries, and their composition, depending whether λ is positive and/or α = 0(mod π ). Find-
ing a symmetry of a shape thus resolves into finding a vector X — which we call the axis of the isometry
— and an angle α — which we call the angle of the isometry — such that I (λ, α) maps this shape onto
itself.
However, finding all symmetries of a shape is much more difficult than simply checking whether
a given transform actually is a symmetry. In particular, the naive approach that would consist of
checking as many sampled values of (X, λ, α) as possible to find a symmetry is far too costly. We thus
need a deterministic method for finding good candidates.
Our approach to finding symmetries is to use intermediate functions, which set of symmetries is
a superset of the set of symmetries of the shape itself, but for which computing the symmetries is
much easier. By examining these functions, we will derive in Section 5 a deterministic algorithm
which finds a finite number of possible candidates for X, λ, and α. Because some unwanted triplets
of values will appear during the process, these candidates are then checked back on the original
shape. Choosing a family of functions which fulfill these requirements is easy. More difficult is the
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
99
442 • A. Martinet et al.
task of finding such functions for which computing the symmetries can be done both accurately and
efficiently.
Inspired by the work on principal component analysis [Minovic et al. 1993], we introduce the gener-
alized moment functions of the shape for this purpose. These functions will be the topic of Section 4.
These functions, indeed, have the same symmetries as the shape itself plus a small number of extra
candidates. Furthermore, we propose an elegant framework based on spherical harmonics to accurately
and efficiently find their symmetries.
A second contribution of this article is to extend the proposed algorithm into a constructive algorithm
which separately computes the symmetries of subcomponents of an object—using the first method—,
and then associates this information to compute symmetries of the whole composite shape. This con-
structive algorithm proves to be more accurate in some situations and more efficient when it is possible
to decompose an object according to its symmetries. It is presented in Section 6.
4. GENERALIZED MOMENTS
In this section, we introduce a new class of functions: the generalized moments of a shape. We then
show that these functions have at least the same symmetries as the shape itself and that their own
symmetries can be computed in a very efficient way.
4.1 Definition
For a surface S in a 3-dimensional domain, we define its generalized moment of order 2 p in direction ω
by
2p
M (ω) = s × ω2 p ds. (1)
s∈S
In this definition, s is a vector which links the center of gravity of the shape (placed at the origin) to a
point on the surface, and ds is thus an infinitesimal surface element. M2 p itself is a directional function.
It should be noted that, considering S to have some thickness d t, the expression M2 (ω)d t (i.e., the
generalized moment of order 2) corresponds to the moment of inertia of the thin shell S along ω, hence
the name of these functions. Furthermore, the choice of an even exponent and a cross-product will lead
to very interesting properties.
This theorem implies that the axes of the symmetries of a shape are to be found in the intersection
of the sets of directions which zero the gradients of each of its moment functions. The properties are
not reciprocal, however. Once the directions of the zeros of the gradients of the moment functions have
been found, they must be checked on the shape itself to eliminate false positives.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
100
Accurate Detection of Symmetries in 3D Shapes • 443
2l + 1 (l − |m|)!
Nlm = .
4π (l + |m|)!
We will use the following very powerful property of spherical harmonics. Any spherical harmonic of
degree l can be expressed in a rotated coordinate system using harmonics of same degree and coefficients
depending on the rotation R:
′ ′
Y lm ◦ R = Dlm,m (R)Y lm . (2)
−l ≤m′ ≤l
Any combination of spherical harmonics of degree less than l can therefore be expressed in a rotated
coordinate system,
′
using spherical harmonics of degree less than l , without loss of information. Coeffi-
cients Dlm,m (R) can efficiently be obtained using recurrence formula [Ivanic and Ruedenberg 1996] or
directly computed [Ramamoorthi and Hanrahan 2004].
Computation of Moment Functions. As defined by Equation (1), the 2 p−moment function of a shape
S is expressed as:
2p
M (ω) = s × ω2 p ds
s∈S
= s2 p sin2 p β ds
s∈S
101
444 • A. Martinet et al.
with:
√ 2l
(4l + 1)π 22 p+1 p!(2k)!( p + k − l )!
S lp = (−1)k . (3)
22l k=l
(2( p + k − l )+1)!(k − l )!k!(2l − k)!
For the sake of completeness, we provide the corresponding derivation and the proof of the finite de-
composition in the appendix section of this article.
Let Rs be a rotation which maps z, unit vector along z−axis, to s. Using Equation (2) for rotating the
Y 2l0 zonal harmonics, we have :
p 2l
sin2 p β = S lp 0,m
D2l (Rs)Y 2lm (ω).
l =0 m=−2l
And finally:
p 2l
2p
M2 p (ω) = C2l ,m Y 2lm (ω), (4)
l =0 m=−2l
using
2p 0,m
C2l ,m = S lp s2 p D2l (Rs) ds. (5)
s∈S
Equation (4) says that M2 p decomposes into a finite number of spherical harmonics, and Equation (5)
allows us to directly compute the coefficients. The cost of computing M2 p is therefore ( p + 1)(2 p + 1)
surface integrals (one integral per even order of harmonic, up to order 2 p). This is much cheaper than
the alternative method of computing the scalar product of M2 p as defined by Equation (1) with each
spherical harmonic basis function: this would indeed require many evaluations of M2 p , which itself is
defined as a surface integral. Furthermore, numerical accuracy is only a concern when computing the
m 2p
C2k, p coefficients, and we can now compute both M and its gradient analytically from Equation (4).
102
Accurate Detection of Symmetries in 3D Shapes • 445
in several directions, and faces are sorted by order of the minimal value found. Only faces with small
minimum values are refined recursively. The number of points to look at in each face, as well as the
number of faces to keep at each depth level, are constant parameters of the algorithm.
In a second step, we perform a steepest descent minimization on ∇(M2 p )(ω)2 , starting from each
of the candidates found during the first step. For this we need to evaluate the derivatives of ∇(M2 p )
which we do using analytically computed second-order derivatives of the spherical harmonics along
with Equation (4). The minimization converges in a few steps because starting positions are by nature
very close to actual minima. This method has the double advantage that (1) the derivatives are very
efficiently computed and (2) no approximation is contained into the calculation of the direction of the
2p
axis beyond the precision of the calculation of the C2l ,m coefficients.
During this process, multiple instances of the same direction can be found. We filter them out by
estimating their relative distance. While nothing in theory prevents the first step from missing the
area of attraction of a minimum, it works very well in the present context. Indeed, moment functions
are very smooth, and shapes having two isometries with very close—yet different—axis are not common.
Finally, because all moment functions, whatever their order, must have an extremum in the direction
of the axis of the symmetries of the shape, we compute such sets of directions for multiple moment
functions (e.g., M4 , M6 and M8 ) but keep only those which simultaneously zero the gradient of all
these functions, which in practice leaves none or very few false positives to check for.
5.2 Determination of Rotation Parameters
After finding the zero directions for the gradient of the moment functions, we still need to find the
parameters of the corresponding isometric transforms. This is done deterministically by studying the
spherical harmonic coefficients of the moment functions themselves. We use the following properties.
PROPERTY 1. A function has a mirror symmetry Sz around the z = 0 plane if and only if all its spherical
harmonic coefficients for which l + m is even are zero (i.e., it decomposes onto z−symmetric harmonics
only). In the specific case of the moment functions:
2p
∀ω M2 p (ω) = M2 p (Sz ω) ⇔ m ≡ 0(mod 2) ⇒ C2l ,m = 0.
PROPERTY 2. A function has a revolution symmetry around the z axis if and only if it decomposes onto
zonal harmonics only, that is,
∀l ∀m m = 0 ⇒ Clm = 0.
PROPERTY 3. A function is self-similar through a rotation Rα of angle α around z if and only if all its
spherical harmonic coefficients Clm verify:
∀l ∀m Clm = cos(mα)Clm − sin(mα)Cl−m . (6)
Property 3 can be adapted to check if the function is self-similar through the composition of a rotation
and a symmetry with the same axis (i.e., the case λ = −1 as defined in Section 3). In this case, the
equation to be checked for is:
∀l ∀m (−1)l +m Clm = cos(mα)Clm − sin(mα)Cl−m . (7)
These properties are easily derived from the very expression of the spherical harmonic functions
[Hobson 1931].
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
103
446 • A. Martinet et al.
Before using these properties, the moment function must be expressed in a coordinate system where
the z axis coincides with the previously found candidate axis. This is performed using the rotation
formula in Equation (2). Then checking for Properties 1 and 2 is trivial provided that some tolerance
is accepted on the equalities. Using Property 3 is more subtle; coefficients of the function are first
examined by order of decreasing m. For λ = 1, for instance, when the first nonzero value of Clm is found,
Equation (6) is solved by:
mα C −m 2 Cl−m kπ
tan = lm , that is, α = arctan m + ,
2 Cl m Cl m
then all the remaining coefficients are checked with the obtained values of α. If the test passes, then α
is the angle of an existing rotation symmetry for the moment function. A very similar process is used
to search for α when λ = −1.
The error tolerance used when checking for Properties 1, 2, and 3 can be considered as a way of
detecting approximate symmetries on objects. We will show in the results section that symmetries can
indeed be detected on noisy data such as scanned models.
The symmetric measure d A (S) of a shape S with respect to a symmetry A is then defined by:
d A (S) = max(d M (S, AS), d M (AS, S)).
It should be noted that this definition is different from that of the Hausdorff distance since, in Equa-
tion (8), not all points of S are considered but only the mesh vertices, whereas all points of R are used.
However, because S is polyhedral, d A (S) = 0 still implies that AS = S.
Computing d A is costly, but fortunately we only compute it for a few choices of A which are the
candidates we found at the previous step of the algorithm. This computation is much cheaper than
computing a full symmetry descriptor [Kazhdan et al. 2004] for a sufficient number of directions to
reach the precision of our symmetry detection algorithm.
5.4 Results
Complete Example. The whole process is illustrated in Figure 2. Starting from the original object (a),
the moment functions of orders 4, 6, and 8 are computed (see, e.g., M8 in (b)). The gradients of these
moments are then computed analytically (c) and used for finding the directions of the minima. The
unfiltered set of directions contains 7 directions among which only 3 are common extrema of M4 , M6 ,
and M8 . This set of 3 directions (D1 ,D2 , and D3 ) must contain the axes of the symmetries of the shape.
After checking the symmetry axis and parameters on the actual shape, D1 is revealed as the axis of a
2-fold symmetry which is the composition of the two remaining mirror symmetries of axes D2 and D3 .
The example of the cube, shown in Figure 1, illustrates the extraction of rotations and mirror sym-
metries. Experiments have shown that our method finds all 48 symmetries whatever the coordinate
system the cube is expressed in originaly.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
104
Accurate Detection of Symmetries in 3D Shapes • 447
Fig. 2. Extraction of symmetries for a single shape. Starting from the original shape (a), generalized moments (b) and their
gradients (c) are computed. The set of their common extrema directions contains the axes of the symmetries of the shape, depicted
at right. Here, both mirror symmetries have been found as well as the 2-fold rotational symmetry. Note that the original shape
is neither convex nor star-shaped and that the mesh is not consistent with the symmetries of the geometry.
Fig. 3. View of the three 3D models used in the robustness tests presented in Figure 4 shown with their symmetries. For the
sake of clarity, we chose models with only one symmetry each.
Robustness Tests. We now study the sensitivity of our method to small perturbations of the 3D model
in two different ways.
(1) Noise. We randomly perturb each vertex of each polygon independently in the original model by a
fraction of the longest length of the model’s bounding box.
(2) Delete. We randomly delete a small number of polygons in the model.
We use a set of three models to test the robustness of our method. These model as well as their
symmetry are shown in Figure 3. For the sake of clarity, we use objects with only one symmetry.
In order to test the robustness of the method, we progressively increase the magnitude of the noise
and let the algorithm automatically detect the symmetry. In our robustness tests, we consider shapes as
single entities and use the first algorithm presented in Section 5 to detect these symmetries. To evaluate
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
105
448 • A. Martinet et al.
Fig. 4. We test the sensitivity of the method to noise by progressively increasing noise magnitude and letting the algorithm
detect the symmetry for each of our three test models. We evaluate the accuracy of the results by computing the angular de-
viation between the axis found and the axis of the symmetry of the original model. Top row: We perturb each vertex of each
polygon independently by a fraction of the longest length of the bounding box on each of the three test models. The left fig-
ure shows a noisy pick-up model with a noise magnitude of 1% and the right figure shows angular deviation evolution for
the three models for a magnitude ranging from 0% to 1%. Bottom row: We randomly delete polygons of the models. The left
figure shows a noisy pick-up obtained by deleting 5% of the polygons and the right figure shows angular deviation evolution
by deleting 0% to 5% of the polygons of the three models. As can be seen from the curve, for small variations of the mod-
els, our method has approximatively linear dependency regarding noise and delivers high-quality results even for nonperfect
symmetries.
the reliability of the results, we compute the angular deviation between the found axis of symmety and
the real one, that is, computed with no noise. In our experiments, noise magnitude varies from 0 to 1%
of the longest length of the model’s bounding box, and the number of deleted polygons ranges from 0 to
5% of the total number of polygons in the model (see Figure 4).
The results of these experiments show that, for small variations, our method has approximatively
linear dependency regarding noise and delivers high-quality results even for nonperfect symmetries.
These statistical results can also be used to derive an upper bound on the mean angular error obtained
as a function of the noise in the model.
5.4.1 Application to Scanned Models. We present in Figure 5 examples of applying the single-shape
algorithm to scanned models, retreived from a Web database and used as is (see http://shapes.aim-at-
shape.net). Our algorithm perfectly detects all the parameters of candidate symmetries for all these
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
106
Accurate Detection of Symmetries in 3D Shapes • 449
Fig. 5. Our algorithm perfectly detects approximate symmetries of scanned models. Detecting these symmetries requires re-
laxing the constraints when checking candidate symmetries on the model. Please note that these scanned models are by nature
neither axis-aligned nor tesselated according to their symmetries. This illustrates the fact that our algorithm does not depend
on the coordinate system nor on the mesh of the objects.
shapes. When testing these symmetries, one should allow a large enough symmetry distance error (as
defined in Section 5.3) because these models are by nature not perfectly symmetric.
5.5 Discussion
Because the M2 p functions are trigonometric polynomials on the sphere, they have a maximum number
of strict extrema depending on p: the larger p is, the more M2 p is able to capture the information of a
symmetry, that is, to have an extremum in the direction of its axis. But because all moment functions
must have a null gradient in this direction (according to Theorem 1), these extrema are bound to become
nonstrict extrema for small values of p, and M2 p is forced to be constant on a subdomain of nonnull
dimension. Using the cube as an example in which case M2 is a constant function a trigonometric
polynomial of order 2 can simply not have enough strict extrema to represent all 12 distinct directions
of the symmetries of the cube.
In all the tests we conducted, however, using moments up to order 10 has never skipped any symmetry
on any model. But it would still be interesting to know the exact maximum number of directions
permitted by moments of a given order.
107
450 • A. Martinet et al.
Fig. 6. This figure illustrates the reliability of our congruency descriptor (as defined by Equation (9)). Two identical objects
meshed differently and expressed in two different coordinate systems ( A and B) have extremely close descriptor vectors, but a
slightly different object (C) has a different descriptor. The graphics on the right shows each component of the three descriptors.
The constructive algorithm first computes (if necessary) the symmetries of all separate tiles using
the single shape algorithm. Then it detects which tiles are similar up to an isometric transform and
finds the transformations between similar tiles. Then it explores all one-to-one mappings between tiles,
discarding mappings which do not correspond to a symmetry of the group of tiles as a whole.
Section 6.2 explains how we detect similar tiles and Section 6.3 details the algorithm which both
explores tile-to-tile mappings and finds the associated symmetry for the whole set of tiles.
Because it is always possible to apply the algorithm presented in Section 5 to the group of tiles,
considering it as a single complex shape, questioning the usefulness of the constructive method is
legitimate. For this reason, we will explain in Section 6.5 in which situations the constructive method
is preferable to the algorithm for single shapes; but let us first explain the method itself.
with
2
2k 2k
d 2l = C2l ,m (10)
−2l ≤m≤2l
(See Figure 6). It has been shown by Kazhdan et al. [2003] that dlk , as defined in Equation (10), does
not depend on the coordinate system the spherical harmonic decomposition is expressed in. This means
2p
that each d 2l , and therefore D2 p itself, is not modified by isometric transforms of the shape. Mirror
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
108
Accurate Detection of Symmetries in 3D Shapes • 451
Fig. 7. Scenes used for testing the object congruency descriptor. In each scene, the descriptor has been used to detect objects
with similar geometry (but possibly different meshes) up to a rigid transform. Objects found to be congruent are displayed with
the same color.
2p
symmetries do not affect d 2l either since they only change the sign of the coefficient for some harmonics
in a coordinate system aligned with the axis.
Two tiles A and B are considered to be similar up to an isometric transform, at a precision ε, when:
D2 p (A) − D2 p (B) < ε.
Theoretically, this shape descriptor can produce false positives, that is, tiles that are not congruent
but have the same descriptor, but it can not produce false negatives because of its deterministic nature.
Our experiments have shown that using moments up to order 6 produces a sufficiently discriminant
shape descriptor on all test scenes. This is illustrated in Table II where we present the average precision
value, that is, the percentage of matched tiles that are actually identical up to an isometric transform,
for a set of architectural scenes (Figure 7).
By definition, congruent tiles should have the same set of symmetries, possibly expressed in different
coordinate systems. Since we know the symmetries of each of the tiles, we introduce this constraint,
thereby increasing the discriminating power of our shape descriptor as shown in Table III.
109
452 • A. Martinet et al.
and S2 ∈ S \ H2 , we restrict the set of isometric transforms to the isometric transforms that also map
S1 onto S2 (but not necessarily S2 onto S1 ). Because these tiles have symmetries, this usually leaves
multiple possibilities.
Note that the global symmetries found must always be applied with respect to the center of mass g
of S, according to the definition of a symmetry of S.
At the end of the recursion step, we have the set of isometric transforms that map H1 ∪ {S1 } onto
H2 ∪ {S2 }.
Each recursion step narrows the choice of symmetries for S. The recursion stops when either this
set is reduced to identity transform or when we have used all the component tiles in the model. In the
latter case, the isometric transforms found are the symmetries of the composite shape. The recursion
is initiated by taking for H1 and H2 two similar tiles, that is, two tiles of the same class.
In the following paragraphs, we review the individual steps of the algorithm: finding all the isometric
transforms which map tile S1 onto similar tile S2 and reducing the set of compatible symmetries of S.
We then illustrate the algorithm in a step-by-step example.
6.3.2 Finding All the Isometries Which Transform a Tile onto a Similar Tile. At each step of our
algorithm, we examine pairs of similar tiles, S1 and S2 , and we have to find all the isometries which
map S1 onto S2 .
If gi is the center of mass of tile Si and g is the center of mass of the composite shape S, this condition
implies that the isometries we are looking for transform vector g1 − g into g2 − g. In order to generate
the set of all isometric transforms that map S1 onto S2 , we use the following property.
PROPERTY 4. If J is an isometry that maps S1 onto a similar tile S2 , then all the isometries K which
map S1 onto S2 are of the following form:
K = J T −1 AT with A ∈ G S1 such that A(g1 − g) = g1 − g, (11)
where G S1 is the group of symmetries of S1 , and T is the translation of vector g− g1 (refer to the Appendix
for proof of this property).
This property states that, once we know a single seed isometric transform which maps S1 onto S2 , we
can generate all such transforms by using the elements of G S1 in Equation (11).
6.3.3 Finding a Seed Transform. We need to find a seed transform J that maps S1 onto S2 . For each
tile, we extract a minimum set of independent vectors that correspond to extremas of their generalized
even moment functions. The number of vectors needed depends on the symmetries of the tile. J is then
defined as any isometric transform that maps the first set of vectors onto the second as well as vector
g1 − g onto g2 − g. Most of the time, a single isometric transform is possible at most. When multiple
choices exist, the candidate transforms are checked onto the shapes using the distance presented in
Section 5.3. This ensures that we find at least one seed transform.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
110
Accurate Detection of Symmetries in 3D Shapes • 453
Fig. 8. Three spheres uniformly distributed on a circle in the z-plane. Etablishing all one-to-one mappings of the set of all tiles
onto itself, which map each tile onto a similar tile, are used to detect all the symmetries of the shape. Note that the 3−fold
symmetry H is detected and is associated to a circular permutation mapping.
6.3.4 Ensuring Compatibility with Previous Isometries. During the recursion, we need to store the
current set of compatible isometries we have found. We do this by storing a minimal set of linearly
independent vectors along with their expected images by these isometries. For example, if we have to
store a symmetry of revolution, we store only one vector, the axis of the symmetry, and its image (itself).
For mirror symmetries, rotations, and central symmetries, we store three independent vectors, along
with their images by this isometric transform. For instance, in the case of a rotation of angle π around
axis X, we have:
X→ X Y → −Y Z → −Z. (12)
By examining all the one-to-one mappings of the set of all tiles onto itself, which map each tile onto a
similar tile, we are able to detect all symmetries of the set of tiles (see Figure 8). Note in this example
that the 3−fold symmetry H is detected and is associated to a circular permutation mapping.
6.4 Step-By-Step Example
Figure 9 presents a very simple example of a shape (a pair of pliers) composed of 3 tiles, S1 , S2 (the
handles), and R (the head). Two of the tiles are similar up to an isometric transform, S1 and S2 . Figure 9
also displays the centers of mass, g1 , and g2 of tiles S1 and S2 (which are not in the plane z = 0), and
the center of mass g of the whole shape. In the coordinate systems centered on their respective centers
of mass, S1 and S2 have a mirror symmetry of axis Z, and R has a rotation symmetry around axis X of
angle π .
Our constructive algorithm starts by selecting tile R and a similar tile (here, the only possible choice
is R).
Step 1. The algorithm explores the possibilities to transform R into itself. Two possibilities exist (a) the identity
transform, and (b) the rotation around X of angle π, deduced from (a) by Property 4.
At this point, the algorithm branches, and either tries to map S1 to itself (branch 1) or to S2 (branch 2).
Branch 1, Step 1. The algorithm tries to match S1 to itself. The only compatible transform is the identity
transform.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
111
454 • A. Martinet et al.
Fig. 9. Illustration of the constructive algorithm on a very simple example: from the symmetries of each of the 3 parts of the
object, the symmetries of the whole object are recovered. Please note that no symmetry was ommitted in this Figure. In particular,
tile R has only a rotational symmetry but no mirror symmetry. See text of Section 6.4 for a detailed explanation.
Fig. 10. A complex model which has the same group of symmetries as the icosahedron. The constructive algorithm successfully
retrieves all 15 planes of mirror symmetries (center) and all 31 distinct axes of rotational symmetries (right) using the rotational
and mirror symmetry of each tile (at left). The presence of 3−fold and 5−fold symmetries proves that our algorithm also detects
symmetries which map a set of similar tiles onto itself through a complex permutation.
Branch 1, Step 2. The algorithm then tries to map S2 to itself. Once again, the only possible transform is the
identity transform, and the recursion stops because all the tiles in the model have been used.
Branch 2, Step 1. The algorithm tries to match S1 to S2 . The only compatible transform is the rotation around X
of angle π.
Branch 2, Step 2. The algorithm then tries to match S2 to S1 . Once again, the only compatible transform is the
rotation around X of angle π, and the recursion stops because all the tiles in the model have been used.
Two symmetries have been found that map the shape onto itself, the identity transform and the
rotation around X of angle π. Note that, although our algorithm can potentially create lots of branching,
we prune branches that result in empty sets of transforms and, in practice, we only explore a small
number of branches.
112
Accurate Detection of Symmetries in 3D Shapes • 455
middle) of the shape, using the symmetries of each tile (Figure 10, left), which are 1 revolution symmetry
and 1 mirror symmetry.
Conversely, directly applying the first algorithm on such a shape shows that M2 to M8 are extremely
close to constant functions, making the extraction of directions an inaccurate process. The single-shape
algorithm still correctly finds all the axis if using moments up to order 10, but this has some impact on
computation times. Furthermore, the single-shape algorithm requires checking all of the symmetries
found on the model which is a significant part of its computation time. This is not the case for the
constructive algorithm because it relies on its knowledge of the symmetries of the tiles only. Because
many symmetries exist for this model, the total computation time of the single-shape algorithm is
therefore much higher. This is summarized in Table IV where we compare the computation times for
both methods at equivalent precision (i.e., 10−4 radians).
6.5.2 Finding Symmetries Inside Noncoherent Geometry. There exist common situations where 3D
scenes do not come as a set of closed separate objects but as an incoherent list of polygons. This hap-
pens, for instance, when retrieving geometric data from a Web site, mostly because a list of polygons
constitutes a practical common denominator to all possible formats.
In such a case, applying the single-shape algorithm would certainly give the symmetries of the whole
scene but if we are able to partition the set of polygons into adequate groups, that is, tiles to which we
apply the constructive algorithm, we may be able to extract symmetric objects from the scene as well
as the set of symmetries for the whole scene more rapidely as illustrated in Figure 10.
The gain in using the constructive algorithm to recover symmetries in the scene resides in the fact
that, once tile symmetries have been computed, grouping them together and testing for symmetries in
composed objects only adds a negligible cost which is not the case when we try to apply the single-shape
algorithm to many possible groups of polygons or even to the entire scene itself.
The various issues in the decomposition of a raw list of polygons into intelligent tiles are beyond the
scope of this article. In our case, tiles only need to be consistent with the symmetries. We propose the
following heuristic to achieve this correctly for most scenes:
We define tiles as maximal sets of edge-connected polygons. To obtain them, we insert all vertices of
the model into a KDTree and use this KDTree to efficiently recover which polygons share vertices up
to a given precision and share an edge. By propagating connectivity information between neighboring
polygons, we then build classes of edge-connected polygons, which we define to be our tiles. Figure 11
gives examples of such tiles for objects collected from the Web as a raw list of polygons.
Our simple heuristic approach of making tiles produced very good results on all scenes we tested and
suffices for a proof of concept of the constructive algorithm. This is illustrated in Figure 11 where a
lamp object and a chess game are shown along with their global symmetries. These symmetries were
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
113
456 • A. Martinet et al.
Fig. 11. Two models taken from the Web. From the raw list of polygons (left) our heuristic for scene partitionning extracts tiles
before the single-shape algorithm computes the symmetries for each of them (center). Using this information, the constructive
algorithm computes the symmetries of the whole model (right). Top row: A lamp object which has seven mirror symmetries and a
7−fold rotational symmetry. Bottom row: a chess board which is composed of pieces with very different symmetries but reveals to
only have a single 2−fold symmetry around a vertical axis (Note: in this last model, once tiles have been identified, chess pieces
were moved so as to obtain a model with at least one global symmetry).
computed from the symmetries of each of the subparts. These, in turn, were separately computed using
the algorithm presented in Section 5.
Obviously, this application requires that constructed tiles be consistent with symmetries, that is, that
it is possible to partition the scene into tiles which will map onto each other through the symmetries
of the scene. This may not be easy with scanned models, for instance, nor in perturbated data. In such
a case, our simple heuristic should be modified so as to base polygon neighborood relationships on
proximity distances between polygons rather than vertex positions only. Doing so, cutting one tile into
two parts and remeshing them independently, would have a high probability of producing the same
original tile after reconstruction. If not, then the existance of a symmetry inside the model may become
questionnable. Suppose, for instance, that the wrench in the step-by-step example (Section 6.4) gets
split into tiles that are not exact symmetrical copies of one another, and that these two tiles are too far
away to be merged into a single tile. Then the model is by nature not symmetric anymore which will
also be the output of the constructive algorithm.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
114
Accurate Detection of Symmetries in 3D Shapes • 457
7. APPLICATIONS
7.1 Geometry Compression and Instantiation
Our framework can be used for model compression at two different levels. (1) If a model exhibits
symmetries, then it can be compressed by storing only the significant part of the model and using the
symmetries to recreate the full model. (2) If a model contains multiple instances of the same part, then
these parts can be instantiated. (see Figure 12).
Although complex models often do not present symmetries, symmetry-based compression can usu-
ally be used on some subparts of the model. The ability to express a model by explicitely storing the
significant parts only while instancing the rest of the scene is provided by some recent 3D file formats
such as X3D (see Table VI). We thus measure our compression ratios as the size of the X3D files before
and after our two compression operations which we detail now.
The scene is first loaded as a raw collection of polygons, before being decomposed into tiles, using the
heuristic presented in Section 6.5.2. We then compute symmetries and congruent descriptors for each
tile. Computation times shown in Table VI present the average time needed to compute symmetries and
congruent descriptors for a single tile. As the process of computing tile properties does not depend on the
other tiles, it is an easily parallelizable process. The scene is then first compressed by instancing the tiles.
Secondly, when storing each tile, we only store the minimum significant part of its geometry according
to its symmetries. This part is extracted using the same algorithm we will present for remeshing a tile
according to its symmetries in the next section. Note that compression rates shown on this table are
computed using geometry informations only, that is, neither texturing nor material information are
taken into account. Compression times shown in Table VI are the times needed to detect all classes of
tile congruency.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
115
458 • A. Martinet et al.
Fig. 12. Detecting symmetries and similarities between tiles created from a raw list of polygons allows us to compress geometric
models in two ways: (1) by instancing similar tiles and (2) inside each symmetric tile, by instancing the part of the geometry which
permits to reconstruct the whole tile. In such a big model as the powerplant (13 millions triangles), we achieve a compression
ratio (ratio of geometry file size in X3D format) of 1:4.5. We show in this figure two subparts of the complete model. For each, we
show the tiles computed by our heuristic (see Section 6.5) as well as the obtained compression ratio. The PowerPlant model is a
courtesy of The Walkthru Project.
8. DISCUSSION
We discuss here a number of features of our technique as well as differences with existing approaches.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
116
Accurate Detection of Symmetries in 3D Shapes • 459
Fig. 13. Starting from an object in arbitrary orientation, we detect symmetries of the shape (in the figure, a planar symmetry)
and use it to remesh the objects with respect to these symmetries. Then, a user can easily edit the mesh and modify it while
keeping the symmetries of the initial shape.
117
460 • A. Martinet et al.
The second algorithm (for assembled objects) naturally works just as well for non star-shaped objects
as illustrated by the examples in Figure 11.
9. CONCLUSIONS
We have presented an algorithm to automatically retrieve symmetries for geometric shapes and models.
Our algorithm efficiently and accurately retrieves all symmetries from a given model, independently
of its tesselation.
We use a new tool, the generalized moment functions, to identify candidates for symmetries. The
validity of each candidate is checked against the original shape using a geometric measure. Generalized
moments are not computed directly: instead, we compute their spherical harmonic coefficients using an
integral expression. Having an analytical expression for the generalized moment functions and their
gradients, our algorithm finds potential symmetry axes quickly and with good accuracy.
For composite shapes assembled from simpler elements, we have presented an extension of this algo-
rithm that works by first identifying the symmetries of each element, then sets of congruent elements.
We then use this information to iteratively build the symmetries of the composite shape. This extension
is able to handle complex shapes with better accuracy since it pushes the accuracy issues down to the
scale of the tiles.
Future Work
The constructive algorithm presented in Section 6 automatically detects instantiation relationships
between tiles into a composite shape.
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
118
Accurate Detection of Symmetries in 3D Shapes • 461
We are currently developing a constructive instantiation algorithm which iteratively collates similar
tiles into instances, checking at each step that the relative orientation of each tile with respect to each
already constructed instance is preserved.
This algorithm requires the symmetries of the tiles, and maintaining the symmetries of the instances
found so far. For this, we use our shape congruency metric, our algorithm for finding symmetries of single
shapes, and our algorithm for finding symmetries on composite shapes.
APPENDIX (PROOFS)
PROOF OF THEOREM 1. Let A be an isometric transform which lets a shape S be globally unchanged.
We have:
2p
∀ω M (Aω) = s × Aω2 p ds
s∈S
= At × Aω2 p | det A| dt
t∈A−1 S
= t × ω2 p dt
t∈A−1 S
2p
= M (ω)
At line 2, we change variables and integrate over the surface transformed by A−1 . At line 3, an isometric
transform is a unit transform and so, its determinant is ±1 and thus vanishes. The cross product is
also left unchanged by applying an isometric transform to each of its terms. Line 4: because AS = S, we
also have S = A−1 S. The isometric transform A is thus also a symmetry of the M2 p moment functions.
Let A be an isometric transform with axis v, and suppose that A is a symmetry of M2 p . Let dv be the
direction of steepest descent of function M2 p around direction v. Because A is a symmetry of M2 p , we
have:
dAv = Adv = dv. (13)
If A is a rotation, this is impossible because dv ⊥ v. Moreover, for all directions ω, we have M2 p (−ω) =
M2 p (ω) and thus:
d−v = −dv. (14)
So, if A is a symmetry, we have Av = −v. From Equations (13) and (14), we get dv = −dv which is
impossible.
In both cases, M2 p can not have a direction of steepest descent in direction v. Because M2 p is infinitely
derivable, this implies that ∇M2 p (v) = 0
PROOF OF PROPERTY 4. Let S and R be two shapes, identical up to an isometric transform. Let J be an
isometric transform such that J S = R. Let T be the translation of vector −uS with uS = gS − g with gS
as the center of mass of S, and g the origin of the coordinate system into which J is applied.
— Let A ∈ G S be a symmetry of S such that AuS = uS . We have AT S = T S (the symmetry A
operates in the coordinate system centered on gS ). Let K = J T −1 AT . Then
KS = J T −1 AT S K0 = J T −1 AT 0
= J T −1 T S and = J T −1 A(−uS )
= JS = J T −1 (−uS )
= R = J0 = 0
ACM Transactions on Graphics, Vol. 25, No. 2, April 2006.
119
462 • A. Martinet et al.
By construction K is a rigid transform and conserves distances. It maps the origin onto itself. K is thus
an isometric transform. Furthermore, K maps S to R.
— Let K be an isometric transform such that K S = R. Let us choose A = T J −1 K T −1 . This choice
leads to K = J T −1 AT . Moreover:
AT S = T J −1 K T −1 T S AuS = T J −1 K T −1 uS
= T J −1 K S and = T J −1 K 2uS
= TS = T 2uS = uS
and
A0 = T J −1 K T −1 0
= T J −1 K uS
= T J −1 ( g R − g)
= T (−uS )
= 0
By construction A is affine and conserves distances. It maps 0 onto 0. A is thus an isometric transform.
A is also a symmetry of S which verifies AuS = uS .
— The set of isometries which map S to R is therefore the set of functions K of the form K = J T −1 AT ,
where A ∈ G S is a symmetry of S such that A(g − gS ) = (g − gS ).
PROOF OF EQUATION 3. We compute the decomposition of function θ −→ sin2 p θ into zonal spherical
harmonics. We prove that this decomposition is finite, and give the values of the coeficients.
By definition [Hobson 1931], we have:
0 2L + 1
Y L (θ, ϕ) = PL (cos θ )
4π
2L + 1 (−1) L d L
= L L
(1 − x 2 ) L (cos θ )
4π 2 L! d x
where Pk is the Legendre polynomial of order k. Because the set of Legendre polynomials P0 , P1 , ..., Pn
is a basis for polynomials of order not greater than n, function θ −→ sin2 p θ = (1 − cos2 θ) p can be
uniquely expressed in terms of PL (cos θ ). The decomposition of θ −→ sin2 p θ is thus finite and has
terms up to Y 20p at most.
Let’s compute them explicitely:
L
dL 2 L
dL
(1 − x ) = (−1) L−k x 2L−2k CLk
d xL d xL k=0
L
dL
= (−1) L (−1)k x 2k CLk
d xL k=0
120
Accurate Detection of Symmetries in 3D Shapes • 463
So:
2L + 1 (−1)k k (2k)!
Y L0 (θ, ϕ) = C
L L! L (2k − L)!
cos2k−L θ
4π L≤2k≤2L
2
We have
π
Jq = [− cos θ sin2q θ ]π0 + 2q cos2 θ sin2q−1 θ d θ
0
0
= 2q Jq−1 − 2q Jq
Therefore
2q
Jq = Jq−1
2q + 1
2q(2q − 2) . . . 2
= J0
(2q + 1)(2q − 1) . . . 3
22q+1 (q!)2
=
(2q + 1)!
For m even, we can take m = 2r and q = p + r; we get:
121
464 • A. Martinet et al.
For L even, we set L = 2l . Using r = k − l to match Equation (16) in Equation (15), we get:
S lp = Y 2l0 (θ, ϕ) sin2 p θ sin θd θd ϕ
4l + 1 (−1)k k (2k)! (2k − 2l )! p!22 p+1 ( p + k − l )!
= 2π 2l
C2l
4π 2l ≤2k≤4l 2 (2l )! (2k − 2l )! (k − l )!(2 p + 2k − 2l + 1)!
√ 2 p+1
(4l + 1)π k (2k)! p!2 ( p + k − l )!
= 2l
(−1)k C2l
2 (2l )! l ≤k≤2l (k − l )!(2 p + 2k − 2l + 1)!
√
(4l + 1)π (2k)! p!22 p+1 ( p + k − l )!
= 2l
(−1)k
2 l ≤k≤2l
k!(2l − k)!(k − l )!(2 p + 2k − 2l + 1)!
REFERENCES
ATTALAH, M. J. 1985. On symmetry detection. IEEE Trans. Comput. 34, 663–666.
BRASS, P. AND KNAUER, C. 2004. Testing congruence and symmetry for general 3-dimensional objects. Comput. Geom. Theory
Appl. 27, 1, 3–11.
HIGHNAM, P. T. 1985. Optimal algorithms for finding the symmetries of a planar point set. Tech. Rep. CMU-RI-TR-85-13 (Aug).
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
HOBSON, E. W. 1931. The Theory of Spherical and Ellipsoidal Harmonics. Cambridge University Press, Cambridge, UK.
IVANIC, J. AND RUEDENBERG, K. 1996. Rotation matrices for real spherical harmonics, direct determination by recursion. J. Phys.
Chem. A. 100, 6342–6347. (See also Additions and corrections in vol. 102, No. 45, 9099-9100).
JIANG, X.-Y. AND BUNKE, H. 1991. Determination of the symmetries of polyhedra and an application to object recognition.
In Proceedings of the International Workshop on Computational Geometry—Methods, Algorithms and Applications (CG ’91).
Lecture Notes in Computer Science, vol. 553. Springer, London, UK, 113–121.
KAZHDAN, M. M., FUNKHOUSER, T. A., AND RUSINKIEWICZ, S. 2003. Rotation invariant spherical harmonic representation of 3D
shape descriptors. In Proceedings of the 2003 Eurographics/ACM Siggraph Symposium on Geometry Processing (SGP ’03).
Eurographics Association, Aire-la-Ville, Switzerland, 167–175.
KAZHDAN, M. M., FUNKHOUSER, T. A., AND RUSINKIEWICZ, S. 2004. Symmetry descriptors and 3D shape matching. In Proceedings of
the 2004 Eurographics/ACM Siggraph Symposium on Geometry Processing (SGP ’04). Eurographics Association, Aire-la-Ville,
Switzerland.
KNUTH, D. E., MORRIS, JR., J. H., AND PRATT, V. R. 1977. Fast pattern matching in strings. SIAM J. Comput. 6, 2, 323–350.
MINOVIC, P., ISHIKAWA, S., AND KATO, K. 1993. Symmetry identification of a 3-D object represented by octree. IEEE Trans. Patt.
Analy. Mach. Intell. 15, 5, 507–514.
PRINCE, E. 2004. Mathematical Techniques in Crystallography and Materials Science, 3rd Ed. Springer, Berlin, Germany.
RAMAMOORTHI, R. AND HANRAHAN, P. 2004. A signal-processing framework for reflection. ACM Trans. Graph. 23, 4, 1004–1042.
SUN, C. AND SHERRAH, J. 1997. 3D symmetry detection using extended Gaussian image. IEEE Trans. Patt. Analy. Mach.
Intell. 19, 2 (Feb.), 164–168.
WOLTER, J. D., WOO, T. C., AND VOLZ, R. A. 1985. Optimal algorithms for symmetry detection in two and three dimensions.
Visual Comput. 1, 37–48.
ZABRODSKY, H., PELEG, S., AND AVNIR, D. 1995. Symmetry as a continuous feature. IEEE Trans. Patt. Analy. Mach. Intell. 17, 12,
1154–1166.
122
3.
Propriétés de la fonction d’éclairage
cos θ1 cos θ2
Z
ρ
B(x) = B(y) dy (3.1)
π E(x) r2
Où E(x) désigne la partie de l’émetteur E qui est visible du point x, et inclut donc l’influence des
obstacles. La fonction B(x) est dérivable, et il est possible de calculer ses dérivéesr [15, 11, 30].
Nous avons fourni l’expression de la dérivée première (le Jacobien ou le gradient) et de la dérivée
seconde (le Hessien) de la radiosité au point x.
Ces dérivées peuvent être calculées explicitement [15] ; ce calcul réutilise plusieurs quantités
qui sont aussi nécessaires pour le calcul de la radiosité. Il est donc possible de calculer simulta-
nément la radiosité et ses dérivées, pour un surcoût modeste (de l’ordre de 30 %) par rapport au
calcul de la radiosité seule.
123
124 CHAPITRE 3. PROPRIÉTÉS DE LA FONCTION D’ÉCLAIRAGE
Figure 3.1 – La réflexion est plus ou moins nette en fonction de la BRDF. Ici, la BRDF évolue de
parfaitement spéculaire à gauche à diffuse à droite.
La simulation de l’éclairage présente des phénomènes qui sont plus ou moins flous, en fonc-
tion des objets et des sources lumineuses. Par exemple, la réflexion sur un objet spéculaire est
parfaitement nette, et contient tous les détails de la scène réfléchie, tandis que la réflexion sur
un objet diffus est très floue (voir figure 3.1). De la même manière, l’ombre causée par une
source ponctuelle est nette, tandis que l’ombre causée par une source surfacique est floue (voir fi-
gure 3.2). Enfin, l’éclairage indirect dans une scène est en général plus flou que l’éclairage direct
(voir figure 3.3).
Ce côté net ou flou peut se traduire en terme de contenu fréquentiel de la fonction d’éclairage :
les effets nets (ombres dures, réflexion spéculaires) correspondent à des hautes fréquences, tandis
que les effets flous (ombres douces, réflexion diffuse, éclairage indirect) correspondent à des
basses fréquences.
Avec François Sillion et Cyril Soler, dans le cadre d’une collaboration avec Frédo Durand et
Eric Chan de l’équipe CSAIL du MIT, collaboration financée par une Équipe Associée INRIA,
nous avons montré qu’il est possible de prédire ce contenu fréquentiel de l’éclairage en fonction
1. Daniel L, Brian S et Donald P. G. « Bounds and Error Estimates for Radiosity ». Dans ACM
SIGGRAPH ’94, p. 67–74, 1994.
2. James A, Kenneth T et Brian S. « A Framework for the Analysis of Error in Global Illumination
Algorithms ». Dans ACM SIGGRAPH ’94, p. 75–84, 1994.
3. G. D et E. F. « Accurate and Consistent Reconstruction of Illumination Functions Using Structured
Sampling ». Computer Graphics Forum (Eurographics ’93), 12(3):C273–C284, septembre 1993.
3.2. ÉTUDE FRÉQUENTIELLE DE LA FONCTION D’ÉCLAIRAGE 125
Figure 3.2 – L’ombre causée par une source ponctuelle est nette, tandis que l’ombre causée par
une source surfacique est plus floue (images gracieusement fournies par Ulf Assarsson).
Figure 3.3 – L’éclairage indirect est en général plus flou que l’éclairage direct (images gracieu-
sement fournies par Cyril Soler).
de la scène (sources lumineuses, obstacles, matériaux) [5] (voir p. 164). Nous nous intéressons à
la fois au spectre spatial et au spectre angulaire :
– Nous considérons l’éclairage comme un local light field, paramétré par une distance et un
angle par rapport à un rayon de référence.
– Au départ de la source lumineuse, le spectre (spatial et angulaire) de ce local light field est
connu.
– Chaque étape entre la source lumineuse et le récepteur est vue comme un filtre agissant sur
le contenu fréquentiel :
– Le transport à travers l’espace libre a l’effet d’une affinité orthogonale à l’axe des fré-
quences angulaires. Cet effet convertit les fréquences spatiales en fréquences angulaires.
– En présence d’un obstacle, il y a convolution entre le spectre de l’obstacle et celui du
local light field, introduisant de nouvelles fréquences spatiales.
– la réflexion sur un récepteur peut être décomposée en plusieurs phases, avec un filtre
particulier associé à chacune. L’effet général est celui d’un filtre passe-bas dans les fré-
quences angulaires. La fréquence de coupure de ce filtre est liée au caractère spéculaire
126 CHAPITRE 3. PROPRIÉTÉS DE LA FONCTION D’ÉCLAIRAGE
ou non de la BRDF. Une BRDF diffuse coupe complètement les fréquences angulaires
tandis qu’une BRDF spéculaire les conserve entièrement.
– L’effet combiné de ces différents filtres permet de prédire l’étendue du spectre (spatial et
angulaire) en un point donné de la scène. On peut ensuite tirer parti de cette connaissance
pour guider les calculs de simulation de l’éclairage, en adaptant l’échantillonnage aux fré-
quences.
Une des observations les plus intéressantes issues de notre travail est que les fréquences spa-
tiales et angulaires sont liées par l’étape de transport dans l’espace libre. Lorsqu’une BRDF non
spéculaire élimine certaines fréquences angulaires, cela a aussi pour conséquence de supprimer
des fréquences spatiales. Plus le transport est long, plus les fréquences spatiales sont liées à des
fréquences angulaires élevées, et donc plus l’effet de coupure de la BRDF sur les fréquences
angulaires se traduit par des fréquences spatiales basses.
Cet effet, confirmé par des études expérimentales, ouvre de nombreuses possibilités dans la
simulation de l’éclairage. La capacité à prédire les fréquences maximales en chaque point de la
scène permet de guider l’échantillonnage au cours du processus de simulation de l’éclairage, et
ce quelle que soit la méthode employée pour les calculs (photon-mapping, radiosité, PRT...). Ce
travail devrait être la base de nombreuses études futures et applications pratiques.
(a) Obstacles à basses fréquences : (b) Fréquences plus élevées dans les (c) Obstacles à hautes fréquences :
basses fréquences sur le récepteur obstacles : fréquences plus élevées sur basses fréquences sur le récepteur
le récepteur
Figure 3.4 – Application de notre étude des fréquences de la fonction d’éclairage. Les fréquences
spatiales sur le récepteur diffus évoluent de façon non-monotone avec les fréquences des obs-
tacles.
Comme application de notre étude, considérons le spectre de la fonction d’éclairage sur un ré-
flecteur diffus en présence d’obstacles (voir figure 3.4). Ce spectre évolue de façon non-monotone
en fonction du spectre des obstacles : dans un premier temps, une augmentation de la fréquence
des obstacles se traduit par une augmentation des fréquences spatiales sur le récepteur (voir
figure 3.4(b)). En revanche, passé un certain seuil, une augmentation de la fréquence des obs-
tacles se traduit au contraire par une diminution des fréquences spatiales sur le récepteur (voir
figure 3.4(c)).
Cet effet, déjà étudié4 , est parfaitement expliqué par notre étude : l’obstacle introduit des
fréquences spatiales. Le transport après l’obstacle pousse ces fréquences spatiales dans les fré-
quences angulaires. La réflexion sur une surface diffuse coupe les fréquences angulaires, et donc
les fréquences spatiales qui y sont liées.
4. Francois S et George D. « Feature-Based Control of Visibility Error: A Multiresolution Clustering
Algorithm for Global Illumination ». Dans ACM SIGGRAPH ’95, p. 145–152, 1995.
3.3. DISCUSSION 127
3.3 Discussion
Dans ce chapitre, nous avons présenté nos travaux sur les propriétés des fonctions d’éclai-
rage. Nous avons montré qu’il est possible de déduire les propriétés locales de l’éclairage en
fonction des positions respectives des objets. Ces propriétés peuvent être utilisées pour guider les
méthodes de résolution, augmentant ainsi leur efficacité.
Les travaux sur le contenu fréquentiel de la fonction d’éclairage n’ont pas encore livré tout
leur potentiel ; nous comptons les poursuivre par de nouvelles recherches.
Chaque réflexion sur une surface non-spéculaire après un transport dans l’espace libre a pour
effet de faire baisser le contenu fréquentiel, aussi bien en espace qu’en angle. En conséquence, les
effets à haute fréquence vont se produire : soit lors des réflexions spéculaires, soit dans l’éclairage
direct, soit lorsque le transport dans l’espace libre a peu d’effet, c’est-à-dire lorsque deux objets
sont proches.
Compte-tenu des progrès des cartes graphiques programmables, il est possible de calculer
séparément et de façon interactive certains de ces effets à haute fréquence. Les effets à basse
fréquence pourraient alors être calculés séparément, avec un échantillonnage plus lâche. Ce calcul
en temps-réel des effets d’éclairage à haute fréquence fait l’objet du chapitre suivant.
128 CHAPITRE 3. PROPRIÉTÉS DE LA FONCTION D’ÉCLAIRAGE
3.4 Articles
3.4.1 Liste des articles
– Accurate Computation of the Radiosity Gradient for Constant and Linear Emitters (EGWR
’95)
– An exhaustive error-bounding algorithm for hierarchical radiosity (CGF ’98)
– A Frequency Analysis of Light Transport (Siggraph 2005)
3.4. ARTICLES 129
3.4.2 Accurate Computation of the Radiosity Gradient for Constant and Linear Emitters
(EGWR ’95)
Auteurs : Nicolas H et François S
Conférence : 6e Eurographics Symposium on Rendering, Dublin, Irlande.
Date : juin 1995
130 CHAPITRE 3. PROPRIÉTÉS DE LA FONCTION D’ÉCLAIRAGE
Accurate Computation of the Radiosity Gradient for
Constant and Linear Emitters
Nicolas Holzschuch, François Sillion
iMAGIS/IMAG⋆
1 Introduction
Computing the effect of a given patch on the radiosity of another patch is easily done
assuming the radiosity on both patches are constant. In that case, we can express
the influence of the emitter on the receiver with a single number, the form-factor.
However, assuming the radiosity on both patches is constant is a strong assumption, and
it introduces a specific source of error in the resolution algorithm.
In 1994, Arvo et al. [2] recorded all possible sources of error in global illumination
algorithms, and introduced a framework for the analysis of error. Errors can occur at
several levels in the resolution process:
– During modeling: our geometry is not exactly that of the scene we want to compute,
and the BRDF are not exact either.
– During discretisation: our set of basis functions is not able to represent the real
solution, but only an approximated one.
– During computation: we do not compute transfer elements exactly, but only within
finite precision.
Lischinski et al. [9] presented an error driven refinement strategy for hierarchical
radiosity. They were able to maintain upper and lower bounds on computed radiosity,
and to concentrate their work in places where the difference was too large.
However, practical tools are still lacking to measure discretisation error. The problem
is to efficiently reconstruct the radiosity function, with only a small number of samples.
The best position for sampling points can only be found with total knowledge of the
radiosity function.
In practice, at each step, we have to intuit the behaviour of the function from our
current set of samples, in order to guess if we should – or not – introduce new sampling
points, and where.
⋆
iMAGIS is a joint research project of CNRS/INRIA/INPG/UJF. Postal address: B.P. 53, F-38041
Grenoble Cedex 9, France. E-mail: [email protected].
131
Knowing the radiosity derivatives allows better sampling, and thus reduction of
discretisation error. Heckbert [6] and Lischinski et al. [7] predicted an efficient surface
mesh using derivatives discontinuities. Drettakis and Fiume [4, 5] used information
on the structure of the function to accurately reconstruct the illumination. Vedel and
Puech [11] presented a refinement criterion based on gradient values at the nodes.
However, these authors usually resorted to approximated values of the partial deriva-
tives, using several computations of radiosity and finite differences. Computing accurate
values for the gradient allows arbitrary precision on our refinement criterion.
Arvo [1] presented a method to compute the irradiance Jacobian in case of partially
occluded light sources. His method is presented with constant emitters. This paper
introduces a new formulation of the radiosity gradient, valid for arbitrary radiosity
functions on the emitter. The derivation is presented in the case of total visibility, i.e.
without occluders. However, we shall see that extending the algorithm to the case of
partial visibility is easy using Arvo’s technique, since the two algorithms are largely
independant.
~
n2
y
θ2
~
r12 A2
θ1
~
n1
A1
Our knowledge of radiosity at the receiving point derives from the integral equation:
ρ B(y) cos θ1 cos θ2
Z
B(x) = dA2 (1)
π A2 k~r12 k2
where ~r12 is the vector joining point x on the receiver and point y on the emitter. θ1
is the angle between ~r12 and the normal on the receiver, θ2 the angle between ~r12 and
the normal on the emitter, and dA2 the area element on the emitter around point y (see
Fig. 1).
Should any occluders be present between point x and emitter A2 , the integral would
only be over the part of A2 visible from x.
132
We can reformulate Equation 1 as the expression of the flux of a vector field through
surface A2 : Z
B(x) = F~ · dA
~2 (2)
A2
where F~ is:
ρB(y)(~r12 · ~n1 )~r12
F~ = −
πk~r12 k4
A classic way to deal with flux integrals as Equation 2 is to transform them into a linear
integral using Stoke’s theorem2:
Z I
~ ) · dA
(∇ × V ~= V~ · d~x (3)
A ∂A
These linear integrals can be easier to compute, and are also easier to estimate if there
are no closed forms. However, to use Stoke’s theorem (3), we need to express the vector
field F~ as the curl of another vector field, V~ .
A classic property is that this is equivalent to F~ having a null divergence (∇· F~ = 0).
Basically, the divergence of a vector flux is a quantity that express at each point how
much does the flux “radiates away” from this point, while the curl of a vector field “turns
around” it at each point. The divergence of a curl is always null (∇ · (∇ × V ~ ) = 0), and
if a field has a null divergence, it can be expressed as a curl.
An easy computation shows that the divergence of F~ with respect to point y on
surface A2 is3 :
ρ ~r12 · ~n1
∇ · F~ = − (∇(B) · ~r12 ) (4)
π k~r12 k2
and hence is null if the gradient of B on the emitting surface is null. That is to say, if
the radiosity of the emitter is constant.
We can always separate F~ in two parts:
F~ = ∇ × (V
~)+G
~
Namely:
Using the properties of cross-products and dot-products, we can rewrite Equation 5 as:
2π ~r12 × d~x2
I Z
~r12
B(x) = −~n1 · B(y) 2
+ · (~n1 × (∇(B) × ~n2 )) dA2 (6)
ρ ∂A2 k~r12 k r12 k2
A2 k~
2
H
∂A stands for the contour of A, and expresses that this contour is closed.
3
In this section, all derivative signs (∇, ∇·, ∇×) are relative to point y on surface A2 .
133
Note that this rewriting process does not make any assumption whatsoever on B(y).
Hence it can be used in any case. An interesting case is when B(y) is constant: then
G~ = ~0, and the second term is null. Another interesting case is B(y) being linear: then
its gradient is constant and can be carried out of the second integral, leaving only a pure
geometric factor to compute. Appendix A presents a detailed study of these two cases.
This rewriting process separates the radiosity in two terms, a contour integral that we
can generally compute, provided that we know the radiosity on the emitter, and a surface
integral, generally harder to compute as an exact term. But, as shown later, having an
integral form of this term, we can compute its value with an arbitrary precision.
In case the emitter A2 does not depend on the position of the point x – that is to say, in
case there are no occluder between point x and the emitting surface A2 – this equation
is equivalent to: Z
∇(B)(x) = ∇ F~ · dA~2
A2
Or, if we use Equation 5:
I Z
∇(B)(x) = ∇ V~ · d~x2 + ~ · dA
∇(G ~ 2) (8)
∂A2 A2
If the emitter depends on the position of point x – that is, if there are occluders –
the expression of ∇(B)(x) is the sum of two terms; the first one takes into account
the variation of F~ , and is exactly the term we are discussing, and the second one takes
into account the variation of the emitter. Thus, it is easy to merge a method to compute
the gradient with occluders and a constant emitter, as in Arvo [1], and our method to
compute the gradient with an arbitrary emitter, but without occluders.
Note that in this section, we are taking a derivative with respect to point x on
the receiving surface, not with respect to point y on the emitting surface. So for our
derivating operator, the radiosity on the emitting point B(y) can be regarded as constant,
as well as its gradient, ∇(B)(y).
Using the properties of the gradient of a scalar product, starting from Equation 8,
we can express the gradient of radiosity at the receiving point:
2π ~n1 · ~r12
I I
d~x2
∇(B)(x) = ~n1 × B(y) 2
+2 B(y) (~r12 × d~x2 )
ρ ∂A2 k~
r 12 k ∂A2 k~r12 k4
Z
dA2
+ (~n1 × (∇(B)(y) × ~n2 )) 2
A2 k~
r12 k
(~n1 × (∇(B)(y) × ~n2 )) · ~r12
Z
−2 ~r12 dA2 (9)
A2 k~r12 k4
134
This equation, like the radiosity equation (6) is divided in two parts: a contour integral
which usually has a closed form, and a surface integral that we can estimate to any
arbitrary precision.
As before, two interesting cases occur: if the gradient on the emitter is null, that is
if we assume a constant radiosity on the emitter, all surface integrals vanish. And if the
gradient on the emitter is constant, that is if we assume a linear radiosity on the emitter,
it can be carried out of the surface integrals, leaving us with purely geometrical factors
or vectors to compute. Please refer to Appendix A for a detailed study of these cases.
3.2 Using the gradient
Knowing the gradient at a point gives very valuable information on the function we are
studying. As previous authors pointed out, the gradient may be used either to reconstruct
the illumination function before display, or to check the consistency of our discretisation
hypothesis.
Reconstructing the illumination function If we know the radiosity values and the
gradient at our sample points, we can then reconstruct the radiosity function as, e.g. a
bicubic spline.
Salesin et al. [10] and Bastos et al. [3] proposed such methods for reconstruction
of radiosity using estimates of gradient. Ward and Heckbert [12] computed irradiance
gradients to interpolate irradiance on receiving surfaces.
Refinement criterion Many radiosity algorithms assume a constant radiosity over
patches. It may seem strange to compute the gradient of radiosity in that case, but
in fact the information given by the gradient can also be used there.
Using the derivatives allows precisely to check whether our discretisation hypothesis
were correct or not, and if they were not, it also gives a hint on where it would be best
to refine in order to minimize the discretisation error.
Polynomial approximation
B(y1 )
linear interpolation
Patch width B(y0 )
135
this cubic interpolant differs from our linear assumption (see Fig. 2b for an example in
2D). We can even compute the difference between linear and cubic interpolant without
explicitly computing the interpolants. This criterion also gives the best next sampling
point, the position of the maximum difference between the two interpolants.
Real Error
Error = 0 ∇(B) = 0
B B(y0 ) B(y1 )
Approximate Error
136
The ability to compute radiosity gradients for linear emitters is especially interesting
when using linear basis functions or linear wavelets. In that case, the discretisation error
can be precisely isolated.
Our next step will be a complete implementation of the refinement criterion described
in section 3.2, to effectively reduce the discretisation error, within a hierarchical radiosity
framework with linear radiosity.
We will then have the possible background for a complete radiosity algorithm
with all possible sources of error (visibility, discretisation, computational) recorded and
monitored, thus allowing to focus the computing resources at the points where this error
is large.
6 Acknowledgements
Color pictures were computed by Myriam Houssay-Holzschuch using the GMT package,
developped by Wessel and Smith [13].
The authors would like to thank the anonymous reviewers for useful insights and
positive criticism.
References
1. Arvo, J.: The Irradiance Jacobian for Partially Occluded Polyhedral Sources. SIGGRAPH
(1994) 343–350
2. Arvo, J., Torrance, K.,Smits, B.: A Framework for the Analysis of Error in Global Illumi-
nation Algorithms. SIGGRAPH (1994) 75–84
3. Bastos, R. M., de Sousa, A. A., Ferreira, F. N., Reconstruction of Illumination Functions
using Bicubic Hermite Interpolation. Fourth Eurographics Workshop on Rendering (June
1993) 317–326
4. Drettakis, G., Fiume, E.: Concrete Computation of Global Illumination Accurate and Con-
sistent Reconstruction of Illumination Functions Using Structured Sampling. Computer
Graphics Forum (Eurographics 1993 Conf. Issue) 273–284
5. Drettakis, G., Fiume, E.: Concrete Computation of Global Illumination Using Structured
Sampling. Third Eurographics Workshop on Rendering (May 1992) 189–201
6. Heckbert, P. S.: Simulating Global Illumination Using Adaptative Meshing. PhD Thesis,
University of California, Berkeley, June 1991.
7. Lischinski, D., Tampieri, F., Greenberg, D. P.: Discontinuity Meshing for Accurate Radios-
ity. IEEE Computer Graphics and Applications 12,6 (November 1992) 25–39
8. Lischinski, D., Tampieri, F., Greenberg, D. P.: Combining Hierarchical Radiosity and Dis-
continuity Meshing. SIGGRAPH (1993)
9. Lischinski, D., Smits, B., Greenberg, D. P.: Bounds and Error Estimates for Radiosity.
SIGGRAPH (1994) 67–74
10. Salesin, D., Lischinski, D., DeRose, T.: Reconstructing Illumination Functions with Selected
Discontinuities. Third Eurographics Workshop on Rendering (May 1992) 99–112
11. Vedel, C., Puech, C.: Improved Storage and Reconstruction of Light Intensities on Surfaces.
Third Eurographics Workshop on Rendering (May 1992) 113–121
12. Ward, G. J., Heckbert, P. S.: Irradiance Gradients. Third Eurographics Workshop on Ren-
dering (May 1992) 85–98
13. Wessel, P. and Smith, W. H. F.: Free Software helps Map and Display Data. EOS Trans.
Amer. Geophys. U., vol. 72, 441–446, 1991
137
A Application to Constant and Linear Emitters
A.1 Case of a constant emitter
In the case of a constant emitter the Equations 6 and 9 reduce to:
2π ~r12 × d~x2
I
B(x) = −~n1 · B(y) (10)
ρ ∂A2 k~r12 k2
2π ~n1 · ~r12
I I
d~x2
− ∇(B)(x) = ~n1 × B(y) 2
+ 2 B(y) (~r12 × d~x2 )(11)
ρ ∂A2 k~ r12 k ∂A2 k~r12 k4
If A2 is a polygon, these integrals have a closed form, and yield:
2π X
B(x) = −B2~n1 · I1 (i) (~ri × ~ei )
ρ i
2π X
− ∇(B)(x) = B2 I1 (i) (~n1 × ~ei )
ρ i
X
+ 2B2 (~ri × ~ei ) · ~n1 (I2 (i)~ri + J2 (i)~ei )
i
where the sum extends on all the edges of the polygon, and B2 is the radiosity of the
emitter. ~ri , ~ei , I1 (i), I2 (i) and J2 (i) stand for (see also Fig. 4):
−→
~ri = xEi
−−−−→
~ei = Ei Ei+1
γi
I1 (i) =
k~ri × ~ei k
1 ~ri+1 · ~ei ~ri · ~ei 2
I2 (i) = − + k~
e i k I 1 (i)
2k~ri × ~ei k2 k~ri+1 k2 k~ri k2
1 1 1
J2 (i) = − − 2I 2 (i)~
ri · ~
e i
2k~eik2 k~ri k2 k~ri+1 k2
and γi is the angle sustended by edge ~ei from point x.
Ei
~
ei
~
ri
Ei+1
γi
A2
x
A1
138
Computing ∇(B)(x) requires roughly 87 multiplications more, 57 additions more
and 3 divisions more. Which, with the same material, equals approximately 150 addi-
tions.
Although this computationnal cost may depend on implementation details as well
as on the computer used (some compilers have very fast implementations of arc cos
and square root), computing the gradient along with the radiosity does not over-increase
computation time.
A.2 Case of a linear emitter
If the emitter is not constant, the gradient of radiosity on the emitter is not null, and
must be used in our computations. However, if we assume the radiosity of the emitter
is linear, then its gradient is constant and can be carried out of the integrals. Moreover,
this gradient is orthogonal to ~n2 , and can be expressed as:
∇(B)(y) = ~n2 × ~k
with:
~r12
Z Z
~ =
m dA2 = ∇(ln(r12 ))dA2
A2 k~r12 k2 A2
Computing the contour integrals does not induce any particular difficulties. However,
computing m
~ is harder. We can make use of Ostrogradsky’s theorem, similar to Stoke’s:
Z I
~
∇(V ) × dA = − V d~x
A ∂A
to express m~ × ~n2 .
~ · ~n2 is null if point x is on polygon A2 . If point x is not on polygon A2 , it can be
m
estimated with arbitrary precision.
The formula for B(x) is then:
2π ~r12 × d~x2
I I
B(x) = −~n1 · B(y) − (~
n 1 · ~
n 2 )~
k · ln(r12 )d~x2
ρ ∂A2 k~r12 k2 ∂A2
2π ~n1 · ~r12
I I
d~x2
− ∇(B)(x) = ~n1 × B(y) + 2 B(y) (~r12 × d~x2 )
ρ ∂A2 k~r12 k2 ∂A2 k~r12 k4
~r12 ~
I
− (~n1 · ~n2 ) (k · d~x2 )
∂A2 k~r12 k2
+ (~n2 · (~n1 × ~k)) (~n2 X1 − 2(~n2 · ~r0 )~
p)
139
with:
Z
~r12
p~ = dA2
A2 k~r12 k4
Z
dA2
X1 =
A2 k~r12 k2
Computing p~ is exactly like computing m: ~ we can compute p~ × ~n2 , and we can estimate
~p · ~n2 with arbitrary precision. Then we use:
~p = ~n2 × (~
p × ~n2 ) + (~
p · ~n2 )~n2
Hence:
2π X
B(x) = −~n1 · (Bi I1 (i) + δBi J1 (i)) (~ri × ~ei )
ρ i
X
− (~n1 · ~n2 ) (~k · ~ei )K1 (i)
i
With:
δBi = Bi+1 − Bi
1 k~ei+1 k
J1 (i) = ln + (~
r i · ~
e )I
i 1 (i)
k~ei k2 k~ei k
1
~ri+1 · ~ei ln(k~ei+1 k2 ) − ~ri · ~ei ln(k~ei k2 ) + 2k~ri × ~ei kγi
K1 (i) = 2
2k~eik
1
I1 (i) − k~ri k2 I2 (i) − 2(~ri · ~ei )J2 (i)
K2 (i) = 2
k~ei k
Z
dA2
X2 = 4
A2 k~ r 12 k
If the the distance between point x and the emitter surface is null, m ~ · ~n2 and p~ · ~n2
are both null. If it is not, we prefer to estimate X1 and X2 . As we know bounds on the
values of the function and its derivatives, we make use of a Gaussian quadrature.
140
A. Radiosity on the Receiving Plane, B. Norm of Radiosity Gradient, due
due to a Constant Emitter. to a Constant Emitter.
Receiver
Emitter
0.1
141
142 CHAPITRE 3. PROPRIÉTÉS DE LA FONCTION D’ÉCLAIRAGE
Nicolas Holzschuch†
François X. Sillion
iMAGIS‡
GRAVIR/IMAG - INRIA
Abstract
This paper presents a complete algorithm for the evaluation and control of error in radiosity calculations. Pro-
viding such control is both extremely important for industrial applications and one of the most challenging issues
remaining in global illumination research.
In order to control the error, we need to estimate the accuracy of the calculation while computing the energy
exchanged between two objects. Having this information for each radiosity interaction allows to allocate more
resources to refine interactions with greater potential error, and to avoid spending more time to refine interactions
already represented with sufficient accuracy.
Until now, the accuracy of the computed energy exchange could only be approximated using heuristic algorithms.
This paper presents the first exhaustive algorithm to compute fully reliable upper and lower bounds on the energy
being exchanged in each interaction. This is accomplished by computing first and second derivatives of the ra-
diosity function where appropriate, and making use of two concavity conjectures. These bounds are then used in a
refinement criterion for hierarchical radiosity, resulting in a global illumination algorithm with complete control
of the error incurred.
Results are presented, demonstrating the possibility to create radiosity solutions with guaranteed precision. We
then extend our algorithm to consider linear bounding functions instead of constant functions, thus creating sim-
pler meshes in regions where the function is concave, without loss of precision.
Our experiments show that the computation of radiosity derivatives along with the radiosity values only requires
a modest extra cost, with the advantage of a much greater precision.
† Current position: Invited Researcher, Department of Computer Global illumination algorithms generally have at least a
Science, University of Cape Town, South Africa. parameter that the user can manipulate, choosing either fast
‡ iMAGIS is a joint research project between CNRS, INRIA, computations or precise results. For Monte-Carlo ray tracing
INPG and Université Joseph Fourier — Grenoble I. Postal ad- algorithms, this parameter can be the number of rays. For
dress: B.P. 53, F-38041 Grenoble Cedex 9, France. E-mail: hierarchical radiosity algorithms, it can be the refinement
[email protected]. threshold, used to decide whether or not to refine a given
143
2 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
A1 x
In 1994, Lischinski1 proposed a refinement criterion for
hierarchical radiosity such that the error on the energy at
each point of the scene could be controlled by the refinement Figure 1: Geometric notations for the radiosity equation.
threshold. Their algorithm used upper and lower bounds on
the point-to-area form factor for each interaction in order to
compute upper and lower bounds for the radiosity at each 2. Background
point in the scene. However, they had no way to compute The radiosity method was introduced in the field of light
reliable upper and lower bounds for the point-to-area form- transfer in 1984 by Goral4 . This method uses a simplification
factor on a given interaction, and still resorted to sampling in order to solve the global illumination problem: it assumes
— computing a set of values for the form-factor, and taking that all the objects in the scene are ideal diffuse surfaces:
the minimum and maximum of these values. their bidirectional reflectance is uniform, and thus does not
depend on the outgoing direction.
Although Lischinski’s method is easy to implement, it is
In this case, the radiosity emitted at a given point x can be
not totally reliable. In this paper, we present a method allow-
expressed as an integral equation:
ing to compute fully reliable upper and lower bounds for the Z
point-to-area form-factor on any interaction. To achieve this cos θ1 cos θ2
B(x) = E(x) + ρd (x) B(y) V (x, y)dy
goal, we use our knowledge of the point-to-area form-factor y∈S
πr 2
derivatives together with its concavity properties. (1)
In this equation, S is the set of all points y. r is the dis-
These concavity properties of the point-to-area form- tance between point x and point y, θ1 and θ2 are the angles
factor are described in section 3. They extend the unimodal- between the − → vector and the normals to the surfaces at
xy
ity conjecture proposed by Drettakis2, 3 . Like the unimodal- point x and y respectively (refer to figure 1 for the geomet-
ity conjecture, they are only conjectures, and despite their ric notations). ρd (x) is the diffuse reflectance at point x, and
apparent simplicity, we have been unable to find a complete V (x, y) expresses whether point x is visible from point y or
demonstration for them. However, we also have been unable not.
to exhibit a counter-example.
In order to solve equation 1, Goral4 suggested to discretize
the scene into a set of patches [Pi ], over which a constant
As is explained in appendix B, we can compute exact val-
radiosity, Bi is assumed.
ues for the derivatives of the point-to-area form-factor; either
for the first derivative, the gradient vector, or for the second In this case, the radiosity at point x becomes:
derivative, the Hessian matrix. As we shall also see in ap- X Z
cos θ cos θ′
pendix B, it is indeed faster to compute an exact value for the B(x) = E(x)+ρd(x) Bi V (x, y)dy
y∈Pi
πr 2
form-factor derivative than computing approximate values i
using several samples. Using our knowledge of the deriva- (2)
tives along with the concavity properties of the point-to-area The purely geometric quantity
form-factor, we show in section 4 how to derive bounds for
cos θ cos θ′
Z
the point-to-area form-factor in any unoccluded interaction. Fi (x) = V (x, y)dy
πr 2
We also show an implementation of the refinement criterion y∈Pi
using these bounds. is called the point-to-area form-factor at point x from patch
i. It only depends on the respective positions of point x and
When dealing with partially occluded interactions we can patch i.
not use the previous bounds, as the concavity conjectures do Since we assume a constant radiosity value within the
not hold in this case. But we can exhibit two emitters that are patch, we can compute this value as the average of all the
convex and bound the actual emitter, which we call the min- point values. This leads to a matrix equation:
imal and the maximal emitter. Using the previously defined
algorithm, we find an upper bound for the maximal emitter, X
and a lower bound for the minimal emitter. The algorithm Bj = Ej + ρi Fji Bi (3)
for finding these convex emitters is detailed in section 5. i
144
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 3
where the geometric quantity cessively in places where the solution has already attained a
Z Z ′ correct level of precision (Holzschuch9 ).
1 cos θ cos θ
Fij = V (x, y)dxdy
Aj x∈Pj y∈Pi
πr 2 Part of these problems can be addressed by using dis-
continuity meshing, where the patches are first subdivided
is called the form-factor. Schröder5 showed that there is a
along the discontinuity lines of the radiosity function and its
closed form expression for the form-factor in the case of two
derivatives (see Heckbert10 , Lischinski11, 12 and Drettakis13 ).
fully visible polygonal patches. In the general case, we do
These discontinuity lines can be computed using geometric
not have access to the exact value of the form-factor, but
algorithms. However, as pointed out by Drettakis, these dis-
only to approximate values.
continuity lines are not of equal importance. Some of them
Equation 3 can be solved in an iterative manner, using do not have a noticeable effect on the final radiosity solu-
Jacobi or Gauss-Seidel iterative methods (see Cohen6 ). The tion. Hence it is not necessary to compute all the disconti-
problem is that in order to compute one full bounce of light nuity lines. Deciding which discontinuity lines are relevant
across the surfaces in the scene, we have to compute the en- is done by a refinement oracle, using heuristic methods like
tire form-factor matrix, which is quadratic with respect to the one described above.
the number of patches.
Many of the latest research results have dealt with giv-
A significant improvement over the classical radiosity ing the user a better control of the level of precision in the
method is hierarchical radiosity. In “standard” radiosity, the modelling of radiosity in the hierarchical radiosity method.
discretisation of one object into patches does not depend on
the objects with which it interacts. In order to model the in- In the most promising paper on the subject, Lischinski1 ,
teraction between objects that are very close, and exchange suggested to compute for each interaction an upper and
lots of energy, we need to subdivide them into many patches, lower bound for the point-to-patch form-factor between the
so as to get a precise modelling of the radiosity. On the other points of the receiving patch and the emitting patch, namely
hand, an interaction between two objects that are far away Fmax and Fmin , as well as an upper and lower bound for
could be modelled with fewer patches. the radiosity of the emitting patch, using information already
available in the hierarchy. We then know that the radiosity on
In hier- the receiving patch is between Fmax Bmax and Fmin Bmin .
archical radiosity, introduced in 1990 by Hanrahan7 , each
object is subdivided into a hierarchy of patches, with each Hence, the uncertainty on the radiosity on the receiving
node in the hierarchy carrying the average of the radiosity patch, due to this particular interaction is:
of its children. Interaction between objects far away from
each other are modelled as interactions between nodes at a δBreceiver = Fmax Bmax − Fmin Bmin
high level in each hierarchy. On the other hand, interactions
between objects close to each other are modelled as interac- The inaccuracy on the energy of the receiving patch, due to
tions between nodes at a lower level in the hierarchy, thereby this particular interaction, is:
allowing more precision in the modelling of radiosity. Each
interaction between two nodes is modelled by a link, a data δEreceiver = Areceiver (Fmax Bmax − Fmin Bmin )
structure carrying the identity of the sender and the receiver,
as well as the form-factor, and possibly other informations We can then decide to refine all interactions where this im-
on the respective visibility of both patches. This hierarchical precision on the transported energy is above a given thresh-
radiosity algorithm has later been extended using wavelets old. The most difficult part in this algorithm is finding re-
(see Gortler8 ). liable values for the bounds on the form-factor. Lischinski1
The most important step in the hierarchical radiosity suggested computing exact values for the point-to-area form
method is the decision whether or not to refine a given in- factor at different sampling points on the receiver, and using
teraction. This decision is deferred to a refinement criterion. the maximum and minimum value at these sampling points
Early implementations of the hierarchical radiosity method as the upper and lower bounds. Although this algorithm does
used crude approximations of the form-factor between two not give totally reliable bounds, it does provide a close ap-
patches. It was known that these form-factor estimates were proximation, and is quite easy to implement on top of an
most imprecise when the result of the approximation was existing hierarchical radiosity implementation.
large. Hence, interactions were refined as long as the form-
In the following sections we show that it is possible to
factor estimate was above a certain threshold (Hanrahan7 ).
compute reliable upper and lower bounds for the point-to-
This refinement criterion does not give the user a full con- area form factor. These bounds can then be used in the pre-
trol of the precision on the modelling of the radiosity func- ceding algorithm, allowing the refinement of all interactions
tion. In particular, it does not give any guarantee that it will where the inaccuracy on the transported energy is above the
refine all problematic interactions, and it can also refine ex- threshold.
145
4 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
Tangent
Function
Secant
Tangents
Concave Function
f ′′ (x) < 0
Area of Interest
Function Xmin Xmax
Tangent
Figure 3: A function that remains concave across an inter-
val lies above its secant, and below all its tangents on this
interval.
Convex Function
f ′′ (x) > 0
Tangent
1.5
Function 0.5
-2
0 -1
Inflection Point 0
f ′′ (x) = 0 1
-2 -1 0 2
1 2
Figure 2: Concavity for univariate functions.
Figure 4: A point where the function is concave: the function
lies below the tangent plane.
3. The Concavity Conjectures
3.1. Definition of Concavity
Univariate functions are said to be concave at a point when ate function is said to be concave at a point when it lies below
they lie entirely below their tangent at that point; conversely, its tangent plane (see figure 4), convex when it lies above its
they are said to be convex when they lie above their tangent. tangent plane and indefinite when the function crosses the
When the function crosses its tangent, the point is said to be tangent plane (see figure 5). As with univariate functions,
an inflection point (see figure 2). Classically, the concavity concavity can be used to find upper and lower bounds: if a
of the function is linked to the sign of its second derivative: if function is concave over a triangular area, then on this area it
the second derivative is positive, then the function is convex. lies below all its tangent planes, and above the secant plane
If it is negative, then the function is concave. It is only when defined by the three corners of the triangle.
the second derivative changes sign that we have an inflection A univariate function usually crosses its tangent at an iso-
point. lated point, the inflection point. Contrarily, the set of points
Concavity is often used to find upper and lower bounds where a bivariate function crosses its tangent plane is a
for functions; if a function is concave on an interval, then whole region.
it is below all its tangents on this interval, and above all its
The second derivative of a bivariate function is a 2 × 2
secants (see figure 3). Since concavity allows bounding by
matrix, called the Hessian matrix. As with univariate func-
affine functions (like tangents) instead of constants, it gener-
tions, the concavity of the function is linked to its second
ally provides bounds that are closer to each other, and hence
derivative: if the Hessian matrix is definite positive, then the
a “better” range.
function is convex; if the Hessian matrix is definite negative,
This notion of concavity extends naturally to bivariate then the function is concave; if the Hessian matrix is indef-
functions, such as radiosity defined over a surface. A bivari- inite, then the function is indefinite. The Hessian can be ex-
146
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 5
0.5
-0.5 -2
-1
-2 0 Figure 6: The C1 conjecture: the radiosity function has in-
-1
0 1 definite concavity everywhere, except over a convex area
1 2
2
(hatched), where the radiosity function is concave.
Figure 5: A point where the concavity is indefinite: the func-
tion crosses its tangent plane.
pressed with respect to the partial derivatives of the function: Like Drettakis, we consider a finite convex emitter, with
constant radiosity, and we assume the receiver is an infinite
∂2f ∂2 f
∂u2 ∂u∂v
1 r s plane. We state the following two conjectures on the concav-
H= ∂2f ∂2 f
= (4)
2 s t ity of the radiosity on the receiver:
∂u∂v ∂v 2
147
6 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
1.8
1.6
1.4
1.2 B
1 A
0.8
0.6
0.4
0.2
0
-2 -1.5 -1 -0.5 0 0.5 1 1.5
0
A B
-5
-10
-15
-2 -1.5 -1 -0.5 0 0.5 1 1.5
148
,,,,,,,,,,,
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 7
,,,,,,,,,,,
in this half-plane are
must be balanced against what it would require to compute smaller than F(x).
,,,,,,,,,,,
approximate values for the derivatives using several form-
factor computations: in this case, the cost increase for the
,,,,,,,,,,,
gradient would be of 100%, and that of the Hessian 600%.
,,,,,,,,,,,
In our refinement phase, we compute the values of the x
point-to-area form-factor and its derivatives at the vertices
of the receiving patch. These values can be reused in the
radiosity propagation phase to obtain the radiosity values at The maximum lies
the vertices.
in this half-plane ∇F(x)
4.2.2. An exact value for the minimum If the above algorithm tells us that the maximum can only be
at one vertex of the receiving patch, then we know the exact
Since we chose to compute the point-to-area form-factor at value of the maximum: it is the value of the point-to-area
the vertices of the receiving patch, A1, we do have access to form-factor at that vertex.
the exact value of the minimum across A1: it is the minimum
of our computed values for the point-to-area form-factor at 4.2.5. If the Maximum is Inside the Receiving Patch
the vertices of A1.
If the above algorithm tells us there exists an area inside the
4.2.3. Finding the Position of the Maximum receiving patch A1 where the maximum can be, then we do
not have access to the exact value of the maximum of the
A consequence of U2 is that given a point x, given the point-
point-to-area form-factor across A1.
to-area form-factor F (x) and its gradient at point x, ∇F (x),
for all points p such that −
→ · ∇F (x) < 0, we have F (p) <
xp The only thing we know at this stage is that the value of
F (x). the maximum must be greater than the values computed at
the vertices of A1.
Otherwise, there would be one local minimum between p
and x on the line passing through p and x, and hence two There are three kind of algorithms for finding an upper
local maxima, which is in contradiction with U2. bound for the point-to-area form-factor across A1:
149
,,,,,,,,,,,
8 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,
,,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
x1 x2
,,,,,,,,,,,
,,,,,,,,,,,
,,,,,,,,,,
x1 x2
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,,
,,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,,
,,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,,
,,,,,,,,,,,
,,,,,,,,,, ,,,,,,,,,,,
,,,,,,,,,,,
The maximum can
,,,,,,,,,,
,,,,,,,,,,
be in this region x3 x4
,,,,,,,,,,,
x3 x4
,,,,,,,,,,
,,,,,,,,,,
The maximum can be inside the polygon. The maximum is at one of the vertices.
Figure 9: Using the gradient to locate the maximum inside or outside the receiving patch.
Heuristic Algorithms: Compute another sample value for here is always two since we are dealing with bivariate func-
the point-to-area form-factor inside patch A1. The posi- tions — and on the number of vertices in the receiving patch.
tion of the sampling point can be arbitrary or can make Usually, in hierarchical radiosity algorithms, we are restrict-
use of the information given by the form-factor gradient. ing ourselves to triangular or quadrangular patches. If this
Concavity Algorithms: If the point-to-area form-factor is the case, we can assume the complexity of computing the
function on the receiving patch is concave, we use the tan- intersection of the tangent planes is constant.
gent planes to find an upper-bound.
Geometric Algorithms: Using geometric tools, build an 4.2.5.2. Geometric Algorithms If the form-factor Hessian
emitter that encloses the actual emitter for all the points of is not definite negative at all the vertices of the receiving
the receiving patch, and for which we can find the value patch, then the point-to-area form-factor function is not con-
of the maximum. This value is an upper-bound. cave across the entire receiving patch. It is therefore not
possible to use concavity algorithms. In this case, we resort
Heuristic algorithms include gradient descent algorithms,
to geometric algorithms: in a plane parallel to the plane of
as described by Arvo15 and Drettakis2, 3 . Gradient descent
the receiver, we construct an emitter with the following two
algorithms make use of the information provided by the gra-
properties:
dient to subdivide the receiving patch until convergence. The
gradient can either be approximated (Drettakis2, 3 ) or an ex- • From all the points of the receiver, it is seen as including
act value (Arvo15 ). the original emitter.
In our implementation, we use concavity algorithms • It has two axes of symmetry, so that we can find the max-
wherever possible, and resort to geometric algorithms if the imum form-factor due to the emitter.
point-to-area form-factor function is not concave. The reason for the second item lies in the symmetry prin-
ciple: if the emitter and the receiver are left unchanged by
4.2.5.1. Concavity Algorithms According to C1, the zone a planar symmetry, then so is the point-to-area form-factor
where the point-to-area form-factor function is concave is a function on the receiver; thus its maximum can only lie on
convex one. As a consequence, if the form-factor Hessian is the intersection of the plane of the symmetry and of the plane
definite negative at the vertices of the receiving patch, then of the receiver. If there are two planes that leave the emitter
it stays definite negative across the receiving patch. and the receiver un-changed, then the maximum can only be
at their intersection (see figure 24, in the color section).
In this case, the form-factor function lies below all its tan-
gent planes at the vertices across the receiving patch. We To build this emitter:
know these tangent planes since we know the form-factor
• select a plane P parallel to the plane of the receiving
gradient at the vertices. Finding an upper bound for the
patch;
point-to-area form-factor is then equivalent to computing the
• for each vertex Vi of the receiving patch, build the projec-
intersection of the tangent planes.
tion pi of the original emitter according to this vertex on
This is mainly a linear programming problem (see, for P (see figure 10);
example, Preparata18 ); the computational complexity of the • this projection is totally equivalent to the original emitter
problem depends on the dimension of the problem — which for this particular vertex;
150
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 9
151
10 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
lies below all the tangent planes for the point-to-area form form-factors for each patch, plus the time needed for the ex-
factor, and above the secant plane. Therefore, we can say ploitation of the derivatives for computing upper and lower
that our uncertainty on the point-to-area form-factor on the bounds.
receiver is equal to the maximum of the distance between
Existing heuristic refinement algorithms (see Lischinski1 )
the secant plane and these tangent planes.
compute one form-factor sample for each of the receiver ver-
Computing this distance is again a linear programming tices, plus one sample at the center of the receiving patch.
problem (see, for example, Preparata18 ). The complexity de- If we assume that the form-factor values at the vertices are
pends on the number of vertices of the receiver, nr , which is shared with the neighbouring patches, we are computing an
usually three or four. Let us denote by EF F this uncertainty average of two point-to-area form-factors for each receiver.
on the form-factor. EF F can be used in our expressions as a
Thus, the cost of the heuristic algorithm and the cost of
replacement for Fmax − Fmin . Using the fact that:
our algorithm are roughly similar. The main overhead of our
(Bmax Fmax − Bmin Fmin ) = algorithm when compared with the heuristic algorithm is the
Bmax (Fmax − Fmin ) + Fmin (Bmax − Bmin ) time needed for the actual computations for finding the posi-
tion of the maximum and for finding an upper bound for the
we decide to refine a given interaction if maximum, when necessary.
Areceiver (Bmax EF F + Fmin (Bmax − Bmin )) > ε Hence, the relative costs of our refinement criterion are in
fact quite small and can be generally regarded as acceptable,
It must be noted that this new bounding of the form-factor especially with respect to the complete control it gives on
does not introduce any uncertainty. We are still bounding the error carried by each interaction.
the form-factor by fully reliable functions. However, since
these functions are affine instead of constants, they provide Also, our algorithm allows for a significant mesh sim-
much tighter bounds, and we can expect a simpler mesh in plification (see figure 25, in the color section) which may,
the areas where the point-to-area form factor is concave. depending on the scene considered, induce a smaller com-
putation time for the exhaustive refinement criterion when
Figure 25 (in the color section) shows the result of our re- compared to a heuristic refinement criterion.
finement criterion on a simple box, with only direct illumi-
nation. Notice that the mesh produced is coarser in some ar-
eas with respect to the immediately neighbouring areas (the 5. Error Control for Partially Occluded Interactions
disc-shaped area on the floor, and the drop-shaped areas on The above algorithm for finding upper and lower bounds
the walls). These are the places where the Hessian is definite- only works in the case of unoccluded interactions, and with
negative. a convex emitter. This algorithm relies on the concavity and
This refinement criterion extends, in some ways, the mesh unimodality conjectures, which do not hold if there are oc-
simplification found in previous work (Holzschuch9 ). The cluders between the emitter and the receiver.
shape of the mesh produced is quite similar between our new However, it is possible to construct, using geometrical
algorithm and the algorithm in Holzschuch9 . However, our tools, a minimal and a maximal emitter that have the fol-
new refinement criterion, while keeping low memory costs, lowing qualities:
also gives fully reliable upper and lower bounds on the ra-
diosity of each patch. • both are convex;
• any point of the minimal emitter is fully visible from the
receiver;
4.3.3. Dealing with Singularities
• the maximal emitter contains all the points of the emitter
4.3.4. Relative Complexity of the Algorithm that are visible from at least one point of the receiver;
Our algorithm requires the computation of the first two Then at any given point on the receiver,
derivatives of the point-to-area form-factor at the vertices of
• the form-factor due to the minimal emitter is lesser or
the receiver. This implies a 100 % increase on the compu-
equal to the actual form factor,
tation time for each vertex (see appendix B). That is to say,
• and the form-factor due to the maximal emitter is greater
computing the point-to-area form-factor and its derivatives
or equal to the actual form-factor.
costs twice what it would cost to compute the point-to-area
form-factor alone. We apply our previous algorithm to these emitters, and
find a lower bound using the minimal emitter, and an upper
Since vertices are shared by several patches, this over-
bound using the maximal emitter.
head cost is shared by several interactions. On the average,
we are only computing one point-to-area form-factor and its Figure 26 (in the color section) shows an example of min-
derivatives for each patch. Thus, the cost of our algorithm imal and maximal emitters for a simple configuration with
is approximately the cost of computing two point-to-area only one occluder: the small red square on the ground is the
152
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 11
Maximal Emitter
Emitter
Receiver
Receiver
Occluders
Occluders
Complement of "Umbra"
Emitter
Figure 12: A single interaction with occluders.
Figure 14: The maximal emitter can be any convex including
the complement of the “umbra” region.
Receiver Complement of
Occluders "penumbra": several
candidates for
the minimal emitter
Occluders
Figure 13: Computing the “umbra” and “penumbra” vol- Emitter
umes using the receiver as a light source.
Figure 15: Several possible candidates for the minimal emit-
ter.
153
12 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
to have several candidates for the minimal emitter. Ideally, only the computation of a surface included in the umbra vol-
we would like to pick the candidate that gives the largest esti- ume, and of a surface enclosing the penumbra volume. Two
mate for the minimum, since this would give tighter bounds, such surfaces can be computed in a straightforward way:
and hence reduce the number of un-necessary refinements.
• For each occluder:
However, it is impossible to find this without computing
the point-to-area form-factor for all the candidates, which – For each receiver vertex, compute the projection of the
would prove very time-consuming. In our implementation, occluder onto the emitter supporting plane;
we choose the candidate with the largest area, since it is – The intersection of these projections is the umbra vol-
likely to induce a larger form-factor. ume for this particular receiver;
– The convex hull of these projections is the penumbra
volume for this receiver.
5.2. Implementation and testing
• The union of the penumbra volumes for all occluders is
We have implemented our algorithm for finding upper and the penumbra volume for the entire interaction.
lower bounds for the point-to-area form-factor using the • The union of the umbra volumes for all occluders is not
maximal and minimal emitter. equal to the umbra volume for the entire interaction. How-
Figure 27 (in the color section) shows the result of our ever, it is included into the actual umbra volume (see
refinement criterion on a simple scene, with a single oc- Lischinski11 ). Hence, we can use it for building the maxi-
cluder. Notice that the algorithm detects the shadow bound- mal emitter.
aries and refines properly in order to model them. Outside The computation of the projection of the occluders onto
of the shadow, the mesh produced is identical to the mesh the emitter supporting plane, and the computation of the
produced without occluders. union of these projections can be reused for computing the
exact value of the point-to-area form-factor in the radiosity
5.3. Complexity of the Algorithm and Possible propagation phase.
Improvements The only extra cost of our refinement criterion is then
Our algorithm relies on computation of the umbra and the computation of the minimal and maximal emitter know-
penumbra volumes for all the interactions. This computation ing the projection of all the occluders on the emitter plane.
can be quite costly, if it is implemented in a naive way. This is a two-dimensional problem, computing a convex re-
gion that contains the complement of the umbra volume,
Previous work by Chin24 has shown that the use of a and another convex region that is included into the comple-
BSP-tree can greatly improve the computation of umbra and ment of the penumbra volume. Note that we do not have to
penumbra volumes. Teller22 showed that by extending the explicitely construct the umbra and the penumbra volume,
data structure used to store the interaction between patches only the two convex regions. We can use several methods
to also store the possible occluders for this interaction, the for computing these convex regions, as described in sec-
complexity of visibility computations could be greatly re- tion 4.2.5.2. The cost of our algorithm is the cost of finding
duced. Both these improvements work with our algorithm. two convex regions enclosing nr n polygons, where n is the
number of occluders, and nr is the number of vertices of the
Our algorithm can also be used in a combination with
receiver.
standard discontinuity meshing, as described in Lischinski11 .
A preliminary light-source discontinuity meshing will re- The heuristic algorithm described by Lischinski1 uses the
duce the complexity of the minimal and maximal emitter same computation of the exact values of the point-to-area
computations by providing occlusion information and reduc- form-factor at the vertices of the receiver, which will be
ing the number of patches where we have to compute these reused in the radiosity propagation phase, plus the compu-
emitters. tation of the point-to-area form-factor at the center of the re-
ceiving patch, which implies the projection of the occluders
The backprojection algorithm described by Drettakis13, 3
on the emitter supporting plane and the computation of the
gives for each patch created during the discontinuity mesh-
union of these projections. Hence, the cost of the heuristic al-
ing step the geometric structure of the emitter as seen
gorithm is n projections and the union of n two-dimensional
from this patch. Implementing our algorithm on top of a
polygons.
backprojection algorithm should be a straightforward post-
processing step.
6. Conclusions and Future Directions
It has been shown (Lischinski11 and Drettakis13, 3 ) that the
boundary of the umbra volume can include a quadric sur- We have introduced a new and reliable way of computing the
face, and hence can be quite complex to model. However, maximum and the minimum of the point form-factor on any
our algorithm does not require a complete computation of interaction. These bounds on the form-factor allow a con-
the umbra and penumbra volumes for each interaction, but trol of the precision of the hierarchical radiosity algorithm,
154
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 13
precision that can be required for certain applications of the a simpler algorithm to find upper and lower bounds by using
algorithm, such as architectural planning. only U1, U2 and the form-factor gradient.
These bounds have been integrated in a new refinement This algorithm would be very similar to the gradient-
criterion for hierarchical radiosity. We have also presented descent algorithms described by Arvo15 and Drettakis2, 13 .
another refinement criterion that, while maintaining control The main difference would be the use of geometric tools, as
on the upper and lower bounds of the energy transported, described in section 4.2.5.2 to find an upper bound. These
allows a coarser mesh to be constructed in some places, thus geometric tools will provide a fully reliable upper bound on
reducing memory and computation costs. the receiving patch.
This algorithm is a significant step in error-control for This simpler algorithm would not allow mesh simplifica-
global illumination methods. Although it has been devised tion as described in section 4.3.2; also, since this simpler
and implemented in a hierarchical radiosity framework, algorithm would only use geometric methods to find up-
nothing in the algorithm prevents the refinement criterion to per bounds it can be expected that it will give greater upper
be implemented with progressive refinement radiosity, as de- bounds, and hence induce more refinement than our current
scribed by Cohen25 . algorithm. On the other hand, this algorithm would not re-
Knowledge of the error produced in all the parts of the al- quire the computation of the form-factor Hessian, thus sav-
gorithm allows global illumination programs to concentrate ing computation time, and would probably be easier to ex-
their work on parts of the scene where the error is still large, tend to partial visibility cases, where C1 may not hold.
and to skip parts where it can be neglected. Thus, our algo- Future work will include an implementation of this sim-
rithm can be hoped to accelerate global illumination compu- pler algorithm, and timing and memory costs comparisons
tations by reducing the amount of unnecessary refinement. between our full algorithm, the simpler algorithm and the
Our algorithm relies on several conjectures: the unimodal- heuristic algorithm, as well as error measurements.
ity conjectures (U1 and U2) and the concavity conjectures
(C1), as well as on a knowledge of the radiosity derivatives.
7. Acknowledgements
Table 1 recalls, for each part of the algorithm, which conjec-
ture and which derivatives are being used. The first author has been funded by an AMN grant from Uni-
The concavity and unimodality conjectures assume that versité Joseph Fourier from 1994 to 1996.
radiosity on the emitter is constant, that the receiver is dif-
fuse and that there is full visibility. An extension of our error- References
control algorithm to cases where radiosity on the emitter is
not constant, or to reflectance functions that are not constant 1. D. Lischinski, B. Smits, and D. P. Greenberg, “Bounds
would first require a careful study of to what extent do our and Error Estimates for Radiosity”, in Computer
concavity or unimodality conjectures still hold. For exam- Graphics Proceedings, Annual Conference Series,
ple, it is clear that they cannot hold for whatever distribution 1994 (ACM SIGGRAPH ’94 Proceedings), pp. 67–74,
of radiosity on the emitter, but only for specific cases. These (1994).
specific cases, once identified, can be used as a functional 2. G. Drettakis and E. Fiume, “Accurate and Consistent
basis for radiosity. Reconstruction of Illumination Functions Using Struc-
We have dealt with the partial visibility problem by com- tured Sampling”, in Computer Graphics Forum (Euro-
puting maximal and minimal emitter, thereby reducing the graphics ’93), vol. 12, (Barcelona, Spain), pp. C273–
problem to two full visibility problems. However, it is known C284, (September 1993).
that it is possible to compute the radiosity gradient in pres- 3. G. Drettakis, “Structured Sampling and Reconstruction
ence of occluders (see Arvo15 ), and it seems possible to com- of Illumination for Image Synthesis”, CSRI Technical
pute the radiosity Hessian in presence of occluders as well Report 293, Department of Computer Science, Univer-
(see Holzschuch17 ). In this case, it would be possible to ex- sity of Toronto, Toronto, Ontario, (January 1994).
tend our refinement criterion to some partially visible inter-
actions without having to compute the maximum and mini- 4. C. M. Goral, K. E. Torrance, D. P. Greenberg, and
mum emitter. Once again, this can be done only in specific B. Battaile, “Modelling the Interaction of Light Be-
configurations where the concavity or unimodality conjec- tween Diffuse Surfaces”, in Computer Graphics (ACM
tures still hold. This is not the case for generic occluders (see SIGGRAPH ’84 Proceedings), vol. 18, pp. 212–222,
figure 28, in the color section), but only for certain specific, (July 1984).
simple occluders (see figure 29 in the color section).
5. P. Schröder and P. Hanrahan, “On the Form Factor
Although the algorithm described in this paper makes use Between Two Polygons”, in Computer Graphics Pro-
of the U1, U2 and C1 conjectures, and of the form-factor ceedings, Annual Conference Series, 1993 (ACM SIG-
gradient and Hessian, table 1 shows that it is possible to build GRAPH ’93 Proceedings), pp. 163–164, (1993).
155
14 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
6. M. Cohen and D. P. Greenberg, “The Hemi-Cube: 15. J. Arvo, “The Irradiance Jacobian for Partially Oc-
A Radiosity Solution for Complex Environments”, in cluded Polyhedral Sources”, in Computer Graphics
Computer Graphics (ACM SIGGRAPH ’85 Proceed- Proceedings, Annual Conference Series, 1994 (ACM
ings), vol. 19, pp. 31–40, (August 1985). SIGGRAPH ’94 Proceedings), pp. 343–350, (1994).
7. P. Hanrahan, D. Salzman, and L. Aupperle, “A 16. N. Holzschuch and F. Sillion, “Accurate Computation
Rapid Hierarchical Radiosity Algorithm”, in Computer of the Radiosity Gradient for Constant and Linear Emit-
Graphics (ACM SIGGRAPH ’91 Proceedings), vol. 25, ters”, in Rendering Techniques ’95 (Proceedings of the
pp. 197–206, (July 1991). Sixth Eurographics Workshop on Rendering) (P. M.
Hanrahan and W. Purgathofer, eds.), (New York, NY),
8. S. J. Gortler, P. Schröder, M. F. Cohen, and P. Hanra- pp. 186–195, Springer-Verlag, (1995).
han, “Wavelet Radiosity”, in Computer Graphics Pro-
ceedings, Annual Conference Series, 1993 (ACM SIG- 17. N. Holzschuch, Le Contrôle de l’Erreur dans la
GRAPH ’93 Proceedings), pp. 221–230, (1993). Méthode de Radiosité Hierarchique (Error Control
in Hierarchical Radiosity). Ph.D. thesis, Équipe
9. N. Holzschuch, F. Sillion, and G. Drettakis, “An Effi- iMAGIS/IMAG, Université Joseph Fourier, Grenoble,
cient Progressive Refinement Strategy for Hierarchical France, (March 5th, 1996).
Radiosity”, in Fifth Eurographics Workshop on Render-
ing, (Darmstadt, Germany), pp. 343–357, (June 1994). 18. F. P. Preparata and M. I. Shamos, Computational Ge-
ometry – An Introduction. New York: Springer Verlag,
10. P. Heckbert, “Discontinuity Meshing for Radiosity”, in (1985).
Third Eurographics Workshop on Rendering, (Bristol,
UK), pp. 203–226, (May 1992). 19. J. D. Foley, A. van Dam, S. K. Feiner, and J. F. Hughes,
Computer Graphics, Principles and Practice, Second
11. D. Lischinski, F. Tampieri, and D. P. Greenberg, “Dis- Edition. Reading, Massachusetts: Addison-Wesley,
continuity Meshing for Accurate Radiosity”, IEEE (1990).
Computer Graphics and Applications, 12(6), pp. 25–39
(1992). 20. T. L. Kay and J. T. Kajiya, “Ray tracing com-
plex scenes”, Computer Graphics, 20(4), pp. 269–276
12. D. Lischinski, F. Tampieri, and D. P. Greenberg, “Com- (1986). Proceedings of SIGGRAPH ’86 in Dallas
bining Hierarchical Radiosity and Discontinuity Mesh- (USA).
ing”, in Computer Graphics Proceedings, Annual Con-
ference Series, 1993 (ACM SIGGRAPH ’93 Proceed- 21. D. L. Toth, “On ray-tracing parametric surfaces”, Com-
ings), pp. 199–208, (1993). puter Graphics, 19(3), pp. 171–179 (1985). Proceed-
ings SIGGRAPH ’85 in San Francisco (USA).
13. G. Drettakis and E. Fiume, “A Fast Shadow Algo-
rithm for Area Light Sources Using Backprojection”, 22. S. J. Teller, “Computing the antipenumbra of an
in Computer Graphics Proceedings, Annual Confer- area light source”, in Computer Graphics (ACM SIG-
ence Series, 1994 (ACM SIGGRAPH ’94 Proceedings), GRAPH ’92 Proceedings), vol. 26, pp. 139–148, (July
pp. 223–230, (1994). 1992).
14. R. Siegel and J. R. Howell, Thermal Radiation Heat 23. S. Teller and P. Hanrahan, “Global Visibility Algo-
Transfer, 3rd Edition. New York, NY: Hemisphere Pub- rithms for Illumination Computations”, in Computer
lishing Corporation, (1992). Graphics Proceedings, Annual Conference Series,
156
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 15
dA 0.6
θ
0.4
n
0.2
(u,v)
(0,0) 0
157
16 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
function of u, A2
dA u cos(θ) + sin(θ) Ei +1
f (u) =
π (u2 + (au + b)2 + 1)2
→
ei →
ri +1
The form-factor is equal to f (u) if u cos(θ)+sin(θ) > 0.
Ei
If u cos(θ) + sin(θ) ≤ 0, then the form-factor is null. → →
n1
ri
γi
It must be noted that f (u) goes to zero when u goes to
±∞, and that f (u) is equal to zero only for u = u0 = A1 x
− tan θ.
It is possible to compute the first and the second derivative Figure 20: Notation when the emitter is a polygon.
of f (u). The first derivative, f ′ (u), is of the sign of a second
degree polynomial in u, and the second derivative, f ′′ (u) is
of the sign of a third degree polynomial in u. As a conse-
quence, f ′ (u) can change sign at most twice, and f ′′ (u) at F =0
foreach edge [Ei Ei+1 ]
most three times. ri = E i − x
~
ri+1 = Ei+1 − x
~
Since the function f (u) goes to zero when u goes to ±∞, crossprod = ~ ri × ~
ri+1
~i ·~
r ri+1
it must have one maximum between u0 and +∞, and one
gamma = arccos ri ri+1
minimum between u0 and −∞. As a consequence, f ′ (u) gamma
I1 = kcrossprodk
must change sign exactly twice. Let us call u1 and u2 the mixt = ~
n1 · crossprod
F − = I1 mixt
points where the first derivative changes sign (u1 < u0 < 1
F ∗ = 2π
u2 ).
Figure 21: Pseudo-Code for computing the form-factor.
f ′ (u) also goes to zero when u goes to ±∞. As a con-
sequence, it must have one minimum between u2 and +∞,
and another between −∞ and u1 , and it must have one max-
imum between u1 and u2 . So the second derivative changes
sign exactly three times. One of the point where the second form-factor gradient is 30 %, while computing an approx-
derivative changes sign is smaller than u1 , which is smaller imate value of the gradient would require two form-factor
than u0 , and one of them is greater than u2 , which is greater samples, thus increasing computation time by 100 %
than u0 .
Then the second derivative changes sign at least once and
at most twice on [u0 , +∞]. When u goes to +∞, f is con- The Point-to-Area Form-Factor
vex, and f ′′ is positive. So we just proved that f ′′ can be Let us recall that the point-to-area form-factor from a point
negative only over a unique bounded segment on [u0 , +∞]. x on a patch A1 to a patch A2 (see figure 1) can be expressed
The form-factor on the line is equal to f (u) for u > u0 , as a contour integral:
and null everywhere else. So the form-factor on a line is con-
~r12 × d~
I
cave only over a unique bounded segment. This proves the 1 ℓ2
F (x) = −~n1 ·
C2 conjecture for a differential area emitter. 2π ∂A2
k~r12 k2
158
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 17
0.18
f(x)
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0
A line cutting through the radiosity function The radiosity function on the line
2 30
d(x) s(x)
20
1.5
10
1
0
0.5 -10
0 -20
-30
-0.5
-40
-1
-50
-1.5 -60
-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0
The first derivative of the radiosity function on The second derivative is negative only over a
the line segment.
Figure 19: The radiosity on any line on the receiving plane is concave only over a segment.
Form-Factor Gradient
F =0
G~ =~ 0
The point-to-area form-factor gradient can be easily com-
foreach edge [Ei Ei+1 ]
puted by derivation of the previous formula (see Arvo15 , or .
.
Holzschuch17 ): .
F − = I1 mixt
−−−−−→
−1 X ei = Ei Ei+1
~
∇F (x) = − ~n1 × ~ei I1 ei ·~
~ ri+1 ei ·~
~ ri
+ e2
2π I2 =
r2
−
r2 i I1
i i+1 i
2
I2 / = 2crossprod
+2~n1 · (~ri × ~ri+1 ) (~ri I2 + ~ei J2 )
J2 = 0.5 1 − 1 −~
ei · ~
ri I2
r2 r2
i i+1
With: J2 / = e2i
~ = (~
G+ n1 × ~
ei )I1 + 2mixt(~
ri I2 + ~
ei J2 )
γi
I1 = F ∗ = 2π1
k~
ei × ~ri k ~ =− 1
G∗
2π
1 ~ei · ~
ri+1 ei · ~
~ ri
I2 = 2
− + e2i I1 Figure 22: Pseudo-Code for computing the gradient of the
ri k2
2kei × ~ ri+1 ri2
form-factor.
1 1 1 ~ei · ~
ri
J2 = − 2 − I2
2e2i ri2 ri+1 e2i
159
18 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
H∗ = π1
160
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 19
Figure 24: The symmetries of the scene can help find the location of the maximum.
Figure 25: Direct illumination with our refinement criterion, unoccluded scene.
161
20 N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity
Figure 27: Direct illumination with our refinement criterion, with one occluder.
162
N. Holzschuch and F. X. Sillion / An exhaustive error-bounding algorithm for hierarchical radiosity 21
Figure 28: With generic occluders, the unimodality conjectures do not hold.
Figure 29: With certain occluders, the unimodality conjectures still holds.
163
164 CHAPITRE 3. PROPRIÉTÉS DE LA FONCTION D’ÉCLAIRAGE
Abstract 1. Spectrum of the source 2. Spectrum after first blocker 3. Spectrum after 2nd blocker
angular freq.
angular freq.
angular freq.
We present a signal-processing framework for light transport. We
study the frequency content of radiance and how it is altered by spatial frequencies spatial frequencies spatial frequencies
phenomena such as shading, occlusion, and transport. This extends
previous work that considered either spatial or angular dimensions,
and it offers a comprehensive treatment of both space and angle. 4. Incoming Spectrum
at receiver
We show that occlusion, a multiplication in the primal, amounts
angular freq.
in the Fourier domain to a convolution by the spectrum of the
blocker. Propagation corresponds to a shear in the space-angle fre- spatial frequencies
quency domain, while reflection on curved objects performs a dif-
ferent shear along the angular frequency axis. As shown by previ-
5 Outgoing Spectrum
ous work, reflection is a convolution in the primal and therefore a after receiver
angular freq.
multiplication in the Fourier domain. Our work shows how the spa-
tial components of lighting are affected by this angular convolution.
Our framework predicts the characteristics of interactions such spatial frequencies
1115
1.2 Related work ℓR 2D light field (2D) around ray R
x spatial dimension (distance to central ray)
Radiance exhibits both spatial and angular variations. A wealth of v directional dimension in 2-plane parameterization
previous work has studied the frequency content along one of these θ directional dimension in plane-sphere paramerization
components, but rarely have both space and angle been addressed. bf Fourier transform of function f
We do not discuss all applications of Fourier analysis, but rather ΩX frequency along dimension X
√
focus on studies of frequency modification in light transport. i −1
Filtering and sampling Heckbert’s seminal work on texture an- f ⊗g convolution of f by g
tialiasing [1989] derives local bandwidth for texture pre-filtering d Transport distance
based on a first-order Taylor expansion of the perspective transform. V(x, v) visibility function of the blockers
The effect of perspective is also studied in the contexts of hologra- cos+ (θ) clamped cosine term: max(cos θ, 0)
phy and light field sampling [Halle 1994; Isaksen et al. 2000; Chai dE Differential irradiance (after cosine term).
et al. 2000; Stewart et al. 2003], mostly ignoring visibility and spec- ρ BRDF
ular effects.
Local illumination as a convolution Recently, local illumina- Figure 2: Notations.
tion has been characterized in terms of convolution and it was
shown that the outgoing radiance is band-limited by the BRDF virtual plane at virtual
plane distance 1 plane
[Ramamoorthi and Hanrahan 2001b; Ramamoorthi and Hanrahan
2004; Basri and Jacobs 2003]. However the lighting is assumed to v θ x v
come from infinity and occlusion is ignored. Frolova et al. [2004]
x x R’ d
central central
explored spatial lighting variations, but only for convex diffuse ob- ray R ray R R
jects. We build on these approaches and extend them by adding x’= v
x-vd
spatial dimensions as well as other phenomena such as occlusion
and transport, at the expense of first-order approximations and a lo- (a) (b) (c)
cal treatment. Ramamoorthi et al. [2004] have also studied local
occlusion in a textured object made of pits such as a sponge. Our Figure 3: (a-b) The two light field parameterization used in this
treatment of occlusion considers complex blockers at an arbitrary article. Locally, they are mostly equivalent: we linearize v = tan θ.
distance of the blocker and receiver. (c) Transport in free space: the angular dimension v is not affected
Wavelets and frequency bases Wavelets and spherical harmon- but the spatial dimension is reparameterized depending on v.
ics have been used extensively as basis functions for lighting sim-
ulation [Gortler et al. 1993; Keller 2001] or pre-computed radi- 2 Preliminaries
ance transfer [Sloan et al. 2002; Ramamoorthi and Hanrahan 2002].
We want to analyze the radiance function in the neighborhood of a
They are typically used in a data-driven manner and in the context
ray along all steps of light propagation. For this, we need a number
of projection methods, where an oracle helps in the selection of the
of definitions and notations, summarized in Fig. 2. Most of the
relevant components based on the local frequency characteristics of
derivations in this paper are carried out in 2D for clarity, but we
radiance. Refinement criteria for multiresolution calculations often
shall see that our main observations extend naturally to 3D.
implicitly rely on frequency decomposition [Sillion and Drettakis
1995]. In our framework we study the frequency effect of the equa-
tions of light transport in the spirit of linear systems, and obtain 2.1 Local light field and frequency content
a more explicit characterization of frequency effects. Our results We consider the 4D (resp. 2D) slice of radiance at a virtual plane
on the required sampling rate can therefore be used with stochastic orthogonal to a central ray. We focus on the neighborhood of the
methods or to analyze the well-posedness of inverse problems. central ray, and we call radiance in such a 4D (resp. 2D) neighbor-
Ray footprint A number of techniques use notions related to band- hood slice a local light field (Fig. 3 left). Of the many parameteri-
width in a ray’s neighborhood and propagate a footprint for adaptive zations that have been proposed for light fields, we use two distinct
refinement [Shinya et al. 1987] and texture filtering [Igehy 1999]. ones in this paper, each allowing for a natural expression of some
Chen and Arvo use perturbation theory to exploit ray coherence transport phenomena. Both use the same parameter for the spatial
[2000]. Authors have also exploited on-the-fly the frequency con- coordinates in the virtual plane, x, but they differ slightly in their
tent of the image to make better use of rays [Bolin and Meyer 1998; treatment of directions. For our two-plane parameterization, we
Myszkowski 1998; Keller 2001]. Our work is complementary and follow Chai et al. [2000] and use the intersection v with a parallel
provides a framework for frequency-content prediction. plane at unit distance, expressed in the local frame of x (Fig. 3-a).
Illumination differentials have been used to derive error bounds In the plane-sphere parameterization, we use the angle θ with the
on radiance variations (e.g. gradients [Ward and Heckbert central direction (Fig. 3-b) [Camahort et al. 1998]. These two pa-
1992; Annen et al. 2004], Jacobians [Arvo 1994], and Hessians rameterizations are linked by v = tan θ and are equivalent around
[Holzschuch and Sillion 1998], but only provide local information, the origin thanks to a linearization of the tangent.
which cannot easily be used for sampling control. We study the Fourier spectrum of the radiance field ℓR , which
Fourier analysis has also been extensively used in optics [Good- we denote by b ℓR . For the two-plane parameterization, we use the
man 1996], but in the context of wave optics where phase and inter- following definition of the Fourier transform:
ferences are crucial. In contrast, we consider geometric optics and Z ∞ Z ∞
characterize frequency content in the visible spatial frequencies. bℓR (Ω x , Ωv ) = ℓR (x, v)e−2iπΩx x e−2iπΩv v d x dv (1)
x=−∞ v=−∞
The varying contrast sensitivity of humans to these spatial frequen-
cies can be exploited for efficient rendering, e.g. [Bolin and Meyer Examples are shown for two simple light sources in Fig. 4, with
1995; Ferwerda et al. 1997; Bolin and Meyer 1998; Myszkowski the spatial dimension along the horizontal axis and the direction
1998]. Finally we note that the Fourier basis can separate different along the vertical axis. We discuss the plane-sphere parameteriza-
phenomena and thus facilitate inverse lighting [Ramamoorthi and tion in Section 4.
Hanrahan 2001b; Basri and Jacobs 2003] depth from focus [Pent- One of the motivations for using Fourier analysis is the
land 1987] and shape from texture [Malik and Rosenholtz 1997]. convolution-multiplication theorem, which states that a convolution
1116
1 1
Ωv (dir.)
Ωv (dir.)
v (dir.)
v (dir.)
0
5
-5 0
0 0
x (space) Ωx (space) x (space) Ωx (space)
-10 -5 0 5 10 -10 -5 0 5 10 -5 0 5
The longer the travel, the more pronounced the shear. g(v) is a bell-like curve (Fig. 5-a); its Fourier transform is:
3.2 Visibility b
g(Ωv ) = 4π|Ωv |K1 (2π|Ωv |)
Occlusion creates high frequencies and discontinuities in the radi-
ance function. Radiance is multiplied by the binary occlusion func- where K1 is the first-order modified Bessel function of the second
tion of the occluders: kind. bg is highly concentrated on low frequencies (Fig. 5-b); the
effect of convolution by b
g is a very small blur of the spectrum in the
ℓR′ (x, v) = ℓR (x, v) V(x, v) (4) angular dimension (Fig. 6, step 5).
1117
Step Light field Fourier transform 3D Version
X - Theta spectrum
1. Emission L 2 20 30
Emitter 4
3.5
20
0.6
Ωv (direction)
1 10 3
v (direction)
2.5
10
2
0.3 1.5
0 0 0 1
0.5
0 0
-10
-1 -10
Occluders -20
-2 -20 -30
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L1/L 5/L 10/L -30 -20 -10 0 10 20 30
X - Theta spectrum
2. Transport 2 20 30
4
20 3.5
0.6
Ωv (direction)
1 10 3
v (direction)
2.5
10
2
0.3 1.5
0 0 0 1
0.5
0 0
-10
-1 -10
-20
-2 -20 -30
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L1/L 5/L 10/L -30 -20 -10 0 10 20 30
x (space) Ωx (space)
× = ⊗ = X - Theta spectrum
3. Visibility 2 20 30
4
20 3.5
0.6
Ωv (direction)
1 10 3
v (direction)
2.5
10
2
0.3 1.5
0 0 0 1
0 0.5
0
-10
-1 -10
-20
-2 -20 -30
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L1/L 5/L 10/L -30 -20 -10 0 10 20 30
x (space) Ωx (space)
X - Theta spectrum
4. Transport 2 20 30
4
20 3.5
0.6
Ωv (direction)
1 10 3
v (direction)
2.5
10
2
0.3 1.5
0 0 0 1
0.5
0 0
-10
-1 -10
-20
-2 -20 -30
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L1/L 5/L 10/L -30 -20 -10 0 10 20 30
x (space) Ωx (space)
2 20
5. cos+θ
1 0.6
Ωv (direction)
v (direction)
10
× = 0.3
0 ⊗ = 0
0
-1 -10
-2 -20
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L1/L 5/L 10/L
x (space) Ωx (space)
6. Diffuse BRDF 2 20
X - Theta spectrum
1 0.6
Ωv (direction)
10
v (direction)
30
0.02
0 0.3 20
0 0.015
10
0 0.01
-1 -10 0 0.005
0
-10
-2 -20
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L1/L 5/L 10/L
-20
x (space) Ωx (space)
-30
0.2 0.08
Radiosity Fourier spectrum
X - Y spectrum
30
0.02
20
0.015
0.1 0.04 10
0.01
0 0.005
0
-10
-20
0 0
-4L -3L -2L -L 0 L 2L 3L 4L -10/L -5/L -1/L 1/L 5/L 10/L -30
Figure 6: Effects on the spectrum of the various steps of light transport with a diffuse reflector. 2D Fourier transforms for steps 1 to 4 are
obtained analytically; step 5 (convolution) is performed numerically. 3D Version spectrums are obtained numerically, via a Photon-Mapping
algorithm and a FFT of the light field computed.
1118
20 20
Ramamoorthi and Hanrahan [2001b] and extends it in several ways,
0.6 0.6
Ωv (direction)
Ωv (direction)
10 10
0.3 0.3
which we discuss in Section 6.1.
0 0
0 0 In a nutshell, local reflection simply corresponds to a multiplica-
-10 -10 tion by the cosine term and a convolution by the BRDF. However,
-20 -20 a number of reparameterizations are necessary to take into account
-10/L -5/L -1/L1/L 5/L 10/L -10/L -5/L -1/L1/L 5/L 10/L
Ωx (space) Ωx (space) the incidence and outgoing angles, as well as the surface curvature.
We first treat the special case of rotation-invariant BRDFs such as
Figure 7: Scene configuration Phong before addressing more general forms as well as texture and
Emitter for visibility experiment. Left: spatially-varying BRDFs. Recall that we study frequency content in
spectrum with only one occluder. ray neighborhoods, which means that for local reflection, we con-
First occluder
Right: spectrum with two occlud- sider an incoming neighborhood and an outgoing neighborhood.
Second occluder Plane-sphere parameterization Since local reflection mostly in-
ers, computed with full precision
Receiver and phase. volves integrals over the directional dimension, it is more naturally
expressed in a parameterization where angles are uniform. This is
3.4 Example and discussion why we use here a plane-sphere parameterization where the direc-
tional component θ is the angle to the central ray (Fig. 3-b). The
Fig. 6 illustrates the various steps of light transport for a simple spatial dimension is unaffected.
scene such as Fig. 9b. The slopes of the transport shears correspond In the plane-sphere parameterization, the domain of directions is
to the travel distance (steps 2 and 4). Visibility increases the spatial the S 1 circle, which means that frequency content along this dimen-
frequency content through the convolution by a horizontal kernel in sion is now a Fourier series, not a transform. Fig. 8 shows the effect
frequency space (step 3). There are only a finite number of blockers of reparameterizing angles on the frequency plane. The frequency
in Fig. 6, which explains why their spectrum is not a Dirac comb distribution is very similar, although the spectrum is blurred by the
times a sinc, but a blurry version. The blocker spectrum mostly non-linear reparameterization. For bandwidth analysis, this intro-
contains a main central lobe corresponding to the average occlusion duces no significant error. Note that for all local interactions with
and two side lobes corresponding to the blocker main frequency. the surface (and thus in this entire section), there is no limitation to
This results in a replication of the sheared source spectrum on the small values of θ, the linearization v = tan θ ≈ θ will only be used
two sides. The smaller the blocker pattern, the further-away these again after light leaves the surface, for subsequent transport.
replicas are in frequency space. The final diffuse integration (step
6) discards all directional frequencies.
4.1 Rotation-invariant BRDFs on curved receivers
The main differences between the 3D and 2D plots of the spectra
in Fig. 6 come from aliasing problems that are harder to fix with Local shading is described by the shading equation
the 4D light field.Furthermore, in the 3D scene, the position of the Z
blockers is jittered (see Fig. 9), which results in a smoother spec- ℓo (xi , θo ) = ℓ(xi , θi ) ρ xi (θo′ , θi′ ) cos+ θi′ dθi (8)
trum. θi
Feature-based visibility The spectra in Fig. 6 show that the second
transport (step 4) pushes the “replicas” to the angular domain. This where the primed angles are in the local frame of the normal while
effect is more pronounced for high-frequency blockers, for which the unprimed angles are in the global frame (Fig. 10). For now, we
the replicas are farther from the vertical line. Since the final diffuse assume that the BRDF ρ does not vary with xi . Local shading is
integration keeps only the spatial line of frequencies (step 5), the mostly a directional phenomenon with no spatial interaction: the
main high-frequency lobe of the blockers is eliminated by diffuse outgoing radiance at a point is only determined by the incoming
shading. This is related to the feature-based approach to visibility radiance at that point. However, the normal varies per point.
[Sillion and Drettakis 1995], where the effect of small occluders As pointed out by Ramamoorthi and Hanrahan [2001b], local re-
on soft shadows is approximated by an average occlusion. How- flection combines quantities that are naturally expressed in a global
ever, our finding goes one step further: where the feature-based frame (incoming and outgoing radiance) and quantities that live in
technique ignores high frequencies, we show that, for small-enough the local frame defined by the normal at a point (cosine term and
blockers, most high-frequencies are effectively removed by integra- BRDF). For this, we need to rotate all quantities at each spatial
tion. location to align them with the normal. This means that we ro-
Combining several blockers A difficult scene for visibility is the tate (reparameterize) the incoming radiance, perform local shading
case of two occluders that individually block half of the light, and in the local frame, and rotate (reparameterize ) again to obtain the
together block all the light (Fig. 7). In our framework, if one carries outgoing radiance in a global frame. All steps of the local shading
out the computations with full precision, taking phase into account, process are illustrated in Fig. 10 and discussed below.
one gets the correct result: an empty spectrum (Fig. 7, right).
However, for practical applications, it is probably not necessary Step 1 & 7: Reparameterization into the tangent frame We
to compute the full spectrum. Instead, we consider elements of first take the central incidence angle θ0 into account, and reparam-
information about the maximal frequency caused by the scene con- eterize in the local tangent frame with respect to the central normal
figuration, as we show in Section 6.2. In that case, one can get direction. This involves a shift by θ0 in angle and a scale in space
an overestimation of the frequencies caused by a combination of 20 20
blockers, but not an underestimation. 0.4 0.4
Ωθ (direction)
Ωv (direction)
10 10
0.2 0.2
0 0
0 0
4 General case for surface interaction -10 -10
-20 -20
So far, we have studied only diffuse shading for a central ray normal -10/L -5/L -1/L1/L 5/L 10/L -10/L -5/L -1/L1/L 5/L 10/L
to a planar receiver (although rays in the neighborood have a non- Ωx (space) Ωx (space)
normal incident angle). We now discuss the general case, taking Figure 8: Spectrum arriving at the receiver (step 4 of Fig. 6), be-
into account the incidence angle, arbitrary BRDF, receiver curva- fore and after the sphere-plane reparameterization. Left: (Ω x , Ωv )
ture as well as spatial albedo variation. Our framework builds upon spectrum. Right: (Ω x , Ωθ ) spectrum.
1119
X - Theta spectrum of incident light on receiver X - Theta spectrum of incident light on receiver X - Theta spectrum of incident light on receiver
30 30 30
2 2 2
20 1.5 20 1.5 20 1.5
10 1 10 1 10 1
0 0.5 0 0.5 0 0.5
-10 0 -10 0 -10 0
-20 -20 -20
-30 -30 -30
-30 -20 -10 0 10 20 30 -30 -20 -10 0 10 20 30 -30 -20 -10 0 10 20 30
transport shear. Similarly, the Fourier transform is sheared along Which is a convolution of dE r′ by ρ′ for each xi : that is, we convolve
the spatial dimension (Fig. 10 step 2, last row): the 2D function dE r′ by a spatial Dirac times the directional shift-
invariant BRDF ρ′ (Fig. 10 step 5). In the Fourier domain, this is a
ℓbi′ (Ω′x , Ω′θ ) = b
ℓi (Ω′x + kΩ′θ , Ω′θ ) (11) multiplication by a spatial constant times the directional spectrum
of the BRDF.
After this reparameterization, our two-dimensional spatio-
directional local light field is harder to interpret physically. For dr′ (Ω′x , Ω′θ ) ρb′ (Ω′θ )
ℓbo′ (Ω′x , Ω′θ ) = dE (16)
each column, it corresponds to the incoming radiance in the frame
of the local normal: the frame varies for each point. In a sense, we Note, however, that our expression of the BRDF is not recipro-
have unrolled the local surface and warped the space of light ray in cal. We address more general forms of BRDF below.
the process [Wood et al. 2000]. The direction of the shear depends
on the sign of the curvature (concave vs. convex). Step 6: Per-point rotation back to tangent frame We now
apply the inverse directional shear to go back to the global frame.
Step 3: Cosine term and differential irradiance In the local Because we have applied a mirror transform in step 4, the shear and
frame of each point, we compute differential irradiance by multi- inverse shear double their effect rather than canceling each other.
plying by the spatially-constant clamped cosine function cos+ . This Since the shear comes from the object curvature, this models the
multiplication corresponds in frequency space to a convolution by effect of concave and convex mirror and how they deform reflection.
a Dirac in space times a narrow function in angle: In particular, a mirror sphere maps the full 360 degree field to the
180 degree hemisphere, as exploited for light probes.
d′ (Ω x , Ωθ ) = ℓb′ (Ω x , Ωθ ) ⊗ cos
dE d+ (Ωθ )δΩx =0 (12)
i
4.2 Discussion
Over the full directional domain, the spectrum of cos+ is: The important effects due to curvature, cosine term, and the BRDF
are summarized in Fig. 10. Local shading is mostly a directional
2 phenomenon, and the spatial component is a double-shear due to
d+ (Ωθ ) = cos π2 Ωθ
cos (13)
1 − (2πΩθ )2 curvature (step 2 and 6). The cosine term results, in frequency
Most of the energy is centered around zero (Fig. 12-a) and the 1/Ω 2θ space, in a convolution by a small directional kernel (step 3) while
frequency falloff comes from the derivative discontinuity1 at π/2. the BRDF band-limits the signal with a multiplication of the spec-
Equivalent to the two-plane reparameterization (Section 3.3), the trum (step 5). Rougher materials operate a more aggressive low-
cosine term has only a small vertical blurring effect. pass, while in the special case of mirror BRDFs, the BRDF is a
Dirac and the signal is unchanged.
1 A function with a discontinuity in the nth derivative has a spectrum Curvature has no effect on the directional bandwidth of the out-
falling off as 1/Ωn+1 . A Dirac has constant spectrum. going light field, which means that previous bounds derived in the
1120
1 2 3 4 5 6 7
cosine term shading inverse per point
reparameterization per point rotation mirror reparameterization
to tangent frame to local frame (differential irradiance) reparameterization rotation back to outgoing light field
(integral with BRDF) to tangent frame
incoming N N N N outgoing
light field
θ0 θi Ro light field
Scene
α(xi)
Ri θ’i θ’i θ’o θo θ1 θ“
x θ θ’r θ’o-θ’r
tangent
plane xi
x“
Ray space
mirror
shift by θ1
shift by θ0
scale by 1/cos θ0
= = scale by cos θ1
ℓ’i ℓo
ℓi cos+ dE’ dE’r ρ' ℓ’o ℓ’’o
Fourier
Alu.
1 2 concave spectrum shear shear
1.5 1.5 Silver
Copper
incoming mirror Plastic
1 1
beam
of light 3 0.5 0.5
focal
point 0 0
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
1121
step 1, 2, 3, 4 step 5
rotation around central Spherical coordinates We use the spherical coordinates θ, ϕ
the x axis by α local normal where θ, in [−π, π], is the azimuth and ϕ, in [−π/2, π/2], the co-
z
normal Mirror
incoming direction latitude. The distortion of this parameterization is cos ϕ, which
light field (wrt local N) means that one must remain around the equator to avoid distortion.
tor
equa In this neighborhood, the parameterization is essentially Euclidean,
ϕ' ϕ θ' r y to a first-order approximation.
equato
θ Local reflection is challenging because it involves four neigh-
x to
r
ua outgoing borhoods of direction around: the incoming direction, the normal,
eq direction the mirror direction, and the outgoing direction; in general, we can-
(a) (b) (c)
not choose a spherical parameterization where they all lie near the
equator. Fortunately, we only need to consider two of these neigh-
Figure 13: 3D direction parameterizations. borhoods at a time (Fig. 4).
For this, we exploit the fact that a rotation around an axis on the
over the directional domain, but the kernel varies spatially. equator can be approximated to first order by a Euclidean rotation
To model this effect, we exploit the fact that a 2D Fourier trans- of the (θ, ϕ) coordinates: (θ′ , ϕ′ ) = Rα (θ, ϕ)
form can be decomposed into two separable 1D transforms, the For brevity, we omit the comprehensive remapping formulas for
first one vertically, then horizontally. We consider the intermedi- 3D shading, but we describe the appropriate parameterization for
ate semi-Fourier space ℓ̊(x, Ωθ ) that represents for each location x each step as well as the major differences with the 2D case.
the 1D Fourier transform of the directional variation of incoming
light. The full Fourier space is then the 1D Fourier transform of the Tangent frame We start with a parameterization where the equa-
semi-Fourier transform along the x dimension. We have tor is in the incident plane, as defined by the central ray of the in-
cident light field and the central normal vector (Fig. 13-b). If the
˚ r′ (x, Ωθ ) ρ̊(x, Ωθ ),
ℓ˚o′ (x, Ωθ ) = dE light field has been properly rotated, only the x spatial dimension
undergoes the scaling by cos θ0 (Eq. 9)
which is a multiplication in the semi-Fourier domain, and therefore
a convolution along x only in full Fourier space:
Curvature In 2D, we approximated the angle with the local nor-
dr′ (Ω x , Ωθ ) ⊗ x b mal linearly by α(x) = kx; For a surface, the corresponding lin-
ℓb′ (Ω x , Ωθ ) = dE ρ(Ω x , Ωθ )
earization of the normal direction (θ N , ϕN ) involves a bilinear form
This means that in order to characterize the effect of spatially- [Do Carmo 1976]:
varying BRDFs, we consider the spectrum of ρ(x, θ). We then (θN , ϕN ) = M (x, y) (19)
take the spectrum of the incoming illumination b ℓ and convolve it
If x and y are aligned with the principal directions of the sur-
only horizontally along Ω x , not vertically. We call this a semi-
face, the matrix M is an anisotropic scaling where the scale fac-
convolution in Ω x , which we note ⊗ x .
tors are the two principal curvatures. The corresponding remapping
In the special case of non-varying BRDFs, the spectrum of
of (x, y, θ, ϕ) is a shear in 4D, with different amounts in the two
ρ(x, θ) is a Dirac times the directional spectrum of the BRDF. The
principal directions. As with the 2D case, the frequency content is
horizontal convolution is a multiplication. If the spectrum of ρ is
sheared along the spatial dimensions.
separable (texture mapping), then the spatially-varying BRDF case
is a multiplication followed by a convolution. The special case of a
a spatially-varying combination of BRDFs [Lensch et al. 2001] can Differential irradiance and cosine Step 3 is mostly un-
be handled more simply as the superposition of multiple BRDFs changed. Since we placed the equator along the incident plane,
with weights encoded as textures. the cosine term depends only on θ to a first approximation. The
spectrum is convolved with a small 1D kernel in θ (Fig. 12-a).
5 Extension to 3D Rotationally-symmetric BRDFs The mirror reparameteriza-
We now show how our framework extends to 3D scenes. tion of step 4 is unchanged, and the angles remain near the equator
since the equator also contains the normals. We express the con-
5.1 Light-field parameterization phenomena volution of the mirrored incoming light field by the BRDF in the
neighborhood of the outgoing direction. For this, we rotate our
The derivations presented in Section 3 involve a two-plane light- spherical coordinates so that the new equator contains both the mir-
field parameterization and extend directly to 3D. The only notable ror direction and the direction of the central outgoing ray (Fig. 13-
difference is the calculation of differential irradiance (Eq. 7), where c). Because all the angles are near the equator, the difference angles
the projected surface area in 3D becomes: between an outgoing ray and a mirrored ray can be approximated
du dv by θo′ − θr′ and ϕ′o − ϕ′r , and Eq. 16 applies.
dA⊥ = = G(u, v) du dv
(1 + v2 + u2 )2
Recap of rotations In summary, we first need to rotate the light-
Fig. 5-c presents the spectrum of G(u, v). field parameterization so that the central incidence plane is along
one of the axes before reparameterizing from two-plane to sphere-
5.2 Shading in plane-sphere parameterization plane (Fig. 13-b). We then need to rotate between the mirror repa-
rameterization and the BRDF convolution to place the central out-
The sphere S 2 of directions is unfortunately hard to parameterize, going direction on the equator (Fig. 13-c). Finally we rotate again
which prompted many authors to use spherical harmonics as the to put the outgoing plane defined by the normal and central outgo-
equivalent of Fourier basis on this domain. In contrast, we have ing direction in the equator (not shown).
chosen to represent directions using spherical coordinates and to
use traditional Fourier analysis, which is permitted by our restric-
tion to local neighborhoods of S 2 . This solution enables a more di- 5.3 Anisotropies in 3D
rect extension of our 2D results, and in particular it expresses well Our extension to 3D exploits the low distortion of spherical coordi-
the interaction between the spatial and angular components. nates near the equator, at the cost of additional reparameterization
1122
Ray space Fourier Spectrum formula lows the 2D derivation, with additional reparameterizations steps
Transport and anisotropies.
Travel shear shear b
ℓ(Ω x , Ωv + dΩ x ) In practice, the notion of locality is invoked for three different
Visibility multiplication convolution b
ℓ⊗bV reasons, whose importance depends on the context and application:
– the use of first-order Taylor series, for example for the curvature
Local geometric configuration or for the tan θ ≈ θ remapping,
e−iΩθ θ0 b
Light incidence scale spatial scale spatial | cos θ0 | ℓ (−Ω x cos θ0 , Ωθ ) – the principle of uncertainty, which states that low frequencies
Outgoing angle scale spatial scale spatial eiΩθ θ1
| cos θ1 | ℓbo ( cos
Ωx
θ , Ωθ ) cannot be measured on small windows (in which case big neigh-
1
1123
Glossy reflection criterion
environment (BRDF, curvature, normal, distance)
N
θ1
curvature k d
V
unoccluded
slope
occluder at 1/d’
distance d’
Figure 16: Criteria and sampling pattern used to render Fig. 17. The
incoming curvature mirror BRDF inverse exitant transport sampling adapts to curvature, the viewing angle, the BRDF as well
spectrum shear reparam. low pass shear angle to eye as the harmonic average of the distance to potential blockers.
(a) (b) (c) (d) (e) (f) (g)
1124
Uniform sampling Using our bandwidth prediction
Figure 17: Scene rendered without and with adaptive sampling rate based on our prediction of frequency content. Only 20,000 shading
samples were used to compute these 800 × 500 image. Note how our approach better captures the sharp detail in the shiny dinosaur’s head
and feet. The criteria and sampling are shown in Fig. 16. Images rendered using PBRT [Pharr and Humphreys 2004]
Compute visibility at full resolution theoretical justification, but also to allow extensions to more gen-
Use finite-differences for curvature criterion eral cases (such as from diffuse to glossy). Our preliminary study of
Compute harmonic blocker distance for sparse samples sampling rates in ray tracing is promising, and we want to develop
Perform bilateral reconstruction new algorithms and data structures to predict local bandwidth, es-
Compute B’ based on blocker and curvature
pecially for occlusion effects. Precomputed radiance transfer is an-
Generate blue noise sampling based on B’
Compute shading for samples
other direct application of our work.
Perform bilateral reconstruction Our analysis extends previous work in inverse rendering [Ra-
mamoorthi and Hanrahan 2001b; Basri and Jacobs 2003] and we
Observe how our sampling density is increased in areas of are working on applications to inverse rendering with close-range
high curvature, grazing angles, and near occluders. The environ- sources, shape from reflection, and depth from defocus.
ment map casts particularly soft shadows, and note how the high-
frequency detail on the nose of the foreground dinosaur is well cap-
tured, especially given that the shading sampling is equivalent to a Acknowledgments
200 × 100 resolution image. We thank Jaakko Lehtinen, the reviewers of the Artis and MIT
Although these results are encouraging, the approach needs im- teams, as well as the SIGGRAPH reviewers for insightful feedback.
provement in several areas. The visibility criterion in particular This work was supported by an NSF CAREER award 0447561
should take into account the light source intensity in a directional “Transient Signal Processing for Realistic Imagery,” an NSF CISE
neighborhood to better weight the inverse distances. Even so, the Research Infrastructure Award (EIA9802220), an ASEE National
simple method outlined above illustrates how knowledge of the Defense Science and Engineering Graduate fellowship, the Eu-
modifications of the frequency content through light transport can ropean Union IST-2001-34744 “RealReflect” project, an INRIA
be exploited to drive rendering algorithms. In particular, similar équipe associée, and the MIT-France program.
derivations are promising for precomputed radiance transfer [Sloan
et al. 2002] in order to relate spatial and angular sampling.
References
A, T., K, J., D, F., S, H.-P. 2004. Spherical
7 Conclusions and future work harmonic gradients for mid-range illumination. In Rendering
We have presented a comprehensive framework for the description Techniques 2004 (Proc. EG Symposium on Rendering 2004).
of radiance in frequency space, through operations of light trans-
port. By studying the local light field around the direction of prop- A, J. 1994. The irradiance Jacobian for partially occluded
agation, we can characterize the effect of travel in free space, oc- polyhedral sources. In Computer Graphics Proceedings, Annual
clusion, and reflection in terms of frequency content both in space Conference Series, ACM SIGGRAPH, 343–350.
and angle. In addition to the theoretical insight offered by our anal- B, R., J, D. 2003. Lambertian reflectance and linear
ysis, we have shown that practical conclusions can be drawn from a subspaces. IEEE Trans. Pattern Anal. Mach. Intell. 25, 2.
frequency analysis, without explicitly computing any Fourier trans-
forms, by driving the sampling density of a ray tracer according to B, B. G., M, N. L. 1993. Smooth transitions between
frequency predictions. bump rendering algorithms. In Computer Graphics Proceedings,
Future work On the theory side, we are working on the anal- Annual Conference Series, ACM SIGGRAPH, 183–190.
ysis of additional local shading effects such as refraction, bump-
mapping, and local shadowing [Ramamoorthi et al. 2004]. We hope B, M. R., M, G. W. 1995. A frequency based ray
to study the frequency cutoff for micro, meso, and macro-geometry tracer. In Computer Graphics Proceedings, Annual Conference
effects [Becker and Max 1993]. The study of participating media Series, ACM SIGGRAPH, 409–418.
is promising given the ability of Fourier analysis to model differ- B, M. R., M, G. W. 1998. A perceptually based adap-
ential equations. The spectral analysis of light interaction in a full tive sampling algorithm. In Computer Graphics Proceedings,
scene is another challenging topic. Finally, the addition of the time Annual Conference Series, ACM SIGGRAPH, 299–309.
dimension is a natural way to tackle effects such as motion blur.
We are excited by the wealth of potential applications encom- C, E., L, A., F, D. 1998. Uniformly sam-
passed by our framework. In rendering, we believe that many tra- pled light fields. In Rendering Techniques ’98 (Proc. of EG
ditional algorithms can be cast in this framework in order to derive Workshop on Rendering ’98), Eurographics, 117–130.
1125
C, J.-X., C, S.-C., S, H.-Y., T, X. 2000. Plenoptic O, V., D, C., J, P.-M. 2004. Fast
sampling. In Computer Graphics Proceedings, Annual Confer- hierarchical importance sampling with blue noise properties.
ence Series, ACM SIGGRAPH, 307–318. ACM Transactions on Graphics (Proc. SIGGRAPH 2004) 23, 3
(Aug.), 488–495.
C, M., A, J. 2000. Theory and application of specular
path perturbation. ACM Trans. Graph. 19, 4, 246–278. P, A. P. 1987. A new sense for depth of field. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence 9, 4 (July).
D C, M. 1976. Differential Geometry of Curves and Sur-
faces. Prentice Hall. P, M., H, G. 2004. Physically Based Rendering:
From Theory to Implementation. Morgan Kaufmann.
F, J. A., S, P., P, S. N., G,
D. P. 1997. A model of visual masking for computer graphics. R, R., H, P. 2001. An efficient represen-
In Computer Graphics Proceedings, Annual Conference Series, tation for irradiance environment maps. In Computer Graphics
ACM SIGGRAPH, 143–152. Proceedings, Annual Conference Series, ACM SIGGRAPH.
F, D., S, D., B, R. 2004. Accuracy of spher- R, R., H, P. 2001. A signal-processing
ical harmonic approximations for images of Lambertian objects framework for inverse rendering. In Computer Graphics Pro-
under far and near lighting. In ECCV 2004, European Confer- ceedings, Annual Conference Series, ACM SIGGRAPH.
ence on Computer Vision, 574–587.
R, R., H, P. 2002. Frequency space envi-
G, J. W. 1996. Introduction To Fourier Optics. McGraw- ronment map rendering. ACM Transactions on Graphics (Proc.
Hill. SIGGRAPH 2002) 21, 3, 517–526.
G, S. J., S̈, P., C, M. F., H, P. 1993. R, R., H, P. 2004. A signal-processing
Wavelet radiosity. In Computer Graphics Proceedings, Annual framework for reflection. ACM Transactions on Graphics 23, 4.
Conference Series, ACM SIGGRAPH, 221–230. R, R., K, M., B, P. 2004. A
Fourier theory for cast shadows. In ECCV 2004, European Con-
H, M. 1994. Holographic stereograms as discrete imaging
ference on Computer Vision, 146–162.
systems. In SPIE Proc. Vol. 2176: Practical Holography VIII,
S. Benton, Ed., SPIE, 73–84. S, M., T, T., N, S. 1987. Principles and appli-
cations of pencil tracing. Computer Graphics (Proc. SIGGRAPH
H, P. 1989. Fundamentals of Texture Mapping and Image ’87) 21, 4, 45–54.
Warping. Master’s thesis, University of California at Berkeley,
Computer Science Division. S, F., D, G. 1995. Feature-based control of vis-
ibility error: A multi-resolution clustering algorithm for global
H, N., S, F. X. 1998. An exhaustive error- illumination. In Computer Graphics Proceedings, Annual Con-
bounding algorithm for hierarchical radiosity. Computer Graph- ference Series, ACM SIGGRAPH, 145–152.
ics Forum 17, 4.
S, P.-P., K, J., S, J. 2002. Precomputed radi-
I, H. 1999. Tracing ray differentials. In Computer Graphics ance transfer for real-time rendering in dynamic, low-frequency
Proceedings, Annual Conference Series, ACM SIGGRAPH. lighting environments. ACM Trans. on Graphics 21, 3, 527–536.
I, A., MM, L., G, S. J. 2000. Dynamically S, C., S, F. X. 1998. Fast calculation of soft shadow
reparameterized light fields. In Computer Graphics Proceedings, textures using convolution. In Computer Graphics Proceedings,
Annual Conference Series, ACM SIGGRAPH, 297–306. Annual Conference Series, ACM SIGGRAPH, 321–332.
K, A. 2001. Hierarchical monte carlo image synthesis. Math- S, J., Y, J., G, S. J., MM, L. 2003. A new
ematics and Computers in Simulation 55, 1–3 (Feb.), 79–92. reconstruction filter for undersampled light fields. In Proc. EG
Symposium on Rendering 2003, Eurographics, 150–156.
L, J., R, S., R, R. 2004. Ef-
ficient BRDF importance sampling using a factored representa- T, C., M, R. 1998. Bilateral filtering for gray
tion. ACM Transactions on Graphics (Proc. SIGGRAPH 2004) and color images. In Proc. IEEE International Conference on
23, 3 (Aug.), 496–505. Computer Vision, IEEE, 836–846.
L, H. P. A., K, J., G, M., H, W., S- W, G. J., H, P. 1992. Irradiance gradients. In Proc.
, H.-P. 2001. Image-based reconstruction of spatially varying of EG Workshop on Rendering ’92, Eurographics, 85–98.
materials. In Rendering Techniques ’01 (Proc. EG Workshop on
Rendering 2001), Eurographics, 104–115. W, G. J., R, F. M., C, R. D. 1988. A ray
tracing solution for diffuse interreflection. Computer Graphics
M, J., R, R. 1997. Computing local surface ori- (Proc. SIGGRAPH ’88) 22, 4 (Aug.), 85 – 92.
entation and shape from texture for curved surfaces. Interna-
tional Journal of Computer Vision 23, 2, 149–168. W, D. N., A, D. I., A, K., C, B., D, T.,
S, D. H., S, W. 2000. Surface light fields for
MC, M. D. 1999. Anisotropic diffusion for monte carlo noise 3D photography. In Computer Graphics Proceedings, Annual
reduction. ACM Transactions on Graphics 18, 2, 171–194. Conference Series, ACM SIGGRAPH, 287–296.
1126
4.
Utilisation des cartes graphiques
programmables
Nous avons observé que plusieurs effets lumineux, comme les ombres et les reflets spécu-
laires, sont à la fois essentiels pour la qualité visuelle de la simulation et coûteux à calculer dans
le cadre d’une simulation globale de l’éclairage. D’un autre coté, les cartes graphiques modernes,
programmables, sont capables d’effectuer de nombreux calculs pour chaque pixel de l’image af-
fichée. Il devient alors possible de décharger le CPU d’un certain nombres de calculs en les
confiant à la carte graphique.
Dans ce chapitre, nous avons présenté nos travaux sur l’utilisation des cartes graphiques dans
la simulation des effets lumineux, tels que les ombres douces et les reflets spéculaires. Ces travaux
permettent d’augmenter le réalisme visuel de la simulation de l’éclairage, tout en libérant du
temps de calcul pour la simulation d’autres effets.
4.1 Introduction
La simulation de l’éclairage demande beaucoup de ressources de calcul, tant pour le proces-
seur que pour la mémoire. La qualité visuelle du résultat dépend beaucoup de certains effets à
haute fréquence, tels que les frontières d’ombre ou les reflets spéculaires. Nos expériences nous
ont montré que ces mêmes effets sont aussi les plus coûteux en temps de calcul. Sur certaines
scènes, le seul calcul des frontières des ombres causées par l’éclairage direct représente 90 % du
temps de calcul de l’éclairage global (direct et indirect).
Cette précision est requise surtout pour l’aspect visuel de la scène et non pour la précision
numérique des calculs. Les calculs d’éclairage indirect peuvent utiliser une version simplifiée de
l’éclairage direct, avec des frontières d’ombre grossièrement modélisées, sans perte de précision.
Plusieurs techniques de simulation de l’éclairage, comme la méthode de radiosité hiérarchique
ou le Photon Mapping, exploitent d’ailleurs cette propriété.
En reprenant notre étude sur les propriétés fréquentielles de la fonction d’éclairage [5], nous
voyons que les effets à haute fréquence se produisent principalement :
– en présence de BRDF spéculaires,
– sur les frontières d’ombre,
– lorsque deux objets sont proches l’un de l’autre.
Une caractéristique commune à tous ces cas est qu’ils ne nécessitent pas une information
complète sur l’éclairage dans l’ensemble de la scène : chacun d’eux ne fait intervenir qu’un petit
nombre d’objets simultanément. Ainsi, pour déterminer une frontière d’ombre on a seulement
besoin de connaître les positions de la source lumineuse, des obstacles et du récepteur. Bien que
ces phénomènes ne soient pas purement locaux, il ne s’agit pas non plus à proprement parler de
phénomènes globaux. On pourrait parler de phénomènes semi-locaux.
177
178 CHAPITRE 4. UTILISATION DES CARTES GRAPHIQUES PROGRAMMABLES
À l’inverse, les phénomènes réellement globaux, qui font intervenir l’éclairage sur l’ensemble
de la scène, impliquent une BRDF plutôt diffuse, et sont plutôt des phénomènes à basse fré-
quence, pour lesquels un échantillonnage spatial faible peut être suffisant.
Nous avons d’un côté un ensemble de phénomènes qui sont importants surtout pour leur effet
visuel, donc qui n’ont besoin d’être calculés que pour l’image affichée et qui ne font intervenir
qu’un petit nombre d’éléments de la scène. De l’autre coté, nous avons des cartes graphiques
dont les capacités se sont étendues et qui sont capables d’effectuer des programmes puissants, en
parallèle, pour chaque pixel de l’écran. Leurs principales limitations (les calculs sont limités à
l’image affichée et chaque programme n’a qu’une petite zone mémoire) correspondent en fait ici
à nos besoins.
Il est donc naturel de faire porter la partie coûteuse mais visuellement importante des calculs
de simulation de l’éclairage sur les cartes graphiques programmables, tout en gardant le proces-
seur central pour les calculs globaux, qui nécessitent un accès à l’information sur l’ensemble de
la scène.
Notons qu’il est également possible d’utiliser les cartes graphiques pour accélérer la simula-
tion de l’éclairage global ; dans le cadre de nos travaux, nous l’avions fait pour accélérer les re-
quêtes de visibilité en utilisant un hémicube [9], pour calculer efficacement les frontières d’ombre
et de pénombre [13], pour calculer rapidement le pourcentage de visibilité d’un objet avec des
occlusion queries [13] et pour l’affichage de fonctions d’éclairage non-linéaires [13]. Ces travaux
ne seront pas décrits ici.
1. Lance W. « Casting Curved Shadows on Curved Surfaces ». Computer Graphics (Proc. of SIGGRAPH ’78),
12(3):270–274, 1978.
2. Franklin C. C. « Shadow Algorithms for Computer Graphics ». Computer Graphics (Proc. of SIGGRAPH ’77),
11(2):242–248, 1977.
4.3. PRÉCALCUL D’OCCLUSION AMBIANTE 179
ne calcule ensuite que l’ombre douce causée par cette partie des obstacles. Cette approxi-
mation limite la qualité des ombres douces, particulièrement si la source lumineuse est très
étendue par rapport aux obstacles.
– Plusieurs algorithmes sont basés sur une analyse du modèle géométrique des objets. Le
coût des calculs est alors proportionnel à la complexité géométrique des obstacles. L’ombre
douce est généralement un phénomène à basse fréquence, dont la complexité visuelle est
beaucoup moins grande que la complexité géométrique des obstacles. On a ici un surcroît
de travail inutile, lié à la complexité des objets.
Figure 4.1 – Notre algorithme calcule les ombres douces en temps-réel (à gauche) en remplaçant
les obstacles par une version discrétisée (à droite) calculée à partir de la shadow map. Ces images
sont calculées à 84 Hz.
En se basant sur la connaissance accumulée dans cet état de l’art, ainsi que sur les connais-
sances issues du projet CYBER, avec l’étudiant en M2R Lionel Atty (co-encadré avec Jean-Marc
Hasenfratz), nous avons développé un nouvel algorithme de calcul des ombres douces [2]. Cet
algorithme se base sur une discrétisation des obstacles dans une carte de profondeur, et calcule
l’ombre douce causée par l’obstacle discrétisé (voir figure 4.1). Bien qu’il ne résolve pas tous
les points soulevés par notre étude, cet algorithme a l’avantage d’être très rapide et surtout indé-
pendant de la complexité géométrique des obstacles. De plus, plus la pénombre est large, plus
on peut utiliser une discrétisation des obstacles avec un petit nombre de pixels : on arrive ainsi à
consacrer moins de temps de calcul aux ombres douces causées par des sources très étendues.
Le point essentiel de notre algorithme est l’utilisation d’une version discrétisée des obstacles
pour calculer l’ombre douce. Cette approche est amenée à se multiplier dans les travaux futurs :
en remplaçant un modèle géométrique complexe par une version discrète, plus simple et calculée
de façon automatique, il serait possible d’accélérer d’autres algorithmes de calcul.
(a) L’occlusion ambiante peut être (b) Nous plaçons une grille 3D autour de (c) Lors du rendu, ces valeurs sont
définie comme un cône (d, α). l’objet. Au centre de chaque cellule, on cal- utilisées pour l’ombrage des objets
cule l’occlusion ambiante. voisins. Cette scène est affichée à
800 Hz.
Pour les scènes animées, des recherches récentes3, 4 portent sur le stockage d’un champ d’oc-
clusion ambiante, attachée à un objet mobile, et qui influencent les objets voisins. Pour un sto-
ckage compact, ces recherches stockent le champ d’occlusion ambiante en le projetant sur un
espace de fonctions simples mais adaptées (par exemple des fractions rationnelles de la distance
au centre) et en stockant les coefficients dans une carte 2D cubique, indexée par la direction par
rapport au centre de l’objet. On a donc un stockage en O(n2 ), au prix d’un travail supplémentaire
sur les données, à la fois dans le pré-calcul et au moment du rendu.
En collaboration avec Mattias Malmer et Fredrik Malmer (Syndicate Ent. AB) et Ulf Assars-
son (Chalmers University of Technology) nous avons montré qu’il était plus rentable de stocker
ces données sous forme brute dans une grille 3D, sans pré-traitement [1] (voir p. 224). Au mo-
ment du rendu, les données sont affichées directement. Théoriquement, l’inconvénient de cette
méthode est que le stockage est en O(n3 ). En pratique, l’occlusion ambiante étant un phéno-
mène qui varie très lentement, on utilise de petites valeurs de n. Nous avons trouvé que n = 32
convenait pour toutes nos scènes. Pour ces valeurs de n, le coût du stockage brut en O(n3 ) est
comparable à celui du stockage en O(n2 ) (environ 100 Ko par obstacle), à cause du plus grand
nombre de coefficients par cellule dans ce dernier.
Outre son intérêt évident sur le plan pratique, ce résultat est également intéressant sur le
plan scientifique : pour certains phénomènes, une représentation brute peut être plus intéres-
sante qu’une représentation élaborée. Cet effet est surtout présent sur des phénomènes à basse
fréquence, comme l’occlusion ambiante. Il sera intéressant d’étudier l’emploi de tels stockages
sous forme de grille 3D pour d’autres phénomènes d’éclairage.
3. Kun Z, Yaohua H, Steve L, Baining G et Heung-Yeung S. « Precomputed Shadow Fields for Dynamic
Scenes ». ACM Transactions on Graphics (proceedings of Siggraph 2005), 24(3), 2005.
4. Janne K et Samuli L. « Ambient Occlusion Fields ». Dans Symposium on Interactive 3D Graphics
and Games, p. 41–48, 2005.
4.5. ÉCLAIRAGE INDIRECT 181
utilisant une environment map de la scène, mais cette technique fait l’hypothèse que la distance
entre le réflecteur et l’objet réfléchi est infinie.
(a) Notre algorithme (b) Référence (lancer de rayons) (c) Environment mapping
Avec le doctorant David Roger (co-encadré avec François Sillion), nous avons développé une
méthode de calcul des réflexions spéculaires en temps-réel qui fonctionne même lorsqu’il y a
contact entre le réflecteur et l’objet réfléchi [3] (voir p. 238). Cette technique calcule (sur la carte
graphique) la réflexion des sommets de la scène, puis interpole entre les positions réfléchies des
sommets. Le principal avantage de la méthode est sa robustesse, y compris dans des situations
difficiles (voir figure 4.3).
Notre méthode a une complexité linéaire par rapport au nombre d’objets dans la scène et
fonctionne en temps réel pour des scènes jusqu’à 20 000 polygones. Le principal inconvénient
repose sur l’interpolation linéaire entre les positions réfléchies des sommets, qui est faite par la
carte, qui peut produire des erreurs visibles dans la réflexion calculée.
(b) Éclairage direct (c) Éclairage indirect calculé par notre (d) Éclairage global résultat
algorithme
Figure 4.4 – Notre algorithme pour le calcul interactif de l’éclairage global. Ces images sont
calculées à 15 Hz.
4.6 Discussion
Dans ce chapitre, nous avons présenté nos travaux sur la simulation d’effets lumineux à l’aide
des cartes graphiques programmables. Nous avons vu qu’il est possible de simuler certains effets
importants pour le réalisme de l’éclairage, comme les ombres douces ou les réflexions spécu-
laires, à une vitesse compatible avec l’interactivité.
Les cartes graphiques programmables ouvrent de nouvelles directions grâce à leur puissance
de calcul, mais cette puissance a ses limites. L’architecture de la carte fait qu’il est plus pratique
de l’utiliser pour des calculs localisés, ne nécessitant pas un accès global à la scène. Ainsi les
cartes graphiques sont bien adaptées pour les calculs d’éclairage en un point, pour les calculs
d’ombre (même douce) ou pour les réflexions spéculaires.
De la même manière, nous avons vu que les cartes graphiques pouvaient être utilisée pour les
effets lumineux attachés à un objet, et qui ne se portent que sur les objets très proches, comme
l’occlusion ambiante.
Ainsi, les phénomènes pour lesquels les cartes graphiques sont les mieux adaptées sont aussi
des phénomènes qui sont plutôt à haute fréquence, selon notre analyse fréquentielle du chapitre
précédent.
Ces travaux sur l’emploi des cartes graphiques pour la simulation de l’éclairage sont égale-
ment les travaux les plus prometteurs en termes de collaborations industrielles et de coopérations
internationales.
4.7. ARTICLES 183
4.7 Articles
4.7.1 Liste des articles
– A survey of real-time soft shadows algorithms (CGF 2003)
– Soft shadow maps: efficient sampling of light source visiblity (CGF 2006)
– Fast Precomputed Ambient Occlusion for Proximity Shadows (JGT 2006)
– Accurate specular reflections in real-time (EG 2006)
– Wavelet radiance transport for interactive indirect lighting (EGSR 2006)
184 CHAPITRE 4. UTILISATION DES CARTES GRAPHIQUES PROGRAMMABLES
Artis GRAVIR/IMAG-INRIA∗∗
Abstract
Recent advances in GPU technology have produced a shift in focus for real-time rendering applications, whereby
improvements in image quality are sought in addition to raw polygon display performance. Rendering effects
such as antialiasing, motion blur and shadow casting are becoming commonplace and will likely be considered
indispensable in the near future. The last complete and famous survey on shadow algorithms — by Woo et al.52 in
1990 — has to be updated in particular in view of recent improvements in graphics hardware, which make new
algorithms possible. This paper covers all current methods for real-time shadow rendering, without venturing into
slower, high quality techniques based on ray casting or radiosity. Shadows are useful for a variety of reasons: first,
they help understand relative object placement in a 3D scene by providing visual cues. Second, they dramatically
improve image realism and allow the creation of complex lighting ambiances. Depending on the application, the
emphasis is placed on a guaranteed framerate, or on the visual quality of the shadows including penumbra effects
or “soft shadows”. Obviously no single method can render physically correct soft shadows in real time for any
dynamic scene! However our survey aims at providing an exhaustive study allowing a programmer to choose the
best compromise for his/her needs. In particular we discuss the advantages, limitations, rendering quality and
cost of each algorithm. Recommendations are included based on simple characteristics of the application such
as static/moving lights, single or multiple light sources, static/dynamic geometry, geometric complexity, directed
or omnidirectional lights, etc. Finally we indicate which methods can efficiently exploit the most recent graphics
hardware facilities.
Categories and Subject Descriptors (according to ACM CCS): I.3.7 [Computer Graphics]: Three-Dimensional
Graphics and Realism – Color, shading, shadowing, and texture, I.3.1 [Computer Graphics]: Hardware Archi-
tecture – Graphics processors, I.3.3 [Computer Graphics]: Picture/Image Generation – Bitmap and framebuffer
operations
Keywords: shadow algorithms, soft shadows, real-time, shadow mapping, shadow volume algorithm.
185
Hasenfratz et al. / Real-time Soft Shadows
186
Hasenfratz et al. / Real-time Soft Shadows
(a) Shadows provide information about the relative positions (b) Shadows provide information about the geometry of the re-
of objects. On the left-hand image, we cannot determine the ceiver. Left: not enough cues about the ground. Right: shadow
position of the robot, whereas on the other three images we reveals ground geometry.
understand that it is more and more distant from the ground.
Figure 3: Shadows provide information about the geometry of the occluder. Here we see that the robot holds nothing in his left
hand on Figure 3(a), a ring on Figure 3(b) and a teapot on Figure 3(c).
graphics and we shall see that several algorithms let us com- gree of softness (blur) in the shadow varies dramatically with
pute hard shadows in real time. the distances involved between the source, occluder, and re-
ceiver. Note also that a hard shadow, with its crisp bound-
In the more realistic case of a light source with finite ex-
ary, could be mistakenly perceived as an object in the scene,
tent, a point on the receiver can have a partial view of the
while this would hardly happen with a soft shadow.
light, i.e. only a fraction of the light source is visible from
that point. We distinguish the umbra region (if it exists) in
which the light source is totally blocked from the receiver,
and the penumbra region in which the light source is par-
tially visible. The determination of the umbra and penumbra
is a difficult task in general, as it amounts to solving visibility
relationships in 3D, a notoriously hard problem. In the case
In computer graphics we can approximate small or distant
of polygonal objects, the shape of the umbra and penumbra
light source as point sources only when the distance from the
regions is embedded in a discontinuity mesh13 which can be
light to the occluder is much larger than the distance from
constructed from the edges and vertices of the light source
the occluder to the receiver, and the resolution of the final
and the occluders (see Figure 4(b)).
image does not allow proper rendering of the penumbra. In
Soft shadows are obviously much more realistic than hard all other cases great benefits can be expected from properly
shadows (see Figures 4(c) and 4(d)); in particular the de- representing soft shadows.
187
Hasenfratz et al. / Real-time Soft Shadows
ice to
s
h v due
Umbra
ert
eacdows
Hard Shadow
a
Sh
Penumbra
Receiver Receiver Receiver
2.4. Important issues in computing soft shadows points in the scene where the light source is not occluded
by any object taken separately, but is totally occluded by
2.4.1. Composition of multiple shadows
the set of objects taken together. The correlation between
While the creation of a shadow is easily described for a (light the partial visibility functions of different occluders cannot
source, occluder, receiver) triple, care must be taken to allow be predicted easily, but can sometimes be approximated or
for more complex situations. bounded45, 5 .
As a consequence, the shadow of the union of the objects
Shadows from several light sources Shadows produced can be larger than the union of the shadows of the objects
by multiple light sources are relatively easy to obtain if we (see Figure 6). This effect is quite real, but is not very visible
know how to deal with a single source (see Figure 5). Due on typical scenes, especially if the objects casting shadows
to the linear nature of light transfer we simply sum the con- are animated.
tribution of each light (for each wavelength or color band).
2.4.2. Physically exact or fake shadows
Shadows from several objects For point light sources,
shadows due to different occluders can be easily combined Shadows from an extended light source Soft shadows
since the shadow area (where the light source is invisible) is come from spatially extended light sources. To model prop-
the union of all individual shadows. erly the shadow cast by such light sources, we must take into
account all the parts of the occluder that block light com-
With an area light source, combining the shadows of sev-
ing from the light source. This requires identifying all parts
eral occluders is more complicated. Recall that the lighting
of the object casting shadow that are visible from at least
contribution of the light source on the receiver involves a
one point of the extended light source, which is algorithmi-
partial visibility function: a major issue is that no simple
cally much more complicated than identifying parts of the
combination of the partial visibility functions of distinct oc-
occluder that are visible from a single point.
cluders can yield the partial visibility function of the set of
occluders considered together. For instance there may be Because this visibility information is much more difficult
188
Hasenfratz et al. / Real-time Soft Shadows
Figure 5: Complex shadow due to multiple light sources. Note the complex interplay of colored lights and shadows in the
complementary colors.
Figure 7: When the light source is significantly larger than the occluder, the shape of the shadow is very different from the
shape computed using a single sample; the sides of the object are playing a part in the shadowing.
to compute with extended light sources than with point light the light source are actually seeing different sides of the ob-
sources, most real-time soft shadow algorithms compute vis- ject (see Figure 7). In that case, the physically exact shadow
ibility information from just one point (usually the center of is very different from the approximated version.
the light source) and then simulate the behavior of the ex-
While large light sources are not frequent in real-time al-
tended light source using this visibility information (com-
gorithms, the same problem also occurs if the object casting
puted for a point).
shadow is extended along the axis of the light source, e.g.
This method produces shadows that are not physically ex- a character with elongated arms whose right arm is point-
act, of course, but can be close enough to real shadows for ing toward light source, and whose left arm is close to the
most practical applications. The difference between the ap- receiver.
proximation and the real shadow is harder to notice if the In such a configuration, if we want to compute a better
objects and their shadow are animated — a common occur- looking shadow, we can either:
rence in real-time algorithms.
• Use the complete extension of the light source for visibil-
The difference becomes more noticeable if the difference ity computations. This is algorithmically too complicated
between the actual extended light source and the point used to be used in real-time algorithms.
for the approximation is large, as seen from the object cast- • Separate the light source into smaller light sources24, 5 .
ing shadow. A common example is for a large light source, This removes some of the artefacts, since each light source
close enough from the object casting shadow that points of is treated separately, and is geometrically closer to the
189
Hasenfratz et al. / Real-time Soft Shadows
Occluder 1
reflections.
source (in %)
• extend the umbra region outwards, by computing an outer Our focus in this paper is on real-time applications, therefore
penumbra region, we have chosen to ignore all techniques that are based on an
• shrink the umbra region, and complete it with an inner expensive pre-process even when they allow later modifica-
penumbra region, tions at interactive rates37 . Given the fast evolution of graph-
• compute both inner penumbra and outer penumbra. ics hardware, it is difficult to draw a hard distinction between
real-time and interactive methods, and we consider here that
The first method (outer penumbra only) will always create frame rates in excess of 10 fps, for a significant number of
shadows made of an umbra and a penumbra. Objects will polygons, are an absolute requirement for “real-time” appli-
have an umbra, even if the light source is very large with cations. Note that stereo viewing usually require double this
respect to the occluders. This effect is quite noticeable, as it performance.
makes the scene appear much darker than anticipated, except
For real-time applications, the display refresh rate is often
for very small light sources.
the crucial limiting factor, and must be kept high enough (if
On the other hand, computing the inner penumbra region not constant) through time. An important feature to be con-
can result in light leaks between neighboring objects whose sidered in shadowing algorithms is therefore their ability to
shadows overlap. guarantee a sustained level of performance. This is of course
190
Hasenfratz et al. / Real-time Soft Shadows
Shadowing algorithms may place particular constraints on Shadow mapping is implemented in current graphics
the scene. Examples include the type of object model (tech- hardware. It uses an OpenGL extension for the comparison
niques that compute a shadow as a texture map typically re- between Z values, GL_ARB_SHADOW† .
quire a parametric object, if not a polygon), or the neces-
sity/possibility to identify a subset of the scene as occlud- Improvements The depth buffer is sampled at a limited
ers or shadow receivers. This latter property is important in precision. If surfaces are too close from each other, sampling
adapting the performance of the algorithm to sustain real- problems can occur, with surfaces shadowing themselves. A
time. possible solution42 is to offset the Z values in the shadow
map by a small bias51 .
2.5. Basic techniques for real-time shadows If the light source has a cut-off angle that is too large, it
is not possible to project the scene in a single shadow map
In this State of the Art Review, we focus solely on real-time without excessive distortion. In that case, we have to replace
soft shadows algorithms. As a consequence, we will not de- the light source by a combination of light sources, and use
scribe other methods for producing soft shadows, such as ra- several depth maps, thus slowing down the algorithm.
diosity, ray-tracing, Monte-Carlo ray-tracing or photon map-
ping. Shadow mapping can result in large aliasing problems if
the light source is far away from the viewer. In that case, in-
We now describe the two basic techniques for computing dividual pixels from the shadow map are visible, resulting in
shadows from point light sources, namely shadow mapping a staircase effect along the shadow boundary. Several meth-
and the shadow volume algorithm. ods have been implemented to solve this problem:
2.5.1. Shadow mapping • Storing the ID of objects in the shadow map along with
their depth26 .
Method The basic operation for computing shadows is • Using deep shadow maps, storing coverage information
identifying the parts of the scene that are hidden from the for all depths for each pixel36 .
light source. Intrisically, it is equivalent to visible surface • Using multi-resolution, adaptative shadow maps18 , com-
determination, from the point-of-view of the light source. puting more details in regions with shadow boundaries
The first method to compute shadows17, 44, 50 starts by that are close to the eye.
computing a view of the scene, from the point-of-view of • Computing the shadow map in perspective space46 , effec-
the light source. We store the z values of this image. This tively storing more details in parts of the shadow map that
Z-buffer is the shadow map (see Figure 8). are closer to the eye.
The shadow map is then used to render the scene (from The last two methods are directly compatible with exist-
the normal point-of-view) in a two pass rendering process: ing OpenGL extensions, and therefore require only a small
amount of coding to work with modern graphics hardware.
• a standard Z-buffer technique, for hidden-surface re-
An interesting alternative version of this algorithm is to
moval.
• for each pixel of the scene, we now have the geometri-
cal position of the object seen in this pixel. If the distance † This extension (or the earlier version, GL_SGIX_SHADOW, is
between this object and the light is greater than the dis- available on Silicon Graphics Hardware above Infinite Reality 2,
tance stored in the shadow map, the object is in shadow. on NVidia graphics cards after GeForce3 and on ATI graphics cards
Otherwise, it is illuminated. after Radeon9500.
191
Hasenfratz et al. / Real-time Soft Shadows
Discussion Shadow mapping has many advantages: Therefore the complete algorithm to obtain a picture using
the Shadow Volume method is:
• it can be implemented entirely using graphics hardware;
• creating the shadow map is relatively fast, although it still • render the scene with only ambient/emissive lighting;
depends on the number and complexity of the occluders; • calculate and render shadow volumes in the stencil buffer;
• it handles self-shadowing. • render the scene illuminated with stencil test enabled:
only pixels which stencil value is 0 are rendered, others
It also has several drawbacks: are not updated, keeping their ambient color.
• it is subject to many sampling and aliasing problems;
• it cannot handle omni-directional light sources; Improvements The cost of the algorithm is directly linked
• at least two rendering passes are required (one from the to the number of edges in the shadow volume. Batagelo and
light source and one from the viewpoint); Júnior7 minimize the number of volumes rendered by precal-
culating in software a modified BSP tree. McCool39 extracts
2.5.2. The Shadow Volume Algorithm the silhouette by first computing a shadow map, then extract-
ing the discontinuities of the shadow map, but this method
Another way to think about shadow generation is purely ge- requires reading back the depth buffer from the graphics
ometrical. This method was first described by Crow12 , and board to the CPU, which is costly. Brabec and Seidel10 re-
first implemented using graphics hardware by Heidmann23 . ports a method to compute the silhouette of the occluders
using programmable graphics hardware14 , thus obtaining an
Method The algorithm consists in finding the silhouette almost completely hardware-based implementation of the
of occluders along the light direction, then extruding this shadow volume algorithm (he still has to read back a buffer
silhouette along the light direction, thus forming a shadow into the CPU for parameter transfer).
volume. Objects that are inside the shadow volume are in
Roettger et al.43 suggests an implementation that doesn’t
shadow, and objects that are outside are illuminated.
require the stencil buffer; he draws the shadow volume in
The shadow volume is calculated in two steps: the alpha buffer, replacing increment/decrement with a mul-
tiply/divide by 2 operation.
• the first step consists in finding the silhouette of the oc-
cluder as viewed from the light source. The simplest Everitt and Kilgard15 have described a robust implementa-
method is to keep edges that are shared by a triangle fac- tion of the shadow volume algorithm. Their method includes
ing the light and another in the opposite direction. This capping the shadow volume, setting w = 0 for extruded ver-
actually gives a superset of the true silhouette, but it is tices (effectively making infinitely long quads) and setting
sufficient for the algorithm. the far plane at an infinite distance (they prove that this step
192
Hasenfratz et al. / Real-time Soft Shadows
only decreases Z-buffer precision by a few percents). Finally, • Algorithms that are based on an object-based approach,
they render the shadow volume using the zfail technique; it and build upon the shadow volume method described
works by rendering the shadow volume backwards: in Section 2.5.2. These algorithms are described in Sec-
tion 3.2.
• we render the scene, storing the Z-buffer;
• in the first pass, we increment the stencil buffer for all
back-facing faces, but only if the face is behind an existing 3.1. Image-Based Approaches
object of the scene;
In this section, we present soft shadow algorithms based on
• in the second pass, we decrement the stencil buffer for all
shadow maps (see Section 2.5.1). There are several methods
front-facing faces, but only if the face is behind an existing
to compute soft shadows using image-based techniques:
object;
• The stencil buffer contains the intersection of the shadow 1. Combining several shadow textures taken from point
volume and the objects of the scene. samples on the extended light source25, 22 .
2. Using layered attenuation maps1 , replacing the shadow
The zfail technique was discovered independently by map with a Layered Depth Image, storing depth informa-
Bilodeau and Songy and by Carmack. tion about all objects visible from at least one point of the
Recent extensions to OpenGL15, 16, 21 allow the use of light source.
shadow volumes using stencil buffer in a single pass, instead 3. Using several shadow maps24, 54 , taken from point sam-
of the two passes required so far. They also15 provide depth- ples on the light source, and an algorithm to compute the
clamping, a method in which polygon are not clipped at the percentage of the light source that is visible.
near and far distance, but their vertices are projected onto 4. Using a standard shadow map, combined with image
the near and far plane. This provides in effect an infinite view analysis techniques to compute soft shadows9 .
pyramid, making the shadow volume algorithm more robust. 5. Convolving a standard shadow map with an image of the
light source45 .
The main problem with the shadow volume algorithm
is that it requires drawing large polygons, the faces of the The first two methods approximate the light source as a
shadow volume. The fillrate of the graphics card is often the combination of several point samples. As a consequence,
bottleneck. Everitt and Kilgard15, 16 list different solutions to the time for computing the shadow textures is multiplied
reduce the fillrate, either using software methods or using by the number of samples, resulting in significantly slower
the graphics hardware, such as scissoring, constraining the rendering. On the other hand, these methods actually com-
shadow volume to a particular fragment. pute more information than other soft shadow methods, and
thus compute more physically accurate shadows. Most of the
Discussion The shadow volume algorithm has many ad- artefacts listed in Section 2.4.2 will not appear with these
vantages: two methods.
• it works for omnidirectional light sources; 3.1.1. Combination of several point-based shadow
• it renders eye-view pixel precision shadows; images25, 22
• it handles self-shadowing.
The simplest method22, 25 to compute soft shadows using im-
It also has several drawbacks: age based methods is to place sample points regularly on the
extended light source. These sample points are used to com-
• the computation time depends on the complexity of the pute binary occlusion maps, which are combined into an at-
occluders; tenuation map, used to modulate the illumination (calculated
• it requires the computation of the silhouette of the occlud- separately).
ers as a preliminary step;
• at least two rendering passes are required; Method Herf25 makes the following assumptions on the ge-
• rendering the shadow volume consumes fillrate of the ometry of the scene:
graphics card.
• a light source of uniform color,
• subtending a small solid angle with respect to the receiver,
3. Soft shadow algorithms • and with distance from the receiver having small variance.
In this section, we review algorithms that produce soft shad- With these three assumptions, contributions from all sam-
ows, either interactively or in real time. As in the previous ple points placed on the light source will be roughly equal.
section, we distinguish two types of algorithms:
The user identifies in advance the object casting shadows,
• Algorithms that are based on an image-based approach, and the objects onto which we are casting shadow. For each
and build upon the shadow map method described in Sec- object receiving shadow, we are going to compute a texture
tion 2.5.1. These algorithms are described in Section 3.1. containing the soft shadow.
193
Hasenfratz et al. / Real-time Soft Shadows
194
Hasenfratz et al. / Real-time Soft Shadows
Light source P1
P12 a
P2
Occluder
P0
p1 p2 q1 q2 P3
Visibility of light
source (in %)
P34
b P4
195
Hasenfratz et al. / Real-time Soft Shadows
Point
light Umbra
Point light r
Inner penumbra P
Blocked
pixels R
Outer penumbra
Occluder
P
Shadow map
r
Occluder
P'
Umbra
Inner penumbra
Receiver
Receiver Outer penumbra
196
Hasenfratz et al. / Real-time Soft Shadows
and the light source: A faster version of this algorithm, by Kirsch and
dist(PixelOccluder , PixelReceiver ) Doellner33 , computes both the shadow map and a shadow-
f= width map: for each point in shadow, we precompute the dis-
RSzReceiver |zReceiver − zOccluder |
tance to the nearest point that is illuminated. For each pixel,
where R and S are user-defineable parameters. The inten- we do a look-up in the shadow map and the shadow-width
sity of the pixel is modulated using8 : map. If the point is occluded, we have the depth of the cur-
rent point (z), the depth of the occluder (zoccluder ) and the
• 0.5 ∗ (1 + f ), clamped to [0.5, 1] if the pixel is outside the shadow width (w). A 2D function gives us the modulation
shadow, coefficient:
• 0.5 ∗ (1 − f ), clamped to [0, 0.5] if the pixel is inside the
shadow. 1 if z = zoccluder
I(z, w) = w
1 + cbias − cscale zoccluder −z otherwise
For pixels that are far away from the boundary of the
shadow, either deep inside the shadow or deep inside the
The shadow-width map is generated from a binary occlu-
fully lit area, f gets greater than 1, resulting in a modulation
sion map, transformed into the width map by repeated appli-
coefficient of respectively 0 or 1. On the original shadow
cations of a smoothing filter. This repeated filtering is done
boundary, f = 0, the two curves meet each other continu-
using graphics hardware, during rendering. Performances
ously with a modulation coefficient of 0.5. The actual width
depend mostly on the size of the occlusion map and on the
of the penumbra region depends on the ratio of the distances
size of the filter; for a shadow map resolution of 512 × 512
to the light source of the occluder and the receiver, which is
pixels, and a large filter, they attain 20 frames per second.
perceptually correct.
Performance depends linearly on the number of pixels in the
The slowest phase of this algorithm is the search of neigh- occlusion map, thus doubling the size of the occlusion map
bouring pixels in the shadow map, to find the potential oc- divides the rendering speed by 4.
cluder. In theory, an object can cast a penumbra than spans
the entire scene, if it is close enough to the light source. In
3.1.5. Convolution technique45
practice, we limit the search to a maximal distance to the
current pixel of Rmax = RzReceiver . As noted earlier, soft shadows are a consequence of partial
visibility of an extended light source. Therefore the calcula-
To ensure that an object is correctly identified as being in
tion and soft shadows is closely related to the calculation of
shadow or illuminated, the information from the depth map
the visible portion of the light source.
is combined with an item buffer, following Hourcade and
Nicolas26 . Soler and Sillion45 observe that the percentage of the
source area visible from a receiving point can be expressed
Discussion The aim of this algorithm is to produce percep- as a simple convolution for a particular configuration. When
tually pleasing, rather than physically exact, soft shadows. the light source, occluder, and receiver all lie in parallel
The width of the penumbra region depends on the ratio of planes, the soft shadow image on the receiver is obtained
the respective distances to the light source of the occluder by convolving an image of the receiver and an image of the
and the receiver. The penumbra region is larger if the oc- light source. While this observation is only mathematically
cluder is far from the receiver, and smaller if the occluder is valid in this very restrictive configuration, the authors de-
close to the receiver. scribe how the same principle can be applied to more general
Of course, the algorithm suffers from several shortcom- configurations:
ings. Since the shadow is only determined by a single sam- First, appropriate imaging geometries are found, even
ple shadow map, it can fail to identify the proper shadowing when the objects are non-planar and/or not parallel. More
edge. It works better if the light source is far away from the importantly, the authors also describe an error-driven algo-
occluder. The middle of the penumbra region is placed on rithm in which the set of occluders is recursively subdivided
the boundary of the shadow from the single sample, which according to an appropriate error estimate, and the shadows
is not physically correct. created by the subsets of occluders are combined to yield the
The strongest point of this algorithm is its speed. Since it final soft shadow image.
only needs to compute a single shadow map, it can achieve
framerates of 5 to 20 frames per second, compared with 2 to Discussion The convolution technique’s main advantages
3 frames per second for multi-samples image-based meth- are the visual quality of the soft shadows (not their phys-
ods. The key parameter in this algorithm is R, the search ical fidelity), and the fact that it operates from images of
radius. For smaller search values of R, the algorithms works the source and occluders, therefore once the images are ob-
faster, but can miss large penumbras. For larger values of R, tained the complexity of the operations is entirely under con-
the algorithm can identify larger penumbras, but takes longer trol. Sampling is implicitly performed when creating a light
for each rendering. source image, and the combination of samples is handled
197
Hasenfratz et al. / Real-time Soft Shadows
198
Hasenfratz et al. / Real-time Soft Shadows
199
Hasenfratz et al. / Real-time Soft Shadows
seen from the area light source, is very different from the
silhouette computed using the single sample. Such scenes
e0 include scenes where a large area light source is close to
the object (see Figure 7), and scenes where the shadows of
c e1 several objects are combined together (as in Figure 6). In
those circumstances, it is possible to compute a more accu-
rate shadow by splitting the light source into smaller light
sources. The authors report that splitting large light sources
into 2 × 2 or 3 × 3 smaller light sources is usually enough
Figure 18: Computing the area of the light source that is
to remove visible artefacts. It should be noted that splitting
covered by a given edge. The fragment program computes
the light source into n light sources does not cut the speed
the hatched area for each pixel inside the corresponding
of the algorithm by n, since the rendering time depends on
wedge.
the number of pixels covered by the penumbra wedges, and
smaller light sources have smaller penumbra wedges.
200
Hasenfratz et al. / Real-time Soft Shadows
• Another form of user control is to add more samples physically exact shadows if the number of samples is
on the light source22, 25, 1 , or to subdivide large light sufficient. However, with current hardware, the number
sources into a set of smaller ones2, 4, 5, 24, 54 . It should be of samples compatible with interactive applications gives
noted that the order of magnitude for this parameter is shadows that are not visually excellent (hence the poor
variable: 256 to 1024 samples are required for point- mark these methods receive in table 1).
based methods22, 25, 1 to produce shadows without arte- Physically exact on simple scenes: Methods that compute
facts, while area-based methods2, 4, 5, 24, 54 just need to cut the percentage of the light source that is visible from the
the light source into 2 × 2 or 3 × 3 smaller sources. Ei- current pixel will give physically exact shadows in places
ther way, the rendering time is usually multiplied by the where the assumptions they make on the respective ge-
number of samples or sources. ometry of the light source and the occluders are verified.
• All image-based methods are also tuneable by changing For example, soft shadow volumes4, 5 give physically ex-
the resolution of the buffer. act shadows for isolated convex objects, provided that the
• Other parameters are method-specific: silhouette computed is correct (that the occluder is far
away from the light source). Visibility channel24, 54 gives
– the single sample soft shadows9 method is tuneable by
physically exact shadows for convex occluders and lin-
changing the search radius;
ear light sources24 , and for isolated edges and polygonal
– Convolution45 is tuneable by subdividing the occluders
light sources54 . Convolution45 is physically exact for pla-
into several layers;
nar and parallel light source, receiver and occluder.
– Plateaus19 are tuneable by changing the number of ver-
tices used to discretize the cones and patches; Always approximate: All methods that restrict them-
– Smoothies11 are tuneable by changing the maximum selves to computing only the inner- or the outer-
width of the smoothies; penumbra are intrisically always approximate. They in-
clude single-sample soft shadows using shadow-width
map33 , plateaus19 and smoothies11 . The original imple-
4.2. Controlling the aspect mentation of single sample soft shadows9 computes both
the inner- and the outer-penumbra, but gives them always
Another important information in chosing a real-time soft the same width, which is not physically exact.
shadow algorithm is the aspect of the shadow it produces.
Some of the algorithms described in this review can produce
a physically exact solution if we allow them a sufficient ren-
dering time. Other methods produce a physically exact solu-
The second class of methods is probably the more inter-
tion in simple cases, but are approximate in more complex
esting for producing nice looking pictures. While the con-
scenes, and finally a third class of methods produce shadows
ditions imposed seem excessively hard, it must be pointed
that are always approximate, but are usually faster to com-
out that they are conditions for which it is guaranteed that
pute.
the shadow is exact in all the points of the scene. In most
Physically exact (time permitting): Methods based on places of a standard scene, these methods will also produce
point samples on the light source22, 25, 1 will produce physically exact shadows.
201
Hasenfratz et al. / Real-time Soft Shadows
4.3. Number and shape of the light sources Receiver: The strongest restriction is when the object re-
ceiving shadows is a plane, as with the plateaus method19 .
The first cause for the soft shadow is the light source. Each
Multi-sample soft shadow25, 22 is also restricted to a small
real-time soft shadow method makes an assumption on the
number of receivers for interactive rendering. In that case,
light sources, their shapes, their angles of emission and more
self-shadowing is not applicable.
importantly their number.
Field of emission: All the methods that are based on an Self-shadowing: The convolution45 method requires that
image of the scene computed from the light source are re- the scene is cut into clusters, within which no self-shadows
stricted with respect to the field of emission of the light are computed.
source, as a field of emission that is too large will result in
distortions in the image. This restriction applies to all image-
Silhouette: For all the methods that require a silhouette ex-
based algorithms, plus smoothies11 and volume-based algo-
traction — such as object-based methods — it is implicitly
rithms if the silhouette is computed using discontinuities in
assumed that we can compute a silhouette for all the objects
the shadow map39 .
in the scene. In practice, this usually means that the scene is
On the contrary, volume-based methods can handle omni- made of closed triangle meshes.
directional illumination.
202
Hasenfratz et al. / Real-time Soft Shadows
We have seen that the latest algorithms benefit from the 6. ATI. SmartshaderTM technology white paper. http://
programmability of recent graphics hardware. Two main di- www.ati.com/products/pdf/smartshader.pdf, 2001. 14,
rections appear attractive to render high-quality soft shad- 15
ows in real time: by programming graphics hardware, and by 7. Harlen Costa Batagelo and Ilaim Costa Júnior. Real-
taking advantage simultaneously of both image-based and time shadow generation using BSP trees and stencil
object-based techniques. Distributed rendering, using for in- buffers. In SIBGRAPI, volume 12, pages 93–102, Oc-
stance PC clusters, is another promising avenue although lit- tober 1999. 8
tle has been achieved so far. Interactive display speeds can
be obtained today even on rather complex scenes. Continu- 8. Stefan Brabec. Personnal communication, May 2003.
ing improvements of graphics technology — in performance 13
and programmability — lets us expect that soft shadows will 9. Stefan Brabec and Hans-Peter Seidel. Single sample
soon become a common standard in real-time rendering. soft shadows using depth maps. In Graphics Interface,
2002. 9, 12, 17, 18
Acknowledgments
10. Stefan Brabec and Hans-Peter Seidel. Shadow vol-
The “Hugo” robot used in the pictures of this paper was cre- umes on programmable graphics hardware. Computer
ated by Laurence Boissieux. Graphics Forum (Eurographics 2003), 25(3), Septem-
ber 2003. 8, 18
This work was supported in part by the “ACI Jeunes
Chercheurs” CYBER of the French Ministry of Research, 11. Eric Chan and Fredo Durand. Rendering fake soft shad-
and by the “Région Rhône-Alpes” through the DEREVE re- ows with smoothies. In Rendering Techniques 2003
search consortium. (14th Eurographics Symposium on Rendering). ACM
We wish to express our gratitude to the authors of the Press, 2003. 14, 15, 17, 18
algorithms described in this review, who have provided us 12. Franklin C. Crow. Shadow algorithms for computer
with useful detailed information about their work, and to graphics. Computer Graphics (SIGGRAPH 1977),
the anonymous reviewers whose comments and suggestions 11(3):242–248, 1977. 8
have significantly improved the paper.
13. George Drettakis and Eugene Fiume. A fast shadow
Remark: All the smooth shadows pictures in this paper algorithm for area light sources using backprojection.
were computed with distributed ray-tracing, using 1024 In Computer Graphics (SIGGRAPH 1994), Annual
samples on the area light sources. Conference Series, pages 223–230. ACM SIGGRAPH,
1994. 3, 6
203
Hasenfratz et al. / Real-time Soft Shadows
17. Cass Everitt, Ashu Rege, and Cem Cebenoyan. Hard- 30. Daniel Kersten, Pascal Mamassian, and David C. Knill.
ware shadow mapping. http://developer.nvidia.com/ Moving cast shadows and the perception of relative
object/hwshadowmap_paper.html. 7 depth. Technical Report no 6, Max-Planck-Institut fuer
biologische Kybernetik, 1994. 1, 2
18. Randima Fernando, Sebastian Fernandez, Kavita Bala,
and Donald P. Greenberg. Adaptive shadow maps. 31. Daniel Kersten, Pascal Mamassian, and David C. Knill.
In Computer Graphics (SIGGRAPH 2001), Annual Moving cast shadows and the perception of relative
Conference Series, pages 387–390. ACM SIGGRAPH, depth. Perception, 26(2):171–192, 1997. 1, 2
2001. 7
32. Mark J. Kilgard. Improving shadows and reflections
19. Eric Haines. Soft planar shadows using plateaus. Jour- via the stencil buffer. http://developer.nvidia.com/docs/
nal of Graphics Tools, 6(1):19–27, 2001. 14, 15, 17, IO/1348/ATT/stencil.pdf, 1999. 8
18
33. Florian Kirsch and Juergen Doellner. Real-time soft
20. Evan Hart. ARB Fragment Program: Frag- shadows using a single light sample. Journal of WSCG
ment level programmability in OpenGL. (Winter School on Computer Graphics 2003), 11(1),
http://www.ati.com/developer/gdc/GDC2003_OGL_ 2003. 12, 13, 17, 18
ARBFragmentProgram.pdf, 2003. 14, 15, 16
34. David C. Knill, Pascal Mamassian, and Daniel Kersten.
21. Evan Hart. Other New OpenGL Stuff: Important Geometry of shadows. Journal of the Optical Society
stuff that doesn’t fit elsewhere. http://www.ati.com/ of America, 14(12):3216–3232, 1997. 1
developer/gdc/GDC2003_OGL_MiscExtensions.pdf,
35. Johann Heinrich Lambert. Die freye Perspektive. 1759.
2003. 9, 18
1, 2
22. Paul S. Heckbert and Michael Herf. Simulating soft
36. Tom Lokovic and Eric Veach. Deep shadow maps.
shadows with graphics hardware. Technical Report
In Computer Graphics (SIGGRAPH 2000), Annual
CMU-CS-97-104, Carnegie Mellon University, January
Conference Series, pages 385–392. ACM SIGGRAPH,
1997. 9, 17, 18
2000. 7
23. Tim Heidmann. Real shadows, real time. In Iris Uni-
37. Céline Loscos and George Drettakis. Interactive
verse, volume 18, pages 23–31. Silicon Graphics Inc.,
high-quality soft shadows in scenes with moving ob-
1991. 8
jects. Computer Graphics Forum (Eurographics 1997),
24. Wolfgang Heidrich, Stefan Brabec, and Hans-Peter Sei- 16(3), September 1997. 6
del. Soft shadow maps for linear lights high-quality. In
38. Pascal Mamassian, David C. Knill, and Daniel Kersten.
Rendering Techniques 2000 (11th Eurographics Work-
The perception of cast shadows. Trends in Cognitive
shop on Rendering), pages 269–280. Springer-Verlag,
Sciences, 2(8):288–295, 1998. 1, 2
2000. 5, 9, 11, 17, 18
39. Michael D. McCool. Shadow volume recontsruction
25. Michael Herf. Efficient generation of soft shadow tex-
from depth maps. ACM Transactions on Graphics,
tures. Technical Report CMU-CS-97-138, Carnegie
19(1):1–26, 2000. 8, 18
Mellon University, 1997. 9, 10, 17, 18
40. Steve Morein. ATI radeon hyperz technology. In
26. J.-C. Hourcade and A. Nicolas. Algorithms for
Graphics Hardware Workshop, 2000. 6
antialiased cast shadows. Computers & Graphics,
9(3):259–265, 1985. 7, 13 41. Steven Parker, Peter Shirley, and Brian Smits. Single
sample soft shadows. Technical Report UUCS-98-019,
27. Geoffre S. Hubona, Philip N. Wheeler, Gregory W. Shi-
Computer Science Department, University of Utah, Oc-
rah, and Matthew Brandt. The role of object shadows
tober 1998. 12, 15
in promoting 3D visualization. ACM Transactions on
Computer-Human Interaction, 6(3):214–242, 1999. 1, 42. William T. Reeves, David H. Salesin, and Robert L.
2 Cook. Rendering antialiased shadows with depth maps.
Computer Graphics (SIGGRAPH 1987), 21(4):283–
28. M. Isard, M. Shand, and A. Heirich. Distributed ren-
291, 1987. 7
dering of interactive soft shadows. In 4th Eurograph-
ics Workshop on Parallel Graphics and Visualization, 43. Stefan Roettger, Alexander Irion, and Thomas Ertl.
pages 71–76. Eurographics Association, 2002. 10, 17 Shadow volumes revisited. In Winter School on Com-
puter Graphics, 2002. 8
29. Brett Keating and Nelson Max. Shadow penumbras for
complex objects by depth-dependent filtering of multi- 44. Mark Segal, Carl Korobkin, Rolf van Widenfelt, Jim
layer depth images. In Rendering Techniques 1999 Foran, and Paul Haeberli. Fast shadows and lighting ef-
(10th Eurographics Workshop on Rendering), pages fects using texture mapping. Computer Graphics (SIG-
205–220. Springer-Verlag, 1999. 10 GRAPH 1992), 26(2):249–252, July 1992. 7
204
Hasenfratz et al. / Real-time Soft Shadows
205
206 CHAPITRE 4. UTILISATION DES CARTES GRAPHIQUES PROGRAMMABLES
4.7.3 Soft shadow maps: efficient sampling of light source visiblity (CGF 2006)
Auteurs : Lionel A, Nicolas H, Marc L, Jean-Marc H, Charles
H et François X. S.
Journal : Computer Graphics Forum, vol. 25, no 4.
Date : décembre 2006
Volume 0 (1981), Number 0 pp. 1–17
Lionel Atty1 , Nicolas Holzschuch1 , Marc Lapierre2 , Jean-Marc Hasenfratz1,3 , Charles Hansen4 and François X. Sillion1
Figure 1: Our algorithm computes soft shadows in real-time (left) by replacing the occluders with a discretized version (right), using informa-
tion from the shadow map. This scene runs at 84 fps.
Abstract
Shadows, particularly soft shadows, play an important role in the visual perception of a scene by providing visual
cues about the shape and position of objects. Several recent algorithms produce soft shadows at interactive rates,
but they do not scale well with the number of polygons in the scene or only compute the outer penumbra. In
this paper, we present a new algorithm for computing interactive soft shadows on the GPU. Our new approach
provides both inner- and outer-penumbra, and has a very small computational cost, giving interactive frame-rates
for models with hundreds of thousands of polygons.
Our technique is based on a sampled image of the occluders, as in shadow map techniques. These shadow samples
are used in a novel manner, computing their effect on a second projective shadow texture using fragment programs.
In essence, the fraction of the light source area hidden by each sample is accumulated at each texel position of
this Soft Shadow Map. We include an extensive study of the approximations caused by our algorithm, as well as
its computational costs.
Categories and Subject Descriptors (according to ACM CCS): I.3.1 [Computer Graphics]: Graphics processors I.3.7
[Computer Graphics]: Color, shading, shadowing, and texture
207
L. Atty et al. / Soft Shadow Maps
2. Previous Work
Researchers have investigated shadow algorithms for
Figure 2: Applying our algorithm (200, 000 polygons, oc- computer-generated images for nearly three decades. The
cluder map 256 × 256, displayed at 32 fps). reader is referred to a recent state-of-the art report by Hasen-
fratz et al. [HLHS03], the overview by Woo et al. [WPF90]
and the book by Akenine-Möller and Haines [AMH02].
method interactively computes a projective shadow texture, The two most common methods for interactively produc-
the Soft Shadow Map, that incorporates soft shadows based ing shadows are shadow maps [Wil78] and shadow vol-
on light source visibility from receiver objects (see Fig. 2). umes [Cro77]. Both of these techniques have been extended
This texture is then projected onto the scene to provide in- for soft shadows. In the case of shadow volumes, Assarsson
teractive soft shadows of dynamic objects and dynamic area and Akenine-Möller [AAM03] used penumbra wedges in a
light sources. technique based on shadow volumes to produce soft shad-
There are several advantages to our technique when com- ows. Their method depends on locating silhouette edges to
pared to existing interactive soft-shadow algorithms: First, form the penumbra wedges. While providing good soft shad-
it is not necessary to compute silhouette edges. Second, the ows without an overestimate of the umbra, the algorithm is
algorithm is not fill-bound, unlike methods based on shadow fill-limited, particularly when zoomed in on a soft shadow
volumes. These properties provide better scaling for occlud- region. Since it is necessary to compute the silhouette edges
ing geometry than other GPU based soft shadow techniques at every frame, the algorithm also suffers from scalability is-
[WH03, CD03, AAM03]. Third, unlike some other shadow sues when rendering occluders with large numbers of poly-
map based soft shadow techniques, our algorithm does not gons.
dramatically overestimate the umbra region [WH03, CD03]. The fill-rate limitation is a well known limitation
Fourth, while other methods have relied on an interpola- of shadow-volume based algorithms. Recent publica-
tion from the umbra to the non-shadowed region to approxi- tions [CD04, LWGM04] have focused on limiting the fill-
mate the penumbra for soft shadows [AHT04,WH03,CD03, rate for shadow-volume algorithms, thus removing this lim-
BS02], our method computes the visibility of an area light itation.
source for receivers in the penumbra regions.
On shadow maps, Chan and Durand [CD03] and Wyman
Our algorithm also has some limitations when compared
and Hansen [WH03] both employed a technique which uses
to existing algorithms. First, our algorithm splits scene ge-
the standard shadow map method for the umbra region and
ometry into occluders and receivers and self shadowing is
builds a map containing an approximate penumbra region
not accounted for. Also, since our algorithm uses shadow
that can be used at run-time to give the appearance, includ-
maps to approximate occluder geometry, it inherits the well
ing hard shadows at contact, of soft shadows. While these
known issues with aliasing from shadow map techniques.
methods provide interactive rendering, both only compute
For large area light sources, the soft shadows tend to blur
the outer-penumbra, the part of the penumbra that is outside
the artifacts but for smaller area light sources, such aliasing
the hard shadow. In effect, they are overestimating the umbra
is apparent.
region, resulting in the incorrect appearance of soft shadows
We acknowledge that these limitations are important, and in the case of large area light sources. These methods also
they may prevent the use of our algorithm in some cases. depend on computing the silhouette edges in object space
However, there are many applications such as video games for each frame; this requirement limits the scalability for oc-
or immersive environments where the advantages of our cluders with large numbers of polygons.
algorithm (a very fast framerate, and a convincing soft
Arvo et al. [AHT04] used an image-space flood-fill
shadow) outweigh its limitations. We also think that this new
method to produce approximate soft shadows. Their algo-
algorithm could be the start of promising new research.
rithm is image-based, like ours, but works on a detection of
In the following section, we review previous work on shadow boundary pixels, followed by several passes to re-
interactive computation of soft shadows. In Section 3, we place the boundary by a soft shadow, gradually extending
present the basis of our algorithm, and in the following sec- the soft shadow at each pass. The main drawback of their
tion, we provide implementation details. In the next two sec- method is that the number of passes required is proportional
208
L. Atty et al. / Soft Shadow Maps
to the extent of the penumbra region, and the rendering time Compute depth map of receivers
is proportional to the number of shadow-filling passes. Compute depth map of occluders
for all pixels in occluder map
Guennebaud et al. [GBP06] also used the back projection Retrieve depth of occluder at this pixel
of each pixel in the shadow map to compute the soft shadow. Compute micro-patch associated with this pixel
Their method was developed independently of ours, yet is Compute extent of penumbra for this micro-patch
very similar. The main differences between the two methods for all pixels in penumbra extent for micro-patch
lie in the order of the computations: we compute the soft Retrieve receiver depth at this pixel
shadow in shadow map space, while they compute the soft Compute percentage of occlusion for this pixel
Add to current percentage in soft shadow map
shadow in screen space, requiring a search in the shadow
end
map.
end
Brabec and Seidel [BS02] and Kirsch and Doell- Project soft shadow map on the scene
ner [KD03] use a shadow map to compute soft shadows,
Figure 4: Our algorithm
by searching at each pixel of the shadow map for the near-
est boundary pixel, then interpolating between illumination
and shadow as a function of the distance between this pixel
and the boundary pixel and the distances between the light
source, the occluder and the receiver. Their algorithm re- depth buffers: one for the occluders (the occluder map) and
quires scanning the shadow map to look for boundary pixels, the other for the receivers.
a potentially costly step; in practical implementations they The occluder map depth buffer is used to discretize the
limit the search radius, thus limiting the actual size of the set of occluders (see Fig. 3(b)): each pixel in this occluder
penumbra region. map is converted into a micro-patch that covers the same
Soler and Sillion [SS98] compute a soft shadow map as image area but is is located in a plane parallel to the light
the convolution of two images representing the source and source, at a distance corresponding to the pixel depth. Pixels
blocker. Their technique is only accurate for planar and par- that are close to the light source are converted into small
allel objects, although it can be extended using an object hi- rectangles and pixels that are far from the light source are
erarchy. Our technique can be seen as an extension of this converted into larger rectangles. At the end of this step, we
approach, where the convolution is computed for each sam- have a discrete representation of the occluders.
ple of an occlusion map, and the results are then combined. The receiver map depth buffer will be used to provide the
Finally, McCool [McC00] presented an algorithm merg- receiver depth, as our algorithm uses the distance between
ing shadow volume and shadow map algorithms by detect- light source and receiver to compute the soft shadow values.
ing silhouette pixels in the shadow map and computing a We compute the soft shadow of each of the micro-patches
shadow volume based on these pixels. Our algorithm is sim- constituting the discrete representation of the occluders (see
ilar in that we are computing a shadow volume for each pixel Fig. 3(c)), and sum them into the soft shadow map (SSM)
in the shadow map. However, we never display this shadow (see Fig. 3(d)). This step would be potentially costly, but
volume, thus avoiding fill-rate issues. we achieve it in a reasonable amount of time with two key
points: 1) the micro-patches are parallel to the light source,
so computing their penumbra extent and their percentage of
3. Algorithm
occlusion only requires a small number of operations, and 2)
3.1. Presentation of the algorithm these operations are computed on the graphics card, exploit-
ing the parallelism of the GPU engine. The percentage of
Our algorithm assumes a rectangular light source and starts
occlusion from each micro-patch takes into account the rel-
by separating potential occluders (such as moving charac-
ative distances between the occluders, the receiver and the
ters) from potential receivers (such as the background in a
light source. Our algorithm introduces several approxima-
scene) (Fig. 3(a)). We will compute the soft shadows only
tions on the actual soft shadow. These approximations will
from the occluders onto the receivers.
be discussed in Section 5.
Our algorithm computes a Soft Shadow Map, (SSM), for
The pseudo-code for our algorithm is given in Fig. 4.
each light source: a texture containing the texelwise percent-
In the following subsections, we will review in detail the
age of occlusion from the light source. This soft shadow map
individual steps of the algorithm: discretizing the occlud-
is then projected onto the scene from the position of the light
ers (Section 3.2), computing the penumbra extent for each
source, to give soft shadows (see Fig. 2).
micro-patch (Section 3.3) and computing the percentage of
Our algorithm is an extension of the shadow map algo- occlusion for each pixel in the Soft Shadow Map (Sec-
rithm: we start by computing depth buffers of the scene. tion 3.4). Specific implementation details will be given in
Unlike the standard shadow map method, we will need two Section 4.
209
L. Atty et al. / Soft Shadow Maps
Occluders Occluders P
(a) Scene view (b) Discretizing occluders (c) Soft shadows from one (d) Summing the soft shadows
micro-patch
Light source L CL
L L’
L’
O O
Occluding patch
P P’
P’ CP
P
(a) (b)
Penumbra
Umbra
Figure 6: Finding the apex of the pyramid is reduced to a
2D problem
Figure 5: The penumbra extent of a micro-patch is a rectan-
gular pyramid
light source and have the same aspect ratio, the penumbra
3.2. Discretizing the occluders
extent of each micro-patch is a rectangular pyramid (Fig. 5).
The first step in our algorithm is a discretization of the oc- Finding the penumbra extent of the light source is equivalent
cluders. We compute a depth buffer of the occluders, as seen to finding the apex O of the pyramid (Fig. 6(a)). This reduces
from the light source, then convert each pixel in this occluder to a 2D problem, considering parallel edges (LL′ ) and (PP′ )
map into the equivalent polygonal micro-patch that lies in a on both polygons (Fig. 6(b)). Since (LL′ ) and (PP′ ) are par-
plane parallel to the light source, at the appropriate depth allel lines, we have:
and occupies the same image plane extent (see Fig. 1).
OL OL′ LL′
The occluder map is axis-aligned with the rectangular = =
OP OP′ PP′
light source and has the same aspect ratio: all micro-patches
This ratio is the same if we consider the center of each line
created in this step are also axis-aligned with the light source
segment:
and have the same aspect ratio.
OCL LL′
=
3.3. Computing penumbra extents OCP PP′
Each micro-patch in the discretized occluder is potentially Since the micro-patch and the light source have the same
LL′
blocking some light between the light source and some por- aspect ratio, the ratio r = PP ′ is the same for both sides of the
tion of the receiver. To reduce the amount of computations, micro-patch (thus, the penumbra extent of the micro-patch is
we compute the penumbra extent of the micro-patches, and indeed a pyramid).
we only compute occlusion values inside these extents.
We find the apex of the pyramid by applying a scaling to
Since the micro-patches are parallel, axis-aligned with the the center of the micro-patch (CP ), with respect to the center
210
L. Atty et al. / Soft Shadow Maps
Light source
Light source
A
Occluding patch
A= *
Occluding patch
Penum
bra ex
Virtua tent
l plane Penu
mbra
exten
t
Figure 7: The intersection between the pyramid and the vir- Figure 9: We reproject the occluding micro-patch onto the
tual plane is an axis-aligned rectangle light source and compute the percentage of occlusion.
L
CL
L’
3.4. Computing the soft shadow map
For all the pixels of the SSM lying inside this penumbra ex-
zO
tent, we compute the percentage of the light source that is
O occluded by this micro-patch. This percentage of occlusion
depends on the relative positions of the light source, the oc-
zR cluders and the receivers. To compute it, for each pixel on the
receiver inside this extent, we project the occluding micro-
facet back onto the light source [DF94] (Fig. 9). The result
R R’ of this projection is an axis-aligned rectangle; we need to
CR compute the intersection between this rectangle and the light
source.
Figure 8: Computing the position and extent of the penum-
bra rectangle for each micro-patch. Computing this intersection is equivalent to computing
the two intersections between the respective intervals on
both axes. This part of the computation is done on the GPU,
using a fragment program: the penumbra extent is converted
into an axis-aligned quad, which we draw in a float buffer.
of the light source (CL ): For each pixel inside this quad, the fragment program com-
putes the percentage of occlusion. These percentages are
−−→ r −−−→
CL O = CLCP summed using the blending capability of the graphics card
1+r (see Section 4.2).
LL′
where r is again the ratio r = PP′ .
3.5. Two-sided soft-shadow maps
We now use this pyramid to compute occlusion in the
soft shadow map (see Fig. 7). We use a virtual plane, par- As with many other soft shadow computation algo-
allel to the light source, to represent this map (which will be rithms [HLHS03], our algorithm exhibits artifacts because
projected onto the scene). The intersection of the penumbra we are computing soft shadows using a single view of the oc-
pyramid with this virtual plane is an axis-aligned rectangle. cluder. Shadow effects linked to parts of the occluder that are
We only have to compute the percentage of occlusion inside not directly visible from the light source are not visible. In
this rectangle. Fig. 10(a), our algorithm only computes the soft shadow for
the front part of the occluder, because the back part of the oc-
Computing the position and size of the penumbra rectan- cluder does not appear in the occluder map. This limitation
gle uses the same formulas as for computing the apex of the is frequent in real-time soft-shadow algorithms [HLHS03].
pyramid (see Fig. 8):
For our algorithm, we have devised an extension that
−−−→ z −−→
CLCR = R CL O solves this limitation: we compute two occluder maps. In
zO the first, we discretize the closest, front-facing faces of the
z − zO occluders (see Fig. 10(b)). In the second, we discretize the
RR′ = LL′ R
zO furthest, back-facing faces of the occluders (see Fig. 10(c)).
211
L. Atty et al. / Soft Shadow Maps
(a) Original algorithm (b) Closest, front faces of the (c) Furthest, back faces of the (d) Combining the two soft
occluder discretized with their occluder discretized with their shadow maps
shadow shadow
Figure 10: The original algorithm fails for some geometry. The two-pass method gives the correct shadow.
(a) One pass (148 fps) (b) One pass with bottom patches (c) Two passes (84 fps) (d) Ground truth
(142 fps)
We then compute a soft shadow map for each occluder centage of occlusion at each pixel of the soft shadow map is
map, and merge them, using the maximum of each occluder done on the GPU (see section 4.2).
map. The resulting occlusion map has eliminated most arti-
These contributions from each micro-patch are added to-
facts (Fig. 10(d) and 11). Empirically, the cost of the two-
gether; we use for this the blending ability of the GPU: oc-
pass algorithm is between 1.6 and 1.8 times the cost of the
clusion percentages are rendered into a floating-point buffer
one-pass algorithm. Depending on the size of a model and
with blending enabled, thus the percentage values for each
the quality requirements of a given application, the second
micro-patch are automatically added to the previously com-
pass may be worth this extra cost. For example, for an ani-
puted percentage values.
mated model of less than 100, 000 polygons, the one-pass al-
gorithm renders at approximately 60 fps. Adding the second
pass drops the framerate to 35 fps — which is still interac- 4.2. Computing the intersection
tive.
For each pixel of the SSM lying inside the penumbra extent
of a micro-patch, we compute the percentage of the light
source that is occluded by this micro-patch, by projecting
4. Implementation details the occluding micro-patch back onto the light source (see
4.1. Repartition between CPU and GPU Fig. 9). We have to compute the intersection of two axis-
aligned rectangles, which is the product of the two intersec-
Our algorithm (see Fig. 4) starts by rendering two depth tions between the respective intervals on both axes.
maps, one for the occluders and one for the receivers; these
depth maps are both computed by the GPU. Then, in order We have therefore reduced our intersection problem from
to generate the penumbra extents for the micro-patches, the a 2D problem to two separate 1D problems. To further op-
occluders depth map is transferred back to the CPU. timize the computations, we use the SAT instructions in the
fragment program assembly language: without loss of gen-
On the CPU, we generate the penumbra extents for the erality, we can convert the rectangle corresponding to the
micro-patch associated to each non-empty pixel of the oc- light source to [0, 1] × [0, 1]. Each interval intersection be-
cluders depth map. We then render these penumbra extents, comes the intersection between one [a, b] interval and [0, 1].
and for each pixel, we execute a small fragment program Exploiting the SAT instruction and swizzling, computing the
to compute the percentage of occlusion. Computing the per- area of the intersection between the projection of the oc-
212
L. Atty et al. / Soft Shadow Maps
cluder [a, b] × [c, d] and the light source [0, 1] × [0, 1] only • We are adding many small values (the occlusion from
requires three instructions: each micro-patch) to form a large value (the occlusion
from the entire occluder). If the micro-patches are too
MOV_SAT rs,{a,b,c,d}
SUB rs, rs, rs.yxwz small, we run into numerical accuracy issues, especially
MUL result.color, rs.x, rs.z with floating-point numbers expressed on 16 bits. This
cause of error will be analyzed in Section 5.1.3.
Computing the [a, b] × [c, d] intervals requires projecting
the micro-patch onto the light source and scaling the projec-
tion. This uses 8 other instructions: 6 basic operations (ADD, 5.1.1. Discretization error
MUL, SUB), one reciprocal (RCP) and one texture lookup to Our algorithm computes the shadow of the discretized oc-
get the depth of the receiver. The total length of our fragment cluder, not the shadow of the actual occluder. The dis-
program is therefore 11 instructions, including one texture cretized occluder corresponds to the part of the occluder
lookup. that is visible from the camera used to compute the depth
buffers, usually the center of the light source. Although
4.3. Possible improvements we reproject each micro-patch of the discretized occluder
onto the area light source, we are missing the parts of the
As it stands, our algorithm makes a very light use of GPU occluder that are not visible from the shadow map cam-
resources: we only execute a very small fragment program, era but are still visible from some points of the area light
once for each pixel covered by the penumbra extent, and we source. This is a limitation that is frequent in real-time soft
exploit the blending ability for floating point buffers. shadow algorithms [HLHS03], especially algorithms relying
on the silhouette of the occluder as computed from a single
The main bottleneck of our algorithm is that the penum-
point [WH03, CD03, AAM03].
bra extents have to be computed on the CPU. This requires
transfering the occluders depth map to the CPU, and loop- We also use a discrete representation based on the shadow
ing over the pixels of the occluders depth map on the CPU. map, not a continuous representation of the occluder. For
It should be possible to remove this step by using the render- each pixel of the shadow map, we are potentially overes-
to-vertex- buffer function: instead of rendering the occlud- timating or underestimating the actual occluder by at most
ers depth map, we would directly render the penumbra ex- half a pixel.
tents for each micro-patch into a vertex buffer. This vertex
buffer would be rendered in a second pass, generating the If the occluder has one or more edges aligned with the
soft shadow map. edges of the shadow map, these discretization errors are of
the same sign over the edge, and add themselves; the worst
case scenario is a square aligned with the axis of the shadow
5. Error Analysis and comparison map.
In this section, we analyze our algorithm, its accuracy and For more practical occluders the discretization errors on
how it compares with the exact soft-shadows. We first study neighboring micro-patches compensate: some of the micro-
potential sources of error from a theoretical point of view, in patches overestimate the occluder while others underesti-
Section 5.1, then we conduct an experimental analysis, com- mate it.
paring the soft shadows produced with exact soft shadows,
in Section 5.2.
5.1.2. Overlapping reprojections
At any given point on the receiver, the parts of the light
5.1. Theoretical analysis
source that are occluded by two neighboring micro-patches
Our algorithm replaces the occluder with a discretized ver- should be joined exactly for our algorithm to compute the
sion. This discretization ensures interactive framerates, but exact percentage of occlusion on the light source. This is
it can also be a source of inaccuracies. From a given point typically not the case, and these parts may overlap or there
on the receiver, we are separately estimating occlusion from may be a gap between them (Fig. 12). The amount of over-
several micro-patches, and adding these occlusion values to- lap (or gap) between the occluded parts of the light source
gether. We have identified three potential sources of error in depends on the relative positions of the light source, the oc-
our algorithm: cluding micro-patches and the receiver
• We are only computing the shadow of the discretized oc- If we consider the 2D equivalent of this problem (Fig. 13),
cluder, not the shadow of the actual occluder. This source with two patches separated by δh and at a distance zO from
of error will be analyzed in Section 5.1.1. the light source, with the receiver being at a distance zR from
• The reprojections of the micro-patches on the light source the light source, there is a point P0 on the receiver where
may overlap or be disjoined. This cause of error will be there is no overlap between the occluded parts. As we move
analyzed in Section 5.1.2. away from this point, the overlap increases. For a point at a
213
L. Atty et al. / Soft Shadow Maps
214
L. Atty et al. / Soft Shadow Maps
(a) 1282 pixels, FP16 blending (b) 5122 pixels, FP16 blending (c) 5122 pixels, FP32 blending (d) Ground truth (CPU)
(66 Hz) (20 Hz) (CPU)
Figure 14: Blending with FP16 numbers: if the resolution of the shadow map is too high, numerical issues appear, resulting in
wrong shadows. Using higher accuracy for blending removes this issue (here, FP32 blending was done on the CPU).
Figure 14 shows an example of these problems. Uncon- scenes exhibit several interesting features. The Buddha and
ventionally, increasing the resolution of the shadow map Bunny are complex models, with folds and creases. The
makes these problems more likely to appear (for a complete Bunny also has important self-occlusion, and in our scene
study of floating-point blending accuracy, see appendix A). it is in contact with the ground, providing information on the
The best workaround is therefore to use relatively low reso- behavior of our algorithm in that case. The square plane is
lution for the occluder map, such as 128 × 128 or 256 × 256. an illustration of the special case of occluders aligned with
While this may seem a low resolution compared to other the axes of the occluders depth map.
shadow map algorithms, our shadow map is focused on the
We have tested both the one-pass and the two-pass ver-
moving occluder (such as a character), not on the entire
sions of our algorithm. We selected four separate parame-
scene, so 128 × 128 pixels is usually enough resolution.
ters: the size of the light source, the resolution of the shadow
We see this is only as a temporary issue that will disappear map and moving the occluder, either vertically from the re-
as soon as hardware FP32 blending becomes available on ceiver to the light source or laterally with respect to the light
graphics cards. source. For each parameter, we plot the variation of the error
introduced by our algorithm as a function of the parameter
and analyze the results.
5.2. Comparison with ground truth
We ran several tests to experimentally compare the shadows 5.2.2. Visual comparison with ground truth
produced by our algorithm with the actual shadows. The ref- Fig. 16 shows a side by side comparison of our algorithm
erence values were computed using occlusion queries, giv- with ground truth. Even though there are slight differences
ing an accurate estimation of the real occlusion of the light with ground truth, our algorithm exhibits the proper behavior
source. In this section, we review the practical differences for soft shadows: sharp shadows at places where the object
we observed. is close to the ground, a large penumbra zone where the ob-
ject is further away from the receiver. Our algorithm visibly
5.2.1. Experimentation method computes both the inner and the outer penumbra of the ob-
For each image, we computed an error metric as thus: for ject.
each pixel in the soft shadow map, we compute the actual Looking at the picture of the differences (Fig. 16(d)
occlusion value (using occlusion queries), and the difference and 16(g)) between the shadow values computed by our al-
with the occlusion value computed using our algorithm. We gorithm and the ground truth values, it appears that the dif-
summed the modulus of the differences, then divided the re- ferences lie mostly on the silhouette: since our algorithm
sult by the total number of pixels lying either in the shadow only computes the soft shadow of the discretized object, as
or in the penumbra, averaging the error over the actual soft seen from the center of the light source. The actual shape of
shadow. We used the number of pixels that are either in the soft shadow depends on subtle effects happening at the
shadow or in penumbra and not the total number of pixels boundary of the silhouette.
in the occluders depth map because the soft shadow can oc-
cupy only a small part of the depth map. Dividing by the 5.2.3. Size of the buffer
total number of pixels in the depth map would have under-
Figure 17 shows the average difference between the occlu-
estimated the error.
sion values computed with our algorithm and the actual oc-
We have used 3 different scenes (a square plane parallel to clusion values for our three test scenes, when changing the
the light source, a Buddha model and a Bunny model). These resolution of the shadow map. In these figures, the abscissa
215
L. Atty et al. / Soft Shadow Maps
15 %
12 %
9%
6%
3%
0%
(a) Scene view (b) Our algorithm (c) Ground Truth (d) Difference between the occlu-
sion values
15 %
12 %
9%
6%
3%
0%
(e) Our algorithm (f) Ground Truth (g) Difference between the occlusion val-
ues
Average error
Average error
0 0 0
0 256 512 768 1024 0 256 512 768 1024 0 256 512 768 1024
Buffer resolution (pixels) Buffer resolution (pixels) Buffer resolution (pixels)
Figure 17: Variation of the error with respect to the resolution of the shadow map
216
L. Atty et al. / Soft Shadow Maps
is the number of pixels for one side of the shadow map, so as half a pixel. The result is a large error, but it occurs
128 corresponds to a 128 × 128 shadow map. For this test, only at the shadow boundary.
we used non-power of two textures, in order to have enough Light source size: except for the special case of point light
sampling data. We can make several observations by looking sources, the error increases with the size of the light
at the data: source. This is consistent with our theoretical analysis
(see Section 5.1.2).
Two-pass version: the two-pass version of the algorithm
consistently outperforms the single-pass version, always 5.2.5. Occluder moving laterally
giving more accurate results. The only exception is of
course the square plane: since it has no thickness, the Figure 19 shows the average difference between the occlu-
single-pass and two-pass version give the same results. sion values computed with our algorithm and the actual oc-
Shadow map Resolution: as expected from the theoretical clusion values, when we move the occluder from left to right
study (see Section 5.1.2), the error decreases as the res- under the light source. The parameter corresponds to the po-
olution of the shadow map increases. What is interesting sition with respect to the center of the light, with 0 meaning
is that this effect reaches a limit quite rapidly. Roughly, that the center of the object is aligned with the center of the
increasing the shadow map resolution above 200 pixels light. We used a bitmap of 128 × 128 for all these tests.
does not bring an improvement in quality. Since the com- The error is at its minimum when the occluder is roughly
putation costs are related to the size of the shadow map, under the light source, and increases as the occluder moves
shadow map sizes of 200 × 200 pixels are close to opti- laterally. The Buddha and Bunny models are not symmetric,
mal. so their curves are slightly asymmetric, and the minimum
The fact that the error does not decrease continuously as does not correspond exactly to 0.
we increase the resolution of the occluder map is a little
surprising at first, but can be explained. It is linked to the 5.2.6. Occluder moving vertically
silhouette effect. As we have seen in Fig. 16, the error Figure 20 shows the average difference between the occlu-
introduced by our algorithm comes from the boundary of sion values computed with our algorithm and the actual oc-
the silhouette of the occluder, from parts of the occluder clusion values, when we move the occluder vertically. The
that are not visible from the center of the light source, but smallest value of the parameter corresponds to an occluder
visible from other parts of the light source. Increasing the touching the receiver, and the largest value corresponds to
resolution of the shadow map does not solve this problem. an occluder touching the light source. We used a bitmap of
The optimal size for the shadow map is related to the size 128 × 128 for all these tests.
of the light source. As the light source gets larger, we can
use smaller bitmaps. As predicted by the theory, the error increases as the oc-
Discretization error: the error curve for the square plane cluder approaches the light source (see Section 5.1.2). For
presents many important spikes. Looking at the results, the Bunny, the error becomes quite large when the upper ear
it appears that these spikes correspond to discretization touches the light source.
error (see Section 5.1.1). Since the square occluder is
aligned with the axis of the shadow map, it magnifies dis- 6. Complexity
cretization error.
The main advantages of our algorithm are its rendering
speed and its scalability. With a typical setup (a modern
5.2.4. Size of the light source PC, an occluder map of 128 × 128 pixels, a scene between
Figure 18 shows the average difference between the oc- 50, 000 polygons and 300, 000 polygons), we get framerates
clusion values computed with our algorithm and the ac- between 30 and 150 fps. In this section, we study the nu-
tual occlusion values when we change the size of the light merical complexity of our algorithm and its rendering speed.
source for our three test scenes. The parameter values We first conduct a theoretical analysis of the complexity of
range from a point light source (parameter=0.01) to a very our algorithm, in Section 6.1, then an experimental analysis,
large light source, approximately as large as the occluder where we test the variation of the rendering speed with re-
(parameter=0.2). We used a bitmap of 128 × 128 pixels for spect to several parameters: the size of the shadow map, the
all these tests. We can make several observations by looking number of polygons and the size of the light source (Sec-
at the data: tion 6.2). Finally, in Section 6.3, we compare the complex-
ity of our algorithm with a state-of-the-art algorithm, Soft
Point light sources: the beginning of the curves Shadow Volume [AAM03].
(parameter=0.01) corresponds to a point light source. In
that case, the error is quite large. This corresponds to an
6.1. Theoretical complexity
error of 1, over the entire shadow boundary; as we are
computing the shadow of the discretized occluder, we Our algorithm starts by rendering a shadow map and down-
miss the actual shadow boundary, sometimes by as much loading it into main memory. This preliminary step has a
217
L. Atty et al. / Soft Shadow Maps
Average error
Average error
0.08 0.08 0.08
0 0 0
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
Light source size Light source size Light source size
Figure 18: Variation of the error with respect to the size of the light source
Average error
Average error
0.025 0.025 0.025
0 0 0
-0.3 -0.2 -0.1 0 0.1 0.2 0.3 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 -0.3 -0.2 -0.1 0 0.1 0.2 0.3
Occluder position Occluder position Occluder position
Figure 19: Variation of the error with respect to the lateral position of the occluder
complexity linear with respect to the number of polygons in 6.2. Experimental complexity
the scene, and linear with the size of the shadow map, mea-
All measurements in this section were conducted on a 2.4
sured in the total number of pixels.
GHz Pentium4 PC with a GeForce 6800 Ultra graphics card.
Then, for each pixel of the shadow map corresponding to All framerates and rendering times correspond to observed
the occluder, we compute its extent in the occlusion map, framerates, that is the framerate for a user manipulating our
and for each pixel of this extent we execute a small fragment system. We are therefore measuring the time it takes to dis-
program of 11 instructions, including one texture lookup. play the scene and to compute soft shadows, not just the time
it takes to compute soft shadows.
The overall complexity of this second step of the algo-
rithm is the number of pixels covered by the occluder, mul- 6.2.1. Number of polygons
tiplied by the number of pixels covered by the extent for We studied the influence of the polygon count. Fig. 6.2
each of them, multiplied by the cost of the fragment pro- shows the observed rendering time (in ms) as a function
gram. This second step is executed on the GPU, and benefits of the polygon count, with a constant occluder map size of
from the high-performance and the parallelism of the graph- 128 × 128 pixels. The first thing we note is the speed of our
ics card. algorithm: even on a large scene of 340, 000 polygons, we
achieve real-time framerates (more than 30 frames per sec-
The worst case situation would be a case where each ond). Second, we observe that the rendering time varies lin-
micro-patch in the shadow map covers a large number of early with respect to the number of polygons. That was to
pixels in the soft shadow map. But this situation corresponds be expected, as we must render the scene twice (once for
to an object with a large penumbra zone, and if we have a the occluder map and once for the actual display), and the
large penumbra zone, we can use a lower resolution for the time it takes for the graphics card to display a scene varies
shadow maps. So we can compensate the cost for the algo- linearly with respect to the number of polygons. For smaller
rithm by running it with bitmaps of lower resolution. scenes (less than 10,000 polygons, rendering time below 10
ms), some factors other than the polygon count play a more
Using our algorithm with a large resolution shadow map
important role.
in a situation of large penumbra results in relatively high
computing costs, but a low resolution shadow map would Our algorithm exhibits good scaling, and can handle sig-
give the same visual results, for a smaller computation time. nificantly large scenes without incurring a high performance
218
L. Atty et al. / Soft Shadow Maps
0.15 0.15 0.15
single pass single pass single pass
double pass double pass double pass
0.1
Average error
0.1 0.1
Average error
Average error
0.05 0.05 0.05
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75
Occluder vertical position Occluder vertical position Occluder vertical position
Figure 20: Variation of the error with respect to the vertical position of the occluder
50
Rendering times
30 fps
Rendering time (ms)
40
30
20
10
0
0 100000 250000 500000
Number of Polygons
(a) Rendering times (in ms) (b) Our largest test scene (565,203 polygons)
150
Rendering times
30 fps
Rendering time (ms)
10 fps
100
50
0
2 2 2
128 256 512
Number of pixels
(a) Rendering times (in ms) (b) Test scene (24,000 polygons)
219
L. Atty et al. / Soft Shadow Maps
As you can see from Fig. 24(a), the rendering time in-
creases with the size of the light source. What is interesting
is the error introduced by our algorithm (see Fig. 24(b)). The
error logically increases with the size of the light source, and
for small light sources, larger bitmaps result in more accu-
rate images. But for large light sources, a smaller bitmap
will give a soft shadow of similar quality. A visual compari-
son of the soft shadows with a small bitmap and ground truth
shows the small bitmap gives a very acceptable soft shadow
(a) Bitmap of 642 (184 fps) (b) Ground truth (see Fig. 23).
Figure 23: Large light sources with small bitmaps This effect was observed by previous researchers: as the
light source becomes larger, the features in the soft shadow
become blurrier, hence they can be modeled accurately with
a smaller bitmap.
cost. The maximum size of the scene depends on the require-
ments of the user.
6.3. Comparison with Soft-Shadow Volumes
6.2.2. Size of occluder map
Finally, we performed a comparison with a state-of-the art
Fig. 6.2.1 shows the observed rendering times (in ms) of algorithm for computing soft shadows, the Soft-Shadow Vol-
our algorithm, on a scene with 24,000 polygons (Fig. 22(b)), umes by Assarsson and Akenine-Möller [AAM03].
when the size of the occluder map changes. We plotted the
rendering time as a function of the number of pixels in the Fig. 25 shows the same scene, with soft shadows, com-
occluder map (that is, the square of the size of the occluder puted by both algorithms. We ran the tests with a varying
map) to illustrate the observed linear variation of rendering number of jeeps, to test how both algorithms scale with re-
time with respect to the total number of pixels. spect to the number of polygons. Fig. 25(c) shows the ren-
dering times as a function of the number of polygons for both
An occluder map of 5122 pixels gives a rendering time of algorithms. These figures were computed using a window of
150 ms — or 7 fps, too slow for interactive rendering. An 512 × 512 pixels for both algorithms, and with the two-pass
occluder map of 1282 or 2562 pixels gives a rendering time version of our algorithm, with an occluder map resolution of
of 10 to 50 ms, or 20 to 100 fps, fast enough for real-time 210 × 210.
rendering. For a large penumbra region, an occluder map of
1282 pixels qualitatively gives a reasonable approximation, Our algorithm scales better with respect to the number of
as in Fig. 22(b). For a small penumbra region, our algorithm polygons. On the other hand, soft shadow volumes provide
behaves like the classical shadow mapping algorithm and ar- a better looking shadow (see Fig. 25(b)), closer to the actual
tifacts can appear with a small occluder map of 1282 pixels; truth.
in that case, it is better to use 2562 pixels. It is important to remember that the rendering time for the
The fact that the rendering time of our algorithm is pro- Soft- Shadow Volumes algorithm varies with the number of
portional to the number of pixels in the occluder map con- screen pixels covered by the penumbra region. If the view-
firms that the bottleneck of our algorithm is its transfer to the point is close to a large penumbra region, the rendering time
CPU. Due to the cost of this transfer, we found that for some becomes much larger. The figures we used for this compari-
scenes it was actually faster to use textures whose dimen- son correspond to an observer walking around the scene (as
sions are not a power of 2: if the difference in pixel count is in Fig. 25(b)).
sufficient, the gain in transfer time compensates the losses in
rendering time. 7. Conclusion and Future Directions
6.2.3. Light source size In this paper, we have presented a new algorithm for com-
puting soft shadows in real-time on dynamic scenes. Our al-
Another important parameter is the size of the light source,
gorithm is based on the shadow mapping algorithm, and is
compared to the size of the scene itself. A large light source
entirely image-based. As such, it benefits from the advan-
results in a large penumbra region for each micro-patch, re-
tages of image-based algorithms, especially speed.
sulting in more pixels of the soft shadow map covered, and
a larger computational cost. Fig. 24(a) shows the observed The largest advantage of our algorithm is its high fram-
framerate as a function of the size of the light source. We did erate, hence there remains plenty of computational power
the tests with several bitmap resolutions (2562 , 1282 , 642 ). available for performing other tasks, such as interacting
Fig. 24(b) shows the error as a function of the size of the with the user or performing non-graphics processing such
light source, for the same bitmap resolutions. as physics computations within game engines. Possibly the
220
L. Atty et al. / Soft Shadow Maps
300 0.07
2562 642
30 fps 12822
250 1282 0.06 256
642
Average error
0.04
150
0.03
100
0.02
50 0.01
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.05 0.1 0.15 0.2 0.25 0.3
Light source size Light source size
Figure 24: Changing the size of the light source (floating bunny scene)
300
Soft Shadow Volumes
Soft Shadow Maps
250
150
100
50
0
0 5000 10000 15000 20000
Number of Polygons
(a) Soft Shadow Maps (b) Soft Shadow Volumes (c) Rendering times
largest limitation of our algorithm is the fact that it does not 8. Acknowledgments
compute self-occlusion and it requires a separation between
Most of this work was conducted while Charles Hansen was on sab-
occluders and receivers. We know that this limitation is very batical leave from the University of Utah, and was a visiting profes-
important, and we plan to remove it in future work, possibly sor at ARTIS/GRAVIR IMAG, partly funded by INPG and INRIA.
by using layered depth images.
The authors gratefully thank Ulf Assarsson for doing the tests for
the comparison with the soft-shadow volumes algorithm.
An important aspect of our algorithm is that we can use
low-resolution shadow maps in places with a large penum- The Stanford Bunny, Happy Buddha and dragon 3D models appear-
bra, even though we still need higher resolution shadow ing in this paper and in the accompanying video were digitized and
maps for places with small penumbra, for example close to kindly provided by the Stanford University Computer Graphics Lab-
the contact between the occluder and the receiver. An obvi- oratory.
The smaller Buddha 3D model appearing in this paper was digitized
ous improvement to our algorithm would be the ability to use
and kindly provided by Inspeck.
hierarchical shadow maps, switching resolutions depending The Jeep 3D model appearing in this paper was designed and kindly
on the shadow being computed. This work could also be provided by Psionic.
combined with perspective-corrected shadow maps [SD02, The horse 3D model appearing in the accompanying video was dig-
WSP04, MT04, CG04], in order to have higher resolution in itized by Cyberware, Inc., and was kindly provided by the Georgia
places with sharp shadows close to the viewpoint. Tech. “Large Geometric Models Archive”.
The skeleton foot 3D model appearing in the accompanying video
In its current form, our algorithm still requires a transfer was digitized and kindly provided by Viewpoint Datalabs Intl.
of the occluder map from the GPU to the main memory, and
a loop, on the CPU, over all the pixels in the occluder map.
References
We would like to design a GPU only implementation of our
algorithm, using the future render-to-vertex-buffer capabili- [AAM03] A SSARSSON U., A KENINE -M ÖLLER T.: A
ties. geometry-based soft shadow volume algorithm using
221
L. Atty et al. / Soft Shadow Maps
graphics hardware. ACM Transactions on Graphics (Proc. niques 2004 (Proc. of the Eurographics Symposium on
of SIGGRAPH 2003) 22, 3 (2003), 511–520. Rendering) (2004), pp. 153–160.
[AHT04] A RVO J., H IRVIKORPI M., T YYSTJÄRVI J.: [SD02] S TAMMINGER M., D RETTAKIS G.: Perspective
Approximate soft shadows using image-space flood-fill shadow maps. ACM Transactions on Graphics (Proc. of
algorithm. Computer Graphics Forum (Proc. of Euro- SIGGRAPH 2002) 21, 3 (2002), 557–562.
graphics 2004) 23, 3 (2004), 271–280. [SS98] S OLER C., S ILLION F. X.: Fast calculation of soft
[AMH02] A KENINE -M ÖLLER T., H AINES E.: Real-Time shadow textures using convolution. In SIGGRAPH ’98
Rendering, 2nd ed. A. K. Peters, 2002. (1998), pp. 321–332.
[BS02] B RABEC S., S EIDEL H.-P.: Single sample soft [Wan92] WANGER L.: The effect of shadow quality on
shadows using depth maps. In Graphics Interface (2002). the perception of spatial relationships in computer gener-
ated imagery. In Symposium on Interactive 3D Graphics
[CD03] C HAN E., D URAND F.: Rendering fake soft shad-
(1992), pp. 39–42.
ows with smoothies. In Rendering Techniques 2003 (Proc.
of the Eurographics Symposium on Rendering) (2003), [WFG92] WANGER L., F ERWERDA J. A., G REENBERG
pp. 208–218. D. P.: Perceiving spatial relationships in computer-
generated images. IEEE Computer Graphics and Appli-
[CD04] C HAN E., D URAND F.: An efficient hybrid cations 12, 3 (1992), 44–58.
shadow rendering algorithm. In Rendering Techniques
2004 (Proc. of the Eurographics Symposium on Render- [WH03] W YMAN C., H ANSEN C.: Penumbra maps: Ap-
ing) (2004), pp. 185–195. proximate soft shadows in real-time. In Rendering Tech-
niques 2003 (Proc. of the Eurographics Symposium on
[CG04] C HONG H., G ORTLER S. J.: A lixel for every Rendering) (2003), pp. 202–207.
pixel. In Rendering Techniques 2004 (Proc. of the Euro-
graphics Symposium on Rendering) (2004), pp. 167–172. [Wil78] W ILLIAMS L.: Casting curved shadows on
curved surfaces. Computer Graphics (Proc. of SIG-
[Cro77] C ROW F. C.: Shadow algorithms for computer GRAPH ’78) 12, 3 (1978), 270–274.
graphics. Computer Graphics (Proc. of SIGGRAPH ’77)
[WPF90] W OO A., P OULIN P., F OURNIER A.: A survey
11, 2 (1977), 242–248.
of shadow algorithms. IEEE Computer Graphics & Ap-
[DF94] D RETTAKIS G., F IUME E.: A fast shadow algo- plications 10, 6 (1990), 13–32.
rithm for area light sources using backprojection. In SIG-
[WSP04] W IMMER M., S CHERZER D., P URGATHOFER
GRAPH ’94 (1994), pp. 223–230.
W.: Light space perspective shadow maps. In Rendering
[GBP06] G UENNEBAUD G., BARTHE L., PAULIN M.: Techniques 2004 (Proc. of the Eurographics Symposium
Real-time soft shadow mapping by backprojection. In on Rendering) (2004), pp. 143–152.
Rendering Techniques 2006 (Proc. of the Eurographics
Symposium on Rendering) (2006).
Appendix A: Floating-point blending accuracy
[HLHS03] H ASENFRATZ J.-M., L APIERRE M.,
H OLZSCHUCH N., S ILLION F.: A survey of real- In this section, we review the issues behind the hardware
time soft shadows algorithms. Computer Graphics Forum blending accuracy problems we have encountered and pro-
22, 4 (2003), 753–774. pose a temporary fix for these issues.
[KD03] K IRSCH F., D OELLNER J.: Real-time soft shad- All the accuracy issues are linked to the fact that hard-
ows using a single light sample. Journal of WSCG (Winter ware blending is, at the time of writing, only available for
School on Computer Graphics) 11, 1 (2003). 16-bits floating point numbers. NVidia graphics hardware
stores these floating-point numbers using s10e5 format: one
[KMK97] K ERSTEN D., M AMASSIAN P., K NILL D. C.: bit of sign, 10 bits of mantissa, 5 bits of exponent, with a
Moving cast shadows and the perception of relative depth. bias of 15 for the exponent. The important point for addition
Perception 26, 2 (1997), 171–192. is that the mantissa is stored on 10 bits. As a result, adding a
[LWGM04] L LOYD B., W ENDT J., G OVINDARAJU N., large number X and a small number ε will give an inaccurate
M ANOCHA D.: Cc shadow volumes. In Rendering Tech- result if ε < 2−10 X:
niques 2004 (Proc. of the Eurographics Symposium on X +ε = X if ε < 2−10 X (inFP16)
Rendering) (2004), pp. 197–205.
[McC00] M C C OOL M. D.: Shadow volume reconstruc- For example, 2048 + 1 = 2048 (in FP16 format) and 0.5 +
1
tion from depth maps. ACM Transactions on Graphics 2049= 0.5 (also in FP16 format).
19, 1 (2000), 1–26. In some cases, the addition of the contribution from all
[MT04] M ARTIN T., TAN T.-S.: Anti-aliasing and conti- micro-patches will be 1 (meaning complete occlusion of the
nuity with trapezoidal shadow maps. In Rendering Tech- light source). As a consequence, we can expect numerical
222
L. Atty et al. / Soft Shadow Maps
accuracy issues if some micro-patches hide less than 2−10 of shadow map would be used for large penumbra regions, and
the light source. Because 322 = 210 , it means that the width the high-resolution shadow map for areas with hard shad-
of the reprojection of one micro-patch should be larger than ows, e.g. when the occluder and the receiver are in contact.
1
32 of the width of the light source.
This translates easily into conditions for the position of
the occluder:
1 1 64 tan α
< + (4)
zO zR NL
where L is the width of the light source, N is the resolution of
the bitmap, α is the half-angle of the camera used to generate
the shadow map, zO is the distance between the light source
and the occluder and zR is the distance between the light
source and the receiver.
Bitmap resolution: The most important thing is that in-
creasing N makes this error more likely to appear. This
explains why using a bitmap of 512 × 512 pixels we see
a poor looking shadow, while the 128 × 128 bitmap gives
the correct shadow (see Fig. 14).
Light source size: In equation 4, the size of the light source
appears in a product with the resolution of the bitmap. If
the light source is large, the bitmap must be low resolu-
tion in order to avoid FP16 blending errors. Fortunately,
a large light source means a large penumbra for most oc-
cluders, so a low resolution bitmap might be enough for
these penumbra effects.
Occluder position: As the occluder moves closer to the re-
ceiver, the likeliness of blending errors gets lower.
Camera half-angle: Similarly, increasing the camera half-
angle improves the FP16 blending accuracy.
Basically, all these conditions amount to the same thing: us-
ing less pixels to describe the occluder in the shadow map.
While this improves the FP16 blending accuracy, it obvi-
ously degrades the discretization of the occluder and also
increases the overlapping between reprojections of neigh-
boring pixels.
In our experiments (see Fig. 14) the blending accuracy
problem appears very often when the resolution of the
shadow map is larger than 512 × 512, sometimes with a
shadow map resolution of 256 × 256 and very rarely with
a shadow map resolution of 128 × 128.
The problem will disappear when hardware blending will
become available on higher accuracy floating point numbers.
FP32 have a mantissa of 23 bits, allowing the use of micro-
patches that block less than 2−23 of the light source, mean-
ing that the width of the back-projection of the micro-patch
should be at least larger that 2−11 than the width of the light
source (64 times smaller than the current threshold). Com-
pared with the current method, it would allow the use of
shadow maps with a resolution above 4096 × 4096.
With FP16 blending only, the best solution is to use a hier-
archical shadow map for soft-shadow computations, as was
suggested by Guennebaud et al. [GBP06]: the low resolution
223
224 CHAPITRE 4. UTILISATION DES CARTES GRAPHIQUES PROGRAMMABLES
4.7.4 Fast Precomputed Ambient Occlusion for Proximity Shadows (JGT 2006)
Auteurs : Mattias M, Fredrik M, Ulf A et Nicolas H
Journal : Journal of Graphics Tools (accepté, à paraître)
Date : accepté en octobre 2006.
✐ ✐
Ulf Assarsson
Chalmers University of Technology
Nicolas Holzschuch
ARTIS-GRAVIR/IMAG INRIA
Abstract. Ambient occlusion is used widely for improving the realism of real-time lighting
simulations. We present a new method for precomputed ambient occlusion, where we store
and retrieve unprocessed ambient occlusion values in a 3D grid. Our method is very easy to
implement, has a reasonable memory cost, and the rendering time is independent from the
complexity of the occluder or the receiving scene. This makes the algorithm highly suitable
for games and other real-time applications.
1. Introduction
225
✐ ✐
✐ ✐
✐ ✐
fects for motion pictures [Landis 02] and for illumination simulations in commercial
software [Christensen 02, Christensen 03].
Ambient occlusion also results in objects having contact shadows: for two close
objects, ambient occlusion alone creates a shadow of one object onto the other (see
Figure 1).
For offline rendering, ambient occlusion is usually precomputed at each vertex of
the model, and stored either as vertex information or into a texture. For real-time
rendering, recent work [Zhou et al. 05, Kontkanen and Laine 05] suggest storing
ambient occlusion as a field around moving objects, and projecting it onto the scene
as the object moves. These methods provide important visual cues for the spatial
position of the moving objects, in real-time, at the expense of extra storage. They
pre-process ambient occlusion, expressing it as a function of space whose parameters
are stored in a 2D texture wrapped around the object. In contrast, our method stores
these un-processed, in a 3D grid attached to the object. The benefits are numerous:
• faster run-time computations, and very low impact on the GPU, with a com-
putational cost being as low as 5 fragment shader instructions per pixel,
• very easy to implement, just by rendering one cube per shadow casting object,
• inter-object occlusion has high quality even for receiving points inside the oc-
cluding object’s convex hull,
The obvious drawback should be the memory cost, since our method’s memory
costs are in O(n3 ), instead of O(n2 ). But since ambient occlusion is a low frequency
226
✐ ✐
✐ ✐
✐ ✐
phenomenon, in only needs a low resolution sampling. In [Kontkanen and Laine 05],
as in our own work, a texture size of n = 32 is sufficient. And since we are storing
a single component per texel, instead of several function coefficients, the overall
memory cost of our method is comparable to theirs. For a texture size of 32 pixels,
[Kontkanen and Laine 05] report a memory cost of 100 Kb for each unique moving
object. For the same resolution, the memory cost of our algorithm is of 32 Kb if we
only store ambient occlusion, and of 128 Kb if we also store the average occluded
direction.
2. Background
Ambient occlusion was first introduced by [Zhukov et al. 98]. In modern imple-
mentations [Landis 02, Christensen 02, Christensen 03, Pharr and Green 04, Bun-
nell 05, Kontkanen and Laine 05], it is defined as the percentage of ambient light
blocked by geometry close to point p:
Z
1
ao( p) = (1 − V(ω))⌊n · ω⌋ dω (1)
π Ω
Occlusion values are weighted by the cosine of the angle of the occluded direction
with the normal n: occluders that are closer to the direction n contribute more, and
occluders closer to the horizon contribute less, corresponding to the importance of
each direction in terms of received lighting. Ambient occlusion is computed as a
percentage, with values between 0 and 1, hence the π1 normalization factor.
Most recent algorithms [Bunnell 05, Kontkanen and Laine 05] also store the aver-
age occluded direction, using it to modulate the lighting, depending on the normal at
the receiving point and the environment.
[Greger et al. 98] also used a regular grid to store illumination values, but their
grid was attached to the scene, not to the object. [Sloan et al. 02] attached radiance
transfer values to a moving object, using it to recompute the effects of the moving
object on the environment.
3. Algorithm
Our algorithm inserts itself in a classical framework where other shading informa-
tion, such as direct lighting, shadows, etc. are computed in separate rendering passes.
One rendering pass will be used to compute ambient lighting, combined with ambi-
ent occlusion. We assume we have a solid object moving through a 3D scene, and
we want to compute ambient occlusion caused by this object.
227
✐ ✐
✐ ✐
✐ ✐
Figure 2. We construct a grid around the object. At the center of each grid element, we
compute a spherical occlusion sample. At runtime, this information is used to apply shadows
on receiving objects.
Our algorithm can either be used with classical shading, or with deferred shading.
In the latter case, the world-space position and the normal of all rendered pixels is
readily available. In the former, this information must be stored in a texture, using
the information from previous rendering passes.
Runtime: • render world space position and normals of all shadow receivers in
the scene, including occluders.
• For each occluder:
1. render the back faces of the occluder’s grid (depth-testing is dis-
abled).
2. for every pixel accessed, execute a fragment program:
(a) retrieve the world space position of the pixel.
(b) convert this world space position to voxel position in the grid,
passed as a 3D texture
(c) retrieve ambient occlusion value in the grid, using linear inter-
polation.
3. Ambient occlusion values a from each occluder are blended in the
frame buffer using multiplicative blending with 1 − a.
The entire computation is thus done in just one extra rendering pass. We used the
back faces of the occluder’s grid, because it is unlikely that they are clipped by the
far clipping plane; using the front faces could result in artifacts if they are clipped by
the front clipping plane.
228
✐ ✐
✐ ✐
✐ ✐
The ambient occlusion values we have stored correspond to the occlusion caused by
the occluder itself: Z
1
ao′ ( p) = (1 − V(ω)) dω (2)
4π Ω
that is, the percentage of the entire sphere of directions that is occluded. When we
apply these occlusion values at a receiving surface, during rendering, the occlusion
only happens over a half-space, since the receiver itself is occluding the other half-
space. To account for this occlusion, we scale the occlusion value by a factor 2.
This shading does not take into account the position of the occluder with respect to
the normal of the receiver. It is an approximation, but we found it performs quite
well in several cases (see Figure 1). It is also extremely cheap in both memory and
computation time, as the value extracted from the 3D texture is used directly.
We use the following fragment program (using Cg notation):
1 float4 pworld = texRECT ( PositionTex , pscreen )
2 float3 pgrid = mul ( MWorldToGrid , pworld )
3 out . color.w = 1 - tex3D( GridTexture , pgrid )
There are two important drawbacks with this simple approximation: first, the in-
fluence of the occluder is also visible where it should not, such as a character moving
on the other side of a wall; second, handling self-occlusion requires a specific treat-
ment, with a second pass and a separate grid of values.
3.3. Shading surfaces with ambient occlusion and average occluded direction
For more accurate ambient occlusion effects, we also store the average occluded
direction. That is equivalent to storing the set of occluded directions as a cone (see
Figure 3). The cone is defined by its axis (d) and the percentage of occlusion a
(linked to its aperture angle α). Axis and percentage of occlusion are precomputed
for all moving objects and stored on the sample points of the grid, in an RGBA
texture, with the cone axis d stored in the RGB-channels and occlusion value a stored
in the A-channel.
In order to compute the percentage of ambient occlusion caused by the moving oc-
cluder, we clip the cone of occluded directions by the tangent surface to the receiver
(see Figure 3(b)). The percentage of effectively occluded directions is a function of
two parameters: the angle between the direction of the cone and the normal at the
receiving surface (β), and the percentage of occlusion of the cone (a). We precom-
pute this percentage and store it in a lookup table T clip . The lookup table also stores
229
✐ ✐
✐ ✐
✐ ✐
n
d
β α
d
α
(a) The cone is defined by its direction d (b) The cone is clipped by the tangent plane to the
and its aperture α. receiver to give the ambient occlusion value.
Figure 3. Ambient occlusion is stored as a cone.
Figure 4. Ambient occlusion computed with our algorithm that accounts for the surface
normal of the receiver and the direction of occlusion.
the effect of the diffuse BRDF (the cosine of the angle between the normal and the
direction). For simplicity, we access the lookup table using cos β.
We now use the following fragment program:
230
✐ ✐
✐ ✐
✐ ✐
(a) Gouraud shading (b) Blending occlusion from (c) Ground truth
multiple occluders
Figure 6. Checking the accuracy of our blending method: comparison of Ambient Occlusion
values computed with ground truth.
When we have several moving occluders in the scene, we compute occlusion values
from each moving occluder, and merge these values together. The easiest method to
do this is to use OpenGL blending operation: in a single rendering pass, we render
the occlusion values for all the moving occluders. The occlusion value computed
for the current occluder is blended to the color buffer, multiplicatively modulating it
with (1 − a).
[Kontkanen and Laine 05] show that modulating with (1 − ai ), for all occluders i,
is statistically the best guess. Our experiences also show that it gives very satisfying
results for almost all scenes. This method has the added advantage of being very
simple to implement: the combined occlusion value for one pixel is independent
from the order in which the occluders are treated for this pixel, so we only need one
rendering pass.
Each occluder is rendered sequentially, using our ambient occlusion fragment pro-
gram, into an occlusion buffer. The cone axes are stored in the RGB channels and
the occlusion value is stored in the alpha channel. Occlusion values are blended mul-
tiplicatively and cone axes are blended additively, weighted by their respective solid
231
✐ ✐
✐ ✐
✐ ✐
angle:
αR = (1 − αA )(1 − αB )
dR = αA d A + α B d B
The occlusion cones can also be used to approximate the incoming lighting from
an environment map, as suggested by [Pharr and Green 04]. For each pixel, we
first compute the lighting due to the environment map, using the surface normal for
Lambertian surfaces, or using the reflected cone for glossy objects. Then, we subtract
from this lighting the illumination corresponding to the cone of occluded directions.
We only need to change the last step of blending the color buffer and occlusion
buffer. Each shadow receiving pixel is rendered using the following code:
P
1 Read cone d, α from occlusion buffer
2 Read normal from normal buffer
3 Compute mipmap level from cone angle α
4 A = EnvMap(d, α). i.e., lookup occluded light within the cone
5 B = AmbientLighting(normal). i.e., lookup the incoming light due to the envi-
ronment map.
6 return B-A.
In order to use large filter sizes, we used lat-long maps. It is also possible to use
cube maps with a specific tool for mip-mapping across texture seams [Scheuermann
and Isidoro 06].
An important parameter of our algorithm is the spatial extent of the grid. If the grid
is too large, we run the risk of under-sampling the variations of ambient occlusion,
232
✐ ✐
✐ ✐
✐ ✐
ri Ai
ei
Occluder
Grid
otherwise we have to increase the resolution, thus increasing the memory cost. If the
grid is too small, we would miss some of the effects of ambient occlusion.
To compute the optimal spatial extent of the grid, we use the bounding box of the
occluder. This bounding box has three natural axes, with dimension 2ri on each axis,
and a projected area of Ai perpendicular to axis i (see Figure 8(a)).
Along the i axis, the ambient occlusion of the bounding box is approximately:
1 Ai
ai ≈ (3)
4π (d − ri )2
233
✐ ✐
✐ ✐
✐ ✐
(a) Using raw values, discontinuities can (b) After re-scaling, ambient occlusion
appear blends continuously
Figure 9. We need to re-scale occlusion values inside the grid to avoid visible artifacts.
following the shape of the object, but with the grid being thinner on the longer axes
of the object (see Figure 8(c)).
We use a relatively large epsilon value (0.1), resulting in a small spatial extent. As
a consequence, there can be visible discontinuities on the boundary of the grid (see
Figure 9(a)). To remove these discontinuities, we re-scale the values inside the grid
so that the largest value at the boundary is 0. If the largest value on the boundary of
the grid is V M , each cell of the grid is rescaled so that its new value V ′ is:
(
′ V if V > 0.3
V = V−V M
0.3 0.3−V M
if V ≤ 0.3
The effect of this scaling can be seen on Figure 9(b). The overall aspect of ambient
occlusion is kept, while the contact shadow ends continuously on the border of the
grid.
Sampling points that are inside the occluder will have occlusion values of 1, ex-
pressing that they are completely hidden. As we interpolate values on the grid, a
point located on the boundary of the occluder will often have non-correct values.
To counter this problem, we modify the values inside the occluder (which are never
used) so that the interpolated values on the surface are as correct as possible.
A simple but quite effective automatic way to do this is: for all grid cells where
occlusion value is 1, replace this value by an average of the surrounding grid cells
that have an occlusion value smaller than 1. This algorithm was used on all the
figures in this paper.
234
✐ ✐
✐ ✐
✐ ✐
4. Results
All timings and figures in this paper were computed on a Pentium 4, running at 2.8
GHz, with a NVidia GeForce 7800GTX, using a grid resolution of 323 .
The strongest point of our method is its performance: adding ambient occlusion
to any scene increases the rendering time by ≈ 0.9 ms for each occluder. In our
experiments, this value stayed the same regardless of the complexity of the scene or
of the occluder. We can render scenes with 40 different occluders at nearly 30 fps.
The cost of the method depends on the number of pixels covered by the occluder’s
grid, so the cost of our algorithm decreases nicely for occluders that are far from the
viewpoint, providing an automatic level-of-detail.
The value of 0.9 ms corresponds to the typical situation, visible in all the pictures
in this paper: the occluder has a reasonable size, neither too small nor too large,
compared to the size of the viewport.
Precomputed values for ambient occlusion are stored in a 3D texture, with a memory
cost of O(n3 ) bytes. With a grid size of 32, the value we have used in all our tests,
the memory cost for ambient occlusion values is 32 Kb per channel. Thus, storing
just the ambient occlusion value gives a memory cost of 32 Kb. Adding the average
occluded direction requires three extra channels, bringing the complete memory cost
to 128 Kb.
Figure 5(b)-5(c) and 6(b)-6(c) show a side-by-side comparison between our algo-
rithm and ground truth. Our algorithm has computed all the relevant features of
ambient occlusion, including proximity shadows. The main difference is that our
algorithm tends to underestimate ambient occlusion.
There are several reasons for this difference: we have limited the spatial influence
of each occluder, by using a small grid, and the blending process (see Section 3.3.2)
can underestimate the combined occlusion value of several occluders.
While it would be possible to improve the accuracy of our algorithm (using a
more accurate blending method and a larger grid), we point out that ambient oc-
clusion methods are approximative by nature. What is important is to show all the
235
✐ ✐
✐ ✐
✐ ✐
Acknowledgments. ARTIS is an INRIA research project and a research team in the GRAVIR
laboratory, a joint research unit of CNRS, INRIA, INPG and UJF.
This work was started while Ulf Assarsson was a post-doctoral student at the ARTIS re-
search team, funded by INRIA.
The space ship model used in this paper was designed by Max Shelekhov.
References
[Bunnell 05] Michael Bunnell. “Dynamic Ambient Occlusion and Indirect Lighting.” In
GPU Gems 2, edited by Matt Pharr, pp. 223–233. Addison Wesley, 2005.
[Christensen 02] Per H. Christensen. “Note 35: Ambient occlusion, image-based illumi-
nation, and global illumination.” In PhotoRealistic RenderMan Application Notes.
Emeryville, CA, USA: Pixar, 2002.
[Christensen 03] Per H. Christensen. “Global Illumination and All That.” In Siggraph 2003
course 9: Renderman, Theory and Practice, edited by Dana Batall, pp. 31 – 72. ACM
Siggraph, 2003.
[Greger et al. 98] G. Greger, P. Shirley, P. M. Hubbard, and D. P. Greenberg. “The Irradiance
Volume.” IEEE Computer Graphics and Applications 18:2 (1998), 32–43.
[Kontkanen and Laine 05] Janne Kontkanen and Samuli Laine. “Ambient Occlusion Fields.”
In Symposium on Interactive 3D Graphics and Games, pp. 41–48, 2005.
[Landis 02] Hayden Landis. “Production Ready Global Illumination.” In Siggraph 2002
course 16: Renderman in Production, edited by Larry Gritz, pp. 87 – 101. ACM Sig-
graph, 2002.
[Pharr and Green 04] Matt Pharr and Simon Green. “Ambient Occlusion.” In GPU Gems,
edited by Randima Fernando, pp. 279–292. Addison Wesley, 2004.
[Scheuermann and Isidoro 06] Thorsten Scheuermann and John Isidoro. “Cubemap Filtering
with CubeMapGen.” In Game Developer Conference 2006, 2006.
[Sloan et al. 02] Peter-Pike Sloan, Jan Kautz, and John Snyder. “Precomputed radiance trans-
fer for real-time rendering in dynamic, low-frequency lighting environments.” ACM
Transactions on Graphics (Proc. of Siggraph 2002) 21:3 (2002), 527–536.
[Zhou et al. 05] Kun Zhou, Yaohua Hu, Steve Lin, Baining Guo, and Heung-Yeung Shum.
“Precomputed Shadow Fields for Dynamic Scenes.” ACM Transactions on Graphics
(proceedings of Siggraph 2005) 24:3.
[Zhukov et al. 98] S. Zhukov, A. Iones, and G. Kronin. “An Ambient Light Illumination
Model.” In Rendering Techniques ’98 (Proceedings of the 9th EG Workshop on Render-
ing), pp. 45 – 56, 1998.
236
✐ ✐
✐ ✐
✐ ✐
Web Information:
Two videos, recorded in real-time and demonstrating the effects of pre-computed ambient
occlusion on animated scenes are available at:
http://www.ce.chalmers.se/˜uffe/ani.mov
http://www.ce.chalmers.se/˜uffe/cubedance.mov
A technique for better accuracy in blending the occlusion from two cones is described in a
supplemental material.
237
✐ ✐
✐ ✐
238 CHAPITRE 4. UTILISATION DES CARTES GRAPHIQUES PROGRAMMABLES
Figure 1: Left: Specular reflections computed with our algorithm. Middle: ray-traced reference. Right: Environment map reflection.
Abstract
Specular reflections provide many important visual cues in our daily environment. They inform us of the shape of
objects, of the material they are made of, of their relative positions, etc. Specular reflections on curved objects are
usually approximated using environment maps. In this paper, we present a new algorithm for real-time computation
of specular reflections on curved objects, based on an exact computation for the reflection of each scene vertex.
Our method exhibits all the required parallax effects and can handle arbitrary proximity between the reflector and
the reflected objects.
Categories and Subject Descriptors (according to ACM CCS): I.3.7 [Computer Graphics]: Three-Dimensional
Graphics and Realism
239
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
pute the accurate reflected position of each vertex in the field, but organized like an environment map. Both methods
scene, then interpolate between these positions. The advan- remove parallax issues, at the cost of a longer precomputa-
tage of our method is that it is computing the reflection of tion time. The specular reflector is also restricted, and can
the object depending on the position on the reflector. We are only be moved inside the area where the light field or the
therefore exhibiting all parallax effects, and we can handle environment maps were computed. If it is moved outside of
proximity and even contact between the reflector and the re- this area, the environment light field must be recomputed, a
flected objects. costly step.
However, our method also has obvious limitations: as it is Other research have dealt with distance-based reflection.
vertex-based and uses the graphics hardware for linear inter- The simplest method is to replace the infinite-radius sphere
polation between the projections of the vertices, artifacts can associated with the environment map by a finite-radius
appear if the model is not finely tessellated enough. These sphere [Bjo04]; the reflection changes with the position of
artifacts can be overcome using either adaptive tessellation the reflector in the environment, but parallax effects can not
or curvilinear interpolation. If the model is finely tesselated, be modeled.
these artifacts are not visible. Our algorithm provides solu-
More accurate methods use the Z-buffer to compute a dis-
tions for situations where no convincing solutions existed
tance map along with the environment map. For each pixel of
before.
the environment map, they know both its color and the dis-
Our paper is organized as follows: in the next section, we tance to the center of the reflected object. Patow [Pat95] and
review previous work on real-time computation of specular Kalos et al. [SKALP05] used this information to select the
reflections. Then, in section 3, we present our algorithm for proper pixel inside the environment map. Their reflections
computing vertex-based specular reflections on curved sur- change depending on the distance between the reflector and
faces. In section 4, we present experiments on various scenes the reflected object. Kalos et al. [SKALP05] use the GPU for
and comparisons with existing methods. Finally, in section 5, a fast computation of the reflected pixel, and achieve real-
we conclude and present future directions for research. time rendering for moderately complex scenes. Still, image
based methods are inherently limited to the information in-
cluded in the original image.
2. Previous Works
For planar reflectors, the easiest way to compute the re-
Ray-tracing has historically been used to compute reflec- flection is vertex-based, using an alternative camera to com-
tions on specular objects. Despite several advances using ei- pute the image of the scene as reflected by the planar reflec-
ther highly parallel computers [WSB01, WBWS01, WSS05] tor. For curved reflectors, there is no simple rule to tell the
or GPUs [CHH02, PBMH02], ray-tracing is not, currently, position of the reflection of the objects. Even for a finite-
available for real-time computations on a standard worksta- radius sphere, the simplest specular reflector, the position of
tion. the reflection depends on a 4th-order polynomial.
Planar specular reflectors are easy to model, at the cost of Mitchell and Hanrahan [MH92] used the equation of the
a second rendering pass, with a camera placed in the mir- underlying surface to compute the characteristic points in the
ror position of the viewpoint [McR96]. Curved reflectors are caustic created by a curved reflector. Ofek [Ofe98] and Ofek
more complex; the easiest method uses environment map- and Rappoport [OR98] computed the explosion map to find
ping [BN76]. intersected triangles ID based on the reflected vector. Chen
and Arvo [CA00b, CA00a] used ray-tracing to compute the
Environment mapping computes an image of the scene
reflection of some vertices, then applied perturbation to these
and maps it on the reflector as if it was located at an infi-
reflections to compute the reflection of neighboring vertices.
nite distance. The reflection only depends on the direction
of the incoming vector from the viewpoint, and can be eas- Estalella et al. [EMD∗ 05] computed the reflection of scene
ily computed in real-time on graphics hardware. Obviously, vertices on curved specular objects by an iterative method.
environment mapping suffers from parallax issues, since the At each iteration, the position of the reflection of the ver-
reflection depends on a single image computed from a sin- tex is modified, using the angles between the normal, the
gle point of view. There is also the question of accuracy: vertex and the viewpoint, in the direction where these an-
since all objects are assumed to be at an infinite distance, gles will follow Descartes’ law. They did a fixed number of
their reflection is not necessarily accurate, and the difference iterations, and have implemented the method only on the
becomes larger as the object gets closer to the reflector. CPU. In a subsequent work, developed concurrently with
ours, Estalella et al. [EMDT06] extended this work to the
There has been much research to improve the original en-
GPU, searching the position of the reflection of the vertex in
vironment mapping algorithm. To remove the parallax is-
image space.
sues, Martin and Popescu [MP04] interpolate between sev-
eral environment maps. Yu et al. [YYM05] used an envi- Our method is comparable to that of Estalella et
ronment light-field, containing all the information of a light al. [EMD∗ 05, EMDT06], but we use a different refinement
240
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
241
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
10
0
(a) Example of successive trian- (b) Example image rendered with our (c) Number of iterations required for conver-
gles generated by our algorithm algorithm gence
242
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
243
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
Figure 6: Computing the illumination of the reflected scene: • pre-render the frontmost back-facing polygons of the re-
illumination at the reflected point is computed using its flector into a depth texture; clear the Z-buffer and frame-
−−→ −−→ buffer.
BRDF, with V L and V P as incoming and outgoing direc-
tions; it is then multiplied by the BRDF on the reflector, with • render the scene, with lighting and shadowing; clear the
−−→ −−→ stencil-buffer.
PV and PE as incoming and outgoing directions.
• render the reflector, with hidden surface removal. For pix-
els that are touched by the reflector, set the stencil buffer
to 1.
V’ • clear the depth buffer and render the reflected scene us-
ing our algorithm. The fragments generated are discarded
V
if the stencil buffer is not equal to 1 (using the classical
P stencil test) and if they are further away than the back-
E P’
faces of the reflector (using the depth texture computed at
the first step).
• (optional) enable blending and render the reflector, com-
puting its illumination.
Figure 7: For a ray originating from the eye, we have to re-
Our strategy correctly handles occlusions between the re-
solve visibility issues both between P and P′ , on the reflector,
flector and the scene (using the stencil test), as well as self
and between V and V ′ , on the reflected ray.
occlusion of the reflector, using the depth texture. Note that
we have to use frontmost back-facing polygons: using the
frontmost front-facing polygons would falsely remove all the
3.3.5. Direction-dependent lighting on the reflected reflected scene for locally convex reflectors, since we are lin-
scene early interpolating between reflected points that are on the
When we display a fragment of the reflected scene, we surface of the reflector.
know its spatial position V and the approximate spatial posi-
tion of its reflection P. We use this information to compute 3.3.7. GPU implementation
directionally-dependent lighting:
We have implemented our algorithm on the GPU for better
• compute illumination at point V, using its BRDF, with the efficiency. To compute the reflected position of one vertex,
light source L as the incoming direction and the reflected we need access to the equation and derivatives of the spec-
point P as the outgoing direction (see Figure 6). ular reflector. Since we stored these in a texture to handle
• multiply this by the BRDF of the specular reflector at arbitrary specular reflectors, this limits us to two possible
point P, using the reflected point V as the incoming di- implementation strategies:
rection, and the viewpoint E as the outgoing direction.
• place our algorithm in vertex shader, using graphics hard-
This simple rule allows us to have directional lighting on ware with vertex texture fetch (NVidia GeForce 6 and
the reflected scene. The lighting on the reflected scene is thus above).
not necessarily the same as the lighting on the original scene. • place our algorithm in a fragment shader and render the
reflected positions of the vertices in a Vertex Buffer Ob-
3.3.6. Multiple Hidden-Surface Removal
ject. In a subsequent pass, render this VBO. This requires
Hidden surface removal requires special handling, as we hardware with render-to-vertex-buffer capability, which
have several possible sources of occlusion (see Figure 7): was not available to us at the time of writing.
244
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
245
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
(a) Our method (b) Ray traced reference (c) Environment mapping
Figure 8: Comparison of our results (left) with ray-tracing (center, for reference) and environment-mapping (right). The differ-
ence are especially visible for objects that are close to the kettle, such as its handle and the right hand of the character.
Figure 9: Our algorithm is able to display objects that are not visible from the center of the reflector. Notice here how the back
of the chair is properly rendered.
246
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
247
D. Roger & N. Holzschuch / Accurate Specular Reflections in Real-Time
(a) The bar is not tessellated, and its reflection is not curved — (b) The problem disappears if we tessellate the bar.
as it should be.
Figure 12: The scene has to be well tesselated, or artifacts appear because we cannot render curved triangles.
[EMDT06] E P., M I., D G., T [WSB01] W I., S P., B C.: Interactive
D.: A gpu-driven algorithm for accurate interactive reflec- distributed ray tracing of highly complex models. In Ren-
tions on curved objects. In Rendering Techniques 2006 dering Techniques 2001 (Proc. 12th EUROGRAPHICS
(Proc. EG Symposium on Rendering) (June 2006). Workshop on Rendering) (2001), pp. 277–288.
[McR96] MR T.: Programming with OpenGL: [WSS05] W S., S J., S P.: Rpu: a
Advanced rendering. Siggraph’96 Course, 1996. programmable ray processing unit for realtime ray trac-
ing. ACM Transactions on Graphics (Proc. of Siggraph
[MH92] M D., H P.: Illumination from
2005) 24, 3 (2005), 434–444.
curved reflectors. Computer Graphics (Proc. of SIG-
GRAPH ’92) 26, 2 (1992), 283–291. [YYM05] Y J., Y J., MM L.: Real-time reflec-
tion mapping with parallax. In Proc. I3D 2005 (2005),
[MP04] M A., P V.: Reflection Morphing.
pp. 133–138.
Tech. Rep. CSD TR#04-015, Purdue University, 2004.
[Ofe98] O E.: Modeling and Rendering 3-D Objects.
PhD thesis, Institute of Computer Science, The Hebrew
University, 1998.
[OR98] O E., R A.: Interactive reflections
on curved objects. In Proc. of SIGGRAPH ’98 (1998),
pp. 333–342.
[Pat95] P G. A.: Accurate reflections through a Z-
buffered environment map. In Proceedings of Sociedad
Chilena de Ciencias de la Computacin (1995).
[PBMH02] P T. J., B I., M W. R., H-
P.: Ray tracing on programmable graphics hardware.
ACM Transactions on Graphics (Proc. of Siggraph 2002)
21, 3 (July 2002), 703–712.
[SKALP05] S-K L., A B., L I., P-
M.: Approximate ray-tracing on the GPU with dis-
tance impostors. Computer Graphics Forum(Proceedings
of Eurographics ’05) 24, 3 (2005).
[WBWS01] W I., B C., W M., S
P.: Interactive rendering with coherent ray tracing. Com-
puter Graphics Forum (Proc. of EUROGRAPHICS 2001)
20, 3 (2001).
248
4.7. ARTICLES 249
4.7.6 Wavelet radiance transport for interactive indirect lighting (EGSR 2006)
Auteurs : Janne K, Emmanuel T, Nicolas H et François X. S
Conférence : Eurographics Symposium on Rendering 2006.
Date : juin 2006
250 CHAPITRE 4. UTILISATION DES CARTES GRAPHIQUES PROGRAMMABLES
Eurographics Symposium on Rendering (2006)
Tomas Akenine-Möller and Wolfgang Heidrich (Editors)
Abstract
Global illumination is a complex all-frequency phenomenon including subtle effects caused by indirect lighting.
Computing global illumination interactively for dynamic lighting conditions has many potential applications,
notably in architecture, motion pictures and computer games. It remains a challenging issue, despite the consid-
erable amount of research work devoted to finding efficient methods. This paper presents a novel method for fast
computation of indirect lighting; combined with a separate calculation of direct lighting, we provide interactive
global illumination for scenes with diffuse and glossy materials, and arbitrarily distributed point light sources. To
achieve this goal, we introduce three new tools: a 4D wavelet basis for concise radiance expression, an efficient
hierarchical pre-computation of the Global Transport Operator representing the entire propagation of radiance in
the scene in a single operation, and a run-time projection of direct lighting on to our wavelet basis. The resulting
technique allows unprecedented freedom in the interactive manipulation of lighting for static scenes.
Categories and Subject Descriptors (according to ACM CCS): I.3.7 [Computer Graphics]: Three-Dimensional
Graphics and Realism
251
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
thus reduce the dimensionality of the emission to 2. Yet, lo- multiple bounces of light, at the expense of rendering time.
cal light sources (a.k.a. near-field illumination) have 5 de- This approach can be seen as a dynamic generalization of
grees of freedom, that can be narrowed down to 4 without Greger’s irradiance volumes [GSHG98].
loss of generality if we consider that light travels through
Dachsbacher and Stamminger [DS06] introduce an ex-
a vacuum; this high dimensionality tends to make classical
tended shadow map to create first-bounce indirect light
PRT methods extremely costly.
sources. They splat the contribution of these sampled
In this paper, we present a technique for interactive com- sources onto the final image using deferred shading. They
putation of global illumination in static scenes with dif- only compute the first indirect light bounce, without taking
fuse and glossy materials, and arbitrarily placed dynamic visibility into account. Nevertheless, they observe that the
point/spotlights. Our algorithm uses a precomputed Global results look plausible in most situations.
Transport Operator that expresses the relationship between
Despite using GPUs or even custom hardware, the above
incident and outgoing surface radiance. During run-time we
methods currently barely run interactively, unless they re-
project the direct light form the light sources to the surfaces,
strict themselves to small scenes or degrade the accuracy of
and apply this precomputed operator to get full global illu-
the simulation.
mination. Rather than following the common compute, then
compress scheme, we try to generate the operator directly in In a separate research direction, PRT techniques [Leh04]
a compact representation. precompute light exchanges and store the relationship be-
tween incoming lighting and outgoing global illumination
Our contributions are: a new 4D wavelet basis for compact
on the surface of an object. The result of these precomputa-
representation of radiance, a method to efficiently compute
tions, the light transport operator, is compressed and used at
the Global Transport Operator, greatly speeding up the pre-
runtime for interactive display of global illumination. Most
computation time, and a method to efficiently project direct
PRT techniques start by precomputing the light transport op-
lighting from point light sources on our hierarchical basis at
erator with great accuracy, then compress it, typically using
runtime. These three contributions, combined together, re-
clustered principal component analysis [SHHS03].
sult in interactive manipulation of light sources, with imme-
diately visible results in the global lighting. The cost of the uncompressed light transport is directly
related to the degrees of freedom (DOF) n in the operator,
The most noticeable limitation of our approach is directly growing with O(kn ). As discussed in section 1, the general
linked to a well-known problem of finite-element methods expression for emission space, assuming no participating
for global illumination: our basis functions have to be ex- media, has 4 DOF. Given that the outgoing surface radiance
pressed on the surfaces of the scene. Incidentally, our ex- also has 4, the general form of the operator end up with 8
ample scenes are exclusively composed of large quads. An- DOF.
other important limitation is that BRDFs must be relatively
low-frequency to be efficiently representable in our wavelet To keep memory and precomputation costs tractable, most
basis. PRT techniques somehow restrict these degrees of free-
dom. It is generally achieved by assuming infinitely distant
lighting, as done Sloan et al. [SKS02] and many others.
2. Previous work Another option is to fix the locations of the light sources
in space [DKNY95]. Yet another one is to fix the view-
Global illumination has been the subject of research in Com-
point [NRH03]: in this work, Ng et al. demonstrate that
puter Graphics for decades. Dutré et al. [DBB03] give a
all-frequency lighting from an infinitely distant environment
complete survey of the state-of-the art of global illumina-
can be rendered efficiently by using a light transport opera-
tion techniques. There have been plentiful research efforts to
tor expressed in Haar wavelet basis and non-linearly com-
speed up global illumination computations and achieve real-
pressed. The fixed viewpoint restriction applies when the
time or interactive framerates.
scene contains glossy materials. In a subsequent publica-
Ray-tracing has been ported to the GPU [PBMH02, tion [NRH04] the authors remove this restriction by intro-
PDC∗ 03] or to specific architectures [WSB01, SWS05]. The ducing triple wavelet product integrals. As a result they
same has been done with the radiosity algorithm [Kel97, are able to generate high quality pictures that solve the 6-
CHL04], while others use the GPU for fast computation of dimensional transport problem, but not with real-time or in-
hierarchical form-factors [LMM05]. teractive rates.
Nijasure et al. [NPG05] compute a representation of the Haar wavelets have succesfully been used by oth-
incident radiance at several sample points sparsely covering ers [LSSS04, WTL06] to efficiently express all-frequency
the volume enclosed by the scene. Incident radiance is stored transport from detailed environment maps to glossy surfaces.
using spherical harmonics. Spherical harmonics coefficients These method utilize separable decomposition, consisting of
are interpolated between the sample points and applied to the a purely light-dependent term and a purely view-dependent
surfaces of the scene. The system can be iterated to compute term.
252
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
253
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
The building blocks of Haar basis are the following Figure 3: Light transport from sending basis function
smooth function φ and wavelet ψ: b ss (y)bas (ω) to receiving basis function brs (x)bra (α).
1 for 0 ≤ x < 1
(
φ(x) =
0 otherwise
4.2. Wavelet Basis for Transport Operator
(1)
1 for 0 ≤ x < 1/2 The projected transport operator consists of coefficients that
for 1/2 ≤ x < 1 describe the influence of each basis function to all the other
ψ(x) = −1
0 otherwise ones. The non-standard operator decomposition is a more
All the wavelets and smooth functions of Haar basis are common choice in hierarchical radiosity, as in theory it gives
formed by scales and translates of the above elementary a more compact representation than standard decomposition.
functions as follows: In spite of this we chose to follow [CSSD94] and used stan-
dard operator decomposition. We see two advantages in us-
φij = φ(2i x − j), ψij = ψ(2i x − j) (2) ing the standard decomposition: it decouples the resolution
Where i gives the scale, and j gives the translation. For com- for sender and receiver, and there is no need for a push-pull
prehensive introduction to Haar wavelets and wavelets in step. The former is an obvious advantage, for example when
general we refer to [SDS96]. the sender and receiver differ greatly in size or in complexity.
Multi-dimensional wavelet bases are usually formed by The latter requires an explanation: conventional global il-
combining one-dimension wavelet bases. There are two sys- lumination methods, using the non-standard representation,
tems for creating multi-dimensional wavelet bases: the stan- require a push-pull step between light bounces [HSA91].
dard refinement (see Figure 2a), where the dimensions are For these methods, the cost of the push-pull step is not pro-
refined separately, and the non-standard refinement (see Fig- hibitive. However, we are using the DTO to compute the
ure 2b), where refinement is performed alternatively along Global Transport Operator, using Neumann series (see sec-
all dimensions. tion 4.4). During this computation, we perform several mul-
tiplications between operators. During these operator multi-
The non-standard refinement method merges together the
plications, the fact that we do not require a push-pull step
different dimensions, treating them equally. As a conse-
greatly accelerates the computation.
quence, it is more widely used in fields such as Image Anal-
ysis and Image Synthesis, where the two spatial dimensions
serve an equal purpose. 4.3. Direct Transport Operator
For Radiance computations, the spatial and angular di- We compute the Direct Transport Operator to express a sin-
mensions are not equivalent. A surface can exhibit large vari- gle bounce of light. As we are going to conduct operator
ations on the spatial domain and be more continuous over multiplications, we require the output space of the DTO to
the angular domain, and linking the resolutions of the spa- be equal to its input space. This leaves a choice: either we
tial and angular dimensions is not always efficient. For this express the DTO in terms of incident radiance or in terms of
reason, we decouple the spatial and angular domains, using outgoing radiance. We chose to use the incident form of the
standard refinement between these dimensions. For the 2D Direct Transport Operator.
sub-domains for angular and spatial dimensions, we still use
non-standard refinement. Our wavelet basis for 4D radiance The incident form of the Direct Transport Operator is de-
therefore uses a combination of standard and non-standard fined as follows:
refinement.
Z
(T L)(x, x ← y) = fr (ω, y, y → x)V(x, y)⌊ω · ny ⌋L(y, ω) dω
For the angular domain, the hemisphere of directions is
(3)
mapped to the unit square using a cosine-weighted concen-
tric map [SC97]; we then apply wavelet analysis over the The transport operator maps the incident radiance arriving
unit square. Using this mapping allows pre-integrated cosine to location y from direction ω to incident radiance at another
on the hemisphere of directions, with a low angular distor- location x from direction x ← y. Given a certain distribution
tion and constant area mapping. of incident radiance, applying this operator once gives the
254
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
distribution of light that has been reflected once from the sis, the refinement may arrive at a certain link from several
surfaces of the scene. Here fr refers to BRDF and V to the parent links. This means that when we chose the standard
visibility term. Along with the ⌊ω · ny ⌋ term, they form the method for combining dimensions in our 8D basis, we par-
kernel k(x, y, ω) of the light transport operator. tially lost the tree-property of our basis.
The projected form of the transport operator is obtained However, we use the same solution for the problem as
by integrating the 6D kernel against each 8D wavelet, in a was used in the conventional wavelet radiance [CSSD94].
similar fashion to [CSSD94]: If, in the refinement process, we arrive at a 8D wavelet coef-
Z ficient that has already been visited, we terminate the traver-
K(x, y, ω)b ss (y)bas (ω)brs (x)bra (x ← y) dω d x d y (4) sal. The difference with conventional wavelet radiance is that
we have four independent subspaces instead of two.
⌊x←y·ny ⌋
Where K(x, y, ω) = k(x, y, ω) and brs , bra , b ss and bas
r2xy An important point in our algorithm is that, as with
refer to the elementary non-standard basis functions of re- Wavelet Radiance [CSSD94], we are computing wavelet
ceiving spatial, receiving angular, sending spatial and send- transport coefficients directly between wavelet coefficients,
ing angular dimensions, respectively (see section 4.1). x and not between smooth functions. This eliminates the need for
y are integrated over surfaces, while ω is integrated over the push-pull step.
hemisphere oriented according to corresponding surface nor-
mal. For a visual illustration, see Figure 3. 4.3.1. Numerical Integration
In the context of light transport, a wavelet coefficient ob- The actual coefficients corresponding to each link are com-
tained from Equation 4 has traditionally been called a link. puted by generating quasi-randomly distributed samples in
We will use this term to refer to a group of coefficients for the support area of the link. Thus, we are computing Equa-
8D basis functions sharing the same support on all 2D sub- tion 4 by quasi-Monte Carlo integration.
spaces. In practice, this means that each link corresponds to
The coefficients of the coarsest links are difficult to com-
255 wavelets and a single smooth function coefficient. This
pute accurately without a significantly large amount of sam-
can be seen by considering that each 2D sub-space has 4 el-
ples. On the other hand the finer scale wavelets do not re-
ementary non-standard basis functions that share the same
quire as many samples since within a smaller support the
support, and 44 = 256. As an example of elementary 2D
kernel does not deviate as much. Because of this we adopt
functions, see the four functions in the lower left corner of
the adaptive integration procedure used in Wavelet Radi-
Figure 2b.
ance [CSSD94]: we first refine the link structure to the finest
We compute the Direct Transport Operator by progres- level and then perform a wavelet transform to compute the
sively refining the existing links. We start by creating inter- coarser links in terms of the finer ones. As a result, only
actions between coarsest level basis functions in the scene, the finest scale wavelet coefficients are computed directly. In
and then refine these. At each step, we consider 256 ba- our implementation, this procedure is done during a single
sis function coefficients. Note, however, that not all the 256 recursive visit.
coefficients are stored. We only store the necessary parts
of link: a link between two diffuse surfaces does not need 4.3.2. Refinement Oracle
wavelets in angular domain. In practice, each link contains
The refinement oracle considers each link, i.e. a cluster of
between 1 and 256 wavelet coeffients depending on its type.
coefficients of wavelets sharing the same support at the time.
A refinement oracle (see section 4.3.2) tells us whether It works by testing quasi-random samples of the kernel, and
a link needs to be refined. For each link it has a choice to using explicit knowledge of the BRDF. If the oracle finds
refine in any of the 2D sub-domains (spatial receiver, angu- that the operator is smooth, then the refinement stops and the
lar receiver, spatial sender, angular sender). The refinement kernel samples are used to compute the wavelet coefficients.
oracle may independently choose each option, possibly re-
At each refinement step, the refinement oracle has to se-
fining both the sender and the receiver in space and angle, or
lect whether to refine the sender or the receiver, or both, and
simply refining the receiver in space, or any combination.
whether to refine them spatially or angularly, or both. Thus,
When the link is refined, we create the child wavelets, and the oracle can refine between 0 to 4 dimensions, resulting in
recursively consider each newly created link. Consider a re- 16 possible combinations. The ability to make an indepen-
finement of one of the spatial basis functions: when a spa- dent refinement decision in each sub-space is a consequence
tial basis function is refined, four new child links are created of using standard refinement as described in sections 4.1
(spatial patch is divided into four child patches). However, and 4.2.
when two of the 2D sub-domains are refined, there will be
The decision to refine the interaction in the angular do-
4 × 4 = 16 child links to consider, and finally if all the dimen-
main is based solely on the BRDFs of the sender and re-
sions are refined 44 = 256 child links are created.
ceiver, unless the sender and the receiver are mutually invis-
When performing progressive refinement in standard ba- ible, in which case the interaction is not refined. The basis
255
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
256
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
In order to use the GTO to generate indirect lighting, we Once we have a wavelet projected representation of direct
need to project the light from the dynamic light sources to incident lighting, we multiply it by the GTO to give the con-
the 4D radiance basis defined on the surfaces of the scene. verged incident radiance:
X = GE
For each light source and for each surface of the scene,
our method proceeds as follows: where E represents the projected direct light, G is the GTO,
and X is the resulting converged incident radiance. All the
1. Estimate the level of precision required. wavelet representations above are in sparse format, so that
2. Compute all the smooth coefficients at this level of preci- only non-zero coefficients are stored.
sion, by integrating direct lighting on the support of each For efficient multiplication, it is important to take advan-
coefficient. tage of the sparseness of E: typically the direct light can be
3. Perform a wavelet transform on these smooth coeffi- expressed with a small number of wavelet coefficients, since
cients to compute the wavelet coefficients, then discard it is often either spatially localized or falling from a far away
the smooth coefficients. This generates a wavelet rep- light source, in which case only coarse basis functions are
resentation of the direct light on all the surfaces of the present in E. We perform the multiplication by considering
scene. only the non-zero element of E and accumulating the results
to X.
This projection of direct lighting onto our wavelet basis
We use the same technique to minimize the amount of
is fundamental for interactive rendering, so it is important to
dynamic memory allocations in X as we used for computing
perform these computations efficiently. Unfortunately, step
E (section 5.1).
2 involves computing direct lighting for all the smooth coef-
ficients, a costly step for arbitrary light sources.
5.3. Multiplication by the BRDF
To estimate the level of precision required (step 1), we
look at the solid angle subtended by the geometry of the ob- X represents the incident indirect radiance, and yet we need
ject, multiplied by the intensity of the light source in the di- the outgoing radiance for display. Thus, we need a final mul-
rection of the object. tiplication by the BRDF. In our implementation, we asso-
ciate a wavelet representation of the BRDF with each surface
The computation of the smooth coefficients (step 2) in- of the scene and this step simply translates into a multipli-
volves computing direct lighting in the scene, including visi- cation in wavelet space. Note that we use the same wavelet
bility between the light source and the support of the smooth representation that the oracle uses to determine the angular
coefficient. In our implementation, we tried both area light refinement (section 4.3.2).
sources and point light sources, but we found that only
point light sources were currently compatible with interac- 5.4. Rendering from the Wavelet Basis
tive framerates.
To generate the final view for the user, we first render the
For point light sources, we compute the visibility using scene representing the indirect light, and then additively
occlusion queries, i.e. we render the smooth functions from blend in the direct light using standard techniques.
the view of light source using GL_ARB_occlusion_query ex-
The indirect light is synthesized from the 4D wavelet ba-
tension of OpenGL, and estimate the solid angle each basis
sis to textures using the CPU. Then the whole scene is drawn
function subtends based on the number of visible pixels. Our
using standard texture mapping and optionally bi-linear fil-
current implementation only supports direct light projection
tering (results without this filtering can be seen in Figure 4).
to directionally smooth basis functions. This means that di-
rect light falling on a specular surface gets reflected as if the To get rid of the discontinuities that would appear between
surface was diffuse. neighboring coarse level quads, we use border texels (sup-
ported by standard graphics hardware) to ensure a smooth
To benefit from our sparse wavelet representation for sur- reconstructed result across the edges of quads.
face radiance, the elements need to be dynamically (de-
Each quad is associated with its own texture, and thus it
)allocated. To avoid an excessive amount of dynamic mem-
is possible to use a specific texture resolution for each quad.
ory management we use the following method: before pro-
In our current implementation we select the texture resolu-
jecting the direct light at each frame we set the existing al-
tion according to the maximum of the spatial and angular
located coefficients to zero. Then we project the light as de-
resolutions present in the wavelet basis.
scribed, and after the projection we de-allocate the entries
that are still null. This minimizes the amount of dynamic al- The texture synthesis is performed by traversing all non-
locations and de-allocations required during run-time. zero wavelet coefficients for a given quad. For performance,
257
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
100 x
GTO error
10 gathered GTO error
receiver
1
0.1
b00 0.01
y ω
b10 b01 0.001 sender
Figure 5: For texture synthesis, we traverse the wavelet
b20 hi-b11 b 0
2
0.0001
erarchy in the order shown here. We terminate ...the angu-
... ... ... 1e-05 x
lar traversal as soon as we detect that the angular sub-tree 1e-06
receiver
points away from the viewer. 1e-07
a00 b001e-08
1.8e+06
a10 1e-60 a01 b10 b01
1e-09 y ω
1.6e+06 1e-07 1e-06 1e-05 0.0001 0.001 0.01 0.1 1
a20 4e-6
a11 a02 b20 b11 b02 senderthreshold
1.5e-5 Compression
1.4e+06 ... 6e-5
...2.4e-4 ... ... ... ... ... ...
1e-3
Figure 7: GTO error in the maze scene as a function of the
1.2e+06
threshold on wavelet coefficients.
1e+06
258
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
Table 1: Summary of the performance of our algorithm. All matrices were computed with a single 3 GHz Pentium 4.
C C H- M M H- G G H-
tDT O (precomp.) 2min 25min 23min 1h 12min 1min 9min
tGT O (precomp.) <1s 2s 40s 1min <1s 1min
FPS (run-time) 60 25 15 7 8 3
Links DTO 3477 30366 65501 288628 4260 36778
Links GTO 418 648 24151 24599 1712 34176
Links gathered GTO 14169 53100 164813 589361 44383 195037
Memory cons. in MB 1.7 6.4 19.7 70 5.3 23.4
1e+07
GTO 140 Indirect lighting
gathered GTO Projection of Direct onto Wavelet basis
Direct lighting
120
80
100000
60
40
10000
20
0
1000 Cornell (low) Cornell (high) Maze (low) Maze (high)
1e-8 1e-7 1e-6 1e-5 1e-4 1e-3 0.01 0.1 1 10
Error
Figure 9: Rendering times for the different steps of our run-
time component. For each scene, we tried a high resolution
Figure 8: Memory cost of the GTO as a function of the error
(moderately compressed) and a low resolution (aggressively
on the operator (maze scene).
compressed) GTO.
cost of the GTO and the error it represents (see Figure 8).
For both versions of the GTO, the error decreases as the dominate, especially when using a high quality GTO (mod-
memory cost increases. We observe that, surprisingly, the erate compression). We observe that the rendering time for
GTO outperforms the gathered GTO: for a given error, it indirect lighting is related to the number of coefficients in
always provides a more compact representation of the oper- the GTO.
ator. Even so, this compact representation does not always
translate into visual quality (see Figure 4): comparing the
two representations of the GTO with similar error levels, we 7. Conclusions and Future work
found that the gathered GTO gives better visual results. The
non-linear compression we used for computing the GTO re- In this paper, we have presented a novel algorithm for fast
moves links based on the energy they represent. However, computation of indirect lighting. Combined with a separate
the visual quality of the image is not directly linked to en- computation of direct lighting, our algorithm allows interac-
ergy levels, but also to other spatial information. tive global illumination.
Our algorithm makes use of three different contributions:
6.2. Runtime component first a new wavelet basis for efficient storage of radiance, us-
ing standard refinement to separate the angular and spatial
We analyzed the performance of our runtime component. dimensions, secondly a hierarchical precomputation method
Timings for each step can be seen on Figure 9. We tried two for PRT, and third a fast projection of direct lighting on our
different scenes, each of them with a different level of com- basis. Our method works in a top-down approach, and there-
pression of the GTO. All these rendering times correspond fore aims to only compute the information that is necessary
to observed framerates. for PRT computations.
Computations related to direct lighting only dependent on
Our main limitation is inherited from the finite element
scene complexity, as the projection of direct lighting does
methods: the elements (i.e. the basis functions) need to be
not depend on the accuracy on the GTO. Computing the pro-
mapped to the surfaces of the scene. In the simplest case,
jection of direct lighting is about as expensive as the compu-
the scene needs to initially consist of large quads as in the
tation of visible direct lighting on the GPU.
examples of this paper. Nonetheless, as a future work we
Not surprisingly, the computations related to indirect wish to study the possibility of to relieve these restrictions in
lighting (multiplying the direct lighting by the GTO, the the spirit of bi-scale radiance transfer [SLSS03]. This would
multiplication by the BRDF and conversion to textures) require an easily parameterizable coarse version of the scene
259
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
Direct
Indirect
Combined
and a method to transfer lighting from this coarse surface to Finally, to leverage the full potential of our 4D represen-
a finer one. tation, we plan to explore run-time projection of arbitrarily
distributed area light-sources instead of point lights. This
Another direction for future work is establishing more ex-
would have to be coupled with an accurate display of di-
plicit relationship between the different compressions done
rect lighting, which could benefit from the information we
in our algorithm. In the current method, we have separate
collect during the projection.
thresholds for hierarchical refinement and the non-linear
compression used during the Neumann series computation.
It would be advantageous to only have a single threshold re- Acknowledgements
lated to the quality of GTO we want to obtain.
We would like to thank the Computer Graphics Group of
The oracle’s ability to independently refine each sub- Helsinki University of Technology, the ARTIS team, and our
space is both a strength and a challenge. The refinement anonymous reviewers for their valuable feedback. This work
heuristics we presented are not optimal, since they do not has been supported by Bitboys, Hybrid Graphics, Remedy
take into account the dependence of directional and spatial Entertainment, Anima Vitae, and the National Technology
dimensions, as explained by Durand et al. [DHS∗ 05]. For Agency of Finland.
instance, it might not make sense to link a sender with a nar-
row angular support to a spatially large receiver. We believe
that this is a very promising direction for future research and References
that clear performance improvements are possible.
[CHL04] C G., H M. J., L A.: Radiosity
Yet another idea concerns applying knowledge of run- on graphics hardware. In Graphics Interface 2004 (2004).
time importance, projected from the camera to the surfaces
of the scene. This projection could be used to speed-up the [CSSD94] C P. H., S E. J., S
GTO multiplication as we would know beforehand which D. H., DR T. D.: Wavelet Radiance. In Photore-
basis functions really contribute to the image. So we could alistic Rendering Techniques (Proc. of EG Workshop on
use only the parts of the GTO that actually have an visible Rendering) (June 1994), pp. 287–302.
effect on the result. [CSSD96] C P. H., S E. J., S
260
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
D. H., DR T. D.: Global Illumination of Glossy En- [NRH03] N R., R R., H P.: All-
vironments Using Wavelets and Importance. ACM Trans- frequency shadows using non-linear wavelet lighting ap-
actions on Graphics 15, 1 (Jan. 1996), 37–71. proximation. ACM Transactions on Graphics (Proc. of
[DBB03] D́ P., B K., B P.: Advanced Global SIGGRAPH 2003) 22, 3 (July 2003), 376–381.
Illumination. AK Peters, 2003. [NRH04] N R., R R., H P.: Triple
[DHS∗ 05] D F., H N., S C., C E., product wavelet integrals for all-frequency relighting.
S F. X.: A frequency analysis of light transport. ACM Transactions on Graphics (Proc. of SIGGRAPH
ACM Transactions on Graphics (Proc. of SIGGRAPH 2004) 23, 3 (Aug. 2004), 477–487.
2005) 24, 3 (Aug. 2005). [PBMH02] P T. J., B I., M W. R., H-
[DKNY95] D Y., K K., N H., Y- P.: Ray tracing on programmable graphics hard-
H.: A quick rendering method using basis func- ware. ACM Transactions on Graphics (Proc. of SIG-
tions for interactive lighting design. Computer Graph- GRAPH 2002) 21, 3 (July 2002), 703–712.
ics Forum (Proc. of EUROGRAPHICS 1995) 14, 3 (Sept. [PDC∗ 03] P T. J., D C., C M.,
1995). J H. W., H P.: Photon mapping on pro-
[DS06] D C., S M.: Splatting in- grammable graphics hardware. In Graphics Hardware
direct illumination. In Interactive 3D Graphics 2006 2003 (2003), pp. 41–50.
(2006). [SC97] S P., C K.: A low distortion map between
[GSCH93] G S. J., S P., C M. F., H- disk and square. Journal of Graphic Tools 2, 3 (1997),
P.: Wavelet Radiosity. In SIGGRAPH ’93 (1993), 45–52.
pp. 221–230. [SDS96] S E. J., DR T. D., S D. H.:
[GSHG98] G G., S P., H P. M., G- Wavelets for Computer Graphics. Morgan Kaufmann,
D. P.: The irradiance volume. IEEE Computer 1996.
Graphics and Applications 18, 2 (March/April 1998), 32– [SHHS03] S P.-P., H J., H J., S J.: Clus-
43. tered principal components for precomputed radiance
[HPB06] H M., P F., B K.: Direct-to- transfer. ACM Transactions on Graphics (Proc. of SIG-
indirect transfer for cinematic relighting. ACM Transac- GRAPH 2003) 22, 3 (2003).
tions on Graphics (Proc. of SIGGRAPH 2006) 25, 3 (Aug. [SKS02] S P.-P., K J., S J.: Precomputed
2006). radiance transfer for real-time rendering in dynamic, low-
[HSA91] H P., S D., A L.: A Rapid frequency lighting environments. ACM Transactions on
Hierarchical Radiosity Algorithm. Computer Graphics Graphics (Proc. of SIGGRAPH 2002) 21, 3 (July 2002).
(Proc. of SIGGRAPH ’91) 25, 4 (July 1991). [SLSS03] S P.-P., L X., S H.-Y., S J.: Bi-
[KAMJ05] K A. W., A-M̈ T., J scale radiance transfer. ACM Transactions on Graphics
H. W.: Precomputed local radiance transfer for real-time (Proc. of SIGGRAPH 2003) 22, 3 (2003).
lighting design. ACM Transactions on Graphics (Proc. of [SWS05] S W J. S., S P.: RPU: A pro-
SIGGRAPH 2005) 24, 3 (Aug. 2005). grammable ray processing unit for realtime ray tracing.
[Kel97] K A.: Instant radiosity. In SIGGRAPH ’97 ACM Transactions on Graphics (Proc. of SIGGRAPH
(1997), pp. 49–56. 2005) 24, 3 (Aug. 2005).
[Leh04] L J.: Foundations of Precomputed Radi- [WSB01] W I., S P., B C.: Interactive
ance Transfer. Master’s thesis, Helsinki University of distributed ray tracing of highly complex models. In Ren-
Technology, Sept. 2004. dering Techniques 2001 (Proc. EG Workshop on Render-
ing) (2001).
[LMM05] L E. B., M K.-L., M N.: Calculating
hierarchical radiosity form factors using programmable [WTL06] W R., T J., L D.: All-frequency re-
graphics hardware. Journal of Graphics Tools 10, 4 lighting of glossy objects. ACM Transactions on Graphics
(2005). (to appear) (2006).
[LSSS04] L X., S P.-P. J., S H.-Y., S J.:
All-frequency precomputed radiance transfer for glossy
objects. In Rendering Techniques (Proc. of EG Sympo-
sium on Rendering) (2004), pp. 337–344.
[NPG05] N M., P S. N., G V.: Real-time
global illumination on GPUs. Journal of Graphics Tools
10, 2 (2005).
261
J. Kontkanen, E. Turquin, N. Holzschuch & F. X. Sillion / Wavelet Radiance Transport for Interactive Indirect Lighting
Direct
Indirect
Combined
262
5.
Conclusion et perspectives
Nous avons développé trois thèmes principaux dans ce mémoire : la simulation de l’éclai-
rage par des méthodes multi-échelles à éléments finis, la détermination des caractéristiques de la
fonction d’éclairage, et la simulation en temps-réel ou interactif de certains effets lumineux.
Nous allons tenter de résumer notre contribution dans chaque thème :
– En ce qui concerne la simulation de l’éclairage par éléments finis, nous avons montré
l’efficacité de la représentation hiérarchique, y compris avec des fonctions d’ondelettes
d’ordre élevé. Nous avons également montré combien les méthodes par éléments finis sont
dépendantes du maillage original, et nous avons développé une méthode pour s’affranchir
des limitations de ce maillage. Enfin, nous avons montré comment combiner les ondelettes
d’ordre élevé avec un maillage de discontinuité.
– Pour l’analyse des propriétés de la fonction d’éclairage, nous avons développé une mé-
thode pour prédire le contenu fréquentiel local de l’éclairage, pour chaque interaction, en
fonction des obstacles rencontrés. Nous avons également développé une méthode pour le
calcul des dérivées de la fonction d’éclairage.
– Enfin, dans le domaine du rendu temps-réel, nous avons développé des méthodes pour la si-
mulation en temps réel de certains effets lumineux : ombres douces, réflexions spéculaires,
éclairage indirect.
Ces travaux ont rencontré plusieurs limitations ou difficultés, et posent un certain nombre de
problèmes intéressants à résoudre.
Ainsi, les méthodes par éléments finis, même hiérarchiques, sont fortement liées au maillage
employé pour représenter la scène. Cette limitation est inhérente à la représentation par éléments
finis, même si plusieurs méthodes ont été développées qui permettent de s’en affranchir partiel-
lement (clustering, face-clustering, instantiation, virtual mesh...). D’autre part, une partie trop
importante du temps de calcul est consacrée à des effets qui sont importants pour l’aspect visuel
de la scène, comme les frontières d’ombre, mais moins importants pour le calcul de l’éclairage
indirect.
L’analyse fréquentielle de la fonction d’éclairage ouvre la voie à de nombreuses études fu-
tures. Nous avons développé un outil pour la prédiction du comportement fréquentiel de l’éclai-
rage en chaque point. Il reste beaucoup de recherches à faire dans ce domaine, à la fois pour le
calcul effectif du contenu fréquentiel dans la scène et pour l’emploi de ces fréquences dans les
méthodes de simulation de l’éclairage. Il est important que le gain en temps de calcul grâce à
l’utilisation du contenu fréquentiel soit supérieur au temps pris pour calculer ce contenu fréquen-
tiel.
Enfin, l’utilisation de cartes graphiques pour la simulation d’effets lumineux est un domaine
très prometteur. En simulant certains effets lumineux sur la carte graphique, il est possible d’aug-
menter le réalisme des simulations d’éclairage tout en diminuant le temps de calcul. En même
263
264 CHAPITRE 5. CONCLUSION ET PERSPECTIVES
temps, ces cartes programmables ont des limitations : elles fonctionnent sur un modèle SIMD,
sans communications possibles entre les différents processeurs, avec un nombre limité d’instruc-
tions... Les algorithmes qui pourront le mieux exploiter la puissance de ces cartes seront ceux qui
s’adaptent à ces limitations. En général, on pourra plutôt les utiliser pour simuler des phénomènes
locaux ou semi-locaux.
5.1 Perspectives
Notre but, pour nos travaux futurs, est d’obtenir une simulation photoréaliste de l’éclairage
global dans une scène quelconque, en temps-réel. Pour atteindre ce but, nous allons poursuivre
plusieurs directions de recherche :
– D’une part, il est nécessaire de pouvoir simuler l’ensemble des phénomènes liés à l’éclai-
rage direct en temps-réel. Les calculs d’éclairage direct avec des fonctions de réflectance
quelconques et des sources étendues, les calculs d’ombre douce, les réflexions sur des sur-
faces semi-spéculaires sont quelques uns des phénomènes locaux ou semi-locaux que nous
voulons pouvoir simuler de façon photoréaliste en temps-réel.
– D’autre part, un certain nombre de phénomènes liés à l’éclairage indirect ne sont obser-
vables qu’à proximité immédiate des objets, comme l’occlusion ambiante causée par un
objet, ou les réflexions causées par des BRDF semi-diffuses. Pour ces phénomènes, il fau-
drait attacher aux objets mobiles une zone d’influence, à l’intérieur de laquelle on cal-
culerait l’effet. Cette zone d’influence pourrait porter un certain nombre de coefficients
pré-calculés pour la simulation.
– Dans la simulation de l’éclairage, on dispose généralement de plusieurs algorithmes pour
simuler un effet donné, ou d’un ensemble de paramètres pour un algorithme donné. On
a donc des choix à faire, et pour guider ces choix, nous proposons d’utiliser notre ana-
lyse fréquentielle de l’éclairage. On pourrait alors choisir l’algorithme le mieux adapté, ou
encore limiter l’échantillonnage pour les phénomènes à basse fréquence.
– Cette analyse fréquentielle de l’éclairage a également des applications pour les simulations
offline de l’éclairage. Notre approche devrait permettre de guider des calculs de simulation
de l’éclairage par lancer de photons, ou encore d’adapter l’échantillonnage spatial dans les
méthodes de Precomputed Radiance Transfer.
Liste des publications
Journaux internationaux
[1] Mattias Malmer, Fredrik Malmer, Ulf Assarsson et Nicolas Holzschuch. Fast Precomputed
Ambient Occlusion for Proximity Shadows. Journal of Graphics Tools, 2007. (à paraître).
[2] Lionel Atty, Nicolas Holzschuch, Marc Lapierre, Jean-Marc Hasenfratz, Chuck Hansen et
François Sillion. Soft shadow maps: Efficient sampling of light source visibility. Computer
Graphics Forum, vol. 25, no 4, décembre 2006.
[3] David Roger and Nicolas Holzschuch. Accurate specular reflections in real-time. Computer
Graphics Forum (Proceedings of Eurographics 2006), vol. 25, no 3, septembre 2006.
[4] Aurélien Martinet, Cyril Soler, Nicolas Holzschuch et François Sillion. Accurate detection
of symmetries in 3D shapes. ACM Transactions on Graphics, vol. 25, no 2, avril 2006.
[5] Frédo Durand, Nicolas Holzschuch, Cyril Soler, Eric Chan et François Sillion. A frequency
analysis of light transport. ACM Transactions on Graphics (Proceedings of SIGGRAPH
2005), vol. 24, no 3, août 2005.
[6] Cyrille Damez, Nicolas Holzschuch et François Sillion. Space-time hierarchical radiosity
with clustering and higher-order wavelets. Computer Graphics Forum, vol. 23, no 2, avril
2004.
[7] Jean-Marc Hasenfratz, Marc Lapierre, Nicolas Holzschuch et François Sillion. A survey
of real-time soft shadows algorithms. Computer Graphics Forum, vol. 22, no 4, décembre
2003.
[8] François Cuny, Laurent Alonso et Nicolas Holzschuch, A novel approach makes higher
order wavelets really efficient for radiosity, Computer Graphics Forum (Proceedings of Eu-
rographics 2000), vol. 19, no 3, septembre 2000.
[9] Laurent Alonso et Nicolas Holzschuch, Using graphics hardware to speed-up your visibility
queries, Journal of Graphics Tools, vol. 5, no 2, avril 2000.
[10] François Cuny, Laurent Alonso, Christophe Winkler et Nicolas Holzschuch, Radiosité à
base d’ondelettes sur des mailles quelconques, Revue internationale de CFAO et d’informa-
tique graphique, vol. 14, no 1, octobre 1999.
[11] Nicolas Holzschuch et François Sillion, An exhaustive error-bounding algorithm for hierar-
chical radiosity, Computer Graphics Forum, vol. 17, no 4, décembre 1998.
265
266 LISTE DES PUBLICATIONS