0% found this document useful (0 votes)
13 views35 pages

Deep Learning For DIC

This document discusses the integration of deep learning, specifically Convolutional Neural Networks (CNNs), with Digital Image Correlation (DIC) to enhance the measurement of displacement and strain fields. It highlights the limitations of traditional DIC methods and proposes the use of CNNs to retrieve dense subpixel displacement fields from speckle patterns, a novel approach not previously explored. The paper outlines the methodology, including dataset generation, network training, and the results of applying the developed network, named 'Strain.Net', to various image pairs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views35 pages

Deep Learning For DIC

This document discusses the integration of deep learning, specifically Convolutional Neural Networks (CNNs), with Digital Image Correlation (DIC) to enhance the measurement of displacement and strain fields. It highlights the limitations of traditional DIC methods and proposes the use of CNNs to retrieve dense subpixel displacement fields from speckle patterns, a novel approach not previously explored. The paper outlines the methodology, including dataset generation, network training, and the results of applying the developed network, named 'Strain.Net', to various image pairs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

When Deep Learning Meets Digital Image

Correlation

S. Boukhtache1, K. Abdclouahab2 , F. Berry\ B. Blaysat1, M. Grccliac1 , F. Sur3


1
Ir~stit-ut Patical, UMR 6(i02, Univer·siti Clermont-Auvergne, CNRS, SIGMA Clennont, Cler-rnont-Fenund,
Fronce

2
Sma-RT'y SAS, Aubie.re, Fmnce

:!LORIA, UMR 750(1, Universite de Lonnine, CNRS, lm·ia, Nan£-y, Prance

Keywords: Convolntional Nenrnl Network, Deep learning, GPU, Digital image Correlation, Error
Quantification, Photornechanics, Speckles

1. I ntroductio n

Digital Image Correlation (DIC) is a full-field displacement and strain measurement technique which
has rapidly spr ead in the experimental mechanics comumnity. The main reason is that this technique
achieves a good compromise between versatility, case of use, and mctrological performance [1]. Many
recent papers illustrate the mm of DIC in various Hituations [2] . Some others discm,s bow to characterize
or improve its mctrological performance, as [3] written within the framework of the DIC Challenge [4] .
Drn,pite all ifa, advantages, DIC suffers from some drawbacks. For instance, DIC is by essence an
iterative procedure, which a utomatically leads to mobilizing significant computational resources. Con-
sequently, its use is fanited when den.se (i.e. pixclwiso-dcfined) displacement or strain distributions
arc to be mcasmcd. Another drawback is the fact that DIC acts as a low-pass filter, which causes
the .retrieved diHplaccmcnt and strain fields to be blurn:d. Indeed, it is shown in [5, 6] that the di'i-
placement field rendered by a DIC system is not the actua l one, but regm·dlcsH of noise, the actual one
convolved by a Savitzky-Golay filter. This makcH DIC to be unable to correctly render displacement
or strain ficldH featuring high spatial frequencies. Overcoming these limitations seems however difficult
without introducing a paradigm Hhift in image processing itHclf. For in.stance, the minimization of the
optical residual t hat DIC p erforms iteratively in the spatial domain can be switched to the Fourier
domain [7]. T he benefit is to considerably reduce the computing time and to allow the use of optimal
patterns i.n terms of camera sensor noise propagation [8, !), 10]. The drawback is t hat such patterns
arc periodic, and depositing them on specimens remains challenging as long as no simple transferring
technique is commercially available. In the prcHcnt study, we propose to usc r andom speckle patterns
as in DIC and to invcHtigatc to what extent a Convolntional Neural Network (CNN) can be used to
retrieve clisplac:cmcnt and strain fields from a pair of reference and deformed imagcH of a speckle pat-
tern. To the best of the authors' knowledge, this route has never been explored so far to retrieve dense
sn bpixcl displacement fields. However , CNNs have been widely used in computer vision in the recent
past, but mainly in order to perform image dassification or recognition [11], or to eHtimatc rigid- body
clisph1cemcnts of solidH [12]. The problem addreHsed here is somewhat diffcn:nt because deformation
occurH, and bcCalL'ie the resulting cliHplaccmcnt is generally much lower in amplitude than in the afor e-
mentioned cases of rigid-body movements. Con.scqucntly wc mainly focus here on the estimation of
snbpixcl displacements.

1
The present paper is organized as follows. The basics of CNNs and deep lea.ruing arc first briefly
given in Section 2. We examine in Section 3 how a problem similar to ours, namely optical flow
determination, has been tackled with CNNs in the literature. CNNs must be trained, and since no
dataset suited to the problem at hand is available in the literature, we explain in Section 4 how a. first
dataset containing speckle images has been generated. Then we t est in Section 5 how four pn.'-cxisting
networks iuc fine-t uned with thi:, data.'lct, and examine whet her these networks a.re able to determine
displacement ficlcb from different speckle images artificiaUy deformed. The network giving the best
rrnmlts is then selected and improved in several ways to give displacement ficlcL'> of better quality.
The different i.mpr ovcmcnts and t h e corresponding results a.r e given in Se ction 6 . Pina.Uy, we use the
network resulting from ail the suggested improvements, named her e "Str ain.Net", to process some pairs
of speckle images from t he DIC Challenge and from a previous study.
The numerical experiments proposed in this paper can be reproduced with Mat lab and Pytorch
codes as well as with datasets available at the following URL: https://github.com/ Drea.rnlP / Stra.inNct.

2. A short prime r on d eep lmu·ning

Data-driven approaches have revolutionized several scientific domains in the last decade. Th.is is
probably due to a combinfition of several factors, in particular the dramatic improvement of computing
and information storage capacity. For exam ple, it is now quite easy to have large datasets gathering
many examples for a task of interest, as millions of labeled images for an iniagc classification task.
However, the most cruci,11 p oint is probably the rise of new machine learning techniques built on
artificia l neural networks. While neural networks have been introduced in the 1950s and have been
continuously improved since then, a major breakthrough occurred a few years ago with the demonstr a-
t ion that deep ncmal networks [13] give excellent results in many signal processing applications. The
most iconic brcakthro11gh is probably the famous 2012 paper [11] which demonstrates a deep learning
based syst em outperforming the comp eting approaches in the lmageNct classification challenge.
The basic clements of neural networks arc neurons connected together. 1u fccdforward neural
networks, neurons m-e organized in layers. The input layer encodes input data (an image in image
classification ta.'lks, two images in the present problem) and the output layer encodes the associated
label (the posterior probability of a cla.<;s in cla.'lsification, a displacement field here). The intermediate
layers arc the hidden layers. T hey a.re made of neurons which output a signal. Each neuron of the
hidden and output layers arc connected to neurons from the preceding layer. The output of these
neurons arc a weighted sum of t he connected neurons modulated by a continuo us non-decreasing
function called activ-a.tion function , except for the output layer in a regTcssion problem as here where
no activation is involved. The most popular activation function is the so-called Rectified Linear Unit
(ReLU) [13].
Deep ncw·al networks arc ca.lied "deep" because they may be made of several t eus of Layers. As we
have seen, neural connections between layers involve weights. Note that "weights" a.re also commonly
referred to as the "network parameters" in the literature. T he term "weight" is used here t o avoid
c,cmfnsinn wit.h t.h,1 nt.h,:r raramd.,:rs rldi nc:rl in t.h,~ papr1r .
Compu ter vision applications typically caU for a s11bclass of deep neural networks known as con-
volutional neural networks (CNNs) where the number of weights is mitigated by imposing that the
weighted su ms correspond to convolut ions , which turn out to be the basic operators of signal process-
ing. Nemons from a convolut ional layer iuc represented as the output of the implemented convolutions,
hence the blue parallelepipeds in Figure 1. Another ingredient used in CNN is down-sampling which
reduces the width of the layers involved. In t he p resent work, down-sampling by a factor (the so-
called stride) of two is obtained by computing convolutions shifted by two un.its instead of one unit as
in a standard convolu tion. This explains t he narrower and narrower layers in the feature extraction
part in Fig1rrc 1. Note that in th.is fig,rre, 11p-sarnpliug is performed Hu-o ugh the so-called transposed
convolutions when predicting the displacement field .
However, deep CNNs arc still likely to require several millions of weights. As in any supervised
lear ning task, the weights arc set by training the CNN over a dataset made of observations, that is,

2
pairs of inputs and corresponding expected outputs. More precisely, training, called deep learning in
this context, consists in minimizing with respect to the weights the cost of errors between the expected
outputs and the CNN outputs obtained from the inputs and the current weights. The error cost is
measmed through the Slrcalled loss function. The optimization method implementing the training
process is most of the time a variation of the mini-batch stochastic gradient descent (SGD) algorithm:
sma!J sets of observation.~ arc iter atively randomly d..rawn from the whole data.-;et, giving a slrcalled
mini-batch, and the weights of the CNN arc then Hlightly updated in order that the sum of the lossllli
over each mini- batch decreases. At each iteration, the outputs of the CNN computed from the mini-
batch inputs arc t hus supposed to get closer to the exp ected outputs. The magnitude of the weight
update is proportional to the step size of the SGD algorithm, called learning rate in machine learning
applications and denoted A in this paper. This parameter has to be set carefully. E ach complete pass
over the dataset is called an epoch. Many epochs arc needed to perform the full training of a CNN.
However, training a deep CNN requires a large dataset a.nd heavy computational rcsomces, which
is simply not possible in many i-;ituatioru;. A popular alternative approach consists in using freely
available p r e-trained networks, tha.t is, CNN that have been trained on standard datasets, and in
adjusting their weights to the p roblem of intercHt. Adjustment can be performed by fine tuning or
tranHfor learning. The former consists in marginally changing the weights in order to adapt them to
the problem of intereHt and its dataset. The latter consists in changing a pa.rt of the arch..itccturn of
the pre-trained network and in learning the corresponding weights from the problem of interest, the
remaining weights being kept constant. Let us now examine how such networ ks have been used in the
literature for solving a classic computer vision problem similar to ours, namely optical flow estimation.

3. A brie f r eview of CNN-base d m ethods for o ptical flow estimat ion

Optical flow is t he apparent displacement field obtained from two views of a scene. It iH cairned
by the relative motion of the ohscrvcr and objects in t he scene which may move or deform. Recent
algorithms of optical fl.ow estimation ba.<-Jcd on CNNs provide a promising alternative to the meth-
ods cla.-;sically used to resolve this problem in computer vision after the seminal papers by Horn and
Schimek [14] or by Lucas and Kana.de [15] . As a typical machine learning setup, Al-based optical flow
estimation algorithms can he divided into three categories: unsupervised, semi-supervised, or snpcr-
vised. Unsupcrviimd [16, 17, 18] and semi-supervised [19, 20] methods arc reviewed in the literature
to add..ress the problem of lim..ited training data in optical flow estimation. ln contra1-lt, these methods
do not yet r each the accuracy of their supervised countcrpa.rtH. Furthermore, supervised methods a.re
the predominant way of learning, as described in the preceding section, and gcnera.lly provide good
performance. However, they require a large amount of accurate, ground-truth optical fl.ow mca.'i\ITL'-
mcnts for training. Most accmatc models m;c CNNs as one of the components of their »-ystcm, a.s in
DCFlow (21], MRFlow [22], Dcepflow (23], and F low F ield-; [24]. None of thlllic previous approachllli
provide cnd-tlrlmd trainable rnodeL'-l or real-time processing performance. The most efficient algo-
rithms recently proposed in the Literature for optical flow estimation a.re reviewed below. In general,
t.hr.y share! t.h ci same! arc:hitc!d ,11ni (a. schmnatic· vic~w is r cprPscmtc~d in F igm·c! 1 ) . 'fhci first. pmt. nf the!
network cxtrads the foa.turcs of the images and the optical flow is preclictcd t luough an up-sampling
procci-;s in the second pa.rt of t he network.
Doi-;ovitskiy ct al. [12] presented two CNNs called FlowNctS and FlowNctC to learn the optical
flow from synthetic dataset. These two CNNs a.re coru;tructcd based on the U-Net architecture [25].
FlowNctS is an cnd-tlrend CNN- ha.-;ed networ k. It concatenates the reference and the current imagllli
together and focdi-; them to the network in order to extract the optical flow. In contrast, FlowNctC
creates two separate processing pipelines for each of these two images. Then, it combines them with
a correlation layer that porforms multiplicative patch comparisons between the two generated feature
maps. The resolution of the output optical flow in both networks is equal to 1/ 4 of the image resolu-
t ion. Indeed, the authors explain that adding other layers to reach full re8olution is computationally
expensive, and docs not rca.lly improve the accuracy. A consequence is that with t his network, the
optical flow at full resolution is obtained by performing a hi.linear interpolation with an upscale factor

3
• Convolutions + ReLUs
• Transposed Convolution + ReLU + Convolution

Figure 1: ScliemaLic view of a oou volutional ucura.J nCLwork. AuLic:ipatiug the result;; discu&md in SL-cLion 6.1, we
prc.-scnt here a schematic view of J<'lowNct-f. The feature extraction part eoru;isLs of several couvoluLion layers followed
by a H.cLU (!{edified Linear Unit 113]). The last layer has a stride of two Lo perform down-sampling. Such a stack is
called a level. The o utput of these levels can thus he rcpreseutcd by uarrowcr and uarrower blocks, iu blue here. Ju the
displacement prediction part, the levels arc made of t rausposed convolution laye rs (for u p-sampling) and convolution
layers. The input of each level o[ this part is the output of the precL~Jiug level, concal,cnatcd to the output of the levels
frow the feature cxtractiou part, as rcprcscute<l by the black arrows.

of 4. This point is discussed further below as we do not use exactly the same approach to solve the
problem addrc;;sed in this paper.
Hu et at. [26] propo;;cd a recurrent spatial pyramid network that inputs the full-resolution images
and generates an initial optical flow of 1/8 full-rci;ulution. The initial flow is then upscaled and refined
recurrently based on a.n energy function. T he initial generated optical flow is converted to reach
the full-resolution flow by performing this recurrent operation on the spatial pyramid. Tli.iH network
achieve;; comparable performance to FlowNctS with !)0 times less weights. It is however twice as slow
as FlowNctS because of the recurr ent operations performed on the spatial pyramid. Note that when
CNNs arc compared, it is important to use the same image size and the same GPU for all of them.
In addition, the computing time, or inference time, is generally considered as equal to the mean time
value over a certain number of pairs of images.
FlowNct 2.0 [27] is the extension of FlowNct [12]. It stacks multiple FlowNctS and FlowNctC
networks in or der to obtain a more accurate model. F lowNct 2.0 reduces the estimation error by more
than 50% compared to FlowNct. However it has three limitations. First, the model size is 4 t imes
larger than the original FlowNct (over 160 million weights). Second, F lowNct 2.0 is 4 times slower
t han FlowNct. Third, the sub-networks need to be trained sequentially to reduec ovcrfitting problem.
Ovcrfitting is a non-desirable phenomenon, which occurs in dc-cp learning when the cardinality of the
dataset is too small with respect to the number of weights to be trained. This leads to a network, which
foils in c:or rc1d.ly p rnrlict.in p; t.hn opt.ir:al flow from imagns w h ich arc1 t.oo rlifforcmt from t.hosn r:nnt.ainnrl
in the training data.set.
Ranjan et al. [28] developed a compact spatial pyramid network, called SpyNct. It combines deep
learning with clas:,ical optical flow estimation principle such as image warping at each pyramid level,
in order to reduce the model size. In SpyNct , the pyramid level-; arc trained independently. Then, each
level uses the previous one as an initialization to warp the second image toward the first th.rough the
spatial pyramid network. lt achieves similar performance to F lowNctC on the same benchmark but is
not as accurate as FlowNctS and F lowNet2. 1n contrast to F lowNet [12], SpyNet model is about 26
t imes smaller but almost similar in terms of rnnning speed. In addition, it is trained by level, which
means that it is more difficult to tra.in than FlowNct.
LitcFlowNct [29] is composed of two compact sub-nctwmks specialized in pyramidal feature extrac-
t ion and optical flow estimation, respectively named NctC and NetE. NctC transforms the two input

4
images into two pyramids of multi-Hcale high-dimen:-;ional features. NctE com;i:-;ts of ca:-;caded "flow-
inference" a.nd "regularization" modules that estimate the optical flow fields. LiteFlowNet performs
better than 8pyNet and FlowNet, but it is outperformed by FlowNet 2.0 in terms of accuracy. It ha.-;
30 timcH less weight.:-; than F lowNct 2.0, but it is 1.36 times faster.
Sun ct al. [30] proposed PWC-Net. This network is based on simple a.nd wcll-c.-;tablished pri11ciplcs
such as pyramidal processing, warping, and cost volume. Similarly to FlowNet 2.0, PWC-Net :-;how:;
the potential of combining deep learning a.nd optical flow knowledge. Compared to FlowNet 2.0, the
model Hize of PWC-Nct iH about 17 t ime:-; Hrrmller and the inference time iH divided by two. Thi:;
n etwork is 1;1bo easier to train than SpyNct [28]. PWC- Net outperfor ms all the previous optical flow
method:-; on the MPl Sintcl [31] and KITTI 2015 [32] benchmarks.
All the:-;e studies demonstrate the u:-;ability of CNNs for estimating the optical flow. Thi:-; motivated
the development of Himilar model-; in order to tackle analogom; problem:-; a.riHing in other rc:-;earch fields.
This is typically the case in [33], where the authorn used FlowNetS to extract dense velocity fields from
particle images of fluid flows . The method introduced in this reference provides promising rnsults, but
the experiments indicate that the deep learning model docs not outperform classical method:-; in terms
of accuracy. Cai ct al. [34] developed a network based on the LitcFlowNct [29] to extract dem;e velocity
field:-; from a. pair of particle images. T he original LiteFlowNct network is enhanced in order to improve
the perform.a.nee by adding more layers. This leads to more accurate rc:;ult:; than those given in [33]
at the price of extra computing time.
In the following section, we propo:-;e to employ a deep convolutional ncw·al network to measure
full-field displacement field:-; that occur on the :-;urfacc of deformed solid:-;. AH for 2D DIC, we assume
here that the deformation is plane and that the swfacc is patterned with a random speckle. The main
difference with the examples from the literature discussed above is twofold. First a deformation occurs
a.nd second, the resulting displacements arc generally small, which means that subpixcl resolution must
be achieved. 1n addition, the idea is to use thi:-; measurement in the context of mctrology, meaning
that the errors affecting these measurement fields must be thoroughly estimated. A specific dataset
and a dedicated network chri:;tcncd "Stra.inNct" arc proposed herein to reach these goals, as described
in the following sections.

4. D ataset

4.1. Existing datasefa


Supervised training of d<.,-cp CNNs requires large datasets. In the context of optical flow, datasets arc
made of pairs of images together with a g,Tmmd truth optical flow, which may be achieved by rendering
synthetic image:-;. In Hpitc that generating Huch large data.-;ct:; i:-; a tedious task, several data.setli were
generated and used in previous studicH on optical flow e:-;tirna.tion [;~5] . The mo:-;t commonly datasets
lL'>Cd for optical flow estimation a.re listed in Table 1.

Dataset Number of frames H.c:-;olution Di:-;placemcnts


Mid<llcbu.q (36] 72 vancH Smull (~ 10 pixcl<i)
KITT12012 [:37] Hl4 1226x:no Large
KlTT12015 [32] 800 1242x:375 Large
MPI Sintcl [31] 1064 1024x4.36 Small and large
FlyingThillgs3D [38] 21818 960x540 Small and large
F lying Chair:-; [12] 22782 384x512 Small and large

Table 1: Commouly used optica..l flow dal<.'Sts.

Almost a.ll existing datasets arc limited to large rigid-body motion,; or quasi-1;gid motions and
deal with natural image:-;. Hence, an appropriate dataset consi:;ting of deformed speckle images had
to be developed in the present :-;tudy. This dataset should be made of pairs of images mimicking

5
typical reference and deformed speckles, as well as their associated deformation fields. It has to be
representative of real images and deformations so that the resulting CNN has a chance to perform
well when inferring the deformation field from a new pa.ir of images that arc not in the training
dataset. Speckle images have a smaller variability than natural images processed by computer vision
applicatiom;. However, we shall sec that we cannot use smaller datasets as the one of Table 1 because
we seek tiny djsplaccmcnts.

4 .2. Developing a spe.!:kle datas et


In order to estimate very small subpixcl displacements (of the order of 10-2 pixel), we propose a
new dataset called Spccklt! dataset1 . T wo versions were developed in tltis study. The first one, referred
to as Speckle dataset LO in the following, contruns 36,663 pairs of frames (tltis number is justified
below) with their corre;;ponding subpixcl displacements fields. A second version called Speckle dataset
2.0 was also developed, as explained later in trns pa.per. Speckle dataset 1.0 was generated as follows:
1. Generating speckle r eference frames.
2. Defining the displacement fields.
3. Generating the deformed frames .
These different steps arc detailed in tum in the following subsections.

Genemtin_q reference speckle frames. 1n real experiments, the fu-1>1 step is to prepare the specimen,
often by spray-painting the surface in order to deposit black droplets on a wrntc surface. Tltis process
was mimicked here by using the speckle generator proposed in [3!J] . This generator consists in randomly
rendering small black disks superimposed in an image, in order to get synthetic speckled patterns that
arc similar to actual ones. Reference frames of size 256 x 256 pixels were rendered with t rns tool. ThiH
frame size influences the quality of the results, as di'lc1issed in Section 6.1 . These reference frames were
obtained here by mixing the different settings or para.meters which arc li;;tcd in Table 2 (Hee (3!J] for
more details).

Speckle data.sets LO and 2.0 Star images


Probability distribution function Uniform, Exponential Exponential
of the radius of the disks and Pois;;on
A vcragc radius of disks 0.45 to 0.8 pixel by step of 0.025 0.5
A veragc number of disks per image 36,000 556,667
Contra.st of the speckle 0.5 to 1 by step of 0.05 0.6
Size of the images 256 X 256 501 X 2000
A veragc number of disks per pixel 0.54!) 0.556

Table 2: Par,unelers used to render Lhe images for Speckle <latasel 1.0 a.n<l 2.0 (s<--concl column), an<l the images
dcformc.-<l through the Star displaccmeut of Sectiou 5.3 (thir<l colUUUI).

The quality of these reference frames was visually estimated and those featuring very large spots
or poor contrast were eliminated. Only 36:3 frames were kept for the next step. Figure 2-a. shows a
typical speckle obtained with our generator and 1rned to build Speckle dataset 1.0. It can be visually
checked that the aspect i~ similar to a. real speckle u;;ed for DIC in experimental mechanics, sec typical
examples in (l ].

1
Rcfcre11cc s p<--ckle frames u;;c<l in sp<-tlle dataset generation au<l Matlab co<lc a rc available at
github.com/DreamIP/StrainNet

6
(a) R.e fcrc ucc spoddc intago (h) R.andow clispla<:ctmmts a loug Lhc :c (t:) R.audom displaccuwuLs aloug Lbc y
axis a.xis

Figure 2: Typical synthetic speckle images and ra.ndom displacem ent (in pixels) aloug x and !I usG'Cl in Speddc dataset.
1.0. J\ll dimcns ious arc in pixels.

Defining the displacements. The following Htrategy was used in order to generate a rich Hct of dis-
placement fields covering the widest possible types of displacement fieldH. The first Htcp consis ted in
Hplitting each reference frame into adjacent square regions of size 8 x 8 pixels. The pixel located at the
corners of these regions were affected by a ra.miom displacement. The second step consisted in linearly
interpolating t he displacements between all thrn,;c pixels to get the displacement at the remaining pixels
of each squm·e region.
Sir1cc we were interested here in estimating subpixel dis-placements, the impoHed random displace-
ments lied between -1 and + 1 pixel. Furthermore, the displa{:ements at pixels located along the
boundary were set to zero to limit the error due to the missing information in these regions. Figures 2-
b and -c show typical random displacements used to deform reference speckle images to obtain their
deformed cow1terparts.

Generating the deformed fmmes. Ea.ch of the 363 reference frames was deformed 101 times, each time
by generating a new random displacement field. 363 x 100 = 36,300 pairs of frames were used to form
the dataset on which the network was trained. 363 other pairs were used for validation purposes in
order to ass<~'!H the ability of the network to render faithful diHplacemcnt fields after training. As in
other HtudieH dealing with optical flow C',stima.tion, this ability is quantified by calculating the Ho-called
Average Endpoint Error (AEE). AEE is the Euclidean distance between predicted flow vector and
ground truth, averaged over all the pixel'!. Thm,

l K L
AEE = KL LL J(ue(i,j) - ug(i,j)) 2 + (v.,(i , j) - v9(i, _j))2
i= l j = l

where (1.1,e, ve) and (ug, vg) arc the estimated optical flow and the ground truth, respectively, dcfilled
at each pixel coordinate (i , j) . l( and L a.re the dimensions of the zone over wh.iclt the AEE value i~
calculated. This quantity is in pixel, the displacements being expm'!Hcd in pixel.
The main intereHt of the Hpeckle generator deHcribed in (39] iH that it doeH not rely on interpolation
when deforming the artificial Hpeckled patterns ir1 order to pr event additional biases caused by interpo-
lation. However, Hince it relics on a Monte Carlo integration scheme, a long computing time is required.
This generator is therefore only H1iitable for the generation of a liniited number of pairs of synthetic
reference and deformed images. Such an approach iH not adequate here since Hever al thousands of
images had to be generated. T he solution adopted here was to use bi-cubic interpolation to deform
the reference images through the randomly-defined d iHpla.cement fields, corn,idering that it yielded a
reaH<mable trade-off between computing time and quality of the results. Hence the Hpeck.le generator
described in (39] was only employed to generate the reference frameH mimicking real speckles uHcd in
experimental mechanics.

7
5. Fine-tuning networks of the liter ature

In general, CNNs arc developed for specific applications and trained with datasets suited to the
problem to be solved. The question is therefore to know if the studies reviewed in Section 3 above, which
ha.ve goo<l performance for <'Btimating large di'!placemcnts, can be fully or partially used to address
the problem of the present paper, namely, resolving subpixel (~ 0.01 pixel) displacements. Transfer
learning and fine tuning arc possibilities to respon<l to this question, as explained in Section 2.
When classification iK the ultimate goal, transfer learning is carried out by r eplacing only the
last fully-connected layer and training the model by npdating the weights of the replaced layer, while
maintaining the same weights for the convolutional layers. The idea behind this is that if the dataset at
hand is similar to the training dataset, the output of the convolutional layers keeps on being relevant for
the problem of interest. Since our dataset is completely different from the datascfa; used for training
the CNNs described in the studies reviewed above, we proceeded to a fine t u ning by updating the
weights of all the layers and not only the last one. The weights found in the cases di'!cussed above
were considered as starting points for the fine-tuning p erformed on our own dataset.
In the following, we discuss the fine tuning of the networks considered as the most interesting ones
in terms of accuracy and {;()mputationa.l efficiency. We did not choose to fine-tune Flownct 2.0 because
of its high computational cost, as di'!cusscd in Section 3. Instead, we ma.inly relied on the FlowNct,
PWC--NET, and LitcFlownct models [12, 30, 29] because they exhibit better computational efficiency
during both fine tuning and inference. In practice, we m,cd t he PyTorch implementations of these
models provided in [40], [41), and (42], respectively.

5.1. Training strategy


The following strategy was used in order to fine-tune the networks proposed in [40, 41, 42]:
1. Keeping the network architecture, loss function, and optimization method of the original model.
2. Increasing the batch size to 16 in order to speed up the training process on our hardware config-
uration.
3. Initiating the learning rate (the specific machine learning vocabulary is defined in Section 2 above)
with >. = 10-4 , and then dividing this rate by 2 every 40 epochs. These values correspond to a
good trade-off between computing t ime and accuracy of the results.
4. Firn,'--tuning each network for :300 epochs because this is a value for which the value of the loss
function of the coru;ider cd model'> stabilizes.
With this strategy, finc,'--tuning each of the networks on an NVlDlA TESLA VlO0 GPU required
between three to four days, depending on the model complexity.

5.2. Obtained results


The AEE value defined in Equation 1 was used to evaluate the fine-tuning r esults of the different
networks. It was averaged over the test dataset made of 363 displacement fields deduced frorn the
363 pairs of reference and deformed speckle images considered for validation purposes. Figure 3 gives
the average AEE value for ea{:h network trained on the Speckle dataset 1.0. The main conclusion is
that the average AEE value is much lower with FlowNetS, which hints that more accurate m,tirnations
arc expected with this network. lt has also been observed that the average AEE was about the sarne
for both the train and the test data.sets for all the networks under consideration (for this reason,
Figure 3 only reports the AEE of the test dataset), suggesting that they do not suffer from overfitting.
Finally, the main conclusion is that the AAE value is equal at best to 0.1 pixel, which is too high
since a r esolution of 0.01 pixel is expected to be obtained to reach a similar performance to D IC. Since
finc,'--tuning wa8 not su.fficient, these preexisting networks had to be improved, a.-; explained in Section 6.

8
PWC-NBT IAEE=0.356
I
I
LiteFlowNet IAEE=0.3.S
I I

FlowNetC IAEE- 0.367


I I

FlowN°elS IAEE -0.107


I I

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45


AEE value

.Figure 3: Average ABE value for four actworks Lrairwd on Speck.le daLasct 1.0.

i'i..'J. Assessin_q the n:sult.~ with a refen:nce Star displacement field


Another Hynthetic diHplacemcnt field than the interpolated random field described above has been
propoHed ir1 recent papers to assess the metrological performance of variom; full-field memmrement
methods 1rned in mrperimental mechanics [43, 7, 44, 10, 4]. This displacement field, named "Star
cli<;placcment" in the following to be consistent with the name given in [4], i;; a synthetic vertical
di8placement modelled by a sine wave, whose per iod gently and linearly increases when going toward
the right-hand side of the map. This vertical reference displacement field v11 is shown in F ig11rc 4b (u 9
is nulJ in thiH case). In this figure, the gTecn rectangle iH the zone over which the rcs11lts arc numerically
evaluated. It is not centered to emphasize the influence of the high spatial frequencies and to account
for some practical constraints imposed by the UHC of LitcFlowNet and PWC-NET. The speckle images
ddormcd through thiH d isplacement field arc named "Star imageH". They were obtained by nsing the
generator described in [39]. It means that no interpolation was performed, which was not the case for
the deformed images of the dataset. The parameters UHcd to render theHc Hpecklc images arc given in
Table 2, third column. lt can be Hecn that the average number of disks per pixel is nearly the ;;amc
for ail typcH of images.
The benefit of using ima.gC'-'l deformed with thi;; displacement field is that the effect of the Hpatial
f:rcqucm;y on the mca:cmrcd displacemen t field can be easily asscsHed by observing the attenuation of
the displacement amplitude when going toward the left-hand side of the map. In other words, this
reference d isplacement field permits the dmracterization of the transfer fu nction of the estimation
process. Images deformed through this diHplacemcnt field were therefore considered in this study
to validate the differ ent variants of the networ k we developed, a.nd more precisely to observe the
attenuation of the displacement rendered for the highest spatial frequencies located on the left.
J\. pair of reference image and image deformed through the Star cli<;pla.cemcnt field2 corresponding
to tho reference clisplaccmcnt field was used to evaluate the results obtairwd with those given by the
four fine-tuned CNNs Helcctcd above, namely FlowNctS, F lowNctC, LiteFlownet and PWC-NET. 1n
the fo!Jowing, extension "-ft" is added to the narneH of the nctworkH when they arc fine-tuned. No
artificial noise was added to these images. The obtained results were compared to those given by clas;;ic
subset-based DIC. T he latter was performed with two subset sizes, namely 2Nl + 1 = 11 and 21 pixels,
where NI is an integer which governs the size of the Hubsct. Interpolation was performed by 1rning
li'Plinc funct ions. First-order subset shape functions, which arc t he moiit common ones in commer cial
codes (1], were considered here ( "subset shape functions" arc also called "matching functions" in
the literature, but we adopt here the terminology proposed in [45]). Note that result;; obtained wit h
second-order ;;ubsct Hhapc fimctions arc alHo presented and discussed in Section 7. J\.s in recent papers

2 SLar (nuuei; used here arc available at g i thub . com/DreamI P/Str ain Net

!)
...
·oo

20()
,.
MO

,00
.."'.'
400

,00
~:o 400 600 w.:o IOOO 1200 1'00 1600 1800 2000

(a) H cferc ucc image


P.!l

100

;'(I(;

,00

400

51JO ~.,
A.:O 600 1000 1200 aoo 1600 1800 2000
"'' ""'

( b ) Ileforcncc displaccmcnL

F igure -1: ( a) R.cfcreucc image oorn,spoml ing Lo (l>) the Star <l.isplacerucut. The grccu rcctaugle is u sed iu this work lo
evalual e the r csull.s. AU dimensions arc i.u pixels.

dealing with t he mctrological p erformance of full-field measurement systems [10, 46], the sh.ift between
two consecutive subsets is equal to one pixel, wh.ich p uts us in a position where DI C is at its best
performance level (regardless of computing time). The results obtained in t hese different cases arc
given in F igure 5.
The worst results arc obtained with FlownctC. Those obtained vvith LitcFlownct and PWC-NET
arc better , but it is elem that FlownctS provides the best results, with a displacement map which is
similar to that o btained with DIC and 2M + 1 = 11 pixels. T he poor accuracy obtained by the first
thn,'c networks (FlownctC, LitcFlownct, and PWC-Nct) may be due to the fact that these networks
make use of predefined blocks such as correlation or regularization, and that they were originally
developed for t he determination of large displacements instead of subpixel ones.
As in previous studfos dealing with q uality assessment of full-field mcasmcmcnt techniq1m'i, such
as [7, 46, 4] for instance, the displacement a.long the horizontal a.xis of symmetry, denoted here by 6.,
is plotted in Figure 6 in order to clearly sec the attenuation of the sigT1al when going to the highest
i-.'Patia.l frequencies of the Star displacement (thus when going to the left of the displacement map), as
well a8 the high-frequency fluctuation8 affecting the results. It i8 worth rem embering that the refer ence
value of the displacement a.long 6. i8 equa l to 0.5 p ixel. A b lue horizontal line of ordinate 0.5 pixel is
t herefore 8upcrimpo8cd to these c1irvc8 to vi8ualizc this reference v<1.luc. The closer the curve8 to this
horizontal line, the better the rcsult8. Displacements obtained with DlC arc also 8upcrimpo8ed for
comparison p1irpo8cs. The mean absolute error obtained column-wise is also given.
It is worth noting that at high ;;patia.l frequencies (left- hand par t of the graphs), F lowNctS provides
rc;;ult8 which arc similar to those given by DIC with a 8u bsct of 8izc 2J'vf + 1 = 11 pixcb. At low spatial
frequencies (right-hand part of the graphs) , DIC provides smoother and more accurate re8ult8 than
FlowNctS. Calculating in ca.ch case t he mean absolute error for the vertical displacement gives a.n
estimation of the global error over the displacement map in each ca8e. This quantity is calculated over
the green rectangle plotted in Figure 4b to remove boundary cffoct8. Thi8 quantity, denoted by MAE
in the following, i8 defined by the mean value of the ab8olutc error throughout the field of interest.
ThlL'i

10
05 )(1_0_5 _ _ 05 ~10~

0 5 0 5

-0 5 -0 5
200 400 600 800 1000 1200 1400 1600 1800 200 400 600 800 1000 1200 1400 1600 1800
Dlsplaoernent Otsptacemenl
0.5

100 100
200 200

300 300
,oo ,oo
--0.5 0
200 400 600 800 1000 1?.0{) 1400 1600 1800 0.5 200 400 600 800 1000 1200 1'00 16(X) 1800 -0.5 0.5
Etror Error histogram Error Error hi$togram

(a) DIC , sul,scl size 2M + l = 21 pixe ls (u) DIC, subscL size 2M + l = 11 pixels
O!'i x 10.:_ __ 05 x10_, _ _
6
100

200 O 5
200 O 5

,oo 400 ~ - -
-05
200 400 600 800 1000 1200 1400 1600 1800 200 400 aoo aoo 1000 1200 1400 1600 rnoo
Dltp'8oemen1 Dlsp~oement
1;~1/._,·~ ~~ --~,-, · ·- .. ~-:--- ----: ~ ;i
100 100 /

200 200

300 300
'
,oo 400
""~~~:~.-;i :--:_-="k ;-::~ ~......-~=. :---I~c ._ ..: :. : -i --0.so ,___ __
200 400 600 800 1000 t200 14-00 1600 1800 0.5 ·200 400 600 soo 1000 1200 1400 ,soo moo -0.5 0.5
Error Error hi$togram Error Error histogram

(c) F lowNcLS-fL (<l) FlowNeLC- fL


05

0 5

-0 5
200 400 600 800 1000 1200 1400 1600 1800 200 400 600 800 1000 1200 1400 1600 1800
01.splaoement
0.5 0.5

100 100
200 200

300 300
, oo ,oo -0.50 , __ _~
200 40() 600 800 1000 1200 1400 1600 1800 0.5 200 4()() 600 800 1000 1200 1400 1600 1800 -0.5
Error Error histogram Error Error histogram

(c) L iltFluwueL-fL {f) PWC NET- fL

.Figure 5: Star displacement obtained (a)(b) wiLb D IC and (c)-(f) with four sclectod CNNs. ln each case, the rdrievc<l
di,;placc10cuL field , Lite d.illereuce wilh I.he reference displacerncal fie ld aud lhc histogram of Lhis difforeucc a.re given in
lum in Lhc cli!Icrent s ub-figures. All djmcusious arc in pixels.

l I< L
MAE = KL LL lve(i,j) - v 9 (i,j) I (2)
i= l j= l

where Ve and v 9 arc the estimated displacement and ground truth, respectively. K and L arc here
t he dimensions of the gTccn rectangle over which t he rc.-mlts arc calc:ulatcd. MAE is equal to AEE
defined in Equation 1, in which the horizontal displacement has been nullified. It i.-; introduced here in
order to focus on the error made along the vertical dir ection only, the relevant infonnation being along
th.is direction for the displacement maps extracted from the Star images. Table 3 gives this quantity
calculated for DIC and F lowNctS-ft.
It is dea r that the .MAE value is the lowest for DIC with 2M + 1= 11 pixels. It is followed by

11
0,6

·0.4 l
0 200 .coo eoo
0.35 ~ _.=,.~--. ,~ - - - ~ - - ~ - - - ~ - - ~ - - ~ - - - ~ - -
__
- _- _- '-
~-- _- - ~ ~
i 0.3 "\
"S~ 0.25
..
2 0.2
'
j 0.15

.
0
! 0.1

: 005
::i;

200 .coo 800 800 ·1000 1200 1•00 1800 1800

Figure 6: Comparison bclwccu FlowNetS--ft and DlC. Top: displaccmeut along the horizontal axis of syuuneLry of U1c
Star displacemeul field. The ltori>1011tal blue line corresponds lo the rcfcrcucc disµla<::cmenl along 6.. TLLi,; reference
displacement is equal Lo 0.5 pixels. l3ottom: mean absolute e rror obtainL-<l column- wise. Closo- up views of the
rightmost part. of tbe graphs arc also in.sertecL All dimensions arc in pixels.

Technique mmd DIC, 2M + 1 = 21 pixels DIC, 2M + 1 = 11 pixels FlowNctS-ft


MAE 0.0828 0.0365 0.0437

Table 3: MAE (in pixels) for D IC (subset si>le 2M + 1 = 11 aud 21 pixels) a11d FlowNetS-[t.

FlowNctS-ft. Th<',sc first results a.re promisrng but they come from predefined networks which arc
therefore not sp ecifically developed to r<'A
'>olvc the problem at hand. The best candidate, namely
FlowNctS-ft, was therefore selected for further improvement. This is the aim of the next section.

6. Tailoring Flow NetS to estimate displacement fie lds

Two types of modifications were proposed to enhance the accuracy of the results. The first one
concerns the network, the second the dataset.

6.1. l mprvvcmcnt of the network


First, let us examine in more dctaib t he architecture of FlowNctS [12]. This network can be split
into two different parts, as illustrated in Figure 1. The first one extracts the feature maps using 10
successive convolutional layers. The first layer nscs 7 x 7 filters, the second and third laycrn use 5 x 5
filters, and the remaining layers use 3 x 3 filters. The second pa.rt predicts the optical fl.ow at different
levels 1md relics, in the ca.'-lC of F lowNctS, on 5 convolutional la.ycrs with 3 x 3 filters and 8 tnmsposcd
convolutional layers. The transposed convolutional layers arc used for the up-sampling process (sec
the right-hand side of the schematic view depicted in Figure 1). We refer the interested reader to [12]
for more details on the architecture.
As mentioned in Section;~, the FlowNetS architecture provides an optical fl.ow of 1/4 of the original
image resolution. The full-resolution optical flow is obtained by applying an up-scaling step with
a factor of 4 using a bi-linear interpolation. This not only reduces the computational complexity
but also the quality of the results. In order to enhance the prediction of t he optical fl.ow, it was

12
therefore proposed to improve the network to directly output a higher resolution optical flow, without
interpolation.
In general, the deeper the network, the better the results. The dataset shall however be enlarged
accordingly to avoid overfitting (sec definition in Section 3 above)). Two approaches were examined
in this study iu order to enhance the accuracy of the predicted optical flow. The first one consists in
adding some levels to the network (thus increasing the number of layers) , the second in changing the
architecture.

6 .1 .1. Pir:~t approach: adding one or two level~


The first approach consists in adding levels to the network. A first option consists in adding one
level to the FlowNetS network while keeping the same architectme. Consequently, the new output
optical flow has a resolution divided by 2 instead of 4. This method still requires interpolation in order
to r each the same resolution as the input image8. A second option ha8 also been examined. lt consist8
in adding one more level so that the full-resolution is directly obtained without any interpolation.
The loss function defined in Section 2 is equa l to the following quantity

"
Loss = L AiCi (3)
i: l

where n is the number of level<, forming the network, A; is a. weighting coefficient corresponding to the
i-th level, and c; i8 the AEE between of the output of the i-th level and the ground-truth displacement
sampled at the same r esolution. T his los8 function was adapted to the proprn,cd networks by keeping
the same Ai coefficients as in [12] for the levels corresponding to the F lowNctS levels, and affecting a
coefficient of 0.003 to each new level. Compared to the strategy defined in Sc'Ction 5.1, only the loss
function is modified. Two training scenarios were used here. lu the first one, the new networks were
fine-tuned by updating all the weights of the networks. lu the second scern1rio, only the weigh.ts of the
new level8 were updated. The remaining weigh.ts were the same as those obtained after applying the
fine-tuning process described in the previous section.
By applying the first scenario for training the new networ k with one additional level only, the
average AEE value definG'C! in Section 5.2 increases from 0.107 to 0.141, which means that the result8
arc worse. The same trend is observed with the MAE value deduced from the displacement map
obtained with the Star images (MAE = 0.3334). The displacement map obtained in this case is shown
in Pigurc 7. It can be seen that the error made is significant. Adding one more level and retraining ,ill
the network weights docs not improve the quality of the rcsu.lts, which means that the first scenario is
not a good option for training the networks.
On the contrary, the second scenario really improvc8 the accuracy of the res1ilt8 when considerin.g
the random displacement fields mmd to generate the 363 deformed speckle images considered ru,; test
dataset, with a reduction of the average AEE of more than 50% compared to FowNctS-ft, as reported
in Table 4. The average AEE concerns the 363 displa.c:cmcnt field<; retrieved from the 363 pairs of
images of the test dataset. When considering now the Star displacement, we can sec that the MAE
reported in Table 4 for each of these two new networks is nearly the same as the MAE obtained with
PlowNetS-ft . This is confirmed by visually comparing the Star displacement reported in Figure 8 which
arc obtained with these two new networks, and the reference Star clisplaccment depicted in Figure 4b.
Besides, it can be seen in Figure 8 that almost the ,mrne accuracy as F lowNctS-ft is obtained when
considering the Star displacement field of Figure 46.
The displacements a.long the horizontal a.xis of symmetry I::!.. of the Star displacements which a.re
obtained with these two networks, F lowNctS-ft, and DIC (2M + 1 = 11 and 2M + 1 = 21 pixels) arc
depicted in Figurn 9 to more easily compare the different techniques. No real improvement is clearly
observed with t hese curves. T he mean absolute error estimated column-wise is lower for the proposed
networks than for DIC (2M + 1 = 11 pixels) for medium-range spatial frequencies, between x = 200
and x = 400, but thi8 enor becomes higher for x > 600. The obtained results show that the networks

13
0 0.5 x105
6
100
200 0 5
300
400 4
-0.5
0 200 400 600 800 1000 1200 1400 1600 1800
Displacement 3
0 0.5

100 2
200 0
300
400
-0.5 0
0 200 400 600 800 1000 1200 1400 1600 1800 -0.5 0 0.5
Error Error histogram

Figure 7: Star <lisplaccrnem outaiuc<l l,y adiling one level only Lo FlowNctS, perfonniug inlcrpolaliou Lo reach full
resolution and updating all the weights, (MAE - 0.3331). All dimensions arc in pixels.

Metric FlowNctS-ft DIC, 2M + 1 = 11 pixels Network with one additional level Network with two additional levels
Average ABE 0.1070 0.0560 0.0450
MAE 0.11437 U.03(i5 0.0445 0.0171

Table 1 : Com parison be.tween i- FlowNctS-fL, ii- DIC with 2M + l = 11 pixels and iii- FlowNctS aft,cr adding one or
two levels and updating only the weights of the new levels. A vcragc ABB caJculaL,xl over the whole images of the test
dataset an<l MAE cakulatc<l with the Star ilisplaccwenL.

5
C).5 x 10_ _
6
100

200 0 5 O 5
300

,oo
-0 5 -0 5
200 400 600 800 1000 1200 1400 1600 1600 200 400 600 800 1000 1200 1400 1600 1800
Dlsplacemen1 Cl!Jplacemen1

J°'
0 .5

100

200 0

300
,oo ,; I ,
--0.so '--- - .c.so'-. -' --
200 400 600 800 1000 l200 1400 1600 1800 -0.5 0.5 200 400 600 800 1000 !200 14'.')f) 1600 1800 --0.s 0.5
Error Error histogram Error Error histogram

(a) Optiou 1: Ade.ling oue level to FlowNotS, (!.,) OpLiou 2, Add iug Lwo levels Lo FlowNcLS,
MAE = 0.044S. MAE = 0.0471.

l<'ip;ure 8: Star <lisplac:crncnt obLained by adding one or two levds to FlowNctS a ud updating only Lhc weights of the
ucw levels. All c.litncusions a.re in µix.els.

proposed here enhance the learning accuracy on Speckle dat,.u-;et 1.0 compared to FlowNetS-ft but no
improvement is observed with the Star displacement.

6.1 .2. Second approach: simplifying the architecture of FlowNetS


In F lowNetS, the feature rnap8 m·e down-sampled 6 times (6 strides of 2 pixeb), which meall.8
that the resolution of the last feature maps i8 26 smaller t han the resolution of the images which a.re
processed. T hen, optical flow estimation and up-1:,ampling proces1:,ing a.re performed only 4 times, which
leads to output an optical flow with a resolution divided by 4. We also propose two alternative networks

14
0,6

~~~~---, -.,-·- ~
-,---

1- - ' -,- ~
~"Y:JJI, , .
-
-
"T ,-
- - ·DIC. 2M+ 1 "'21 pixels ; .,

r
<l 0.4
'"··-DIC. 2M• 1 ■ 11 pl,iels:
& I ; ••., ,,,.. ..•~ Fb,NetS.tt
;

j
0.2

0
~.v·~·-:'.
-

,._.
055

05
-


Add onole.....

L 2 1 - - ...
0 .45
1200 1250 11 00 1350 L 1400 1~50 1500 1550 1600 1650 L1700
·0.4 l J..
0 200 ,coo eoo 800 1000 1200 1400 1600 1800
0.35
C - • ·DIC. 2M+1 ::21 pullels
§ 0.3 ~···- 01~ 2M•1 ■ 11 J)Qle!II
AowNofS,n
'S~ 0.25
.
e 0.2
'' 0 04
- Addone llM!II

~
' .
'
j 0.15 0.02

.
0

!
: 005
0.1 ' - ,o -'
1200 _ 1250..
... __ 1300
-........ _
1350 1400 1450 HiOO 1550 1600 1650

...............
1700
~

- - - • .- - - -. -- -
::i;
0
0 200 ,coo 800 800 ·1000 1200 1•00 1600 1800

Figure 9: Comparison uctwcen i- FlowNctS-fL, ii- FlowNetS after adding one or two levels a n<l updating only the
weights of ti.Jc new lcvcb, and iii- DIC with 2M + 1 - 11 and 2M + l - 21 pixcls. Oisplaccmenl.s along 6. and mean
absolute error per column.

with th.is second approach. The first one, named Stra.in.l\[et-f, is a full-resolution network obtained by
applying 4 down-samplings followed by 4 up-samplings. The second network, named StrninNet- h, is a
half-resolution network obtained by applying 5 down-samplings followed by 4 up-samplings. The same
loss function as in F lowNctS is used in both cases. These two networks a.re trained by using the weights
of the corresponding level-; of FlowNetS-ft as starting values, and then by fine-tuning all the network
weights. The same training strategy as that described in Section 5.1 was adopted. The average AEE
and MAE v-,1lues reported in Table 5 clearly show that these two proposed networks outperform the
pre,,jous ones, even though no r eal difl:'crcnce is visible to the naked eye between the Star displacements
reported in Fig11re 10 and those reported in Fig1rre 8 . This is clearer when observing the displacements
along I::,. reported in F igure 11 and comparing them with their ccmntcrparts in Figure !). lndL'Cd the
sharp fluctuations visible in the closeup view embodied in Figure 11 arc much sm,iller and smoother
t ha.n those shown in F igure !). In addition, the MAE per colu mn is lower in the latter than in the
former a.t high frequencies (for about 200 < x < 400). Finally, t he main conclusion of the different
metrics given in Tables 4 and maps or cmves showed in Figure 10 and 11 is t hat the last two networ ks
perform better than DIC (2NJ + 1 = 11 and 21 pixels, 1st order) at high spatial frequencies, and that
t hey prm,jdc comparable results at low frequencies . Let us now examine how to improve fmther the
re~mlts by changing the dataset used to train the networks.

MeLric FlowNeLS-fL DIC, '2M + l = 11 p ixels HaU-re;;oluLion network Full-resolnLion network


Average AEE 0.1070 - 0.0;{52 0.0211
MAE 0.0437 0.0~{65 0.Q:i50 ().():{fi}

Table 5: Comparisou between i- FlowNeLS-ft, ii- DIC with 2M + 1 = 11 pixels, and iii- FlowNeLS after changiug the
arcl1itectu rc with two opt.ions: half- re,;olution (8trainNct- h ) and full resolution (StraiuNet-f), aucl updating all the
weight..~. Average A.EE and MAE (in pixels), ealculat.ed over the whole irnagcs of the tcst dataseL.

Different methods were investigated in this section in order to improve the network. Table 6 gathers
a ll these methods and the corresponding results in order to help the reader compare them and easily
find the results obtained in each case. The results can also be improved by changing the data;;et. This
is the aim of the next section.

15
0.5 x 10~

0 5

4 4
,0 5 ,0 5
200 -too eoo 800 1000 1200 1400 ,aoo 1aoo 200 400 600 800 1000 1200 1400 1600 1600

100
200
Displacement

r· Dlsplttcemenl
0.5

300 , l·
,1 00 ff."_ I : _. : '-------------- -0.s oL-...-..___
200 400 600 800 IOOO t200 1400 1600 1800 -0.5 0.5 200 4(X) 600 800 lOOO 1200 1400 1600 1800 -<J.5 0.5
Eiror Error histogram Error Error histogram

(a ) F irsL neLwork: ha.U-rcsolulion ncLwork SlraiuNcl- h , MAE = (b ) Se cond nc lwork: full-reso lulion network SLrai uNel-f, MAE
0.0350. = 0.0:161.
Figure 10: Star d isplaccmenl. obtained aJtcr changjng the archil.ccl.ure of FJowNel.S wil.lt Lwo opLion..~: (a )
half-rc:;olution and (b) frill rc,;ulution, and updal.ing all the weights. Ail dimensious arc iu pixels. T he abscissa of the
vertical red line is such that I.be period of the sine wave is equal Lo 16 pixels, thus 1.wice Lhe s ize of Lhe regions used to
dcH.ne I.he displacernenl. maps gathered in Speckle dataseL 1.0. The red square aL the Lop left is Lite zone considered for
plot ting Lhc closeup view iu Figure 12.

0.6

<l OA

!~
02
0.55 ~ - ~ - = - -- - - - - - - ~ - -- --~=-----i

..
j
.e
0
0

-0.2
0.5
~

045 ( ·-~i., ...- ....... ~ • -
;

· # ... • - ... ..,!... . - • ,. \,. - ' ~~--~-

J
1200 1250 1'J_00 1350 I 1400 14r° 1500 1550 16«f 1650 l 1700
-0.4
0 200 800 800 1000 1200 1400 1600 1800
0.35, - - ; r a c ; - - - - . - - - - - - - - - - - , - - - - - - - - , - - -- ,- - - - - = = = = = = = -
- - ·DIC. 2M+1 =21 pbi~s
i 0.3 ·--·---CIC. 2M• 1 • 1 t obcetl

! 0.25 -
-
AowNetS.ft
Half-resolulc:ft
f"'-'esoli.mon
I
Q.

0.2
0.04

~ 0 15 0.02 -. . . . . - , -;.__,::+~- - • · - - '-


0
l 0.1 -- 0
't2CO . 1250 1300 1350 1400 1450 1500 1550 1600 1850
! oos
:1' oL __~:::~~~~~~~~~~~~~~~~~~~ ~

0 200 400 000 800 1000 1200 1400 1600 1800

Figure 11: Comparison bctw<.-cu i- FlowNetS- ft, ii- DlC with 2/vl + 1 - 11 and 21 pixels, and iii- F lowNcLS after
cl.Hlllgiug the archllccl.urc with Lwo op tions: half-resolu Liou and fuU rcsolutiou, and updaLi.ug all the wcighLs.
Displaceme11Ls along 6.. and meau absolute error per columu. AJJ dirneusiuus arc iu pixels.

6.2. Improvement of the training dataset


6.2.1. Procedure
An interc~<-;ting concl1L~ion of the preceding Hirnulations is that, regardk'ls of t he influence of sen-
Hor noise propagation to final displacement maps, the networ ks prop osed in the first ap proach (Hee
Section 6.1.1) enhance the aver age accuracy over t he teHt dataset. NeverthclcsH, t he accuracy of the
Star displacement field returned by these networks iH not improved, especially when high-frequency
displacements are concerned. This suggests limited generalization capability of the proposed network,
probably caused by the nature of t he training data.set.
We now diHcuss how to change t he speckle dataset HO that a lower noiHe and a lower bias can be
o btained with both StrainNet-f and -h for the h.ighrn;t spat ial frequencies of the Star disp lacement.

16
Approach Option 'I\·aining scenario R esults
1- Adding Levels one level Updating all the weights F ig. 7
1- Adding Levelc; one level Updating new Table 4, F ig. Sa
weights only
1- Adding LeveL'l two leveLc; Updating new Table 4, Fig. 8b
weights only
2-Changing architecture Half-resolution Updating all the weights Table 5, F ig. 10a, Fig. 11
(StrninNet-h)
2- Changing a.rchitectm e Fu.11-resolution Updating a.II the weights Ta.hie 5, F ig. 10b, F ig. 11
(Stra.inNet-f)

Table 6: tfoLhods aud options iuvcstigaLed Lo improve Lhc network iu this study.

Observing the bias at hi_qh frc.quencies in the Star displacement map. Before improving the dataset,
let us exam.inc in deta il t he rcsu.lts obtained in the preceding section. For instance, the error rna.ps
d epicted in Figure 10 show that the error suddenly i:ncrcasrn, for the highest frequencies (those on
the left). Interestingly, the location of this i-mdden change in rcspmL'lC of the network su bstantially
corresponds to the zone of the rlisplaccment map for which the spatial period of the vertical sine wave
is equal to twice t he size of the region used in Speckle dataset. 1.0, namely 8 x 2 = Hi pixels. This is
confirmed by superimposing to these maps a vertical red line at an abscissa which corresponds to this
period of 16 pixel'!. The explanation is that the displacement field coJLc;id ercd in Speckle dataset 1.0
arc linear interpolation of ra.ndom values 8 pixel apmt. They arc ther efor e increasing or decreasing
over an 8 pixel interval, and ca.nnot correctly reprc:,cnt sine waves of period lower than 16 pixcl8.
l n the same way, let us now enlarge the di:,placcment field obtained with StrninNct-h trained on
Spec:klc dataset 1.0. The zone under coll8ideration is a small square portion of the displacement field
(:-me precise locat ion with the red square in Figures 10a and l:fa). The result is given in Fig11re 12a. It
can be observed that the network is impacted by the fact that the dataset used for training pmposes
is generated from 8 x 8 independent r egions. Indeed, the network predicts t he displacement at one
point per region and then interpolates the rc8ltlts to obtain the full opticaJ flow. This p henomenon
is con.finned by down-:,arnpling a predicted displacement with a factor of 8 and then up-sampling the
result with the same factor. The resulting rlisplacernent is practically the same as the one given by
StrainNet -h trained with Spccklc datm,et 1.0. The main conclusion is that the network cannot correctly
predict the displacements on the left of the Star displacement because they occu.r at higher spatial
frcxi.uencics than those used in Speckle d ataset 1.0.

0.5

10 10 10

20 0 20 20

30 30 30

40 ·0.5 40 40
10 20 30 40 10 20 30 40 10 20 30 40

(a) SLrainNe L-b traioe<l on Speckle (b ) StrainNeL-b trained on Speckle (c) Ilefere uce displaceo1eut
dataset 1.0 <lalasCL 2.0

Figure 12: Result.:; obtained wiLh Stra.inNe t,.h Lraiucd on Specl<lc dataset 1.0 a.nd 2.0: closeup view at high s patial
frequcuc.ics area of the Star displaccmcuL (sec preeiHC locatiou iu Figures 10 and 1:1). (a) SLraiuNct,.h Lraiued on Speck.le
dataset 1.0. ( b) StrainNct- h trained on Speckle d ataset 2.0. (c) Reference displacement. All dimensions arc in pixels.

17
A com;cquencc of the remark;; above is t hat ;;quan! region;; smaller in ;;ize t han 8 pixel;; should a.l;;o
be included in the speckle dataset to be able to r etrieve displacement field;; involving ;;patia l -frequencies
higher than 1/8 pixcl- 1. A ;;econd and mor e suitable datai-;ct called Speckle dataset 2.0 was therefore
generated, as explained below.

Generating Speckle dataset 2.0. Speckle dataset 2.0 was generated with the same principle as Speckle
dataset 1.0, but by changing thrG'C design rules. First, regions of various ;;izc~-; instead of uniform ;;ize
of 8 x 8 pixels were considered to define the random ground truth di;;placement ficld'i. On the one
h and, the preceding remark motivates t he use of smaller rcgioI18. O n t h e other hand, lcss accu rate
c;;timation t han with DIC for low-frequency di;;placements (;;cc for instance F ig11re 11) motivates the
use of larger r egions. We therefor e considered regions of ;;izc equal to 4 x 4, 8 x 8, 16 x 16, ;12 x 32,
64 x 64 a.nd 128 x 128 p ixels.
Second, bilinear interpolation used in Speckle dataset L O wa;; replaced by a bicubic one to limit
potential interpolation bias. Third, a noise wa;; added to all t he images of the dataset in or der to
;;imulate the effect of sensor noise which al ways corrupt;; digital images, while only noi;;clcs;; images
were considered in Speckle dataset 1.0. This noise was hctcrm;cedastic to mimic typical sensor noise of
actual linear cameras [47. 43]. With this model, the variance v of the camera sensor noise i;; an affine
function of the brightn~-;s s, so v = a x s + b. We chose her e a.= 0.0342 and b= 0.2679 to be com;istent
with the values used to generate the noisy images used in the DIC Challenge 2.0 [4] .

The number of frames was t he same for C',ach region size. It was equal to 363 x 10 = 3, 630 (363
reference images deformed 10 t imes). Since six different region sizes were considered, Speckle dataset
2.0 contains 6 x ;1630 = 21780 different pairs of images.

6.2.2. Results obtained wi,th the Star images


Res1tlts obtained, wi,th noi.~elcss images. Strain.l'\ict-h aml StrainNct-f wer e trained again, but thi-; t ime
on Speckle dataset 2.0 instead of Speckle dataset 1.0. P rocessing t hen the noiseless Star images
with these two networks give;; the results illustrated in Figure 13-a and -b, respectively. It is wort h
remembering t hat StrainNct-f a.nf -h an! used here to determine a displacement field from noisclc'is
frames after being trained on noisy frame'>.

100

200 0 5 0 5
300

a .,• -0 5
4

- - - =- =- - -
Ohipl41cement
- ~ - - - - - - - - - -- - - - , - 0_:;
200 400 600 800 lOOO 1'200
Ch1placement '"StnlnNttt.f"
1400 1600 1600

100

400
- ~ - - - - - - - - - - - - - ~ - -0.s o"-- -
• • • • - =- - -
En o, =
Err0r o u
h istogram ••••-=--- Error
0 .5
Error histogram

(a ) SLrainNcL- lt, MAE = 0.0305 (b) Strn.iuNc t- r, MAE = 0.0266

Figure 13: Star displacement obtairl(.'<l with Stra.iuNct-f aud Stra.iuNet-h traint.'<l ou speckle tlaLascL 2.0. The reel square
at I.he I.op left. is I.he zour, cousidenxl for plotting I.he doseu p view iu Figure 12. All tl.imeusious a.re iu pixels.

Results obtained at the right-hand side of t he disphwcmcnt map of Fig1ue 13 arc rather smooth.
In addition, bearing in mind that the colorba.rs used in F igure-; 10 and 13 arc the same, it can be seen
by compm-ing the error maps that the high spatial frequencies me rendered in a more accmate way
with the networks trained on Speckle dataset 2.0., in particular the high-frequency components in the
left-hand side of the displacement maps.

18
Figure 14 compares the rrnmlts obtained by StrainNet-h and Stra.inNet-f trained in tum on Speckle
datasets 1.0 and 2.0. The r esults obtained by these networks trained on Speckle data.set 2.0 arc also
compared in Figmc 15 with DIC (subset size 2M + 1 = 11 and 21 pixcb) . It is clear that the r e-
sults obtained after training on Speckle data.'lct 2.0 arc better and have less fluctuations of the error.
Furthermore, the results shown in Figure 15 show that StrainNet-h and StrainNet-f have a similar
accuracy to DIC at low frequcm;ics and a better one at high frequencies.

Finally, it can be concluded that trnining the network wit h Speckle dataset 2.0 instead of Speckle
dat,;1.-;et 1.0 leads to better results with the Star images. Let us now examine the influence of image
noise on the results.

0.6 .,.
<I 0.4 '. • · SmmNel•h (Speclde dataset 1.0)1 "'
f -
-
StrainNel-1 (Speci<le dalaset 1.0)
SlralnNel-h (Speckle da1- • 2.0)
c •
• 02
Stra,nNel-f (Speckle dalas&l 2.0) 1

ii
j
C
0

-0.2
00: E ~~
0.45 '--.,___...__.....__ _ _ _ __,___ _.__ _ ___._ __,
-0,4 J._ 1200 1250 1:j_Qll 1350 .t.)400 _ j4p0 1@0 1550 16QO 1650 J 1700
0 200 <I()() eoo 800 1000 1200 1•00 1eoo 1800

• • •Stm,nNel-h (5-kk, dalasel 1.0)


........ Sl!lllnNCl•I (Speckle dat""'I 1.0)
- SlralnNet-h (Speckle da1ase12.0) -
- StralnNel-f (Spe<;kle datasel2.0J

002 1~~~~~~~~r~~,.,~'
0.01 1""'
, , ~ -, •
'

0 -L- ...
1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700

200 600 800 1000 1200 ,~oo 1600 1800

F igure 14: Comparison between the networks l.r aiu<.-d uu Sp eckle dataset 1.0 and Sp <.-cklc <lat,a,;et2.0.

Rcs11,lts obtained with noisy images. T he previous results were obtained with noiseless images. We
now consider noisy imagrn to evaluate the dependency of t he maps provided by Strainl'llet-f and -h to
image noise. T he Star images used in this case were obtained by adding a heteroscedastic noise similar
to t he one discussed above to the noiseless Star images.
Fig1.rre 16 shows the maps obtained with StrainNet-h and Strainl\/et-f with the St;u- images, as well
a.'l those o btained with DlC for comparison purposes (2M + 1 = 11 and 21 pixels). It is worth noting
t hat Stra.inNet-h and StminNet-f outperform DIC 2M + 1 = 11 pixels at both high and low spatial
frc.,,quencies. Indeed, the bias is lower for the high spatial frequencies and a smoother displacement
cli<,tribution (thus a lower noise) is obtained.

7. Spatial resolution and m ctrological performance indicator

7.1 . Spatial re.~olution


J\.s in other studies dealing with the metrological performance of full-field II1easuremcnt tech-
niques [10, 46], the spatial resolution of each algorithm is estimated here by considering the di<;-
placements along t he horizontal axis of symmetry t:,, _ The reference value is 0.5 pixels all a.long this
line, but the mea.<;med one becomes lower when going toward the high frequencies of the star displacc.,'-
ment, thus toward the left of the m ap. The attenuation of the signal, which directly reflects the bia.'l,
depends on the spatial frequency. The spatial resolution, denoted here by d, is generally defined ;;1.<-;
t he inverse of the spatial frequency obtained for a given attenuation of the amplitude of the vertical

19
<1
gt 0.4
..
0
·.~~J,~\-:=~-~-:-~:~,:::·.~ ~ ~ i :~i::t --_,
'c
t~. ( / '1t
1
- StrainNet-h (Speckle dataset 2.0)
e 0.2 ,
,t• .. ,; ,.: ' ·'
,~ ' 0 •55
- StrainNet-f (Speckle dataset 2.0)
~
m ,,.;
0.
~ 0 \ :•,,) ,,.i 0.5 --:·:·:~: -_'.~~ -: ~:. ;-:'_: .··:~---:- ' ~_ /;:fr!~·:,,"':·,·,~ ~'-~ __::~,_: =~~~:
.•· 0.4
-0.2 L __ ___J__ _ __ L_ __J_,_,,,.CJ_..,__,,.,____._..,,,,_ ....,,.l>L.JL...e:n8L_ -"1>,.,__,_,,,,,,_...L..,_,,,,,,__ u,.,,,,,__ lll>,!IL_.Lli.'8l.J

0 200 400 600 800 1000 1200 1400 1600 1800

C
~ 0.3
- - · DIC 2M• 1 • 21 pixels
..... .... DIC 2M+1 • 11 pixels
8:;; 0.25 - StrainNet -h (Speckle dataset 2.0)
- StrainNet-f (Speckle dataset 2.0)

:_:~
0.
g 0.2
0
~

~ 0.1 5 ·-- ••, ____


~ 01
too·-- i:l!iO__ 1300 1350 1400 1450 1500 1550 1600 16so 1100
- .. - - -
C
: 0.05
::.
0 L __-1_~:__:_:::;::::::::::..::-·~
~r -::··='
-- ~ ~======~~~§~~=='='=""'=!'""""'~~~9
0 200 400 800 1000 1200 1400 1600 1800

Fii,,urc 15: Cornpa.rison bctwcc1t the uctworks Lraiued ou Speckle daLa,-;ct 2.0 aml DlCs (2M + 1 - 11 aud 21 pixels).
Top: displaccrncuLs a.loug !::,.. BoLLom: rn=u absol ute error per col umu. AU dirnensious a rc iu pixels.

sine displacement. ThiH a ttenuation is generally equal to 10%=0.l. l n the pr esent caHe, it means
t hat t he spatial rcHolution is the period of the vertical sine displa.cement for which the amplitude iH
equal to 0.5 - 0.5 x 0.1 = 0.45 pixel. T he value of d must be a.-; small as possible to reflect a small
!>'Patial resolution, thus t he ability of the mm.IBuring technique of interest to distinguish clm;e features
in di-;pla.cement and strain maps, and r eturn a value of t he displaccmentH and strains in these regions
with a small bias. ln certain cases, the di-;pla.cement r esolution can be p r edicted theor etically from the
transfer function of t he filter associated to the technique (Savitzky-Golay filter for DIC [5, 6], Gaussian
filter for the Localized Spectrum Analysis [48]).

I n the present case of CNN however , no t r anHfcr function has been identified so far , so d can only
be determined numerica.lly or graph ically, by seeking the intersection between the curve representing
t he error obtained along 6,, with noiseless imageH on the one hand, and a horizontal line of equation
y = 10% on the other hand, sec F ig11re 17. Note that the curveH were smoothed with a rectangular
filter to remove the local fluctuation of t he bias that could potentially diHturb a. proper eHtimation of
this q uantity. T he spatial resolution found in each case iH directly reported in each subfigme. 'Ne alHo
considered here second-order subset shape functions, this case being more favorable for DIC [49]. Only
t h e ea.,c 2M I 1 = 21 p ixels iH reported here, DIC diverging 11t some points with 2NI I 1 = 11 pixels.
The value of dis smaller with both StrainNet-f and -h t han with DlC 1rncd with first-order su bset
shape functions, even for 2NI + 1 = 11 pixels, whiled is nearly the same with DIC used with second-
order 1,ubsct shape functions. Indeed, F igure 18 (top ) shows that the d isplacement along 6,, iHsimilar
between Stra.inNet-f or -h on the one hand, and DIC with second-order subHet shape functions on
t he other hand. F ig1rre ]!), where closeu p viewH of the error map for the highest Hpatial frcqnenci<',;;
arc depicted, shows however t hat the way d is eHtimated is more favorable for DIC with Hccond-order
1,u bset shape functiou..-; than for Strnin.Net-f and - h. Indeed, an addit ional bias occurs with DIC when
going vertically away &-om the a.xis of symmetry /::;. , a.-; clearly illustrated in Figure H)a. F igures 19b-c
Hhow that it is not the case for StrainNet-f and - h. Th.is phenomenon is not taken into account when
estimating d since only the loss of amplitude along 6,, is considered. Consequently, when considering
t he mean absolute error per column as in F igure 18 (bottom), it can be observed that this error is

20
0 5

0 5 0 5

4
-05 -0 5

~ - ~ ~ =
=CNs.placement - - = 200 400 600 800 1000 1200 1400 1600 1600
Dlsplaeement
::,-,,, -- - - - - - - - - - - - - - , ,.0,0 0.5

100 100

200 200

300 300
400 ,oo
200
= 400
= - 600
- -800
-- 1000
---------••
1203 1400 1600 18QO
--0.so'---
-0.5
--
0.5 200 400 600 800 1000 1200 1400 1600 1800
-0.soL.__...__
-0.5 0.5
Error h ist ogra,m Error Error h ist ogram

(a) DIC, 2M + 1 = 21, MAE = 0.08:14 {b) DIC, 2M + 1 = 11, MAE = 0.0417
05 x 10_5 _ _ 0 .5 x 10_5 _ _

100

200 0 5 O 5
300 300

400 4 400 4
-0 5 .0 5
200 400 600 000 1000 1200 1400 1600 1600

: ~t~
0 I,placement Olsplaeeme<it

L,
o-~~f;_-_~---~---------~--~ • "'
100 ~~~:

::i
400 ~1 -0.
50
300 '}.:-
' -....,...__, 1100 .: .;~
200 400 600 800 1000 120) 1400 1600 1800 --0.5 0.5 200 400 600 800 1000 1200 1400 1600 1800 -0.5 0.5
Em>, Error h istogram Error Error hist ogram

(c) SLraiuNeL- lt, MAE = 0.033:l (d) StraiuNeL- f, MAE = 0.0299

-0.2 ~ - - - ~ - - - ~ - - = ~ ~ = ~ ~ = - ~1=3=
50~_1~4~0=0- ~ = - = ~ ~ = ~ ~ = ~ ~ = ~ ~ 17~00=
0 200 400 1000 1200 1400 1600 1800

~ 0.3 • - · DIC 2M+1 = 21 pixels


8 - ······ DIC 2M• 1 ; 11 pixels
~ 0.25 - StrainNet-h (Speckle dataset 2.0)
a. " - StrainNet-f (Speckle dataset 2.0)

~ : ,___.-~~:~ ~--·~-- ~ ~~ ~~~,-::::= ~


02
• ,, ' • ._,.,..,& · ~-'..,. --'<l.t<:v Rl.. ~
Io~.
-
- ',' •
0 :;:.::
0 •·• , ••
2 ~ : : ; ; l

C "- 1200•• .1250. 1300 1350 1400 1450 1500 1550 1600 1650 1700

:i oo: L____L_~:-"'=::----._
::~"' ::::"""~'.':"::'.'.i':::::~:-::--:".-~-~:::::'.'.:::~-=-'.i-~-~-=-==-:::::::::t::~~~::r:::::::::::::~~~~::'.'::~'.'::!::d
0 200 400 600 800 1000 1200 1400 1600 1800
(c) Displaccmcul. aloug 6 an<l wcau abfiolul.c e rror per c.oluwu . All diwcJ..tt.iom;; are .iu pixels.

F igure 16: Results givcu by Dl C auc.l Straiu.Nct with uoisy images.

lower for StrainNet-f and -h than for DIC with second-order subset shape functions.

21
100 - Wilhoul filler 100 - Without filter
- After median filter - After median filler

80 80

~
0

(/)
60 C
(/)
60
ro ro
iii 40 iii 40

20 Spatial resolution = 79. [pixel] 20

0
--------- 0
20 40 60 80 100 120 140 20 40 60 80 100 120 140
Spatial wave length (pixel) Spatial wave length (pixel)
(a) DIC, 2M + 1 = 21 pixels, lirsL-ordcr subscL shape fuuc Liou.s ( h ) DIC, 2/vf + 1 = 11 pixel~, fusl- o r<lc r suhs<:L shape fuucLious

100 - Wilhout filler


- After median filter

80

~ 60
(/)
ro
iii 40

20 Spatial resolution = 26.8 [pixel]

0
20 40 60 80 100 120 140
Spatial wave length (pixel)
(c) DIC , 2M + 1 = 21 pixels, second-order s uhseL shape
fun.cl io ns

100 - Without filter 100 - Without filter


- After median filter - After median filler

80 80

~ 60 ~ 60
(/) (/)

"' 40
iii "' 40
iii

20 20 Spatial resolution = 26.7 [pixel]

0 0
20 40 60 80 100 120 140 20 40 60 80 100 120 140
Spatial wave length [pixel] Spatial wave length (pixel]
(d ) S lrai,,Nel- b (c) SLraiuNcl- [

Figure 17: Seeking Lhc spatial resolution of ead1 technique. The biii,; give n here b a percentage of Lhc displacement
ampliLudc, wh.ich is equal Lo 0.5 pixel

22
0,6 -,--- ,. ,--,
• M~~
l - - -DtC, 2M+1: 21 pixels .....
- Stra.,Not-h
~ s,.,...Nol•f -j

f ~~~t-~
1200L 1300 L 1400 J_SW 1600 1700 1800 L1900
-0.2 l
0 200 .coo eoo 800 1000 1200 1400 1600 1800

· ·· DIC, 2M•1 • 21 p l j
- sualnNot-h
SmmNet-f

0.05,~~~ J

0~
1200 1300 1400
--
1600 1600 1700
~~ 1800 1900

200 .coo 800 800 ·1000 1200 1•00 1600 1800

Figure 18: Comparison betwc-cn StrainNct and D IC (2M + 1 = 21 pixels) with S<.-cou<l-ordcr subset shape function
(noisy images). "lbp: disp lacements along 6.. BotLow: mca.o absolute e rror per cnluwu. All di.mcnsions arc in pixels.

0 0.5 0 0.5
0.5

100 100

200 200

300 300

400 400
-05 -0.5
-0.5 100 200 300 100 200 300
100 200 300 Displacement ,.StrainNet-f,.
Displacement "StralnNet-h"
Displacement "DIC 21x21" o .....,..,...,.,.,,,...,.,,.._=....,,
0 0.5 0.5
0 0.5

100 100
100
200 200
200

300
300
400
400
-0.5 -0.5
·0.5 100 200 300 200 300
100 200 300
Error Error
Error

(L) SLraiuNcL-l, (c) SLra.inNcL- [


(a) DIC, 2M + 1 = 21 pixels,
sccoud~ordcc ~ubscl. shape fuuc.:Lious.

J:,' igurc l!J: Closcup vicw of the crror map in pixels (for the high spatial frequencies) obt.a.incd wit.h St.rainNct. and DIC
with st.'Cou<l-order s ubset shape fw1etions.

1.2. Mctrological performance indicator


A gener al remark holds for full-field measmement systems: the lower the value of d, the higher the
noise in t he displacement map. The noise level is quantified by the standard deviation of the noise.
This quantity, denoted by a,,, reflects the displacement resolution of t he technique. Thus there is a link
between d and a ,,. lt has been ri~orously demonstrated in [48, 43] that this product is constant for the
Localized Spectral Analy8is, a spectral technique which minimizes the optical residual in the Fourier
domain. It has also been o bserved in subsequent studies that it was also the cai;e for DIC [7, 46],

23
which minimizr.H the Harne reHidual, but in the Hpatia.l domain. Hence eHtim ating t he product between
d and a,, iH a handy way to compa re measurement 8yHtems for a given biaH, but independently from
any other parameter chosen by the uHer such as the subset Hize for DIC. This product., denoted by a
and named "m etrological perfor mance indicator" in [7, 46], haH been calculated here for the different
a lgor ithms. The value of a., is merely estimated by calculating the standard deviation of the differ ence
between the displacement fields found by processing noisy and noiseless Star images. The values of a
found for each algorithm is reported in Figm e 20.
Thi8 indicator is nearly the Harne for DIC uHcd with 21\II + l = 11 pixel., and 21 pixelH (hit order),
which is consistent with the conclus ions of [7]. l t is also almost identical for Stra.i nNct- f and -h . Doth
lie between DlC uHed with first- and second-order Hubsct shape fu.ndions. Since the spatial resolution
eHtimated with d is nearly the same with DIC used with second-order subset shape fund.ions and
Strainl~et , it means that the noise level i8 higher in t he latter case. This can be observed in t he uoiHe
maps depicted in Fig11re 21. lo particular , the shape of the wave can be guessed in Fig11re 2lb-c. A
higher difference can alHo be ob8erved on the left, so for the highest spatial frequencies. lt means that
a slight bias is induced by noise in t hese two cases. Further investigations should be undertaken to sec
how to climi.nate this phenomenon, by changing the data.-;et and/or the laycrH of the network itHelf.

DIC, 2M + 1 = 11 px, 1st-order 10- = 0.62


I
DIC, 2M + 1 = 21 px, 1st-order I la = 059 I
StrainNet-h
I 10- = 0.46
I

StrainNet-f
I I" = 0.49 I

DIC, 2M + 1 == 21 px, 2nd-order 10- = 0.39


I I

0 0.2 0.4 0.6 0.8


a

Figure 20: Mctrological efficiency iu<licalor a for DlC (lsl aud 2u<l subset shape Iuuclious), SlraruNet.-1.i au<l
S LraiuNet-f.

1.,"J. Pattern-induced bias


Pat tern-induced bias (P IB) is a phenomenon, which ha.'l only recently been observed and described
in the DIC litcratmc (50, 51]. This bias is induced by the pattern texture itself. It manifests itself by
the presence of random fluctuations in the displa.c:cmcnt and strain maps. These random fluctuations
shall not be confused with sensor noise propagation since it is due to different causes, ma.inly the image
gTadient distribution and the difference between the true displacement field and it!, local approximation
by Hubsct shape functions, sec [6] where a. model for this phenomenon is p roposed. These spatial
fluctuations a rc rnudomly distributed because speckle patterns arc randomly distributed. The aim
here is to briefly examine whether displacement fields retrieved by StrainNct arc also prone to t his
phenomenon. We p erformed for this the same two experiments as in [51]. These experiments also rely
on synthetic: images deformed through the Star diHplaccmcnt. The first one consist8 in considering a
unique pair of reference/deformed speckle images, adding 50 times a different copy of heteroscedastic
noise, retrieving the 50 mrresponding displacement fields, and plotting the mean distribution along
the horizontal axis of f->'}7Il1Ilct,ry ,6. of the displacement map. The Hccoud one consists in considering
50 different pairs of reference/deformed Hpccklcs images, adding heterosccdastic noise to ca.ch image,

24
100

200

300

400

200 400 600 800 1000 1200 1400 1600 1800

(a) D IC, 2/vf + 1 = 21 pixels, 2nd-order subset sliape ruucLious

01

.
.,
0 05

200
0
.•..
300

400

'1>: , •
·:<i ·.~··
. .Q,05

.Q 1
200 400 600 800 1000 1200 1400 1800 1800

(b) SLrainNe t - !J
01

100 0.05

200
, :•
,,.
., 0

300
..
,.,. -0.05

400 ~

• •' ~1. I
U:..·•I , ► .Q.1
200 400 600 800 1000 1200 1400 1600 1800

(c) StraiuNc t-(

Figure 21: Difference between displacerncnL ficl,L~ obtained with noisy and no isclc,;s s p<--cklc images (in pixels). All
dimcru;ions arc i.i, pixe ls.

retrieving the 50 corre8ponding di'lpla.cement field'>, and plotting a.gain the mean distribution a.long
/1. With the first experiment, the raJ1dom fluctuations due to sensor noise m·e averaged out (or at
least decrease in amplitude by averaging effect) . However, PIB is constant over the dataset Hince the
di<.Jp!accment is the same and the sp eckles arc similar from one image to ;mother , the only difference
bctwc,'Cn them being due to noise. In the second experiment, a ll the speckles arc different ;md they arc
noisy, so both the random fluctuations due to sensor noise on the one hand, and due to the r andom
fluctuations caru;c by PIB on the other hand, arc averaged out. Comparing t he curves obtained in
each of thc.-,c two cases enables us to numerically illustrate the effect of PIB on the displacement field,
and to study its properties.
Fig1rre 22 shows on the left and on the right the curves obtained on average in the first and second
ca.<-;cs, rc.<.Jp cctivcly. They arc plotted in. red. The curves obtained for the 50 different pairs of images
arc superimposed in each case to give an idea of t he fluctuations which arc observed with each pair
of images. The results obtained with DIC (2M + 1 = 11 pixels, 1st order and 2M + 1 = 21 pixels,
2nd order) arc also shown for comparison purposes. It is worth r emembering that exact ly the same
sets of images are procc~'lsed here by DIC 1md StrainNct. The main remark is that PIB abo affects the
results given by StrainNet, but aver aging the results provided by 50 different patterns less improves
the results t han for DIC. The effect of Pili seems therefore to be less prononnccd for StrninNet ti.tan

25
for DIC, other phenomena cam;ing t he:-;e fluctuation:-;. lt is worth remembering t hat StrninNet-f and
-h were trained on Speckle data:-;et 2.0, and that the deformed imag<',:-; contained in this dataset were
o btained by interpolation. It would therefore be intcrc~ting to t rain StrninNct-f and -h with irna.gc:-;
o btained with the Speckle generator dc:-;cr ibcd in [39]. Indeed this latter generator docs not per form
any interpolation , so we could sec if the errors observed in Figure 22-f arc due to t he pattern a.lone or
to both t he patter n and the bias due to t he interpolation used when generating the deformed imagei;.
A larger dataset with m10ther random di.i;placemcnt generation i;chernc as in Speckle dataset 2.0 could
a.h;o help smoothing out this bias.

Subset = 11, Order = 1 Subset = 11, Order = 1

0.5 0.5

0.4 0.4

l o.3
:§,
-.f 0.2
0.1 0.1

-0.1 ~ - - - ~ - - - ~ - - - - ~ - - - ~ -0.1 ~ ---~---~----~---~


0 500 1000 1500 2000 0 500 1000 1500 2000
x [pix.ell x [pix.ell

(a) D IC, 2M + 1 = 11 pixels, lsL-order subsd sliapc f1rnclious, (b) DIC , 2/1,f + 1 = 11 p ixe ls , lsl-order s ubsel shape fuuclious,
1 pa.Uc n1 with 50 uoiscs 50 patLcrus wit.l..1 a <liJforcuL uoisc

Subset= 21, Order = 2 Subset= 21, Order= 2

0.5 0.5

0 .4 0.4

0 .1 0.1

-0.1 ~ - - -~ - - - ~ - - - - ~ - - - ~ -OJ ~ - - -~ - - - ~ - - - - ~ - - - ~
0 500 1000 1500 2000 0 500 1000 1500 200D
x [pixel] x [pixell

(c) D10 1 2M + 1 = 21 pixels , 2ud~order sub~t. s hape fuactious~ (d) DIC, 2M + 1 = 21 pixels, 2nd-order suhs<ll s h ape
1 pal.tern wibh 50 noises fuuctions,50 pat.Lc rus wiLh a differcnL noise

StrainNet-f StrainNet-f

0.5 0.5

0.4

] 0.3
C.
;! 0.2
0.1 01

0 - -- - - - - - - - - - - - - - - --<
-0.1 ~ - - - - - - - - - - - ~ - - ~ -0.1----~---~----~---~
0 500 1000 1500 2000 0 500 1000 1500 2000
x [pixel] x [pLxel]
(o) SLraiuNct.-f, 1 p2.t.Lcru w iU.1 50 noises (f) SLrainNcL--f, 50 pallcrns will, a di!Jcccnl uoisc

Figure 22: PaLLern- i.nducc<l bias. Comparison between rcsulLs obtained wiLh DIC {2M +1= 11 pixels, lsL order and
2M + 1 - 21 pixels, 2nd order) a u d SLrn.iuNcL-f.

26
8. A ssessing t h e gen eralizat io n capability

In deep learning, a key point is to validate any CNN with images <liffcrcnt from those used to
train the network in order to ensure good generalization capability. l n the preceding sections, we took
care to use StrninNet on speckle images deformed with the Star displacement while this network was
trained with Speckle dataset 2.0 which docs not contain any image deformed in a similar way. This
is, however, not sufficient because both the reference images in Speckle data.set 2.0 and the reference
Star image were generated with the same speckle generator [39]. Two other examples were therefore
considered in this study. Both involve speckles, which arc different from those obtained with the speckle
gener ator [39] which generates the reference frarnl~s in t he speckle datasets, and both the reference and
deformed Star images. The first example concerns images of synthetic speckles from Sample 14 of the
DIC Challenge 1.0 [4], the second real speckle images taken during a compression test performed on a
wood specimen, as described in [44].

8.1. E:mmple 1: Sample 14 of the DIC Challenge 1.0


In this section, we consider a pair of images from Sample 14 of the DIC Challenge 1.0 [4, 3].
The speckle pattern is obtained with TcxGcn, a speckle generator described in [52]. This pattern is
deformed by using a standard FFT expansion. It can be checked that the visual aspect is lliffcrent from
that of the images m;cd w far in this paper. In particular, large dots over which the image gradient
is nearly null can be observed, sec F igurn 23a where a typical subset size of 2IVf + 1 = 21 pixels
is superimposed. For comparison pnrposcs, Fig11.rc 23b represents a closeup view of one the speckle
image;; of Speckle dataset 2.0. The displacement u1,cd to artificially deform the images is a sine wave
a long the hor izontal axis, but with a frequency which increases when going to the right, the mnplitudc
being constant and equal to 0.1 pixel. This example is discus;;ed in [3], so the ;;a.me colorb:u- as in this
reference is used here to ma.kc comparison easier. Figure;; 23c-j show the di;;placement fields u along
the hor izontal direction obtained with the images from the "Sample 14 L5 Amp0.l" file available in [4].
DIC (2M +I = 11 and 21 pixels, firnt-ordcr sub;;et shape fonction;;) , Stra.in.Nct-h, and -f arc m;cd here.
The mean value along t he vcrticaJ direction of the horizontal displacement, denoted here by 'ux, is also
represented.
It can be SL'cn that results obtained with DIC and 2M + 1 = 21 pixels is the less affected by
noise. Noise increases with 2Nl + 1 = 11 pixels, which i;; logical since the snbsct size is smaller. The
displacement field obtau1cd with StrainNct- h i;; similar to the one obtained with DlC and 2M + 1 =
11 p ixels. Figure 24 shows the strain field;; Exx corresponding to the displacement fields shown above.
They were obtained by convolving the corresponding displacement fields with a. derivative kernel, which
enables us to perform at the ;;ame time both a smoothing of the noisy data and differentiation. This
kernel is the :1:-dcrivativc of a Gaus;;ian window of standard deviation equal to 6 pixels. Again, similar
maps arc obtained with StrainNet-f and -h trained on Speckle dataset 2.0 on the one hand, and DIC
run with 2M + 1 = 11 pixels on the other ha.nd. However, the main conchrnion here is not really
that Stra.inNet-f and -h give displacement and strain maps similar to DIC, but that these networks
trained with speckles obtained with the speckle generator described in [39] arc able to determine the
displacement field 11.•;cd to deform ;;pccklc images obtained with another generator.

27
(a) C lose-up view of Lhc imag~s o f Sample 14 of t he DIC (b) C loSL'-Up view of oue of Lite irnage.q
challe nge 1.0. A subset u::-icd iu D I G is supcrirnµ osed ( size: o f Speckle <la.t.asct. 2.0 used Lo t raiu
2M + 1 =21 pixels). t.hc uctwork.

01

200

300

140J 1600 11'5C(J 2000

(c) u .,, D IC , 2M + 1 = 21 pixels (ti) u., , DIC , 2M + 1 =21 p ixels


01

-0

-0
.o.1s - - ~ - ~ ' ~
=~
200 400 MO
""' l(JOO 1'00
""' 1600 18f.O
0 = - - - - - - - - -
(e) u .,, DIC , 2M + 1 = 11 pixels (f) u .,, D I C, 11, 2M + 1 = 11 pixels

0.1
100
0,(
200

'
300 t ..:

~oo 3:~-
..
500 ~.. ,

. -- -- - - ---
-0

'
.o.,s - _..__ -~. _..j_ --'---

200 ,oo 1000 ·,oo 1-'00 160) 10C0 2000


"'' "' ""
(g) u .,, StrainNcL-1.J (h) u.,, StnunNe t- h
0.1

I
"'~:'1
200, t . .
n.r

·,
JOO ~
·!! -0
.iool "

.'°k 200
"" "" 800 1000 1200 140.1 1600 1800 2000
-0

(i) u.,, Strai,u'-lcL-f (j) u.,, StrainNet- f


Figure 23: R.esulLs obtained oy processi11g images from Sample 11 oft.he DlC Challenge [1]. 1.bp: closeup view of the
speckle. Left: displacement. fie lds. Right : profile of Lhc average clisplaccmcnL over all Lhe columns of the maps. Sec [3]
t.o eornparc Llwsc maps Lo Lhosc o btained wit.h other D IC packages. All <lirucnsious a rc in pixcL5.

8.2. Example 2: Compression test on a wood specimen


1n this second example, we consider a real test performed on a wood specimen shown in FiguI"c 25a,
sec [44] for morn details a bout this specimen and the testir1g conditioru;. The interest here is twofold.
First real speckles instead of artificial ones arc considered. Second, the constitutive mater ial of the
specimen is heterogeneous since this iH a Htack of early a nd late wood. A consequence iH that the

28
IDO L 100
I
2001

3001-

1
4001-

500c.
200 400 BOD BOO 1200 14CD • f!oo !BOO 2000

{a) DIC, 2M + 1 = 21 pixels {b) DIC, 2M + 1 = 11 pixels

(c) SLraioNc t,.h (d) SLrai1JNc L- f

Figure 21: Strain map s,,,,, dcducc.~J from the dii;placeracnt fields depic ted in Figure 23. All dirncusioas arc i□ pixels.

stiffness spatially changes, and so docs t he strain d istr ibution if the rings arc perpendicular to the
loading force, which is the ca.<,c here. We consider a typical pair of frames and applied StrninNct-f
to determine the displacement field. Results obtained with DIC with 2M + 1 = 21 pixels (1st-order
imbsct shape functions) arc shown for comparison purposes. A convolution with a Gaussian derivative
filter is then applied in all cases to deduce the vertical strain map E: yy ·
It can be seen that similar maps arc obtained bu t a.gain, a more refined analysis should be performed
to discuss t he possible damping of the actual details in the strain map, as in [44] but this is out of the
scope of the p resent paper. The ma.in point here is that StrainNet is able to extract displacement and
Htrnin fields featuring rather high spatial frequencies from images different from those obtained wit h
Speckle dataset 2.0 since this is here a real speckle pattern. It must be noted that the displacement
is greater than one pixel over most of the front fa.cc of the sp ecimen. This disp lacement was therefore
split into two quantities. The first one is the displacement with a quite rough pixel resolution. The
second one is t he subpixcl displacement. The images arc t herefore processed p iecewise, in such a way
t hat the round value of the displacement is the same throughout each of the clements of the mesh. A
mesh of 11 horizontal bands is considered here. This round value for the displacement ca.n easily be
found by cross-correlation for instance. This point is however not really challenging and we applied
here a rough DIC to get this integer value for the sake of simplicity, only the su bpixel determination
of the displacement being of interest in the p resent study. A consequence is the fact that on close
inspection, slight straight lines c,m be seen in the strain maps, along t he bor der betwL'Cn :-;ome of the
clements. Nothing can be visually detected at the same place in t he displacement maps.

29
y

X
1--....... ---a------====- --+
(a) Specimcu lwfore spray- pai11li11g, aSlcr (44] ( h} Speckled sur fr.u.;c of Lhc specimcu after sp ray paiut.iug,
afLcr [44]
...
100 100

" 200
...,.
..,
300 JOO ....
.O.Oli
_
. _ __ .... "
,00
....,
''"
.........,
IIOO
I
,00
-=--
700 . . . . . . . . . ,,.

100 200 !XC1 400 !I(» 11C10 700 llOO ~ 1()(1:) nlXI 100 ,ao XIO 4 ~ et'Q 700 100 000 1000 1100

(c) DIC, 2M + 1 = 21 pixels (d} DIC, 2M + 1 = 21 pixels


,.
100
,,

... --- -- ---


"
''"
.,0

,00

,., .., .., ,..


"" JOO
""' "" ""' IOCO
""'
(c) SLraioNct,-lt (f) SlrainNcL- h

"
IOO
,,"
200
.,, ,.,
.,0
"
500
~
.,0

..,,.
-<10,

. ""' "" "" "" ..,


100

, r,i ,00
.., 1DCIO .... tOO lOO 303 MJO !iDO tQl 100 800 ((ID 1000 1100

(g) Slrn.iu NcL- f {b) SlraiuNel- f

Figure 25: Result,,; o btairlL'<l l,y processing real images. LcfL: v <lis placcm euL ficl<l iu pixels. Hight: •·yy slra.iu map. All
di rncnsions arc iu pixels.

30
9. Comput ing t ime

Finally, we give some information on the computing time needed to perform the cakulatiom;. CNNs
arc wcU suited to be run on Graphical Processing Units (GPUs), which was the case here. This is
even necessary to achieve training in a reasonable amount of time, as mentioned in Section 5.1. Once
StrainNet was trained, we used it on the same GP U. The computing time needed to estimate the
subpixcl displacement field is r eported in Table 7. Two typical examples arc considered here. The first
one corresponds to one pair of Star images, the second to Sample 14 of the DlC Challenge 2.0. The
number of pixel-, of the frames and the computing time are a lso reported in th.is table. The number of
Points Of Interest per second (POI/ s) is also given in each case. Th.is quantity represents the number
of points per second at wh.ich a measurement is provided by the measuring system. It is used in [53] in
order to measure the calculating speed of GPU- based DIC. In our case, using thi'> quantity is a handy
way to normalize the results obtained with different techniques and different frame sizes, and to fairly
compare them.

Case lirame size StrainNet-h StrninNet-f


Computing time (s) POl/ s Calculating time (s) POl s
Example 1 2000 X 501 0.0081 l.24E+08 0.077 l.30E+07
Example 2 2048 X 589 0.0088 l.37E+08 0.092 l.31E+ 07

Table 7: Comµuling Lime and Poirll8 Of lot.crest µer 8L'COud (POl/s) for ExarnµIC8 1 aud 2.

For StrainNet-f and -h, the value of POI/:-; is about ten times lower for the "h" version than for
the "f' one while the resolution before final interpolation to reach fuU resolution is 4 times lower.
Interestingly, R ef [53] reports a POI/s equal to 1.66 • 105 and 1.13 • 105 for a parallel DIC code
implemented on a GPU, which is nearly two orders of magnitude below. These values a.re given for
information only: the GPU used in [53] (NVIDIA GTX 760, 2.:3 TFLOPs) is indeed less powerful than
the GPU used in the present study (NVIDIA TESLA VlO0, 114 TFLOPs). l n add.ition, the reader
should bear in mind that CNNs must be trained with a suitable dataset, which generally represents
heavy calculations. Fmther work should therefore be undertaken to fairly compare Strain.Net and a
GPU-based DIC in tenm of computing t ime. The conchrnion is, however, that Strain.Net provides
pixel-wise d.isplacement maps (and thus strain maps by convolution with s1.Litable dcrivc1tive filters) at
a rate compatible with real-time measurements.

10 . Con clusio n

This paper presented a CNN dedicated to the measurerncnt of d.isplacement and strain fields. As
for DIC, the surface to be investigated was marked with a speckle pat.tern. Various strategics deduced
from the similar problem of optical flow determination were presented, and the best one has been
adapted to give a network named StrainNct. This network was trained with two versions of a specific
speckle dataset. The main result was to demonstrate, tln·ough some relevant examples, the feasibility
of this type of approach for accurate pixclwisc subpixel measurement over fuU d.isplacernent fields. As
for other problems tackled with CNNs in engineering, the main benefit here is the very short computing
t ime to obtain the sought quantities, here the displacement fields. Further stud.ies remain necessary
t o investigate variou:-; problems, which are still open after thi'l preliminary work. For instance, the
dataset used in order to train the network directly in.fluenc<:;, the quality of the final results. The dataset
developed here led to valuable results for :-;pcckles different than those of the images form ing the dataset,
in particular experimental ones. This observation should however be consolidated by considering a
wider panel of speckles and thus by trying to reduce noise and bias in the final displacement maps.
The generator free from any interpolation, wh.ich was used here only for generating the deformed Star
images for a matter of time, could abo he employed for the images of the dataset despite the computing

31
cost. The networks discussed here were obtained by enhancing FlowNetS. A complete redesign should
also be undertaken, in particular in order to simplify the network. This would certainly reduce both
the training and the processing times. Finally, a model able to deal with displacements larger than one
pixel while still giving accurate subpixcl estimation should also be investigated further, for instance by
training the network on a datm,et containing deformed images involving displacements greater than
one pixel.

A c k now lcdgm ents

This work has been sponsored by the French government research prognun "lnvestissements d 'Avenir"
through the lDEX-ISITE initiative 16-lDEX-0001 (CAP 20-25) and the IMobS3 Laboratory of Excel-
lence (ANH-10-LABX-16-01).

R efer ences

[l] M. Sutton, .1..1. Orteu, and H . Schreier. Ima_qe Correlation for Shape, Motion and Deformation
Measurements. Basic Concepts, Theor7J and Applications. Springer, 2009.
[2] B. Pan, K. Qian, H. Xie, and A. Asundi. Two-dimensional digital image correlation for in-
plane displacement and strain measurement: a review. Measurement Science and Tcchnoto_qy,
20(6) :062001, 200!).

[3] P. L. Rcu, E. Toussaint, E ..Jones, H. A. Bruck, M. la.dicola, R. Balcaen , D. Z. Tmner, T. Siebert,


P. Lava, and M. Simonsen. DIC challenge: Developing images and guidelines for evaJnating
accuracy and resolution of 2D analyses. Experimental Mechanics, 58(7) :1067-1099, 2018.

[4] DIC challenge: http://sem.org/dic-challenge/.

[5] H . W . Schreier, M. Sutton, and A. Michael. Systematic errors in digital image correlation due to
undermatched subset shape functions. Expcrimcnt<d Mechanics, 42(3):303-310, 2002.

[6] F. Sur, B. Blaysat, and M. Grediac. On biases in displacement estimation for image registration,
with a focus on photomechanics. Submitted for publication, 2020.

[7] M. Grcdiac, B. Blaysat, and F. Sur. A critical compm·i:son of some metrological parameters charac-
terizing local digital image correlation and grid method. E:i;perimental Mechanics, fi7(6):871-903,
2017.

[8] G.F. Bomarito, .J.D. Hochhalter, T ..J. Ruggles, and A.H. Cannon. Increasing accnrncy and prcci-
:sion of digital image correlation through pattern optimization. Optics and Lasers in Engineerin_q,
91:73 - 85, 2017.
[!J] M. Grcdiac, B. Blaysat, and F. Sur. Extracting displacement and :,train fields from checkerboard
images with the localized spectrum analysis. Experimental Mechanics, 59(2) :207-218, 2019.

[10] M. Grediac, B. Blaysat, and F. Sur. On the optima.I pattern for displacement field measurement:
random speckle a.nd DIC, or checkerboard a.nd LSA? Experimental Mechanics, 60(4):50!)-534,
2020.

[11] A. Ifrizhevsky, I. Sutskever, and G.E. Hinton. lnmgeNet classilica.tion with deep convolutiona.l
neural networks. Advances in Neural Information Processing Systems 25, 1097-1105, 2012.

[12] A. Dosovitskiy, P. Fischer, E. Ilg, P. Hiiusser, C . Hazirbas, V. Golkov, P. van der Smagt, D. Cn.'-
mers, and T. Brox. FlowNet: Learning optical flow with convolutiona.l networks. In IEEE Inter-
national Conference on Computer Vision {!CCV}, pages 27fi8-2766, 2015.

32
[13] Ll. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MlT PreHH, Cambridge, MA, USA ,
2016.

[14] B.K.P. Horn and B.G. Schunck. Determining optical flow. Artificial i ntelligence, 17(1-3) :185-203,
1981.

[15] B.D. LucaH and T. Kana.de. An iterative image regiHt,ration teclmiquc with an application to Hterco
vision. In i nternational .lo·i nt Conference on Artificial i ntelligence, volume 2, page 674~79, 1981.

[16] S . G u=, II. L i, ,J.nd W . Zheng. Unsupervised learning for optic;J.l Bow estimation using pyramid
convolution lHt,m . 1n IEEE International Conference on Multimedia and Expo {ICME}, pages
181-186, 20Hl.

[17] A. Ahmadi and L P atras. Unsupervised convolutional neural networks for motion estimation. ln
IEEE International Conference on i mage Processing {I CIP), pages 162!)-163;\ 2016.

[18] Y . Wang, Y. Yang, Z. Yang, L. Zhao, P. Wang, and W . Xu. Occlusion aware unsupervised learning
of optical flow. In IEEE/CVF Conference on Comp1der Vision and Pattern Recognition, pa.geH
4884-48!):3, 2018.

[19] W.-S. Lai, .J.-B. Huang, and M .-H . Yang. Semi-supervised learning for optical fl.ow with generative
advers;u-ia l networks. 1n Ne·ural Information Processing Systems (NIPS), 2017.

[20] Y. Yang and S. Soatto. Conditional prior networks for optical flow . 1n IEEE Eurovean Conference
on Comvuter Vision {ECCV), pages 282-2!)8, 2018.
[21] .J. Xu, R . Ranft!, and V. Kolt1m. Accu rate optical flow via direct cost volume p roceHsing. 1n
iEEE Conference on Comvuter Vi~ion and Pattern Recognition {CVI'R) , pages 5807-5815, 2017.
[22] .J . Wulff, L. Sevilla-Lara, and M . J . Black. Optical flow in mostly rigid scenes. In IEEE Conference
on Computer Vision and Pattern Recognition {C VPR), pageH 6911-6920, 2017.

[2:3] P. Weinzaepfol, J. Revaud, Z. Ha.rchaoui, and C. Schmid. Deepflow: Large displacement optical
flow with deep matching. In IEEE international Conference on Comvutcr Vision, pageH 1385-
1;3!)2, 2013.

[24] C . Bailer, B. Taetz, and D. Stricker. Flow fields: Dense cmTespondence fields for highly accurate
large displaccrm:nt optical flow estimation. In IEEE International Conference on Comvuter Vision
{I CCV}, pages 4015-4023, 2015.

[25] 0. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical im-
age Hegmentation. 1n Medical i mage Cornp1tting and Comvuter-A ssisted Intervention (MI CCAI),
pages 234-241, 2015.

[26] P. Hu, G . Vvang, and Y. Tan. R ecurrent 1>J>atial pyramid CNN for optical flow estimation. IEEE
Transactions on Multimed-ia, 20(10):2814-2823, 2018.
[27] E. Ilg, N. Mayer, T . Saikia, M . K euper , A. Dosovitskiy, and T. Brox. F lownet 2.0: Evolution of
optical flow estimation with deep networks . 1n iEEE Conference on Comvuter Vi.sion and Pattern
Recognition {CVPR), pagcH 1647-1655, 2017.

[28] A. Ranjan and M . .J. Black. Optical fl.ow c~-;timation uHing a 1>J>atial pyramid network. ln iEEE
Conference on Computer Vision and Pattern Recognition {CVPR), pages 2720-272!), 2017.
[29] T. Hui , X. Tang, and C . C. Loy. Liteflownct: A lightweight convolutiona.l neural network for
optical flow eHt.imation. 1n IEEE/CVF Conference on Computer Vision and Pattern Recognition,
pages 8981-8989, 2018.

33
[30] D. Sun, X . Yang, M. Liu, and .l. Kautz. PWC-Net: CNNs for optical flow using pyramid, warping,
and cost volume. In JEEE/CVF Conference on Computer Vision and Pattern Recognition, pages
8!):14-8948, 2018.

[31] D ..T. Butler, .l. Wulff, G.B. Stanley, and M ..J. Black. A naturalistic open source movie for optical
flow evaluation. In E1iropean Conference on Computer Vision {ECCV} , page 611-625, 2012.

[32] M. Menze and A. Geiger. Object scene flow for autonomous vehicles. In IEEE Conference on
Computer Vision and Pattern R ecognition {CVPR }, pages 3061-3070, 2015.
[33] S. Cai, S. Zhou , C . Xu, and Q. Gao. Dense motion estimation of particle images via a convolutional
neural network. Experiments in Fluid.s, 60(4):7:3, 2019.
[34] S. Cai, .l. Liang, Q. Gao, C . Xu, and R. Wei. Particle image velocimetry based on a deep learning
motion estimator. IEEE Transactions on i nstrumentation and Measurement, 2019.

[35] N. Mayer, E . Ilg, P. Hiiusser, P . Fischer, D. Cremers, A. Dosovitsk.iy, and T . Brox. A large dataset
to train convolutional networks for disparity, optical flow, and sc,cne flow estimation. In IEEE
Conference on Computer Vision and Pattern Recognition {CVPR}, pages 4040-4048, 2016.

[36] S. Baker, D. Scharstein, .l. P. Lewis, S. Roth, M .J . Black, a.nd R. Szeliski. A database and
evaluation methodology for optical flow. International Journal of ComputCT" Vision, 92(1) :1-31,
2010.

[37] A. Geiger , P. Lenz, and R. Urtm:mn. Arc we ready for autonomous dTiving? the KITTI vision
benchmark suite. ln JEEE Conference on Computer Vision and Pattern R e.cognition, pages 3354-
3361, 2012.

[38] N. Mayer, E . lig, P. Hiiusscr, P . Fischer, D. Cremers, A. Do:mvitskiy, and T. Brox. A large dataset
to train convolutiona.1 networks for disparity, optical flow, and scene flow estimation. ln IEEE
Conference on Computer Vision and Pattern Rer.ognition {CVPR ), pages 4040--4048, 2016.

[39] F. Sur, B. Blaysat, and M. GnSdiac. Rendering deformed speckle images with a Boolean model.
Journal of Mathematical Imaging and Vision, 60(5):634-650, 2018.

[40] C . Pinard. A reimplementation of F lowNet IL'iing PyTorch. https://github . c om/


ClementPinard/FlowNetPytorch, 2017.

[41] S. Niklaus. A reimplementation of PWG-Net using PyTorch. https: //gi thub. com/sniklaus/
pytor ch-pwc, 2018.

[42] S. Niklmrn. A reimplementation of LiteF!owNct using PyTorch. https: / / gi thub. com/sniklaus/


pyt orch- liteflownet, 2019.

[43] M . G r6di,.u.: aud F. S u.r. EITcd uf :scu:sor 1101:sc uu Lite rc:suluLiuu a rnl :spatial rc:soluLiou uf LL.e
disp lacement and strain maps obtained with the grid method. Strain, 50(1) :1-27, 2014.

[44] M. Gr6diac, B. Blaysat, and F . Sur. A robust-to-noise deconvolution algorithm to enhance di'i-
placcment and strain maps obtained with local DIC and LSA. E:i;perimental Mechanics, 59(2):219-
243, 2019.

[45] E .M.C .Jones and M.A. l adicola (Eds) . A Good Practices Guide for Digital Image Correlation.
International Digital Image Correlation Society, 2018. DOI: 10.32720/ idics/gpg.edl.

[46] B. Blaysat, .J . Ncggcrs, M. Gr6diac, and F . Sur. Towards criteria character izing the metrological
performance of full-field measurement techniques. Application to the comparison between local
and global versions of DIC . E:r;perirncntal Mechanics, 60(3):393--407, 2020.

34
[47] A. Foi, M. Trimeche, V. Katkovnik, and K. Egiazarian. Practical Poi:-;:-;onia.n-Gam;sian noi,m mod-
eling and fitting for single-image raw-data. JEEE Transactions on image Pror,cssing, l 7(10):1737-
1754, 2008.

[48] F. Sur and M. Grcdiac. Toward-, deconvolution to enhance the grid method for in-plane :-;train
measurement. Inverse Problems and Imaging, 8(1):259-2!)1, 2014.

[4D] L . Wittevrongcl, P. Lava, S. V . Lomov, and D. Debruyne. A :-;elf adaptive global digital image
correlation algor ithm. E:q1erimental Mechanics, 55(2):361-378, 2015 .
[50] R. B. Lchouc:q, P . L. R.eu, a.nd D. Z. 'l'mner. The effect of the ill-posed problem on quantitative
error as:-;cssment in digital image correlation. Experimental Mechanics, 2017. in prcHs.

[51] S. S. Fayad, D. T. Seidl, and P. L . Ren. Spatial DIC ci-rorn due to pattern-induced bias and grey
level discretization. ExpC'rimental Mechanics, 60(2):24!)-263, 2020.

[52] .J .-.). Ortcu, D. Garcia, L. Robert, and F. Bugarin. A speckle texture image generator. Proceedings
SPIE: Speckle06: speckles, frvm grains to flowers., 6341:63410H 1--6, 2006.

[53] L. Zhang, T . Wang, Z..Jiang, Q . Kcmao, Y. Liu, Z. Liu, L. Tang, a nd S. Dong. High accu-
racy digital image correlation powered. by GP U-based parallel computing. Optics and Lasers in
Engineering, 69:7 - 12, 2015.

35

You might also like