calibDB_enabling_web_based_computer_vision_through
calibDB_enabling_web_based_computer_vision_through
(a) Calibrated camera matrix only (b) Rectified image (c) Fully calibrated camera
Figure 1: Effect of camera calibration on an augmented reality scene: Although a calibrated camera matrix is used in (a), the
misalignment is clearly visible. Using a complete distortion model allows rectifying the image (b). Together with an adapted
camera matrix, this results in a fully aligned augmentation (c).
ABSTRACT 24th International Conference on 3D Web Technology (Web3D ’19), July 26–
For many computer vision applications, the availability of camera 28, 2019, Los Angeles, CA, USA. ACM, New York, NY, USA, 4 pages. https:
//doi.org/10.1145/3329714.3338132
calibration data is crucial as overall quality heavily depends on it.
While calibration data is available on some devices through Aug-
mented Reality (AR) frameworks like ARCore and ARKit, for most
cameras this information is not available. Therefore, we propose
a web based calibration service that not only aggregates calibra-
tion data, but also allows calibrating new cameras on-the-fly. We 1 INTRODUCTION
build upon a novel camera calibration framework that enables even Camera calibration in the context of computer vision is the process
novice users to perform a precise camera calibration in about 2 of determining the internal geometrical and optical camera char-
minutes. This allows general deployment of computer vision algo- acteristics (intrinsic parameters) and optionally the position and
rithms on the web, which was previously not possible due to lack orientation of the camera frame in the world coordinate system (ex-
of calibration data. trinsic parameters). The performance of many 3D vision algorithms
directly depends on the quality of this calibration [Furukawa and
CCS CONCEPTS Ponce 2008]. Furthermore, calibration is a recurring task that has to
• Computing methodologies → Tracking. be performed each time the camera setup is changed. Even cameras
of the same series can have different intrinsic parameters due to
build inaccuracies.
KEYWORDS
Native applications can leverage frameworks like ARKit and AR-
computer vision, distributed systems, calibration, webxr Core which provide the camera intrinsic parameters per-frame. Al-
ACM Reference Format: ternatively developers use lower-level vision libraries like OpenCV
Pavel Rojtberg and Felix Gorschlüter. 2019. calibDB: enabling web based [Bradski et al. 2005] and manually acquire and ship the calibration
computer vision through on-the-fly camera calibration. In Web3D ’19: The data specific to their setup.
For web-based computer vision solutions the WebXR Device
API Draft [World Wide Web Consortium 2019] provides the in-
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed trinsic camera matrix through the XRView interface. However, the
for profit or commercial advantage and that copies bear this notice and the full citation data is encoded into a projectionMatrix as used for rendering and
on the first page. Copyrights for components of this work owned by others than the needs special conversion to be used with vision algorithms. The
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission lens distortion coefficients are completely absent, which drastically
and/or a fee. Request permissions from [email protected]. reduces precision (see Figure 1). These two aspects show that the
Web3D ’19, July 26–28, 2019, Los Angeles, CA, USA existing API focuses on a camera representation primarily suited
© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-6798-1/19/07. . . $15.00 for rendering — likely due to its strong heritage from the WebVR
https://doi.org/10.1145/3329714.3338132 API. Furthermore, the available WebXR polyfills either leverage
Web3D ’19, July 26–28, 2019, Los Angeles, CA, USA Pavel Rojtberg and Felix Gorschlüter
Figure 3: The REST protocol of our web-based camera cali- 3.2 Calibration Database
bration system The service can be queried for calibration data using a combination
of userAgent, MediaStreamTrack and MediaTrackSettings [World
Wide Web Consortium 2017] as the key:
an overlay (see Figure 2b) to guide to specific poses. The whole {
process of capturing the images and computing a new calibration " camera " : " C922 Pro S t r e a m Webcam ( 0 4 6 d : 0 8 5 c ) " ,
on average only requires 2 minutes — even if the user is not familiar " host " : " Linux x86_64 " ,
with computer vision. " image_width " : 1280 ,
" image_height " : 720 ,
" zoom " : 0
3 WEB BASED IMPLEMENTATION }
In this section we describe our calibration service "calibDB" in detail. Listing 1: Example calibration-data request
First we discuss the high-level architecture and internal protocol of
Here the camera property is used for differentiating multiple cam-
the service. Then we describe the external API and data format used
eras attached to the PC or the front and back camera on mobile
for calibration data retrieval and acquisition. Finally, we discuss
devices. The host property is mainly used to differentiate mobile
how the current WebXR API should be extended to seamlessly
devices where camera would only contain "front" or "back". The
provide calibration data to computer vision applications.
"zoom" property translates to the currently set focal length of the
camera or zero if the focal length cannot be determined.
3.1 Efficient Client/ Server separation
If no reliable calibration data is available the server responds with
To bring our existing OpenCV based implementation to the Web, the HTTP/307 status code, redirecting to the calibration-guidance
we utilize the OpenCV.js bindings, that wrap the C++ code with landing page as described in Section 2.
Emscripten [Zakai 2011] into a WebAssembly library. Here, we To verify whether calibration data is reliable, we collect at least
do not fully port our existing code to javascript to be executed 5 different calibrations and compute the variance of the intrinsic
in the browser. Instead, we introduce a client/server split as the parameters. Only if the variance is small compared to the parameter
captured 2D measurements, and the final calibration parameters values, we consider the calibration data reliable. Here, we aim
will be transferred to the server anyway. Our architecture is split to enforce re-calibration for interchangeable lens cameras. These
as follows: identify using the same name, but have largely varying intrinsic
• A web-based acquisition client, that captures video using We- properties. Notably, this also covers the use of manually operated
bRTC [Burnett and Narayanan 2011] and performs low-level lenses where the "zoom" property cannot be read automatically.
image processing directly on the device. This reduces latency If reliable calibration data is available it is returned in JSON
and offloads the computation heavy image processing from encoding as:
the server. {
• The calibDB server component that receives the captured " image_width " : 1280 ,
key-points and provides new target poses to the clients. This " image_height " : 720 ,
" camera_matrix " : [ [ 1 . 4 3 e +03 , 0 . 0 , 9 . 5 2 e +02] ,
allows re-using most of our control logic and keeps the ar-
[ 0 . 0 , 1 . 4 3 e +03 , 5 . 0 5 e +02] ,
chitecture extendable for multiple clients, as is useful with [0.0 , 0.0 , 1.0]] ,
e.g stereo camera calibration. " distortion_coefficients " : [ . . . ] ,
Figure 3 shows a sequence diagram of the REST based commu- " distortion_model " : " r e c t i l i n e a r " ,
" avg_reprojection_error " : 0.72
nication between browser and calibDB. As we want to provide our }
calibration service publicly on the internet we employ API tokens
Listing 2: Example calibration-data response
to prevent abuse. After the client was authorized by calibDB, a
Web3D ’19, July 26–28, 2019, Los Angeles, CA, USA Pavel Rojtberg and Felix Gorschlüter
The message contains the parameters K and d as discussed in Sec- draft end suggested extensions that can make the whole process
tion 2. Additionally, it provides the resolution at which the cal- transparent for the end-user.
ibrated was performed. This is useful when the exact requested However, additional support by the browsers might be needed to
resolution is not available. In this case the calibration for closest allow matching AR visualization. One possibility is to support image
resolution is returned. The client is now able to either adapt the remapping through the WebXR API to allow rectification as shown
capturing or redirect to the guidance page, if a specific resolution in Figure 1b. Alternatively the WebGL API could be extended to
is crucial. support the reverse direction, namely distorted rendering. However,
The client is also able to explicitly specify the desired distor- actual usage patterns should be analyzed to decide whether this
tion_model, by adding it to the request (Listing 1), if only a specific would be beneficial or whether it is sufficient to offload these tasks
model is supported. In case no calibration using the requested model to client libraries like OpenCV.js.
is available for the specified camera, the server can transparently Furthermore, it needs to be evaluated whether our calibration
perform a new parameter fitting on-the-fly. This is made possible key is sufficient to identify the various cameras and devices or if
by storing the 2D key-points alongside the calibration results. For we have to use more sophisticated fingerprinting.
instance if ∆R is requested, but only calibrations for ∆ F are avail-
able, the server can repeat the parameter fitting using the existing REFERENCES
data. However, this is not always valid. In the example above the Adobe Systems Inc. 2012. Digital Negative Specification. https://www.adobe.com/
products/dng/pdfs/dng_spec_1_3_0_0.pdf
rectilinear model is not capable of explaining all measurements as Gary Bradski, Adrian Kaehler, and Vadim Pisarevsky. 2005. Learning-Based Computer
produced by a fisheye lens. Therefore, the response also includes Vision with Intel’s Open Source Computer Vision Library. Intel Technology Journal
the avg_reprojection_error, which is the residual error on the mea- 9, 2 (2005).
Tim Bray. 2017. The javascript object notation (json) data interchange format. Technical
surements. The client is now again able to redirect to the guidance Report.
page to force a more precise calibration. Daniel C Burnett and Anant Narayanan. 2011. getUserMedia: Getting access to local
Our prototype implementation supports the "rectilinear" and devices that can generate multimedia streams. W3C Editor’s Draft (2011).
Yasutaka Furukawa and Jean Ponce. 2008. Accurate camera calibration from multi-view
"fisheye" distortion models and stores the calibration results as well stereo and bundle adjustment. In Computer Vision and Pattern Recognition, CVPR
as the key-points in a schema-less database [MongoDB 2019]. This 2008. IEEE Conference on. IEEE, 1–8.
S Garrido-Jurado, Rafael Muñoz-Salinas, Francisco José Madrid-Cuevas, and
allows to easily extend the system to new distortion models as Manuel Jesús Marín-Jiménez. 2014. Automatic generation and detection of highly
needed. reliable fiducial markers under occlusion. Pattern Recognition 47, 6 (2014), 2280–
2292.
Christopher Geyer and Kostas Daniilidis. 2000. A unifying theory for central panoramic
3.3 Extending the WebXR API systems and practical implications. In Computer Vision—ECCV. Springer, 445–461.
Fabian Göttl, Philipp Gagel, and Jens Grubert. 2018. Efficient pose tracking from
To provide the relevant calibration information through the WebXR natural features in standard web browsers. arXiv preprint arXiv:1804.08424 (2018).
API, it needs to be extended in several ways. We propose to extend Richard Hartley and Andrew Zisserman. 2005. Multiple view geometry in computer
vision. Robotica 23, 2 (2005), 271–271.
the XRView interface, as it already contains the related projection-
Juho Kannala and Sami S Brandt. 2006. A generic camera model and calibration
Matrix attribute. To this end, we suggest extending the WebXR method for conventional, wide-angle, and fish-eye lenses. IEEE transactions on
matrix notion to 9 element 3x3 matrices to accommodate the K ma- pattern analysis and machine intelligence 28, 8 (2006), 1335–1340.
Inc MongoDB. 2019. MongoDB. URL https://www. mongodb. com/. Cited on (2019), 9.
trix. Although it duplicates some information, it can be passed to Pavel Rojtberg. 2019. User Guidance for Interactive Camera Calibration. In 21st
computer vision algorithms without conversion — similarly to how International Conference on Human-Computer Interaction (To appear).
projectionMatrix can be directly passed to WebGL. Furthermore, an P. Rojtberg and A. Kuijper. 2018. Efficient Pose Selection for Interactive Camera
Calibration. In 2018 IEEE International Symposium on Mixed and Augmented Reality
attribute storing d and the distortion model must be added. (ISMAR). 31–36. https://doi.org/10.1109/ISMAR.2018.00026
The distortion model attribute should also be added to XRRen- World Wide Web Consortium. 2017. Media Capture and Streams. https://www.w3.
org/TR/mediacapture-streams/ Candidate Recommendation, 3 October 2017.
derState for allowing applications to request a specific model as World Wide Web Consortium. 2019. WebXR Device API. https://www.w3.org/TR/
discussed in the section above — similarly to how developers re- webxr/ First Public Working Draft, 5 February 2019.
quest a specific depthNear. Alon Zakai. 2011. Emscripten: an LLVM-to-JavaScript compiler. In Proceedings of the
ACM international conference companion on Object oriented programming systems
This would enable browsers to transparently provide calibration languages and applications companion. ACM, 301–312.
data as provided by our service through the WebXR API. Alterna- Zhengyou Zhang. 2000. A flexible new technique for camera calibration. Pattern
tively browser vendors could opt to bundle a set of calibrations for Analysis and Machine Intelligence, IEEE Transactions on 22, 11 (2000), 1330–1334.
popular cameras directly with the browser.