Open Image IO
Open Image IO
7
Programmer Documentation
(in progress)
Dan Wexler
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Simplifying Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
iii
iv CONTENTS
6.8 ICO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.9 BMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7 Cached Images 69
7.1 Image Cache Introduction and Theory of Operation . . . . . . . . . . . . . . . 69
7.2 ImageCache API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9 Image Buffer 97
II Image Utilities 99
C Glossary 141
Index 143
Welcome to OpenImageIO!
1.1 Overview
OpenImageIO provides simple but powerful ImageInput and ImageOutput APIs that abstract
the reading and writing of 2D image file formats. They don’t support every possible way of
encoding images in memory, but for a reasonable and common set of desired functionality, they
provide an exceptionally easy way for an application using the APIs support a wide — and
extensible — selection of image formats without knowing the details of any of these formats.
Concrete instances of these APIs, each of which implements the ability to read and/or write
a different image file format, are stored as plugins (i.e., dynamic libraries, DLL’s, or DSO’s)
that are loaded at runtime. The OpenImageIO distribution contains such plugins for several
popular formats. Any user may create conforming plugins that implement reading and writing
capabilities for other image formats, and any application that uses OpenImageIO would be able
to use those plugins.
The library also implements the helper class ImageBuf, which is a handy way to store and
manipulate images in memory. ImageBuf itself uses ImageInput and ImageOutput for its file
I/O, and therefore is also agnostic as to image file formats.
The ImageCache class transparently manages a cache so that it can access truly vast amounts
of image data (thousands of image files totaling tens of GB) very efficiently using only a tiny
amount (tens of megabytes at most) of runtime memory. Additionally, a TextureSystem class
provides filtered MIP-map texture lookups, atop the nice caching behavior of ImageCache.
Finally, the OpenImageIO distribution contains several utility programs that operate on im-
ages, each of which is built atop ImageInput and ImageOutput, and therefore may read or
write any image file type for which an appropriate plugin is found at runtime. Paramount
among these utilities is iv, a really fantastic and powerful image viewing application. Addi-
tionally, there are programs for converting images among different formats, comparing image
data between two images, and examining image metadata.
All of this is released as “open source” software using the very permissive BSD license. So
you should feel free to use any or all of OpenImageIO in your own software, whether it is private
or public, open source or proprietary, free or commercial. You may also modify it on your own.
You are also encouraged to contribute to the continued development of OpenImageIO and to
share any improvements that you make on your own, though you are by no means required to
do so.
1
2 CHAPTER 1. INTRODUCTION
• OpenImageIO only deals with ordinary 2D images, and to a limited extent 3D volumes,
and image files that contain multiple (but finite) independent images within them. Open-
ImageIO does not deal with motion picture files. At least, not currently.
• Pixel data are 8- 16- or 32-bit int (signed or unsigned), 16- 32- or 64-bit float. NOTHING
ELSE. No < 8 bit images, or pixels boundaries that aren’t byte boundaries. Files with
< 8 bits will appear to the client as 8-bit unsigned grayscale images.
• Only fully elaborated, non-compressed data are accepted and returned by the API. Com-
pression or special encodings are handled entirely within an OpenImageIO plugin.
• Color space is grayscale or RGB. Non-spectral data, such as XYZ, CMYK, or YUV, are
converted to RGB upon reading.
• All color channels have the same data format. Upon read, an ImageInput ought to con-
vert all channels to the one with the highest precision in the file.
• All image channels in a subimage are sampled at the same resolution. For file formats
that allow some channels to be subsampled, they will be automatically up-sampled to the
highest resolution channel in the subimage.
• Color information is always in the order R, G, B, and the alpha channel, if any, always
follows RGB, and z channel (if any) always follows alpha. So if a file actually stores
ABGR, the plugin is expected to rearrange it as RGBA.
It’s important to remember that these restrictions apply to data passed through the APIs, not
to the files themselves. It’s perfectly fine to have an OpenImageIO plugin that supports YUV
data, or 4 bits per channel, or any other exotic feature. You could even write a movie-reading
ImageInput (despite OpenImageIO’s claims of not supporting movies) and make it look to
the client like it’s just a series of images within the file. It’s just that all the nonconforming
details are handled entirely within the OpenImageIO plugin and are not exposed through the
main OpenImageIO APIs.
Historical Origins
OpenImageIO is the evolution of concepts and tools I’ve been working on for two decades.
In the 1980’s, every program I wrote that output images would have a simple, custom format
and viewer. I soon graduated to using a standard image file format (TIFF) with my own library
implementation. Then I switched to Sam Leffler’s stable and complete libtiff.
In the mid-to-late-1990’s, I worked at Pixar as one of the main implementors of PhotoRe-
alistic RenderMan, which had display drivers that consisted of an API for opening files and
outputting pixels, and a set of DSO/DLL plugins that each implement image output for each of
a dozen or so different file format. The plugins all responded to the same API, so the renderer
itself did not need to know how to the details of the image file formats, and users could (in
theory, but rarely in practice) extend the set of output image formats the renderer could use by
writing their own plugins.
This was the seed of a good idea, but PRMan’s display driver plugin API was abstruse and
hard to use. So when I started Exluna in 2000, Matt Pharr, Craig Kolb, and I designed a new
API for image output for our own renderer, Entropy. This API, called “ExDisplay,” was C++,
and much simpler, clearer, and easier to use than PRMan’s display drivers.
NVIDIA’s Gelato (circa 2002), whose early work was done by myself, Dan Wexler, Jonathan
Rice, and Eric Enderton, had an API called “ImageIO.” ImageIO was much more powerful and
descriptive than ExDisplay, and had an API for reading as well as writing images. Gelato was
not only “format agnostic” for its image output, but also for its image input (textures, image
viewer, and other image utilities). We released the API specification and headers (though not
the library implementation) using the BSD open source license, firmly repudiating any notion
that the API should be specific to NVIDIA or Gelato.
For Gelato 3.0 (circa 2007), we refined ImageIO again (by this time, Philip Nemec was
also a major influence, in addition to Dan, Eric, and myself1 ). This revision was not a major
overhaul but more of a fine tuning. Our ideas were clearly approaching stability. But, alas, the
Gelato project was canceled before Gelato 3.0 was released, and despite our prodding, NVIDIA
executives would not open source the full ImageIO code and related tools.
After I left NVIDIA, I was determined to recreate this work once again – and ONLY once
more – and release it as open source from the start. Thus, OpenImageIO was born. I started with
the existing Gelato ImageIO specification and headers (which were BSD licensed all along),
and made some more refinements since I had to rewrite the entire implementation from scratch
anyway. I think the additional changes are all improvements. This is the software you have in
your hands today.
1 Gelato as a whole had many other contributors; those I’ve named here are the ones I recall contributing to the
design or implementation of the ImageIO APIs
Acknowledgments
OpenImageIO incorporates or depends upon several other open source packages:
• libtiff
1988-1997
c Sam Leffler and 1991-1997 Silicon Graphics, Inc.
http://libtiff.org
• jpeg-6b
1991-1998,
c Thomas G. Lane. http://www.ijg.org
• zlib
1995-2005
c Jean-loup Gailly and Mark Adler. http://www.zlib.net
• libpng
1998-2008
c Glenn Randers-Pehrson, et al. http://www.libpng.org
• Boost http://www.boost.org
• GLEW
2002-2007
c Milan Ikits, et al. http://glew.sourceforge.net
• Jasper
2001-2006
c Michael David Adams, et al.
http://www.ece.uvic.ca/ mdadams/jasper/
These other packages are all distributed under licenses that allow them to be used by and
distributed with OpenImageIO.
5
2 Image I/O API
• Native file data is what is stored in an image file itself (i.e., on the “other side” of the
abstraction layer that OpenImageIO provides).
Both internal and file data is stored in a particular data format that describes the numerical
encoding of the values. OpenImageIO understands several types of data encodings, and there
is a special type, TypeDesc, that allows their enumeration. A TypeDesc describes a base data
format type, aggregation into simple vector and matrix types, and an array length (if it’s an
array).
TypeDesc supports the following base data format types, given by the enumerated type
BASETYPE:
UINT8 8-bit integer values ranging from 0..255, corresponding to the C/C++
unsigned char.
INT8 8-bit integer values ranging from -128..127, corresponding to the C/C++ char.
UINT16 16-bit integer values ranging from 0..65535, corresponding to the C/C++
unsigned short.
INT16 16-bit integer values ranging from -32768..32767, corresponding to the C/C++
short.
UINT 32-bit integer values, corresponding to the C/C++ unsigned int.
INT signed 32-bit integer values, corresponding to the C/C++ int.
FLOAT 32-bit IEEE floating point values, corresponding to the C/C++ float.
DOUBLE 64-bit IEEE floating point values, corresponding to the C/C++ double.
HALF 16-bit floating point values in the format supported by OpenEXR and OpenGL.
A TypeDesc can be constructed using just this information, either as a single scalar value, or an
array of scalar values:
7
8 CHAPTER 2. IMAGE I/O API
In addition, TypeDesc supports certain aggregate types, described by the enumerated type
AGGREGATE:
SCALAR a single scalar value (such as a raw int or float in C). This is the default.
VEC2 two values representing a 2D vector.
VEC3 three values representing a 3D vector.
VEC4 four values representing a 4D vector.
MATRIX44 sixteen values representing a 4 × 4 matrix.
And optionally, several vector transformation semantics, described by the enumerated type
VECSEMANTICS:
NOXFORM indicates that the item is not a spatial quantity that undergoes any par-
ticular transformation.
COLOR indicates that the item is a “color,” not a spatial quantity (and of course
therefore does not undergo a transformation).
POINT indicates that the item represents a spatial position and should be trans-
formed by a 4 × 4 matrix as if it had a 4th component of 1.
VECTOR indicates that the item represents a spatial direction and should be
transformed by a 4 × 4 matrix as if it had a 4th component of 0.
NORMAL indicates that the item represents a surface normal and should be trans-
formed like a vector, but using the inverse-transpose of a 4 × 4 matrix.
Construct a type description of an aggregate (or array of aggregates), with optional vector
transformation semantics. For example, TypeDesc(HALF,COLOR) describes an aggregate
of 3 16-bit floats comprising a color, and TypeDesc(FLOAT,VEC3,POINT) describes an
aggregate of 3 32-bit floats comprising a 3D position.
Note that aggregates and arrays are different. A TypeDesc(FLOAT,3) is an array of three
floats, a TypeDesc(FLOAT,COLOR) is a single 3-channel color comprised of floats, and
TypeDesc(FLOAT,3,COLOR) is an array of 3 color values, each of which is comprised of
3 floats.
Of these, the only ones commonly used to store pixel values in image files are scalars of
UINT8, UINT16, FLOAT, and HALF (the last only used by OpenEXR, to the best of our knowl-
edge).
Note that the TypeDesc (which is also used for applications other than images) can describe
many types not used by OpenImageIO. Please ignore this extra complexity; only the above
simple types are understood by OpenImageIO as pixel storage data types, though a few others,
including STRING and MATRIX44 aggregates, are occasionally used for metadata for certain
image file formats (see Sections 3.2.5, 4.2.4, and the documentation of individual ImageIO
plugins for details).
• The origin, if its upper left corner is not located beginning at pixel (0,0).
• The full size and offset of an abstract “full” or “display” image, useful for describing
cropping or overscan.
• Whether the image is organized into tiles, and if so, the tile size.
• The native data format of the pixel values (e.g., float, 8-bit integer, etc.).
• The number of color channels in the image (e.g., 3 for RGB images), names of the chan-
nels, and whether any particular channels represent alpha and depth.
• Any presumed gamma correction or hints about color space of the pixel values.
• Quantization parameters describing how floating point values should be converted to in-
tegers (in cases where users pass real values but integer values are stored in the file). This
is used only when writing images, not when reading them.
Therefore the pixel data are defined over pixel coordinates [x ... x+width-1] horizontally,
[y ... y+height-1] vertically, and [z ... z+depth-1] in depth.
TypeDesc format
Indicates the native format of the pixel data values themselves, as a TypeDesc (see 2.1).
Typical values would be TypeDesc::UINT8 for 8-bit unsigned values, TypeDesc::FLOAT
for 32-bit floating-point values, etc.
NOTE: Currently, the implementation of OpenImageIO requires all channels to have the
same data format.
int nchannels
The number of channels (color values) present in each pixel of the image. For example,
an RGB image has 3 channels.
std::vector<std::string> channelnames
The names of each channel, in order. Typically this will be "R", "G","B", "A" (alpha),
"Z" (depth), or other arbitrary names.
int z channel
The index of the channel that respresents z or depth (from the camera). It defaults to -1 if
no depth channel is present, or if it is not know which channel represents depth.
LinearitySpec linearity
Describes the mapping of pixel values to real-world units. LinearitySpec is an enumer-
ated type that may take on the following values:
float gamma
The gamma exponent, if the pixel values in the image have already been gamma cor-
rected (indicated by linearity having a value of GammaCorrected). The default of 1.0
indicates that no gamma correction has been applied.
ImageSpec (int xres, int yres, int nchans, TypeDesc fmt = UINT8)
Constructs an ImageSpec with the given x and y resolution, number of channels, and
pixel data format.
All other fields are set to the obvious defaults – the image is an ordinary 2D image (not a
volume), the image is not offset or a crop of a bigger image, the image is scanline-oriented
(not tiled), channel names are “R”, “G”, “B,” and “A” (up to and including 4 channels,
beyond that they are named “channel n”), the fourth channel (if it exists) is assumed to be
alpha, values are assumed to be linear, and quantization (if fmt describes an integer type)
is done in such a way that the maximum positive integer range maps to (0.0, 1.0).
Sets the format as described, and also sets all quantization parameters to the default for
that data type (as explained in Section 3.2.6).
Sets the channelnames to reasonable defaults for the number of channels. Specifically,
channel names are set to “R”, “G”, “B,” and “A” (up to and including 4 channels, beyond
that they are named “channeln”.
static TypeDesc
format from quantize (int quant black, int quant white,
int quant min, int quant max)
Utility function that, given quantization parameters, returns a data type that may be used
without unacceptable loss of significant bits.
Returns the number of bytes comprising each channel of each pixel (i.e., the size of a
single value of the type described by the format field).
Returns the number of bytes comprising each pixel (i.e. the number of channels multi-
plied by the channel size).
Returns the number of bytes comprising each scanline (i.e. width pixels). This will return
std::numeric limits<imagesize t>::max() in the event of an overflow where it’s
not representable in an imagesize t.
Returns the number of tiles comprising an image tile (if it’s a tiled image). This will return
std::numeric limits<imagesize t>::max() in the event of an overflow where it’s
not representable in an imagesize t.
Searches extra attribs for an attribute matching name, returning a pointer to the at-
tribute record, or NULL if there was no match. If searchtype is TypeDesc::UNKNOWN,
the search will be made regardless of the data type, whereas other values of searchtype
will reject a matching name if the data type does not also match. The name compar-
ison will be exact if casesensitive is true, otherwise in a case-insensitive manner if
caseinsensitive is false.
int get int attribute (const std::string &name, int defaultval=0) const
Gets an integer metadata attribute (silently converting to int even if if the data is really
int8, uint8, int16, uint16, or uint32), and simply substituting the supplied default value if
no such metadata exists. This is a convenience function for when you know you are just
looking for a simple integer value.
Gets a string metadata attribute, simply substituting the supplied default value if no such
metadata exists. This is a convenience function for when you know you are just looking
for a simple string value.
For a given parameter (in this ImageSpec’s extra attribs field), format the value
nicely as a string. If human is true, use especially human-readable explanations (units, or
decoding of values) for certain known metadata.
#include "imageio.h"
using namespace OpenImageIO;
...
• Search for an ImageIO plugin that is capable of writing the file ("foo.jpg"), deducing
the format from the file extension. When it finds such a plugin, it creates a subclass
instance of ImageOutput that writes the right kind of file format.
• Open the file, write the correct headers, and in all other important ways prepare a file
with the given dimensions (640 × 480), number of color channels (3), and data format
(unsigned 8-bit integer).
• Write the entire image, hiding all details of the encoding of image data in the file, whether
the file is scanline- or tile-based, or what is the native format of data in the file (in this
15
16 CHAPTER 3. IMAGEOUTPUT: WRITING IMAGES
case, our in-memory data is unsigned 8-bit and we’ve requested the same format for disk
storage, but if they had been different, write image() would do all the conversions for
us).
• Close the file, destroy and free the ImageOutput we had created, and perform all other
cleanup and release of any resources needed by the plugin.
out->close ();
delete out;
Individual scanlines may be written using the write scanline() API call:
...
unsigned char scanline[xres*channels];
out->open (filename, spec);
int z = 0; // Always zero for 2D images
for (int y = 0; y < yres; ++y) {
... generate data in scanline[0..xres*channels-1] ...
out->write_scanline (y, z, TypeDesc::UINT8, scanline);
}
out->close ();
...
The first two arguments to write scanline() specify which scanline is being written by
its vertical (y) scanline number (beginning with 0) and, for volume images, its slice (z) number
(the slice number should be 0 for 2D non-volume images). This is followed by a TypeDesc
describing the data you are supplying, and a pointer to the pixel data itself. Additional optional
arguments describe the data stride, which can be ignored for contiguous data (use of strides is
explained in Section 3.2.3).
All ImageOutput implementations will accept scanlines in strict order (starting with scan-
line 0, then 1, up to yres-1, without skipping any). See Section 3.2.7 for details on out-of-order
or repeated scanlines.
The full description of the write scanline() function may be found in Section 3.3.
Not all image formats (and therefore not all ImageOutput implementations) support tiled im-
ages. If the format does not support tiles, then write tile() will fail. An application using
OpenImageIO should gracefully handle the case that tiled output is not available for the chosen
format.
Once you create() an ImageOutput, you can ask if it is capable of writing a tiled image
by using the supports("tiles") query:
...
ImageOutput *out = ImageOutput::create (filename);
if (! out->supports ("tiles")) {
// Tiles are not supported
}
Assuming that the ImageOutput supports tiled images, you need to specifically request a
tiled image when you open() the file. This is done by setting the tile size in the ImageSpec
passed to open(). If the tile dimensions are not set, they will default to zero, which indicates
that scanline output should be used rather than tiled output.
In this example, we have used square tiles (the same number of pixels horizontally and
vertically), but this is not a requirement of OpenImageIO. However, it is possible that some
image formats may only support square tiles, or only certain tile sizes (such as restricting tile
sizes to powers of two). Such restrictions should be documented by each individual plugin.
The first three arguments to write tile() specify which tile is being written by the pixel
coordinates of any pixel contained in the tile: x (column), y (scanline), and z (slice, which
should always be 0 for 2D non-volume images). This is followed by a TypeDesc describing
the data you are supplying, and a pointer to the tile’s pixel data itself, which should be ordered
by increasing slice, increasing scanline within each slice, and increasing column within each
scanline. Additional optional arguments describe the data stride, which can be ignored for
contiguous data (use of strides is explained in Section 3.2.3).
All ImageOutput implementations that support tiles will accept tiles in strict order of in-
creasing y rows, and within each row, increasing x column, without missing any tiles. See
Section 3.2.7 for details on out-of-order or repeated tiles.
The full description of the write tile() function may be found in Section 3.3.
...
ImageOutput *out = ImageOutput::create (filename);
if (! out->supports ("rectangles")) {
// Rectangles are not supported
}
If rectangular regions are supported, they may be sent using the write rectangle() API
call:
The first six arguments to write rectangle() specify the region of pixels that is being
transmitted by supplying the minimum and maximum pixel indices in x (column), y (scanline),
and z (slice, always 0 for 2D non-volume images). The total number of pixels being transmitted
is therefore:
This is followed by a TypeDesc describing the data you are supplying, and a pointer to the
rectangle’s pixel data itself, which should be ordered by increasing slice, increasing scanline
within each slice, and increasing column within each scanline. Additional optional arguments
describe the data stride, which can be ignored for contiguous data (use of strides is explained in
Section 3.2.3).
The code examples of the previous sections all assumed that your internal pixel data is stored
as unsigned 8-bit integers (i.e., 0-255 range). But OpenImageIO is significantly more flexible.
You may request that the output image be stored in any of several formats. This is done
by setting the format field of the ImageSpec prior to calling open. You can do this upon
construction of the ImageSpec, as in the following example that requests a spec that stores data
as 16-bit unsigned integers:
Or, for an ImageSpec that has already been constructed, you may reset its format using the
set format() method (which also resets the various quantization fields of the spec to the
defaults for the data format you have specified).
Note that resetting the format must be done before passing the spec to open(), or it will
have no effect on the file.
Individual file formats, and therefore ImageOutput implementations, may only support
a subset of the formats understood by the OpenImageIO library. Each ImageOutput plugin
implementation should document which data formats it supports. An individual ImageOutput
implementation may choose to simply fail to open(), though the recommended behavior is
for open() to succeed but in fact choose a data format supported by the file format that best
preserves the precision and range of the originally-requested data format.
It is not required that the pixel data passed to write image(), write scanline(), write tile(),
or write rectangle() actually be in the same data format as that requested as the native for-
mat of the file. You can fully mix and match data you pass to the various write routines and
OpenImageIO will automatically convert from the internal format to the native file format. For
example, the following code will open a TIFF file that stores pixel data as 16-bit unsigned in-
tegers (values ranging from 0 to 65535), compute internal pixel values as floating-point values,
with write image() performing the conversion automatically:
Note that write scanline(), write tile(), and write rectangle have a parameter that
works in a corresponding manner.
Please refer to Section 3.2.6 for more information on how values are translated among the
supported data formats by default, and how to change the formulas by specifying quantization
in the ImageSpec.
• each pixel in memory consists of a number of data values equal to the declared number
of channels that are being written to the file;
• successive column pixels within a row directly follow each other in memory, with the first
channel of pixel x immediately following last channel of pixel x − 1 of the same row;
• for whole images, tiles or rectangles, the data for each row immediately follows the pre-
vious one in memory (the first pixel of row y immediately follows the last column of row
y − 1);
• for 3D volumetric images, the first pixel of slice z immediately follows the last pixel of
of slice z − 1.
Please note that this implies that data passed to write tile() be contiguous in the shape
of a single tile (not just an offset into a whole image worth of pixels), and that data passed to
write rectangle() be contiguous in the dimensions of the rectangle.
The write scanline() function takes an optional xstride argument, and the write image(),
write tile(), and write rectangle functions take optional xstride, ystride, and zstride
values that describe the distance, in bytes, between successive pixel columns, rows, and slices,
respectively, of the data you are passing. For any of these values that are not supplied, or are
given as the special constant AutoStride, contiguity will be assumed.
By passing different stride values, you can achieve some surprisingly flexible functionality.
A few representative examples follow:
• Write a tile that is embedded within a whole image of pixel data, rather than having a
one-tile-only memory layout:
• Write only a subset of channels to disk. In this example, our internal data layout consists
of 4 channels, but we write just channel 3 to disk as a one-channel image:
Please consult Section 3.3 for detailed descriptions of the stride parameters to each write
function.
Not all image file formats have a way to describe display windows. An ImageOutput
implementation that cannot express display windows will always write out the width × height
pixel data, may upon writing lose information about offsets or crop windows.
Here is a code example that opens an image file that will contain a 32 × 32 pixel crop
window within an abstract 640 × 480 full size image. Notice that the pixel indices (column,
scanline, slice) passed to the write functions are the coordinates relative to the full image, not
relative to the crop widow, but the data pointer passed to the write functions should point to the
beginning of the actual pixel data being passed (not the the hypothetical start of the full data, if
it was all present).
The ImageSpec passed to open() can specify all the common required properties that describe
an image: data format, dimensions, number of channels, tiling. However, there may be a variety
of additional metadata1 that should be carried along with the image or saved in the file.
The remainder of this section explains how to store additional metadata in the ImageSpec.
It is up to the ImageOutput to store these in the file, if indeed the file format is able to accept the
data. Individual ImageOutput implementations should document which metadata they respect.
Channel names
In addition to specifying the number of color channels, it is also possible to name those channels.
Only a few ImageOutput implementations have a way of saving this in the file, but some do,
so you may as well do it if you have information about what the channels represent.
1 Metadatarefers to data about data, in this case, data about the image that goes beyond the pixel values and
description thereof.
By convention, channel names for red, green, blue, and alpha (or a main image) should be
named "R", "G", "B", and "A", respectively. Beyond this guideline, however, you can use any
names you want.
The ImageSpec has a vector of strings called channelnames. Upon construction, it starts
out with reasonable default values. If you use it at all, you should make sure that it contains the
same number of strings as the number of color channels in your image. Here is an example:
int channels = 4;
ImageSpec spec (width, length, channels, TypeDesc::UINT8);
spec.channelnames.clear ();
spec.channelnames.push_back ("R");
spec.channelnames.push_back ("G");
spec.channelnames.push_back ("B");
spec.channelnames.push_back ("A");
Here is another example in which custom channel names are used to label the channels in an
8-channel image containing beauty pass RGB, per-channel opacity, and texture s,t coordinates
for each pixel.
int channels = 8;
ImageSpec spec (width, length, channels, TypeDesc::UINT8);
spec.channelnames.clear ();
spec.channelnames.push_back ("R");
spec.channelnames.push_back ("G");
spec.channelnames.push_back ("B");
spec.channelnames.push_back ("opacityR");
spec.channelnames.push_back ("opacityG");
spec.channelnames.push_back ("opacityB");
spec.channelnames.push_back ("texture_s");
spec.channelnames.push_back ("texture_t");
The main advantage to naming color channels is that if you are saving to a file format that
supports channel names, then any application that uses OpenImageIO to read the image back
has the option to retain those names and use them for helpful purposes. For example, the iv
image viewer will display the channel names when viewing individual channels or displaying
numeric pixel values in “pixel view” mode.
Specially-designated channels
The ImageSpec contains two fields, alpha channel and z channel, which can be used to
designate which channel indices are used for alpha and z depth, if any. Upon construction, these
are both set to -1, indicating that it is not known which channels are alpha or depth. Here is an
example of setting up a 5-channel output that represents RGBAZ:
int channels = 5;
ImageSpec spec (width, length, channels, format);
spec.channelnames.push_back ("R");
spec.channelnames.push_back ("G");
spec.channelnames.push_back ("B");
spec.channelnames.push_back ("A");
spec.channelnames.push_back ("Z");
spec.alpha_channel = 3;
spec.z_channel = 4;
There are two advantages to designating the alpha and depth channels in this manner:
• Some file formats may require that these channels be stored in a particular order, with
a particular precision, or the ImageOutput may in some other way need to know about
these special channels.
• Certain operations that make sense for colors should not apply to alpha or z. For example,
if your call to write reduces precision (e.g., converts from float to integer pixels) it will
typically add random dither to eliminate banding artifacts in the quantization. But for a
variety of reasons, you want to add dither only to color channels and not to alpha. So
setting alpha channel will cause write to not dither that channel.
Linearity hints
We certainly hope that you are using only modern file formats that support high precision and
extended range pixels (such as OpenEXR) and keeping all your images in a linear color space.
But you may have to work with file formats that dictate the use of nonlinear color values. This
is prevalent in formats that store pixels only as 8-bit values, since 256 values are not enough to
linearly represent colors without banding artifacts in the dim values.
Since this can (and probably will) happen, the ImageSpec has fields that allow you to ex-
plain what color space your image pixels are in. Each individual ImageOutput should document
how it uses this (or not).
The ImageSpec field linearity can take on any of the following values:
ImageSpec::UnknownLinearity the default, indicates that you have made no claim about
the color space of your pixel data.
ImageSpec::Linear indicates that the pixel values you are passing repesent linear values.
ImageSpec::GammaCorrected indicates that the color pixel values (but not alpha or z) that
you are passing have already been gamma corrected (raised to the power 1/γ), and that
the gamma exponent may be found in the gamma field of the ImageSpec.
ImageSpec::sRGB indicates that the color pixel values that you are passing are already in
sRGB color space.
Here is a simple example of setting up the ImageSpec when you know that the pixel values you
are writing are linear:
If a particular ImageOutput implementation is required (by the rules of the file format it
writes) to have pixels in a particular color space, then it will convert the color values of your
image to the right color space if it is not already in that space. For example, JPEG images must
be in sRGB space, so if you declare your pixels to be Linear, the JPEG ImageOutput will
convert to sRGB.
If you leave the linearity set to the default of UnknownLinearity, the values will not be
transformed, since the plugin can’t be sure that it’s not in the correct space to begin with.
The linearity only describes color channels. An ImageOutput plugin will assume that alpha
or depth (z) channels (designated by the alpha channel and z channel fields, respectively)
always represent linear values and should never be transformed.
Arbitrary metadata
For all other metadata that you wish to save in the file, you can attach the data to the ImageSpec
using the attribute() methods. These come in polymorphic varieties that allow you to attach
an attribute name and a value consisting of a single int, unsigned int, float, char*, or
std::string, as shown in the following examples:
unsigned int u = 1;
spec.attribute ("Orientation", u);
float x = 72.0;
spec.attribute ("dotsize", f);
These are convenience routines for metadata that consist of a single value of one of these
common types. For other data types, or more complex arrangements, you can use the more
general form of attribute(), which takes arguments giving the name, type (as a TypeDesc),
number of values (1 for a single value, > 1 for an array), and then a pointer to the data values.
For example,
In general, most image file formats (and therefore most ImageOutput implementations) are
aware of only a small number of name/value pairs that they predefine and will recognize. Some
file formats (OpenEXR, notably) do accept arbitrary user data and save it in the image file. If an
ImageOutput does not recognize your metadata and does not support arbitrary metadata, that
metadatum will be silently ignored and will not be saved with the file.
Each individual ImageOutput implementation should document the names, types, and
meanings of all metadata attributes that they understand.
It is possible that your internal data format (that in which you compute pixel values that you
pass to the write functions) is of greater precision or range than the native data format of the
output file. This can occur either because you specified a lower-precision data format in the
ImageSpec that you passed to open(), or else that the image file format dictates a particular
data format that does not match your internal format. For example, you may compute float
pixels and pass those to write image(), but if you are writing a JPEG/JFIF file, the values
must be stored in the file as 8-bit unsigned integers.
The conversion from floating-point formats to integer formats (or from higher to lower inte-
ger, which is done by first converting to float) is controlled by five fields within the ImageSpec:
quant black, quant white, quant min, quant max, and quant dither. Float 0.0 maps
to the integer value given by quant black, and float 1.0 maps to the integer value given by
quant white. Then, for color channels only (not alpha or depth), a random amount is added
in the range (-quant dither..quant dither), in order to reduce banding artifacts. The re-
sult is then clamped to lie within the range of quant min and quant max, inclusive. Finally,
this result is truncated its integer value for final output. Here is the code that implements this
transformation (T is the final output integer type):
The values of the quantization parameters are set in one of three ways: (1) upon construction
of the ImageSpec, they are set to the default quantization values for the given data format; (2)
upon call to ImageSpec::set format(), the quantization values are set to the defaults for
the given data format; (3) or, after being first set up in this manner, you may manually change
the quantization parameters in the ImageSpec, if you want something other than the default
quantization.
Default quantization for each integer type is as follows:
Note that the default is to use the entire positive range of each integer type to represent the
floating-point (0..1) range. Floating-point types do not attempt to remap values, do not add
dither, and do not clamp (except to their full floating-point range).
The default will almost always be what you want. But just as an example, here’s how you
would specify a quantization for a 16-bit file in which 1.0 maps to 16383 (14 bits of positive
range) rather than filling the full 16 bit:
All ImageOutput implementations that support scanlines and tiles in strict order of increasing
z slice, increasing y scanlines/rows within each slice, and increasing x column within each row.
It is generally not safe to skip scanlines or tiles, or transmit them out of order, unless the plugin
specifically advertises that it supports random access or rewrites, which may be queried using:
Similarly, you should assume the plugin will not correctly handle repeated transmissions of a
scanline or tile that has already been sent, unless it advertises that it supports rewrites, which
may be queried using:
if (out->supports ("rewrite"))
...
Some image file formats support multiple discrete images to be stored in one file. Given a
created ImageOutput, you can query whether multiple images may be stored in the file:
If you are working with an ImageOutput that supports multiple images, it is easy to write
these images. All you have to do is, after writing all the pixels of one image but before calling
close(), call open() again for the next image and passing true as the optional third append
argument. (See Section 3.3 for the full technical description of the arguments to open().) The
close() routine is called just once, after all subimages are completed.
Below is pseudocode for writing a MIP-map (a multi-resolution image used for texture
mapping) that shows how to use multi-image:
const char *filename = "foo.tif";
const int xres = 512, yres = 512;
const int channels = 3; // RGB
unsigned char *pixels = new unsigned char [xres*yres*channels];
In this example, we have used write image(), but of course write scanline(), write tile(),
and write rectangle() work as you would expect, on the current subimage.
ImageOutput and call write image() to output the pixels from the input image. However,
for compressed images, this may be inefficient due to the unnecessary decompression and sub-
sequent re-compression. In addition, if the compression is lossy, the output image may not
contain pixel values identical to the original input.
A special copy image method of ImageOutput is available that attempts to copy an image
from an open ImageInput (of the same format) to the output as efficiently as possible with
without altering pixel values, if at all possible.
Not all format plugins will provide an implementation of copy image (in fact, most will
not), but the default implemenatation simply copies pixels one scanline or tile at a time (with
decompression/recompression) so it’s still safe to call. Furthermore, even a provided copy -
image is expected to fall back on the default implementation if the input and output are not able
to do an efficient copy. Nevertheless, this method is recommended for copying images so that
maximal advantage will be taken in cases where savings can be had.
The following is an example use of copy image to transfer pixels without alteration while
modifying the image description metadata:
// Clean up
out->close ();
delete out;
in->close ();
delete in;
check the directory "/home/tom/plugins" (assuming the user’s home directory is /home/tom),
and if not found there, will then check the directory "/shared/plugins".
The first search path it will check is that stored in the environment variable IMAGEIO -
LIBRARY PATH. It will check each directory in turn, in the order that they are listed in the
variable. If no adequate plugin is found in any of the directories listed in this environment
variable, then it will check the custom searchpath passed as the optional second argument to
ImageOutput::create(), searching in the order that the directories are listed. Here is an
example:
#include "imageio.h"
using namespace OpenImageIO;
...
return;
}
if (! out->close ()) {
std::cerr << "Error closing " << filename
<< ", error = " << out->geterror() << "\n";
delete out;
return;
}
delete out;
"random access" May tiles or scanlines be written in any order? False indicates that
they must be in successive order.
"multiimage" Does this format support multiple images within a single file?
"volumes" Does this format support “3D” pixel arrays (a.k.a. volume images)?
"rewrite" Does this plugin allow the same scanline or tile to be sent more than once?
Generally this is true for plugins that implement some sort of interactive display,
rather than a saved image file.
"empty" Does this plugin support passing a NULL data pointer to the various write
routines to indicate that the entire data block is composed of pixels with value zero.
Plugins that support this achieve a speedup when passing blank scanlines or tiles
(since no actual data needs to be transmitted or converted).
This list of queries may be extended in future releases. Since this can be done simply by
recognizing new query strings, and does not require any new API entry points, addition
of support for new queries does not break “link compatibility” with previously-compiled
plugins.
bool open (const std::string &name, const ImageSpec &newspec, bool append=false)
Open the file with given name, with resolution, and other format data as given in newspec.
This function returns true for success, false for failure. Note that it is legal to call
open() multiple times on the same file without a call to close(), if it supports multi-
image and the append flag is true – this is interpreted as appending images (such as for
MIP-maps).
bool close ()
Closes the currently open file associated with this ImageOutput and frees any memory
or resources associated with it.
bool write scanline (int y, int z, TypeDesc format, const void *data,
stride t xstride=AutoStride)
Write a full scanline that includes pixels (∗, y, z). For 2D non-volume images, z is ignored.
The xstride value gives the distance between successive pixels (in bytes). Strides set to
the special value AutoStride imply contiguous data, i.e.,
xstride = spec.nchannels*format.size()
This method automatically converts the data from the specified format to the actual out-
put format of the file. Return true for success, false for failure. It is a failure to call
write scanline() with an out-of-order scanline if this format driver does not support
random access.
bool write tile (int x, int y, int z, TypeDesc format, const void *data,
stride t xstride=AutoStride, stride t ystride=AutoStride,
stride t zstride=AutoStride)
Write the tile with (x, y, z) as the upper left corner. For 2D non-volume images, z is
ignored. The three stride values give the distance (in bytes) between successive pixels,
scanlines, and volumetric slices, respectively. Strides set to the special value AutoStride
imply contiguous data, i.e.,
xstride = spec.nchannels*format.size()
ystride = xstride*spec.tile width
zstride = ystride*spec.tile height
This method automatically converts the data from the specified format to the actual out-
put format of the file. Return true for success, false for failure. It is a failure to call
write tile() with an out-of-order tile if this format driver does not support random
access.
bool write rectangle (int xmin, int xmax, int ymin, int ymax, int zmin, int zmax,
TypeDesc format, const void *data,
stride t xstride=AutoStride, stride t ystride=AutoStride,
stride t zstride=AutoStride)
Write pixels whose x coords range over xmin...xmax (inclusive), y coords over ymin...ymax,
and z coords over zmin...zmax. The three stride values give the distance (in bytes) be-
tween successive pixels, scanlines, and volumetric slices, respectively. Strides set to the
special value AutoStride imply contiguous data, i.e.,
xstride = spec.nchannels*format.size()
ystride = xstride*(xmax-xmin+1)
zstride = ystride*(ymax-ymin+1)
This method automatically converts the data from the specified format to the actual out-
put format of the fil. Return true for success, false for failure. It is a failure to call
write rectangle for a format plugin that does not return true for supports("rectangles").
Because this may be an expensive operation, a progress callback may be passed. Period-
ically, it will be called as follows:
progress_callback (progress_callback_data, float done)
where done gives the portion of the image (between 0.0 and 1.0) that has been written
thus far.
std::string geterror ()
Returns the current error string describing what went wrong if any of the public methods
returned false indicating an error. (Hopefully the implementation plugin called error()
with a helpful error message.)
#include "imageio.h"
using namespace OpenImageIO;
...
• Search for an ImageIO plugin that is capable of reading the file ("foo.jpg"), first by
trying to deduce the correct plugin from the file extension, but if that fails, by opening
every ImageIO plugin it can find until one will open the file without error. When it finds
the right plugin, it creates a subclass instance of ImageInput that reads the right kind of
file format.
• Open the file, read the header, and put all relevant metadata about the file in a specification
structure.
35
36 CHAPTER 4. IMAGE I/O: READING IMAGES
ImageSpec spec;
in->open (filename, spec);
• The specification contains vital information such as the dimensions of the image, number
of color channels, and data type of the pixel values. This is enough to allow us to allocate
enough space for the image.
xres = spec.width;
yres = spec.height;
channels = spec.nchannels;
pixels = new unsigned char [xres*yres*channels];
Note that in this example, we don’t care what data format is used for the pixel data in the
file — we allocate enough space for unsigned 8-bit integer pixel values, and will rely on
OpenImageIO’s ability to convert to our requested format from the native data format of
the file.
• Read the entire image, hiding all details of the encoding of image data in the file, whether
the file is scanline- or tile-based, or what is the native format of the data in the file (in this
case, we request that it be automatically converted to unsigned 8-bit integers).
• Close the file, destroy and free the ImageInput we had created, and perform all other
cleanup and release of any resources used by the plugin.
in->close ();
delete in;
Let’s walk through some of the most common things you might want to do, but that are more
complex than the simple example above.
The simple example of Section 4.1 read an entire image with one call. But sometimes you want
to read a large image a little at a time and do not wish to retain the entire image in memory as
you process it. OpenImageIO allows you to read images one scanline at a time or one tile at a
time.
Individual scanlines may be read using the read scanline() API call:
...
in->open (filename, spec);
unsigned char *scanline = new unsigned char [spec.width*spec.channels];
for (int y = 0; y < yres; ++y) {
in->read_scanline (y, 0, TypeDesc::UINT8, scanline);
... process data in scanline[0..width*channels-1] ...
}
delete [] scanline;
in->close ();
...
The first two arguments to read scanline() specify which scanline is being read by its
vertical (y) scanline number (beginning with 0) and, for volume images, its slice (z) number (the
slice number should be 0 for 2D non-volume images). This is followed by a TypeDesc describ-
ing the data type of the pixel buffer you are supplying, and a pointer to the pixel buffer itself.
Additional optional arguments describe the data stride, which can be ignored for contiguous
data (use of strides is explained in Section 4.2.3).
All ImageInput implementations will read scanlines in strict order (starting with scanline
0, then 1, up to yres-1, without skipping any). However, it’s probably uncommon for an
ImageInput to properly allow reading of out-of-order scanlines.
The full description of the read scanline() function may be found in Section 4.3.
Once you open() an image file, you can find out if it is a tiled image (and the tile size) by exam-
ining the ImageSpec’s tile width, tile height, and tile depth fields. If they are zero,
it’s a scanline image and you should read pixels using read scanline(), not read tile().
...
in->open (filename, spec);
if (spec.tile_width == 0) {
... read by scanline ...
} else {
// Tiles
int tilesize = spec.tile_width * spec.tile_height;
unsigned char *tile = new unsigned char [tilesize * spec.channels];
for (int y = 0; y < yres; y += spec.tile_height) {
for (int x = 0; x < xres; x += spec.tile_width) {
in->read_tile (x, y, 0, TypeDesc::UINT8, tile);
... process the pixels in tile[] ..
}
}
delete [] tile;
}
in->close ();
...
The first three arguments to read tile() specify which tile is being read by the pixel
coordinates of any pixel contained in the tile: x (column), y (scanline), and z (slice, which
should always be 0 for 2D non-volume images). This is followed by a TypeDesc describing the
data format of the pixel buffer you are supplying, and a pointer to the pixel buffer. Pixel data
will be written to your buffer in order of increasing slice, increasing scanline within each slice,
and increasing column within each scanline. Additional optional arguments describe the data
stride, which can be ignored for contiguous data (use of strides is explained in Section 4.2.3).
All ImageInput implementations are required to support reading tiles in arbitrary order
(i.e., not in strict order of increasing y rows, and within each row, increasing x column, without
missing any tiles).
The full description of the read tile() function may be found in Section 4.3.
The code examples of the previous sections all assumed that your internal pixel data is stored
as unsigned 8-bit integers (i.e., 0-255 range). But OpenImageIO is significantly more flexible.
You may request that the pixels be stored in any of several formats. This is done merely
by passing the read function the data type of your pixel buffer, as one of the enumerated type
TypeDesc.
It is not required that the pixel data buffer passed to read image(), read scanline(),
or read tile() actually be in the same data format as the data in the file being read. Open-
ImageIO will automatically convert from native data type of the file to the internal data format
of your choice. For example, the following code will open a TIFF and read pixels into your
internal buffer represented as float values. This will work regardless of whether the TIFF file
itself is using 8-bit, 16-bit, or float values.
Note that read scanline() and read tile() have a parameter that works in a correspond-
ing manner.
You can, of course, find out the native type of the file simply by examining spec.format.
If you wish, you may then allocate a buffer big enough for an image of that type and request
the native type when reading, therefore eliminating any translation among types and seeing the
actual numerical values in the file.
In the preceeding examples, we have assumed that the buffer passed to the read functions (i.e.,
the place where you want your pixels to be stored) is contiguous, that is:
• each pixel in memory consists of a number of data values equal to the number of channels
in the file;
• successive column pixels within a row directly follow each other in memory, with the first
channel of pixel x immediately following last channel of pixel x − 1 of the same row;
• for whole images or tiles, the data for each row immediately follows the previous one in
memory (the first pixel of row y immediately follows the last column of row y − 1);
• for 3D volumetric images, the first pixel of slice z immediately follows the last pixel of
of slice z − 1.
Please note that this implies that read tile() will write pixel data into your buffer so that
it is contiguous in the shape of a single tile, not just an offset into a whole image worth of pixels.
The read scanline() function takes an optional xstride argument, and the read image()
and read tile() functions take optional xstride, ystride, and zstride values that de-
scribe the distance, in bytes, between successive pixel columns, rows, and slices, respectively,
of your pixel buffer. For any of these values that are not supplied, or are given as the special
constant AutoStride, contiguity will be assumed.
By passing different stride values, you can achieve some surprisingly flexible functionality.
A few representative examples follow:
• Read a tile into its spot in a buffer whose layout matches a whole image of pixel data,
rather than having a one-tile-only memory layout:
Please consult Section 4.3 for detailed descriptions of the stride parameters to each read
function.
The ImageSpec that is filled in by ImageInput::open() specifies all the common properties
that describe an image: data format, dimensions, number of channels, tiling. However, there
may be a variety of additional metadata that are present in the image file and could be queried
by your application.
The remainder of this section explains how to query additional metadata in the ImageSpec.
It is up to the ImageInput to read these from the file, if indeed the file format is able to carry
additional data. Individual ImageInput implementations should document which metadata
they read.
Channel names
In addition to specifying the number of color channels, the ImageSpec also stores the names of
those channels in its channelnames field, which is a vector<std::string>. Its length should
always be equal to the number of channels (it’s the responsibility of the ImageInput to ensure
this).
Only a few file formats (and thus ImageInput implementations) have a way of specifying
custom channel names, so most of the time you will see that the channel names follow the
default convention of being named "R", "G", "B", and "A", for red, green, blue, and alpha,
respectively.
Here is example code that prints the names of the channels in an image:
Specially-designated channels
The ImageSpec contains two fields, alpha channel and z channel, which designate which
channel numbers represent alpha and z depth, if any. If either is set to -1, it indicates that it is
not known which channel is used for that data.
If you are doing something special with alpha or depth, it is probably safer to respect the
alpha channel and z channel designations (if not set to -1) rather than merely assuming
that, for example, channel 3 is always the alpha channel.
Linearity hints
We certainly hope that you are using only modern file formats that support high precision and
extended range pixels (such as OpenEXR) and keeping all your images in a linear color space.
But you may have to work with file formats that dictate the use of nonlinear color values. This
is prevalent in formats that store pixels only as 8-bit values, since 256 values are not enough to
linearly represent colors without banding artifacts in the dim values.
The ImageSpec has a field that reveals what color space the image file is using. Each
individual ImageInput is responsible for setting this properly.
The ImageSpec field linearity can take on any of the following values:
ImageSpec::UnknownLinearity indicates indicates it’s unkown what color space the im-
age file is using.
ImageSpec::Linear indicates that the color pixel values are known to be linear.
ImageSpec::GammaCorrected indicates that the color pixel values (but not alpha or z) in
the file have already been gamma corrected (raised to the power 1/γ), and that the gamma
exponent may be found in the gamma field of the ImageSpec.
ImageSpec::sRGB indicates that the color pixel values in the file are in sRGB color space.
Arbitrary metadata
All other metadata found in the file will be stored in the ImageSpec’s extra attribs field,
which is a ParamValueList, which is itself essentially a vector of ParamValue instances. Each
ParamValue stores one meta-datum consisting of a name, type (specified by a TypeDesc),
number of values, and data pointer.
If you know the name of a specific piece of metadata you want to use, you can find it
using the ImageSpec::find attribute() method, which returns a pointer to the matching
ParamValue, or NULL if no match was found. An optional TypeDesc argument can narrow
the search to only parameters that match the specified type as well as the name. Below is an
example that looks for orientation information, expecting it to consist of a single integer:
By convention, ImageInput plugins will save all integer metadata as 32-bit integers (TypeDesc::INT
or TypeDesc::UINT), even if the file format dictates that a particular item is stored in the file
as a 8- or 16-bit integer. This is just to keep client applications from having to deal with all
the types. Since there is relatively little metadata compared to pixel data, there’s no real mem-
ory waste of promoting all integer types to int32 metadata. Floating-point metadata and string
metadata may also exist, of course.
It is also possible to step through all the metadata, item by item. This can be accomplished
using the technique of the following example:
Each individual ImageInput implementation should document the names, types, and mean-
ings of all metadata attributes that they understand.
The seek subimage() function takes two arguments: the index of the subimage to switch
to (starting with 0), and a reference to an ImageSpec, into which will be stored the spec of
the new subimage. The seek subimage() function returns true upon success, and false
if no such subimage existed. It is legal to visit subimages out of order; the ImageInput is
responsible for making it work properly. It is also possible to find out which subimage is
currently being viewed, using the current subimage() function, which returns the index of
the current subimage.
Below is pseudocode for reading all the levels of a MIP-map (a multi-resolution image used
for texture mapping) that shows how to read multi-image files:
int num_subimages = 0;
while (in->seek_subimage (num_subimages, spec)) {
// Note: spec has the format spec of the current subimage
int npixels = spec.width * spec.height;
int nchannels = spec.nchannels;
unsigned char *pixels = new unsigned char [npixels * nchannels];
in->read_image (TypeDesc::UINT8, pixels);
delete [] pixels;
++num_subimages;
}
// Note: we break out of the while loop when seek_subimage fails
// to find a next subimage.
in->close ();
delete in;
In this example, we have used read image(), but of course read scanline() and read tile()
work as you would expect, on the current subimage.
Please see Section 3.2.10 for discussion about search paths for finding plugins that implement
ImageOutput.
In a similar fashion, calls to ImageOutput::create() will search for plugins in each di-
rectory listed in the environment variable IMAGEIO LIBRARY PATH, in the order that they are
listed. If no adequate plugin is found, then it will check the custom searchpath passed as the
optional second argument to ImageInput::create(). Here is an example:
Nearly every ImageInput API function returns a bool indicating whether the operation suc-
ceeded (true) or failed (false). In the case of a failure, the ImageInput will have saved an er-
ror message describing in more detail what went wrong, and the latest error message is accessi-
ble using the ImageInput method geterror(), which returns the message as a std::string.
The exception to this rule is ImageInput::create, which returns NULL if it could not create
an appropriate ImageInput. And in this case, since no ImageInput exists for which you can
call its geterror() function, there exists a global geterror() function (in the OpenImageIO
namespace) that retrieves the latest error message resulting from a call to create.
Here is another version of the simple image reading code from Section 4.1, but this time it
is fully elaborated with error checking and reporting:
#include "imageio.h"
using namespace OpenImageIO;
...
ImageSpec spec;
if (! in->open (filename, spec)) {
std::cerr << "Could not open " << filename
<< ", error = " << in->geterror() << "\n";
delete in;
return;
}
xres = spec.width;
yres = spec.height;
channels = spec.nchannels;
pixels = new unsigned char [xres*yres*channels];
if (! in->close ()) {
std::cerr << "Error closing " << filename
<< ", error = " << in->geterror() << "\n";
delete in;
return;
}
delete in;
bool close ()
Closes an open image.
bool read tile (int x, int y, int z, TypeDesc format, void *data,
stride t xstride=AutoStride, stride t ystride=AutoStride,
stride t zstride=AutoStride)
Read the tile that includes pixels (∗, y, z) into data, converting if necessary from the na-
tive data format of the file into the format specified (z = 0 for non-volume images). The
stride values give the data spacing of adjacent pixels, scanlines, and volumetric slices,
respectively (measured in bytes). Strides set to the special value of AutoStride imply
contiguous data, i.e.,
xstride = spec.nchannels*format.size()
ystride = xstride*spec.tile width
zstride = ystride*spec.tile height
The ImageInput is expected to give the appearance of random access — in other words,
if it can’t randomly seek to the given tile, it should transparently close, reopen, and se-
quentially read through prior tiles. The base ImageInput class has a default implemen-
tation that calls read native tile and then does appropriate format conversion, so there’s
no reason for each format plugin to override this method.
where done gives the portion of the image (between 0.0 and 1.0) that has been read thus
far.
1. Read the base class definition from imageio.h. It may also be helpful to just use the
OpenImageIO namespace, just to make your code a little less verbose.
#include "imageio.h"
using namespace OpenImageIO;
(a) An integer called name imageio version that identifies the version of the Im-
ageIO protocol implemented by the plugin, defined in imageio.h as the constant
OPENIMAGEIO PLUGIN VERSION. This allows the library to be sure it is not loading
a plugin that was compiled against an incompatible version of OpenImageIO.
(b) A function named name input imageio create that takes no arguments and
returns a new instance of your ImageInput subclass. (Note that name is the name
of your format, and must match the name of the plugin itself.)
(c) An array of char * called name input extensions that contains the list of file
extensions that are likely to indicate a file of the right format. The list is terminated
by a NULL pointer.
49
50 CHAPTER 5. WRITING IMAGEIO PLUGINS
All of these items must be inside an ‘extern "C"’ block in order to avoid name man-
gling by the C++ compiler. Depending on your compiler, you may need to use special
commands to dictate that the symbols will be exported in the DSO; we provide a special
DLLEXPORT macro for this purpose, defined in export.h.
Putting this all together, we get the following for our JPEG example:
extern "C" {
DLLEXPORT int jpeg_imageio_version = OPENIMAGEIO_PLUGIN_VERSION;
DLLEXPORT JpgInput *jpeg_input_imageio_create () {
return new JpgInput;
}
DLLEXPORT const char *jpeg_input_extensions[] = {
"jpg", "jpe", "jpeg", NULL
};
};
3. The definition and implementation of an ImageInput subclass for this file format. It must
publicly inherit ImageInput, and must overload the following methods which are “pure
virtual” in the ImageInput base class:
(a) format name() should return the name of the format, which ought to match the
name of the plugin and by convention is strictly lower-case and contains no whites-
pace.
(b) open() should open the file and return true, or should return false if unable to do
so (including if the file was found but turned out not to be in the format that your
plugin is trying to implement).
(c) close() should close the file, if open.
(d) read native scanline should read a single scanline from the file into the ad-
dress provided, uncompressing it but keeping it in its native data format without any
translation.
(e) The virtual destructor, which should close() if the file is still open, addition to
performing any other tear-down activities.
Additionally, your ImageInput subclass may optionally choose to overload any of the
following methods, which are defined in the ImageInput base class and only need to be
overloaded if the default behavior is not appropriate for your plugin:
(f) seek subimage(), only if your format supports reading multiple subimages within
a single file.
(g) read native tile(), only if your format supports reading tiled images.
Here is how the class definition looks for our JPEG example. Note that the JPEG/JFIF
file format does not support multiple subimages or tiled images.
Your subclass implementation of open(), close(), and read native scanline() are
the heart of an ImageInput implementation. (Also read native tile() and seek subimage(),
for those image formats that support them.)
The remainder of this section simply lists the full implementation of our JPEG reader, which
relies heavily on the open source jpeg-6b library to perform the actual JPEG decoding.
/*
Copyright 2008 Larry Gritz and the other authors and contributors.
All Rights Reserved.
Based on BSD-licensed software Copyright 2004 NVIDIA Corp.
#include <cassert>
#include <cstdio>
extern "C" {
#include "jpeglib.h"
}
#include "imageio.h"
using namespace OpenImageIO;
#include "fmath.h"
#include "jpeg_pvt.h"
using namespace Jpeg_imageio_pvt;
bool
JpgInput::open (const std::string &name, ImageSpec &newspec,
const ImageSpec &config)
{
const ImageIOParameter *p = config.find_attribute ("_jpeg:raw",
TypeDesc::TypeInt);
m_raw = p && *(int *)p->data();
return open (name, newspec);
}
bool
// Request saving of EXIF and other special tags for later spelunking
for (int mark = 0; mark < 16; ++mark)
jpeg_save_markers (&m_cinfo, JPEG_APP0+mark, 0xffff);
jpeg_save_markers (&m_cinfo, JPEG_COM, 0xffff); // comment marker
#endif
std::string xml ((const char *)m->data, m->data_length);
OpenImageIO::decode_xmp (xml, m_spec);
}
else if (m->marker == (JPEG_APP0+13) &&
! strcmp ((const char *)m->data, "Photoshop 3.0"))
jpeg_decode_iptc ((unsigned char *)m->data);
else if (m->marker == JPEG_COM) {
if (! m_spec.find_attribute ("ImageDescription", TypeDesc::STRING))
m_spec.attribute ("ImageDescription",
std::string ((const char *)m->data));
}
}
newspec = m_spec;
return true;
}
bool
JpgInput::read_native_scanline (int y, int z, void *data)
{
if (m_raw)
return false;
if (y < 0 || y >= (int)m_cinfo.output_height) // out of range scanline
return false;
if (m_next_scanline > y) {
// User is trying to read an earlier scanline than the one we’re
// up to. Easy fix: close the file and re-open.
ImageSpec dummyspec;
int subimage = current_subimage();
if (! close () ||
! open (m_filename, dummyspec) ||
! seek_subimage (subimage, dummyspec))
return false; // Somehow, the re-open failed
assert (m_next_scanline == 0 && current_subimage() == subimage);
}
while (m_next_scanline <= y) {
// Keep reading until we’re read the scanline we really need
jpeg_read_scanlines (&m_cinfo, (JSAMPLE **)&data, 1); // read one scanline
++m_next_scanline;
}
return true;
}
bool
JpgInput::close ()
{
if (m_fd != NULL) {
// N.B. don’t call finish_decompress if we never read anything
if (m_next_scanline > 0) {
// But if we’ve only read some scanlines, read the rest to avoid
// errors
std::vector<char> buf (spec().scanline_bytes());
char *data = &buf[0];
while (m_next_scanline < spec().height) {
jpeg_read_scanlines (&m_cinfo, (JSAMPLE **)&data, 1);
++m_next_scanline;
}
}
if (m_next_scanline > 0 || m_raw)
jpeg_finish_decompress (&m_cinfo);
jpeg_destroy_decompress (&m_cinfo);
fclose (m_fd);
m_fd = NULL;
}
init (); // Reset to initial state
return true;
}
void
JpgInput::jpeg_decode_iptc (const unsigned char *buf)
{
// APP13 blob doesn’t have to be IPTC info. Look for the IPTC marker,
// which is the string "Photoshop 3.0" followed by a null character.
if (strcmp ((const char *)buf, "Photoshop 3.0"))
return;
buf += strlen("Photoshop 3.0") + 1;
A plugin that writes a particular image file format must implement a subclass of ImageOutput
(described in Chapter 3). This is actually very straightforward and consists of the following
steps, which we will illustrate with a real-world example of writing a JPEG/JFIF plug-in.
1. Read the base class definition from imageio.h, just as with an image reader (see Sec-
tion 5.2).
(a) An integer called name imageio version that identifies the version of the Im-
ageIO protocol implemented by the plugin, defined in imageio.h as the constant
OPENIMAGEIO PLUGIN VERSION. This allows the library to be sure it is not load-
ing a plugin that was compiled against an incompatible version of OpenImageIO.
Note that if your plugin has both a reader and writer and they are compiled as sep-
arate modules (C++ source files), you don’t want to declare this in both modules;
either one is fine.
(b) A function named name output imageio create that takes no arguments and
returns a new instance of your ImageOutput subclass. (Note that name is the name
of your format, and must match the name of the plugin itself.)
(c) An array of char * called name output extensions that contains the list of file
extensions that are likely to indicate a file of the right format. The list is terminated
by a NULL pointer.
All of these items must be inside an ‘extern "C"’ block in order to avoid name man-
gling by the C++ compiler. Depending on your compiler, you may need to use special
commands to dictate that the symbols will be exported in the DSO; we provide a special
DLLEXPORT macro for this purpose, defined in export.h.
Putting this all together, we get the following for our JPEG example:
extern "C" {
DLLEXPORT int jpeg_imageio_version = OPENIMAGEIO_PLUGIN_VERSION;
DLLEXPORT JpgOutput *jpeg_output_imageio_create () {
return new JpgOutput;
}
DLLEXPORT const char *jpeg_input_extensions[] = {
"jpg", "jpe", "jpeg", NULL
};
};
3. The definition and implementation of an ImageOutput subclass for this file format. It
must publicly inherit ImageOutput, and must overload the following methods which are
“pure virtual” in the ImageOutput base class:
(a) format name() should return the name of the format, which ought to match the
name of the plugin and by convention is strictly lower-case and contains no whites-
pace.
(b) supports() should return true if its argument names a feature supported by your
format plugin, false if it names a feature not supported by your plugin. See Sec-
tion 3.3 for the list of feature names.
(c) open() should open the file and return true, or should return false if unable to do
so (including if the file was found but turned out not to be in the format that your
plugin is trying to implement).
(d) close() should close the file, if open.
(e) write scanline should write a single scanline to the file, translating from internal
to native data format and handling strides properly.
(f) The virtual destructor, which should close() if the file is still open, addition to
performing any other tear-down activities.
Additionally, your ImageOutput subclass may optionally choose to overload any of the
following methods, which are defined in the ImageOutput base class and only need to be
overloaded if the default behavior is not appropriate for your plugin:
(g) write tile(), only if your format supports writing tiled images.
(h) write rectangle(), only if your format supports writing arbitrary rectangles.
(i) write image(), only if you have a more clever method of doing so than the default
implementation that calls write scanline() or write tile() repeatedly.
Here is how the class definition looks for our JPEG example. Note that the JPEG/JFIF
file format does not support multiple subimages or tiled images.
Your subclass implementation of open(), close(), and write scanline() are the heart
of an ImageOutput implementation. (Also write tile(), for those image formats that sup-
port tiled output.)
An ImageOutput implementation must properly handle all data formats and strides passed
to write scanline() or write tile(), unlike an ImageInput implementation, which only
needs to read scanlines or tiles in their native format and then have the super-class handle the
translation. But don’t worry, all the heavy lifting can be accomplished with the following helper
functions provided as protected member functions of ImageOutput:
Convert a full scanline of pixels (pointed to by data) with the given format and strides
into contiguous pixels in the native format (described by the ImageSpec returned by the
spec() member function). The location of the newly converted data is returned, which
may either be the original data itself if no data conversion was necessary and the requested
layout was contiguous (thereby avoiding unnecessary memory copies), or may point into
memory allocated within the scratch vector passed by the user. In either case, the caller
doesn’t need to worry about thread safety or freeing any allocated memory (other than
eventually destroying the scratch vector).
Convert a full tile of pixels (pointed to by data) with the given format and strides into con-
tiguous pixels in the native format (described by the ImageSpec returned by the spec()
member function). The location of the newly converted data is returned, which may ei-
ther be the original data itself if no data conversion was necessary and the requested
layout was contiguous (thereby avoiding unnecessary memory copies), or may point into
memory allocated within the scratch vector passed by the user. In either case, the caller
doesn’t need to worry about thread safety or freeing any allocated memory (other than
eventually destroying the scratch vector).
const void * to native rectangle (int xmin, int xmax, int ymin, int ymax,
int zmin, int zmax, TypeDesc format, const void *data,
stride t xstride, stride t ystride, stride t zstride,
std::vector<unsigned char> &scratch);
Convert a rectangle of pixels (pointed to by data) with the given format, dimensions, and
strides into contiguous pixels in the native format (described by the ImageSpec returned
by the spec() member function). The location of the newly converted data is returned,
which may either be the original data itself if no data conversion was necessary and the
requested layout was contiguous (thereby avoiding unnecessary memory copies), or may
point into memory allocated within the scratch vector passed by the user. In either case,
the caller doesn’t need to worry about thread safety or freeing any allocated memory
(other than eventually destroying the scratch vector).
The remainder of this section simply lists the full implementation of our JPEG writer, which
relies heavily on the open source jpeg-6b library to perform the actual JPEG encoding.
/*
Copyright 2008 Larry Gritz and the other authors and contributors.
All Rights Reserved.
Based on BSD-licensed software Copyright 2004 NVIDIA Corp.
#include <cassert>
#include <cstdio>
#include <vector>
extern "C" {
#include "jpeglib.h"
}
#include "imageio.h"
using namespace OpenImageIO;
#include "fmath.h"
#include "jpeg_pvt.h"
using namespace Jpeg_imageio_pvt;
private:
FILE *m_fd;
std::string m_filename;
int m_next_scanline; // Which scanline is the next to write?
std::vector<unsigned char> m_scratch;
struct jpeg_compress_struct m_cinfo;
struct jpeg_error_mgr c_jerr;
jvirt_barray_ptr *m_copy_coeffs;
struct jpeg_decompress_struct *m_copy_decompressor;
extern "C" {
DLLEXPORT ImageOutput *jpeg_output_imageio_create () {
return new JpgOutput;
}
DLLEXPORT const char *jpeg_output_extensions[] = {
"jpg", "jpe", "jpeg", NULL
};
};
bool
JpgOutput::open (const std::string &name, const ImageSpec &newspec,
bool append)
{
if (append) {
error ("JPG doesn’t support multiple images per file");
return false;
}
if (m_spec.nchannels == 3 || m_spec.nchannels == 4) {
m_cinfo.input_components = 3;
m_cinfo.in_color_space = JCS_RGB;
m_spec.nchannels = 3; // Force RGBA -> RGB
m_spec.alpha_channel = -1; // No alpha channel
} else if (m_spec.nchannels == 1) {
m_cinfo.input_components = 1;
m_cinfo.in_color_space = JCS_GRAYSCALE;
}
m_cinfo.density_unit = 2; // RESUNIT_INCH;
m_cinfo.X_density = 72;
m_cinfo.Y_density = 72;
m_cinfo.write_JFIF_header = true;
if (m_copy_coeffs) {
// Back door for copy()
jpeg_copy_critical_parameters (m_copy_decompressor, &m_cinfo);
DBG std::cout << "out open: copy_critical_parameters\n";
jpeg_write_coefficients (&m_cinfo, m_copy_coeffs);
DBG std::cout << "out open: write_coefficients\n";
} else {
// normal write of scanlines
jpeg_set_defaults (&m_cinfo); // default compression
DBG std::cout << "out open: set_defaults\n";
jpeg_set_quality (&m_cinfo, quality, TRUE); // baseline values
DBG std::cout << "out open: set_quality\n";
jpeg_start_compress (&m_cinfo, TRUE); // start working
DBG std::cout << "out open: start_compress\n";
}
m_next_scanline = 0; // next scanline we’ll write
if (m_spec.linearity == ImageSpec::sRGB)
m_spec.attribute ("Exif:ColorSpace", 1);
return true;
}
bool
JpgOutput::write_scanline (int y, int z, TypeDesc format,
const void *data, stride_t xstride)
{
y -= m_spec.y;
if (y != m_next_scanline) {
error ("Attempt to write scanlines out of order to %s",
m_filename.c_str());
return false;
}
if (y >= (int)m_cinfo.image_height) {
error ("Attempt to write too many scanlines to %s", m_filename.c_str());
return false;
}
assert (y == (int)m_cinfo.next_scanline);
return true;
}
bool
JpgOutput::close ()
{
if (! m_fd) // Already closed
return true;
return true;
}
bool
// Re-open the input spec, with special request that the JpgInput
// will recognize as a request to merely open, but not start the
// decompressor.
ImageSpec in_spec;
ImageSpec config_spec;
config_spec.attribute ("_jpeg:raw", 1);
in->open (in_name, in_spec, config_spec);
return true;
}
6.1 TIFF
6.2 JPEG
6.3 OpenEXR
6.4 HDR/RGBE
6.5 PNG
6.6 Zfile
6.7 Targa
6.8 ICO
6.9 BMP
67
68 CHAPTER 6. BUNDLED IMAGEIO PLUGINS
• ImageCache presents an even simpler user interface than ImageInput— the only sup-
ported operations are asking for an ImageSpec describing a subimage in the file, re-
trieving for a block of pixels, and locking/reading/releasing individual tiles. You refer
to images by filename only; you don’t need to keep track of individual file handles or
ImageInput objects. You don’t need to explicitly open or close files.
• The ImageCache is completely thread-safe; if multiple threads are accessing the same
file, the ImageCache internals will handle all the locking and resource sharing.
• No matter how many image files you are accessing, the ImageCache will maintain a
reasonable number of simultaneously-open files, automatically closing files that have not
been needed recently.
• No matter how large the total pixels in all the image files you are dealing with are, the
ImageCache will use only a small amount of memory. It does this by loading only the
individual tiles requested, and as memory allotments are approached, automatically re-
leasing the memory from tiles that have not been used recently.
In short, if you have an application that will need to read pixels from many large image files,
you can rely on ImageCache to manage all the resources for you. It is reasonable to access
thousands of image files totalling hundreds of GB of pixels, efficiently and using a memory
footprint on the order of 50 MB.
69
70 CHAPTER 7. CACHED IMAGES
Below are some simple code fragments that shows ImageCache in action:
#include "OpenImageIO/imagecache.h"
// Request and hold a tile, do some work with its pixels, then release
ImageCache::Tile *tile;
tile = cache->get_tile ("file2.exr", 0, x, y, z);
// The tile won’t be freed until we release it, so this is safe:
TypeDesc format;
void *p = cache->tile_pixels (tile, format);
// Now p points to the raw pixels of the tile, whose data format
// is given by ’format’.
cache->release_tile (tile);
// Now cache is permitted to free the tile when needed
// Note that all files were referenced by name, we never had to open
// or close any files, and all the resource and memory management
// was automatic.
ImageCache::destroy (cache);
bool attribute (const std::string &name, TypeDesc type, const void *val)
Sets an attribute (i.e., a property or option) of the ImageCache. The name designates the
name of the attribute, type describes the type of data, and val is a pointer to memory
containing the new value for the attribute.
If the ImageCache recognizes a valid attribute name that matches the type specified,
the attribute will be set to the new value and attribute() will return true. If name
is not recognized as a valid attribute name, or if the types do not match (e.g., type is
TypeDesc::FLOAT but the named attribute is a string), the attribute will not be modified,
and attribute() will return false.
Here are examples:
ImageCache *ts;
...
int maxfiles = 50;
Note that when passing a string, you need to pass a pointer to the char*, not a pointer to
the first character. (Rationale: for an int attribute, you pass the address of the int. So
for a string, which is a char*, you need to pass the address of the string, i.e., a char**).
The complete list of attributes can be found at the end of this section.
Note that when passing a string, you need to pass a pointer to the char*, not a pointer
to the first character. Also, the char* will end up pointing to characters owned by the
ImageCache; the caller does not need to ever free the memory that contains the characters.
The complete list of attributes can be found at the end of this section.
string searchpath
The search path for images: a colon-separated list of directories that will be searched in
order for any image name that is not specified as an absolute path. (Default: no search
path.)
int autotile
This attributes controls how the image cache deals with images that are not “tiled” (i.e.,
are stored as scanlines).
If autotile is set to 0 (the default), an untiled image will be treated as if it were a single
tile of the resolution of the whole image. This is simple and fast, but can lead to poor
cache behavior if you are simultaneously accessing many large untiled images.
If autotile is nonzero (e.g., 64 is a good recommended value), any untiled images will
be read and cached as if they were constructed in tiles of size autotile × autotile.
This leads to slightly more expensive disk access if you are using only a few images, but
if you are using many untiled images, the caching be much more efficient.
int automip
If automip is set to 0 (the default), an untiled single-subimage file will only be able to
utilize that single subimage.
If automip is nonzero, any untiled, single-subimage (un-MIP-mapped) images will have
lower-resolution MIP-map levels generated on-demand if pixels are requested from the
lower-res subimages (that don’t really exist). Essentially this makes the ImageCache
pretends that the file is MIP-mapped even if it isn’t.
int forcefloat
If set to nonzero, all image tiles will be converted to float type when stored in the image
cache. This can be helpful especially for users of ImageBuf who want to simplify their
image manipulations to only need to consider float data.
The default is zero, meaning that image pixels are not forced to be float when in cache.
resolution The resolution of the image file, which is an array of 2 integers (described
as TypeDesc(INT,2)).
texturetype A string describing the type of texture of the given file, which describes
how the texture may be used (also which texture API call is probably the right one
for it). This currently may return one of: "unknown", "Plain Texture", "Volume
Texture", "Shadow", or "Environment".
textureformat A string describing the format of the given file, which describes the kind
of texture stored in the file. This currently may return one of: "unknown", "Plain
Texture", "Volume Texture", "Shadow", "CubeFace Shadow", "Volume Shadow",
"LatLong Environment", or "CubeFace Environment". Note that there are sev-
eral kinds of shadows and environment maps, all accessible through the same API
calls.
channels The number of color channels in the file (an integer).
format The native data format of the pixels in the file (an integer, giving the TypeDesc::BASETYPE
of the data). Note that this is not necessarily the same as the data format stored in
the image cache.
cachedformat The native data format of the pixels as stored in the image cache (an inte-
ger, giving the TypeDesc::BASETYPE of the data). Note that this is not necessarily
the same as the native data format of the file.
viewingmatrix The viewing matrix, which is a 4 × 4 matrix (an Imath::M44f, de-
scribed as TypeDesc(FLOAT,MATRIX)).
projectionmatrix The projection matrix, which is a 4 × 4 matrix (an Imath::M44f,
described as TypeDesc(FLOAT,MATRIX)).
Anything else – For all other data names, the the metadata of the image file will be
searched for an item that matches both the name and data type.
Find a tile given by an image filename, subimage, and pixel coordinates. An opaque
pointer to the tile will be returned, or NULL if no such file (or tile within the file) exists or
can be read. The tile will not be purged from the cache until after release tile() is
called on the tile pointer. This is thread-safe.
After finishing with a tile, release tile() will allow it to once again be purged from
the tile cache if required.
For a tile retrived by get tile(), return a pointer to the pixel data itself, and also store in
format the data type that the pixels are internally stored in (which may be different than
the data type of the pixels in the disk file). This method should only be called on a tile that
has been requested by get tile() but has not yet been released with release tile().
Invalidate any loaded tiles or open file handles associated with the filename, so that any
subsequent queries will be forced to re-open the file or re-load any tiles (even those that
were previously loaded and would ordinarily be reused). A client might do this if, for
example, they are aware that an image being held in the cache has been updated on disk.
This is safe to do even if other procedures are currently holding reference-counted tile
pointers from the named image, but those procedures will not get updated pixels until
they release the tiles they are holding.
Invalidate all loaded tiles and open file handles, so that any subsequent queries will be
forced to re-open the file or re-load any tiles (even those that were previously loaded and
would ordinarily be reused). A client might do this if, for example, they are aware that
an image being held in the cache has been updated on disk. This is safe to do even if
other procedures are currently holding reference-counted tile pointers from the named
image, but those procedures will not get updated pixels until they release the tiles they
are holding. If force is true, everything will be invalidated, no matter how wasteful it is,
but if force is false, in actuality files will only be invalidated if their modification times
have been changed since they were first opened.
std::string geterror ()
If any other API routines return false, indicating that an error has occurred, this routine
will retrieve the error and clear the error status. If no error has occurred since the last
time geterror() was called, it will return an empty string.
#include <ImathVec.h>
#include <ImathMatrix.h>
Please refer to the Ilmbase and OpenEXR documentation and header files for more complete
information about use of these types in your own application. However, note that you are not
strictly required to use these classes in your application — Imath::V3f has a memory layout
identical to float[3] and Imath::M44f has a memory layout identical to float[16], so as
long as your own internal vectors and matrices have the same memory layout, it’s ok to just cast
pointers to them when passing as arguments to TextureSystem methods.
79
80 CHAPTER 8. TEXTURE ACCESS: TEXTURESYSTEM
This means that parameter x may either be a single value for use at each of the n texture lookups,
or it may have n different values of x.
If you want to pas a uniform value, you may do any of the following:
8.2.3 TextureOptions
TextureOptions is a structure that holds many options controlling individual texture lookups.
Because each texture lookup API call takes a reference to a TextureOptions, the call signa-
tures remain uncluttered rather than having an ever-growing list of parameters, most of which
will never vary from their defaults. Here is a brief description of the data members of a
TextureOptions structure:
int nchannels
int firstchannel
The number of color channels to look up from the texture — for example, 1 (single
channel), or 3 (for an RGB triple) — and the number of channels to look up. The defaults
are firstchannel = 0, nchannels = 1.
Examples: To retrieve the first three channels (typically RGB), you should have nchannels
= 3, firstchannel = 0. To retrieve just the blue channel, you should have nchannels = 1,
firstchannel = 2.
Specify the wrap mode for 2D texture lookups (and 3D volume texture lookups, using the
additional zwrap field). These fields are ignored for shadow and environment lookups.
These specify what happens when texture coordinates are found to be outside the usual
[0, 1] range over which the texture is defined. Wrap is an enumerated type that may take
on any of the following values:
VaryingRef<float> bias
For shadow map lookups only, this gives the “shadow bias” amount.
VaryingRef<float> fill
Specifies the value that will be used for any color channels that are requested but not
found in the file. For example, if you perform a 3-channel lookup on a 1-channel texture,
the second two channels will get the fill value.
VaryingRef<int> samples
The number of samples to use for each lookup. Currently this only applies for certain
types of shadow maps.
VaryingRef<float> alpha
Specifies a destination for one additional channel to be looked up, the one immediately
following the return value (i.e., channel firstchannel + nchannels). The point of this is to
allow a 4-channel lookup, with the 4th channel put in an entirely different variable than
the 3-channel color. The default for alpha is to point to NULL, indicating that no extra
alpha channel should be retrieved.
Wrap zwrap
VaryingRef<float> zblur, zwidth
Specifies wrap, blur, and width for 3D volume texture lookups only.
bool attribute (const std::string &name, TypeDesc type, const void *val)
Sets an attribute (i.e., a property or option) of the TextureSystem. The name designates
the name of the attribute, type describes the type of data, and val is a pointer to memory
containing the new value for the attribute.
If the TextureSystem recognizes a valid attribute name that matches the type specified,
the attribute will be set to the new value and attribute() will return true. If name
is not recognized as a valid attribute name, or if the types do not match (e.g., type is
TypeDesc::FLOAT but the named attribute is a string), the attribute will not be modified,
and attribute() will return false.
Here are examples:
TextureSystem *ts;
...
int maxfiles = 50;
ts->attribute ("max_open_files", TypeDesc::INT, &maxfiles);
Note that when passing a string, you need to pass a pointer to the char*, not a pointer to
the first character. (Rationale: for an int attribute, you pass the address of the int. So
for a string, which is a char*, you need to pass the address of the string, i.e., a char**).
The complete list of attributes can be found at the end of this section.
Note that when passing a string, you need to pass a pointer to the char*, not a pointer
to the first character. Also, the char* will end up pointing to characters owned by the
TextureSystem; the caller does not need to ever free the memory that contains the char-
acters.
The complete list of attributes can be found at the end of this section.
matrix worldtocommon
The 4 × 4 matrix that provides the spatial transformation from “world” to a “common”
coordinate system. This is used for shadow map lookups, in which the shadow map itself
encodes the world coordinate system, but positions passed to shadow() are expressed in
“common” coordinates.
matrix commontoworld
The 4 × 4 matrix that is the inverse of worldtocommon — that is, it transforms points
from “common” to “world” coordinates.
You do not need to set commontoworld and worldtocommon separately; just setting either
one will implicitly set the other, since each is the inverse of the other.
This function returns true upon success, or false if the file was not found or could not
be opened by any available ImageIO plugin.
This function returns true upon success, or false if the file was not found or could not
be opened by any available ImageIO plugin.
This function returns true upon success, or false if the file was not found or could not
be opened by any available ImageIO plugin.
Perform filtered shadow map lookups on a collection of positions all at once, which may
be much more efficient than repeatedly calling the single-point version of shadow(). The
parameters P, dPdx, and dPdy are now VaryingRef’s that may refer to either a single or
an array of values, as are all the fields in the options.
Shadow lookups will be computed at indices beginactive through endactive (exclu-
sive of the end), but only at indices where runflags[i] is nonzero. Results will be stored
at corresponding positions of result, that is, result[i*n ... (i+1)*n-1] where n
is the number of channels requested by options.nchannels.
This function returns true upon success, or false if the file was not found or could not
be opened by any available ImageIO plugin.
This function returns true upon success, or false if the file was not found or could not
be opened by any available ImageIO plugin.
resolution The resolution of the texture file, which is an array of 2 integers (described
as TypeDesc(INT,2)).
texturetype A string describing the type of texture of the given file, which describes
how the texture may be used (also which texture API call is probably the right one
for it). This currently may return one of: "unknown", "Plain Texture", "Volume
Texture", "Shadow", or "Environment".
textureformat A string describing the format of the given file, which describes the kind
of texture stored in the file. This currently may return one of: "unknown", "Plain
Texture", "Volume Texture", "Shadow", "CubeFace Shadow", "Volume Shadow",
"LatLong Environment", or "CubeFace Environment". Note that there are sev-
eral kinds of shadows and environment maps, all accessible through the same API
calls.
channels The number of color channels in the file (an integer).
viewingmatrix The viewing matrix, which is a 4 × 4 matrix (an Imath::M44f, de-
scribed as TypeDesc(FLOAT,MATRIX)).
projectionmatrix The projection matrix, which is a 4 × 4 matrix (an Imath::M44f,
described as TypeDesc(FLOAT,MATRIX)).
Anything else – For all other data names, the the metadata of the image file will be
searched for an item that matches both the name and data type.
Return true if the file is found and could be opened by an available ImageIO plugin,
otherwise return false.
std::string geterror ()
If any other API routines return false, indicating that an error has occurred, this routine
will retrieve the error and clear the error status. If no error has occurred since the last
time geterror() was called, it will return an empty string.
97
98 CHAPTER 9. IMAGE BUFFER
Image Utilities
99
10 The iv Image Viewer
The iv program is a great interactive image viewer. Because iv is built on top on OpenImageIO,
it can display images of any formats readable by ImageInput plugins on hand.
More documentation on this later.
101
102 CHAPTER 10. THE IV IMAGE VIEWER
The iinfo program will print either basic information (name, resolution, format) or detailed
information (including all metadata) found in images. Because iinfo is built on top on Open-
ImageIO, it will print information about images of any formats readable by ImageInput plugins
on hand.
The -s flag also prints the uncompressed sizes of each image file, plus a sum for all of the
images:
The -v option turns on verbose mode, which exhaustively prints all metadata about each
image:
103
104 CHAPTER 11. GETTING IMAGE INFORMATION WITH IINFO
$ iinfo -v img_6019m.jpg
If the input file has multiple subimages, extra information summarizing the subimages will
be printed:
$ iinfo img_6019m.tx
$ iinfo -v img_6019m.tx
...
Furthermore, the -a option will print information about all individual subimages:
$ iinfo -a ../sample-images/img_6019m.tx
$ iinfo -v -a img_6019m.tx
img_6019m.tx : 1024 x 1024, 3 channel, uint8 tiff
11 subimages: 1024x1024 512x512 256x256 128x128 64x64 32x32 16x16 8x8 4x4 2x2 1x1
subimage 0: 1024 x 1024, 3 channel, uint8 tiff
channel list: R, G, B
tile size: 64 x 64
...
subimage 1: 512 x 512, 3 channel, uint8 tiff
channel list: R, G, B
...
...
--help
-v
-a
-f
Print the filename as a prefix to every line. For example,
$ iinfo -v -f img_6019m.jpg
-m pattern
Match the pattern (specified as an extended regular expression) against data metadata
field names and print only data fields whose names match. The default is to print all data
fields found in the file (if -v is given).
For example,
$ iinfo -v -f -m ImageDescription test*.jpg
Note: the -m option is probably not very useful without also using the -v and -f options.
--hash
Displays a SHA-1 hash of the pixel data of the image (and of each subimage if combined
with the -a flag).
-s
Show the image sizes, including a sum of all the listed images.
12.1 Overview
The iconvert program will read an image (from any file format for which an ImageInput
plugin can be found) and then write the image to a new file (in any format for which an
ImageOutput plugin can be found). In the process, iconvert can optionally change the file for-
mat or data format (for example, converting floating-point data to 8-bit integers), apply gamma
correction, switch between tiled and scanline orientation, or alter or add certain metadata to the
image.
The iconvert utility is invoked as follows:
iconvert [options] input output
Where input and output name the input image and desired output filename. The image files
may be of any format recognized by OpenImageIO (i.e., for which ImageInput plugins are
available). The file format of the output image will be inferred from the file extension of the
output filename (e.g., "foo.tif" will write a TIFF file).
Alternately, any number of files may be specified as follows:
iconvert --inplace [options] file1 file2 ..
When the --inplace option is used, any number of file names ≥ 1 may be specified, and
the image conversion commands are applied to each file in turn, with the output being saved
under the original file name. This is useful for applying the same conversion to many files,
or simply if you want to replace the input with the output rather than create a new file with a
different name.
107
108 CHAPTER 12. CONVERTING IMAGE FORMATS WITH ICONVERT
Just use the -d option to specify a pixel data format. For example, assuming that in.tif uses
16-bit unsigned integer pixels, the following will convert it to an 8-bit unsigned pixels:
The following command converts writes a TIFF file, specifically using LZW compression:
The following command writes its results as a JPEG file at a compression quality of 50
(pretty severe compression):
Gamma-correcting an image
The following gamma-corrects the pixels, raising all pixel values to x1/2.2 upon writing:
You can use the --inplace flag to cause the output to replace the input file, rather than create a
new file with a different name. For example, this will re-compress all of your TIFF files to use
ZIP compression (rather than whatever they currently are using):
--help
Prints usage information to the terminal.
-v
Verbose status messages.
--inplace
Causes the output to replace the input file, rather than create a new file with a different
name.
Without this flag, iconvert expects two file names, which will be used to specify the
input and output files, respectively.
But when --inplace option is used, any number of file names ≥ 1 may be specified, and
the image conversion commands are applied to each file in turn, with the output being
saved under the original file name. This is useful for applying the same conversion to
many files.
For example, the following example will add the caption “Hawaii vacation” to all JPEG
files in the current directory:
iconvert --inplace --adjust-time --caption "Hawaii vacation" *.jpg
-d datatype
Attempt to sets the output pixel data type to one of: uint8, sint8, uint16, sint16,
half, float, double.
If the -d option is not supplied, the output data type will be the same as the data format
of the input file.
In either case, the output file format itself (implied by the file extension of the output
filename) may trump the request if the file format simply does not support the requested
data type.
-g gamma
--sRGB
Explicitly tags the image as being in sRGB color space. Note that this does not alter pixel
values, it only marks which color space those values refer to (and only works for file
formats that understand such things). An example use of this command is if you have an
image that is not explicitly marked as being in any particular color space, but you know
that the values are sRGB.
--tile x y
Requests that the output file be tiled, with the given x × y tile size, if tiled images are
supported by the output format. By default, the output file will take on the tiledness and
tile size of the input file.
--scanline
Requests that the output file be scanline-oriented (even if the input file was tile-oriented),
if scanline orientation is supported by the output file format. By default, the output file
will be scanline if the input is scanline, or tiled if the input is tiled.
--separate
--contig
--compression method
Sets the compression method for the output image. Each ImageOutput plugin will have
its own set of methods that it supports.
By default, the output image will use the same compression technique as the input image
(assuming it is supported by the output format, otherwise it will use the default compres-
sion method of the output plugin).
--quality q
Sets the compression quality, on a 1–100 floating-point scale. This only has an effect if
the particular compression method supports a quality metric (as JPEG does).
--no-copy-image
Ordinarily, iconvert will attempt to use ImageOutput::copy image underneath to
avoid de/recompression or alteration of pixel values, unless other settings clearly contra-
dict this (such as any settings that must alter pixel values). The use of --no-copy-image
will force all pixels to be decompressed, read, and compressed/written, rather than copied
in compressed form. We’re not exactly sure when you would need to do this, but we put
it in just in case.
--adjust-time
When this flag is present, after writing the output, the resulting file’s modification time
will be adjusted to match any "DateTime" metadata in the image. After doing this, a
directory listing will show file times that match when the original image was created or
captured, rather than simply when iconvert was run. This has no effect on image files
that don’t contain any "DateTime" metadata.
--caption text
Sets the image metadata "ImageDescription". This has no effect if the output image
format does not support some kind of title, caption, or description metadata field. Be
careful to enclose text in quotes if you want your caption to include spaces or certain
punctuation!
--keyword text
Adds a keyword to the image metadata "Keywords". Any existing keywords will be
preserved, not replaced, and the new keyword will not be added if it is an exact duplicate
of existing keywords. This has no effect if the output image format does not support some
kind of keyword field.
Be careful to enclose text in quotes if you want your keyword to include spaces or certain
punctuation. For image formats that have only a single field for keywords, OpenImageIO
will concatenate the keywords, separated by semicolon (‘;’), so don’t use semicolons
within your keywords.
--clear-keywords
Clears all existing keywords in the image.
--orientation orient
Explicitly sets the image’s "Orientation" metadata to a numeric value (see Section B.2
for the numeric codes). This only changes the metadata field that specifies how the image
should be displayed, it does NOT alter the pixels themselves, and so has no effect for
image formats that don’t support some kind of orientation metadata.
--rotcw
--rotccw
--rot180
Adjusts the image’s "Orientation" metadata by rotating it 90◦ clockwise, 90◦ de-
grees counter-clockwise, or 180◦ , respectively, compared to its current setting. This only
changes the metadata field that specifies how the image should be displayed, it does NOT
alter the pixels themselves, and so has no effect for image formats that don’t support some
kind of orientation metadata.
The igrep program search one or more image files for metadata that match a string or regular
expression.
--help
Prints usage information to the terminal.
-d
Print directory names as it recurses. This only happens if the -r option is also used.
-E
Interpret the pattern as an extended regular expression (just like egrep or grep -E).
113
114 CHAPTER 13. SEARCHING IMAGE METADATA WITH IGREP
-f
Match the expression against the filename, as well as the metadata within the file.
-i
Ignore upper/lower case distinctions. Without this flag, the expression matching will be
case-sensitive.
-l
Simply list the matching files by name, surpressing the normal output that would include
the metadata name and values that matched. For example:
$ igrep Jack *.jpg
bar.jpg: Keywords = Carly; Jack
foo.jpg: Keywords = Jack
test7.jpg: ImageDescription = Jack on vacation
-r
Recurse into directories. If this flag is present, any files specified that are directories will
have any image file contained therein to be searched for a match (an so on, recursively).
-v
Invert the sense of matching, to select image files that do not match the expression.
14.1 Overview
The idiff program compares two images, printing a report about how different they are and
optionally producing a third image that records the pixel-by-pixel differences between them.
There are a variety of options and ways to compare (absolute pixel difference, various thresholds
for warnings and errors, and also an optional perceptual difference metric).
Because idiff is built on top on OpenImageIO, it can compare two images of any formats
readable by ImageInput plugins on hand. They may have any (or different) file formats, data
formats, etc.
115
116 CHAPTER 14. COMPARING IMAGES WITH IDIFF
The “mean error” is the average difference (per channel, per pixel). The “max error” is
the largest difference in any pixel channel, and will point out on which pixel and channel it
was found. It will also give a count of how many pixels were above the warning and failure
thresholds.
The metadata of the two images (e.g., the comments) are not currently compared; only
differences in pixel values are taken into consideration.
But what happens if a just a few pixels are very different? Maybe you want that to fail, also.
The following adjustment will fail if at least 10% of pixels differ by 0.004, or if any pixel differs
by more than 0.25:
If none of the failure criteria are met, and yet some pixels are still different, it will still give
a WARNING. But you can also raise the warning threshold in a similar way:
The above example will PASS as long as fewer than 3% of pixels differ by more than 0.004. If
it does, it will be a WARNING as long as no more than 10% of pixels differ by 0.004 and no
pixel differs by more than 0.25, otherwise it is a FAILURE.
The -abs flag saves the absolute value of the differences (i.e., all positive values or zero).
If you omit the -abs, pixels in which a.jpg have smaller values than b.jpg will be negative in
the difference image (be careful in this case of using a file format that doesn’t support negative
values).
You can also scale the difference image with the -scale, making them easier to see. And
the -od flag can be used to output a difference image only if the comparison fails, but not if the
images pass within the designated threshold (thus saving you the trouble and space of saving a
black image).
General options
--help
Prints usage information to the terminal.
-v
Verbose output — more detail about what it finds when comparing images. (Currently,
there is no extra info to print.)
-a
Compare all subimages. Without this flag, only the first subimage of each file will be
compared.
-fail A
-failpercent B
-hardfail C
Sets the threshold for FAILURE: if more than B% of pixels (on a 0-100 floating point scale)
are greater than A different, or if any pixels are more than C different. The defaults are to
fail if more than 0% (any) pixels differ by more than 0.00001 (1e-6), and C is infinite.
-warn A
-warnpercent B
-hardwarn C
Sets the threshold for WARNING: if more than B% of pixels (on a 0-100 floating point scale)
are greater than A different, or if any pixels are more than C different. The defaults are to
warn if more than 0% (any) pixels differ by more than 0.00001 (1e-6), and C is infinite.
-p
Does an additional test on the images to attempt to see if they are perceptually different
(whether you are likely to discern a difference visually), using Hector Yee’s metric. If
this option is enabled, the statistics will additionally show a report on how many pixels
failed the perceptual test, and the test overall will fail if more than the “fail percentage”
failed the perceptual test.
-o outputfile
Outputs a difference image to the designated file. This difference image pixels consist are
each of the value of the corresponding pixel from image1 minus the value of the pixel
image2.
The file extension of the output file is used to determine the file format to write (e.g.,
"out.tif" will write a TIFF file, "out.jpg" will write a JPEG, etc.). The data format
of the output file will be format of whichever of the two input images has higher precision
(or the maximum precision that the designated output format is capable of, if that is less
than either of the input imges).
Note that pixels whose value is lower in image1 than in image2, this will result in negative
pixels (which may be clamped to zero if the image format does not support negative
values)), unless the -abs option is also used.
-abs
Will cause the output image to consist of the absolute value of the difference between the
two input images (so all values in the difference image ≥ 0).
-scale factor
Scales the values in the difference image by the given (floating point) factor. The main
use for this is to make small actual differences more visible in the resulting difference
image by giving a large scale factor.
-od
Causes a difference image to be produce only if the image comparison fails. That is, even
if the -o option is used, images that are within the comparison threshold will not write
out a useless black (or nearly black) difference image.
15.1 Overview
The maketx program will read an image (from any file format for which an ImageInput plugin
can be found) and then write it in a form in which it will have high performance when used by
TextureSystem (Chapter 8). This involves converting it to tiled (versus scanline) orientation,
writing multiple subimages at different resolutions (MIP-map), and setting a variety of header
or metadata fields appropriately for texture maps.
The maketx utility is invoked as follows:
maketx [options] input... -o output
Where input and output name the input image and desired output filename. The input files
may be of any image format recognized by OpenImageIO (i.e., for which ImageInput plugins
are available). The file format of the output image will be inferred from the file extension of the
output filename (e.g., "foo.tif" will write a TIFF file).
--help
Prints usage information to the terminal.
-v
Verbose status messages, including runtime statistics and timing.
-o outputname
Sets the name of the output texture.
--format formatname
Specifies the image format of the output file (e.g., “tiff”, “OpenEXR”, etc.). If --format
is not used, maketx will guess based on the file extension of the output filename; if it is
not a recognized format extension, TIFF will be used by default.
119
120 CHAPTER 15. MAKING TILED MIP-MAP TEXTURE FILES WITH MAKETX
-d datatype
Attempt to sets the output pixel data type to one of: uint8, sint8, uint16, sint16,
half, float, double.
If the -d option is not supplied, the output data type will be the same as the data format
of the input file.
In either case, the output file format itself (implied by the file extension of the output
filename) may trump the request if the file format simply does not support the requested
data type.
--tile x y
Specifies the tile size of the output texture. If not specified, maketx will make 64 × 64
tiles.
--separate
Forces “separate” (e.g., RRR...GGG...BBB) packing of channels in the output file. With-
out this option specified, “contiguous” (e.g., RGBRGBRGB...) packing of channels will
be used for those file formats that support it.
--update
Ordinarily, textures are created unconditionally (which could take several seconds for
large input files if read over a network) and will be stamped with the current time.
The --update option enables update modeI if the output file already exists and has the
same time stamp as the input file, the texture will not be recreated. If the output file does
not exist or has a different time than the input file, then the texture will be created be
given the time stamp of the input file.
--wrap wrapmode
--swrap wrapmode --twrap wrapmode
Sets the default wrap mode for the texture, which determines the behavior when the tex-
ture is sampled outside the [0, 1] range. Valid wrap modes are: black, clamp, periodic,
mirror. The default, if none is set, is black. The --wrap option sets the wrap mode in
both directions simultaneously, while the --swrap and --twrap may be used to set them
individually in the s (horizontal) and t (vertical) diretions.
Although this sets the default wrap mode for a texture, note that the wrap mode may have
an override specified in the texture lookup at runtime.
--noresize
Ordinarily, the input image will be resized (by rounding up) to be a power-of-two reso-
lution in each dimension. The --noresize option prevents this from happening, instead
keeping the highest resolution MIP-map level the same resolution as the original input
image.
--nomipmap
Causes the output to not be MIP-mapped, i.e., only will have the highest-resolution level.
--ingamma value
--outgamma value
Not currently implemented
--hash
Computes a SHA-1 hash on the input file’s pixels and embeds this hash in the “Im-
ageDescription” metadata of the output texture. This is useful in helping the TextureSystem
identify duplicate textures at runtime.
Appendices
123
A Building OpenImageIO
125
126 APPENDIX A. BUILDING OPENIMAGEIO
The ImageSpec class, described thoroughly in Section 2.2, provides the basic description of an
image that are essential across all formats — resolution, number of channels, pixel data format,
etc. Individual images may have additional data, stored as name/value pairs in the extra -
attribs field. Though literally anything can be stored in extra attribs — it’s specifically
designed for format- and user-extensibility — this chapter establishes some guidelines and lays
out all of the field names that OpenImageIO understands.
"ImageDescription" : string
The image description, title, caption, or comments.
"Keywords" : string
Semicolon-separated keywords describing the contents of the image. (Semicolons are
used rather than commas because of the common case of a comma being part of a keyword
itself, e.g., “Kurt Vonnegut, Jr.” or “Washington, DC.”)
"Artist" : string
The artist, creator, or owner of the image.
"Copyright" : string
Any copyright notice or owner of the image.
"DateTime" : string
The creation date of the image, in the following format: YYYY:MM:DD HH:MM:SS (exactly
19 characters long, not including a terminating NULL). For example, 7:30am on Dec 31,
2008 is encoded as "2008:12:31 07:30:00".
"DocumentName" : string
The name of an overall document that this image is a part of.
127
128 APPENDIX B. METADATA CONVENTIONS
"Software" : string
The software that was used to create the image.
"HostComputer" : string
The name or identity of the computer that created the image.
"Orientation" : int
By default, image pixels are ordered from the top of the display to the bottom, and within
each scanline, from left to right (i.e., the same ordering as English text and scan progres-
sion on a CRT). But the "Orientation" field can suggest that it should be displayed
with a different orientation, according to the TIFF/EXIF conventions:
1 normal (top to bottom, left to right)
2 flipped horizontally (top to botom, right to left)
3 rotate 180◦ (bottom to top, right to left)
4 flipped vertically (bottom to top, left to right)
5 transposed (left to right, top to bottom)
6 rotated 90◦ clockwise (right to left, top to bottom)
7 transverse (right to left, bottom to top)
8 rotated 90◦ counter-clockwise (left to right, bottom to top)
"PixelAspectRatio" : float
The aspect ratio (x/y) of the individual pixels, with square pixels being 1.0 (the default).
"XResolution" : float
"YResolution" : float
"ResolutionUnit" : string
The number of horizontal (x) and vertical (y) pixels per resolution unit. This ties the
image to a physical size (where applicable, such as with a scanned image, or an image
that will eventually be printed).
Different file formats may dictate different resolution units. For example, the TIFF Im-
ageIO plugin supports "none", "in", and "cm".
"BitsPerSample" : int
"planarconfig" : string
"contig" indicates that the file has contiguous pixels (RGB RGB RGB...), whereas
"separate" indicate that the file stores each channel separately (RRR...GGG...BBB...).
Note that only contiguous pixels are transmitted through the OpenImageIO APIs, but this
metadata indicates how it is (or should be) stored in the file, if possible.
"compression" : string
Indicates the type of compression the file uses. Supported compression modes will vary
from ImageInput plugin to plugin, and each plugin should document the modes it sup-
ports. If ImageInput::open is called with an ImageSpec that specifies an compression
mode not supported by that ImageInput, it will choose a reasonable default. As an ex-
ample, the TIFF ImageInput plugin supports "none", "lzw", "ccittrle", "zip" (the
default), "packbits".
"CompressionQuality" : int
Indicates the quality of compression to use (0–100), for those plugins and compression
methods that allow a variable amount of compression, with higher numbers indicating
higher image fidelity.
"Make" : string
For captured or scanned image, the make of the camera or scanner.
"Model" : string
For captured or scanned image, the model of the camera or scanner.
"ExposureTime" : float
The exposure time (in seconds) of the captured image.
"FNumber" : float
The f/stop of the camera when it captured the image.
Several standard metadata are very helpful for images that are intended to be used as textures
(especially for OpenImageIO’s TextureSystem).
"textureformat" : string
The kind of texture that this image is intended to be. We suggest the following names:
"Plain Texture" Ordinary 2D texture
"Volume Texture" 3D volumetric texture
"Shadow" Ordinary z-depth shadow map
"CubeFace Shadow" Cube-face shadow map
"Volume Shadow" Volumetric (“deep”) shadow map
"LatLong Environment" Latitude-longitude (rectangular) environment map
"CubeFace Environment" Cube-face environment map
"wrapmodes" : string
Give the intended texture wrap mode indicating what happens with texture coordinates
outside the [0...1] range. We suggest the following names: "black", "periodic",
"clamp", "mirror". If the wrap mode is different in each direction, they should simply
be separated by a comma. For example, "black" means black wrap in both directions,
whereas "clamp,periodic" means to clamp in u and be periodic in v.
"fovcot" : float
The cotangent (x/y) of the field of view of the original image (which may not be the same
as the aspect ratio of the pixels of the texture, which may have been resized).
"worldtocamera" : matrix44
For shadow maps or rendered images this item (of type TypeDesc::PT MATRIX) is the
world-to-camera matrix describing the camera position.
"worldtoscreen" : matrix44
For shadow maps or rendered images this item (of type TypeDesc::PT MATRIX) is the
world-to-screen matrix describing the full projection of the 3D view onto a [−1...1] ×
[−1...1] 2D domain.
"updirection" : string
For environment maps, indicates which direction is “up” (valid values are "y" or "z"), to
disambiguate conventions for environment map orientation.
"Exif:ExposureProgram" : int
The exposure program used to set exposure when the picture was taken:
0 unknown
1 manual
2 normal program
3 aperture priority
4 shutter priority
5 Creative program (biased toward depth of field)
6 Action program (biased toward fast shutter speed)
7 Portrait mode (closeup photo with background out of focus)
8 Landscape mode (background in focus)
"Exif:SpectralSensitivity" : string
The camera’s spectral sensitivity, using the ASTM conventions.
"Exif:ISOSpeedRatings" : int
The ISO speed and ISO latitude of the camera as specified in ISO 12232.
"Exif:DateTimeOriginal" : string
Date and time that the original image data was generated (in "YYYY:MM:DD HH:MM:SS"
format).
"Exif:DateTimeDigitized" : string
Date and time that the image was stored as digital data (in "YYYY:MM:DD HH:MM:SS"
format).
"Exif:CompressedBitsPerPixel" : float
The compression mode used, measured in compressed bits per pixel.
"Exif:ShutterSpeedValue" : float
"Exif:ApertureValue" : float
"Exif:BrightnessValue" : float
"Exif:ExposureBiasValue" : float
"Exif:MaxApertureValue" : float
"Exif:SubjectDistance" : float
"Exif:MeteringMode" : int
0 unknown
1 average
2 center-weighted average
3 spot
4 multi-spot
5 pattern
6 partial
255 other
"Exif:LightSource" : int
0 unknown
1 daylight
2 tungsten (incandescent light)
4 flash
9 fine weather
10 cloudy weather
11 shade
12 daylight fluorescent (D 5700-7100K)
13 day white fluorescent (N 4600-5400K)
14 cool white fuorescent (W 3900 - 4500K)
15 white fluorescent (WW 3200 - 3700K)
17 standard light A
18 standard light B
19 standard light C
20 D55
21 D65
22 D75
23 D50
24 ISO studio tungsten
255 other light source
"Exif:Flash" int
A sum of:
1 if the flash fired
0 no strobe return detection function
4 strobe return light was not detected
6 strobe return light was detected
8 compulsary flash firing
16 compulsary flash supression
24 auto-flash mode
32 no flash function (0 if flash function present)
64 red-eye reduction supported (0 if no red-eye reduction mode)
"Exif:FocalLength" : float
Actual focal length of the lens, in mm.
"Exif:SubsecTime" : string
Fractions of a second to augment the "DateTime" (expressed as text of the digits to the
right of the decimal).
"Exif:SubsecTimeOriginal" : string
Fractions of a second to augment the "Exif:DateTimeOriginal" (expressed as text of
the digits to the right of the decimal).
"Exif:SubsecTimeDigitized" : string
Fractions of a second to augment the "Exif:DateTimeDigital" (expressed as text of
the digits to the right of the decimal).
"Exif:PixelXDimension" : int
"Exif:PixelYDimension" : int
The x and y dimensions of the valid pixel area. FIXME – better explanation?
"Exif:FlashEnergy" : float
Strobe energy when the image was captures, measured in Beam Candle Power Seconds
(BCPS).
"Exif:FocalPlaneXResolution" : float
"Exif:FocalPlaneYResolution" : float
"Exif:FocalPlaneResolutionUnit" : int
The number of pixels in the x and y dimension, per resolution unit. The code for resolution
units is: 2 for inches.
"Exif:ExposureIndex" : float
The exposure index selected on the camera.
"Exif:SensingMethod" : int
The image sensor type on the camra:
1 undefined
2 one-chip color area sensor
3 two-chip color area sensor
4 three-chip color area sensor
5 color sequential area sensor
7 trilinear sensor
8 color trilinear sensor
"Exif:FileSource" : int
Set to 3, if captured by a digital camera, otherwise it should not be present.
"Exif:SceneType" : int
Set to 1, if a directly-photographed image, otherwise it should not be present.
"Exif:CustomRendered" : int
Set to 0 for a normal process, 1 if some custom processing has been performed on the
image data.
"Exif:ExposureMode" : int
The exposure mode:
0 auto
1 manual
2 auto-bracket
"Exif:WhiteBalance" : int
Set to 0 for auto white balance, 1 for manual white balance.
"Exif:DigitalZoomRatio" : float
The digital zoom ratio used when the image was shot.
"Exif:FocalLengthIn35mmFilm" : int
The equivalent focal length of a 35mm camera, in mm.
"Exif:SceneCaptureType" : int
The type of scene that was shot:
0 standard
1 landscape
2 portrait
3 night scene
"Exif:GainControl" : float
The degree of overall gain adjustment:
0 none
1 low gain up
2 high gain up
3 low gain down
4 high gain down
"Exif:Contrast" : int
The direction of contrast processing applied by the camera:
0 normal
1 soft
2 hard
"Exif:Saturation" : int
The direction of saturation processing applied by the camera:
0 normal
1 low saturation
2 high saturation
"Exif:Sharpness" : int
The direction of sharpness processing applied by the camera:
0 normal
1 soft
2 hard
"Exif:SubjectDistanceRange" : int
The distance to the subject:
0 unknown
1 macro
2 close
3 distant
"Exif:ImageUniqueID" : string
A unique identifier for the image, as 16 ASCII hexidecimal digits representing a 128-bit
number.
"GPS:LatitudeRef" : string
Whether the "GPS:Latitude" tag refers to north or south: "N" or "S".
"GPS:Latitude" : float[3]
The degrees, minutes, and seconds of latitude (see also "GPS:LatitudeRef").
"GPS:LongitudeRef" : string
Whether the "GPS:Longitude" tag refers to east or west: "E" or "W".
"GPS:Longitude" : float[3]
The degrees, minutes, and seconds of longitude (see also "GPS:LongitudeRef").
"GPS:AltitudeRef" : string
A value of 0 indicates that the altitude is above sea level, 1 indicates below sea level.
"GPS:Altitude" : float
Absolute value of the altitude, in meters, relative to sea level (see "GPS:AltitudeRef"
for whether it’s above or below sea level).
"GPS:TimeStamp" : float[3]
Gives the hours, minutes, and seconds, in UTC.
"GPS:Satellites" : string
Information about what satellites were visible.
"GPS:Status" : string
"A" indicates a measurement in progress, "V" indicates measurement interoperability.
"GPS:MeasureMode" : string
"2" indicates a 2D measurement, "3" indicates a 3D measurement.
"GPS:DOP" : float
Data degree of precision.
"GPS:SpeedRef" : string
Indicates the units of the related "GPS:Speed" tag: "K" for km/h, "M" for miles/h, "N"
for knots.
"GPS:Speed" : float
Speed of the GPS receiver (see "GPS:SpeedRef" for the units).
"GPS:TrackRef" : string
Describes the meaning of the "GPS:Track" field: "T" for true direction, "M" for magnetic
direction.
"GPS:Track" : float
Direction of the GPS receiver movement (from 0–359.99). The related "GPS:TrackRef"
indicate whether it’s true or magnetic.
"GPS:ImgDirectionRef" : string
Describes the meaning of the "GPS:ImgDirection" field: "T" for true direction, "M" for
magnetic direction.
"GPS:ImgDirection" : float
Direction of the image when captured (from 0–359.99). The related "GPS:ImgDirectionRef"
indicate whether it’s true or magnetic.
"GPS:MapDatum" : string
The geodetic survey data used by the GPS receiver.
"GPS:DestLatitudeRef" : string
Whether the "GPS:DestLatitude" tag refers to north or south: "N" or "S".
"GPS:DestLatitude" : float[3]
The degrees, minutes, and seconds of latitude of the destination (see also "GPS:DestLatitudeRef").
"GPS:DestLongitudeRef" : string
Whether the "GPS:DestLongitude" tag refers to east or west: "E" or "W".
"GPS:DestLongitude" : float[3]
The degrees, minutes, and seconds of longitude of the destination (see also "GPS:DestLongitudeRef").
"GPS:DestBearingRef" : string
Describes the meaning of the "GPS:DestBearing" field: "T" for true direction, "M" for
magnetic direction.
"GPS:DestBearing" : float
Bearing to the destination point (from 0–359.99). The related "GPS:DestBearingRef"
indicate whether it’s true or magnetic.
"GPS:DestDistanceRef" : string
Indicates the units of the related "GPS:DestDistance" tag: "K" for km, "M" for miles,
"N" for knots.
"GPS:DestDistance" : float
Distance to the destination (see "GPS:DestDistanceRef" for the units).
"GPS:ProcessingMethod" : string
Processing method information.
"GPS:AreaInformation" : string
Name of the GPS area.
"GPS:DateStamp" : string
Date according to the GPS device, in format "YYYY:MM:DD".
"GPS:Differential" : int
If 1, indicates that differential correction was applied.
"IPTC:ObjectName" : string
The name of the object in the picture.
"IPTC:Instructions" : string
Special instructions for handling the image.
"IPTC:AuthorsPosition" : string
The job title or position of the creator of the image.
"IPTC:City" : string
"IPTC:State" : string
"IPTC:Country" : string
The city, state, and country of the location of the image.
"IPTC:Headline" : string
Any headline that is meant to accompany the image.
"IPTC:Provider" : string
The provider of the image, or credit line.
"IPTC:Source" : string
The source of the image.
"IPTC:Contact" : string
The contact information for the image (possibly including name, address, email, etc.).
"IPTC:CaptionWriter" : string
The name of the person who wrote the caption or description of the image.
Channel One of several data values persent in each pixel. Examples include red, green, blue,
alpha, etc. The data in one channel of a pixel may be represented by a single number,
whereas the pixel as a whole requires one number for each channel.
Client A client (as in “client application”) is a program or library that uses OpenImageIO or
any of its constituent libraries.
Data format The type of numerical representation used to store a piece of data. Examples
include 8-bit unsigned integers, 32-bit floating-point numbers, etc.
Image File Format The specification and data layout of an image on disk. For example, TIFF,
JPEG/JFIF, OpenEXR, etc.
Metadata Data about data. As used in OpenImageIO, this means Information about an image,
beyond describing the values of the pixels themselves. Examples include the name of the
artist that created the image, the date that an image was scanned, the camera settings used
when a photograph was taken, etc.
Native data format The data format used in the disk file representing an image. Note that with
OpenImageIO, this may be different than the data format used by an application to store
the image in the computer’s RAM.
Pixel One pixel element of an image, consisting of one number describing each channel of data
at a particular location in an image.
Scanline A single horizontal row of pixels of an image. See also tile.
Scanline Image An image whose data layout on disk is organized by breaking the image up
into horizontal scanlines, typically with the ability to read or write an entire scanline at
once. See also tiled image.
Tile A rectangular region of pixels of an image. A rectangular tile is more spatially coherent
than a scanline that stretches across the entire image — that is, a pixel’s neighbors are
most likely in the same tile, whereas a pixel in a scanline image will typically have most
of its immediate neighbors on different scanlines (requiring additional scanline reads in
order to access them).
Tiled Image An image whose data layout on disk is organized by breaking the image up into
rectangular regions of pixels called tiles. All the pixels in a tile can be read or written at
once, and individual tiles may be read or written separately from other tiles.
141
142 APPENDIX C. GLOSSARY
Volume Image A 3-D set of pixels that has not only horizontal and vertical dimensions, but
also a ”depth” dimension.
attribute, 71, 83
crop windows, 21
data formats, 7
getattribute, 72, 84
iconvert, 107
idiff, 115
igrep, 113
iinfo, 103
Image Cache, 69–77
Image I/O API, 7–48
ImageOutput, 15
ImageSpec, 9
iv, 101
maketx, 119
Orientation, 128
overscan, 21
143