Introduction To Programming With OpenCV
Introduction To Programming With OpenCV
Gady Agam
Abstract:
The purpose of this document is to get you started quickly with OpenCV without having
to go through lengthy reference manuals. Once you understand these basics you will be
able to consult the OpenCV manuals on a need basis.
Contents
• Introduction
o Description of OpenCV
o Resources
o OpenCV naming conventions
o Compilation instructions
o Example C Program
• GUI commands
o Window management
o Input handling
• Basic OpenCV data structures
o Image data structure
o Matrices and vectors
o Other data structures
• Working with images
o Allocating and releasing images
o Reading and writing images
o Accessing image elements
o Image conversion
o Drawing commands
• Working with matrices
o Allocating and releasing matrices
o Accessing matrix elements
o Matrix/vector operations
• Working with video sequences
o Capturing a frame from a video sequence
o Getting/setting frame information
o Saving a video file
Introduction
Description of OpenCV
• General description
o Open source computer vision library in C/C++.
o Optimized and intended for real-time applications.
o OS/hardware/window-manager independent.
o Generic image/video loading, saving, and acquisition.
o Both low and high level API.
o Provides interface to Intel's Integrated Performance Primitives (IPP) with
processor specific optimization (Intel processors).
• Features:
o Image data manipulation (allocation, release, copying, setting,
conversion).
o Image and video I/O (file and camera based input, image/video file
output).
o Matrix and vector manipulation and linear algebra routines (products,
solvers, eigenvalues, SVD).
o Various dynamic data structures (lists, queues, sets, trees, graphs).
o Basic image processing (filtering, edge detection, corner detection,
sampling and interpolation, color conversion, morphological operations,
histograms, image pyramids).
o Structural analysis (connected components, contour processing, distance
transform, various moments, template matching, Hough transform,
polygonal approximation, line fitting, ellipse fitting, Delaunay
triangulation).
o Camera calibration (finding and tracking calibration patterns, calibration,
fundamental matrix estimation, homography estimation, stereo
correspondence).
o Motion analysis (optical flow, motion segmentation, tracking).
o Object recognition (eigen-methods, HMM).
o Basic GUI (display image/video, keyboard and mouse handling, scroll-
bars).
o Image labeling (line, conic, polygon, text drawing)
• OpenCV modules:
o cv - Main OpenCV functions.
o cvaux - Auxiliary (experimental) OpenCV functions.
o cxcore - Data structures and linear algebra support.
o highgui - GUI functions.
Resources
• Reference manuals:
o <opencv-root>/docs/index.htm
• Web resources:
o Official webpage:
http://www.intel.com/technology/computing/opencv/
o Software download:
http://sourceforge.net/projects/opencvlibrary/
• Books:
o Open Source Computer Vision Library by Gary R. Bradski, Vadim
Pisarevsky, and Jean-Yves Bouguet, Springer, 1st ed. (June, 2006).
• Sample programs for video processing (in <opencv-root>/samples/c/):
o color tracking: camshiftdemo
o point tracking: lkdemo
o motion segmentation: motempl
o edge detection: laplace
• Sample programs for image processing (in <opencv-root>/samples/c/):
o edge detection: edge
o segmentation: pyramid_segmentation
o morphology: morphology
o histogram: demhist
o distance transform: distrans
o ellipse fitting: fitellipse
cvActionTargetMod(...)
CV_<bit_depth>(S|U|F)C<number_of_channels>
S = Signed integer
U = Unsigned integer
F = Float
IPL_DEPTH_<bit_depth>(S|U|F)
• Header files:
#include <cv.h>
#include <cvaux.h>
#include <highgui.h>
#include <cxcore.h> // unnecessary - included in cv.h
Compilation instructions
• Linux:
g++ hello-world.cpp -o hello-world \
-I /usr/local/include/opencv -L /usr/local/lib \
-lm -lcv -lhighgui -lcvaux
• Windows:
Example C Program
////////////////////////////////////////////////////////////////////////
//
// hello-world.cpp
//
// This is a simple, introductory OpenCV program. The program reads an
// image from a file, inverts it, and displays the result.
//
////////////////////////////////////////////////////////////////////////
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <cv.h>
#include <highgui.h>
if(argc<2){
printf("Usage: main <image-file-name>\n\7");
exit(0);
}
// load an image
img=cvLoadImage(argv[1]);
if(!img){
printf("Could not load image file: %s\n",argv[1]);
exit(0);
}
// create a window
cvNamedWindow("mainWin", CV_WINDOW_AUTOSIZE);
cvMoveWindow("mainWin", 100, 100);
// invert the image
for(i=0;i<height;i++) for(j=0;j<width;j++) for(k=0;k<channels;k++)
data[i*step+j*channels+k]=255-data[i*step+j*channels+k];
GUI commands
Window management
cvNamedWindow("win1", CV_WINDOW_AUTOSIZE);
cvMoveWindow("win1", 100, 100); // offset from the UL corner of
the screen
• Load an image:
IplImage* img=0;
img=cvLoadImage(fileName);
if(!img) printf("Could not load image file: %s\n",fileName);
• Display an image:
cvShowImage("win1",img);
have values in the range . A float image is assumed to have values in the
• Close a window:
cvDestroyWindow("win1");
• Resize a window:
Input handling
• Handle mouse events:
o Define a mouse handler:
case CV_EVENT_LBUTTONUP:
printf("Left button up\n");
break;
}
}
mouseParam=5;
cvSetMouseCallback("win1",mouseHandler,&mouseParam);
int key;
key=cvWaitKey(10); // wait 10ms for input
int key;
key=cvWaitKey(0); // wait indefinitely for input
while(1){
key=cvWaitKey(10);
if(key==27) break;
switch(key){
case 'h':
...
break;
case 'i':
...
break;
}
}
int trackbarVal=25;
int maxVal=100;
cvCreateTrackbar("bar1", "win1", &trackbarVal ,maxVal ,
trackbarHandler);
• IPL image:
IplImage
|-- int nChannels; // Number of color channels (1,2,3,4)
|-- int depth; // Pixel depth in bits:
| // IPL_DEPTH_8U, IPL_DEPTH_8S,
| // IPL_DEPTH_16U,IPL_DEPTH_16S,
| // IPL_DEPTH_32S,IPL_DEPTH_32F,
| // IPL_DEPTH_64F
|-- int width; // image width in pixels
|-- int height; // image height in pixels
|-- char* imageData; // pointer to aligned image data
| // Note that color images are stored in
BGR order
|-- int dataOrder; // 0 - interleaved color channels,
| // 1 - separate color channels
| // cvCreateImage can only create
interleaved images
|-- int origin; // 0 - top-left origin,
| // 1 - bottom-left origin (Windows
bitmaps style)
|-- int widthStep; // size of aligned image row in bytes
|-- int imageSize; // image data size in bytes =
height*widthStep
|-- struct _IplROI *roi;// image ROI. when not NULL specifies
image
| // region to be processed.
|-- char *imageDataOrigin; // pointer to the unaligned origin
of image data
| // (needed for correct image
deallocation)
|
|-- int align; // Alignment of image rows: 4 or 8 byte
alignment
| // OpenCV ignores this and uses
widthStep instead
|-- char colorModel[4]; // Color model - ignored by OpenCV
CvMat // 2D array
|-- int type; // elements type
(uchar,short,int,float,double) and flags
|-- int step; // full row length in bytes
|-- int rows, cols; // dimensions
|-- int height, width; // alternative dimensions reference
|-- union data;
|-- uchar* ptr; // data pointer for an unsigned char
matrix
|-- short* s; // data pointer for a short matrix
|-- int* i; // data pointer for an integer matrix
|-- float* fl; // data pointer for a float matrix
|-- double* db; // data pointer for a double matrix
• Generic arrays:
• Scalars:
CvScalar
|-- double val[4]; //4D vector
Initializer function:
Example:
CvScalar s = cvScalar(20.0);
s.val[0]=10.0;
Note that the initializer function has the same name as the data structure only
starting with a lower case character. It is not a C++ constructor.
E.g.:
p.x=5.0;
p.y=5.0;
• Rectangular dimensions:
size: cvSize(width,height);
Examples:
• Release an image:
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,1);
cvReleaseImage(&img);
• Clone an image:
IplImage* img1=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,1);
IplImage* img2;
img2=cvCloneImage(img1);
IplImage* img=0;
img=cvLoadImage(fileName);
if(!img) printf("Could not load image file: %s\n",fileName);
Supported image formats: BMP, DIB, JPEG, JPG, JPE, PNG, PBM,
PGM, PPM,
SR, RAS, TIFF, TIF
By default, the loaded image is forced to be a 3-channel color image. This default
can be modified by using:
img=cvLoadImage(fileName,flag);
The output file format is determined based on the file name extension.
• Assume that you need to access the -th channel of the pixel at the -row and -
.
• Indirect access: (General, but inefficient, access to any type image)
o For a single-channel byte image:
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,1);
CvScalar s;
s=cvGet2D(img,i,j); // get the (i,j) pixel value
printf("intensity=%f\n",s.val[0]);
s.val[0]=111;
cvSet2D(img,i,j,s); // set the (i,j) pixel value
IplImage*
img=cvCreateImage(cvSize(640,480),IPL_DEPTH_32F,3);
CvScalar s;
s=cvGet2D(img,i,j); // get the (i,j) pixel value
printf("B=%f, G=%f, R=%f\n",s.val[0],s.val[1],s.val[2]);
s.val[0]=111;
s.val[1]=111;
s.val[2]=111;
cvSet2D(img,i,j,s); // set the (i,j) pixel value
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,1);
((uchar *)(img->imageData + i*img->widthStep))[j]=111;
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,3);
((uchar *)(img->imageData + i*img->widthStep))[j*img-
>nChannels + 0]=111; // B
((uchar *)(img->imageData + i*img->widthStep))[j*img-
>nChannels + 1]=112; // G
((uchar *)(img->imageData + i*img->widthStep))[j*img-
>nChannels + 2]=113; // R
IplImage*
img=cvCreateImage(cvSize(640,480),IPL_DEPTH_32F,3);
((float *)(img->imageData + i*img->widthStep))[j*img-
>nChannels + 0]=111; // B
((float *)(img->imageData + i*img->widthStep))[j*img-
>nChannels + 1]=112; // G
((float *)(img->imageData + i*img->widthStep))[j*img-
>nChannels + 2]=113; // R
• Direct access using a pointer: (Simplified and efficient access under limiting
assumptions)
o For a single-channel byte image:
IplImage* img =
cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,1);
int height = img->height;
int width = img->width;
int step = img->widthStep/sizeof(uchar);
uchar* data = (uchar *)img->imageData;
data[i*step+j] = 111;
IplImage* img =
cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,3);
int height = img->height;
int width = img->width;
int step = img->widthStep/sizeof(uchar);
int channels = img->nChannels;
uchar* data = (uchar *)img->imageData;
data[i*step+j*channels+k] = 111;
IplImage* img =
cvCreateImage(cvSize(640,480),IPL_DEPTH_32F,3);
int height = img->height;
int width = img->width;
int step = img->widthStep/sizeof(float);
int channels = img->nChannels;
float * data = (float *)img->imageData;
data[i*step+j*channels+k] = 111;
typedef struct{
unsigned char b,g,r;
} RgbPixel;
typedef struct{
float b,g,r;
} RgbPixelFloat;
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,1);
BwImage imgA(img);
imgA[i][j] = 111;
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_8U,3);
RgbImage imgA(img);
imgA[i][j].b = 111;
imgA[i][j].g = 111;
imgA[i][j].r = 111;
IplImage*
img=cvCreateImage(cvSize(640,480),IPL_DEPTH_32F,3);
RgbImageFloat imgA(img);
imgA[i][j].b = 111;
imgA[i][j].g = 111;
imgA[i][j].r = 111;
Image conversion
• Convert to a grayscale or color byte-image:
for(i=0;i<cimg->height;i++) for(j=0;j<cimg->width;j++)
gimgA[i][j]= (uchar)(cimgA[i][j].b*0.114 +
cimgA[i][j].g*0.587 +
cimgA[i][j].r*0.299);
Drawing commands
• Draw a box:
• Draw a circle:
cvPolyLine(img,curveArr,nCurvePts,nCurves,isCurveClosed,cvScalar(
0,255,255),lineWidth);
cvFillPoly(img,curveArr,nCurvePts,nCurves,cvScalar(0,255,255));
• Add text:
CvFont font;
double hScale=1.0;
double vScale=1.0;
int lineWidth=1;
cvInitFont(&font,CV_FONT_HERSHEY_SIMPLEX|CV_FONT_ITALIC,
hScale,vScale,0,lineWidth);
CV_FONT_HERSHEY_SIMPLEX, CV_FONT_HERSHEY_PLAIN,
CV_FONT_HERSHEY_DUPLEX, CV_FONT_HERSHEY_COMPLEX,
CV_FONT_HERSHEY_TRIPLEX, CV_FONT_HERSHEY_COMPLEX_SMALL,
CV_FONT_HERSHEY_SCRIPT_SIMPLEX, CV_FONT_HERSHEY_SCRIPT_COMPLEX,
• General:
o OpenCV has a C interface to matrix operations. There are many
alternatives that have a C++ interface (which is more convenient) and are
as efficient as OpenCV.
o Vectors are obtained in OpenCV as matrices having one of their
dimensions as 1.
o Matrices are stored row by row where each row has a 4 byte alignment.
• Allocate a matrix:
Example:
CvMat* M = cvCreateMat(4,4,CV_32FC1);
• Release a matrix:
CvMat* M = cvCreateMat(4,4,CV_32FC1);
cvReleaseMat(&M);
• Clone a matrix:
CvMat* M1 = cvCreateMat(4,4,CV_32FC1);
CvMat* M2;
M2=cvCloneMat(M1);
• Initialize a matrix:
double a[] = { 1, 2, 3, 4,
5, 6, 7, 8,
9, 10, 11, 12 };
Alternatively:
CvMat Ma;
cvInitMatHeader(&Ma, 3, 4, CV_64FC1, a);
CvMat* M = cvCreateMat(4,4,CV_32FC1);
cvSetIdentity(M); // does not seem to be working properly
CvMat* M = cvCreateMat(4,4,CV_32FC1);
int n = M->cols;
float *data = M->data.fl;
data[i*n+j] = 3.0;
CvMat* M = cvCreateMat(4,4,CV_32FC1);
int step = M->step/sizeof(float);
float *data = M->data.fl;
(data+i*step)[j] = 3.0;
double a[16];
CvMat Ma = cvMat(3, 4, CV_64FC1, a);
a[i*4+j] = 2.0; // Ma(i,j)=2.0;
Matrix/vector operations
• Matrix-matrix operations:
• Vector products:
Note that Va, Vb, Vc, must be 3 element vectors in a cross product.
CvMat* A = cvCreateMat(3,3,CV_32FC1);
CvMat* x = cvCreateMat(3,1,CV_32FC1);
CvMat* b = cvCreateMat(3,1,CV_32FC1);
cvSolve(&A, &b, &x); // solve (Ax=b) for x
CvMat* A = cvCreateMat(3,3,CV_32FC1);
CvMat* E = cvCreateMat(3,3,CV_32FC1);
CvMat* l = cvCreateMat(3,1,CV_32FC1);
cvEigenVV(&A, &E, &l); // l = eigenvalues of A (descending order)
// E = corresponding eigenvectors (rows)
CvMat* A = cvCreateMat(3,3,CV_32FC1);
CvMat* U = cvCreateMat(3,3,CV_32FC1);
CvMat* D = cvCreateMat(3,3,CV_32FC1);
CvMat* V = cvCreateMat(3,3,CV_32FC1);
cvSVD(A, D, U, V, CV_SVD_U_T|CV_SVD_V_T); // A = U D V^T
The flags cause U and V to be returned transposed (does not work well without
the transpose flags
Working with video sequences
Capturing a frame from a video sequence
• Capturing a frame:
IplImage* img = 0;
if(!cvGrabFrame(capture)){ // capture a frame
printf("Could not grab a frame\n\7");
exit(0);
}
img=cvRetrieveFrame(capture); // retrieve the captured
frame
To obtain images from several cameras simultaneously, first grab an image from
each camera. Retrieve the captured images after the grabbing is complete.
cvReleaseCapture(&capture);
Note that the image captured by the device is allocated/released by the capture
function. There is no need to release it explicitly.
The total frame count is relevant for video files only. It does not seem to be
working properly.
• Get frame information:
Get the position of the captured frame in [msec] with respect to the first frame, or
get its index where the first frame starts with an index of 0. The relative position
(ratio) is 0 in the first frame and 1 in the last frame. This ratio is valid only for
capturing images from a file.
This only applies for capturing from a file. It does not seem to be working
properly.
CvVideoWriter *writer = 0;
int isColor = 1;
int fps = 25; // or 30
int frameW = 640; // 744 for firewire cameras
int frameH = 480; // 480 for firewire cameras
writer=cvCreateVideoWriter("out.avi",CV_FOURCC('P','I','M','1'),
fps,cvSize(frameW,frameH),isColor);
IplImage* img = 0;
int nFrames = 50;
for(i=0;i<nFrames;i++){
cvGrabFrame(capture); // capture a frame
img=cvRetrieveFrame(capture); // retrieve the captured frame
cvWriteFrame(writer,img); // add the frame to the file
}
To view the captured frames during capture, add the following in the loop:
cvShowImage("mainWin", img);
key=cvWaitKey(20); // wait 20 ms
Note that without the 20[msec] delay the captured sequence is not displayed
properly.
cvReleaseVideoWriter(&writer);