0% found this document useful (0 votes)
84 views10 pages

Final Project: Face Recognition: Objectives of This Project

This project introduces students to the practical capabilities and challenges of automated facial recognition. The report has to be done separately for each group and no "copying" of code is allowed between groups. You will be graded on the completeness, accuracy and presentation quality of your report.

Uploaded by

Prasad Kalum
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views10 pages

Final Project: Face Recognition: Objectives of This Project

This project introduces students to the practical capabilities and challenges of automated facial recognition. The report has to be done separately for each group and no "copying" of code is allowed between groups. You will be graded on the completeness, accuracy and presentation quality of your report.

Uploaded by

Prasad Kalum
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D.

KundurPage 1 of 10

Final Project: Face Recognition


Objectives of this Project
This project introduces students to the practical capabilities and challenges of automated facial recognition. Good luck!

Instructions
This project can be conducted in groups of at most two people. It is acceptable to discuss the project with other groups and to help other groups. However, the report has to be done separately for each group and no copying of code is allowed between groups. You should use the last lab session on December 2, 2010 to ask the TA any questions you may have and to attempt to conduct the project. You must attend the December 2, 2010 lab session for full points.

Deliverables
Please provide a group report presenting the following: o o o o A picture of the *.mdl and/or *.m file(s) that you used to generate the results. Please note that you can use either MATLAB or Simulink to conduct this project. A picture of your test and reference image(s) that you used to generate the results. Documentation and explanation of your results. A discussion of the challenges you encountered and any changes in models you made to obtain good results.

You will be graded on the completeness, accuracy and presentation quality of your report.

Grading and Due Date


This project report is due at noon on Tuesday December 14, 2010. Please note that this is a strict deadline, as the grades must be reported shortly after this date.

Grading and Due Date


William Luh developed a preliminary version of this project description.

Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 2 of 10

Final Project: Face Recognition


Introduction and Background
Nowadays when you get your passport photo taken, instead of saying smile! the photographer will say dont smile! Why are these photographers so cranky? Hey dont blame them, blame the DSP used in airport face recognition devices that cant handle a smile. If you havent guessed what this project is about from the above introduction (and the title), then be prepared to smile, because you are actually going to implement a very rudimentary face recognition device! Most face recognition devices follow the same block diagram as in Figure 1.
REFERENCE IMAGE OF FACE FEATURE EXTRACTION

COMPARISON

TEST IMAGE OF FACE

FEATURE EXTRACTION

IMAGE REGISTRATION

Figure 1: General Block Diagram for Face Recognition

Before you dive into the details, note that most face recognition devices that search huge databases are implemented via software, so that the software can access the image face databases stored locally or over some network. Since software implementations allow greater computational power (compare your Pentium to the DSP youre using for your labs!), we must water down the algorithms used substantially, and thus your implementation will lose the efficacy enjoyed by commercial face recognition software. For example, the face recognition software used in Las Vegas casinos can recognize the face of patrons while they move across the gambling floor, or while standing at the craps table. For our algorithm, we must capture only the face, looking straight at the camera, with no make-up, and of course no smile! This guarantees higher delectability, and lower false alarm. In addition the background should be completely black1, and the camera always at the same distance from the face pretty much like taking a photo for your drivers license or passport. After learning the basics in this project, you can take these ideas and implement more complex and more effective algorithms.

Images as Matrices
Before we begin describing each block used in this project, lets take a moment to review images. Recall that a sampled time-series signal can be considered as a 1-D vector. Naturally, a digital grayscale image can be considered as a matrix. If you want to get fancier and include color images, then you would have a 2-D matrix for every color plane: red, green, blue (RGB). Different formats represent color images differently. For example, instead of using the RGB format, one may use the YCbCr format, which represents color also using 3 planes: luminance (describes the intensity), and chrominance for blue and red. In this project well deal only with grayscale images, and thus consider one plane. An 8-bit grayscale
1

You may want to take the picture initially with a white background, and then use a photo editing software such as Photoshop to make the background black. The reason is due to lighting from various sources, including your flash, may give the background an unwanted luster. The reason we want a black background will become apparent soon. Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 3 of 10

image is thus a matrix whose values take on the integers 0 to 255, in other words 256 = 28 shades of gray. The minimum value 0 corresponds to black, since 0 intensity is darkness, while the maximum 255 corresponds to white, as the brightest scenario is all white.

Extracting Facial Features


What are facial features? These turn out to be items such as your eyes, nose, and mouth. Most of us can see one anothers eyes, nose and mouth because there are clear visual boundaries or contours that outline these items. We can exploit these contours using edge detection in order to identify facial features. Recall that in edge detection if you input an 8-bit grayscale image, the output will actually be a binary image (only 0s and 1s) highlighting the edge contours of the original image. Each pixel of the output will be either solid black or solid white; the white parts of the image correspond to the edges of the image. In this project, instead of using Simulinks edge detection algorithm, we will use MATLABs function edge.

Image Registration (Alignment)


The reason that facial feature extraction is so important is because it allows one to register or align two facial images together for more accurate comparison. Often the comparison is based on the geometry between the features, so registration is important in removing any difference that are due to difference in photographic angle on the face. During the registration process one image is matched to the others geometry. In our algorithm the test image features are registered to the reference image features as shown in Figure 1. Registration may be conducted using MATLAB or Simulink and we provide a *.m (MATLAB) file to help you with this. Feel free to look through the file if you want the details of registration.

Statistical Test for Comparison2


After registration, you may see right away if the two eye shapes and distances between the eyes match. However machines do not have the brilliance of our vision and brain, and need a more systematic way to compare images. There are many ways to do this, and many face recognition software require a training face. These algorithms are based on correlation detectors and to enhance its performance, you need to train your algorithm on several faces. We shall use a simple method of comparison, which is not so effective, but easy to explain, and easy to implement and test without training! Common sense tells you that if the two features are exactly the same, then subtracting one from the other (pixel-by-pixel) will yield a difference image that consists of all 0 valued pixels. Common sense also tells you that unless you are comparing exactly the same image file, the difference image will not be all 0 valued pixels even when the two photos are of the same face! Why? Each time you snap a shot of someones face, lighting conditions, movement of the facial muscles, etc. will result in a photo that may look the same, but whose pixel values may be somewhat different. Hence in reality, the difference image of two faces that are the same should contain very few pixels that are nonzero. In other words the mean of the difference pixels should be 0 (in a perfect world where lighting conditions are constant, and facial expressions are mannequin), but not the individual pixel values themselves. If we simply take the sample
2

More advanced face recognition software will also compute the geometries between sets of features to test whether the two faces in question may be the same or completely different. For example, your eyes and nose can be considered as the vertices of a triangle. Some triangles may be equilateral, while others will have different angles. Of course two different faces may share the same triangle type, and thus this test is not the only test to be performed. Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 4 of 10

average of all difference pixels, it is unlikely that the resulting value will be identical to 0. Say we compute a sample average of 1.5. Do we decide same face or different face? Where does the threshold lie for the comparison stage? For this, we rely on statistics. Let X denote the reference feature extracted (i.e., edge detected) image, and Y denote the test feature extracted (i.e., edge detected) image after registration as shown in Figure 1. Assume the pixels of each image are denoted X i, j and Yi, j . Let both images be MN matrices, and define the sample mean (a.k.a. average) and sample variance of the difference image X Y as in Equations (1) and (2), respectively:

mean(X Y ) =

(X
i, j

i, j

Yi, j )

M N

(1)

var(X,Y ) =

i, j

[(X i, j Yi, j ) mean(X,Y )]2 M N 1

(2)

Notice that the minus one in the denominator of Equation (2) is not a typo. The minus one is used to make the sample variance an unbiased estimate of the true variance; details are beyond the scope of this project. The question we ask during this comparison phase of the facial recognition is whether the sample mean of Equation (1) is close enough to zero that it seems the faces in the reference and test images are the same. Our approach is to model the image pixels as random variables and then apply a confidence interval test to give a probability measure that can be used for comparison of the facial similarities. If we model the pixels X i, j and Yi, j as random variables that are independent, then the sample mean and variance denoted mean(X Y ) and var(X,Y ) , respectively, are also random variables. By the Central Limit Theorem, we can assume that mean(X Y ) is a Gaussian random variable, with its own mean being equal to true mean of the difference image X Y denoted m , and its variance being the true variance of ( X Y )/(M N) (dont worry about why this is true, but we will use it in the next paragraph).

Without access the true mean and true variance, we instead employ the sample mean of Equation (1), i.e. Equation (1) with the actual image pixel value numbers substituted which we denote with mean(xy) as not to confuse with with the random variable version, and substitute the sample variance appropriately into true variance of ( X Y )/(M N) . So now we have a random variable mean(X Y ) that has a Gaussian distribution with an estimate of its mean and variance. We can use probability theory to test whether it is likely that 0 is the true mean. We form a confidence interval, and check if 0 is inside this confidence interval. Specifically, we ask the question: Is 0 in the interval [ mean(x y) -b, mean(x y) +b]? As discussed before,

mean(x y) is the actual sample mean computed using the registered feature image data. The parameter
b is a number, which makes, for a user-defined p:

Pr{mean(X ) [mean(x y) b,mean(x y) + b]} = 1 p Y

(3)
Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 5 of 10

For example, if we choose p = 0.05, and the corresponding parameter b is such that 0 is not in the interval [ mean(x y) - b, mean(x y) + b], then we can say that with probability 0.95 (or confidence at 95%), the two faces are not the same. Another way to look at this is if p = 0.05, then the corresponding b value is set so that with probability 0.95, all occurrences of sample means should be in this interval. Thus if we get a sample mean outside of this interval, we know that this only had a 0.05 chance of happening, so it must be an anomaly and we reject the notion that the true mean can be 0. Let us consider a pictorial representation as in Figure 2.

[mean(x,y)-b,mean(x,y)+b] mean(x,y)-b mean(x,y)+b

estimated with mean(x,y)

Figure 2: Tail Probabilities as an Indicator of whether 0 is an Anomaly or not.

Figure 2 shows a Gaussian probability density function for the estimated mean random variable denoted mean(X Y ) . Recall from probability theory that integrating under a density function about a portion of the horizontal axis say [a c] gives the probability of the random variable falling in that range particular range of values. Recall also that integrating over the entire probability density function equals one. The shaded region, called tail probabilities, of Figure 2 corresponds to the event that mean(X Y ) does not fall in the region [ mean(x y) - b, mean(x y) + b] which has probability 1-p. Why is this? Recall our definition of the relationship between p and b in Equation (3).

You can see in Figure 2 that the shaded region corresponds precisely to p (because the un-shaded region corresponds to 1-p and the overall area under the function is one), and thus is commonly called the p-value. The smaller the p-value, the more likely 0 is an anomaly if it is not in the interval [ mean(x y) - b, mean(x y) + b]. Analytically, the p-value is given by Equation (4):

| mean(X Y ) | p-value = 1 erf 2 var(X Y ) MN

(4)

The erf or error function is used to find the probabilities for the Gaussian distribution. It is calculated based on a look-up table.

Design and Implementation


Implementing a Simulink model for Figure 1 may look simple enough, but when you try to actually make it, the details of doing so are not inherently obvious. Even though youll be performing the
Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 6 of 10

face recognition in Simulink, youll have to write a couple of MATLAB function m-files to help you out. Youll also have to do some preliminary work in the MATLAB command window to prepare the images for the Simulink model. In addition, keep in mind some general guidelines for doing the face recognition.

Acquiring the Test Images


You have been provided with some test images to get you started. However, in this project you and your partner should also use images of yourselves. As discussed, take passport-type photos. If you crop them, please make sure they are all the same size. It is important to keep in mind that there are some best practices to taking your test and reference images that can help immensely in the performance of your face recognition algorithm. It is important that your face be against a black background. Some people in the past have taken a picture against a white background (say against a whiteboard) and then changed it to black. If you are stuck in the lab, try taking a picture of yourself under the lab table yes this works sometimes, but can obscure your facial features. The best thing though is to get a black screen behind you and take a picture.

Preparing the Simulink Model


One of the first problems you may recognize with face recognition is that it is not time based; in other words, face recognition is like a one-time calculation. It doesnt really matter how long that calculation takes (within reason, of course). On the other hand, Simulink models are usually time based (i.e. you specify a start time and a stop time). In order to work around this apparent incompatibility, were going to trick Simulink by setting both the start and stop time to 0 seconds. This is equivalent to only sending one sample from the input, through the block diagram, and to the output. However, in our case, this sample at time 0 will not just be a scalar value; rather, it will be an entire image matrix (in fact, there will be two inputs and thus such images). The requisite calculations will be performed on these input two samples, all at time 0. Open a new Simulink model and go to Simulation Configuration Parameters. Select the Solver pane. Set the Start time and Stop time to 0. Under Solver Options, set Type to Fixed-step and Solve to Discrete. Press OK. Even though you havent added any blocks to your model, go ahead and save it.

Loading the Images to Simulink


First, youll have to load them to the MATLAB workspace and change them from color to grayscale. From there, youll have to put them in a structure-with-time format (we discuss this more specifically below). After, this youll be ready to send the images to Simulink. 1. Save your reference and test images in the same directory as your Simulink model. Please note that you are also given sample images to help you get started on this project. Make sure that the MATLAB Current Directory is set to this same directory as these images. Also, make sure that the images are the same size (in pixels); if not, crop them in a program such as Microsoft Office Picture Manager or Photoshop until they are the same size. 2. In the MATLAB command window, type the following command, replacing name.jpg with the name of one of your pictures:

Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 7 of 10

This command reads a picture from a file and stores it into an MxNx3 array where MxN is the dimensions of the image. The third dimension (3) represents the red, blue, and green intensities (the image is read as a color image). The array is stored in the variable called anchor. We call this picture the anchor because it is the reference face to which all others will be compared. 3. Use the imread command again to read your second picture to the MATLAB workspace. Save it in a variable named target. The target picture will be compared to the anchor picture from step 4. 4. The first thing you need to do to the pictures is convert them into grayscale. This is quite easy. Type anchor = rgb2gray(anchor) into the command window; do the same for the target. This is the red-green-blue to grayscale command. If you type whos into the command window, you will see that the third dimension of the arrays has been removed; anchor and target are now MxN matrices. Also, note that the data type is an unsigned 8-bit integer. 5. In anticipation of the edge detection that will be done later, you need to convert the image matrices from unsigned 8-bit integers to double-precision floating point numbers (MATLABs edge command only operates on double-precision numbers). Type anchor = double(anchor); do the same for the target. 6. In order to load the image matrices into Simulink, they must be put into a structure format. In MATLAB, a structure is a data type that stores more elementary data types (i.e. strings, doubleprecision floating point numbers, etc.) in an organized fashion. Specifically, structures consist of fields and values. A field is like a category, and values are the actual data within those categories. If youre confused, heres an example that should make more sense. Suppose you wanted to a store the data for a sine wave that lasts over a period of ten seconds. You could make a structure called S that has two fields: the first field, called signal, could store a vector of the actual values of the sine wave (lets call this vector x); the second field, called time, could store a vector of the corresponding time values from 0 to 10 seconds (lets call this vector t). If you wanted to create this structure, the syntax is pretty straightforward (dont actually type this command):

A structure, then, is just an organized way of storing data types that you are already know about. The complexity of structures is almost limitless because you can even put structures within other structures (which is what were about to do). If youve learned a programming language like C, you might have recognized by now that a MATLAB structure is much like a struct in C. 7. In order to be compatible with Simulink, the structures you make must be exactly the way Simulink wants. Heres Simulinks way. Within the structure are two fields. The first field is called time and contains, as its name suggests, a vector of time values; for you, of course, the time vector is a single value: 0. The second field is called signals; it actually contains another structure. This substructure has two fields of its own: the first, called values, contains the actual data in question (in your case, the image matrix); the second, called dimensions, specifies the dimensions of the data (in your case, this would be a vector [M N]). To make this a little clearer, heres a visual depiction of the structures organization:

Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 8 of 10

8. Based on the description in step 6 and the discussion of structures in step 5, create a structure for the anchor and a structure for the target that are compatible with Simulink. Call the name of the anchors structure I1 and the name of the targets structure I2. 9. Now you are ready to load the images into Simulink. In your Simulink model add two From Workspace blocks from Simulink Sources. In the Block Parameters window, change the Data parameter to I1 and I2, respectively.

Edge Detection
The first step in our face recognition process is to extract the facial features (represented by the blocks labeled Extraction of Facial Features in Figure 1). As stated in the Introduction, youre going to do this MATLABs edge function. In Simulink, the easiest way to use a function that you would normally use in the command window is to use an Fcn block from Simulink User-Defined Functions. The Fcn block is a one-input, one-output block in which you can specify pretty much any one-input, one-output function that is defined in MATLAB. It can even be used for functions that you have written yourself in an m-file. In the Block Parameters of the Fcn block, all that you have to do for edge detection is to type double(edge(u)) into the MATLAB function parameter. Here, the u stands for the input to the block. Also, the double() command is necessary because, due to some peculiarity in Simulink, the block will not support a binary output (the edge command produces a binary output, as explained in the Introduction). Make two such Fcn blocks and place them into your Simulink model according to Figure 1.

Registering the Images


1. The next step in the facial recognition process is the registration of the images (represented by the block labeled Alignment of Images in Figure 1). Weve provided you with a registration function in an m-file. Download im_reg_MI.m and save it to the same directory as your Simulink model. 2. To use the function m-file in your Simulink model, youll have to use an Fcn block again. However, as you might have noticed, the register function is a multi-input, multi-output function, which is not compatible with the Fcn block. Namely, it has four inputs: image1 (the anchor), image2 (the target), step (the maximum vertical or horizontal translation allowed in pixels), and angle (a vector of the possible angles of rotation allowed). It has five outputs: im_matched (the registered target image; this is the output you want), h, theta, I, and J. In order to work around this problem, youre going to have to right your own function m-file (a one-input, oneoutput function) that calls on im_reg_MI and extracts only the necessary output.

Dr. Deepa Kundur

ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. KundurPage 9 of 10

3. Open a new m-file and write a one-input, one-output function called register.m. Heres some ideas on how proceed: a. Hardcode the step and angle inputs within your function. Namely, define angle = [2:0.01:2] and step = 50. This gets rid of two of the four inputs to im_reg_MI. b. In your Simulink model, concatenate the anchor and target images into one Mx2N matrix (use the edge-detected images, NOT the original anchor and target). This combined matrix will be the only input to your function. Youll have to do this with a Matrix Concatenate block (In the Block Parameters, set Number of inputs to 2, Mode to Multidimensional array, and Concatenate dimension to 2). c. Inside your function, extract the anchor and target images from the concatenated input of part b (i.e. into two separate matrices). d. Inside your function, call on im_reg_MI, using the angle and step variables you defined in part a, as well as the anchor and target images you extracted in part c. e. Make the single output of your function equal to the second output of im_reg_MI; you can ignore the other outputs. 4. Save your function m-file as register.m in the same directory as your Simulink model. 5. In your Simulink model, place an Fcn block that calls on the register function you just created. 6. Connect the blocks that you have so far to reflect the flow of data shown in Figure 1.

Comparison of Images
1. The last step in the face recognition process is to compare the anchor to the registered target image. To do this, of course, you need to calculate the difference image. In your Simulink model, subtract the edge-detected, registered target image from the anchor (the edge-detected anchor, NOT the original anchor). 2. Now, you need calculate the p-value of the difference image. To do this, youll once again need an Fcn block. And, once more, youll have to write your own function m-file to put in this Fcn block. 3. Write a one-input, one-output function m-file that calculates the p-value from a difference image. As you may have guessed, the input to this function should be the difference image and the output should be the p-value itself. Here are some things to consider as you write your function: a. Notice that to calculate the p-value in equation (3), you need the two-dimensional sample mean and sample variance, as defined in equations (1) and (2). CAUTION: MATLABs built-in functions mean and var are written for one-dimensional data and will not work as you want on your two-dimensional images. Instead, youll have to write some code to implement those formulae before you can compute equation (3). You may consider writing separate function m-files for equations (1) and (2) and have your p-value function call on these functions. b. The error function, erf, in equation (3) can be computed in MATLAB using, you guessed it, the erf() command. 4. If you havent already done so, implement your p-value function in step 3 in an Fcn block in your Simulink model. At the output of this Fcn block, attach a Display block from Simulink Sinks to view the p-value with.

Dr. Deepa Kundur

Page 10 of 10 ECEN 448: Real-time DSP, Final Project, Texas A&M University (Fall 2010), Instructor: Dr. D. Kundur

Viewing Images
Although your face recognition algorithm is now complete, it would be nice to view some of the images at various stages in the process. To do this, use a Matrix Viewer block from Signal Processing Blockset Signal Processing Sinks. In the Block Parameters Window, set the Colormap matrix to gray(256) and uncheck the box labeled Display colorbar. Also: 1. If you want to view any image after edge detection, set the Minimum input value to 0 and the Maximum input value to 1. This is because edge detection produces a binary output. 2. If you want to view any image before edge detection, set the Minimum input value to 0 and the Maximum input value to 256. This is because before edge detection, the intensity values of the image are integers between 0 and 256. When you run your Simulink model, one window will appear for each Matrix Viewer you have in your model.

Running the Model


1. Run your Simulink model. It may take awhile to run (perhaps close to a minute), so be patient. Note that because this is a non-real time application, your simulation is in Normal mode (as opposed to External mode), and you dont have to build real-time code with Real-Time Workshop. 2. Test different sets of images in your model; specifically use the test images provided and your own iamges. Remember, each time you want to test a new image, youll have to load it into the MATLAB workspace, convert it to grayscale and double-precision floating point, and put it in a structure format. Suggestion: dont delete or overwrite any of the structures you create in the MATLAB workspace. Instead, change the names of the variables in the From Workspace blocks in your model. 3. Try adjusting the angle and step parameters in your register function to see how they affect both the speed of the calculation and the resulting p-value. If your value for he angle is too large, then it will take the program much longer to complete, so keep this to a minimum.

Dr. Deepa Kundur

You might also like