Finger Tracking in Real Time Human Computer Interaction
Finger Tracking in Real Time Human Computer Interaction
ABSTRACT:
For a long time research on human-computer interaction (HCI) has been restricted to techniques based on the use of monitor, keyboard and mouse. Recently this paradigm has changed. Techniques such as vision, sound, speech recognition, projective displays and location aware devices allow for a much richer, multi-modal interaction between man and machine. Finger-tracking is usage of bare hand to operate a computer in order to make human-computer interaction much more faster and easier. Fingertip finding deals with extraction of information from hand features and positions. In this method we use the position and direction of the fingers in order to get the required segmented region of interest.
CONTENTS:
INTRODUCTION METHODS OF FINGER TRACKING
COLOR TRACKING SYSTEMS CORRELATION TRACKING SYSTEMS CONTOUR-BASED TRACKING SYSTEMS
CONCLUSIONS REFERENCES
INTRODUCTION:
Finger pointing systems aim to replace pointing and clicking devices like the mouse with the bare hand. These applications require a robust localization of the fingertip plus the recognition of a limited number of hand postures for clicking-commands. Finger-tracking systems are considered as specialized type of hand posture/gesture recognition system. The typical Specializations are: 1) Only the most simple hand postures and recognized. 2) The hand usually covers a part of the on screen. 3) The finger positions are being found in real-time 4) Ideally, the system works with all kinds of backgrounds 5) The system does not restrict the speed of hand movements In finger tracking systems except that the real-time constraints currently do not allow sophisticated approaches such as 3D-model matching or Gabor wavelets.
Figure 2: (a) FingerPaint system (from [Crowley 95]) (b) The Digital Desk (from [Well 93]) (c) Television control with the hand (from [Freeman 95])
FingerPaint was inspired by the digital desk described in [Well 93], which also uses a combination of projector and camera to create an augmented reality (see Figure 2.b). Wells system used image differencing to find the finger. The big drawback is that it does not work well if the finger is not moving. Freeman used correlation to track the whole hand and to discriminate simple gestures. He applied the system to build a gesture based
television control ([Freeman 95]). In his setup the search region was simply restricted to a fixed rectangle. As soon as the user moves his hand into this rectangle, the television screen is turned on. Some graphical controls allow manipulation of the channel and volume with a pointer controlled by the hand (Figure 2.c).
Figure 3: Contour-based tracking with condensation (a, b) Hand contour recognition against complex backgrounds (b) Finger drawing with different line strengths (from [MacCormick 00])
MacCormick uses a combination of several techniques to achieve robustness. Color segmentation yields the initial position of the hand. Contours are found by matching a set of pre-calculated contour segments (such as the contour of a finger) with the results of an edge detection filter of the input image. Finally, the contours found are tracked with an algorithm called condensation. Condensation is a statistical framework that allows the tracking objects with high-dimensional configuration spaces without incurring the large computational cost that would normally be expected in such problems. If a hand is modeled, for example, by a b-spline curve, the configuration space could be the position of the control points.
Finger Finding:
In order to find regions of interest in video images we need to take a closer look at those regions and to extract relevant information about hand features and positions. Both the position of fingertips and the direction of the fingers are used to get a fairly clean segmented region of interest.
Applications often need only a limited number of commands (e.g. simulation of mouse buttons, next slide/previous slide command during presentation). The number of fingers presented to the camera can control those commands.
Motivation
First of all, the method we choose has to work in real-time, which eliminates 3D-models and wavelet-based techniques. Secondly, it should only extract parameters that are interesting for human computer interaction purposes. Many parameters could possibly be of interest for HCIapplications. List of possible parameters in order of importance for HCI:
In combination with some constraints derived from the hand geometry, it is possible to decide which fingers are presented to the camera. Theoretically thirty-two different finger configurations can be detected with this information. For non-piano players only a subset of about 13 postures will be easy to use, though.
Many applications only require this simple parameter. Examples: Finger-driven mouse pointer, recognition of space-time gestures, moving projected objects on a wall, etc.
As shown by [Lee 95], those parameters uniquely define a hand pose. Therefore they can be used to extract complicated postures and gestures. An important application is automatic recognition of hand sign languages. The list above shows that most human-computer interaction tasks can be fulfilled with the knowledge of 12 parameters: the 2D positions of the five fingertips of a hand plus the position of the center of the palm
2) Along a square outside the inner circle, fingertips are surrounded by a long chain of non-filled pixels and a shorter chain of filled pixels (see Figure 5.2). To build an algorithm, which searches these two features, several parameters have to be derived first: Diameter of the little finger (d1): This value usually lies between 5 and 10 pixels and can be calculated from the distance between the camera and the hand.
Diameter of the thumb (d2): Experiments show that the diameter is about 1.5 times the size of the diameter of the little finger.
Figure 5.1: Typical finger shapes (a) Clean segmentation (b) Background clutter (c) Sparsely segmented fingers Figure 5.2: A simple model of the fingertip
Maximum number of filled pixels along the search square (max_pixel): Geometric considerations
show that this value is twice the width of the thumb.
the finger tracker. Also, it is interesting to compare the finger-based mouse-pointer control with the standard mouse as a reference. This way the usability of the system can easily be tested.
Applications:
Three applications based on Fingertracking systems are: FingerMouse FreeHandPresent BrainStorm
FingerMouse
The FingerMouse system makes it possible to control a standard11 mouse pointer with the bare hand. If the user moves an outstretched forefinger in front of the camera, the mouse pointer follows the finger in real-time. Keeping the finger in the same position for one second generates a single mouse click. An outstretched thumb invokes the double-click command; the mouse-wheel is activated by stretching out all five fingers (see Figure 6.1). The application mainly demonstrates the capabilities of the tracking mechanism. The mouse pointer is a simple and well-known feedback system that permits us to show the robustness and responsiveness of
Figure 6.1: The FingerMouse on a projected screen (a) Moving the mouse pointer (b) Double-clicking with an outstretched thumb (c) Scrolling up and down with all five fingers outstretched
There are two scenarios where tasks might be better solved with the FingerMouse than with a standard mouse:
Projected Screens: Similar to the popular touch-screens, projected screens could become touchable with the FingerMouse. Several persons could work simultaneously on one surface and logical objects, such as buttons and sliders, could be manipulated directly without the need for a physical object as intermediary.
Navigation: For standard workplaces it is hard to beat the point-andclick feature of the mouse. But for other mouse functions, such as navigating a document, the FingerMouse could offer additional usability. It is easy to switch between the different modes by (stretching out fingers), and the hand movement is similar to the one used to move around papers on a table (larger possible magnitude than with a standard mouse). For projected surfaces the FingerMouse is easier to use because the fingertip and mouse-pointer are always in the same place. Figure 6.5 shows such a setup. A user can paint directly onto the wall with his/her finger by controlling the Windows Paint application with the FingerMouse.
FreeHandPresent: The second system is built to demonstrate how simple hand gestures can be used to control an application. A typical scenario where the user needs to control the computer from a certain distance is during a presentation. Several projector manufacturers have recognized this need and built remote controls for projectors that can also be used to control applications such as Microsoft PowerPoint. Our goal is to build a system that can do without remote controls. The user's hand will become the only necessary controlling device.
The interaction between human and computer during a presentation is focused on navigating between a set of slides. The most common command is Next Slide. From time to time it is necessary to go back one slide or to jump to a certain slide within the presentation. The FreeHandPresent system uses simple hand gestures for the three described cases. Two fingers shown to the camera invoke the Next Slide command; three fingers mean Previous Slide; and a hand with all five fingers stretched out opens a window that makes it possible to directly choose an arbitrary slide with the fingers.
just walk up to the wall and move the text lines around with the finger. Figure 6.2b-d show the arranging process. First an item is selected by placing a finger next to it for a second. The user is notified about the selection with a sound and a color change. Selected items can be moved freely on the screen. To let go of an item the user has to stretch out the outer fingers as shown in figure 6.2d.
Brainstorm:
The BrainStorm system is built for the described scenario. During the idea generation phase, users can type their thoughts into a wireless keyboard and attach colors to their input. The computer automatically distributes the user input on the screen, which is projected onto the wall. The resulting picture on the wall resembles the old paper-pinning technique but has the big advantage that it can be saved at any time. For the second phase of the process, the finger-tracking system comes into action. To rearrange the items on the wall the participants
Figure 6.2: The BrainStorm System (a) Idea generation phase with projected screen and wireless keyboard (b) Selecting an item on the wall (b) Moving the item and (c) Unselecting the item
Conclusions
Finger-tracking system with the following properties:
The system works on light background with small amounts of clutter. The maximum size of the search area is about 1.5 x 1m but can easily be increased with additional processing power. The system works with different light situations and adapts automatically to changing conditions. No set-up stage is necessary. The user can just walk up to the system and use it at any time. There are no restrictions on the speed of finger movements. No special hardware, markers or gloves are necessary. The system works at latencies of around 50ms, thus allowing real-time interaction Multiple fingers and hands can be tracked simultaneously. Especially the BrainStorm system demonstrated, how finger tracking can be used to create added value for the user. Other systems that allow barehand manipulation of items projected to a wall, as done with BrainStorm, or presentation control with hand postures, as done with FreeHandPresent. It is possible, though, that the same applications could have been built with other fingertracking systems
References::
[Brard 99] Brard, F., Vision par ordinateur pour linteraction hommemachine fortement couple, Doctoral Thesis, Universit Joseph Fourier, Grenoble, 1999. [Card 83] Card, S., Moran, T. and Newell, A., The Psychology of HumanComputer Interaction, Lawrence Erlbaum Associates, 1983. [Castleman 79] Castleman, K., Digital Image Processing, Prentice-Hall Signal Processing Series, 1979.