0% found this document useful (0 votes)
66 views5 pages

Teaching Natural User Interaction Using Openni and The Microsoft Kinect Sensor

The launch of the Microsoft Kinect has spurred a flurry of development of natural user interfaces for computer applications. Using Kinect offers opportunities for novel approaches to classroom instruction on natural user interaction.

Uploaded by

srazihaider
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views5 pages

Teaching Natural User Interaction Using Openni and The Microsoft Kinect Sensor

The launch of the Microsoft Kinect has spurred a flurry of development of natural user interfaces for computer applications. Using Kinect offers opportunities for novel approaches to classroom instruction on natural user interaction.

Uploaded by

srazihaider
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Teaching Natural User Interaction Using OpenNI and the Microsoft Kinect Sensor

Norman Villaroman
Brigham Young University Provo, Utah

Dale Rowe, Ph. D


Brigham Young University Provo, UT

Bret Swan, Ph. D


Brigham Young University Provo, UT

[email protected]

[email protected]

[email protected]

ABSTRACT
The launch of the Microsoft Kinect for Xbox (a real-time 3D imaging device) and supporting libraries has spurred a flurry of development of, among other things, natural user interfaces for computer applications. Using Kinect offers opportunities for novel approaches to classroom instruction on natural user interaction. With the launch of this sensor came the establishment of development platforms that are able to collect and process the data that this sensor provides, one of which is OpenNI. We evaluate the current state of this technology, and present examples of how Kinect-enabled user interfaces can provide tremendous opportunities for students in Human Computer Interaction (HCI) courses. Our paper presents sample learning activities to achieve various HCI learning outcomes listed in IT 2008. The advantages of using this as a tool in the classroom are discussed.

device. The adoption and expansion natural user interfaces is expected to make it easy for the people to learn how to use the interface in the quickest possible way. 1 The desire to develop natural user interfaces has existed for decades. Since the last world war, professional and academic groups have been formed to enhance interaction between man and machine [2, 3]. While computer user interfaces started with punch cards and keyboard-like devices, attempts to develop interfaces that process more natural movements emerged relatively early. An example of such an attempt involved the study of pattern recognition of natural written language using a pen-like interface called the RAND tablet and stylus in 1967 [4, 5]. An estimation of the value of the ability of a machine to process audio commands was published in 1968 [6]. The first touch device, called the Elograph was created in 1971, with further advancements from the same company continuing in the following decades. In the late 1970s, VIDEOPLACE, a system with video cameras and projectors allowed graphical interaction and communication between users without the aid of user-attached or controlled devices, was developed. The 1980s saw the beginnings of multi-touch systems with a master's thesis by Nimish Mehta that resulted in the development of the Flexible Machine Interface. In the same decade, touch screen interfaces began to be prevalent in commercial establishments. At this time, HewlettPackard 150, a personal computer with basic touch input, was launched. In the 1990s, touch screen interfaces made its way to mobile devices with Simon, launched by IBM and Bell South in 1994 [7]. Touch screens continued to spread in different consumer applications. In the first decade of this century, more advanced gestural technology and accompanying applications have surfaced with a plethora of PDAs, smartphones, and multimedia devices. Natural interfaces have become more prevalent in recent years in gaming systems that use such input such as the Nintendo Wii Remote, the PlayStation Eye/Move, and the Microsoft Kinect for Xbox. The emergence of these natural user interface and gestural technologies has motivated researchers and hardware enthusiasts

Categories and Subject Descriptors


K.3.1 [Computers and Education]: Computer Uses in Education Computer-Assisted Instruction

General Terms
Design, Experimentation, Human Factors

Keywords
HCI, Kinect, Natural User Interface, Education

1. INTRODUCTION
Human-computer interaction (HCI) is a fundamental pillar in the Information Technology discipline.[1] Any interactive computing system involves one or more interfaces with which a user can provide commands and gather results. Interactive computer systems started with command-line interfaces that are still widely used today. The development of graphical user interfaces has allowed users with varying levels of computer skills to use a wide variety of software applications. Recent advancements in HCI have facilitated the development of natural user interfaces that provide more intuitive ways of interacting with the computing

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGITE11, October 2022, 2011, West Point, NY, USA. Copyright 2011 ACM X-XXXXX-XX-X/XX/X.

The exact scope of command-line, graphical, and natural user interfaces may vary from one expert to another and may even overlap. Terminology and definitions may vary with terms aside from those mentioned such as tangible and gestural user interfaces. Suffice it to say that there have been continuous advancements in making the user interface easier and more intuitive for the user to learn.

to look for novel ways that they can be used. In addition to their use in consumer applications, more of these technologies are finding their way into the classrooms, particularly for the study of HCI. We suggest that the Kinect is an excellent addition to the toolset for teaching such topics for reasons discussed in this paper. Our paper begins with a discussion and assessment of the state of Kinect-enabled technology. We then make specific recommendations for how this technology can be applied in HCI course instruction. Finally, we discuss the advantages and disadvantages of a Kinect-enabled approach to HCI instruction.

the sensor [13]. A developer with an alias of AlexP was able to successfully accomplish the challenge first at two days after the launch but it was Hector Martin who was recognized as the winner as the former was not willing to open-source his code at the time [14]. The efforts of these two individuals led to the beginning of two of the projects mentioned. Hector Martin released his code on November 10th, which marks the beginning of OpenKinect [15]. Among other things, this project provides drivers for Kinect, wrappers for different languages and other projects, and an analysis library that are all open-source. This project is distributed under both the Apache 2.0 and the GPL v2 license [16]. AlexP's code was used in the development of the Windows Kinect Driver/SDK - CL NUI Platform of Code Laboratories, version 1.0 of which was released officially on December 8th. The driver and the SDK with C/C++/C# libraries are freely available to the public. In the same month of the Kinect's launching, OpenNI [11, 17] was formed by a group of companies, which included PrimeSense Ltd., as a not-for-profit organization that aims to set an industrystandard framework for the interoperability of natural interaction devices. With the framework they have released came middleware libraries that can convert raw sensor data from a compliant device to application-ready data. These libraries are available for applications written in C, C++, and C#. Because Kinect is an OpenNI-compliant device, this software can be used to build applications for it. OpenNI is written in C/C++ so that, among other reasons, it can be used across different platforms. While OpenNI is officially supported on later versions of Windows from Windows XP and Ubuntu from 10.10, [18] its use in other Linux distributions and Mac OS X has been documented to work in online forums. Microsoft Research has announced the release of a noncommercial Kinect for Windows SDK in the spring of 2011, with a commercial version coming later. [19] The initial release promises, among other things, audio processing with source localization.[20] Processing audio signals from the sensors microphones is not enabled and currently under development in the three projects previously mentioned. This official Microsoft SDK may also provide the support necessary to accomplish the learning outcomes discussed here.

2. DEVELOPMENT FRAMEWORK
Before presenting ideas for how Kinect gestural, natural user interface technology can be used in the classroom, we will discuss the state of the hardware and software technology as it exists today.

2.1 Hardware
The Kinect is based on a sensor design developed by PrimeSense Ltd. The technical specifications described here are taken from the reference design of the PrimeSense sensor.[8] Light Coding is the technology that allows it to construct 3D depth maps of a scene in real-time with the amount of detail that it does. Structured near-infrared light is cast on a region of space and a standard CMOS image sensor is used to receive the reflected light. The PS1080 SoC (System on a Chip) is able to control the structured light with the aid of a light diffuser and is able to process the data from the image sensor to provide realtime depth data [9]. The depth image size from the PS1080 has a maximum resolution of 640 x 480. At 2m from the sensor it is able to resolve down to 3mm for height and width and 1cm for depth. It operates at a range between 0.8m and 3.5m. Experimentation has shown that Kinect is only able to process depth data at a frame rate of 30 fps. The sensor also has an integrated RGB camera with a maximum resolution of 1600x1200 (UXGA) to match the depth data with real images and two built-in microphones for audio input. Connectivity is enabled by a USB 2.0 interface. While it has to be noted that the Kinect sensor is not the only device that uses the PrimeSense reference design (e.g. ASUS Xtion Pro), all related experiments and activities in this paper were accomplished using it. Its basic applications do not require special or powerful computer hardware. A dual-core machine with 2GB RAM and standard video graphics processor can handle these applications just fine.

3. EDUCATIONAL USE
Familiarity with a Kinect-related SDK will help students be able to develop and understand applications that work with the Kinect. Any of the three software packages mentioned can be used to create HCI learning activities. For the purposes of this paper, OpenNI, with its accompanying middleware library called NITE, was investigated for its academic value. OpenNI was selected because it was established as an industry standard. It can also be further improved by its professional maintainers and the open source community. In this section the advantages and limitations of using Kinectbased applications to teach various topics in HCI are discussed. An enumeration of some of these topics is given. And some learning activities are outlined.

2.2 Software
At the moment of this writing, there are three major projects with freely available libraries that can be used to collect and process data from a Kinect sensor OpenKinect [10], OpenNI [11], and CL NUI [12]. Before the official release of Microsoft Kinect on November 4th, 2010, Adafruit Industries announced that they would give monetary reward to hardware enthusiasts who can write software that will enable a PC to access and process the data coming from

3.1 Suggestions for HCI Instruction Using Kinect


HCI is a required and fundamental area in the IT discipline. Gesture-based, natural user interaction can and should be part of the context of the different topics in HCI. Many existing applications (e.g. games, presentations, multimedia, drawing programs, etc.) can be controlled using gestural interfaces alone or in combination with other interaction technologies. Learning activities using Kinect could easily be developed to fulfill core and advanced learning outcomes outlined in the IT 2008: Curriculum Guidelines in Undergraduate Degree Programs in Information Technology [1]. The following subsections discuss examples of IT2008 HCI learning outcomes topics and suggest learning activities using Kinect-based gesture recognition. Note that for the sake of brevity and clarity, the use of device-free, natural bodily gesture and possibly multimodal interfaces enabled by Kinect will be called "Kinect-controlled" or "Kinect-enabled" interfaces in the following sections.

Come up with usability guidelines and standards for Kinect-enabled interfaces. Perform a heuristic evaluation of the usability of a Kinect-enabled application. Determine and design performance, usability, and user experience metrics for Kinect-enabled interfaces. Evaluate the appropriateness of traditional usability and user experience testing methods to assess Kinectenabled interactions (both single and multimodal).

Sample activities for advanced learning outcomes

3.1.4 Developing Effective Interfaces


Sample activities for core learning outcomes Determine the types of user processes, workflows, and tasks that are more or less appropriate for Kinectenabled interfaces. Implement a Kinect-enabled user interface that integrates and triggers typical cursor, mouse, and keyboard events. Apply different cognitive models or interaction frameworks to assess Kinect-enabled interface design. Study how Kinect-enabled interfaces can be adapted to be able to adjust to different characteristics of user personas and demographics (e.g. age, size, etc.). If appropriate, test and redesign available calibration methods for such a purpose. Implement a Kinect-enabled interface that uses skeleton detection, pose detection, or depth map processing.

3.1.1 Human Factors


Sample activities for core learning outcomes Study how cognitive principles, affordance, and feedback should influence the design of Kinectcontrolled interfaces in desktop computers. Discuss if and how Kinect-enabled interfaces can cause repetitive stress syndrome. Analyze the abilities of users of different age groups to use Kinect-controlled interfaces. Design and implement a Kinect-controlled interface accounting for appropriate cognitive, physical, cultural, and affective ergonomic considerations.. Experiment on various conditions that would increase or decrease the usability and user experience of a particular Kinect-controlled interface.

Sample activities for advanced learning outcomes

Sample activities for advanced learning outcomes

3.1.5 Accessibility
Sample activities for core learning outcomes Identify different ways that a Kinect sensor can assist users with certain disabilities. Consider the limits and capabilities of Kinect for disabled users and analyze the risks of such assistance. Discuss how 3D depth maps or gestures can improve computer use for special purpose or disabled computer users.

3.1.2 HCI Aspects of Application Domains


Sample activities for core learning outcomes Describe the advantages and disadvantages of using Kinect-control in web browsing. Describe conditions of the Kinect-enabled interface that would enhance usability and user experience in web browsing Design and implement a Kinect-controlled interface for a specific application domain such as web browsing. Discuss how a particular application can benefit from multimodal interfaces that include Kinect-enabled gesture interactions.

Sample activities for advanced learning outcomes

3.1.6 Emerging Technologies


Sample activities for core learning outcomes As an emerging technology, explore how Kinect can help advance the field of gesture-based, natural user interaction. Give examples of how Kinect-controlled interfaces can be used for ubiquitous computing environments. Compare the Kinect as a video game input device with comparable devices marketed by its competitors such as the Nintendos Wii Remote Controller and the PlayStation Eye/Move, as well as other traditional (e.g.,

Sample activities for advanced learning outcomes

3.1.3 Human-Centered Evaluation


Sample activities for core learning outcomes Perform usability and user experience evaluations for a commercial Kinect-enabled application

Sample activities for advanced learning outcomes

mouse, keyboard, and game controllers) and emerging (e.g., voice-recognition) interaction technologies. Illustrate how existing applications can be improved upon by Kinect-enabled interfaces.

3.1.7 Human-Centered Computing


Sample activities for core learning outcomes Test and analyze whether usability and user experience requirements can be met with the current capabilities of Kinect-enabled user interfaces. Determine what technological improvements are necessary to meet those requirements, if not currently met. Test an existing Kinect-based application in a user environment, and redesign the interface for that application to better satisfy user experience and usability requirements.

well documented which is very helpful in self-directed study, a crucial skill that IT students have to have. The libraries and middleware that process the raw data into more useful and application-ready information are already available (e.g. skeleton tracking, object segmentation, gesture and pose recognition, sensor data recording, etc.) A simple Internet search on Kinect-enabled applications will show scores of applications that have already been developed soon after the Kinects release. Even though they are based on the same sensor technology, these applications vary widely in purpose and usage. Having a plethora of explored and unexplored applications provides a great opportunity for students to exercise creative thought and innovation. Students who have gone through an undergraduate advanced programming course in C/C++ will find it relatively easy to develop applications directly from the project libraries. If the students do not have this advanced level of understanding but only have basic programming skills, wrapper classes or incomplete implementations can be written to make its use simpler for them. Even just using the sample applications out of the box could be useful in various topics of HCI. The NITE libraries provide a wide range of functionality that is more than sufficient for a structured course on natural user interaction. These libraries are well documented. Among much functionality that is provided out of the box include (though not by any means complete): Recognition of push, steady, swipes, waves and other gestures Skeleton detection and tracking of individual skeleton joints Pose detection User segmentation and multiple user detection Accessing depth and video data Multiple point tracking Various calibration and smoothing functions to enhance recognition Sample applications of gesture-controlled interfaces, user segmentation (Figure 1), point tracking (Figure 2), skeleton tracking (Figure 3), and other functionalities previously mentioned.

Sample activities for advanced learning outcomes

3.2 Advantages & Limitations


The following discusses some of the advantages of using Kinect in the classroom. Some caution is also given on its use. Kinect is revolutionary as a depth sensor because of the level of detail of depth data it can generate in real-time. This is achieved without user-attached devices and at significantly lower cost. Several years ago, computer vision techniques for 3D processing on readily available hardware are virtually non-existent. And even with the development of such in the last several years, they are usually computationally expensive and difficult to implement because the 3D reconstruction from 2D sensors is done in software. Inherent in Kinects technology is its ability to reduce such computing and implementation on the software side. Some time-of-flight 3D sensors can be used and have been used in the academe in the fields of computer vision and HCI to generate depth data from a scene. While there is an advantage in their being able to do in hardware what would otherwise have to be done in software, these can cost tens of times more expensive and at the same time provide depth data at a much lower resolution than the Kinect. Clearly, the availability and affordability of Kinect sensors make it a good learning tool to use in courses where they will be useful. If resources limit the number of sensors that can be used by students, OpenNI allows the recording of sensor data streams that can be played back and processed without an actual sensor. These recordings provide a standard data set for students to process where such a condition is required. Because Kinect makes the user the controller, as opposed to physically controlling an input device, it requires a greater level of involvement on the part of the user or the developer testing it. This can make any learning activity that uses it quite engaging. Graphical toolkits can easily be used to display the processed data, as shown in the included sample applications, which makes it visually engaging. Another advantage of its use in the classroom is the open-source nature of its supporting platforms. In particular, OpenNI, as an industry standard, is stable and is designed to be robust. It is very

Figure 1 Multiple user segmentation

courses, in order to learn how to keep up with advancements in this exciting technology.

5. REFERENCES
[1] LUNT, B., EKSTROM, J., REICHGELT, H., et al. IT 2008. Communications of the ACM, 53, 12, 133. [2] BIRMINGHAM, H. P. Human Factors in ElectronicsHistorical Sketch. Proceedings of the IRE, 50, 51962), 11161117. [3] SANDERS, M. S. Human factors in engineering and design: Mark S. Sanders, Ernest J. McCormick. New York: McGraw-Hill, 1987. [4] HORNBUCKLE, G. D. The Computer Graphics User/Machine Interface. Human Factors in Electronics, IEEE Transactions on, HFE-8, 11967), 17-20. [5] DAVIS, M. R. and ELLIS, T. O. The RAND tablet: a manmachine graphical communication device. In Proceedings of the Proceedings of the October 27-29, 1964, fall joint computer conference, part I (San Francisco, California, 1964). ACM, [insert City of Publication],[insert 1964 of Publication]. [6] LEA, W. Establishing the value of voice communication with computers. Audio and Electroacoustics, IEEE Transactions on, 16, 21968), 184-197. [7] SAFFER, D. Designing Gestural Interfaces. Sebastopol : O'Reilly Media, In., Sebastopol, 2008. [8] LTD., P. The PrimeSensor (TM) Reference Design 1.08. PrimeSense Ltd, http://www.primesense.com/files/FMF_2.PDF (Last Accessed: April 2011) [9] ZALEVSKY, Z., SHPUNT, A., MAIZELS, A., et al. METHOD AND SYSTEM FOR OBJECT RECONSTRUCTION. 2006. [10] OPENKINECT OpenKinect Main Page. http://openkinect.org/ (Last Accessed: April 2011) [11] OPENNI OpenNI. http://openni.org/ (Last Accessed: April 2011) [12] LABORATORIES, C. About: CL NUI Platform. Code Laboratories, http://codelaboratories.com/kb/nui (Last Accessed: April 2011) [13] INDUSTRIES, A. The Open Kinect project THE OK PRIZE get $3,000 bounty for Kinect for Xbox 360 open source drivers. http://www.adafruit.com/blog/2010/11/04/the-openkinect-project-the-ok-prize-get-1000-bounty-for-kinect-for-xbox360-open-source-drivers/ (Last Accessed: February 2011) [14] GILES, J. Inside the Race to Hack the Kinect. 2010. http://www.newscientist.com/article/dn19762-inside-the-race-tohack-the-kinect.html?full=true [15] OPENKINECT OpenKinect History. http://openkinect.org/wiki/History (Last Accessed: April 2011) [16] OPENKINECT OpenKinect Policies. http://openkinect.org/wiki/Policies (Last Accessed: April 2011) [17] OPENNI About. http://openni.org/about (Last Accessed: May 2011) [18] OPENNI OpenNI User Guide. 2011. http://openni.org/images/stories/pdf/OpenNI_UserGuide_v3.pdf [19] KNIES, R. Academics, Enthusiasts to Get Kinect SDK. Microsoft Research, http://research.microsoft.com/enus/news/features/kinectforwindowssdk-022111.aspx (Last Accessed: April 2011). [20] RESEARCH, M. Kinect for Windows SDK beta. http://research.microsoft.com/enus/um/redmond/projects/kinectsdk/ (Last Accessed: May 2011)

Figure 2 Multiple point tracker

Figure 3 Skeleton tracking These and the other standard features that are included in OpenNI/NITE could be used in learning activities for HCI. While these could very well be sufficient, there are many other opensource and commercial applications that can be used if necessary. The technology behind Kinect also has the potential to allow significant advances in natural user interaction technology, as evidenced by the wide variety of applications that have been developed from the very beginning of its launch. Students interested in the field of computer vision and/or natural user interaction will benefit from keeping up with this advancement. While Kinect-enabled technology can be a very helpful aid in learning HCI topics, caution is given to keep its use appropriate and balanced with other technologies that are also useful for HCI instruction.

4. CONCLUSION
The Microsoft Kinect sensor and its supporting development platforms provide significant learning opportunities in various topics of Human Computer Interaction. A number of such platforms are currently available. In this paper, we evaluated OpenNI. OpenNI and its libraries can be very effective, robust, and flexible in providing natural user interface functionalities, which can be used to provide students with hands-on experience with this gesture-based, natural user interaction technology. Since gesture-based interaction technologies are becoming a standard part of commercial systems, IT students will benefit from integrating this technology in their education, such as in HCI

You might also like