Smart Classroom - An Intelligent Environment For Tele-Education
Smart Classroom - An Intelligent Environment For Tele-Education
for Tele-education
1
Weikai Xie1, Yuanchun Shi1, Guanyou Xu , Dong Xie2
1
Dept. of Computer Science and Technology, Tsinghua Univ., Beijing, China
[email protected],{shiyc, xgy-dcs}@tsinghua.edu.cn
2
IBM China Research Lab, Beijing, China
[email protected]
Abstract. The Smart Classroom project explores the challenges and potentials
of the Intelligent Environment as a new human-computer interaction paradigm.
By constructing an intelligent classroom for tele-education, we try to provide
teachers the same experiences as in an ordinary classroom when giving tele-
education lessons. The Smart Classroom could actively observe, listen and
serve the teachers, and teachers can write on a wall-size media-board just by
their hands, or use speeches and gestures to conduct the class discussion
involving of the distant students. This paper discusses the advantages and main
underlying technologies of this system.
1 Introduction
We are steadily moving into a new age of information technology named as ubiqui-
tous computing (pervasive computing), where computation power and network con-
nection will be embedded and available in the environments, on our bodies, and in the
numerous handhold information appliances [1]. The human computer interaction
paradigm we currently used on the desktop computers will not be sufficient [2]. In-
stead of operating on individual computers and dispatching many trivial commands to
separated applications through keyboard and mouse, we should be able to interact
with all related computation devices as a whole, and express our intended tasks in a
high abstraction level and by ways as natural as we used to communicate with other
people in everyday life.
The research of Intelligent Environment is just motivated by this vision. General
speaking, an Intelligent Environment is an augmented living or working environment
which could actively watch and listen to the occupants, recognize their requirements
and attentively provide services for them. The occupants could use normal human-
being interaction methods such as gesture and voice to interact with the computer
system embedded in the environment. The researches in this filed are bring into main-
stream in the late 1990's by several first-class research groups of the world such as AI
Lab and Media Lab at MIT, Xerox PARC, IBM and Microsoft. Currently there are
dozens of related projects carried out in research groups from all over the world. The
H.-Y. Shum, M. Liao, and S.-F. Chang (Eds.): PCM 2001, LNCS 2195, pp. 662–668, 2001.
© Springer-Verlag Berlin Heidelberg 2001
Smart Classroom - an Intelligent Environment for Tele-education 663
most famous ones are Intelligent Room project from MIT AI Lab [3, 4], Aware Home
project from GIT [5] and Easy Living project from Microsoft [6, 7].
The Smart Classroom project in our group is also a research effort on Intelligent
Environment. It demonstrates an intelligent classroom for teachers involved in tele-
educations, in which teachers could have the same experiences as in a real classroom.
This paper is organized as follows. In Section 2 we present the scenario of Smart
Classroom. In Section 3 we discuss the main technologies involved in implementing
the scenario. Finally we give a summary and outline the future work.
Almost all the tele-education systems developed today are desktop-computing based,
where teachers are required to seat down in front of the desktop computer and use the
keyboard or mouse to give a tele-education class. The teacher's experience is of much
difference from teaching in an ordinary classroom, where the teacher could make
handwriting on the blackboard, use speech and gesture to conduct the students to take
part in the class discussions and other body languages like that. The different experi-
ence always makes the teacher feel uncomfortable and reduced the efficiency of the
course as well.
MediaBoard
The scenario might seem no significant differences from many other whiteboard
based tele-education systems. However, the magic of the Smart Classroom is the way
teachers using the system - teachers are no longer tied up to the desktop computer,
nor cumbersome keyboard and mouse. Making annotations on the courseware is just
as easy as writing on an ordinary classroom blackboard, i.e. the teacher only need to
move his finger, then the stroke will be displayed on the media-board, overlapped
with the displayed courseware. The teacher could also point to the object displayed on
the media-board and say a predefined command to do what he wanted. For example,
to following a hyperlink in the courseware, the teacher just need to point at the hyper-
link and tell the Smart Classroom "follow this link". The way to enable a remote
student to speak on the class is also intuitive and easy. It can be completed by speech.
For example, let's suppose the teacher want to ask a student named Peter to answer a
question. Then he can say "Peter, could you answer this question?" Whenever a stu-
dent requires speaking, the corresponding image will start to blink to alert the teacher.
Actually, this system blurs the border between the ordinary classroom education
and the tele-education. The teacher can give a class to the local students on the Smart
Classroom and those remotely attended students at the same time.
Smart Classroom - an Intelligent Environment for Tele-education 665
The Smart Classrooms, just like many other similar Intelligent Environments, is an
assembly of many different kinds of hardware and software modules such as projec-
tors, cameras, sensors, face recognition module, speech recognition module and eye-
gaze recognition module. It is unimaginable to install all these components in one
computer due to the limited computation power and terrible maintenance require-
ments. Thus, a distributed computing platform is a must for an Intelligent Environ-
ment.
There are a handful of distributed computing platforms from commercial organiza-
tions or research groups. These distributed computing platforms can be classified into
two different types according to their structure modal. One is called distributed-
component- modal based system, such as CORBA from OMG and DCOM from Mi-
crosoft. The other is called multi agent system, such as OAA from SRI and Aglets
from IBM. Although the distributed component modal is currently more prevalent
and mature than the multi agent system modal, we considered the latter is more suit-
able for the Intelligent Environment system, including our Smart Classroom. The
consideration is explained as following.
The distributed component modal implies monolithic control logic and tight-
couple system structure. Put it another way, the software in a distributed-component-
modal system is composed of a central logic and several other peripheral objects offer
service to the central logic. There is only one execution process in the system and the
objects run only when invoked by the central logic. On the contrary, the multi agent
modal, which have been invented and used for years in the AI domain, implies a
loose-couple system structure. According to this modal, a system is divided into many
individual autonomous software modules called agent, which has its own executing
process. Each agent has limited capabilities and limited knowledge of the functions of
the whole system, but through communication and cooperation the agents' commu-
nity will expose a high degree of intelligence and can achieve very complex func-
tions.
A loose-couple system will be far more suitable than a tight-couple system in the
Intelligent Environment.
1) The scenarios of an Intelligent Environment are usually very complex, so de-
veloping a central logic for it is very difficult even an unpractical matter.
2) The scenarios and configurations in an Intelligent Environment are often very
dynamic. New functions will be added, old module will be revised, and all those
things are happened frequently. The monolithic system structure is very inflexible
under this situation, because any trivial modifications will require the whole system to
shut down and all modules in the system should be re-linked.
3) The central logic is likely to be the bottleneck of the system.
After a survey of some multi agent systems, we finally adopted the OAA (Open
Agent Architecture), a public available multi-agent system, as the software platform
666 W. Xie et al.
for the Smart Classroom. It was developed by SRI and has been used by many re-
search groups [10]. (We fixed some errors of the implementation provided by SRI to
make it more robust.)
All software modules in the Smart Classroom are implemented as OAA agents. At
start up, they will register their capabilities in a central coordinating logic called "fa-
cilitator". When they want services from other agents they can just send a message
encoded in the Inter Agent Language (ICL) to the facilitator, and need not to know
which agent actually provided the services. The ICL is essentially based on an exten-
sion of the Prolog language.
People use multiple modalities to communicate with each other in everyday life, such
as speaking, pointing and gesturing. The multi-modal interaction capability is a fun-
damental requirement of the Intelligent Environment, since any single modality is
often semantic incomplete. For example, when one say "move it to right", without the
recognition of the hand pointing modality we could not tell which object is referred
by the speaker. Another benefit of the multi-modal processing is the information from
other modalities often helps to improve the recognition accuracy of a single modality.
For example, suppose in a noise environment, a not so smart speech recognition algo-
rithm maybe has difficulty to decide what the user said is "move to right" or "move to
left". But after referring the information from the hand gesture recognition module,
the system could eventually make the correct choice. This also happens in Smart
Classroom. In the scenario we designed, the teacher can use speech and hand gesture
to make annotations on the media-board, to manage the content in the media-board
and to pass floor to a remote student.
The formalization architecture we used to address this issue is based on the "unifi-
cation based multimodal parsing" presented in [9]. It essentially takes the multi-
modality integration process as a kind of language parsing process, i.e., each separate
action in a single modality is considered as a phrase structure in the multi-modal
language grammar, they are grouped and induced to generate a new higher level
phrase structure. The process is repeated until a semantic completed sentence is
found. The process could also help to correct the wrong recognition result of one
modality. If one phrase structure could not be grouped with any other phrase structure
according to the grammar, it will be regarded as a wrong recognition result and will
be abandoned.
Several cameras installed on the room together with a skin color consistency based
algorithm are used to recognize the 3D movement parameters of the teacher's hand, as
well as some simple actions of the teacher's palm such as open, close and push [14].
These modules function as a virtual mouse for the teacher, i.e. the movement of the
teacher's hand will be explained as dragging of the pointer on the screen and the ac-
tions of his palm will be explained as clicks of the mouse button. In this way, the
Smart Classroom - an Intelligent Environment for Tele-education 667
teacher can easily make annotations on the media-board or select a remote student on
the student-board just by his hand.
Since we started to design the system structure, we have been keeping in our mind
that the structure should be flexible enough to support gradually extending the sce-
nario of the Smart Classroom. The speech recognition framework we developed is
one of the examples.
In the core of the framework is an enhanced speech grammar, which could not
only describe the syntax of legal phrases, but also mark the semantic effect of the
phrase when recognized. Exactly, the semantic effect refers to an Inter-Agent-
Language message that can trigger wanted actions of other agents in the system. Us-
ing this grammar, other agents could add vocabularies into the speech recognition
agent on the runtime. Nevertheless, the vocabularies could be dynamically enabled
and disabled by other agents according to their knowledge of current context in order
to keep the effective vocabulary at a minimum size, which is very important to im-
prove the recognition speed and accuracy rate.
The tele-education system behind the scene is based on the Same View system from
another group in our lab [11,12,13]. The system is constituted of three layers. The
most upper layer is a multimedia whiteboard application we called media-board and
associated floor-handling mechanism. As mentioned above, one interesting feature of
the media-board is it can record all the actions on it, which could be used to aid the
creation of the courseware. The middle layer is a adapting content transform layer,
which could automatically re-authorizing or transforming the content sent to the user
according to the user's bandwidth and device capability [11]. The media-board use
this feature to ensure that students at different sites could get the contents on the me-
dia-board at a maximum quality according to their different network and hardware
conditions. The lowest layer is a reliable multicast transport layer called Totally Or-
dered Reliable Multicast (TORM), which could be used in a WAN environment
where sub-networks capable of or incapable of multicast coexist [12,13]. The media-
board uses this layer to improve its scalability across large networks like Internet.
cally resume the context (such as the content on the media-board) where the teacher
stopped at the last class.
We believe the Intelligent Environment will be the right metaphor for people to in-
teract with the computer systems in the ubiquitous computing age. The Smart Class-
room is our test-bed for the researches on the Intelligent Environment as well as an
illustration of its application.
References
1. Weiser, M.: The Computer for the 21st Century. Scientific American, September (1991) 94–
100
2. Weiser M.: The world is not a desktop. Interactions, January (1994), 7–8
3. Coen, M.: Design Principles for Intelligent Environments. In Proceedings of The Fifteenth
National Conference on Artificial Intelligence, Madison, Wisconsin (1998)
4. Coen, M.: The Future Of Human•omputer Interaction or How I learned to stop worrying
and love My Intelligent Room. IEEE Intelligent Systems, March/April (1999)
5. Kidd, Cory D., Robert J. Orr, Gregory D. Abowd and et al.: The Aware Home: A Living
Laboratory for Ubiquitous Computing Research. In the Proceedings of the Second Interna-
tional Workshop on Cooperative Buildings, October (1999)
6. Shafer S. and et al.: The New EasyLiving Project at Microsoft Research. Proceedings of the
1998 DARPA/NIST Smart Spaces Workshop, July (1998) 127–130
7. Brumitt, B. L., Meyers, B., Krumm, J. and et al.: EasyLiving: Technologies for Intelligent
Environments. Handheld and Ubiquitous Computing, 2nd Intl. Symposium, September
(2000) 12–27
8. G. D. Abowd.: Classroom 2000: An experiment with the instrumentation of a living educa-
tional environment. IBM Systms Journal, Vol. 38. No.4
9. Johnston M.: Unification•ased multimodal parsing. In the Proceedings of the 17th Interna-
tional Conference on Computational Linguistics and the 36th Annual Meeting of the Asso-
ciation for Computational Linguistics, ACL Press, August (1998) 624–630
10. OAA web site: http://www.ai.sri.com/~oaa/
11. Liao, C.Y., Shi, Y.C., Xu, G.Y.: AMTM – An Adaptive Multimedia Transport Model. In
proceeding of SPIE International Symposia on Voice, Video and Data Communication.
Boston, Nov. (2000)
12. Pei, Y.Z., Liu, Y., Shi, Y.C. and et al.: Totally Ordered Reliable Multicast for Whiteboard
Application. In proceedings of the 4th International Workshop on CSCW in Design, Com-
pigne, France (1999)
13. Tan K., Shi, Y.C., Xu, G.Y.: A practical semantic reliable multicast architecture. In pro-
ceedings of the third international conference on multimodal interfaces, Beijing, China,
(2000)
14. Ren, H.B., Zhu, Y.X., Xu, G.Y. and et al.: Spatio•emporal appearance modeling and rec-
ognition of continuous dynamic hand gestures. Chinese Journal of Computers (in Chinese),
Vol 23, No. 8, Agu. (2000) 824–828