An Efficient Text Input Method For Pen-Based Computers
An Efficient Text Input Method For Pen-Based Computers
Toshiyuki Masui
Sony Computer Science Laboratory Inc.
3-14-13 Higashi-Gotanda
Shinagawa, Tokyo 141-0022, Japan
+81-3-5448-4380
[email protected]
ABSTRACT was not the one that the user intended to use, the user types
Pen-based computing has not yet taken off, partly because of a “next candidate key” until the correct word appears as the
the lack of fast and easy text input methods. The situation candidate.
is even worse for people using East Asian languages, where
thousands of characters are used and handwriting recogni- On almost all the pen-based computers available in Japan,
tion is extremely difficult. In this paper, we propose a new either RKC or handwriting recognition is supported. Text
fast text input method for pen-based computers, where text input is slow and tiring using either of the techniques, for
is not composed by entering characters one by one, but by the following reasons. Specifying the pronunciation of every
selecting words from a menu of candidates created by filter- input word using a soft keyboard takes a lot of time, and
ing the dictionary and predicting from context. Using our the user must convert the pronunciation to the desired Kanji
approach, users can enter Japanese text more than twice as strings with extra keystrokes. Handwriting recognition has
fast as recognition-based and other existing text input meth- more problems. First, the recognizer has to distinguish be-
ods. User studies and detailed analysis of the method are also tween thousands of characters, often making errors. Many of
given. the characters in the character sets have similar shapes, so it
is inherently difficult to make recognition reliable. Second,
KEYWORDS: Input devices, Pen-based input, Predictive in many cases, users do not remember the shape or the stroke
interface, Hand-held devices, International interfaces, POBox order of Kanji characters, even when they have no problem
reading them. Finally, writing many characters with many
INTRODUCTION
strokes on a tablet is very tiring. With these difficulties, it is
Although a variety of pen-based computers are available these
believed to be difficult to enter Japanese text faster than 30
days, they are not as widely used as keyboard-based comput-
characters a minute on pen-based computers, which is several
ers, partly because entering text is much harder on pen-based
times slower than using keyboards.
machines. Traditionally, handwriting recognition techniques
and the soft keyboard (virtual keyboard displayed on the We have developed a new pen-based text input method called
tablet of a pen computer) used to be the main techniques for POBox (Pen-Operation Based On eXample), where users can
entering characters on pen-based computers, although other efficiently enter text in any language, using menus, word
techniques have also been proposed[4][6]. However, using prediction and approximate pattern matching. The remainder
any of these techniques takes much longer to enter text than of this paper demonstrates the details of POBox.
with a standard keyboard.
The situation is worse for East Asian languages such as Chi- STRATEGIES FOR RAPID TEXT ENTRY
nese, Japanese, etc. These, unlike European languages, have There is a big difference between the speed of typing on
thousands of character faces. Even with a keyboard, it is keyboards and pointing to characters on soft keyboards of
not easy to enter a character. A variety of techniques for en- pen-based computers. Computer users can easily type more
tering text into computer have been investigated. The most than five characters per second, while it is very difficult to
widely-used Japanese input technique is “Roman-Kanji con- touch three character keys per second, accurately on the soft
version” (RKC), in which a user specifies the pronunciation keyboard of a pen-based computer. In contrast, the speed of
of a word with an ASCII keyboard, and the system shows the selecting an item from a list is faster with a pointing device,
user a word with the specified pronunciation1 . If the word and many keyboard-oriented text editors (e.g. Emacs) now
1 Japanese
have mouse interfaces. For this reason, forcing the user
characters consist of two character sets. Kanji characters,
to enter many characters should be avoided on pen-based
computers, while a better approach should allow the user to
Published in: select a word from a list of candidates, in a minimum number
Proceedings of the ACM Conference on Human of penstrokes. We took the following approach.
Factors in Computing Systems (CHI’98) (April
1998), ACM press, pp. 328–335. imported from China, contain both meaning and pronunciation, while Kana
characters only represent pronunciation.
Figure 1: Initial display. Figure 4: Selecting “first” after releasing the pen from the
tablet.
Figure 8: Selecting the “E” key. Figure 11: After specifying “comple”.
Figure 9: Moving to the “N” key and selecting “entering”. Figure 12: After specifying “cplm”.
Figure 1 shows the startup display of POBox. When the user times to enter the phrase “First, we show our technique for
touches the “F” key, the display changes to Figure 2, showing entering.” Notice that the user made no spelling errors with
the frequently used words that start with “F” in a pulldown this method, since all the input words were taken from the
menu. Since the word “first” is a frequently used word and is dictionary.
found in the menu, the user can drag the pen and highlight the
word “first” as shown in Figure 3, and then take the pen off Using Approximate String Matching
the tablet to complete the selection. Alternatively, if the user
does not make a selection from the pulldown menu of Figure With the approximate string matching feature, even when the
3, he can choose the desired word from the popup menu as user does not specify the correct spelling of a word, there is
shown in Figure 4. a good chance of finding the desired word among the can-
didates. Also, the user can specify only part of the spelling
After selecting “first”, the display changes to Figure 5. In the to find the desired word. For example, if the user does not
menu at the bottom, the words that often come after “first” remember the spelling of “Mediterranean,” he can specify
are listed in order of frequency. The word combination “first “mdtrn” to see the list of words which are close to the pattern
the” appears 27 times in the CHI’95 CD-ROM, “first and” and then can find the right word in the list (Figure 10.)
and “first time” appear 20 times, etc. Since the next word,
“we,” happens to be in the list because “first we” appears 13 The same technique can be used to enter a word that has a
times in the CD-ROM, the user can directly select “we” by common prefix. If the user tries to enter “complementary”
touching it in the menu. After selecting “we”, the display and specifies “comple,” he still cannot find the word in the
changes to Figure 6. In this case, “show” is not found in candidates in Figure 11, since there are many commonly
the menu, but it can be selected from the pulldown menu by used words that begin with “comple.” Instead, the user can
touching the “S” key as shown in Figure 7. specify the characters that better represent the word. As
shown in Figure 12, the user can obtain “complementary” by
After this, “our”, “technique” and “for” can be selected in specifying “cplm,” although other patterns such as “cpmt”
a similar manner. Touching the “E” key does not make the will also work.
system display the next intended word (“entering”) as shown
in Figure 8, but touching the “N” key next narrows the search
Entering Japanese Text
space of the dictionary and “entering” then appears in the
menu for the selection (Figure 9). With POBox, users can enter Japanese text much more easily
than RKC and handwriting recognition systems. We show
From start to finish, the user only had to tap the tablet 15 the example by using “
Word Spelling/Pronunciation
the THE
of OF
to TO
and AND
... ...
Figure 13: Initial display in Japanese input mode. Context Word Spelling/Pronunciation
of the THE
in the THE
to the THE
... ... ...
as well as AS
into the THE
... ... ...
5 Zaurus was the most popular PDA in Japan at the time this experiment Figure 23: Probability of finding the desired word in
was performed. the menu (English text).
1.0 1.0 1.0
i=5 i=5 i=5
i =4 i =3 i =4 i =3
0.9 0.9 0.9
i =4 i =3 i =2 i =2
0.8 0.8 0.8
i =2 i =1
i =1
0.7 0.7 0.7
i =1
0.6 0.6 0.6
i =0 i =0
0.5 0.5 0.5
i =0
0.4 0.4 0.4
0 0 0
1 2 5 10 20 50 100 1 2 5 10 20 50 100 1 2 5 10 20 50 100
(a) Without prediction (b) With prediction (c) With prediction and dictionary adaptation
i : number of penstrokes
P(i,n) :Hit ratio before showing the menu
i characters is known. If it takes Tk for a user to input
1.0
i=5
one character and it takes Ts (n) to select an item from the
i =4 i =3
0.9 menu with n items, the average total time for entering a word
i =2
0.8
(T (i; n)) can be calculated by the following formula:
0.7 T (i; n) = Ts (n)
0.6
i =1
+ (Tk + Ts (n))(1 0 P ( ; n))
0
+ (Tk + Ts (n))(1 0 P ( ; n))
1
P1
0.5
+ ...
=0 (Tk + Ts(n))(1 0 P (j; n))
0.4
= Ts (n) + j
0.3
i =0
0.2 If the user starts using the menu after entering at least i
0.1 characters, the average total time T (i) is calculated by the
0
following formula:
1 2 5 10 20 50 100
n : Number of candidates in the menu
Japanese text input, most of the desired words can be found We assume that Ts (n) is proportional to n and Tk is a constant
in the menu after two or three penstrokes, while more than value, since POBox shows a menu of candidates according
four penstrokes are required using ordinary Kanji-conversion to the probability of the words, and the user cannot tell the
methods. ordering of the words in the menu beforehand. We calcu-
lated T (i; n) using P (i; n) for the two cases of slow and fast
Dynamic Analysis character input.
More accurate hit ratio of POBox menus can be calculated
by simulating the prediction and adaptation mechanisms of Slow Character Input: Figure 26 shows the calculated av-
POBox with real English text. Figure 25(a) shows the hit ratio erage time for entering a word where character input speed
calculated by using all the texts in the CHI’95 CD-ROM. The is slow and Ts (n) can be estimated to be n=10 and Tk is the
hit ratio with the prediction from context feature is shown in constant 1. In this case, without prediction, the minimum text
Figure 25(b), and the hit ratio with prediction and dictionary input time is obtained when i = 1 and n = 3, which means
adaptation is shown in Figure 25(c). Prediction from context using a three-entry menu after one penstroke without a menu.
is effective for increasing the hit ratio, especially when no With prediction, the input time is minimized when i = 0 and
input is specified for selecting words (i = 0). In this case, n = 3, which means using a three-entry menu from the start.
POBox displays the correct word among its 10 candidates This is because frequently-used words are displayed at the top
38% of the time, whereas this number drops to 26% when of the menu even before the user specifies characters for fil-
prediction is not used. tering the dictionary. The estimated average time for entering
words is smaller with prediction than without prediction.
Input Speed Estimation
Text input speed can also be estimated by dynamic analysis Faster Character Input: Figure 27 shows the average time
if the character input speed using the soft keyboard and the for entering a word, where character input speed is faster than
speed of menu selection is known. the previous example and Ts (n) is estimated to be n=3. In
this case, minimum input time is obtained when i = 0 and
From the dynamic analysis shown above, the hit ratio P (i; n) n = 1, which means predicting one word every time after
of finding a word in the menus with n items after selecting entering a character.
T(i,n) i=5 existing common GUI tools with the prediction mechanism,
6.0 i=4 POBox can greatly reduce the time for text input on pen-based
i=3 computers, especially for Japanese and other languages where
i=2 direct text input is not possible.
i=1
5.0 i=0 Greenberg[5] argued that it is convenient to put frequently
used tools close at hand, and showed that this technique
is useful for issuing text commands in his WORKBENCH
4.0 system. POBox resembles the WORKBENCH system in that
both frequently used words and recently used words always
appear close at hand at the top of the candidate list for quick
selection.
3.0
Fukushima et al.[3] showed that input word prediction can
5 10 5 10 n reduce the search space and the number of penstrokes for
Without prediction With prediction handwriting recognition of Japanese texts. Although they
reported that their prediction system could reduce input pen-
Figure 26: Text input speed estimation with slow char-
strokes from 10 to 40 percent, problems with handwriting
acter input. (Tk = 1, Ts (n) = n=10)
recognition still remain and the text input speed does not
increase dramatically.
T(i,n) i=5
i=4 CONCLUSIONS
i=3 We developed a new fast text input method for pen-based
7.0 i=2 computers based on dynamic query of the dictionary and
i=1 word prediction from context. With our method, the speed of
i=0
text input on pen-based computers greatly increases and for
6.0 the first time, pen computing becomes a viable alternative to
keyboard-based input methods.
ACKNOWLEDGEMENTS
5.0 We would like to thank Jun Rekimoto and Jeremy Cooper-
stock for giving us many valuable suggestions. We also thank
many POBox users who actually used it, sent comments to
us, and performed the evaluation tests.
4.0
REFERENCES
3 6 3 6 n 1. Baeza-Yates, R. A., and Gonnet, G. H. A new approach to
Without prediction With prediction text searching. Communications of the ACM 35, 10 (October
1992), 74–82.
Figure 27: Text input speed estimation with faster char-
acter input. (Tk = 1, Ts (n) = n=3) 2. Darragh, J. J., Witten, I. H., and James, M. L. The Reactive
Keyboard: A predictive typing aid. IEEE Computer 23, 11
(November 1990), 41–49.
In this manner, the fastest method for entering text depends 3. Fukushima, T., and Yamada, H. A predictive pen-based
on the relation between Tk =Ts (n) and P (i; n). Roughly Japanese text input method and its evaluation. Transactions of
speaking, when Tk =Ts (n) is very small (character input is Information Processing Society of Japan 37, 1 (January 1996),
very fast) as with a keyboard, the fastest way of entering text 23–30. in Japanese.
is entering characters without the use of menus. On the other
4. Goldberg, D., and Richardson, C. Touch-typing with a
hand, if Tk =Ts (n) is very large (character input is very slow),
using menus with many entries is faster. The two cases shown stylus. In Proceedings of ACM INTERCHI’93 Conference on
Human Factors in Computing Systems (CHI’93) (April 1993),
in Figure 26 and Figure 27 are between these extremes, and
Addison-Wesley, pp. 80–87.
POBox supports the entire spectrum.
5. Greenberg, S. The Computer User as Toolsmith. Cambridge
Related Work Series on Human-Computer Interaction. Cambridge University
Darragh’s Reactive Keyboard[2] predicts the user’s next key- Press, March 1993.
strokes from the statistical information gathered by the user’s 6. Venolia, D., and Neiberg, F. T-Cube: A fast, self-disclosing
previous actions and shows the predicted data for the selec- pen-based alphabet. In Proceedings of the ACM Conference on
tion. Unfortunately, the Reactive Keyboard is not usually Human Factors in Computing Systems (CHI’94) (April 1994),
useful for experienced computer users, since they can type Addison-Wesley, pp. 265–270.
much faster than selecting candidates from the menu. On
pen-based computers, however, people cannot enter charac- 7. Wu, S., and Manber, U. Agrep - a fast approximate pattern-
ters as fast as with keyboards, thus predictive methods like matching tool. In Proceedings of USENIX Technical Conference
POBox and the Reactive Keyboard are useful. By integrating (San Francisco, CA, January 1992), pp. 153–162.