Robot_arm_controller_using_fuzzy_speech_recognition
Robot_arm_controller_using_fuzzy_speech_recognition
Jain
from the machine to human might be required by 2. Complement: The complement of A is a fuzzy
using synthetic speech. Thus, the study of speech set, denoting as A, and the membership function
recognition is part of a quest for "artificially is defined as
intelligent" machines which can "hear", "understand", P& = C(P.4
and "act upon" spoken information, and "speak" in 3. Union (s-norm, t-conorm ): The union of A
(4
(3 ) (4}
completing exchange of the information. As anyone and B is a fuzzy set, denoting as A UB and the
who has read or watched science fiction knows that the
87
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain
Speech
Recognition
Result
. . _ _ - . . - _ . - - - -Dictionary
-------
Input output
88
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-BasedIntelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain
Speech signal
L_I--l Sampling
Quantization
Endpoint detection
Windowing
1
1
+
Preemphasis
Analysis of feature
Feature extraction
output
Voicedunvoiced
Impulse
switch
generator
All-pole Speech
filter signal S(n)
White-noise
Gain estimate
Figure 4 Model of speech production
89
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-BasedIntelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain
90
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain
rf
t
3
2
3
4
3
4
4
4
3
3
2
5 3 5 2 7 7 8 3 0 5 5 0 0 0 0 0 0 0 1 0 0 0 2 1 2 510 5 6 6 7 5
1
3
0
4
4
2
2
1
4
1
3
5
3
4
5
3
2
1
4
1
0
3
3
4
3
5
4
6
4
5
3
5
3
2
1
4
1
3
1
1
1
3
3
2
3
3
1
1
3
3
6
0
1
7 1 2 5 7 6 1 2 3 0 1 0 0 1 1 1 4 3 3 7 9 7 0 2 7 3
6 3 2 8 7 6 1 0 0 1 0 1 0 0 1 2 5 2 4 8 7 5 3 3 6 3
6 3 410 5 3 0 0 0 0 1 1 0 1 0 1 4 4 4 9 8 7 1 5 7 4
4 4 2 7 7 6 0 0 0 0 0 2 0 0 0 1 3 3 2 8 5 9 3 4 2 5
3 4 4 9 4 7 0 0 2 0 0 1 2 1 0 1 5 2 4 9 6 9 1 5 2 1
2 5 4 7 6 5 0 0 0 0 0 0 1 0 0 1 3 2 6 9 4 9 4 3 5 4
3 6 4 6 3 5 0 1 2 1 0 1 4 0 0 1 6 6 4 9 4 6 1 6 3 3
3 6 4 7 3 5 0 0 0 0 0 0 2 0 0 1 2 2 510 610 3 6 3 2
3 3 5 9 6 7 0 1 0 0 0 0 1 0 0 1 5 4 5 8 6 4 3 7 5 2
3 4 6 9 6 7 0 0 0 0 1 2 4 2 0 1 2 4 810 3 8 1 3 4 .
4 3 6 8 7 5 0 2 0 1 0 1 3 0 0 3 2 3 6 9 7 8 2 1 2 2
7
2 0 1 6 1 4 3 8 5 7 4 5 0 1 2 1 0 0 3 1 0 2 4 4 4 9 4 9 1 2 3 4
4 5 3 5 2 1 2 2 4 9 4 7 0 0 0 0 1 0 1 1 0 2 2 4 610 6 9 1 3 4 2
2 3 3 4 1 1 3 4 4 7 5 6 0 0 0 0 0 0 2 0 0 2 3 4 510 610 2 4 6 3
1 4 1 6 2 3 5 4 4 8 3 5 0 0 0 0 2 0 0 1 0 2 5 2 4 1 0 7 7 3 2 6 3
4 2 1 5 2 6 4 6 1 9 5 3 0 0 0 0 0 0 1 0 0 2 3 1 4 1 0 510 3 4 5 4
1 2 3 5 3 2 2 6 2 1 0 5 4 0 0 0 0 1 3 1 0 1 1 3 3 2 1 0 5 7 3 5 3 7
2 4 0 3 0 5 3 7 3 8 5 6 0 0 3 0 0 2 3 0 0 1 5 1 3 9 3 9 3 6 2 4
1 3 3 5 2 3 0 5 1 9 5 8 0 2 3 1 1 0 2 2 0 3 4 1 2 1 0 3 9 2 4 3 3
2 2 1 4 1 5 0 7 4 7 7 6 0 1 0 0 0 2 2 0 1 0 4 1 5 1 0 5 9 1 6 4 3
4 4 0 2 2 2 0 5 3 6 9 8 1 0 0 0 1 0 1 2 0 0 7 1 2 9 4 1 0 3 5 5 4
3 3 2 5 5 1 2 7 1 7 7 4 0 0 0 0 0 0 1 2 0 1 3 1 3 1 0 7 1 0 1 4 4 6
2 1 2 1 4 4 2 8 110 7 7 0 0 2 0 0 1 1 4 0 2 5 2 210 510 2 3 2 0
2 4 1 3 2 4 4 8 5 1 0 8 5 0 0 0 0 1 0 1 2 1 0 0 1 4 1 0 6 9 1 3 2 3
3 3 3 6 0 3 3 5 4 7 5 4 0 0 2 0 0 2 0 0 0 1 6 0 610 6 7 1 2 7 4
1 2 1 5 2 0 4 7 7 8 4 1 0 0 1 1 0 1 0 0 0 0 6 2 7 1 0 6 7 3 3 6 5
4 0 1 3 1 4 5 6 6 8 7 6 0 0 1 0 0 0 1 0 0 0 5 1 7 1 0 5 7 0 4 3 5
1 1 3 4 2 1 2 5 5 9 8 4 0 0 0 0 1 0 2 0 0 1 7 0 4 1 0 8 9 2 3 6 2
4 0 1 1 1 0 6 3 6 1 0 8 8 0 0 1 1 0 2 2 0 0 0 2 1 6 9 9 9 1 2 4 3
2 1 1 4 1 1 7 3 7 8 5 7 0 0 0 0 0 1 0 1 0 0 4 3 9 9 7 8 2 2 5 2
9 2 4 4 0 0 5 2 5 1 0 4 3 1 0 1 0 2 0 1 0 0 1 4 3 . 4 8 9 7 2 2 6 1
0 . 5 0.3 0.5 0.2 0.7 0.1 0.8 0.3 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.2 0.1 0.2 0.5 1.0 0.5 0.6 0.6 0.7 0.5
0.3 0.1 0 . 5 0.3 0.2 0.3 0.7 0.1 0.2 0 . 5 0.7 0.6 0.1 0.2 0.3 0.0 0.1 0.0 0.0 0.1 0.1 0.1 0.4 0.3 0.3 0.7 0.9 0.7 0.0 0.2 0.7 0.3
0.2 0.3 0.3 0 . 4 0.1. 0.3 0 . 6 0.3 0.2 0.8 0.7 0.6 0.1 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.1 0.2 0 . 5 0.2 0.4 0.8 0.7 0.5 0.3 0.3 0.6 0.3
0.3 0.0 0 . 4 0.3 0.1 0.1 0 . 6 0.3 0.4 1.0 0 . 5 0.3 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.1 0.0 0.1 0 . 4 0.4 0 . 4 0.9 0.8 0.7 0.1 0 . 5 0.1 0.4
0.4 0.4 0.5 0.5 0.4 0.1 0.4 0.4 0 . 2 0.7 0.7 0.6 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.1 0.3 0.3 0.2 0.8 0 . 5 0 . 9 0.3 0.4 0.2 0 . 5
0.3 0.4 0.3 0 . 4 0.1 0.3 0.3 0.4 0.4 0.9 0.4 0.7 0.0 0.0 0.2 0.0 0.0 0.1 0.2 0.1 0.0 0.1 0 . 5 0 . 2 0.4 0.9 0.6 0.9 0.1 0 . 5 0.2 0.1
0 . 4 0.2 0.2 0 . 6 0.3 0.3 0.2 0.5 0.4 0.7 0 . 6 0 . 5 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.3 0 . 2 0.6 0 . 9 0.4 0 . 9 0.4 0.3 0 . 5 0.4
0.4 0.2 0.1 0.4 0.1 0.3 0.3 0.6 0.4 0.6 0.3 0 . 5 0.0 0.1 0.2 0.1 0.0 0.1 0.4 0.0 0.0 0.1 0.6 0.6 0 . 4 0.9 0.4 0.6 0.1 0.6 0.3 0.3
0.4 0.1 0.4 0 . 5 0.0 0.6 0.3 0.6 0.4 0.1 0.3 0 . 5 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.1 0.2 0.2 0.5 1.0 0.6 1.0 0.3 0.6 0.3 0.2
0.3 0.4 0.1 0.3 0 . 2 0.2 0.3 0.3 0 . 5 0.9 0 . 6 0.7 0.0 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.5 0.4 0 . 5 0.8 0.6 0.4 0.3 0.7 0.5 0.2
0.3 0.1 0.0 0.3 0.3 0.0 0.3 0.4 0.6 0.9 0 . 6 0.7 0.0 0.0 0.0 0.0 0.1 0 . 2 0.4 0.2 0.0 0.1 0.2 0.4 0 . 8 1.0 0.3 0 . 8 0.1 0.3 0.4 0.2
0.2 0.3 0.3 0 . 5 0.1 0.1 0.4 0.3 0.6 0.8 0.7 0.5 0.0 0.2 0.0 0.1 0.0 0.1 0.3 0.0 0.0 0.3 0.2 0.3 0.6 0.9 0.7 0.8 0.2 0.1 0.2 0 . 2
0.2 0.0 0.1 0.6 0.1 0.4 0.3 0.8 0 . 5 0.7 0 . 4 0.5 0.0 0.1 0.2 0.1 0.0 0.0 0.3 0.1 0.0 0.2 0.4
0.4 0.4 0.9 0 . 4 0.9 0.1 0.2 0.3 0.4
0 . 4 0 . 5 0.3 0.5 0 . 2 0.1 0.2 0.2 0.4 0.9 0 . 4 0.7 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0.1 0.0 0.2
0.2 0.4 0.6 1.0 0.6 0.9 0.1 0.3 0.4 0.2
0.2 0.3 0.3 0.4 0.1 0.1 0.3 0.4 0.4 0.7 0 . 5 0.6 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.4 0.5 1.0 0.6 1.0 0.2 0.4 0.6
0.2 0.0 0.0 0 . 2 0.3
0.1 0.4 0.1 0.6 0.2 0.3 0 . 5 0.4 0.4 0 . 8 0.3 0 . 5 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.1 0.0 0.2 0 . 5 0.2 0.4 1.0 0.7 0.7 0.3 0.2 0.6 0.3
0.4 0 . 2 0.1 0.5 0.2 0.6 0.4 0 . 6 0.1 0.9 0 . 5 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0 . 2 0.3 0.1 0.4 1.0 0.5 1.0 0.3 0.4 0 . 5 0.4
0.1 0.2 0.3 0 . 5 0.3 0.2 0.2 0.6 0.2 1.0 0.5 0.4 0.0 0.0 0.0 0.0 0.1 0.3 0.1 0.0 0.1 0.1 0.3 0.3 0.2 1.0 0.5 0.1 0.3 0.5 0.3 0.7
0.2 0.4 0.0 0.3 0.0 0.5 0.3 0.7 0.3 0 . 8 0 . 5 0.6 0.0 0.0 0.3 0.0 0.0 0.2 0.3 0.0 0.0 0.1 0 . 5 0.1 0.3 0.9 0.3 0.9 0.3 0.6 0.2 0.4
0.1 0.3 0.3 0.5 0.2 0.3 0.0 0.5 0.1 0.9 0.5 0.8 0.0 0.2 0.3 0.1 0.1 0.0 0.2 0.2 0.0 0.3 0.4 0.1 0.2 1.0 0.3 0.9 0.2 0.4 0.3 0.3
0.2 0.2 0.1 0.4 0.1 0 . 5 0.0 0.7 0.4 0.7 0.7 0.6 0.0 0.1 0.0 0.0 0.0 0.2 0.2 0.0 0.1 0.0 0.4 0.1 0.5 1.0 0.5 0 . 9 0.1 0.6 0.4 0.3
0.4 0 . 4 0.0 0.2 0 . 2 0.2 0.0 0 . 5 0.3 0.6 0.9 0.8 0.1 0.0 0.0 0.0 0.1 0.0 0.1 0.2 0.0 0.0 0.7 0.1 0.2 0.9 0 . 4 1.0 0.3 0 . 5 0 . 5 0.4
0.3 0.3 0.2 0.5 0 . 5 0.1 0.2 0.7 0.1 0.7 0.7 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.2 0.0 0.1 0.3 0.1 0.3 1.0 0.7 1.0 0.1 0.4 0.4 0.6
0.2 0.1 0.2 0.1 0 . 4 0.4 0.2 0.8 0.1 1.0 0.7 0.7 0.0 0.0 0.2 0.0 0.0 0.1 0.1 0.4 0.0 0.2 0.5 0.2 0 . 2 1.0 0 . 5 1.0 0 . 2 0.3 0.2 0.0
0.2 0.4 0.1 0.3 0 . 2 0 . 4 0 . 4 0.8 0 . 5 1.0 0 . 8 0 . 5 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0 . 2 0.1 0.0 0.0 0.1 0 . 4 1.0 0 . 6 0.9 0.1 0.3 0.2 0.3
0.3 0.3 0.3 0.6 0.0 0.3 0.3 0 . 5 0.4 0.7 0 . 5 0.4 0.0 0.0 0.2 0.0 0.0 0.2 0.0 0.0 0.0 0.1 0.6 0.0 0.6 1.0 0.6 0.7 0.1 0.2 0.7 0 . 4
0.1 0.2 0.1 0 . 5 0.2 0.0 0.4 0.7 0.7 0.8 0 . 4 0.1 0.0 0.0 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0 . 6 0.2 0.1 1.0 0 . 6 0.7 0.3 0.3 0.6 0 . 5
0 . 4 0.0 0.1 0.3 0.1 0.4 0.5 0.6 0.6 0.8 0.7 0.6 0.0 0.0 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0 . 5 0.1 0.7 1.0 0 . 5 0.7 0.0 0.4 0.3 0 . 5
0.1 0.1 0.3 0.4 0 . 2 0.1 0.2 0.5 0 . 5 0.9 0.8 0.0
0.4 0.0 0.0 0.0 0.1 0.0 0.2 0.0 0.0 0.1 0.7 0.0 0.4 1.0 0.8 0.9 0 . 2 0.3 0.6 0.2
0.4 0.0 0.1 0.1 0.1 0.0 0.6 0.3 0 . 6 1.0 0.8 0.8 0.0 0.0 0.1 0.1 0.0 0.2 0.2 0.0 0.0 0.0 0.2 0.1 0.6 0.9 0.9 0.9 0.1 0.2 0.4 0.3
0.2 0.1 0.1 0.4 0.1 0.1 0.1 0.3 0.7 0.8 0 . 5 0.7 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.4 0.3 0.9 0.9 0.7 0.8 0 . 2 0.2 0 . 5 0.2
0.9 0.2 0.4 0.4 0.0 0.0 0.5 0.2 0 . 5 1.0 0 . 4 0.3 0.1 0.0 0.1 0.0 0.2 0.0 0.1 0.0 0.0 0.1 0.4 0.3 0.4 0.8 0.9 0.7 0.2 0 . 2 0.6 0.1
91
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain
r Microphone
Amplifier
i_i
r--1 lowpass filter
TMS320C25
microprocessor system
Interface circuit
............................................................
Seven-segment i Robot arm
1
Display Robot arm
system
Action
Figure 7 The block diagram of speech recognition controller system
92
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain
description for each block is explained in the hardware system results have good recognition rate.
following statements. This algorithms is suitably used in small vocabulary
(1) Interface circuit: It is combined with the size speech recognition system. If we want to
microprocessor system and output devices. increase the number of recognition words, the
(2) Robot arm system: This experiment is adopted algorithm of speech spectrum pattem must be added
-
the TICR type A system by Tatung Company the size of frames, sections or peaks. On the contrary,
with five axes robot arm. if the number of recognition words need to decrease,
First, some of speech pattern from device of SUN the speech spectrum pattern can be reduced. This
workstation are simulated in the program of personal method can be used in the speaker-independent system
computer with the words of number from 0 to 9. or the speakerdependent system by the different
Every speech signal is divided into 16 average frames training way.
and these 16 frames are divided into 16 sections
individually after DFT transformation. Then the References
main peak is selected from every frame. When the
peaks locate in the section, the bit set '1' and otherwise [l]Deller J. R., Proakis J. G., and Hansen J. H.,
set '0'. In this way, the result of recognition rate is 0. Discrete-time processing of speech signals,
After that,two maximum peaks are selected and set '1' Macmillan Publishing Company, New York 1993.
in BTFSP. At the same time, the recognition rate is [2]Huang X. and Lee K. F., On Speaker-independent,
also 0. Then four or six maximum peaks are selected speakerdependent, and speaker-adaptive speech
with the same procedure. The recognition rate rises recognition," IEEE Transactions on Speech and
to 40%. As the peaks is increased to N = 8, the rate Audio Processing, 2:150-157, April 1993.
is still 40%. It is found that the recognition rate has [3]Pan K. C., Soong F. K., and Rabiner L. R., A
no improvement as the peak number is increased. vectorquantization-based preprocessor for
By the way, the interval of fhmes and sections of speaker-independent isolated word recognition,
the word pattem has extended to 32 frames and each IEEE Transactions on Speech and Audio
frame has 32 sections. The vocabulary size is 10 Processing, 3546-560, June 1985.
words. Its rate is different from the peak number. [4]Ainsworth W. A., Speech Recognition by Machine,
The number of peak is set to N = 1and N = 2, the rate Peter Pergrinus Ltd., 1988.
is 0. When N = 4, the rate is 40% and when N = 6 [5JZadehL. A., Information and Control, Fuzzy sets,
the rate is 50%; and when N = 8, the rate is raised to 18:338-353, 1965.
65%. Finally the peaks of number is selected as N = [6]Lee C. C., Fuzzy logic in control system: fuzzy
10, the rate of speech recognition is 80%. The
number peaks is still increasing, but the rate is
-
logic controller part I, IEEE Transactions on
Systems, Man, and Cybernetics, 2:404-418,
tendency to a level and may be decreased. When MarcNApril 1990.
the peaks of number is selected by N = 12, the rate is [7]Lee C. C., Fuzzy logic in control system: fuzzy
down to 70%. So we know the number of peaks logic controller - part 11, IEEE Transactions on
which are not greater than one third number of Systems, Man, and Cybernetics, 2:419-435,
sections. As the vocabulary size is 5 words, the MarcNApril 1990.
speech recognition rate is over 95%. The recognition [8]Braae M. and Rutherford D. A., Selection of
rate is listed in table 1. When this method employs parameters for a fuzzy logic controller, Fuzzy Sets
to the actual hardware on-line system, it has the andsystems, 2:185-199, 1979.
satisfactory recognition rate. [9]Bedrosian S., Designing with fuzzy logic, IEEE
Spectrum, 42-44, November 1990.
6 Conclusion
Acknowledgments
In this paper, a detailed fuzzy analysis of speech
pattern and independent frameware system of speech This work was supported by the National Science
recognition has been accomplished successfully. Council of Republic of China, under contract NSC 86-
Both the simulation and the actuality independent 2213-E-036-009.
93
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.