0% found this document useful (0 votes)
3 views7 pages

Robot_arm_controller_using_fuzzy_speech_recognition

This paper presents a fuzzy speech recognition system designed to control a robot arm. It outlines the components of the system, including feature extraction, dictionary establishment, and similarity calculation, utilizing fuzzy set theory for processing speech signals. The goal is to create a machine capable of understanding and responding to spoken commands through a structured fuzzy logic approach.

Uploaded by

Ștefan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views7 pages

Robot_arm_controller_using_fuzzy_speech_recognition

This paper presents a fuzzy speech recognition system designed to control a robot arm. It outlines the components of the system, including feature extraction, dictionary establishment, and similarity calculation, utilizing fuzzy set theory for processing speech signals. The goal is to create a machine capable of understanding and responding to spoken commands through a structured fuzzy logic approach.

Uploaded by

Ștefan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C.

Jain

ROBOT ARM CONTROLLER USING FUZZY SPEECH RECOGNITION


Ta-Hsiung Hung
Hung-Ching Lu
Department of Electrical Engineering
Tatung Institute of Technology
40 Chungshan North Road, 3rd Sec., Taipei, 10451,
Taiwan

Keywords: Fuzzy speech recognition, robot arm controller.

Abstract human have capability to imagine the fomi and


applications of such systems which is seemly
In this paper, the fuzzy set theory techniques are unbounded. Our imagination, however, is far
employed to develop a speech recognition system. surpassing our technical abilities in this domain.
The idea is to generate a control signal for driving Many techniques of speech recognition have
robot arm system using fuzzy speech recognition. developed [1]-[3]. In general, a speech recognition
First, we design an independent microprocessor system consists of three parts: feature extraction,
system combined with the control circuit of robot arm. establish dictionary and similarity calculation. As
Then, the speech signal is analyzed in accordance with shown in figure 1, every speech voice has its own
fuzzy set logic. The speech signal is divided into feature extraction which has many types. The
several units which produces the feature parameters in training process establishes a reference dictionary of
accordance with the locations of frequency spectrum template patterns which are combined with the best
criterion voice. And the recognition result is
peak. By way of training, it will generate the speech
reference pattern and can be transformed into calculated by its similarity between unknown pattern
membership function. After the calculation of and dictionary.
pattern similarity, the recognition results and the
output control signal are produced. 2 Fuzzy sets theory
1 Introduction Let A and B be two fuzzy sets in universal set with
membership functions p A and p B , respectively.
For most people, speech is the most natural and Then the following operations are defined:
efficient manner of exchanging information. The 1. Fuzzy set :
goal of speech recognition technology, in a broad A = U p A ( I I ) / U ( U is continuous) (1)
sense, is to create machines which can receive spoken
information and act appropriately upon that
information. Further more information exchange
A=
,=I
TpA
( U is discrete)
~

from the machine to human might be required by 2. Complement: The complement of A is a fuzzy
using synthetic speech. Thus, the study of speech set, denoting as A, and the membership function
recognition is part of a quest for "artificially is defined as
intelligent" machines which can "hear", "understand", P& = C(P.4
and "act upon" spoken information, and "speak" in 3. Union (s-norm, t-conorm ): The union of A
(4
(3 ) (4}
completing exchange of the information. As anyone and B is a fuzzy set, denoting as A UB and the
who has read or watched science fiction knows that the

87
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain

Speech

Recognition
Result

. . _ _ - . . - _ . - - - -Dictionary
-------

Figure 1 The basic structure of speech recognition system

Input output

Figure 2 Basic structure of fuzzy logic system

membership function is defined as Fuzzy system is characterized by a set of linguistic


PAVB 3 U(PA(U)> P~(u)} (4) description rules based on expert knowledge. The
4. Intersection (t-norm): The intersection of A and expert knowledge is used in the form of IF-THEN
B is a fuzzy set, denoting as A n B and the rules as
membership function is defined as IF (conditions are satisfied)
THEN (results are received),
PAnB(') E '(PA(')> PE(')} (5)
where the conditions and the results can be multi-
The fuzzy logic system has four principal elements variables.
such as fuzzifer, inference engine, defuzzifier and
data base. The basic structure is shown in figure 2.
More detailed descriptions for each element are 3 Speech recognition
illustrated as follows:
(1) Fuzzifier: The fuzzifier is to measure the values
Speech signal is a serial of sound data. By the use of
of input variables. sampling, quantization, endpoint detection,
(2) Inference engine: The inference engine has the
preemphasis and windowing, the speech signal would
capability of simulating human decision-making become a serial frames. After analysis of feature, the
based on fuzzy concepts. feature extraction obtained as shown in figure 3.
( 3 ) Defuzzifier: The defuzzifier is a scale mapping,
In applications of all practical signal processing, it is
which coverts the output of inference engine to necessary to work with short terms or frames of the
active system value. signal, unless the signal is of short duration. In short
(4) Data base: The data base is the transfer rules time analysis, a general way is to have the speech
between the active system and fuzzy system. signal multiplied by window function, w(.), such as

88
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-BasedIntelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain

Speech signal

L_I--l Sampling

Quantization

Endpoint detection

Windowing

1
1

+
Preemphasis

Analysis of feature

Feature extraction
output

Figure 3 Disposal of speech signal

Voicedunvoiced
Impulse
switch
generator
All-pole Speech
filter signal S(n)
White-noise

Gain estimate
Figure 4 Model of speech production

89

Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-BasedIntelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain

Firstly, two BTFSPs are superimposed and the


X(n)= (6)
otherwise corresponding elements in each pattern are added.
where ~ ( n denotes
) the speech signal, w(n) denotes Next, the third BTFSP is superimposed on the added
pattern by the same procedure. Samely, repeating
the window function, x(n)is the result of the output, this procedure, the membership can be obtained. The
and n is the length of window. There are two membership of a word '8' is indicated in figure 5 and
windows of the magnitude spectra are commonly used, the membership function is showed in figure 6. It is
such as rectangular window and Hamming window. seen that the membership function consists of ten
In discrete-time, speech production model (as shown BTFSPs. In this case, the membership has the
in figure 4) is hypothesized the identification of the maximum value of the BTFSP, which is, 10. The
parameters associated with the all-pole system calculation of pattern parameters is very important
function. task in recognition process. The features of a word is
replaced by the membership function.
For easy calculation by computer, the degree D of
k=l
membership can be written as
where s(.) is output signal( speech signal ) of Z-
l f
transform, U(Z) is input signal of Z-transform, and p
where f is frequency and y indicates the location of
is order.
the some maximum peaks. It is difficult to normalize
the number of peaks N in an actual digital system, so it
4 Algorithms is desirable to define the similarity to be independent
of N.
In this paper, holding membership functions as Let mkLl(f,t ) be the denonnalized membership
templates, similarities between unknown and each function, then
word are calculated using the membership function.
m k ' ( f ,')
In other words, similarities between unknown and "k (f,r, = N
known pattern are calculated and then unknown
pattern is classified into a category which assigns Substitute it into equation (ll), and the degree of
largest similarity. This method is only practical for membership is rewritten as
isolated words or discrete utterances when the
vocabulary is reasonable. The degree D of
membership is defined as The similarity is defined as
' k = mk (f) I
A v(f)' (8) Eh = D k . (14)
where f is frequency and r(f) indicates the location We get the similarity solution of one word from (14),
of the some maximum peaks in frequencies f,,f2,.- . : and use the Same method to calculate the pattern of
each word. Using the next equation
v(f) = A ( f - A ) + A(f - f2)+-. .' (9)
k = max(E,}
where ~ ( p is) defined as kEN
where Eh is the similarity between unknown pattern
x and the word pattern p k , the recognition result k
The membership function mk(f,r ) of the word pattern can be finally obtained.
xk indicates the degree of likeness of the word pattern
x,. The characteristic of the fluctuations among
5 Hardware system and software
many speakers must be known for deciding mk(f,t ) . system
Furthermore it is desired that membership function is
constructed by superimposed of many binarized time- The block diagram of speech recognition controller
frequency spectrum patterns (BTFSP)for each word. hardware system is shown in figure 7. More detailed

90
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain

rf
t
3
2
3
4
3
4
4
4
3
3
2
5 3 5 2 7 7 8 3 0 5 5 0 0 0 0 0 0 0 1 0 0 0 2 1 2 510 5 6 6 7 5
1
3
0
4
4
2
2
1
4
1
3
5
3
4
5
3
2
1
4
1
0
3
3
4
3
5
4
6
4
5
3

5
3
2
1

4
1
3
1

1
1

3
3

2
3
3
1
1

3
3
6

0
1
7 1 2 5 7 6 1 2 3 0 1 0 0 1 1 1 4 3 3 7 9 7 0 2 7 3
6 3 2 8 7 6 1 0 0 1 0 1 0 0 1 2 5 2 4 8 7 5 3 3 6 3
6 3 410 5 3 0 0 0 0 1 1 0 1 0 1 4 4 4 9 8 7 1 5 7 4
4 4 2 7 7 6 0 0 0 0 0 2 0 0 0 1 3 3 2 8 5 9 3 4 2 5
3 4 4 9 4 7 0 0 2 0 0 1 2 1 0 1 5 2 4 9 6 9 1 5 2 1
2 5 4 7 6 5 0 0 0 0 0 0 1 0 0 1 3 2 6 9 4 9 4 3 5 4
3 6 4 6 3 5 0 1 2 1 0 1 4 0 0 1 6 6 4 9 4 6 1 6 3 3
3 6 4 7 3 5 0 0 0 0 0 0 2 0 0 1 2 2 510 610 3 6 3 2
3 3 5 9 6 7 0 1 0 0 0 0 1 0 0 1 5 4 5 8 6 4 3 7 5 2
3 4 6 9 6 7 0 0 0 0 1 2 4 2 0 1 2 4 810 3 8 1 3 4 .
4 3 6 8 7 5 0 2 0 1 0 1 3 0 0 3 2 3 6 9 7 8 2 1 2 2
7

2 0 1 6 1 4 3 8 5 7 4 5 0 1 2 1 0 0 3 1 0 2 4 4 4 9 4 9 1 2 3 4
4 5 3 5 2 1 2 2 4 9 4 7 0 0 0 0 1 0 1 1 0 2 2 4 610 6 9 1 3 4 2
2 3 3 4 1 1 3 4 4 7 5 6 0 0 0 0 0 0 2 0 0 2 3 4 510 610 2 4 6 3
1 4 1 6 2 3 5 4 4 8 3 5 0 0 0 0 2 0 0 1 0 2 5 2 4 1 0 7 7 3 2 6 3
4 2 1 5 2 6 4 6 1 9 5 3 0 0 0 0 0 0 1 0 0 2 3 1 4 1 0 510 3 4 5 4
1 2 3 5 3 2 2 6 2 1 0 5 4 0 0 0 0 1 3 1 0 1 1 3 3 2 1 0 5 7 3 5 3 7
2 4 0 3 0 5 3 7 3 8 5 6 0 0 3 0 0 2 3 0 0 1 5 1 3 9 3 9 3 6 2 4
1 3 3 5 2 3 0 5 1 9 5 8 0 2 3 1 1 0 2 2 0 3 4 1 2 1 0 3 9 2 4 3 3
2 2 1 4 1 5 0 7 4 7 7 6 0 1 0 0 0 2 2 0 1 0 4 1 5 1 0 5 9 1 6 4 3
4 4 0 2 2 2 0 5 3 6 9 8 1 0 0 0 1 0 1 2 0 0 7 1 2 9 4 1 0 3 5 5 4
3 3 2 5 5 1 2 7 1 7 7 4 0 0 0 0 0 0 1 2 0 1 3 1 3 1 0 7 1 0 1 4 4 6
2 1 2 1 4 4 2 8 110 7 7 0 0 2 0 0 1 1 4 0 2 5 2 210 510 2 3 2 0
2 4 1 3 2 4 4 8 5 1 0 8 5 0 0 0 0 1 0 1 2 1 0 0 1 4 1 0 6 9 1 3 2 3
3 3 3 6 0 3 3 5 4 7 5 4 0 0 2 0 0 2 0 0 0 1 6 0 610 6 7 1 2 7 4
1 2 1 5 2 0 4 7 7 8 4 1 0 0 1 1 0 1 0 0 0 0 6 2 7 1 0 6 7 3 3 6 5
4 0 1 3 1 4 5 6 6 8 7 6 0 0 1 0 0 0 1 0 0 0 5 1 7 1 0 5 7 0 4 3 5
1 1 3 4 2 1 2 5 5 9 8 4 0 0 0 0 1 0 2 0 0 1 7 0 4 1 0 8 9 2 3 6 2
4 0 1 1 1 0 6 3 6 1 0 8 8 0 0 1 1 0 2 2 0 0 0 2 1 6 9 9 9 1 2 4 3
2 1 1 4 1 1 7 3 7 8 5 7 0 0 0 0 0 1 0 1 0 0 4 3 9 9 7 8 2 2 5 2
9 2 4 4 0 0 5 2 5 1 0 4 3 1 0 1 0 2 0 1 0 0 1 4 3 . 4 8 9 7 2 2 6 1

Figure 5 Membership n t k ( f ,f) of the word ‘8’

0 . 5 0.3 0.5 0.2 0.7 0.1 0.8 0.3 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.2 0.1 0.2 0.5 1.0 0.5 0.6 0.6 0.7 0.5
0.3 0.1 0 . 5 0.3 0.2 0.3 0.7 0.1 0.2 0 . 5 0.7 0.6 0.1 0.2 0.3 0.0 0.1 0.0 0.0 0.1 0.1 0.1 0.4 0.3 0.3 0.7 0.9 0.7 0.0 0.2 0.7 0.3
0.2 0.3 0.3 0 . 4 0.1. 0.3 0 . 6 0.3 0.2 0.8 0.7 0.6 0.1 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.1 0.2 0 . 5 0.2 0.4 0.8 0.7 0.5 0.3 0.3 0.6 0.3
0.3 0.0 0 . 4 0.3 0.1 0.1 0 . 6 0.3 0.4 1.0 0 . 5 0.3 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.1 0.0 0.1 0 . 4 0.4 0 . 4 0.9 0.8 0.7 0.1 0 . 5 0.1 0.4
0.4 0.4 0.5 0.5 0.4 0.1 0.4 0.4 0 . 2 0.7 0.7 0.6 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.1 0.3 0.3 0.2 0.8 0 . 5 0 . 9 0.3 0.4 0.2 0 . 5
0.3 0.4 0.3 0 . 4 0.1 0.3 0.3 0.4 0.4 0.9 0.4 0.7 0.0 0.0 0.2 0.0 0.0 0.1 0.2 0.1 0.0 0.1 0 . 5 0 . 2 0.4 0.9 0.6 0.9 0.1 0 . 5 0.2 0.1
0 . 4 0.2 0.2 0 . 6 0.3 0.3 0.2 0.5 0.4 0.7 0 . 6 0 . 5 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.3 0 . 2 0.6 0 . 9 0.4 0 . 9 0.4 0.3 0 . 5 0.4
0.4 0.2 0.1 0.4 0.1 0.3 0.3 0.6 0.4 0.6 0.3 0 . 5 0.0 0.1 0.2 0.1 0.0 0.1 0.4 0.0 0.0 0.1 0.6 0.6 0 . 4 0.9 0.4 0.6 0.1 0.6 0.3 0.3
0.4 0.1 0.4 0 . 5 0.0 0.6 0.3 0.6 0.4 0.1 0.3 0 . 5 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.1 0.2 0.2 0.5 1.0 0.6 1.0 0.3 0.6 0.3 0.2
0.3 0.4 0.1 0.3 0 . 2 0.2 0.3 0.3 0 . 5 0.9 0 . 6 0.7 0.0 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.5 0.4 0 . 5 0.8 0.6 0.4 0.3 0.7 0.5 0.2
0.3 0.1 0.0 0.3 0.3 0.0 0.3 0.4 0.6 0.9 0 . 6 0.7 0.0 0.0 0.0 0.0 0.1 0 . 2 0.4 0.2 0.0 0.1 0.2 0.4 0 . 8 1.0 0.3 0 . 8 0.1 0.3 0.4 0.2
0.2 0.3 0.3 0 . 5 0.1 0.1 0.4 0.3 0.6 0.8 0.7 0.5 0.0 0.2 0.0 0.1 0.0 0.1 0.3 0.0 0.0 0.3 0.2 0.3 0.6 0.9 0.7 0.8 0.2 0.1 0.2 0 . 2
0.2 0.0 0.1 0.6 0.1 0.4 0.3 0.8 0 . 5 0.7 0 . 4 0.5 0.0 0.1 0.2 0.1 0.0 0.0 0.3 0.1 0.0 0.2 0.4
0.4 0.4 0.9 0 . 4 0.9 0.1 0.2 0.3 0.4
0 . 4 0 . 5 0.3 0.5 0 . 2 0.1 0.2 0.2 0.4 0.9 0 . 4 0.7 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0.1 0.0 0.2
0.2 0.4 0.6 1.0 0.6 0.9 0.1 0.3 0.4 0.2
0.2 0.3 0.3 0.4 0.1 0.1 0.3 0.4 0.4 0.7 0 . 5 0.6 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.4 0.5 1.0 0.6 1.0 0.2 0.4 0.6
0.2 0.0 0.0 0 . 2 0.3
0.1 0.4 0.1 0.6 0.2 0.3 0 . 5 0.4 0.4 0 . 8 0.3 0 . 5 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.1 0.0 0.2 0 . 5 0.2 0.4 1.0 0.7 0.7 0.3 0.2 0.6 0.3
0.4 0 . 2 0.1 0.5 0.2 0.6 0.4 0 . 6 0.1 0.9 0 . 5 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0 . 2 0.3 0.1 0.4 1.0 0.5 1.0 0.3 0.4 0 . 5 0.4
0.1 0.2 0.3 0 . 5 0.3 0.2 0.2 0.6 0.2 1.0 0.5 0.4 0.0 0.0 0.0 0.0 0.1 0.3 0.1 0.0 0.1 0.1 0.3 0.3 0.2 1.0 0.5 0.1 0.3 0.5 0.3 0.7
0.2 0.4 0.0 0.3 0.0 0.5 0.3 0.7 0.3 0 . 8 0 . 5 0.6 0.0 0.0 0.3 0.0 0.0 0.2 0.3 0.0 0.0 0.1 0 . 5 0.1 0.3 0.9 0.3 0.9 0.3 0.6 0.2 0.4
0.1 0.3 0.3 0.5 0.2 0.3 0.0 0.5 0.1 0.9 0.5 0.8 0.0 0.2 0.3 0.1 0.1 0.0 0.2 0.2 0.0 0.3 0.4 0.1 0.2 1.0 0.3 0.9 0.2 0.4 0.3 0.3
0.2 0.2 0.1 0.4 0.1 0 . 5 0.0 0.7 0.4 0.7 0.7 0.6 0.0 0.1 0.0 0.0 0.0 0.2 0.2 0.0 0.1 0.0 0.4 0.1 0.5 1.0 0.5 0 . 9 0.1 0.6 0.4 0.3
0.4 0 . 4 0.0 0.2 0 . 2 0.2 0.0 0 . 5 0.3 0.6 0.9 0.8 0.1 0.0 0.0 0.0 0.1 0.0 0.1 0.2 0.0 0.0 0.7 0.1 0.2 0.9 0 . 4 1.0 0.3 0 . 5 0 . 5 0.4
0.3 0.3 0.2 0.5 0 . 5 0.1 0.2 0.7 0.1 0.7 0.7 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.2 0.0 0.1 0.3 0.1 0.3 1.0 0.7 1.0 0.1 0.4 0.4 0.6
0.2 0.1 0.2 0.1 0 . 4 0.4 0.2 0.8 0.1 1.0 0.7 0.7 0.0 0.0 0.2 0.0 0.0 0.1 0.1 0.4 0.0 0.2 0.5 0.2 0 . 2 1.0 0 . 5 1.0 0 . 2 0.3 0.2 0.0
0.2 0.4 0.1 0.3 0 . 2 0 . 4 0 . 4 0.8 0 . 5 1.0 0 . 8 0 . 5 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0 . 2 0.1 0.0 0.0 0.1 0 . 4 1.0 0 . 6 0.9 0.1 0.3 0.2 0.3
0.3 0.3 0.3 0.6 0.0 0.3 0.3 0 . 5 0.4 0.7 0 . 5 0.4 0.0 0.0 0.2 0.0 0.0 0.2 0.0 0.0 0.0 0.1 0.6 0.0 0.6 1.0 0.6 0.7 0.1 0.2 0.7 0 . 4
0.1 0.2 0.1 0 . 5 0.2 0.0 0.4 0.7 0.7 0.8 0 . 4 0.1 0.0 0.0 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0 . 6 0.2 0.1 1.0 0 . 6 0.7 0.3 0.3 0.6 0 . 5
0 . 4 0.0 0.1 0.3 0.1 0.4 0.5 0.6 0.6 0.8 0.7 0.6 0.0 0.0 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0 . 5 0.1 0.7 1.0 0 . 5 0.7 0.0 0.4 0.3 0 . 5
0.1 0.1 0.3 0.4 0 . 2 0.1 0.2 0.5 0 . 5 0.9 0.8 0.0
0.4 0.0 0.0 0.0 0.1 0.0 0.2 0.0 0.0 0.1 0.7 0.0 0.4 1.0 0.8 0.9 0 . 2 0.3 0.6 0.2
0.4 0.0 0.1 0.1 0.1 0.0 0.6 0.3 0 . 6 1.0 0.8 0.8 0.0 0.0 0.1 0.1 0.0 0.2 0.2 0.0 0.0 0.0 0.2 0.1 0.6 0.9 0.9 0.9 0.1 0.2 0.4 0.3
0.2 0.1 0.1 0.4 0.1 0.1 0.1 0.3 0.7 0.8 0 . 5 0.7 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.4 0.3 0.9 0.9 0.7 0.8 0 . 2 0.2 0 . 5 0.2
0.9 0.2 0.4 0.4 0.0 0.0 0.5 0.2 0 . 5 1.0 0 . 4 0.3 0.1 0.0 0.1 0.0 0.2 0.0 0.1 0.0 0.0 0.1 0.4 0.3 0.4 0.8 0.9 0.7 0.2 0 . 2 0.6 0.1

Figure 6 Membership function of the word ‘8’

91

Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain

r Microphone

Amplifier

i_i
r--1 lowpass filter

TMS320C25
microprocessor system

Interface circuit
............................................................
Seven-segment i Robot arm

1
Display Robot arm
system

Action
Figure 7 The block diagram of speech recognition controller system

Table 5.2 The recognition rate of this experiment.

92
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.
1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain

description for each block is explained in the hardware system results have good recognition rate.
following statements. This algorithms is suitably used in small vocabulary
(1) Interface circuit: It is combined with the size speech recognition system. If we want to
microprocessor system and output devices. increase the number of recognition words, the
(2) Robot arm system: This experiment is adopted algorithm of speech spectrum pattem must be added
-
the TICR type A system by Tatung Company the size of frames, sections or peaks. On the contrary,
with five axes robot arm. if the number of recognition words need to decrease,
First, some of speech pattern from device of SUN the speech spectrum pattern can be reduced. This
workstation are simulated in the program of personal method can be used in the speaker-independent system
computer with the words of number from 0 to 9. or the speakerdependent system by the different
Every speech signal is divided into 16 average frames training way.
and these 16 frames are divided into 16 sections
individually after DFT transformation. Then the References
main peak is selected from every frame. When the
peaks locate in the section, the bit set '1' and otherwise [l]Deller J. R., Proakis J. G., and Hansen J. H.,
set '0'. In this way, the result of recognition rate is 0. Discrete-time processing of speech signals,
After that,two maximum peaks are selected and set '1' Macmillan Publishing Company, New York 1993.
in BTFSP. At the same time, the recognition rate is [2]Huang X. and Lee K. F., On Speaker-independent,
also 0. Then four or six maximum peaks are selected speakerdependent, and speaker-adaptive speech
with the same procedure. The recognition rate rises recognition," IEEE Transactions on Speech and
to 40%. As the peaks is increased to N = 8, the rate Audio Processing, 2:150-157, April 1993.
is still 40%. It is found that the recognition rate has [3]Pan K. C., Soong F. K., and Rabiner L. R., A
no improvement as the peak number is increased. vectorquantization-based preprocessor for
By the way, the interval of fhmes and sections of speaker-independent isolated word recognition,
the word pattem has extended to 32 frames and each IEEE Transactions on Speech and Audio
frame has 32 sections. The vocabulary size is 10 Processing, 3546-560, June 1985.
words. Its rate is different from the peak number. [4]Ainsworth W. A., Speech Recognition by Machine,
The number of peak is set to N = 1and N = 2, the rate Peter Pergrinus Ltd., 1988.
is 0. When N = 4, the rate is 40% and when N = 6 [5JZadehL. A., Information and Control, Fuzzy sets,
the rate is 50%; and when N = 8, the rate is raised to 18:338-353, 1965.
65%. Finally the peaks of number is selected as N = [6]Lee C. C., Fuzzy logic in control system: fuzzy
10, the rate of speech recognition is 80%. The
number peaks is still increasing, but the rate is
-
logic controller part I, IEEE Transactions on
Systems, Man, and Cybernetics, 2:404-418,
tendency to a level and may be decreased. When MarcNApril 1990.
the peaks of number is selected by N = 12, the rate is [7]Lee C. C., Fuzzy logic in control system: fuzzy
down to 70%. So we know the number of peaks logic controller - part 11, IEEE Transactions on
which are not greater than one third number of Systems, Man, and Cybernetics, 2:419-435,
sections. As the vocabulary size is 5 words, the MarcNApril 1990.
speech recognition rate is over 95%. The recognition [8]Braae M. and Rutherford D. A., Selection of
rate is listed in table 1. When this method employs parameters for a fuzzy logic controller, Fuzzy Sets
to the actual hardware on-line system, it has the andsystems, 2:185-199, 1979.
satisfactory recognition rate. [9]Bedrosian S., Designing with fuzzy logic, IEEE
Spectrum, 42-44, November 1990.
6 Conclusion
Acknowledgments
In this paper, a detailed fuzzy analysis of speech
pattern and independent frameware system of speech This work was supported by the National Science
recognition has been accomplished successfully. Council of Republic of China, under contract NSC 86-
Both the simulation and the actuality independent 2213-E-036-009.

93
Authorized licensed use limited to: Gheorghe Asachi Technical University of Ia¿i. Downloaded on November 15,2023 at 20:29:19 UTC from IEEE Xplore. Restrictions apply.

You might also like