Infant Cry Language Analysis and Recogni
Infant Cry Language Analysis and Recogni
3, MAY 2019
Abstract—Recently, lots of research has been directed towards head restraints [3]. Previously, in [4], [5], we proposed a
natural language processing. However, the baby’s cry, which preliminary approach which can recognize cry signals of a
serves as the primary means of communication for infants, has specific infant. However, only limited normal cry signals such
not yet been extensively explored, because it is not a language
that can be easily understood. Since cry signals carry information as hunger, a wet diaper and attention have been studied, and
about a babies’ wellbeing and can be understood by experienced the algorithms work only for specific infants in the study
parents and experts to an extent, recognition and analysis of an in a controlled lab environment. Nevertheless, an abnormal
infant’s cry is not only possible, but also has profound medical cry can be associated with severe or chronic illness, so the
and societal applications. In this paper, we obtain and analyze au- detection and recognition of abnormal cry signals are of great
dio features of infant cry signals in time and frequency domains.
Based on the related features, we can classify given cry signals importance. Compared with normal cry signals, abnormal cry
to specific cry meanings for cry language recognition. Features signals are more intense, requiring further evaluation [6]. An
extracted from audio feature space include linear predictive abnormal cry is often related to medical problems, such as:
coding (LPC), linear predictive cepstral coefficients (LPCC), infection, abnormal central nervous system, pneumonia, sepsis,
Bark frequency cepstral coefficients (BFCC), and Mel frequency laryngitis, pain, hypothyroidism, trauma to the hypopharynx,
cepstral coefficients (MFCC). Compressed sensing technique was
used for classification and practical data were used to design vocal cord paralysis, etc. Therefore, approaches which can
and verify the proposed approaches. Experiments show that the identify and recognize both normal and abnormal cry signals
proposed infant cry recognition approaches offer accurate and in practical scenarios is of extreme importance. In this paper,
promising results. we propose a novel cry language recognition algorithm which
Index Terms—Compressed sensing, feature extraction, infant can distinguish the meanings of both normal and abnormal cry
cry signal, language recognition. signals in a noisy environment. Additionally, the proposed al-
gorithm is individual crier independent. Hence, this algorithm
can be widely used in practical scenarios to recognize and
I. I NTRODUCTION
classify various cry features.
which are related with different cry reasons. In this paper, In the first few weeks after birth, crying has a reflexive-like
short-time Fourier transform (STFT) is used to analyze the quality and is most likely tied to the regulation of physiological
cry signals. Recently, speech recognition and acoustic signal homeostasis as the neonate is balancing internal demands with
classification techniques have been widely used in many areas external demands [15].
such as manufacturing, communication, consumer electronic As physiological processes stabilize, periods of alertness
products and medical care [10]−[12]. Speech recognition is and attention increase, which place additional demands on
a signal processing procedure that transfers speech signal regulatory functions. Crying can occur when the system be-
waveforms in a spatial domain into a series of coefficients, comes overloaded due to external stimulation. Crying is also
called a feature, which can be recognized by the computer considered as a mechanism for discharging energy or tension.
[10]−[13]. Since infant cry signals are time-varying non- The need for tension reduction is especially acute at times
stationary random signals which are similar to speech signals. of major developmental upheavals and shifts. Unexplained
The stimuli for infant cry signal is the same as the stimuli fussiness and sudden increases in crying occur between 3 and
for voiced speech signal. In this paper, we use techniques 12 weeks of age due to maturational changes in brain structure
originally designed and used in automatic speech recognition and shifts in the organization of the central nervous system.
to detect and recognize the features for infant cry signals, and Physiological and anatomical changes that occur around 1
use compressed sensing to analyze and classify those signals. to 2 months result in more control over vocalization, thus
Figure 1 shows the procedures of cry signal recognition which crying becomes more differentiated. At the age of 7−9 months
consists of the following steps: there is a second bio-behavioral shift characterized by major
Step 1: Cry unit detection cognitive and affective changes that are also thought to reflect
Step 2: Feature extraction central nervous system reorganization. Crying now occurs for
Step 3: Analysis and classification additional reasons, such as fear and frustration [15].
III. C RY S IGNAL T IME F REQUENCY A NALYSIS AND performs well as a cry detector because there is a noticeable
D ETECTION difference of average energy between voiced and unvoiced
After obtaining cry signals, we analyze the recorded signals cry signals, and between crying and silence. This technique
by using waveform and time frequency analysis. Then we is usually paired with short-time zero crossing for a robust
conduct signal detection and segmentation for later pattern ex- detection scheme.
traction. Signal detection processes instances of voiced activity
instead of spending computational time during silent periods. C. Short-Time Zero Crossing
To accurately detect potential periods of voiced activity, two Short-time zero crossing (STZC) is defined as the rate of
short term signal detection techniques are used. signal sign change [11]:
N −1
A. Short-Time Fourier Analysis 1 X
Z(n) = |sign(x(n − m)) − sign(x(n − m − 1))|
N m=0
In this section, we use time frequency analysis to analyze
the infant cry signals. It is well known that discrete fourier (4)
transform (DFT) of a long sequence is an estimate of the (
1, x(m) ≥ 0
power spectrum density (PSD), called a periodogram [11]. where sign(x(m)) =
Different cry signals from different infants would produce −1, x(m) < 0.
similar gross PSD. Therefore, we use STFT to obtain the time- STZC estimation works well with the crying detector be-
varying properties of cry signals. STFT is defined as: cause there are noticeably fewer zero crossings in voiced
∞
crying as compared with unvoiced crying. It is obvious that
X short-time zero crossing can predict the start and endpoints of
Xn (ejω ) = x(m)w(n − m)e−jωn (1)
m=−∞
cry signals, as shown in Fig. 8. STZC approach can effectively
obtain the envelope of a non-silent signal, and combined with
where w(n − m) is a real window sequence to determine the short-time energy, STZC can effectively track instances of
portion of the signal x(n) that receives emphasis at a particular potentially voiced signals that are the signals of interest for
time index, n. STFT is a time dependent complex function of analysis.
time index n and frequency ω. Not all signals bounded by the STZC boundary contain
We can observe STFT as the discrete time Fourier transform cries. Large STZC envelopes with low energy tended to
(DTFT) of the sequence x(m)w(n − m). An alternative contain cry precursors such as whimpers and breathing events.
interpretation of STFT is to consider Xn (ejω ) as a function Not all signals with non-negligible STE contained cries as
of n with a given frequency. Then it becomes a discrete-time well. Infant coughing could also have similar STZC envelopes
convolution and can be considered as linear filtering. and contain noticeable STE values. In this research, crying is
The shape of the window sequence has an important effect defined as a high energy segment of sufficiently long duration.
on this time-dependent FT. The STFT of a given signal is In this research, we use both STE and STZC to detect cry
Z π
1 units. As the normal infant cry duration is around 1.6sec, the
Xn (ejω ) = W (e−jω )e−jωn X(ej(ω−eω) )de ω . (2) two quantifiable threshold conditions to constitute a desired
2π −π
voiced cry are [14]:
Fourier transform (FT) of the sequence of input signal is 1) Normalized energy > 0.05 (to eliminate non-voiced
convolved with the FT of the shifted window. To represent artifacts such as breathing/whimpering and to supersede cry
X(ejω ) by using STFT Xn (ejω ), we choose a window func- precursors);
tion with a spectral highly concentrated around the origin. In 2) Signal envelope period > 0.1 sec (to eliminate impulsive
this paper, Hamming window is used to conduct STFT. voiced artifacts such as coughing).
brain will actively process those analog auditory qualities and reflects the difference of the biological structure of the human
make decisions regarding the sound. vocal track [9]. LPCC derives from LPC recursively as [11]
Audio feature extractions hinge upon digital signal pro-
cessing of audio signals to quantize acoustic information in LP CC1 = LP C1
a manner that makes classification practical and tractable. P k
i−1
LP CCi = LP Ci + i LP CCi−k LP Ck , 1 < i ≤ M
The comparison of time domain waveforms can be used as k=1
a measure for signal classification. On the other hand, time- (6)
domain signals can also be segmented and processed in smaller
where M is LPCC coefficients order, i = 2, . . . , M .
time windows to generate frequency domain snapshots of the
segments through Fourier transform. The frequency domain
analysis of signals yields information closely tied with timbre C. Mel Frequency Cepstral Coefficients
and pitch. In this paper, we leverage both time and frequency Mel frequency cepstral coefficients (MFCC) are coefficients
domain analysis to cover all four primary auditory qualities. that describe the mel frequency cepstrum [13], [18]. In sound
In this section, linear predictive coding (LPC), linear predic- processing, mel frequency cepstrum is a representation of
tive code cepstral (LPCC), mel-frequency cepstral coefficients the short-time power spectrum of a sound based on a linear
(MFCC), and bark-frequency cepstral coefficients (BFCC) are cosine transform of a log spectrum on a non-linear mel scale
extracted from cry signals as features. Additionally, com- of frequency. The mel frequency cepstrum is obtained with
pressed sensing (CS) is used for cry feature recognition in the following steps. The short-time Fourier transform of the
this paper. signal is taken to obtain the quasi-stationary short-time power
spectrum F (f ) = F {f (t)}. The frequency portion is then
B. Linear Predictive Coding mapped to the mel scale perceptual filter bank with 18 triangle
The waveforms of two similar sounds will also be similar. band pass filters equally spaced on the mel range of frequency
If two infant cries have very similar waveforms, it indicates F (m). These triangle band pass filters smooth the magnitude
that they should possess the same impetus. However, it is spectrum such that the harmonics are flattened to obtain the
impractical to conduct a full sample by sample comparison envelope of the spectrum The log of the filtered spectrum is
between cry signals due to the complexity of the sampled obtained and then the Fourier transform of the log spectrum
audio signals. For better performance of the time domain squared results in the power cepstrum of the signals.
comparison of infant cry signals, linear predictive coding
f
(LPC) is applied. M el(f ) = 2595log10 (1 + ). (7)
There are two acoustic sources associated with voiced and 700
unvoiced speech. Voiced crying is produced by the vibration of At this point, the discrete cosine transform (DCT) of the
the vocal cords caused by the airflow from the lungs and this power cepstrum is taken to obtain the MFCC, a tool commonly
vibration is periodic in nature; unvoiced crying is produced used to measure audio signal similarity. The DCT coefficients
by constrictions in the air tract resulting in random airflow are retained as they represent the power amplitudes of the mel
[12]. The basis of the source-filter model of speech is that frequency cepstrum.
crying can be synthesized by generating an acoustic source
and passing it through an all-pole filter.
D. Bark Frequency Cepstral Coefficients
LPC produces a vector of coefficients that represent a
spectral shaping filter [11]. The input signal to this filter Similar to MFCC, BFCC warps power cepstrum in such a
is either a pitch train for voiced sounds, or white noise way that it matches human perception of loudness. The method
for unvoiced sounds. This shaping filter is an all-pole filter of obtaining BFCC is similar to that of MFCC [12]. In BFCC,
represented as [11]: frequencies are converted to the bark scale as following:
1 f 2
H(z) = PM (5) Bark(f ) = 13arctan(0.00076f ) + 3.5arctan(( ) ) (8)
1 − i=1 ai z −i 7500
where ai are the linear prediction coefficients and M is the where Bark denotes bark frequency and f is the frequency
number of poles. The present sample of the cry signal could in Hertz. The mapped bark frequency is passed through 18
then be described as a linear combination of the past M triangle band pass filters. The center frequencies of these
samples of the cry signals. triangular band pass filters correspond to the first 18 of the
The coefficients {ai } can then be estimated by either 24 critical frequency bands of hearing.
autocorrelation or covariance methods [10]. Effectively, the BFCC is obtained by applying DCT to the bark frequency
purpose of LPC is to take a large size waveform and then cepstrum and the 10 DCT coefficients describe the amplitudes
compress it into coefficients, a more manageable form. Be- of the cepstrum. The power cepstrum also possesses the same
cause similar waveforms will also result in similar acoustic sampling rate as the signal, so the BFCC is obtained by
output, LPC serves as a time domain measure of how close performing LPC algorithm on the power cepstrum in 128
two different waveforms are. sample frames. BFCC encodes the cepstrum waveform in
Linear predictive cepstral coefficients (LPCC) represents a compact fashion that makes it suitable for classification
LPC coefficients in the cepstral domain [12]. This feature schemes.
782 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 6, NO. 3, MAY 2019
infant has hearing impairment. Cry signals were filed under the table, d stands for day, w stands for week, and m stands
five different causes: needing a diaper (6 observations), being for month.
hungry (16 observations), needing attention (8 observations), We analyzed the different cry signals by using time-
needing sleep (8 observations), and being in discomfort (10 frequency analysis. A Hamming window with length 256
observations), which included injection, sputum induction and was used, the overlap was 128 and a 512 point fast Fourier
blood tests. In the 20−40 second recording time, we assumed transform (FFT) was used for calculating the STFT.
that an infant would not change his/her mood or desire within Figs. 2−6 show the waveforms and STFTs (spectrograms)
the recording period. of different cry signals. It is obvious that different catalogs of
cry signals have different waveform and spectrum character-
Based on the data obtained from the babies and known istics.
facts from the experts, we listed “discomfort” cry in the Diaper-related crying is considered a normal cry and has a
abnormal category of crying and the other cries in the normal pattern of crying and silence as shown in Fig. 2. This kind of
category. We used “hungry”, “diaper”, “attention”, “sleepy” crying starts with a cry coupled with a briefer silence, which
and “discomfort” as pilot features, but more features can be is followed by a short high-pitched inspiratory whistle. Then,
easily added such as tired, cold or hot, need a burp etc.. there is a brief silence followed by another cry. Fig. 3 shows
attention-related crying, which is also a normal cry. This type
Signal acquisition and numbering of cry audio files with the
of cry is characterized by a similar temporal sequence but can
associated infant is shown in Table I. In the Age column of
be distinguished by differences in the length of the various
TABLE I frequency components.
C RY S IGNAL I NFORMATION Hunger-related crying, which is also a normal cry, is the
Cause Sex Age Race File most general cry. The duration of crying is not only longer
1 Diaper F 2w Asian T07 but it is also followed by a longer silence as shown in Fig. 4.
2 Attention F Asian T10A Typically, this cry is louder and more abrupt compared with
3 Attention T34
4 Attention T105 attention or diaper-related crying.
5 Hungry M 1w Asian T11
6 Attention T33
7 Hungry T35
8 Hungry M 1w Asian T19
9 Sleepy M 3m Asian T20
10 Disturbed T32
11 Sleepy F Asian T21
12 Sleepy T23
13 Diaper F 3d Asian T22
14 Inject M 1w Asian T24
15 Sputum induction M 2w T110
16 Sleepy M 1w Asian T25
17 Hungry M 2w T113
18 Hungry F 3d Asian T26
19 Attention F 1w T104
20 Hungry F 1w T122
21 Attention F 8d Asian T27
22 Uncomfortable M 2w Asian T28
23 Blood test M 3w Asian T109
24 Diaper F 2w Asian T29
25 Attention M 9d Asian T30 Fig. 2. Diaper signal wave form (upper) and spectrogram (lower).
26 Attention T31
27 Diaper M 2d Asian T36
28 Diaper M 6d T116
29 Diaper F 9d Asian T37
30 Hungry F 2w T117
31 Other M 1w Asian T106
32 Hungry M 2w T121
33 Hungry M 2w Asian T107
34 Blood test F 2w Asian T108
35 Uncomfortable F 1m Asian T111
36 Hungry F T124
37 Hungry M 5d Asian T112
38 Hungry M 11d Asian T114
39 Hungry M 2w T115
40 Hungry M 2w T120
41 Sleepy F 8d Asian T118
42 Sleepy F T119
43 Hungry M 1w Caucasian T123
44 Uncomfortable M 14w Asian T125
45 Sleepy M T126
46 Hungry M T127
47 Sleepy M T128
48 Uncomfortable M T129 Fig. 3. Attention: signal waveform (upper) and spectrogram (lower).
784 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 6, NO. 3, MAY 2019
Fig. 5. Sleepy: waveform (upper) and spectrogram (lower). Fig. 7. Baby cry signal, short time energy and detected cry unit for cry file
T19.wav.
Fig. 5 shows sleep-related crying, which is also a normal Fig. 8. Baby cry signal, short time zero-crossing and detected cry for T19.
cry. However, it is quite different from the previous normal wav.
LIU et al.: INFANT CRY LANGUAGE ANALYSIS AND RECOGNITION: AN EXPERIMENTAL APPROACH 785
Fig. 9. BFCC features for attention, diaper, hungry and discomfort cry signals.
segments were removed from the cry signals. For all those (48) energy cry signals which matches the experts’ experience.
recording files, we got 151 “attention cry units”, 137 “diaper Cry units from each class (100 “Draw attention cry units”,
change needed cry units”, 422 “hungry cry units”, 79 “sleepy” 50 “Diaper change needed cry units”, 120 discomfort, and 200
cry units and 182 “discomfort cry units”. “Hungry cry units”) were used as training signals, and the rest
Fig. 9 shows the BFCC features for different catalogs from of the data (51 attention, 87 diaper and 222 hungry) were used
different infants. BFCC features for attention from 4 different for testing purposes. Fig. 10 shows the cry units for 3 different
babies are shown in (a). Features from one infant are similar cry signals by using LPC, LPCC, MFCC an BFCC features.
to other infants when they had a similar reason to cry. It is obvious that different cry signals have different features.
Subplot (b) shows “Diaper change needed cry units” BFCC Compressed sensing technique was used to conduct recog-
features of 4 different cry files. Again, the results show similar nition and classification and classification rate was used to
features for needing a diaper change across different infants. evaluate the performance and was defined as
Since attention-related crying and diaper-related crying both
Nright
are characterized as normal crying, their intensity levels are Pc = × 100% (14)
similar but less than hunger-related crying. Ntotal
Fig. 9 (c) shows “Hungry cry units” BFCC features of 4 where Nright was the number of right classifications, and Ntotal
different babies. Hunger-related crying had the highest inten- was the total number of test cry units. We used sleep-related
sity level in the normal cry catalog. BFCC features obtained crying and hunger-related crying to test the performance of
from “hungry” is quite different from those of “attention” CS, 79 sleepy cry units and 200 hungry cry units have been
crying and “diaper” crying. It is shown that the BFCC patterns used for training data and 100 sleepy cry units and 222 hungry
changed from the low stress level cries to high stress level cry units have been used as testing data.
cries. There is an abrupt jump from coefficient 1 to coefficient Comparing Fig. 5 and Fig. 6, the differences between the
2 which is close to the trend of abnormal cry signals. Fig. 9 (d) waveforms of the two types of cries are more pronounced
shows discomfort-related crying from 4 files associated with in the frequency domain than in the time domain. Since
4 different babies. The BFCC features show a similar trend LPC features were obtained only in the time domain, CS
among those infants. They are quite different from normal cry cannot distinguish hungry and sleepy accurately based on LPC
signals, especially the low intensity level cries, such as diaper- features, as shown in Table II. However, LPCC represents LPC
related and attention-related crying. And even compared with coefficients in the cepstral domain to reflect the differences
hunger-related cry signals, the values of the coefficients were of the biological structure of the vocal track and the LPCC
higher which means discomfort-related crying produced higher algorithm produces different features for hungry and sleepy
786 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 6, NO. 3, MAY 2019
TABLE II
cry. Nonetheless, since BFCC and MFCC algorithms capture
I NFANT C RY R ECOGNITION C ORRECT R ATE WITH C OMPRESSED
both time and frequency domain information of cry signals,
S ENSING T ECHNIQUE AND D IFFERENT F EATURES FOR C RY
it is obvious that BFCC and MFCC can produce different
S IGNALS ( SLEEPY AND HUNGRY )
features for hunger-related crying and sleepy crying, as shown
in Fig. 10. As a result, BFCC and MFCC features outperform The data ratio of constructing the matrix
LPC and LPCC features with a classification rate around 70% Features
0.4 0.5 0.6 0.7 0.8 0.9
(Table II).
BFCC 0.6991 0.6915 0.7067 0.6842 0.7105 0.6842
Experimental results on hungry, discomfort, attention and
diaper cry signals are shown in Fig. 11 and Table III. For LPC 0.5133 0.4681 0.4933 0.4737 0.4211 0.5789
the same reason mentioned above, BFCC and MFCC features LPCC 0.6018 0.6064 0.6267 0.5965 0.5789 0.4737
outperform LPC and LPCC features. MFCC 0.6814 0.6596 0.6767 0.7018 0.7105 0.6842
We also investigate the performance in terms of the recogni-
tion rate for different features and different popular classifica-
TABLE III
tion methods, as shown in Table IV. We found that MFCC and
I NFANT C RY R ECOGNITION C ORRECT R ATE WITH C OMPRESSED
BFCC outperform other features, and ANN and CS technique
S ENSING T ECHNIQUE AND D IFFERENT F EATURES
can provide higher recognition rate. Different combination can
achieve different performance. For example, LPC can work The data ratio of constructing the matrix
with NN well, MFCC and LPCC combined with CS can Features
0.4 0.5 0.6 0.7 0.8 0.9
achieve higher recognition rate than CS or NN. The highest
BFCC 0.5701 0.5393 0.5742 0.5754 0.5111 0.6842
recognition correct rate for infants cry application is achieved
at 76.47% by using BFCC feature and ANN. LPC 0.6131 0.6225 0.5854 0.5688 0.5419 0.4667
It is obvious that there are universal individual independent LPCC 0.5009 0.4989 0.4986 0.4944 0.5028 0.4889
patterns for infant cry signals. Based on the time and frequency MFCC 0.5907 0.5910 0.5938 0.5502 0.5140 0.5333
features, it is feasible to discern between different cry units.
BFCC features and CS algorithms can provide reasonable and VI. C ONCLUSION
accurate recognition capabilities. Experimental results of the This paper presents a novel detection and recognition
proposed approach match experts’ knowledge and judgments method for individual independent infant cries in a noisy en-
very well. vironment. Audio features of infant cry signals were obtained
LIU et al.: INFANT CRY LANGUAGE ANALYSIS AND RECOGNITION: AN EXPERIMENTAL APPROACH 787
Fig. 11. Features for attention, diaper, hungry and discomfort cry.
TABLE IV
[3] Y. Kheddache and C. Tadj, “Acoustic measures of the cry characteris-
I NFANT C RY R ECOGNITION C ORRECT R ATE BY U SING tics of healthy newborns and newborns with pathologies,” Journal of
D IFFERENT F EATURES AND R ECOGNITION T ECHNIQUES Biomedical Science and Engineering, vol. 6, no. 8, 9 pages, 2013.
Features LPC LPCC MFCC BFCC [4] L. Liu, K. Kuo, and Sen M. Kuo, “Infant cry classification integrated
Nearest neighborhood (NN) 0.6384 0.4795 0.6389 0.6522 ANC system for infant incubators,” in Proc. IEEE International Conf.
on Networking, Sensing and Control, Paris, France, 2013, pp. 383−387.
Artificial neural network (ANN) 0.5455 0.5188 0.6045 0.7647
Compressed sensing (CS) 0.5789 0.6267 0.7105 0.7064 [5] L. Liu and K. Kuo, “Active noise control systems integrated with infant
cry detection and classification for infant incubators,” in Proc. Acoustic,
pp. 1−6. 2012.
in time and frequency domains, and were used to perform
infant cry language recognition. Practical data from hospitals
[6] L. LaGasse, A. Neal, and M. Lester, “Assessment of infant cry: acoustic
were used to design and verify the proposed approaches. cry analysis and parental perception,” Ment Retard Dev Disabil Res Rev.,
Experiments proved that the proposed infant cry unit recog- vol. 11, no. 1, pp. 83−93, 2005.
nition models offer accurate and promising results with far-
reaching applications medically and societally. Our future [7] Várallyay Jr. György, “Future prospects of the application of the infant
research includes: takes multiple features into consideration cry in the medicine,” Periodica Polytechnica Ser. El. Eng, vol. 50, no.
1−2, pp. 47−62, 2006.
and reinforcement learning to improve the performance. We
plan to collect more data and include more cry reasons as well.
[8] G. Buonocore and C.V. Bellieni, Neonatal Pain, Suffering, Pain and Risk
of Brain Damage in the Fetus and Newborn, Berlin, Germany, Springer,
2008.
“Deep scalogram representations for acoustic scene classification,” Lichuan Liu (M’06–SM’11) received the B.S. and
IEEE/CAA J. Autom. Sinica, vol. 5, no. 3, pp. 662−669, May 2018. M.S. degree in electrical engineering in 1995 and
1998 respectively from University of Electronic Sci-
ence and Technology of China, and Ph.D. degree
[12] Dong Yu and Jinyu Li. “Recent progresses in deep learning based in electrical engineering from New Jersey Institute
acoustic models,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 3, pp. of Technology, Newark, NJ in 2006. She joined
396−409, April 2017 Northern Illinois University in 2007 and is currently
an Associate Professor of Electrical Engineering and
[13] B. Goldand N. Morgan, Speech and Audio Signal Processing. New York, the Director of Digital Signal Processing Laboratory.
NY, USA, John Wiley & Sons, 2011. Her current research includes digital signal process-
ing, real-time signal processing, wireless communi-
cation and networking. She has over 70 publications including 30 journal
[14] V. R. Fisichelli, S. Karelitz, C. F. Z. Boukydis, and B. M. Lester, “The papers and one book chapter. She has three patents awarded. She has led and
cry attencies of normal infants and those with brain damage,” Infant participated in many research grants, such as: NSF, NASA and NIH.
Crying, Plenum Press, 1985.