An Automatic Arabic Sign Language Recognition System (ArSLRS)
An Automatic Arabic Sign Language Recognition System (ArSLRS)
a r t i c l e i n f o a b s t r a c t
Article history: Sign language recognition system (SLRS) is one of the application areas of human computer interaction
Received 5 June 2017 (HCI) where signs of hearing impaired people are converted to text or voice of the oral language. This
Revised 12 September 2017 paper presents an automatic visual SLRS that translates isolated Arabic words signs into text. The pro-
Accepted 26 September 2017
posed system has four main stages: hand segmentation, tracking, feature extraction and classification.
Available online 5 October 2017
A dynamic skin detector based on the face color tone is used for hand segmentation. Then, a proposed
skin-blob tracking technique is used to identify and track the hands. A dataset of 30 isolated words that
Keywords:
used in the daily school life of the hearing impaired children was developed for evaluating the proposed
Gesture recognition
Arabic sign language recognition
system, taking into consideration that 83% of the words have different occlusion states. Experimental
Isolated word recognition results indicate that the proposed system has a recognition rate of 97% in signer-independent mode. In
Image-based recognition addition to, the proposed occlusion resolving technique can outperform other methods by accurately
specify the position of the hands and the head with an improvement of 2.57% at s = 5 that aid in differ-
entiating between similar gestures.
Ó 2017 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an
open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction the isolated systems signs only one letter or word at a time while
in continuous systems the performer signs one or more complete
Hearing impairment is a board term referred to partial or com- sentences. Further, SLRS can be categorized as a signer-
plete loss of hearing in one or both ears. The level of impairment dependent or signer-independent. Systems rely on the same sign-
varies between mild, moderate, serve or profound. ers to perform in both training and testing phases are signer-
Granting to the world health organization (WHO) in the year of dependent and this affects the recognition rate positively. On the
2017, Over 5% of the world’s population – 360 million people’ has other hand, in signer-independent systems singers performed the
disabling hearing loss (328 million adults and 32 million children). training stage is not admitted in the testing stage and this adds a
Roughly one-third of people over 65 years of historic period are challenge of adapting the system to accept any signer. The goal
affected by disabling hearing loss. The majority of people with dis- of SLRS can be achieved by either a sensor-based or an image-
abling hearing loss live in low and middle income countries based system.
(Center, 2017). The sensor-based system employs just about variety of elec-
SLRS is one of the application areas of HCI. The main goal of tromechanical devices that are incorporated with many sensors
SLRS is to recognize signs of hearing impaired people and convert- to recognize signs, e.g.: Data gloves (Shukor et al., 2015), power
ing them to text or voice of the oral language and vice versa. These gloves (Mohandes et al., 2004), cyber gloves (Mohandes, 2013),
systems use either isolated or continuous signs. The performer of and Dexterous master gloves (Hoshino, 2006). Sadek et al. (2017)
designed a smart glove using a few sensors depending on a statis-
⇑ Corresponding author at: Department of Computer Science, Faculty of Com-
tical analysis of the anatomical shape of the hands when perform-
puters and Informatics, Benha University, Benha Mansoura Road, next to Holding
ing the 1300 words of the Arabic sign language (ArSL). The glove
Company for Water Supply and Sanitation Benha, Qalyubia Governorate, Egypt. costs about $65 which is according to the author 5% less than the
E-mail addresses: [email protected] (N.B. Ibrahim), [email protected] cost of commercial smart gloves. The high cost and less normality
(M.M. Selim), [email protected] (H.H. Zayed). of this method contributes to the appearance of the image-based
Peer review under responsibility of King Saud University. way, where one or more cameras are employed to capture the
signs. Classification can be done either by marker-based or
visual-based techniques.
In marker-based techniques, markers with predefined color or
Production and hosting by Elsevier colored gloves are placed on the fingertips and wrist. These
https://doi.org/10.1016/j.jksuci.2017.09.007
1319-1578/Ó 2017 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
N.B. Ibrahim et al. / Journal of King Saud University – Computer and Information Sciences 30 (2018) 470–477 471
predefined colors are then detected and segmented from an image This paper presents an automatic visual SLRS that translates
captured by a 2D camera using image processing methods, but isolated Arabic word signs into text. The proposed system has
these techniques also lack for normality (Wang and Popović, four primary stages: hand segmentation, tracking, feature extrac-
2009; El-Bendary et al., 2010). On the other hand, visual-based tion and classification. Hand segmentation is performed utilizing
techniques use bare hands without any markers. These techniques a dynamic skin detector based on the color of the face (Ibrahim
have high normality and higher mobility than any other types of the et al., 2012). Then, the segmented skin blobs are used in identify-
SLRS. Visual -based SLRS have low cost as one camera can be used. ing and tracking of the hands with the help of the head. Geomet-
But these techniques suffer from changing in illumination. Hand ric features of the hands are employed to formulate the feature
occlusion with either each other or the face is another drawback vector. Finally, Euclidean distance classifier is applied for classifi-
as 2D images lack the depth information that aid in solving occlu- cation stage. A dataset of 30 isolated words used in the daily
sion. This paves the way to the depth sensors that depends on school life of the hearing impaired children was developed. The
RGB-D image technique giving the depth of each pixel in the image experimental results indicate that the proposed system has a
helping in constructing a 3D model of the objects in the scene. Till recognition rate of 97%. Taking into consideration that 83% of
now it’s still an open field of research. In most of the researches the words mainly cover all the occlusion states to prove the
vision-based refers to visual-based vision system. A further discus- robustness of the system.
sion and detailed overview on related work in the field of SLR is The upcoming sections are arranged as follows: Dataset
given at (Cooper et al., 2011; Mohandes et al., 2014; Rautaray and description is illustrated in Section 2. The proposed approach
Agrawal, 2015; Agrawal et al., 2016). This paper will focus on ArSL. including a novel identifying and tracking method is described in
Recent isolated vision ArSLR systems are pointed out. Section 3. Results and evaluation are outlined in Section 4. Finally,
Al-Rousan et al. (2009) developed a system that automatically a conclusion is given in Section 5.
recognizes 30 isolated ArSL words using discrete cosine transform
(DCT) to extract features and Hidden Markov Model (HMM) as
2. ArSLRS dataset
recognition method. The system obtained a word recognition rate
of 94.2% for signer-independent off-line mode. Due to the nature
A unified Arabic sign language dictionary was published in two
of DCT, the observation features produced by the DCT algorithm
editions in 2008. Despite of that, there are no common databases
misclassified similar gestures. Also, the system did not concern
available for researchers in the area of Arabic sign language recog-
with working out the occlusion problem.
nition. Thus, each researcher has to establish his own database
To overcome the misclassification of similar gesture, Al-Rousan
with reasonable size.
et al. (2010) developed a system that used two-level scheme of
The dataset used is an ArSL database videos which was col-
HMM classifier. The system overcomes the occlusion state by treat-
lected at Benha University. The database consists of 450 col-
ing the occluded objects as one object or by taking the preceding
ored ArSL videos captured at a rate of 30 fps. These videos
features of the objects before occlusion. In the real situation, this
represent 30 Arabic words which were selected as the daily
is not the case.
common used words in school. 300 videos are used for train-
Another technique for solving the occlusion states was devel-
ing while 150 are for testing. The signers performing the test-
oped by El-Jaber et al. (2010) where a stereo vision is applied to
ing clips are different from whom performed the training clips
estimate and segment out the signer’s body using its depth infor-
to guarantee the signer -independency of the system designed.
mation to recognize 23 isolated gestures in signer-dependent
The videos are gathered in different illumination, backgrounds,
mode. It aches from its high cost as more than one camera is
and clothing. The signer is asked to face the camera with no
needed to construct the stereo vision. Disparity maps are computa-
orientation, then starts signing from silence state where both
tionally expensive as any change in the distance between the two
hands are placed beside the body and then ends again with
cameras and the object will affect the performance of solving the
a silence state.
correspondence problem.
It was considered that the database contains words that have
In Elons et al. (2013), a 3D model of the hand posture is gener-
variety of using one hand or both hands with occlusion with each
ated from two 2D images from two perspectives that are weighted
other or with the face to test the validity of the system in solving
and linearly combined to produce single 3D features trying to clas-
different occlusion states.
sify 50 isolated ArsL words using Hybrid pulse-coupled neural net-
The list of the used words and their description is given in the
work (PCNN) as feature generator technique followed by non-
Table 1. The occlusion column identifies that the sign performed
deterministic finite automaton (NFA). Then, ‘‘Best-match” algo-
has an occlusion state with either one of the hands or both hands
rithm is used to find the most probable meaning of a gesture.
and the face. RH and LH columns show whether the sign is exe-
The recognition accuracy reaches 96%. The misclassification comes
cuted with the right hand or left hand, respectively. R-L H column
from the fact that the NFA of some gestures may be wholly
indicates that the sign is performed with both hands. The last row
included in another gesture NFA.
illustrates the estimated percentage of occlusion states in the built
Ahmed and Aly (2014) uses a combination of local binary pat-
database.
terns (LBP) and principal component analysis (PCA) to extract fea-
tures that are fed into a HMM to recognize a lexical of 23 isolated
ArSL words. Occlusion is not resolved as any occlusion state is han- 3. The proposed ArSLRS
dled as one object and recognition goes on. The system achieves a
recognition rate of 99.97% in signer- dependent mode. But LBP may As shown in Fig. 1, the vision-based SLRS has two modes. The
not work properly on the areas of constant gray-level because of first mode is from the hearing-impaired people to the vocal people
the thresholding schemes of the operator (Ahmed and Aly, 2014). where a video of the sign language (SL) is translated into oral lan-
Obviously, most vision systems suffer from two main problems: guage either in the form of text or voice. This mode is called vision-
confusing similar gestures in motion, and curing the occlusion based SLRS. On the other hand, the second mode is from vocal peo-
problem. The aim of the research documented in this paper is to ple to the hearing-impaired people where the oral language voice
decrease the misclassification rate for similar gestures and resolves record is converted into SL video. The vision-based SLRS mode
all occlusion states using only one camera and without any compli- was the interest in this paper. Each stage is illustrated in detail
cated environment to compute the disparity map. in the upcoming sub-section.
472 N.B. Ibrahim et al. / Journal of King Saud University – Computer and Information Sciences 30 (2018) 470–477
Table 1 includes the eyes and the mouth which are non-skin non-smooth
List of dataset words and their description. regions is used to calculate the probability of a pixel to be skin
Words Occlusion R-L H LH RH or non-skin pixel. This has affected the results of this approach
p p by detecting the mouth, the eyes and the brows as skin region
Peace be upon you
p p
Thank you which is not true. In Bilal et al. (2015), a 10 ⁄ 10 window around
p
Telephone
p
the center pixel of the face is used to distinguish the skin tone pix-
I els, which in most of the cases is the nose tip. But, this region suf-
p p
Eat
p p fers from the effect of illumination and may give wrong
Sleep
p p indications.
Drink
p p
Prayer In this paper, a dynamic skin detector based on face skin tone
p p
To go color is used in segmenting the hands (Ibrahim et al., 2012). YCbCr
p p
Bathroom
p p color space is used after discarding the luminance channel. Face
Ablution
p detector is applied to the first frame. The probability distribution
Tomorrow
p p
Today function (PDF) histogram bins are calculated and trimmed at
p p
Food
p p
0.005. To avoid eyes and mouth regions to be recognized as the
Water skin, a threshold is applied to remaining PDF values after trimming.
p p
To love
p p The pixels along the major and minor axes of the bounding rectan-
To hate
p p gle of detecting face are used to calculate a dynamic threshold. This
Money
p p
Where’re you going? threshold is applied to the face image to identify skin pixels. And
p p
Where then, the threshold is updated by increasing the pixels around
p p
Why
p the axes until 95% from face pixels are recognized as a skin. Finally,
How much
p p this threshold is applied to the entire image. This method is
Yes
p p
No employed due to its adaptive nature which make it applicable for
p
Want
p p
different races. In addition to, using the YCbCr color space reduce
School dramatically the effect of illumination on the segmentation. The
p p
Teacher
p p outcome of this phase is a binary image that holds the hands and
Help
p p p the face with white pixels and other objects with dark.
Sick
p p
Friend
Percentage 83% 33% 27% 40% 3.2. Tracking
the hands. In this technique, it is estimated that the hand shape 3.2.2. Hands tracking
changes are small. A translation of the former position of the head The center of the bounding rectangle of the head can be used as
and the hands is done to occupy the bounding rectangle of the a reference point to define the hands. Let B be a skin blob. To iden-
occluded objects. The TER at s = 20 is 0.08%. This technique is tify the skin blob as right hand (RH) or left hand (LH), the differ-
signer-independent and model-free technique. This technique uses ence ðMxÞ between the x-coordinates of the centers of the current
forward tracking path in addition to the preceding information head ðHc Þ and the skin blob (B) must be calculated as follows:
about the tracking object to decide its next location.
Mx ¼ xHc xB ð2Þ
Then, the skin blob is identified according to the following
3.2.1. Head tracking conditions:
The head can be easily localized by using a cascade boosting
algorithm (Viola and Jones, 2004) but it is computationally very RH; Mx > 0
expensive to apply this algorithm on all frames to detect head B¼ ð3Þ
LH; Otherwise
especially if this application is a real-time one. Consequently, the
cascade boosting algorithm is applied to the first frame only to Identifying of the right and the left-hand skin blobs is shown in
obtain the bounding rectangle of the head. Since the position of Fig. 3.
the head during the signing process mainly doesn’t change and After localizing the first appearance of the head and the hands,
nearly has the same location, Euclidean distance applies to the pre- Euclidean distance is used to keep track of them, as shown in Fig. 4.
ceding frames to recognize the head skin blob. When more than This will work well till an occlusion takes place.
one skin blob appears, the head is distinguished as the skin blob Occlusion resolvingOcclusion is the overlapping of one or more
with the smallest Euclidean distance from the former position of of the tracked objects where one object may cover some or whole
the head. Let Hp ¼ ðxp ; yp Þ is the center of the previous head bound-
ing rectangle while Bi ¼ ðxi ; yi Þ is the center of the current skin
blobs where i ¼ f1; 2; 3g. The Euclidean distance ðnHB Þ between
the Hp and Bi is given by:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðnHBi Þ ¼ ðxp xi Þ2 þ ðyp yi Þ2 ð1Þ
The skin blob with the minimum nHB is the current head (Hc ). As
shown in Fig. 2a, the head is marked with solid rectangle. In Fig. 2b,
the previous head bounding rectangle is marked with a dotted
rectangle, while the new skin blobs are marked with solid rectan-
gles. Euclidean distances between the center of the previous loca-
tion of the head and the center of the current skin blobs are
calculated. The blob with the minimum Euclidean distance is rec-
ognized and marked as the new head with a solid rectangle as
Fig. 3. Identifying the first appearance of the right and the left hands.
shown in Fig. 2c.
Then,
P 0 ! move RU PRH to RU B and move LLHP to LLB
if My ¼ ð5Þ
< 0 ! move RLPRH to RLB and move LU HP to LU B
On the other hand, for the occlusion of the head and the left
hand, the difference between the y-coordinates of the RU corners
of the current skin blob (B) bounding rectangle and the previous
left hand (PLH) bounding rectangle are calculated (y) as follows:
Then,
Fig. 6. Hands tracking.
P 0 ! move LU PRH to LU B and move RLHP to RLB
if My ¼
< 0 ! move LLPRH to LLB and move RU HP to RU B
ð7Þ head. If this distance is less than a predefined threshold, then
remove occlusion as mentioned previously for both hands with
The case of occlusion between the head and the left hand and keeping the head position as its previous location. On the other
how to resolve it is shown in detail in Fig. 6. In Fig. 6a, the head hand, if the distance is greater than the predefined threshold, then
and the left hand are marked with solid rectangle. Then, Euclidean remove this hand and resolve the occlusion using the previous
distance between the current skin blobs and the previous head and methods for the remaining hand and the head.
left hand is calculated, as shown in Fig. 6b. These calculations indi- Occlusion between hands The partial occlusion of the hand is
cate that the head and the left hand share the same skin blob, as indicated by its area increase by one half. If the occlusion takes
illustrated in Fig. 6c. y is calculated and the arrow shown in place, then solve it as if it was an occlusion between the hands
Fig. 6d indicates the translation of the head and the left hand from and the head.
the previous locations to the new locations. Finally, the occlusion is Fig. 7a is the frame that contains the head and both hands
resolved and the location of the new head and hand is in Fig. 6e. before occlusion. In Fig. 7b the hands have moved and its region
Finally, occlusion between the head and both hands is solved by has increased by more than the half which indicates an occlusion
calculating the Euclidean distance between both hands and the situation. The head area doesn’t increase by more than the third;
therefore, the occlusion happened between hands only. As shown
in Fig. 7c the head is identified and the other skin blob is recog-
nized as the hands. Dy is calculated for hands as shown in
Fig. 7d using Eq. 4 and Eq. 6. The arrow in Fig. 7d shows the move-
ment of the previous bounding rectangles of both hands. The RU of
the right hand is moved to the RU of the current skin blob while the
LL of the left hand is moved to the LL of the current skin blob. The
result of the tracking is shown in Fig. 7e.
For full occlusions between both hands, it is indicated when the
Fig. 5. Corners of the bounding rectangle of an object. previous location of both hand points to the same skin blob as the
N.B. Ibrahim et al. / Journal of King Saud University – Computer and Information Sciences 30 (2018) 470–477 475
the hands, the velocity of the hand movement and the orientation
of the main axis of the hand. The feature vector of any sign is rep-
resented as follows:
3.4. Recognition
Table 2 mode. Form the confusion matrix illustrated in Fig. 8, it was indi-
TER for the proposed tracking technique and other state-of-the-art tracking methods cated that gestures (2, 8, 12, 19 and 26) confused with the gestures
from SIGNSPEAK (2012).
(24, 4, 28, 20 and 3), respectively. These gestures have great simi-
TER% larity in either partial or full hand movement. Despite that, only
Tracking methods s=5 s = 10 s = 15 s = 20 one gesture is recognized wrongly. This indicates the ability of
DPT + PCA 26.77 17.32 12.7 10.86
the proposed ArSLRS to differentiate between gestures with high
DPT + VJ 10.06 0.4 0.02 0 similarity.
VJD 9.75 1.23 1.09 1.07 Finally, it is clear that the proposed system has no demand for
VJT 10.04 0.81 0.73 0.68 more than one camera or complicated calculations to adjust the
FJAAM 10.17 6.85 6.82 6.81
two cameras to achieve high recognition rates as in El-Jaber
FJAAM 10.92 7.92 7.88 7.76
POICAAM 3.54 0.12 0.08 0.08 et al. (2010). The proposed system did not have to group similar
Proposed 0.97 0.70 0.22 0.08 gesture to increase its recognition rate as in AL-Rousan et al.
(2007) where gesture that has common parts decrease dramati-
cally its rate. The developed system does not construct a 3D
appearance model (AAM), Project-Out Inverse Compositional AAM model of hand posture that need two cameras and a sensitive cal-
(POICAAM) and Fixed Jacobian active appearance model (FJAAM). culation to weight the two views from both cameras. The system
Most of these methods are model-based signer-dependent tracking proves its robust occlusion resolving technique that outperform
approaches and have been evaluated on all 15732 ground-truth other methods.
annotated frames of the RWTH-BOSTON-104 dataset. The results
of these different algorithms are compared to the results of the 5. Conclusions
proposed technique for evaluation. The proposed technique has
the lowest TER at s = 5, as shown in Table 2. By increasing the s, This paper presents an automatic visual SLRS that translates iso-
the TER decrease, but the proposed technique has small changes lated Arabic word signs into text. The proposed system is signer-
unlike other algorithms. This demonstrates the robustness of the independent system that utilizes a single camera and the signer
proposed occlusion resolving technique as it can accurately specify does not employ any type of gloves or markers. The system has
the position of the hands and the head with an improvement of four primary stages: hand segmentation, hand tracking, hand fea-
25.7% compared to the result from POICAAM at s = 5. The proposed ture extraction and classification. Hand segmentation is performed
technique is not a model-based technique compared to other utilizing a dynamic skin detector based on the color of the face.
methods which guarantee the less computation needed and the Then, the segmented skin blobs are used to identify and track
signer independency of the method. Integrating the proposed tech- hands with the aid of the head. The system proved its robust per-
nique with other methods may improve the accuracy, especially, formance against all states of occlusion as 83% of the words in the
for the tolerance 106 s 620. The third scenario is evaluating the dataset has different occlusion states. Geometric features are
whole system by considering the percentage of the total number employed to construct the feature vector. Finally, Euclidean
of Arabic sign words that were correctly recognized. Euclidean dis- distance classifier is applied to classification stage. A dataset of
tance is used to classify the video of each gesture. The signer of the 30 isolated words that are utilized in the daily school life of the
tested gesture is asked to face the camera with no orientation and hearing-impaired children was developed. The experimental
freely perform the sign (remember that signer must begin and end results indicate that the proposed system has a recognition rate
with a silence state). The environment is controlled as one signer at of 97% with the low misclassification rate for similar gestures. In
a time is performing with stationary environment around him. The addition, the system proved its robustness against different cases
system attains a recognition rate of 97% in signer independent of occlusion with a minimum TER of 0.08%.
References Holden, E.-J., Lee, G., Owens, R., 2005. Australian sign language recognition. Mach.
Vis. Appl. 16 (5), 312.
Hoshino, K., 2006. Dexterous robot hand control with data glove by human
Agrawal, S.C., Jalal, A.S., Tripathi, R.K., 2016. A survey on manual and non-manual
imitation. IEICE Trans. Inf. Syst. 89 (6), 1820–1825.
sign language recognition for isolated and continuous sign. Int. J. Appl. Pattern
Ibrahim, N.B., Selim, M.M., Zayed, H.H., 2012. A dynamic skin detector based on face
Recognit. 3 (2), 99–134.
skin tone color. In: Informatics and Systems (INFOS), 2012 8th International
Ahmed, A.A., Aly, S., 2014. Appearance-based arabic sign language recognition using
Conference on, IEEE, 2012, pp. MM–1.
hidden markov models. In: Engineering and Technology (ICET), 2014
Kawulok, M., 2008. Dynamic skin detection in color images for sign language
International Conference on, IEEE, 2014, pp. 1–6.
recognition. Image Signal Process., 112–119
AL-Rousan, M., Al-Jarrah, O., Nayef, N., 2007. Neural networks based recognition
Lee, C.-Y., Lin, S.-J., Lee, C.-W., Yang, C.-S., 2012. An efficient continuous tracking
system for isolated arabic sign language. In: Information Technology, 2007. ICIT
system in real-time surveillance application. J. Network Comput. Appl. 35 (3),
2007. The 3rd International Conference on, AL-Zaytoonah University.
1067–1073.
Al-Rousan, M., Assaleh, K., Tala’a, A., 2009. Video-based signer-independent arabic
Li, Y.-B., Shen, X.-L., Bei, S.-S., 2011. Real-time tracking method for moving target
sign language recognition using hidden markov models. Appl. Soft Comput. 9
based on an improved camshift algorithm. In: Mechatronic Science, Electric
(3), 990–999.
Engineering and Computer (MEC), 2011 International Conference on, IEEE,
Al-Rousan, M., Al-Jarrah, O., Al-Hammouri, M., 2010. Recognition of dynamic
2011, pp. 978–981.
gestures in arabic sign language using two stages hierarchical scheme. Int. J.
Mohandes, M.A., 2013. Recognition of two-handed arabic signs using the
Knowl.-Based Intell. Eng. Syst. 14 (3), 139–152.
cyberglove. Arabian J. Sci. Eng., 1–9
Asaari, M.S.M., Suandi, S.A., 2010. Hand gesture tracking system using adaptive
Mohandes, M., A-Buraiky, S., Halawani, T., Al-Baiyat, S., 2004. Automation of the
kalman filter. In: Intelligent systems design and applications (ISDA), 2010 10th
arabic sign language recognition. In: Information and Communication
international conference on, IEEE, 2010, pp. 166–171.
Technologies: From Theory to Applications. In: Proceedings. 2004
Assaleh, K., Shanableh, T., Fanaswala, M., Amin, F., Bajaj, H., et al., 2010. Continuous
International Conference on, IEEE, 2004, pp. 479–480.
arabic sign language recognition in user dependent mode. JILSA 2 (1), 19–27.
Mohandes, M., Deriche, M., Liu, J., 2014. Image-based and sensor-based approaches
Baskaran, J., Subban, R., 2014. Compressive object tracking–a review and analysis.
to arabic sign language recognition. IEEE Trans. Hum.-Mach. Syst. 44 (4), 551–
In: Computational Intelligence and Computing Research (ICCIC), 2014 IEEE
557.
International Conference on, IEEE, 2014, pp. 1–7.
Rautaray, S.S., Agrawal, A., 2015. Vision based hand gesture recognition for human
Bilal, S., Akmeliawati, R., Salami, M.J.E., Shafie, A.A., 2015. Dynamic approach for
computer interaction: a survey. Artif. Intell. Rev. 43 (1), 1–54.
real-time skin detection. J. Real-Time Image Proc. 10 (2), 371–385.
Sadek, M.I., Mikhael, M.N., Mansour, H.A., 2017. A new approach for designing a
Center, M., 2017. Deafness and hearing loss Tech. rep.. World Health Organization.
smart glove for arabic sign language recognition system based on the statistical
Cooper, H., Holt, B., Bowden, R., 2011. Sign language recognition. In: Visual Analysis
analysis of the sign language. In: Radio Science Conference (NRSC), 2017 34th
of Humans. Springer, pp. 539–562.
National, IEEE, 2017, pp. 380–388.
Dreuw, P., Deselaers, T., Rybach, D., Keysers, D., Ney, H., 2006. Tracking using
Shukor, A.Z., Miskon, M.F., Jamaluddin, M.H., bin Ali, F., Asyraf, M.F., bin Bahar, M.B.,
dynamic programming for appearance-based sign language recognition. In:
et al., 2015. A new data glove approach for malaysian sign language detection.
Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International
Proc. Comput. Sci. 76, 60–67.
Conference on, IEEE, 2006, pp. 293–298.
SIGNSPEAK, 2012. Scientific understanding and vision-based technological
Dreuw, P., Forster, J., Ney, H., 2010. Tracking benchmark databases for video-based
development for continuous sign language recognition and translation, Tech.
sign language recognition. In: ECCV Workshops (1), 2010, pp. 286–297.
rep., European Commission.
El-Bendary, N., Zawbaa, H.M., Daoud, M.S., Hassanien, A.E., Nakamatsu, K., 2010.
Viola, P., Jones, M.J., 2004. Robust real-time face detection. Int. J. Comput. Vision 57
Arslat: Arabic sign language alphabets translator. In: Computer Information
(2), 137–154.
Systems and Industrial Management Applications (CISIM), 2010 International
Wang, R.Y., Popović, J., 2009. Real-time hand-tracking with a color glove. ACM
Conference on, IEEE, 2010, pp. 590–595.
Transactions on Graphics (TOG), vol. 28. ACM, p. 63.
El-Jaber, M., Assaleh, K., Shanableh, T., 2010. Enhanced user-dependent recognition
Yang, H., Shao, L., Zheng, F., Wang, L., Song, Z., 2011. Recent advances and trends in
of arabic sign language via disparity images. In: Mechatronics and its
visual tracking: A review. Neurocomputing 74 (18), 3823–3831.
Applications (ISMA), 2010 7th International Symposium on, IEEE, 2010, pp. 1–4.
Yilmaz, A., Javed, O., Shah, M., 2006. Object tracking: a survey. ACM Computing
Elons, A.S., Abull-Ela, M., Tolba, M.F., 2013. A proposed PCNN features quality
Surveys (CSUR) 38 (4), 13.
optimization technique for pose-invariant 3d arabic sign language recognition.
Zaki, M.M., Shaheen, S.I., 2011. Sign language recognition using a combination of
Appl. Soft Comput. 13 (4), 1646–1660.
new vision based features. Pattern Recogn. Lett. 32 (4), 572–577.
Gianni, F., Collet, C., Dalle, P., 2007. Robust tracking for processing of videos of
communication’s gestures. In: International Gesture Workshop, Springer, 2007,
pp. 93–101.