Abstract – Ahkam Al-Tajweed represents the most in teaching Ahkam Al-Tajweed, including the
precious religious heritage that is in critical need to be requirement of an expert for private tutoring called
preserved and kept for the next generation. This study "Talqeen," which is not always available [1]. To tackle
tackles the challenge of learning Ahkam Al-Tajweed by this issue, researchers have turned to Machine
developing a model that considers one of the rules
Learning (ML) techniques aiming to develop
experienced by early learners in the Holy Quran. The
proposed model focuses, specifically, on the "Hukm Al- computerized systems that check the proper
Noon Al-Mushaddah," which pertains to the proper application of Ahkam Al-Tajweed based on audio
pronunciation of the letter "Noon" when it is recordings. However, existing systems are limited in
accompanied by a Shaddah symbol in Arabic. By the rules they consider or the Quran Verses they cover
incorporating this rule into the proposed model, learners [2]. Moreover, to detect errors in this rule and other
will benefit the model because it will improve their rules, we utilized state-of-the-art deep learning
Tajweed skills and facilitate the learning process for techniques, including Convolutional Neural Networks
those who do not have access to private tutors or experts. (CNNs) and Recurrent Neural Networks (RNNs), to
The proposed approach involved three models namely,
Convolutional Neural Network (CNN), Recurrent
analyze audio recordings of Quran recitations.
Neural Network (RNN), and Random Forest models in Furthermore, one of the remarkable aspects of the
the context of a classification task. The models were Quran is its linguistic structure and the way its Verses
evaluated based on their validation accuracy, and the are composed. The study of the phonetics of the
results indicate that the CNN model achieved the highest Quran, also known as the science of Tajweed, involves
validation accuracy of 0.8613. The other contribution of the analysis of the sounds, rhythm, and melody of the
this work is collecting a novel dataset for this kind of Quranic text [3]. The Quranic text is written in Arabic,
study. The findings show that the Random Forest model a language that is highly complex and rich in its
outperformed the other models in terms of accuracy. phonetic and linguistic features. The proper recitation
of the Quranic text is considered to be of great
Keywords: Artificial Intelligence, Deep Learning, Quran
Recitation, Ahkam Al-Tajweed, Hukm Al-Noon Al- importance in Islam, and a skilled reciter of the Quran
Mushaddah. is highly respected and admired within the Muslim
community. The phonetics of the Quran are, therefore,
laypeople alike [3]. The phonetics of the Quran are
I. INTRODUCTION also of interest to linguists and scholars of Arabic
This section provides a comprehensive description language and literature. The rhythmic and melodic
and background about Quran recitation as well as patterns of the Quranic text have been the subject of
Ahkam Al-Tajweed. It also states the related works of much study and analysis, and the beauty and
this study and provides the problem considered complexity of these patterns continue to fascinate
alongside the contribution of this work. scholars today. The phonetics of the Quran also offer
A. Overview insights into the development of the Arabic language
The Holy Quran is the main sacred book of Islam, and its evolution over time [4].
composed of 30 chapters and 6236 Verses grouped On the other hand, there are several audio feature
into 114 groups called "Surahs." Correct extraction techniques such as MFCCs, Spectral
pronunciation during recitation is called "Tajweed," Contrast, Chroma STFT, and Spectrograms for speech
and rules must be considered to ensure the correct and music analysis. These features have been shown
meaning is delivered. However, there are many issues to be effective in capturing the spectral content of
A Deep learning Approach for Recognizing the Noon Rule for Reciting Holy Quran
audio signals and can provide valuable information for for tracking Tajweed rules. The system recognized the
tasks such as speech recognition, music genre recitation of 10 different reciters, including men,
classification, and emotion recognition. The features women, and children, and could identify mistakes at
and their effectiveness in improving the performance both Verse and word levels. However, it has a
of the system are described as follows [5]: limitation as it was based on phonetic rules that are not
- MFCC (Mel-Frequency Cepstral Coefficients): provided. Another study called "Makhraj" was
MFCCs are widely used in speech and music introduced in [11] to make the recitation of the Holy
analysis. They are computed by first applying a Quran less dependent on expert reciters. The authors
filter bank to the power spectrum of an audio used MFCC for feature extraction and tested the
signal to obtain the mel-scaled power spectrum. system in two modes: one-to-one and one-to-many.
This is followed by taking the logarithm of the The system achieved a 98% accuracy in the one-to-one
mel-scaled power spectrum and then performing mode, which is not considered very accurate due to the
the Discrete Cosine Transform (DCT) on the utilization of a simple matching technique. Moreover,
resulting coefficients. The first few coefficients the authors in [12] introduced an intelligent tutoring
are typically the most informative, capturing the system for teaching Tajweed. It was evaluated by
overall spectral shape of the signal, while higher reciting teachers and students; the results were
coefficients capture more detailed spectral promising. However, the system was limited to
information [6]. teaching Tafkhim and Tarqiq in Tajweed for the Holy
- Spectral Contrast: it is a feature that captures the Quran recitation with the Rewaya of Hafs from
differences in energy between adjacent 'Aasem. Another intelligent recognition model
frequency bands in a spectrogram. It is calculated proposed in [13] to recognize Qira'ah from the
by dividing the spectrum into sub-bands and corresponding Holy Quran acoustic wave. The model
computing the difference in energy between the was built upon three phases: 1) feature extraction, 20
highest and lowest frequencies in each sub-band. training the SVM learning model, and 3) recognizing
Spectral Contrast can be useful for speech and Qira'ah based on the trained model. The SVM-based
music classification, as it can capture the recognition model achieved a success rate of 96%.
distinctive spectral features of different types of More studies are available in the literature in this
sounds [7]. area, for instance, the authors in [14] used traditional
- Chroma STFT: Chroma features are a way of audio processing techniques for feature extraction and
summarizing the pitch content of an audio signal. classification on an in-house dataset of thousands of
Chroma features are computed by first audio recordings covering all occurrences of the rules
calculating the short-time Fourier transform under consideration in the entire Holy Quran. The
(STFT) of the signal, and then projecting the work showed how to enhance the classification
magnitude spectrum onto a set of pitch classes. accuracy to surpass 97.7% by incorporating deep
Each pitch class corresponds to a particular learning techniques. The researchers in [15] proposed
musical note, and the value of the chroma feature a machine learning approach for recognizing the
for each pitch class is the sum of the magnitudes reciter of the Holy Quran. The system achieved
of the corresponding spectral components. excellent accuracy of 97.62% for chapter 18 and
Chroma features are often used in music 96.7% for chapter 36 using the ANN classifier. In [16],
information retrieval tasks, such as genre the authors addressed the problem of identifying the
classification or chord recognition [8]. correct usage of Ahkam Al-Tajweed in the entire
- Spectrogram: A spectrogram is a visual Quran, focusing on eight Ahkam Al-Tajweed faced by
representation of the frequency content of an early learners of recitation. The results showed that the
audio signal over time. It is computed by highest accuracy achieved was 94.4%, which was
dividing the signal into overlapping frames, obtained when bagging was applied to SVM with all
computing the magnitude of the Fourier features except for the LPC features. Finally, the work
transform for each frame, and then plotting the proposed in [17] suggested a deep learning model
resulting magnitudes as a function of frequency using MFCCs to distinguish between trustworthy and
and time. Spectrograms can be useful for fraudulent reciters of the Qur'an. It compared the deep
visualizing the spectral content of an audio learning approach to machine learning methods and
signal, and can also be used as input to machine determined the optimal segment length and number of
learning models for tasks such as speech or features. The proposed model achieved high accuracy
music classification [9]. and outperformed other models, while a future
B. Literature Review direction includes creating a dataset encompassing the
The literature includes several studies that aimed entire Qur'an for further research on recitation rules
to develop an intelligent model for recognizing the using deep learning techniques. The data used in the
rules of Holy Quran recitation and tracking reading study is available from the corresponding author upon
errors using automatic speech recognition techniques. request.
One of the early studies was of Muhammad et al., [10],
who developed an intelligent system called "Hafize"
Figure 3: (A) CNN model accuracy, and (B) evaluation
According to the obtained results in Figures 2, 3, and
4, it can be observed that CNN is better for achieving REFERENCES
the purpose of this work. While RNN that is used for [1] Samara G, Al-Daoud E, Swerki N, Alzu’bi D. The
sequential data processing underperformed CNN. Recognition of Holy Qur’an Reciters Using the
Also, since Random Forest is used for classification MFCCs’ Technique and Deep Learning. Advances
and regression tasks it is also underperformed CNN. in Multimedia. 2023 Mar 21;2023.
This is because CNN uses convolutional layers to https://doi.org/10.1155/2023/2642558
efficiently extract features from data, while RNN uses [2] Al-Ayyoub M, Damer NA, Hmeidi I. Using deep
recurrent layers to maintain a memory of the previous
learning for automatically determining correct
inputs and outputs. On the other hand, Random Forest application of basic quranic recitation rules. Int.
uses decision trees to make predictions. However, it is Arab J. Inf. Technol.. 2018 Apr;15(3A):620-5.
observed that Random Forest is a simpler algorithm [3] Nasallah MK. The Importance Of Tajweed In The
that can be used for smaller datasets or when Recitation Of The Glorious Qur’an: Emphasizing
computational resources are limited [21-22]. Its Uniqueness As A Channel Of Communication
Between Creator And Creations. IOSR Journal Of
IV. CONCLUSIONS Humanities And Social Science (IOSR-JHSS).
This work tried to address the issue of learning 2016;21(2):55-61.
Ahkam Al-Tajweed by developing a model that [4] Ahmad M. Literary Miracle of the Quran. Ar-
considered one of the rules experienced by early Raniry: International Journal of Islamic Studies.
learners in the Holy Quran. The proposed model 2020 Jul 28;3(1):205-20.
focuses, specifically, on the "Hukm Al-Noon Al- [5] Nogueira AF, Oliveira HS, Machado JJ, Tavares
Mushaddah," which pertains to the proper JM. Sound Classification and Processing of Urban
pronunciation of the letter "Noon" when it is Environments: A Systematic Literature Review.
accompanied by a Shaddah symbol in Arabic. By Sensors. 2022 Nov 8;22(22):8608. doi:
incorporating this rule into the proposed model, 10.3390/s22228608
learners will benefit the model because it will improve [6] Ayvaz, Uğur, et al. "Automatic speaker recognition
their Tajweed skills and facilitate the learning process using mel-frequency cepstral coefficients through
for those who do not have access to private tutors or machine learning." CMC-Computers Materials &
experts. The proposed approach involved three AI Continua 71.3 (2022).
models namely, CNN, RNN, and Random Forest. The [7] Su, Yu, et al. "Performance analysis of multiple
other contribution of this work is collecting a novel
aggregated acoustic features for environment
dataset for this kind of study. The collected data was sound classification." Applied Acoustics 158
used by the three models. The findings show that the (2020): 107050.
CNN model outperformed the other models in terms [8] Kumar, Nagendra, et al. "CNN based approach for
of validation accuracy. Also, the test accuracy varies Speech Emotion Recognition Using MFCC,
from verse to verse in different models, some test verse Croma and STFT Hand-crafted features." 2021 3rd
showed promised result against others. Finally, this International Conference on Advances in
study is ongoing and will continue until it covers all Computing, Communication Control and
the available aspects of Ahkam Al-Tajweed. Networking (ICAC3N). IEEE, 2021.
Furthermore, this work is considered an approach [9] Gong, Yuan, et al. "Ssast: Self-supervised audio
for cultural and religious heritage preservation, which
spectrogram transformer." Proceedings of the
contributes to having sustainable communities as the AAAI Conference on Artificial Intelligence. Vol.
United Nations declares its goals in promoting our 36. No. 10. 2022.
communities to be sustainable. Future works can build
upon these findings to develop more effective
A Deep learning Approach for Recognizing the Noon Rule for Reciting Holy Quran