Fully Automated Deep Learning Approach To Dental Development Assessment in Panoramic Radiographs
Fully Automated Deep Learning Approach To Dental Development Assessment in Panoramic Radiographs
Fully Automated Deep Learning Approach To Dental Development Assessment in Panoramic Radiographs
Abstract
Background Dental development assessment is an important factor in dental age estimation and dental maturity
evaluation. This study aimed to develop and evaluate the performance of an automated dental development staging
system based on Demirjian’s method using deep learning.
Methods The study included 5133 anonymous panoramic radiographs obtained from the Department of Pediatric
Dentistry database at Seoul National University Dental Hospital between 2020 and 2021. The proposed methodology
involves a three-step procedure for dental staging: detection, segmentation, and classification. The panoramic data
were randomly divided into training and validating sets (8:2), and YOLOv5, U-Net, and EfficientNet were trained and
employed for each stage. The models’ performance, along with the Grad-CAM analysis of EfficientNet, was evaluated.
Results The mean average precision (mAP) was 0.995 for detection, and the segmentation achieved an accuracy
of 0.978. The classification performance showed F1 scores of 69.23, 80.67, 84.97, and 90.81 for the Incisor, Canine,
Premolar, and Molar models, respectively. In the Grad-CAM analysis, the classification model focused on the apical
portion of the developing tooth, a crucial feature for staging according to Demirjian’s method.
Conclusions These results indicate that the proposed deep learning approach for automated dental staging can
serve as a supportive tool for dentists, facilitating rapid and objective dental age estimation and dental maturity
evaluation.
Keywords Dental development, Artificial intelligence, Deep learning, Demirjian method, Panoramic radiographs
*Correspondence:
Young-Jae Kim
[email protected]
1
Department of Pediatric Dentistry, School of Dentistry, Seoul National
University, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use,
sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included
in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The
Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available
in this article, unless otherwise stated in a credit line to the data.
Ong et al. BMC Oral Health (2024) 24:426 Page 2 of 12
of a fully automated deep learning approach for dental University Dental Hospital, Seoul, Korea (Ethics Code:
development assessment based on Demirjian’s staging ERI23026). Informed consent was waived by the Ethics
system in panoramic radiographs. Committee of Seoul National University Dental Hospital
for this retrospective study, as the data and patient details
Materials and methods were anonymized.
Dataset collection
The panoramic radiograph datasets used in this study Proposed methodology
were obtained retrospectively from the 2020–2021 data- In this study, a novel approach for an automated dental
base of the Department of Pediatric Dentistry at Seoul development stage classification system based on pan-
National University Dental Hospital. The subjects’ ages oramic images was proposed. The proposed methodol-
ranged from 4 to 16 years, and they were of Korean eth- ogy includes three key procedures using CNN models:
nicity. For the utilization of dental developmental stag- detection, segmentation, and classification. First, the
ing with Demirjian’s method, panoramic images with Yolov5 detection model automatically detected and indi-
low resolution, a subject’s pathologic condition affecting vidually cropped the seven permanent teeth of the left
the maturity of teeth, missing permanent teeth in the left mandible in sequence, starting from the front. Second,
mandible, a history of orthodontic treatment, the exis- the cropped images were processed with U-Net model
tence of apical lesions and eruption disturbances of teeth to segment each tooth from its surrounding background.
were excluded from the study. Finally, the segmented seven teeth were assigned to the
This study was conducted in accordance with the prin- EfficientNet classification model (Incisor, Canine, Pre-
ciples of the Declaration of Helsinki and was approved molar, and Molar) in sequence and classified into dental
by the Institutional Review Board of the Seoul National development stages based on Demirjian’s method. The
Fig. 1 Workflow of the proposed fully automated dental development assessment system including three procedures: (A) Detection, (B) Segmentation,
and (C) Classification
Ong et al. BMC Oral Health (2024) 24:426 Page 4 of 12
performances of the models used in each procedure were rectified linear unit (ReLU) and a max-pooling layer with
analyzed. Figure 1 illustrates the workflow of our pro- a window size of 2 × 2 and a stride of 2 for down-sam-
posed methodology for a fully automatic dental develop- pling. The expansive path was composed of a repeated
mental stage assessment system. Gradient-weighted class application of a transported convolutional layer with a
activation mapping (Grad-CAM) was employed to ana- kernel size of 2 × 2 and stride of 2 for up-sampling the
lyze the heatmap images of the model for each develop- feature map followed by concatenation with the corre-
mental stage. sponding feature map from the contracting path and two
convolutional layers with a kernel size of 3 × 3 and stride
Tooth detection using YOLOv5 of 1, each followed by a ReLU. The final convolutional
The You-Only-Look-Once (YOLO) v5 model was used layer with a kernel size of 1 × 1 and stride of 1 mapped
for the detecting the seven permanent teeth of the left a 64-component feature vector to the desired number
mandible. The YOLO system is a fast and accurate object of classes (tooth region: 1, other region: 0). U-Net has
detector model that uses a single neural network and pre- been widely used in biomedical segmentation applica-
dicts bounding boxes and class probabilities directly from tions, and its application to tooth segmentation in X-ray
full images in one evaluation [24]. The YOLO network images has demonstrated superior results [26]. 80% of
consists of three main parts. Backbone: A pre-trained the cropped tooth images were randomly allocated as the
convolutional neural network used to extract feature training dataset, and the remaining 20% were allocated
representation for images. Neck: This part connects the for the validation dataset. As the contour of the tooth
backbone and the head, mixing and combining the fea- is important for stage determination, and U-Net might
tures formed in the backbone. Head: Responsible for not accurately segment the tooth edge details [27], our
generating the final output. It applies anchor boxes on study intentionally extended the segmentation beyond
feature maps and renders the final output. The panoramic the exact tooth contour. The image size is set to 128 × 128
radiographs were resized to 1000 pixels in width and 500 with U-Net. To minimize unnecessary variance and
pixels in height. 80% of the 5133 panoramic samples improve the performance of the model, training images
were randomly allocated as the training dataset, and the were rotated from − 15 to 15 degrees, and size changes
remaining 20% were allocated for the validation dataset. within a 10% range were applied for augmentation.
The seven teeth of the left mandible were manually anno-
tated with bounding boxes and annotated as ‘target’, and Dental development classification with EfficientNet
the rest of the teeth were annotated as ‘no_target’. When The EfficientNet model was employed to develop the
both primary and subsequent permanent teeth were dental development classification model. EfficientNets
present, the primary teeth were annotated as ‘no_target’ are a family of image classification models, and scaling
to ensure that only the permanent teeth were recognized. methods that uniformly scales all dimensions of depth,
The image size was set to 640 × 640 with YOLOv5, and width, and resolution using a compound coefficient. This
the training images were rotated from − 30 to 30 degrees, compound scaling method enables easy scale up a base-
and the brightness and contrast were randomly changed line convolutional neural network to any target resource
within 30%. The transfer learning technique with the constraints in a more principled way while maintain-
YOLOv5l(large) pre-trained model was used to acceler- ing model efficiency [28]. EfficientNet-B0 is the base
ate and improve the performance. Transfer learning is a model, and EfficientNet-B1 to B7 have scaled variants of
useful way to quickly retrain a model on new data with- the base model. The transfer-learning with pre-trained
out having to retrain the entire network. EfficientNet-B7 was used to accelerate and improve the
performance. Four types of classification models (incisor,
Tooth segmentation using U-Net canine, premolar, and molar) were devised, according to
Tooth segmentation was performed to extract accurate the Demirjian’s method for dental development staging.
and distinctive features of teeth and improve the accu- The seven cropped and segmented tooth images were
racy of the dental development classification model by assigned to the classification model in order. The first and
removing the surrounding background of the tooth from second tooth images were assigned to the Incisor model,
the cropped image. The U-Net model was employed to the third tooth image to the Canine model, the fourth
segment teeth in cropped images obtained from the pre- and fifth tooth images to the Premolar model, and the
vious tooth detection stage. The U-Net architecture con- sixth and seventh tooth images to the Molar model. Each
sists of a contracting path (left side) to capture context image was then labeled with the corresponding tooth
and a symmetric expanding path (right side) that enables development stage. The Incisor and Canine models clas-
precise localization [25]. The contracting path consisted sify their corresponding teeth into stage C to H, while
of repeated applications of two convolutional layers with the Premolar and Molar models classify their respective
a kernel size of 3 × 3 and a stride of 1, each followed by a teeth into stage A to H. The development stage for each
Ong et al. BMC Oral Health (2024) 24:426 Page 5 of 12
segmented tooth image from the panoramic radiographs The U-Net model was trained for the segmentation
was labeled by one skilled pediatric dental specialist, and process. 80% of the cropped tooth images from the detec-
set as a reference for the classification model training and tion procedure were randomly split and assigned to the
evaluation. The intraobserver reliability of the develop- training dataset, while the remaining 20% were allocated
mental stage labeling of each tooth based on the Demir- to the validation dataset. For the training of the seg-
jian’s method was assessed using weighted Cohen’s kappa mentation model with U-Net, the Adam optimizer and
analysis with MedCalc® Statistical Software (version binary cross-entropy loss function were used, with an ini-
20.100; MedCalc Software Ltd, Ostend, Belgium). The tial learning rate of 1e-4 and a batch size 10. The model
developmental stage of each tooth was re-examined using was trained for 1000 epochs and the model with the best
200 randomly selected panoramic radiographs at 3-week performance was selected. The segmentation model was
intervals, and the calculated weighted Cohen’s kappa val- evaluated for accuracy.
ues were 0.93, indicating ‘almost perfect’ agreement. Due The classification procedure of the dental developmen-
to significant variations in the number of images for each tal stages was performed using EfficientNet, and four
stage of tooth development within the model, the data for types of classification models were developed based on
training and validation in each development stage were Demirjian’s method: the Incisor model (central and lat-
randomly allocated as an 80:20 ratio, and the maximum eral incisors), Canine model (canine), Premolar model
number of training data was set to prevent significant (first and second premolars), and Molar model (first and
training bias between categories. The image size is set to second molars). Segmented images from U-Net were
224 × 224 with EfficientNet, and various data augmenta- labeled with the corresponding tooth development stage.
tion techniques were performed to increase the amount For each development class, datasets were randomly
of data and avoid overfitting and optimize the results. split, with 80% allocated to the training dataset and 20%
Training images were randomly flipped horizontally, to the validation dataset. The classification model with
brightness, contrast, saturation, and hue values were EfficientNet was trained for 1000 epochs using the Adam
randomly changed within 30%, image movement and optimizer, and the best model was selected. The initial
size changes within the 10% range, and random rotation learning rate was set to 1e-4, and the batch size was 10,
within 360 degrees were applied. with the cross-entropy loss function being employed.
A performance matrix was constructed to summarize
Model training options and evaluations the performance of the classification models. The recall
The study was performed on an NVIDIA Tesla K80 24 (classification accuracy), precision, and F1 score for each
GB GPU, and Python, an open-source programming classification model were calculated using the validation
language (version 3.8.13; Python Software Foundation, dataset, as shown in Eq. (2) to (4).
Wilmington, DE, USA), using the PyTorch library (ver-
sion 1.9.1), was used for the model development. TP
Recall = (2)
For the development of the automated tooth develop- TP + FN
ment staging system proposed in this study, a detection
and segmentation procedure for the seven left mandibu- TP
P recision = (3)
lar teeth in panoramic radiographs was needed prior to TP + FP
the tooth classification. A total of 5133 panoramic images
were randomly split into a training dataset (80%) and a Recall × P recision
F 1score = 2 × (4)
validation dataset (20%), and YOLOv5 was trained for Recall + P recision
tooth detection. The training of the detection model
with YOLOv5 uses the Adam optimizer with an initial TP: true positive, FP: false positive, FN: false negative.
learning rate of 1e-3 and a batch size of 4. The GIoU loss
function was adopted, and the model was trained for 100 Results
epochs, selecting the model with the best performance. A total of 5133 panoramic radiograph images, consisting
The performance of the detection model was evaluated of 2825 males and 2308 females were retrospectively col-
with recall, precision, and mAP (mean average precision). lected from the database of the Department of Pediatric
The equations are shown in (1), (2), and (3) Dentistry at Seoul National University Dental Hospital
between 2020 and 2021. The age and gender distributions
k=n
1 are presented in Table 1, with chronologic age calculated
mAP = AP k (1) by subtracting the date of birth from the date of the pan-
N
k=1
oramic radiograph taken.
Table 1 Age and sex distribution of the panoramic radiograph all showing the same value. This is because the results of
samples U-Net segmentation and ground truth contain only two
Chronologic age Boys Girls grayscale intensity values, 0 and 255 [26]. The accuracy
4–4.99 175 144
of the U-Net segmentation model was found to be 0.978,
5–5.99 286 201
and the visualized images resulting from the U-Net can
6–6.99 366 272
be seen in Fig. 1B.
7–7.99 365 316
8–8.99 343 287
Performance of the classification model
9–9.99 359 293
The confusion matrix with recall (classification accu-
10–10.99 311 276
racy), precision, and F1 score for each classification
11–11.99 217 191
model with the validation dataset is presented in Tables 2
12–12.99 174 133
and 3. The confusion matrix depicts the summary of the
13–13.99 111 106
14–14.99 74 53
prediction results of a classification model. The F1 score
15–15.99 44 36
combines precision and recall into a single metric and
Total 2825 2308
provides a balanced evaluation of a model’s performance.
The F1 score has a range between 0 and 1, with 1 indicat-
ing perfect precision and recall and 0 representing poor
Performance of the detection and segmentation model performance [29]. The processes of fully automated clas-
The performance of the YOLOv5 model was as follows: sification are shown in Fig. 1C.
recall: 0.991, precision: 0.994, and mAP: 0.995. Recall The Incisor model exhibited the highest classifica-
measures how well you find true positives (TP) out of all tion accuracy in stage H (99.22) and the lowest in stage
predictions (TP + FN), and precision measures how well C (34.78), with the highest F1 score achieved in stage
you find true positives (TP) out of all positive predic- H (96.49). The Canine model demonstrated the high-
tions (TP + FP) [29]. The mean average precision (mAP) est classification accuracy in stage F (94.04), the lowest
is a commonly used metric to analyze the performance of in stage G (65.89), and the highest F1 score in stage F
an object detection model. A high mAP indicates that the (91.09). The Premolar model showed the highest classi-
model is more precise and has higher recall. The process fication accuracy in stage F (92.28), the lowest in stage G
of tooth segmentation with YOLOv5 is shown in Fig. 1A. (73.37), and the highest F1 score in stage F (92.28). Last,
The accuracy of U-Net was evaluated for the perfor- the Molar model showed the highest classification accu-
mance, with accuracy, sensitivity, and specificity values racy in stage B (96.49) and the lowest in stage A (82.35),
Table 2 Evaluation metrics of each classified stage in incisor and canine classification models using EfficientNet
Classified Incisor Canine
stages Recall (%) Precision (%) F1 score (%) Recall (%) Precision (%) F1 score (%)
C 34.78 61.54 44.44 77.78 83.05 80.33
D 66.67 60.95 63.68 70.00 77.78 73.68
E 80.72 75.00 77.75 80.82 82.71 81.76
F 81.79 87.16 84.39 94.04 88.32 91.09
G 35.74 75.91 48.60 65.89 81.73 72.96
H 99.22 93.90 96.49 92.04 77.61 84.21
Average 66.49 75.74 69.23 80.09 81.87 80.67
Table 3 Evaluation metrics of each classified stage in premolar and molar classification models using EfficientNet
Classified Premolar Molar
stages Recall (%) Precision (%) F1 score (%) Recall (%) Precision (%) F1 score (%)
A 80.00 88.89 84.21 82.35 100.00 90.32
B 89.74 89.74 89.74 96.49 85.94 90.91
C 87.96 88.79 88.37 91.41 90.85 91.13
D 84.16 84.16 84.16 92.98 95.21 94.08
E 87.05 87.52 87.29 92.70 94.07 93.38
F 92.28 92.28 92.28 90.51 90.73 90.62
G 73.37 69.86 71.57 85.58 79.26 82.30
H 81.18 83.13 82.14 92.12 95.45 93.76
Average 84.47 85.55 84.97 90.52 91.44 90.81
Ong et al. BMC Oral Health (2024) 24:426 Page 7 of 12
Table 4 Cross-tabulation of the classified stages of the incisor assigned by the expert (column) and by the automated staging
proposed method (row) (%)
Stages C D E F G H
C 0.35 0.65
D 0.05 0.67 0.28
E 0.11 0.81 0.07
F 0.09 0.82 0.05 0.04
G 0.07 0.36 0.57
H 0.01 0.99
Table 5 Cross-tabulation of the classified stages of the canine assigned by the expert (column) and by the automated staging
proposed method (row) (%)
Stages C D E F G H
C 0.78 0.21 0.01
D 0.08 0.70 0.22
E 0.05 0.81 0.14
F 0.02 0.94 0.03 0.01
G 0.12 0.66 0.22
H 0.01 0.07 0.92
Table 6 Cross-tabulation of the classified stages of the premolar assigned by the expert (column) and by the automated staging
proposed method (row) (%)
Stages A B C D E F G H
A 0.80 0.20
B 0.02 0.90 0.08
C 0.01 0.88 0.11
D 0.07 0.84 0.09
E 0.05 0.87 0.08
F 0.04 0.92 0.04
G 0.13 0.73 0.14
H 0.01 0.18 0.81
Table 7 Cross-tabulation of the classified stages of the molar assigned by the expert (column) and by the automated staging
proposed method (row) (%)
Stages A B C D E F G H
A 0.82 0.18
B 0.97 0.03
C 0.04 0.91 0.05
D 0.04 0.93 0.03
E 0.02 0.93 0.05
F 0.02 0.91 0.07
G 0.07 0.86 0.07
H 0.08 0.92
with the highest F1 score in stage D (94.08). Among Visualization of Grad-CAM for the classification model
the four classification models, the Molar model exhib- Gradient-weighted class activation mapping (Grad-
ited the best performance with the highest classification CAM) was applied to the classification model results to
accuracy (90.97) and F1 score (90.81), while the Incisor create a visual explanation of the regions on which the
model showed the lowest accuracy (66.49) and lowest F1 EfficientNet model concentrated for each tooth develop-
score (69.23). Cross-tabulations of the stages assigned mental stage. The areas that had the most influence on
within the validation dataset, using the ground truth data the classification evaluation of the model are highlighted
labeled by one skilled pediatric dentist (rows) and the and presented as a heatmap [30]. Figure 2 illustrates the
classification model (columns), are shown in Tables 4, 5 Grad-CAM heatmaps for each dental development stage.
and 6, and 7. In cases of misclassification, most misclassi- The classification model seemed to effectively focus on
fied stages were seen only in the neighboring stages.
Ong et al. BMC Oral Health (2024) 24:426 Page 8 of 12
Fig. 2 Grad-CAM heatmaps of the classification according to dental development stage by Demirjian’s method
the features of each stage, mostly concentrating on the rapid, and consistent diagnoses while enhancing the
apical portion of the tooth. accuracy of prognostic predictions, particularly in the
analysis and diagnosis of radiographic images [16, 21,
Discussion 31]. In forensic odontology, the estimation of age groups
With the advancement of AI technology, there has been using AI has shown promising results, with high accu-
an increased interest in its application to dentistry. AI racy and precision [14, 15]. However, the studies on the
models serve as supportive tools, providing more precise, developing dentition of adolescents and children were
Ong et al. BMC Oral Health (2024) 24:426 Page 9 of 12
insufficient. The present study devised an automated performance in segmenting teeth in panoramic and
dental developmental staging system in panoramic radio- periapical images, as well as different features of teeth
graph using deep learning models and evaluated the per- in periapical images [26, 27, 36], was employed to seg-
formances for each process. The proposed methodology ment detected teeth in this study, achieving a high accu-
has potential applications in estimating dental age for racy of 0.978. For tooth development staging, Merfietio
forensic odontology and in treatment planning for ortho- Boedi et al. suggested the full tooth segmentation type,
dontics and pediatric dentistry, by providing dental pro- which includes only the developing tooth structure [33].
fessionals with the ease and efficiency of dental staging. However, in this study, rough segmentation with the sur-
Previous studies utilizing deep learning to classify den- rounding pixels was implemented to reduce misclassifi-
tal development stages with panoramic radiographs have cation caused by the under-segmentation of the tooth
primarily focused on evaluating one or two teeth rather edge [12, 26], as the obscurity of the boundary between
than the lower left quadrant teeth commonly exam- the tooth root and alveolar bone may be a critical issue
ined in traditional methods [15, 32]. Mohammad et al. in tooth segmentation [27]. Since Demirjian’s method
assessed the left mandibular first and second permanent classifies teeth based on the apical portion of the devel-
premolars from stage C to H with a deep learning model oping tooth, it was necessary to prevent inadvertent cut-
[12], and Merdietio Boedi et al. devised an automated ting of the tooth and minimize background interference
tooth developmental staging system for the segmented as much as possible.
left mandibular third molar [33]. However, determining Following detection and segmentation, each tooth was
dental age based on the development stage of a single or a categorized into four types (incisor, canine, premolar, and
few teeth may result in a broad age range. A comprehen- molar) based on its tooth number. Subsequently, four
sive evaluation of multiple teeth, similar to the currently separate models were trained using EfficientNet, each
used manual methods, would enhance the accuracy and corresponding to one of these categories and referenc-
practical utility of age determination. In this study, we ing the dental development stage according to Demir-
designed a fully automated dental development classifi- jian’s method. The EfficientNet model family is smaller
cation system using deep learning based on Demirjian’s and faster than other previous models with its compound
method and evaluated the performance of the stage clas- scaling techniques [28] and has shown promising results
sification. Our proposed method comprises three stages: in the classification of dental images [37, 38]. The model’s
detection, segmentation, and classification, with the aim performance in distinguishing between each develop-
of automatically classify the dental development stages in mental stage of the tooth was assessed, with the F1 score,
panoramic radiographs. precision, and classification accuracy (recall) of the four
For the classification of individual teeth, it was neces- models being highest in the Molar model, followed by the
sary to detect each tooth sequentially. YOLO, a fast real- Premolar, Canine, and Incisor models (Tables 2 and 3).
time object detection model known for its high mean The Incisor model effectively distinguished develop-
average precision, was utilized to detect permanent teeth mental stages, particularly in the E, F, and H stages. How-
in panoramic radiographs. YOLOv4 has previously dem- ever, the overall model performance was poor due to low
onstrated high performance in detecting permanent classification accuracy in the C, D, and G stages, resulting
tooth germs on panoramic radiographs [34] and has also in an F1 score of 69%. The low F1 score of the C and D
shown accurate and fast performance for automated stages in the Incisor model can be attributed to the lim-
tooth detection and numbering in panoramic radio- ited number of panoramic radiograph samples in young
graphs [35]. In this study, the performance of YOLOv5 children, leading to underfitting of the model caused by
showed promising results, demonstrating high recall, the insufficient number of samples. Moreover, stages C
precision, and mean average precision for the detection and D often overlap with primary teeth or appear rotated
of permanent tooth in the lower left quadrant of pan- on radiographs, making it challenging for the model to
oramic radiographs. However, since only panoramic sam- accurately learn and distinguish these stages. In stage G, a
ples with all seven teeth intact were included for training considerable number of cases were misclassified as stage
and evaluation, excluding images of missing or supernu- H, contributing to low accuracy (Table 4). The blurred,
merary teeth, the model’s detection performance may shortened, or unclear perspective of the lower incisors
have shown higher values. in panoramic radiographs with mixed dentition, which
The segmentation procedure was conducted after could result from improper positioning of the patient
detecting the seven teeth with the bounding boxes. Seg- [39], may also attribute to the low performance of the
menting the tooth from the surrounding background Incisor model. Positioning errors are a common issue in
can enhance the stage classification performance of the panoramic radiography, causing image distortions where
model, as the remaining surrounding tissues may obscure the apexes of the lower incisors may appear out of focus,
correct stage allocation [33]. U-Net, known for its high impacting diagnostic accuracy [40]. Such errors are
Ong et al. BMC Oral Health (2024) 24:426 Page 10 of 12
more prevalent among younger individuals who may not models is insufficient. However, considering that the
remain calm and motionless during the radiograph pro- misclassified cases were predominantly categorized into
cedure, leading to challenges in proper positioning [39]. neighboring stages (Tables 4, 5 and 6, and 7), it suggests
The Canine model exhibited higher classification per- that the deep learning models can effectively play a sup-
formance than the Incisor model, with no significant dif- portive role in classifying tooth development stages.
ferences between stages and an average F1 score of 80%. The use of deep learning in radiograph analysis can
However, similar to the Incisor model, the classification reduce observer fatigue and bias, handle large samples in
accuracy was low in stage G and was often misclassified a short amount of time, thus shortening the time of diag-
as stage H (Table 5). The Premolar and Molar models nosis and increases the efficiency of clinicians [14, 21,
demonstrated the highest performance in distinguish- 33]. In contrast to manual interpretation, disagreements
ing developmental stages overall, with average F1 scores between observers are eliminated, and the results are
of 85% and 90%, respectively (Table 3). The highest F1 independent of the skills or experiences of the observers.
score was observed in the F stage for the Premolar model Furthermore, with ongoing technology advancements,
and the D stage for the Molar model. The performance new CNN architectures are continually being developed,
between stages did not exhibit substantial differences in leading to a gradual improvement in the performance
either model. However, both the Premolar and Molar of deep learning models. This enhanced performance
models, showed the lowest F1 score in the G stage and is expected to further increase their effectiveness and
misclassified cases were assigned to the E and H stages in broaden their application in medical image analysis in the
a similar proportion. future [41, 42].
The important features for dental developmental stages There are still a few limitations to this study. First,
in classification models were highlighted through heat- panoramic radiographs with low resolution or showing
maps using gradient-weighted class activation mapping patient positioning errors were included as long as they
(Grad-CAM) in Fig. 2 to improve the interpretability of could be distinguishable by a pediatric dental special-
the classification model. The classification models spe- ist. This inclusion criterion may have resulted in a par-
cifically focused on the apical portion of the develop- ticularly lower performance of the anterior tooth model,
ing tooth, which is considered an important feature in as these errors are more common in pediatric patients.
distinguishing between the stages based on Demirjian’s Further studies considering positioning errors in pan-
method. oramic radiographs is necessary to enhance the model’s
In this study, we proposed a three-step procedure for performance, particularly for anterior teeth. Second, as
the automated classification of dental development stages four classification models were trained with seven teeth
in panoramic radiographs using deep learning. Preced- from the same panoramic samples, the number of data-
ing the classification, tooth detection and segmentation sets varied for each tooth stage. The imbalanced datas-
would enhance the overall performance of stage classi- ets between the developmental stages may introduce
fication compared to the classification procedure alone. bias in the classification model, necessitating additional
While deep learning models have demonstrated high research to address class imbalances in developing denti-
accuracy in tooth detection and segmentation [26, 27, tion. Third, the number of samples for early developmen-
35, 36], their performance for dental developmental stage tal stages was limited, as panoramic radiographs are not
classification remains insufficient. Previous studies on routinely taken at a young age. Studies with a larger num-
deep learning models for development stage classifica- ber of samples for early developmental stages are needed
tion have primarily focused on premolars or molars [12, to improve the model’s performance for this phase. Fur-
32, 33], with research on incisors and canines lacking. thermore, with the advancement of deep learning mod-
Therefore, the results of this study could provide ideas for els, additional studies would be needed to investigate the
further research in devising more accurate classification potential for achieving more precise and accurate detec-
models for a comprehensive automated dental age and tion, segmentation, and classification performances, as
maturity analysis. The four types of classification models demonstrated in this study.
exhibited differences in accuracy and performance, with
the Incisor and Canine models showing lower perfor- Conclusion
mance than the Premolar and Molar models. It remains In this study, we proposed a fully automated dental devel-
challenging to classify all seven lower left teeth individu- opment staging system based on Demirjian’s method
ally using deep learning without manual interpretation to using deep learning. The proposed method consists of
estimate dental age or evaluate dental maturity according three stages: detection, segmentation, and classification.
to Demirjian’s method. Manual intervention is still nec- YOLOv5, U-Net, and EfficientNet were employed for
essary to minimize errors from the deep learning model, each stage, and the models’ performance was evaluated,
and completely relying on decisions from deep learning demonstrating good results across various metrics. The
Ong et al. BMC Oral Health (2024) 24:426 Page 11 of 12
References
detection and segmentation procedures yielded promis- 1. Khorate MM, Dinkar A, Ahmed J. Accuracy of age estimation methods from
ing results, with a mAP of 0.995 for the detection model orthopantomograph in forensic odontology: a comparative study. Forensic
and an accuracy of 0.978 for the segmentation model. Sci Int. 2014;234(184):e1–8.
2. Chaillet N, Willems G. Dental maturity in Belgian children using Demirjian’s
The classification model demonstrated F1 scores of 69.23, method and polynomial functions: new standard curves for forensic and
80.67, 84.97, and 90.81 for the Incisor, Canine, Premo- clinical use. J Forensic Odontostomatol. 2004;22(2):18–27.
lar, and Molar models, respectively. In the Grad-CAM 3. Leurs I, Wattel E, Aartman I, Etty E, Prahl-Andersen B. Dental age in Dutch
children. Eur J Orthod. 2005;27(3):309–14.
analysis, the classification model focused on the apical 4. Shi L, Zhou Y, Lu T, Fan F, Zhu L, Suo Y, Chen Y, Deng Z. Dental age estimation
portion of the developing tooth, a crucial feature for stag- of tibetan children and adolescents: comparison of Demirjian, Willems meth-
ing according to Demirjian’s method. Further studies are ods and a newly modified Demirjian method. Leg Med. 2022;55:102013.
5. Moness Ali AM, Ahmed WH, Khattab NM. Applicability of Demirjian’s
needed to enhance the model’s performance for dental method for dental age estimation in a group of Egyptian children. BDJ open.
staging accuracy in anterior teeth. The proposed method 2019;5(1):2.
holds great promise for future use in forensic odontology 6. Priyadarshini C, Puranik MP, Uma SR. Dental Age Estimation methods: a
review. Int J Adv Health Sci. 2015;1(12):19–25.
and clinical practice, serving as a supportive tool for the 7. Panchbhai A. Dental radiographic indicators, a key to age estimation. Dento-
rapid and objective evaluation of dental age estimation maxillofac Radiol. 2011;40(4):199–212.
and dental maturity. 8. Demirjian A, Goldstein H, Tanner JM. A new system of dental age assessment.
Hum Biol. 1973;45(2):211–27.
Abbreviations 9. Demirjian A, Goldstein H. New systems for dental maturity based on seven
AI Artificial intelligence and four teeth. Ann Hum Biol. 1976;3(5):411–21.
ML Machine learning 10. Wang J, Dou J, Han J, Li G, Tao J. A population-based study to assess two
DL Deep learning convolutional neural networks for dental age estimation. BMC Oral Health.
CNN Convolutional neural network 2023;23(1):109.
YOLO You only look once 11. Milošević D, Vodanović M, Galić I, Subašić M. Automated estimation of
ReLU Rectified linear unit chronological age from panoramic dental X-ray images using deep learning.
mAP mean average precision Expert Syst Appl. 2022;189:116038.
Grad-CAM Gradient-weighted class activation mapping 12. Mohammad N, Muad AM, Ahmad R, Yusof MYPM. Accuracy of advanced
deep learning with tensorflow and keras for classifying teeth developmental
Acknowledgements stages in digital panoramic imaging. BMC Med Imaging. 2022;22(1):66.
Not applicable. 13. Jain V, Kapoor P, Miglani R. Demirjian approach of dental age estimation:
abridged for operator ease. J Forensic Dent Sci. 2016;8(3):177.
Author contributions 14. Khanagar SB, Vishwanathaiah S, Naik S, Al-Kheraif AA, Divakar DD, Sarode
SHO and YJK conceived the ideas and established the experimental setup. SC, Bhandi S, Patil S. Application and performance of artificial intelli-
HTK, JSS, TJS, and HKH collected and generated data. KTJ assisted planning gence technology in forensic odontology–A systematic review. Leg Med.
of the study and reviewed the manuscript. SHO and YJK wore the first 2021;48:101826.
manuscript. All authors analyzed and interpreted the data. All authors read 15. Vila-Blanco N, Varas-Quintana P, Tomás I, Carreira MJ. A systematic overview
and approved the manuscript. of dental methods for age assessment in living individuals: from traditional to
artificial intelligence-based approaches. Int J Legal Med. 2023;137:1117–46.
Funding 16. Vishwanathaiah S, Fageeh HN, Khanagar SB, Maganur PC. Artificial intel-
This research was not supported by any funding. ligence its uses and application in pediatric dentistry: a review. Biomedicines.
2023;11(3):788.
Data availability 17. Ongsulee P. Artificial intelligence, machine learning and deep learning. Proc
The data that support the findings of this study are available from the 15th Int Conf ICT Knowl Eng (ICT&KE). 2017:1–6.
corresponding author, upon reasonable request. 18. El Joudi NA, Othmani MB, Bourzgui F, Mahboub O, Lazaar M. Review of the
role of Artificial Intelligence in dentistry: current applications and trends.
Procedia Comput Sci. 2022;210:173–80.
Declarations 19. Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning.
Electron Mark. 2021;31(3):685–95.
Ethics approval and consent to participate 20. Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK. Medical
This retrospective study was conducted according to the guidelines of the image analysis using convolutional neural networks: a review. J Med Syst.
Declaration of Helsinki and approved by the Institutional Review Board of the 2018;42:1–13.
Seoul National University Dental Hospital, Seoul, Korea (Ethics Code: ERI23026). 21. Khanagar SB, Al-Ehaideb A, Maganur PC, Vishwanathaiah S, Patil S, Baeshen
There was no need for individual consent, and the need for informed consent HA, Sarode SC, Bhandi S. Developments, application, and performance
was waived by the Ethics Committee of Seoul National University Dental of artificial intelligence in dentistry–A systematic review. J Dent Sci.
Hospital for this retrospective study because the data and patient details were 2021;16(1):508–22.
anonymized. 22. Guo YC, Han M, Chi Y, Long H, Zhang D, Yang J, Yang Y, Chen T, Du S.
Accurate age classification using manual method and deep convolutional
Consent for publication neural network based on orthopantomogram images. Int J Legal Med.
Not applicable. 2021;135:1589–97.
23. Kahaki SM, Nordin MJ, Ahmad NS, Arzoky M, Ismail W. Deep convolutional
Competing interests neural network designed for age assessment based on orthopantomography
The authors declare no competing interests. data. Neural Comput Appl. 2020;32:9357–68.
24. Redmon J, Divvala S, Girshick R, Farhadi A. You Only Look Once: Unified, real-
Received: 24 November 2023 / Accepted: 18 March 2024 time object detection. In: Proceedings of the IEEE conference on computer
vision and pattern recognition. 2016;779–788.
25. Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedi-
cal image segmentation. In Proceedings of 18th International Confer-
ence on Medical Image Computing and Computer-Assisted Intervention.
2015;234–241.
Ong et al. BMC Oral Health (2024) 24:426 Page 12 of 12
26. Fariza A, Arifin AZ, Astuti ER. Automatic tooth and background segmentation radiograph using a deep learning approach. Oral Surg Oral Med Oral Pathol
in dental x-ray using U-Net convolution network. In: 2020 6th International Oral Radiol. 2023;000:1–8.
Conference on Science in Information Technology (ICSITech). 2020;144–149. 36. Ari T, Sağlam H, Öksüzoğlu H, Kazan O, Bayrakdar İŞ, Duman SB, Çelik Ö, Jag-
27. Nishitani Y, Nakayama R, Hayashi D, Hizukuri A, Murata K. Segmentation of tap R, Futyma-Gąbka K. Różyło-Kalinowska I. Automatic feature segmentation
teeth in panoramic dental X-ray images using U-Net with a loss function in Dental Periapical radiographs. Diagnostics. 2022;12(12):3081.
weighted on the tooth edge. Radiol Phys Technol. 2021;14:64–9. 37. Deepak GD, Krishna Bhat S. Optimization of deep neural networks for
28. Tan M, Le Q, EfficientNet. Rethinking model scaling for convolutional neural multiclassification of dental X-rays using transfer learning. Comput Methods
networks. In: International conference on machine learning. 2019;6105–6114. Biomech Biomed Eng Imaging Vis. 2023;1–20.
29. Powers DM. Evaluation: from precision, recall and F-measure to ROC, 38. Hasnain MA, Malik H, Asad MM, Sherwani F. Deep learning architectures in
informedness, markedness and correlation. arXiv Preprint. 2020. dental diagnostics: a systematic comparison of techniques for accurate pre-
arXiv:201016061. diction of dental disease through x-ray imaging. Int J Intell Comput Cybern.
30. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: 2023.
Visual explanations from deep networks via gradient-based localization. 39. Peretz B, Gotler M, Kaffe I. Common errors in digital panoramic radiographs
In: Proceedings of the IEEE international conference on computer vision. of patients with mixed dentition and patients with permanent dentition. Int J
2017;618–626. Dent. 2012;584138.
31. Carrillo-Perez F, Pecho OE, Morales JC, Paravina RD, Della Bona A, Ghinea 40. Rondon RHN, Pereira YCL, do Nascimento GC. Common positioning errors in
R, Pulgar R, Pérez MM, Herrera LJ. Applications of artificial intelligence in panoramic radiography: a review. Imaging Sci Dent. 2014;44(1):1–6.
dentistry: a comprehensive review. J Esthet Restor Dent. 2022;34(1):259–80. 41. Fatima A, Shafi I, Afzal H, Díez IDLT, Lourdes DR-SM, Breñosa J, Espinosa JCM,
32. Pintana P, Upalananda W, Saekho S, Yarach U, Wantanajittikul K. Fully auto- Ashraf I. Advancements in dentistry with artificial intelligence: current clinical
mated method for dental age estimation using the ACF detector and deep applications and future perspectives. Healthcare. 2022;10:2188.
learning. Egypt J Forensic Sci. 2022;12(1):54. 42. Razzak MI, Naz S, Zaib A. Deep learning for medical image processing:
33. Merdietio Boedi R, Banar N, De Tobel J, Bertels J, Vandermeulen D, Thevis- overview, challenges and the future. Classification in BioApps: automation of
sen PW. Effect of lower third molar segmentations on automated tooth decision making. Cham: Springer; 2018. pp. 323–50.
development staging using a convolutional neural network. J Forensic Sci.
2020;65(2):481–6.
34. Kaya E, Gunec HG, Aydin KC, Urkmez ES, Duranay R, Ates HF. A deep learning Publisher’s Note
approach to permanent tooth germ detection on pediatric panoramic Springer Nature remains neutral with regard to jurisdictional claims in
radiographs. Imaging Sci Dent. 2022;52(3):275–81. published maps and institutional affiliations.
35. Putra RH, Astuti ER, Putri DK, Widiasri M, Laksanti PAM, Majidah H, Yoda N.
Automated permanent tooth detection and numbering on panoramic