AI in Orthodontics
AI in Orthodontics
January 2020
Yi-Chin Wang
Department of Craniofacial Orthodontics, Chang-Gung memorial hospital, Taoyuan, Taiwan
Yu-Chih Wang
Department of Craniofacial Orthodontics, Chang-Gung memorial hospital, Taoyuan, Taiwan
Recommended Citation
Hung, Hsien-Ching; Wang, Yi-Chin; and Wang, Yu-Chih (2020) "Applications of Artificial Intelligence in
Orthodontics," Taiwanese Journal of Orthodontics: Vol. 32: Iss. 2, Article 3.
DOI: 10.38209/2708-2636.1005
Available at: https://www.tjo.org.tw/tjo/vol32/iss2/3
This Review Article is brought to you for free and open access by Taiwanese Journal of Orthodontics. It has been
accepted for inclusion in Taiwanese Journal of Orthodontics by an authorized editor of Taiwanese Journal of
Orthodontics.
Applications of Artificial Intelligence in Orthodontics
Abstract
Artificial intelligence (AI) technology is a tool for finding insights in different kind of information in
medical field. The purpose of this article was giving a brief introduction to applications of AI in
orthodontic treatment. The reviewed literatures were further categorized into (1) extraction or non-
extraction therapy in orthodontic treatment, (2) orthognathic surgery, (3) segmentation and landmark
identification, (4) growth prediction, (5) cleft related studies, and (6) TMD classification.
Keywords
artificial intelligence; machine learning; orthodontic treatment
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0
License.
Artificial intelligence (AI) technology is a tool for finding insights in different kind of information in medical
field. The purpose of this article was giving a brief introduction to applications of AI in orthodontic treatment.
The reviewed literatures were further categorized into (1) extraction or non-extraction therapy in orthodontic
treatment, (2) orthognathic surgery, (3) segmentation and landmark identification, (4) growth prediction, (5)
cleft related studies, and (6) TMD classification. (Taiwanese Journal of Orthodontics. 32(2): 85-92,
2020)
Received: April 15, 2020 Revised: May 11, 2020 Accepted: June 7, 2020
Reprints and correspondence to: Dr. Hsien-Ching Hung
10F., No. 6, Ln. 284, Wuxing St., Xinyi Dist., Taipei City 110, Taiwan (R.O.C.)
Tel: +886-958-156902 E-mail: [email protected]
turn new input data such as sounds, images, into valuable process, the chosen function was evaluated by testing
output information such as speech recognition and data again to see if it is really workable.
facial recognition. Machine learning could be divided The state-of-the-art machine learning method ‘deep
into training process and testing process (Figure 1). In learning’ is based on artificial neural network (ANN)
training process, a set of function, also known as model, (Figure 2). ANN was inspired by biological neural
was evaluated by training data and computer algorism network which help us to sense the world and learn from
to come up with the best function. Followed by testing it. The basic unit of ANN is called an artificial neuron.
Figure 1. A
ll samples will be firstly divided into training set and testing set. Training set will be further
divided into training set/learning set and validation set for preventing overfitting. Finally, testing
set will be used for model evaluation.
Figure 2. A
rtificial neural network (ANN) consist of input layer, output layer and hidden layers in between.
Artificial neurons in different layers were connected. Each connection was assigned with a weight
representing its relative importance.
ANN can be described as a function with both input and data for each patient, and extraction or non-extraction
output value as a vector. The first layer of ANN is called were calculated as output data. Among 200 subjects, 180
input layer, it is constituted by the features of the sample were used as training data and 20 were used as testing
as a vector. Between input layer and output layer we can data. The constructed ANN in this study showed 80%
design different number of layers called hidden layers. accuracy in testing set. Moreover, lip incompetence and
Within each hidden layer, different numbers of artificial IMPA(L1-MP) were the two indices that give the biggest
neurons are designed. Within the artificial neuron, input contribution to the output data.
values can be transformed either linear or non-linear Jung et al. also constructed neural network model
8
through different weights and activation function and combined with back propagation algorism. The purpose
its output propagates to the input of the next layer of of the study was to construct an AI expert system for
ANN. Deep learning neural network consists of at least decision of extraction therapy and extraction pattern.
3 hidden layers. There were 156 patients that included in the study. Twelve
Convolutional neural network (CNN) is a kind of cephalometric variables and 6 indexes were selected as
deep learning neural network. It showed outstanding input data. Extraction or non-extraction and extraction
4
performance in image recognition and classification. pattern were set as output data. The treatment plans were
Within convolutional layer, multiple filters extract different determined by one orthodontic specialist. Different from
7
patterns in the image and come up with multiple feature Xie’s study, it further divided training data into training
maps. Followed by pooling process which streamline the data and validation data. Iterative learning was stopped at
size of the image and reducing the computation. After the minimum error point of the validation set to prevent
iterative convolutional and pooling process, the output overfitting. The success rate of the models was 93% for
was connected with fully connected layers to classify the the diagnosis of extraction or non-extraction therapy and
image. Medical field such as dermatology and radiology 84% for the selection of extraction pattern.
have shown good result applying CNN as assisting
5, 6
Orthognathic surgery
diagnostic tool, yet in orthodontic filed, it has started to
Great investment has been made in research and
get attention gradually.
development of digital orthodontics and 3D simulation
This article aims to provide an insight into 9
of orthognathic surgery. Besides, automated treatment
applications of AI related to orthodontic diagnosis and
planning and customized surgical set up planning lead
treatment planning.
to improved diagnostic precision especially among
10, 11
Extraction or non-extraction therapy in orthodontic inexperienced doctors. Knoops et al. developed a
treatment machine learning framework for automated diagnosis and
Xie et al. constructed a decision-making expert computer-assisted planning in plastic and reconstructive
system for orthodontic treatment of patients aged between surgery.
12
They presented the large-scale clinical
11 to 15 years old to decide whether tooth extraction is 3D morphable model (3DMM), a machine-learning
7
needed by using back propagation ANN model. ANN framework including supervised learning constructed
model simulates human neural system, with neural with surface 3D scan. The model was trained with 4261
networks which can process nonlinear relationships faces of healthy volunteers and orthognathic surgery
and exhibit learning ability. 200 subjects were chosen, patients. Through automated image processing, it provides
120 receiving extraction therapy and 80 receiving non- binary outcome whether someone should be referred to
extraction therapy. 23 indices were selected as input a specialist with 95.5% sensitivity and 95.2% specificity.
Then, a specialist can automatically produce 3D mandibular variables which might be quite helpful for
simulation of post-surgical outcome with mean accuracy craniofacial reconstruction.
of 1.1±0.3 mm, without the need for conventional time- Patcas et al. conducted an interesting study
consuming computer assisted surgical simulation. assessing the impact of orthognathic treatments on facial
15
However, only surface scan was used in this study, so attractiveness and estimated age by AI technologies. For
the underline bone movement needed to be calculated age estimation, the convolutional neural network (CNN)
according to soft tissue movement which still remain a big model was trained with > 0.5 million facial images with
task nowadays. age labels acquired from the Internet Movie Database
Weichel et al. developed a computer-assisted and Wikipedia. For attractiveness prediction, the CNN
planning system based on CT, cephalometric and plaster model was trained on data from a dating site with > 13000
13
model. The system referred to a knowledge base built in face images and >17 million ratings for attractiveness.
semantic web standard Resource Description Framework Presurgical and postsurgical photos of 146 consecutive
Schema, which transferred human knowledge into orthognathic patients were collected for this single-
machine readable data. Gradient descent algorism was center study. According to the algorism, 66.4% of the
used to find local minimum of the loss function. Loss patients improved with the treatment resulting in younger
function described how good the current regression we appearance of nearly 1 year. The study showed that AI
applied. The bigger the calculated result deviate from might be an objective way evaluating treatment outcome
optimal result, the bigger the loss function become. Good in terms of aesthetic improvement.
general agreement between the automatically generated Segmentation and landmark identification
planning proposal and planning result of a maxillofacial Image segmentation is the process we isolate the
expert was found. But it is a preliminary study with only 5 pixels of target organs or lesion from medical images
cases was evaluated. such as X-rays, CT, or MRI.
16
Image segmentation
Comparing to Knoops’s study who used 3DMM to plays an important role in automated or semi-automated
come up with diagnosis, Choi et al. applied ANN obtained computer-aided diagnosis systems, and it is also important
from 12 measurement values of the lateral cephalogram in volumetric medical image analysis. Landmark
11
and 6 additional indexed. The machine learning model identification in lateral cephalometric X-ray have been of
consisted of 2-layer neural network with one hidden layer. paramount importance in terms of diagnosis and treatment
17
The sample included 316 patients with 160 were planned planning in orthodontic treatment for decades. In this
with surgical treatment and 156 were planned with non- session we reviewed studies applied machine learning to
surgical treatment. The success rate of the model showed perform segmentation and landmark identification.
96% for whether the patient need surgical treatment, Wang et al. developed a method for automated
and 91% for the detailed diagnosis of surgery type and segmentation of maxilla and mandible through
18
extraction decision. The success rate is comparable CBCT. They applied a learning-based framework to
between these two studies. simultaneously segment both maxilla and mandible from
19
Niño-Sandoval et al. tried to predict mandible bone CBCT based on random forest. Dice ratio is a popular
14
morphology based on maxilla morphology using ANN. way evaluating volumetric segmentation of medical
299 lateral cephalograms was obtained from Colombian images. The definition is the sum of intersection voxels
patients with 19 landmarks on X and Y coordinates. of the learned and ground truth sets times two divided by
The result showed high predictability of the selected the sum of the respective voxels. When Dice ratio equals
to 1 means perfect match between learned and ground growth prediction such as chronological age, menarche,
truth set, and zero indicates no similarity. In this study, 30 change in voice and body height, and bone age. The gold
CBCT were validated base on manually labeled ground standard for assessing bone age was obtained by hand-
truth. The average Dice ratio of mandible and maxilla wrist radiographs, however, Lamparski reported that by
were 0.94 and 0.91 respectively. reading cervical vertebrae stages, similar accuracy could
Chen et al. used a machine learning algorism based be attained and preventing additional radiation at the same
18 26, 27
on Wang’s technique to assess maxillary structure time. Spampinato used deep learning approaches to
20 28
variation in unilateral canine impaction. Subjects assess bone age through hand-wrist radiographs. The
included 30 study group patients with unilateral maxillary dataset contained 1391 X-ray left-hand scans of children
canine impaction and 30 healthy control group subjects. of age up to 18 years old with bone age values provided
Maxillary structure was auto-segmented and no significant by two expert radiologists. The result showed an average
difference in bone volume was found between impacted discrepancy between manual and automatic evaluation
side and non-impacted side in study group. Study group of about 0.8 years. Kok et al. compared different AI
had significant smaller maxillae volume than control algorisms for determination of growth by cervical
29
group. The segmentation efficiency has been greatly vertebrae stages. K-nearest neighbors, Naive Bayes,
improved by the automatic algorism. decision tree, artificial neural networks, support vector
Several studies looked into automated landmark machine, random forest, and logistic regression algorithms
21-25 22
identification of lateral cephalometric. Arik first were tested for accuracy. ANN showed most stable result
applied CNNs for automated lateral cephalometric and was suggested the preferred method for determining
23, 25
landmark identification. Park and Hwang used deep- cervical vertebrae stage.
learning method You-Only-Look-Once version 3 to train
Cleft related studies
on 1028 cephalograms. The mean detection error of a total
Zhang collected blood samples from healthy control
of 80 landmarks between AI and human was 1.46±2.97
24
and non-syndromic cleft lip and palate infants (NSCL/
mm. Kunz used open source CNN deep learning
P) in Han and Uyghur Chinese population to validate
algorism (Keras & Google Tensorflow) for 12 commonly
the diagnostic effectiveness of 43 single nucleotide
used orthodontic parameters automatic identification.
polymorphisms (SNPs) previously detected using genome-
A set of 50 cephalometric X-rays were analyzed. Mean 30
wide association studies. Different machine learning
difference between AI and humans’ gold standard were
algorisms was used to build predictive models with those
less than 0.37° for angular parameters and less tvhan 0.20
SNPs and evaluated their prediction performance. The
mm for metric parameters and less than 0.25% for the
21
result showed logistic regression had the best performance
proportional parameter facial height. Nishimoto used
for risk assessment. Defective variants in MTHFR and
CNNs with personal computer and lateral cephalometric
RBP4, two genes involved in folic acid and vitamin A
X-rays gathered through the internet and still get the result
biosynthesis, were found to have high contributions to
without significant difference between AI and hand traced
NSCL/P incidence based on feature importance evaluation
cephalometric landmarks.
with logistic regression. This is in consistence with the
Growth prediction impression that folic acid and vitamin A are essential for
Timing is one of the key points needed to be reducing the risk of conceiving an NSCL/P baby.
considered during treatment planning, especially among Patcas et al. used a CNN model previously trained
growing patients. Several methods have been proposed for on a dataset of dating site with >13000 face images and
> 17000 ratings for attractiveness to compare facial way process, taking the goodness of the predictive result
31
attractiveness between treated cleft patients and controls. as a feedback, we can further fine-tune the previous
Human rated significantly higher than AI for the score model and feature engineering process to get a positive
of attractiveness of controls. Attractiveness scores were feedback loop. In orthodontic field, the concept of
comparable in treated cleft patients rated by AI and precision medicine means a more complex diagnostic
human. The result suggested that AI still need to improve process, a more personalized treatment planning and a
its interpretation of cleft features impacting on facial more sophisticated treatment process and those might lead
attractiveness, to become a better tool evaluating aesthetics. to a more efficient treatment with less side effects and
treatment duration. Hopefully, the medical quality could
TMD classification
be raised while decreasing the medical costs through the
Shoukri et al. applied neural network to stage
application and development of AI technology.
condylar morphology in temporomandibular joint
32
osteoarthritis (TMJOA). The neural network was
trained on 259 condyles to detect and classify the stage of CONCLUSION
TMJOA and compare to clinical expert’s classification. AI technologies have been increasingly applied to the
Condylar morphology was classified into 6 groups by field of orthodontic treatment. It is proved to be a reliable
CBCT image. Predictive analytics of the AI’s staging of and time saving tool in many aspects. Future effort could
TMJOA compared to the repeated clinicians’ consensus be made on creating cloud-based platforms for data
showed 73.5 and 91.2% accuracy. The results suggest that integration and sharing. Given that data is the foundation
TMJOA condylar morphology can be comprehensively of well-constructed models, with high quality and quantity
classified by AI. of data, higher accuracy of predictive result and image
AI have been applied to robotic surgeries in interpretation could be achieved through machine learning
neurological, gynecological, cardiothoracic and numerous process. In terms of orthodontic research, a well-trained
2
general surgical procedures. It is quite promising in the AI model can help not only landmark identification,
near future that AI robotic technologies could be applied but all kinds of linear and angular measurements and
to orthognathic surgery as well. It can reduce infection volumetric measurements as well. It can save tremendous
rate because only robotics have contact with the patient. time by fully automated AI measurements so researchers
Higher precision of jaw movement can be expected at will have more energy finding new insights within clinical
the same time. Last but not least, thanks to the power examinations.
of technology, diagnostic and therapeutic philosophy
are going through a paradigm shift from the traditional
REFERENCES
‘signs and symptoms’ approach to ‘precision medicine’
approach.
33, 34
Starting with patients deep phenotyping, 1. Nilsson NJ: Artificial Intelligence: A New Synthesis.
which gathered not only clinical data but genetic and 1st Ed. San Francisco, CA, Morgan Kaufmann, 1998;
biomarkers information, even lifestyle and environmental p.1-7.
condition as well. Then data cleaning, exploratory data 2. Scerri M, Grech V. Artificial intelligence in medicine.
analysis, and feature engineering will be conducted by Early Hum Dev. 2020 Mar 20:105017. doi: 10.1016/
data scientists. Applying AI technology, we can build j.earlhumdev.2020.105017
a diagnostic/prognostic model based on the ‘big data’ 3. S h a r k e y N E , Z i e m k e T. M e c h a n i s t i c v e r s u s
and predicting treatment results. It is not only a one- phenomenal embodiment: Can robot embodiment lead
to strong AI? Cogn Syst Res. 2001;2(4):251-62. planning. Curr Dir Biomed Eng. 2019;5(1):41-4.
4. S c h w e n d i c k e F, G o l l a T, D r e h e r M , K r o i s 14. Niño-Sandoval TC, Guevara Pérez SV, González FA,
J. Convolutional neural networks for dental Jaque RA, Infante-Contreras C. Use of automated
image diagnostics: A scoping review. J Dent. learning techniques for predicting mandibular
2019;91:103226. morphology in skeletal class I, II and III. Forensic Sci
5. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Int. 2017;281:187 e1- e7.
Blau HM, et al. Dermatologist-level classification 15. Patcas R, Bernini DAJ, Volokitin A, Agustsson E,
of skin cancer with deep neural networks. Nature. Rothe R, Timofte R. Applying artificial intelligence to
2017;542(7639):115-8. assess the impact of orthognathic treatment on facial
6. Saba L, Biswas M, Kuppili V, Cuadrado-Godia E, Suri attractiveness and estimated age. Int J Oral Maxillofac
HS, Edla DR, et al. The present and future of deep Surg. 2019;48(1):77-83.
learning in radiology. Eur J Radiol. 2019;114:14-24. 16. Hesamian MH, Jia W, He X, Kennedy P. Deep
7. Xie X, Wang L, Wang A. Artificial neural network learning techniques for medical image segmentation:
modeling for deciding if extractions are necessary achievements and challenges. J Digit Imaging.
prior to orthodontic treatment. Angle Orthod. 2019;32(4):582-96.
2010;80(2):262-6. 17. B r o a d b e n t B H . A n e w X - r a y t e c h n i q u e a n d
8. Jung SK, Kim TW. New approach for the diagnosis of its application to orthodontia. Angle Orthod.
extractions with neural network machine learning. Am 1931;1(2):45-66.
J Orthod Dentofacial Orthop. 2016;149(1):127-33. 18. Wang L, Gao Y, Shi F, Li G, Chen KC, Tang Z, et al.
9. Han S. The fourth industrial revolution and oral and Automated segmentation of dental CBCT image with
maxillofacial surgery. J Korean Assoc Oral Maxillofac prior-guided sequential random forests. Med Phys.
Surg. 2018;44(5):205-6. 2016;43(1):336-46.
10. Bouletreau P, Makaremi M, Ibrahim B, Louvrier 19. Tin Kam Ho. Random decision forests. In: Kavanaugh
A, Sigaux N. Artificial intelligence: Applications in M, Storms P, editors. Proceedings of the 3rd
orthognathic surgery. J Stomatol Oral Maxillofac International Conference on Document Analysis and
Surg. 2019;120(4):347-54. Recognition; 1995 Aug 14-16; Montreal, Quebec,
11. Choi HI, Jung SK, Baek SH, Lim WH, Ahn SJ, Canada. Piscataway, NJ: IEEE; 1995. p. 278-282.
Yang IH, et al. Artificial intelligent model with 20. Chen S, Wang L, Li G, Wu TH, Diachina S, Tejera B,
neural network machine learning for the diagnosis et al. Machine learning in orthodontics: Introducing
o f o r t h o g n a t h i c s u rg e r y. J C r a n i o f a c S u rg . a 3D auto-segmentation and auto-landmark finder
2019;30(7):1986-9. of CBCT images to assess maxillary constriction in
12. Knoops PGM, Papaioannou A, Borghi A, Breakey unilateral impacted canine patients. Angle Orthod.
RWF, Wilson AT, Jeelani O, et al. A machine learning 2020;90(1):77-84.
framework for automated diagnosis and computer- 21. Nishimoto S, Sotsuka Y, Kawai K, Ishise H, Kakibuchi
assisted planning in plastic and reconstructive surgery. M. Personal computer-based cephalometric landmark
Sci Rep. 2019;9(1):13597. detection with deep learning, using cephalograms on
13. Weichel F, Eisenmann U, Richter S, Hagen N, the internet. J Craniofac Surg. 2019;30(1):91-5.
Rückschloß T, Freudlsperger C, et al. A computer- 22. Arık SÖ, Ibragimov B, Xing L. Fully automated
assisted optimization approach for orthognathic surgery quantitative cephalometry using convolutional
neural networks. J Med Imaging (Bellingham). of cleft patients: a direct comparison between
2017;4(1):014501. artificial-intelligence-based scoring and conventional
23. Park JH, Hwang HW, Moon JH, Yu Y, Kim H, Her rater groups. Eur J Orthod. 2019;41(4):428-33.
SB, et al. Automated identification of cephalometric 32. Shoukri B, Prieto JC, Ruellas A, Yatabe M, Sugai
landmarks: Part 1-Comparisons between the latest J, Styner M, et al. Minimally invasive approach
deep-learning methods YOLOV3 and SSD. Angle for diagnosing TMJ osteoarthritis. J Dent Res.
Orthod. 2019;89(6):903-9. 2019;98(10):1103-11.
24. Kunz F, Stellzig-Eisenhauer A, Zeman F, Boldt J. 33. König IR, Fuchs O, Hansen G, von Mutius E, Kopp
Artificial intelligence in orthodontics : Evaluation MV. What is precision medicine? Eur Respir J.
of a fully automated cephalometric analysis using a 2017;50(4):1700391.
customized convolutional neural network. J Orofac 34. Jheon AH, Oberoi S, Solem RC, Kapila S. Moving
Orthop. 2020;81(1):52-68. towards precision orthodontics: An evolving paradigm
25. Hwang HW, Park JH, Moon JH, Yu Y, Kim H, Her shift in the planning and delivery of customized
SB, et al. Automated identification of cephalometric o r t h o d o n t i c t h e r a p y. O r t h o d C r a n i o f a c R e s .
landmarks: Part 2-Might it be better than human? 2017;20(Suppl 1):106-13.
Angle Orthod. 2020;90(1):69-76.
26. Cericato GO, Bittencourt MAV, Paranhos LR.
Validity of the assessment method of skeletal
maturation by cervical vertebrae: a systematic
review and meta-analysis. Dentomaxillofac Radiol.
2015;44(4):20140270.
27. Malina RM, Beunen GP. Assessment of skeletal
maturity and prediction of adult height (TW3 method).
Am J Hum Biol. 2002;14:788-9.
28. Spampinato C, Palazzo S, Giordano D, Aldinucci M,
Leonardi R. Deep learning for automated skeletal
bone age assessment in X-ray images. Med Image
Anal. 2017;36:41-51.
29. Kök H, Acilar AM, Izgi MS. Usage and comparison of
artificial intelligence algorithms for determination of
growth and development by cervical vertebrae stages
in orthodontics. Prog Orthod. 2019;20(1):41.
30. Zhang SJ, Meng P, Zhang J, Jia P, Lin J, Wang X,
et al. Machine learning models for genetic risk
assessment of infants with non-syndromic orofacial
cleft. Genomics Proteomics Bioinformatics.
2018;16(5):354-64.
31. Patcas R, Timofte R, Volokitin A, Agustsson E,
Eliades T, Eichenberger M, et al. Facial attractiveness