International Journal of Information Management Data Insights 3 (2023) 100175

International Journal of Information Management Data

How machine learning is used to study addiction in digital healthcare: A

systematic review
Bijoy Chhetri a, Lalit Mohan Goyal a, Mamta Mittal b,∗
Department of Computer Engineering, J C Bose University of Science and Technology, YMCA, Faridabad, India
Delhi Skill and Entrepreneurship University, New Delhi, India

a r t i c l e i n f o a b s t r a c t

Keywords: Long-term use of drugs can sometimes result in brain damage that greatly affects a person’s psychology and
Machine learning sometimes become indecent. This paper examines psychological disorders caused by substance abuse by exam-
Alcohol addiction ining literatures that involved machine learning (ML) models. The brain imaging, behavioural kinematics, and
memory analysis are studied to gain insights of substance use and its disorder. Review analysis follows the Pre-
Random forest
ferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. In order to help better
Risk screen, diagnose and monitor such disorders, ML identifies early onset of substance intake as predictors of disor-
Digital healthcare ders. The study measures identified in the articles (N=26) illustrate the exclusive use of ML to bring out insights
of substance use disorders. Brain-related factors, behavioural phenotypes, and functional differentiation of the
brain can express a great deal about disorders. Findings also identify the insights into various research levels,
classification techniques, performance measures, challenges, and future directions related to use of ML. Random
forests models are largely used for better performance. In addition, the diversity of interviews, questionnaires,
brain imaging and the latest digital tools is part of this review. A longitudinal study with clinical validation could
open up new models to explore substance use disorders.

1. Introduction With concurrent mental illness, substance use is dangerous due to

mortality rates and complexity (Peacock et al., 2018). It would save
Substance abuse is one of the causes of many mental health issues many lives if a variety of measures were taken early on to diagnose and
that are becoming increasingly common. The dictionary suggests that isolate patients. In general, clinical evaluation is the most commonly
substance abuse can be a fact or a state of habit. It is impossible to used diagnostic method under expert supervision (Joseph, 2019). How-
control the desire to participate in any activity, whether it is using ever, this type of examination takes time and requires human obser-
a smart phone, eating food or drinking alcohol. Behavioural depen- vation with special equipment and tools. Other constraints like stigma
dence occurs when a person is involved in activities such as social net- and budget keep most patients from accessing medical facilities. Ad-
working, games, overeating, and watching movies. Drug abuse leads vanced technologies such as Artificial Intelligence (AI) can therefore
to addiction and chemical dependency when consumed more than pre- play a critical role in the practical evaluation and diagnosis of depen-
scribed. This issue is dealt with in two diagnostic manuals, the Diag- dencies (Sarker, 2021). Clinical diagnoses can be made more accurate
nostic and Statistical Manual of Mental Disorders (DSM) and the Inter- with the help of digital healthcare. A Machine Learning algorithm model
national Classification of Diseases (ICD) (Diagnostic & Statistical Man- abbreviated as ML is one of the most versatile AI methods and is benefi-
ual of Mental Disorders, 2021). Addiction typically results in mental cial for processing large data sets under austere conditions (Baker et al.,
instability and psychological problems. According to the global report 2018). Substance abuse can also be interpreted and built into the ML
on drugs, opioids, alcohol and tobacco are among the most common (Xie et al., 2021).
over worldwide (United Nations Office on Drugs and Crime (UNODC). A Substance-Induced Disorder (SID) causes abnormal thoughts, feel-
World Drug Report 2017). The 43.4 million American people who use ings and behaviours, resulting in high levels of stress and confusion
drugs suffer from depression and other mental illnesses and even death in daily life. They range from intellectual difficulties to mood swings,
(Abuse, 2020). Meanwhile, 14.6% of the Indian population use alco- anxiety, depression and schizophrenia, and are detected and moni-
hol, 2.8% use cannabis and 1.14% use opioids such as heroin, chemical tored by specific treatment mechanisms that are available (Mauri et al.,
opioids and opium (Ambekar et al., 2019). 2017; Symons et al., 2019). Healthcare professionals use question-

Received 27 June 2022; Received in revised form 28 March 2023; Accepted 29 March 2023
B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

naires and interviews (e.g., structured or semi-structured) to assess a 2. Substance types and impacts
patient’s physical, behavioural, and emotional characteristics (Lynskey
& Strang, 2013). Using these methods, a variety of factors contribute There are many reasons why people consume drugs, includ-
to the classification of disorders, and sometimes major features are ne- ing pleasure, amusement, recreation, and medicine (Kalant, 2001).
glected like human error (Ferreri et al., 2018). These disorders can also Amongst pleasure chemicals, alcohol has existed since ancient times
be identified through images, scans, games, or even electroencephalo- (Crocq, 2007), and even chewing gum and cigarettes contain caf-
gram (EEG) and Magnetic Resonance Imaging (MRI). Medical informat- feine, cocaine and nicotine (BHATT, 2019). In addition, painkillers,
ics remains on the cutting edge of modern and emerging technologies. cough soothers and recreational drugs are sold by prescription but are
The medical datasets produced are very multifaceted, unstructured and used illegally, and some drugs may be purchased over the counter
variable. An analysis of how computer methods such as ML can help (Manchikanti, 2007; Schaper & Ebbecke, 2017). It triggers natural
health care workers shows how important they are in digital health care arousal by diverting neuro-receptors and emitters. A substance of abuse
(Thieme et al., 2020). refers to the natural enjoyment of simple activities like shopping or
The collection, clean-up, transformation, analysis and storage of such eating. Consequently, addicts show abnormal behaviour and physical
multi-dimensional, unstructured and variable datasets are critical com- changes.
ponents of the technological response. ML models learn trends from ex- Drugs like tetrahydrocannabinol (THC), cannabinoids, opioids and
isting data and predict the results of future samples. By exploring dif- ethanol disturb normal brain function. In addition, to boost dopamine
ferent developmental horizons, it is found that the importance of ML levels in the body, these chemical compounds also cause cravings by
in addiction studies is still in its infancy and has enormous potential. transmitting signals of pleasure and satisfaction. Many studies have
A variety of parameters and attributes are selected from the data dur- shown that such psychoactive substances may cause a variety of dis-
ing training, followed by the algorithm is optimized to test on sub-sets eases and psychological imbalances, under their influence or during
of the same dataset. Samples from the case study are used for validat- abstinence (Daley, 2016). Globally, cocaine with THC is the second
ing the algorithm and model generated from the data set. Performance most widely used illegal drug after cannabis with cannabinoids and
measures such as accuracy, sensitivity and specificity are commonly they have stimulant-like effects on the brain and are prone to addic-
used for measuring model effectiveness. This helps to analyse big data tion (Walker & Nestler, 2018). In almost every country, opiates are used
in distinct forms using high-performance computing algorithms to de- illegally as painkillers and cough killers. Youth aged 18 to 25 use these
rive diagnostic solutions, early prediction, and personalized treatment drugs most often, increasing the death rate associated with dependence
(Rehman et al., 2022). In diagnostics, the challenge always lies in model (Koechl et al., 2012).
efficiency, monitoring strategies and accuracy. An early intervention Substance use disorders are observed throughout the lives of those
and recovery support system is used for a range of substance use dis- affected by them. Some diseases are predominantly mental illnesses, also
orders. Decision Trees (DT), Random Forests (RF), Support Vector Ma- referred to as psychological disorders (Chhetri et al., 2020). Most early
chines (SVM), and neural networks based Deep Learning (DL) models adopters tend to go beyond the normal intake and become addicted,
can effectively understand and extract patterns from patients’ informa- which can lead to a variety of psychological disorders. In addition, sub-
tion (Yahyaoui et al., 2019). stance abuse primarily results in brain damage, mental trauma, exces-
Further research in this area will make it easier to treat mental health sive expectations and obsessions, all of which contribute to mental ill-
problems, relapses and follow-ups. A review of the digital healthcare lit- ness. Many of these conditions are assessed or classified as psychological
erature using ML in the synthesis of disorders is presented in this article disorders based on substance abuse through clinical evaluation. Vari-
to explore the breadth of knowledge. It also covers the effects of drugs ous studies have shown that such behaviour disorders are not caused
(legal/illicit), as well as the various disorders that may occur. Therefore, by drug abuse, but by the disorder itself (Alavi et al., 2012; Castillo-
this article will look at the available drugs and their effects on human Carniglia et al., 2019). There are three categories of illnesses: Parental
consumption, as well as how ML strategies can be applied to detect and Child-Induced (PCI), addicted adults who exhibit Substance Use Disor-
monitor drug use. Additionally, it describes ML approaches for unrav- ders (SUD), and recovering addicts with Substance Withdrawal Symp-
elling addiction diseases, their benefits, challenges, datasets, and appli- toms (SWS) (Ooms et al., 2021). In the ICD, these disorders are referred
cations. Health professionals should recognize the complex challenges to as psychological disorders due to substance abuse and are also di-
associated with developing ML applications. To underline the lack of agnosed based on the DSM diagnostic criteria. Table 1 provides details
research in this field, authors have focused on existing developments on the types of drugs, their effects and the classification of disorders
in order to monitor future work and encourage greater participation. according to the DSM and ICD handbooks.
Throughout the review, the following research questions have been ad-
3. Review methodology
RQ1: Which substances are classified as licit/illicit drugs and how
are they consumed?
3.1. Document search
RQ2: What are the different methodologies used to understand sub-
stance abuse?
Document selection was performed through three different library
RQ3: Do articles that use ML methods to study substance abuse pub-
sources, namely MEDLINE (PubMed), google scholar and Scopus with a
lish often?
search query containing Boolean operators. The search terms "substance
RQ4: What are the various ML methods used and how do they work
use disorder AND machine learning", "addiction AND psychological dis-
in addiction studies?
orders AND machine learning", and "machine learning in substance use"
The organization of this article is as follows. In Section 2, the impact
were used. The article selection criteria are based on the guidelines of a
of substance consumption as a background of further review process is
comprehensive set of PRISMA guidelines (Moher et al., 2009).
presented. Section 3, presents the review methodology that have been
followed for this study. A comprehensive review of various methodolo-
gies used in addition study and characteristics has been presented in 3.2. Inclusion and exclusion criteria
Section 4. Additionally, it explains how machine learning models reveal
substance abuse. Section 5 presents comparision of the articles based A wide variety of articles on human psychological disorders resulting
on parameters identified by the authors . Finally, Section 6 has discus- from substance abuse, drug addiction, and alcohol abuse are included.
sion on the review analysis with limitations, and Section 7 presents the Authors consider publications in English with full text available. EEG
conclusion. analogy signals, MRI, and functional MRI (fMRI) brain imaging, as well

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Table 1
Classification of Drugs and its ill effects on Human Body.

Sl No Author Name and Types of Drug Psychoactive constituent Ill-effects Disorders

1. Kalant (2004), Cannabis THC, Cannabiods Euphoria, dysphoria, sedation, Obsessive Compulsive
Hall & Degenhardt, (2009), Legal Medicinal for changes in the perception of time, disorder, Anxiety Disorder,
Volkow et al. (2016), purpose sensory functions, impairment in
Ishida et al. (2020), motor control and short-term
Castaño et al. (2019) memory, dry mouth, tachycardia,
postural hypo-tension.
2. Gahlinger (2004), Natural Opioids Heroin, Morphine, Feeling rush, Hypoxia, Insomnia, Personality Disorder,
Shand et al. (2011), (morphine, codeine, Codeine, and Thebaine, Muscle & Bone Pain, Anxiety Disorder
Rummans et al. (2018), thebaine) Methadone, Tramedol
Cheng et al. (2018) Semi-synthetic Opioids
Lalic et al. (2019), (hydrocodone, oxycodone,
Degenhardt et al. (2019), and oxymorphone,
Dydyk et al. (2020), morphine derivative)
Jin et al. (2020) Synthetic Opioids
3. Gouzoulis-Mayfrank & Amphetamines derivatives, Stimulants, Ephedrine, Stimulating and hallucinate Addictive Disorder,
Daumann (2009), and Ecstasy Pseudoephederine activity, mood control, increase Personality Disorder
Steinkellner et al. (2011), heartbeat, hypertension, cerebral
Sitte & Freissmuth (2015) cell degeneration.
McKetin et al. (2019),
Van Amsterdam et al. (2020)
4. Nestler (2005), Cocaine Cocaine, stimulants Stimulating central nervous system Bipolar Disorder, Psychoric
Reed & Evans (2016) caffeine, amphetamines, (loss of contact with reality, an Disorder
Pasha et al. (2020), methyl phenidate, intense feeling of happiness, or
Fentanyl, Carfentanyl agitation), Anxiety, increased
motor activities.
5. Cargiulo (2007), Alcohol Ethanol Depression, Hypomania, phobias Depressive Disorder,
Guggenmos et al. (2020) hypertension, mood disorder, Trauma & Stress Disorder,
Cargiulo (2007), anxiety, personality disorders.
Afzali et al. (2019)

as behavioural kinetics-based studies, are included. Furthermore, peer- text and clinical observations. Large portions of research were involved
reviewed articles on the prevalence and severity of substance abuse were with neural network architectures with brain image analysis.
analysed. Articles published before 2012, reviews, opinion articles, and Figure 3 shows the sources of data on the distribution of articles by
editorial letters were excluded. In addition, studies on the tobacco con- subject and the results of the ML model. Patients are evaluated psycho-
tent of livestock have been excluded. logically through a variety of tests and interviews. The latest advances
in ML image analysis are also being applied to MR, fMRI, EEG and other
3.3. Article selection social media images. In substance use, the ML algorithm predicts early
abuse accompanied by anxiety and depression. The maximum number
There were 551 studies found after the initial search (n = 269 by of articles is separated depending on data collection strategies, includ-
PubMed, n = 153 by google scholar, n = 129 by Scopus). A careful re- ing clinical evaluation of patients. Such methods include psychological
view of the title and summary of each study found that 424 of them evaluation of patients using different tests and interviews. The latest
were ineligible, and 17 were withdrawn because there were duplicates progress in ML image analysis is also being leveraged with MRI, fMRI,
along with three did not have enough data. 107 studies were there- EEG and various other social media images. The characteristics that are
fore screened further. Of these, 48 were excluded due to animal studies predicted by the ML algorithm depict early prediction of misuse with
(n = 13), unpublished status (n = 7), case reports (n = 18), and nico- the major occurrence of anxiety and depression in substance use.
tine studies (n = 10). Ultimately, 40 items met the eligibility criteria
for review process. However, only 26 studies are included for final re- 3.5. Article publication frequency
view since they are based on genomics, neuropsychiatry and neurolog-
ical studies (Fig. 1). The publication rate of addiction articles is determined through key-
word searches on PubMed and other databases. Many search queries
3.4. Article distribution have been sorted and filed manually to explore details about these dis-
orders. Figure 4 shows the network diagrams of substance abuse and
Figure 2 represents the categories of substance types and the ML ML applications to identify associated conditions. Drug and alcohol ad-
techniques and algorithms that researchers have implemented. To ad- diction, mental health issues, substance abuse, and seizures are all re-
dress the ML research gap on the substance, it is important to understand lated to addictions. ML is used in a database search result to depict the
the model implementation procedure and its effectiveness. Different ML diagnosis of substance use disorder over the last ten years, as well as
model studies including supervised, unsupervised, and deep learning its association with the use of ML. There is a variety of study types,
have been incorporated. Based on detailed examples from relevant lit- samples, demographic data, and substance-specific research and publi-
erature, the findings provide a quantitative and descriptive summary. cation trends. However, the goal is addiction and its effects thanks to
In addition to identifying gaps in the literature, PRISMA was chosen ML techniques. However, some articles refer to disorders with other co-
to reflect the scope and depth of the reported trends and challenges. morbidities. The ML has been found to be more effective in behavioural
Through the selection and analysis of articles, the authors found that studies of drug users. When categorizing subjects into different classes
articles were distributed by substance type. The key items considered by study purpose, RF and SVM are often used in addiction research. An-
for the review process are alcohol-based, followed by polydrug, opioid, other application of neural networks is the use of psychological tools for
and cannabis. For the most part, RF and SVM models are used for both the evaluation of medical conditions.

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Fig. 1. PRISMA flow diagram for new systematic re-

views which included searches of databases.

Fig. 2. Article distribution based on substance consumption (left) and ML models involved (right).

4. Outcomes of the review evident from the review of articles that addiction due to substance con-
sumption is a matter of study in four different methodologies namely
In this section, authors discussed outcome of the review of articles. Brain Imaging, Brain Electrical Signals, Memory based and behavior
Table 2 presents the main characteristics of the study that are reviewed. based assessments. In addition, clinical observation and counseling are
It includes authors, objectives of study, sample size, sample character- also relevant methods as shown in Fig. 5 below. Drug use can also be
istics, addiction measuring tools and articles performance metrics. It is detected with retinal scans (Peragallo et al., 2013) and face recogni-

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Fig. 3. Article distribution based on the data source (Left) and model outcomes (right).

Fig. 4. Publication rate of articles with ML applications (left) and substance addiction (right).

tion (Menon et al., 2019), momentary assessments (Kim et al., 2019), changes in the brain power of cocaine users (Rabin et al., 2020). The T1-
speech interpretation (JOHNSTON, 2016), and handwriting analysis weighted morphometric analysis in this study demonstrates changes in
(Phillips et al., 2009). Authors have classified adopted studies into four grey matter.
categories (brain images, electrical signals, memory, behavioural) based In adolescents exposed to cannabinoids, the volume of gray matter
on their characteristics and derived objectives. Further, a well crafted varies (Demirakca et al., 2011). It is not surprising that heroin addiction
explaination as to how ML methodologies are used to analyse the sub- is associated with changes in gray matter and with damage or maladapts
stance addiction is also presented. Proceding subsections explains each inhibiting control, reward, visual functions and other brain functions
of this along with usage of ML methods. (Shi et al., 2020). Another study revealed a reduction in grey matter
volumes and poor connectivity in prefrontal areas (Parvaz et al., 2016).
4.1. Brain images The fMRI examinations of cocaine users investigate the position and
function of their brain structures to determine how substance use affects
Brain images classify patients according to their cellular structures the size and function of the brain. However, the study suggests further
and determine how dependency affects the density, thickness, colour, research into the cognitive correlates of substances.
size and volume of brain tissue. However, there is little overlap among
psychiatric disorders, as demonstrated by the results of cross-disorders 4.2. Brain electrical signals
(Navarri et al., 2022). It is reported that people with addictive disor-
ders have various brain structures that claim low volume of gray mat- The study of electrical signals in the brain can also shed light on sub-
ter in the prefrontal cortex, dorsal striatum, insula, and posterior cin- stance abuse. Studies of electrical signals in the brain have set their ref-
gular cortex (Xiao et al., 2015). In the central nervous system, gray erence point for detecting drug-induced deficiencies in people. A model
matter is composed of neural cells, tissues and synapses that deter- in which electrical signals from the brain match behavioural patterns to
mine the level of substance. Similar results were reported for persons categorize individuals into substance users or non-users (Turnip et al.,
with chemical dependence and identified six striatal regions of interest, 2018). Another study showed evidence of classifying a patient into a
low functional connectivity strength and high family risk (Ersche et al., group of depressive or non-depressive individuals using EEG signals
2020). (Hosseinifard et al., 2013).
Brain circuits and their abnormalities are key concepts for substance The linear and non-linear characteristics of EEG electrical signals cre-
use studies. In a person with SID, frontal and dorsal striatal circuitry ate a parametric system that describes how the brain responds to drug
are functionally deficient and habits are treated by these circuitry and research. Furthermore, the pleasure circuit, sensation and severity of the
executive function is regulated (Klugah-Brown et al., 2020). Drug abuse damage were revealed (Acharya et al., 2012). Likewise, an EEG-based
also increases the levels of dopamine in these pleasure pathways in a study identified clusters of depressive and non-depressive patients with
significant and rapid manner (Parvaz et al., 2017). The European Num- SID (Ding et al., 2019). Based on historical and clinical data, a multi-
bers Information Gathering and Monitoring Association (ENIGMA), the modal decision-making system, including one of the methods described
world’s largest working group on substance abuse, has found that people above is identified. ML models are also enhanced with clinical informa-
who use drugs have smaller brains (Mackey et al., 2019). An ENIGMA tion such as urine test results, blood sample reports, temperature, and
mega-analysis of cocaine-specific addiction data concludes that there are pressure.

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Table 2
Major characteristics of the ML-based substance articles.

No Author(s) Substance Objective of Study Sample Sample Characteristics Measurement of Addiction Model
Studied Size Performance

1 Beattie & Nicholson (2021) Heroin Feature extraction for heroin use 56,897 Participant’s responses to Variable importance scores Precision = 0.69
classification questions in the survey F- measure = 0.53
2 Marcon et al. (2021) Alcohol To understand the pattern of 4840 Medical students AUDIT questionnaire Precision = 0.70
high-risk drinking Sensitivity = 0.75
Specificity =0.79
3 Jing et al. (2020) Illicit drug To test whether an ML can 700 Individual subjects were Drug use screening tools, P < 001
identify health, psychological, recruited and assessments sensation, impulsivity & AUC: 0.85
psychiatric, and contextual were conducted at 10–12, personality test scales
features to predict SUD. 12–14, 16, 19, and 22 years
of age.
4 Segal et al. (2020) Opioid To create a prediction model and 550 000 Individuals with OUD DSM criteria for OUD AUC: 0.95
algorithm for early diagnosis of Sensitivity: 0.85
opioid use disorder (OUD) Specificity: 0.88
5 Wadekar (2020) Opioid To understand the OUD of the 56,000 National drug survey Response on the use of AUC:.89
general population respondents opioids, cannabis & Sensitivity: 0.81
alcohol. Specificity: 0.81
6 Dou et al. (2021) Poly Harness a risk of Substance use 158 Individuals who are homeless Bag of words with word AUC: 0.74
Substance among homeless youth who own social network use and answer of a survey r value: +ve
profiles & Posts on substance use.
7 Nasir et al. (2021a,b) Poly Hypothesis generation for 226,940 Discharge report of Treatment outcomes, AUC: 0.89
Substance substance use disorder treatment individuals 12 yrs. and older Interaction effects among Precision: 0.77
success and effects. who are SUD patients. variables, Recall: 0.91
8 Kamarajan et al. (2020) Alcohol Identify specific features of brain 68 Individuals with lifetime Connectivity across the AUC: 0.76
connectivity and classify them as AUD as prescribed by region in the Brain, P<.001
alcohol used disorders (AUD) DSM-IV criteria. neuropsychological score,
Impulsiveness & visual
span test scores
9 Lee et al. (2019) Alcohol To predict whether a person with 778 Individuals with AUD In-person clinical screen, Accuracy: 0.86
AUD seeks treatment. the structured clinical Kappa: 0.57
interview for DSM
disorders, bio markers
10 (Mudalige Dhanushka S. Cannabis To create a risk prediction model 94 Regular cannabis user Sensation, impulsivity, and AUC: 0.65
Rajapaksha et al., 2020) for the cannabis user personality scales Sensitivity: 0.79
Specificity: 0.50
11 Guggenmos et al. (2020) Alcohol To classify the alcoholic and 119 Participants meeting criteria Composite international AUC: 0.76
healthy subjects using an image of alcohol dependence diagnostic interview, gray Sensitivity: 0.79
classifier according to ICD-10 and matter density, and Specificity: 0.74
DSM-IV increased cerebral fluid
12 Mackey et al. (2019) Alcohol, Identify general & 2140 Substance dependent Thickness and volume of P<.05
nicotine, substance-specific regional effects individual gray matter in the brain AUC: 0.86
cocaine, of the substance in brain cells region Cohen’s D:
metham- using gray matter study. Negative
13 Symons et al. (2019) Alcohol To predict the outcome of alcohol 780 Patients who had undertaken ML model supersedes the AUC: 0.74
dependence treatment a 12-week, abstinence-based clinicians in predicting the Sensitivity: 0.73
CBT for alcohol dependence treatment outcome. Specificity: 0.77
14 Kinreich et al. (2021) Alcohol To predict alcohol remission 1376 Patients with AUD Neuroticism, depression, Accuracy: 0.86
aggression, years of
education, and alcohol
consumption phenotypes
predict the remission of
alcohol dependent.
15 Panlilio et al. (2020) Cocaine and A clinical trial to observe the use 358 Participants who are Self-report DSM Cohen’s Kappa: 84
opioid of cocaine and opioid during the receiving outpatient questionnaire and urine
treatment. treatment with methadone analysis
16 Kalyanam et al. (2016) Prescription Identify, analyze, and understand – Tweets from the social Degree of Polydrug use –
Drugs the trends in the use of networking user @Twitter
non-medical use of drugs.
17 Zoboroski et al. (2021) Marijuana To determine the risk of 1885 Respondents were recruited THC Accuracy: 0.86
marijuana use to respond to a questionnaire Sensitivity: 0.87
about their substance use. Specificity: 0.79
18 Dipnall et al. (2017) Opioid To detect a depression cluster in – National health and nutrition PHQ, survey item on –
life style, environ variables examination survey substance intake
(NHANES) survey
(continued on next page)

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Table 2 (continued)

No Author(s) Substance Objective of Study Sample Sample Characteristics Measurement of Addiction Model
Studied Size Performance

19 Capecci et al. (2015) Opioid To understand the functional – Consented participants who DSM questionnaire –
changes in the brain of the are patients and healthy
patients with an Opioid intake control
20 Dong et al. (2021) Opioid To predict OUD for patients on – Patients under medication ICD codes for opioid Precision: 0.81
opioid medications using containing active opioid disorder Sensitivity: 0.78
electronic health records and ingredients. F- measure: 0.80
deep learning methods. Optimizer: ADAM
21 Mukhtar et al. (2021) Alcohol Analysis of the brain EEG signal 122 Alcoholic and control group Raw EEG signal Precision: 1
for binary classification of of samples. Sensitivity: 0.98
alcoholic & non-alcoholic F- measure: 0.98
subjects. Optimizer:
22 Ibrahimi et al. (2021) Poly drug Classify the subjects into normal, 2571 Patients diagnosed with AUDIT questionnaire Precision: 0.88
hazardous, and harmful drinkers substance used disorder. Sensitivity: 0.88
using EHR and relay study F- measure: 0.88
dataset. Optimizer: ADAM
23 Hassanpour et al. (2019) Alcohol, Identify substance use risk among 2287 Active social network users. NIDA Modified ASSIST Precision: 0.68
prescription Individuals Risk Class Sensitivity: 0.76
& illicit drug F- measure: 0.72
Stochastic GD
24 Menon et al. (2019) Alcohol To recognize the sobriety of 41 Male and female face images Temperature pattern Precision: 0.87
driver using thermal image of taken with series of red wine around eyes and nose. Optimizer: EM
face. intake.
25 Wang et al. (2018) Alcohol Detecting alcoholism using MRI 235 Long term chronic alcoholic AUDIT questionnaire Precision: 0.97
scan images person and non-alcoholic Sensitivity: 0.96
control person. Optimizer: IPSO
26 Zhou et al. (2017) Poly drug Predicting any risky behavior in 2362 Social network Instagram Text containing positive Precision: 0.89
the future using social network users. risky drug consumption Optimizer: MSE
dataset. matters, hash tags.

4.3. Memory-based activities same time, structured interviews are reliable and provide adequate in-
formation for numerous clinical and research purposes.
Memory-based activities identify deficits in sensitivity and attention Table 3 presents several well-structured, standardized, reliable and
in people with SID. The ability to pay attention, solve problems, solve validated evaluation tools. In Hassanpour et al. (2019), substance
abstract problems and process information is measured using memory- use was discussed on social media. Qualitative and quantitative be-
based activities and assessments. Research has used a variety of per- haviours are analysed through interviews and questionnaires. By study-
forming activities to investigate working memory and its brain mapping ing behavioral factors in the brain, responders can be alerted to the
operations. This is to understand what is happening within an individ- risks, treatment and substance abuse (Baldacchino et al., 2015). A be-
ual when that individual is influenced by a drug (Sweeney et al., 2018). havioural measure called temporal discounting can provide insight into
Additionally, cannabis users showed diminished memory performance drug consumption habits, addiction severity, and treatment changes
during associative memory tasks (Jager et al., 2007). The Tower of Lon- (Estevez et al., 2017). The development of addictive behaviour seems
don test was extensively used in research to understand memory perfor- to imply emotional control and attachment (Rajapaksha et al., 2020).
mance, impulsivity, and psychological performance (Kamarajan et al.,
2020). Patients with SID can also benefit from memory-based tests in 4.5. ML role in addiction studies
identifying and evaluating treatment effectiveness.
To study the psychology of addiction, one must measure urge and It is essential to use the latest technologies, such as ML methods when
impulse (Miranda Jr et al., 2019). In addition, the Digit span, reading diagnosing psychological disorders caused by substance abuse. ML ap-
span, strop, and symmetry tasks help understand substance consumption plications require data sets obtained from various models discussed pre-
(Wanmaker et al., 2018). Other decision-making tasks and risk adjust- viously. Images, texts, electrical signals, surveys, and reports are com-
ment procedures help researchers absorb the memory activities of the monly used. Figure 6 illustrates the taxonomy of ML approaches, which
person with SID. The memory-based evaluation showed better results for are computer-aided approaches for synthesizing and analyzing datasets
the technological intervention involving ML (Ahn & Vassileva, 2016). In faster and more effectively. ML interventions aim to identify, validate,
cocaine addicted patients with SID, ML approaches predict impulsive- and classify diseases during the three phases of substance intake. Mod-
ness as the main predictor (Ahn et al., 2016). els of diagnosis and treatment include personalized intervention strate-
gies, long-term monitoring remotely via technology, and personalized
4.4. Behavioral medical models. The ML models are cleaned, trained, and tested. A cor-
relation is observed between brain functionalities, thought provocation,
Psychological and physical assessments are critical for medical re- behavioral changes, and severity and craving of substance use to ob-
search, treatment planning and referrals. Based on the DSM, substance- serve a profile or pattern (Jardine & Lindner, 2020). ML architecture
induced disorders are evaluated in different ways depending on the ob- can predict the severity and prevalence of substance abuse and addic-
jective. Informal clinical interviews can be used for many purposes, in- tion in some national and international surveys. It is reported that few
cluding third-party diagnosis, and formal physical counselling. At the such models identify demographic, nationality, and gender factors that

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Fig. 5. Addiction Analysis Models as per the literature review outcome.

Table 3
The battery of Psychological and SID assessment tools.

Tools Purpose

NIDA modified ASSIST (Hassanpour et al., 2019) Quick screening of drug intake like the alcohol, smoking, and substance involvement screening
test (ASSIST)
Alcohol and Drug Use Disorder Test (AUDIT) (Ibrahimi et al., 2021) Tests to diagnose alcohol and drug use disorder
Opioid risk screening tool (Dong et al., 2021) Quick risk assessment tool for opioid abuse among individuals prescribed opioids.
Drug use screening questionnaire (Jing et al., 2020) Questionnaire to screen drug problems
Drug abuse screening test (Fehrman et al., 2019) Screening tool to assess drug use
Psychopathological rating scale (Lee et al., 2019) Measures a psychological parameter of the subject.
Screening questions on drug consumption (Dou et al., 2021) To determine the frequency of substance, use to categorize the severity
Cocaine or opioid dependence questionnaires (Panlilio et al., 2020) Consists of frequency of use questions to identify substances like cocaine.
Barratt impulsiveness scale (Kamarajan et al., 2020; Lee et al., 2019) A questionnaire designed to assess the personality/behavioral construct of impulsiveness
Generalized anxiety disorder (Chhetri et al., 2020; Dong et al., 2021) It measures or assesses the severity of generalized anxiety disorder
Patient health questionnaire (PHQ) (Dipnall et al., 2017; Marcon et al., 2021) It is a screening and diagnostic tool for mental health disorders
Conner’s behavioral rating scale (Jing et al., 2020) It is used for the assessment of behavior through a questionnaire
Psychometric assessment (Symons et al., 2019) It is used to test cognitive function along with tests of orientation, attention, and memory.
Disruptive behavior scale (Jing et al., 2020) Assessment tool for a disruptive behavior assessment.
Zuckerman-Kuhlman personality questionnaire (Rajapaksha et al., 2020) Used to reliably measure a person’s level of consciousness using a personality questionnaire
Sensation-seeking scale (Zoboroski et al., 2021) Questionnaire to understand sensation to seek new and risky experience

influence the detection of depression clusters due to substance consump- tory of disorders, and symptoms, they classify subjects into different
tion (Dipnall et al., 2017). Based on lifestyle and environment variables, addiction types and SID classes. Clinicians can develop effective dig-
this study identifies groups of depressed patients. Five-factor personality ital health care systems and strategies using data extracted from pa-
traits were synthesized using ML to classify individuals with a particular tients’ information through such algorithms. Data from the national
drug intake (Fehrman et al., 2019). survey on drug use and health (NSDUH) is analysed with ML to de-
termine heroin intake (Beattie & Nicholson, 2021). While heroin use
4.5.1. Supervised machine learning algorithms among respondents is highly imbalanced, RF techniques show distinc-
Supervised machine learning techniques are attractive because they tive attributes that emerge from classification models. The web-based
can be applied in almost any location. This algorithm uses input and survey revealed that high-risk drinking causes disorders among medi-
output data to create a model for predicting the response, particularly cal students (Marcon et al., 2021). In a longitudinal study, an RF al-
in substance use studies. Depending on their consumption pattern, his- gorithm predicted a substance use disorder from thirty psychological,

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

Fig. 6. ML Approaches Taxonomy to classify the review.

social, environmental and health behaviours (Jing et al., 2020). Partici- (Wadekar, 2020). An ML model with 436 predictor variables provides
pant phenotypes and ecological factors play an important role in disease an average 14-month reduction in the time it takes to diagnose opi-
forecasting. oid use disorder (Segal et al., 2020). According to a study of insurance
Through conversations on social networks and survey responses, the claims, the model is effective in diagnosing problems with substances.
SVM classifier determined that homeless youth were at risk of substance
abuse (Dou et al., 2021). Using a hospital discharge dataset with RF en- 4.5.2. Unsupervised machine learning algorithms
hancement, a study developed a novel hypothesis about SID treatment The unsupervised learning algorithm uses a known set of inputs to
that could accurately classify interaction effects up to 89% (Nasir et al., derive output clusters and predict which cluster is an input based on
2021a). A similar study on SID patients mentioned a more accurate ML its response. K-means and hierarchical clustering ML were applied to
model including SVM, RF, DT, and fuzzy unordered rule induction algo- analyse four drug use patterns, including opioid consumption, cocaine
rithms (FURIA) (Pandey et al., 2018). Researchers compared the clinical consumption, and dual-use (opioid and cocaine) (Panlilio et al., 2020).
profiling of addicts with ML model prediction to predict treatment out- The proportion of cocaine and opioid-positive results is better calcu-
comes. Various ways of pre-processing data have been developed for lated by K-means clustering. The researchers used the two-term thematic
research and analytics. In voxel-based volumetric studies, the brain’s model, an unsupervised ML model, to detect themes and identify a range
structure, reward, and response activities are explored. Psychological of nonmedical prescription drug use activities (Kalyanam et al., 2016).
tests were predicted more accurately by RF classifiers than by tradi- The study identified tweets from three of the most popular prescription
tional algorithms (Kamarajan et al., 2020). Functional connectivity of drugs, and the clusters that form suggest patterns of abuse of more than
the brain and neuropsychological and impulsivity factors are considered one prescription drug. The Neural Network (NN) model with hyperpa-
predictor variables. rameter settings among the models indicated that marijuana use is more
The effects of alcohol on brain cells have been studied using struc- likely in younger, less educated individuals (Zoboroski et al., 2021). The
tural MRI and voxel-based morphometry data using several classifiers, study claims to be far superior to existing literature with 87% sensitiv-
such as SVM and weighted robust distance (Guggenmos et al., 2020). ity of the NN model. The combination of heterogeneous data allowed to
The experimental results of a grey matter deficit found in people who form ordered depressive clusters and to better understand the relation-
drank alcohol with an accuracy of 76% show a volumetric reduction in ship between depressive episodes and addiction (Dipnall et al., 2017).
brain density. An SVM-based study incorporating ML-based studies pre- One study explored neural and cognitive activity connectivity to de-
dicts remission of alcohol consumption from EEG signals (Kinreich et al., tect any changes caused by drug use or overdose (Capecci et al., 2015).
2021). An overall accuracy of 86% is determined by factors such as This study looks at functional brain changes related to opioid substitu-
neuroticism, depression, aggression, years of education, and alcohol tion therapy (methadone) using a connectivity analysis of spiking neural
consumption phenotypes. It is shown that there is synergy between network models trained in EEG data.
therapists and ML models (Symons et al., 2019). A DT-based classifier
model proposed explains whether or not people with alcohol consump- 4.5.3. Deep learning algorithms
tion disorders seek treatment (Lee et al., 2019). With an Area Under In deep learning, neural networks are employed to learn large
Curve (AUC) of 83.6%, a tree-based classifier implemented on a na- amounts of data to derive features and formulate a learning model to
tional survey of drug use reveals the risk of opioid use among youth predict responses to new data. Using the tag-search API, social media

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

image posts and captions are searched for probable words and phrases Table 4
about substance use with the help of a Convolution Neural Network Values assigned to the Metrics for comparatives of study.
(CNN) (Zhou et al., 2017). Clusters include alcoholism, substance abuse, Sl. No Metric Types Linguistic Type Value
depression, sleep disorders and eating disorders, yielding 90% accuracy. Assigned
Using a heat image, the CNN algorithm detected sobriety in a driver
1 Type of Study Observational O
with 97% precision (Menon et al., 2019). MRI images of patients suffer- Longitudinal Study L
ing from alcoholism are subjected to three different CNN models based Case Control Study C
on max, average and stochastic pooling techniques (Wang et al., 2018). Randomized Clinical Trial R
2 Number of Cases Less than 100 -
A similar CNN model is used in the brain study to identify dopamine
Between 100–500 0
liberation phases with an accuracy of over 95%. More than 500 +
The use of Long Short-Term Memory (LSTM) in deep learning archi- 3 Types of cases In Patients IP
tecture has also been widely investigated in substance addiction and dis- Out patients OP
orders (Hassanpour et al., 2019). In addition, the study used EEG signals Both BT
Others OT
to classify alcoholics as well as non-alcoholics. In analysing an individ-
4 Type of Features Categorical C
ual’s EEG signals, another study proposed an LSTM architecture capable Numerical N
of detecting alcoholism 98% accurately and classifying subjects as alco- Both B
holics or controls (Singhal et al., 2021). An electronic health record and 5 Type of Dataset Image Dataset I
Video Dataset V
deep learning are used to predict opioid use disorder (Dong et al., 2021).
Behavioral Assessment B
An LSTM model outperformed DT and RF in identifying opioid use dis- All other A
order risks among opioid medication users (Mukhtar et al., 2021). Based 6 Study purpose Addiction Risk A
on the patient’s historical data in medical records, the LSTM model gains Substance abuse disorder S
93% accuracy in assessing opioid use disorder risk. Analysis of historic Co-Morbidity C
7 Complexity of Model Multi model M
electronic health records to detect alcohol-related disorders through
Singular S
Deep NN[86]. Using four layers of feed-forward Deep NN, it can pre- 8 Type of analysis Statistical S
dict different drinking patterns with an accuracy of 80%. Analytical A
Both B
5. Parametric attributes evaluation for the addiction studies

While in the entire study, authors have identified paramters that pos- are applied when a diagnostic system has the appropriate features. The
sibly make impact on the designing of substance addiction study designs. images, especially MRIs, were used to diagnose disorders in certain
These parametric attributes are data collection strategies, input features cases. Similar to the supervised ML, manual removal of characteris-
that are considered, and 8 different metrices that have been extensibly tics may create a classification bias. By identifying inherent character-
used in the adapted articles. These attrbutes are: istics, unsupervised machine learning categorizes them. ML approaches
based on deep learning produce output as these features from the in-
1. Data collection strategies: Every studies that involves human study
put dataset. With the concept of pooling rectified linear representation,
must go through data collection either in real time of retrospective
soft-max layers are core to deep learning architecture. For example, a
datasets. Thus, choosing appropriate type of strategies makes the
well-trained model might consider 100% of the inherent features re-
research work more effective and focussed.
quired in the model design, which would not be possible with manual
2. Input features: While in ML techniques, selection and extraction of
extraction. CNN outperformed other imaging classification models for
features is most important to seek better outcome from the model,
such implementation. However, fMRI images are not as predictive as
thus the features that have higher impacts may be selected and used.
textual, numeric and categorical data.
3. Metrics: There are eight(8) distinct parameters such as number of
A systematic review of mental health and the use of ML algorithms
cases to consider, types of cases, type of datasets, study type etc. As
has been outlined in Thieme et al. (2020). The bibliographic represen-
presented in Table 4, it displays the authors’ given values for the
tation of published articles was explored using a keyword-based search
various linguistic variables that correspond to the metric employed
in an article (Verma et al., 2022). The topic of AI was addressed in
in the studies. Each parametric attribute’s values are designated in
an important marketing article, which prompted further ML research
accordance with the linguistic variables for better understanding of
to detect fake news (Verma et al., 2021; Nasir et al., 2021b). In addi-
metrics. These metric value aid in classifying the parameters into
tion to using ML methodologies for determining suicidal ideation, the
their possible subtypes.
multimodal approach for clustering various stress management strate-
In addition, Table 5 summarises the results of a thorough comparison gies is also considered critical for the direction of and motivation for
of different methods using the aforementioned parametric characteris- conducting substance use behaviour analysis (Chatterjee et al., 2022;
tics. Researchers may quickly gain perspective on which methods are Mittal et al., 2015). Another study to examine stress and its management
feature dependent or independent, can handle tiny data sets and how for psychological disorders have done similar research (Mittal et al.,
analysis of datasets can be performed. The outcomes and constraints 2022).
are included to assist the researcher choose the best approach.

6. Discussion 6.1. Theoretical contribution

There are different data collection sources for substance abuse dis- Substance abuse can cause psychological problems, according to the
orders. For the most part, data-based approaches like this do not allow best available evidence, and is diagnosed and classified as a mental
for statistical analysis. Rather than analysing the high-dimensional data, disorder by the DSM and ICD. Under normal circumstances, such psy-
pattern recognition can be used to reduce the dimensionality, compute chological behaviour is not a disease, but it raises levels of stigma, un-
principal component analysis, calculate factor analysis, extract features, like what society expects. Neurologists, psychologists and clinical health
and remove outliers. The selection of characteristics prevents relevant workers can provide clinical insights into this using a digital health sys-
or irrelevant noise variables from affecting predictive outcomes. To pro- tem. Drug use causes structural, behavioural and perceptual changes in
duce efficient features, filtering, packaging and integration techniques the brain. Using social media, images, and textual content, a model to

B. Chhetri, L.M. Goyal and M. Mittal
Table 5
Exploring Machine Learning Models in Substance Consumption & addiction studies, their outcome and imitations (N = 26).

No Article Model Data Collection Strategy 8-Metrics Major Input parameters/features Outcome Limitations

1 Dong et al. (2021) LSTM Cerner’s health facts database of O + IP B Symptoms, laboratory tests, The dose of opioid medication, Low in performance. Less
BERT EHR, drug bank to retrieve opioid medication features as per NDC pain related diagnoses like applicable for the new
ingredients codes, clinical events, and dorsalis, anxiety and alcohol samples
demographic information. A total dependence are the top
of 1468 features were extracted. predictive features to have opioid
misuse disorder. (AUC:.93)
2 Mukhtar et al. (2021) Regularized CNN EEG dataset from the UCI O – OT B EEG raw Signal Classify the alcoholic and Data segmentation choices
repository. non-alcoholic successfully, data are limited. No perturbation
segmentation, normalization, and of input signals and weight
regularized model has improved initialization process not
performance (AUC: 0.98) performed.
3 Ebrahimi et al. (2021) Feedforward DNN Relay study & EHR from Odense O + IP B AUDIT scores, clinical details. Multiclass classification, Small-size datasets suffer
University Hospital. balancing dataset using SMOTE from bias
to have better performance
4 Hassanpour et al. (2019) Deep CNN ASSIST, O + OT B Posted images, related Alcohol has the highest risk; No explicit insight into
LSTM Instagram Data. captions/comments, and face intervention is possible based on specific features of increased
features. behavioral health predictive substance use risk
analysis of features generated
(AUC: 0.65)
5 Menon et al. (2019) CNN Dataset from the university of O 0 OT B Facial thermal localized features Heat across the eye and nose is Only alcohol is considered
GMM Patras primarily low during non-sobriety with a limited time frame

thus successful classification was and quantity on account of
validated. (AUC: 0.87) consumption.
6 Wang et al. (2018) CNN The applicant was identified O 0 OT I AUDIT response, MRI scans Classifications based on the Accelerated training may not
through forms. AUDIT survey features are successfully be sufficient with a low
response, MRI scans. achieved. (AUC: 0.87) number of input datasets.

International Journal of Information Management Data Insights 3 (2023) 100175

7 Zhou et al. (2017) CNN Social network multimedia and O + OT B Instagram tags, images, posts, Positive and negative classes of Low in accuracy, real-time
RNN text contents demographic information tags, Behaviors are classified into data are missing
multi classes (AUC: 0.89)
8 Panlilio et al. (2020) Hierarchical Clustering Cocaine or opioid dependence R 0 OP B Biomarkers, DSM questionnaire Four clusters of users, and The external validity of
K Means questionnaires score, interview scores non-user, both have been clusters was not done to
Multinomial LR self-reported survey identified. Person-level output is evaluate the ground truth.
urine Analysis interpreted and action initiated.
9 Kalyanam et al. (2016) Biterm Topic Model Tweets from Twitter Handle O + OT B Corpus of text, INN or slang The filter of text is based on three Only texts are analyzed, the
names, mentions of other illicit themes, polydrug abuse is content of hyperlinks is not
drugs, mentions of identified predominantly associated with considered. Alphanumeric
substance abuse risk behavior, Twitter prescription drug abuse codes are omitted which
contains adjectives related to discussions might have missed out on
prescription drug abuse behavior. important correlations.
10 Zoboroski et al. (2021) Multi-dimensional NN Response from the survey O + OT B Scores of personality tests, Optimizing hyperparameters, Inter-drug correlation is not
administered to a sample substance intake frequencies, use marijuana consumption is more performed to derive the
population of polydrugs, Impulsivity & likely in younger, less educated consumption of polydrugs.
sensation scores. individuals. Openness to new
experiences also increases
cannabis use.
(continued on next page)
B. Chhetri, L.M. Goyal and M. Mittal
Table 5 (continued)

No Article Model Data Collection Strategy 8-Metrics Major Input parameters/features Outcome Limitations

11 Dipnall et al. (2017) SOM Ninety-six O + OT B Lifestyle environmental variables Members were more likely to Specific substance use
‘‘lifestyle-environment’’ variables including responses from the have problems sleeping; disorder is not considered.
were used from the national psychological assessment are unhealthy eating, an old home;
health and nutrition examination taken as the input features. perceived depression, positive
study. relationship depression, and
substance consumption.
12 Capecci et al. (2015) NN EEG data from consented C – OT B Time series EEG signal Classify healthy and patients Deep-in-time spatiotemporal
participants group, and differences in data may be incorporated for
functional connectivity observed further study.
in patients.
13 Beattie & RF The National Survey on Drug Use O + OT B Drug flags, first use age, reasons, The features can predict the risk Less structured survey data
Nicholson (2021) and Health survey data risks, treatments, health variables involved in taking poly-drug. The and timeline factors are not
early onset of marijuana is the taken into consideration.
gateway to heroin.
14 Marcon et al. (2021) ANN Web-based survey questionnaire O + OT A Sociodemographic response to Use of tobacco, cannabis, family Factors like personality traits,
RF PHQ-2, Trauma, AUDIT response, income, single marital status, distress, alcohol use, etc. are
GLMNET suicide ideation and attempt, and bisexuality, and less physical not considered in the survey.
family suicide history. activity are major features of
high-risk drinking.
15 Jing et al. (2020) LR Battery of Assessment L + OT B Features include health, Individual phenotypic Oversampling,
AB psychological, psychiatric, and characteristics, environmental random sampling would have
NB socio-demographic details. factors, psychological produced better results.
SVM dysregulation, and health

RF problems are prime predictors.

DNN (ACC= 0.86)
16 Segal et al. (2020) Word2Vec, Gradient 10 million medical insurance O + IP B Demographics, psychological Significant differences between Use of only billable medical
Boosting Tree claims conditions, diagnosis and OUD and non-OUD in the number history.
procedures features, medication of opioid use days, and number

International Journal of Information Management Data Insights 3 (2023) 100175

features, episode counts, medical of opioid prescriptions per year.
costs OUD can be detected 14.4
months earlier (ACC: 0.95).
17 Wadekar (2020) LR National O + OT B Opioid dependence, Early initiation of marijuana, age Use of secondary data health
RF survey response Demographic & socio-economic between 18 and 34, less income, conditions like neuropathy
details. probationary and easy access to not considered
Physical and psychological drugs. (ACC= 0.89) Family history could
details. The first use of generate a better result.
18 Dou et al. (2021) RF Social media posts, and survey O + OT B Posts and comments associated Association between words and The reason for dependencies
NLTK questions on substance with survey answers, a bag of substance use (e.g.: sucking, grab, is not covered, and emotional
SVM consumption. words. slap under substance user outbursts due to family
category) Sentiments of the post relationships are overlooked.
as anger, disgust, sad. (ACC:
19 Nasir et al. (2021a,b) RF, ANN, Boosting Treatment discharge Dataset O + OT B Treatment facility characteristics, Interaction effect on length of Change in model behavior
demographic characteristics, stay, substance use, and changes not tested with a new
substance usage, length of stay, in self-help. (ACC: 0.89) dataset, external factors are
and the treatment outcome, drug not included
offense, vagrancy
(continued on next page)
B. Chhetri, L.M. Goyal and M. Mittal
Table 5 (continued)

No Article Model Data Collection Strategy 8-Metrics Major Input parameters/features Outcome Limitations

20 Kamarajan et al. (2020) RF fMRI-with DMN, tower of London R – OT B Demographic and clinical Hyperconnectivity across the Overlooked neuropsychiatric
Test, visual span test, BIS characteristics, network bilateral and prefrontal cortex, conditions, Family history,
functional connectivity of fMRI, Poor neuropsychological and their influence are not
neuropsychological scores & performance, and increased considered.
impulsivity factors Impulsivity (ACC:.76)
21 Lee et al. (2019) ADT The Structured clinical interview, C + BT B Drinking history, cognitive, Classification of drinking Imbalanced dataset with
RF intelligence scale, mood, aggression, impulsivity, behavior, depression, and more male subjects. No
RT psychopathological rating Scale, trauma, personality, biomarkers. psychological problems & insights on the comorbid risk
LR BIS, NEO-5-PI-R, early life stress, substance dependence. of alcohol intake.
and childhood trauma. Treatment-seeking predictor:
drinking level, substance
dependence, depression, IQ.
(ACC: 0.86)
22 Guggenmos et al. (2020) SVM MRI, fMRI images O 0 OT A Grey-matter density, Diminished gray matter density, Low in generalizability, no
weird cerebrospinal fluid csf, thickness, less sensation on the reward validation samples, least
reward response values response circuit, and low fluids exhaustive image modality

predict alcohol dependence

23 Mackey et al. (2019) SVM MRI Images O + OT B Behavioral phenotypes, Volumetric differences due to Cross-study comparison is
demographic characteristics, substance-induced disorders. not performed, and
thickness & volume of ROI Alcohol showed low volume co-occurring substance uses
followed by cannabis (ACC:.78). are not analyzed.

International Journal of Information Management Data Insights 3 (2023) 100175

24 Symons et al. (2019) FURIA Demographic and psychometric C 0 OT B Dependence severity, craving 28 different models tested, FURIA Accuracy and sensitivity are
SVM assessment data scale, self-reported health gained high accuracy, and the low compared to the
RF interviews scores, demographic prediction is more accurate than literature.
information the clinicians.
25 Kinreich et al. (2021) SVM EEG level brain connectivity, O + I A PRS, EEG functional connectivity, Collections of PRS related to Validation not performed;
polygenic risk scores (PRS), marital and employment status neuroticism, depression, symptomatic & psychosocial
medications, demographic aggression, years of education, features not included.
information and alcohol consumption
26 (Mudalige Dhanushka S. LASSO Impulsive Sensation Scale, O – OT C Age of first onset, level of Risk factor analysis Multiple drugs aren’t taken
Rajapaksha et al., 2020) KNN Zuckerman-Kuhlman personality enjoyment, impulsivity, (early onset, high impulsivity, into consideration and
SVM questionnaire, personality traits (negative high neuroticism) (ACC: 0.93) factors vary over time. Better
RF Barratt impulsivity scale, NEO feeling, openness, insights would come from
GB five-factor inventory. conscientiousness). multi-modeling.
B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

address the specific point of substance use and screen for possible abuse learning in ML. As a result, early screening for substance abuse is more
may be developed. effective in its treatment.
By using ML methods, authors have been able to prepare data, clas- ML models have improved our understanding of multi-dimensional
sify it, categorize it, and predict uncertainty easily. Behavioural map- patterns of medicine and behaviour. They are used because they are
ping has been studied very rarely, whereas neurology and neuroscience scalable, suitable for engineering and computing. It seems obvious that
have been studied more. The Internet of Things (IoT), big data, and computer science can address drug-related disorders that cause psychi-
nature-inspired computing are emerging computational techniques that atric problems. The following are some limitations of ML approaches
can assist in correlations and associations. It has been found that ML that could guide future research:
techniques can assist researchers in making better addiction decisions Lack of a reliable dataset: A digital healthcare system cannot automate
in this systematic review of addiction studies. A combination of ML and and utilize a data-driven diagnosis due to the lack of a reliable substance
digital health data has provided experts with an excellent solution for addiction dataset. It’s a new way to gain insights from data using digital
determining the most appropriate addiction treatment. RF methods have interventions.
been shown to produce positive results in studies. Researchers and doc- Unavailability of Pre-trained ML Model: It’s complex and time-
tors alike benefit from applying ML to substance addiction research since consuming to process medical images, but prebuilt models can be
there are too many variables that may be overlooked. Therefore, the ex- adapted and integrated into diagnostic systems. It would lead to lower
tracted inherent predictors will help in making decisions regarding the costs, less time and fewer resources if medical images were distributed
classification and prediction of disorders. and preformed.
Recent research on ML has highlighted the efficacy of supervised, un- Limited research on early-onset prediction: Based on existing data, it is
supervised, and deep learning in decoding substance abuse (Cho et al., possible to predict the next phase of substance abuse. For this reason,
2019). SID also uses AI algorithms and applications to identify psychi- there is a need for on-site, localized, gendered and geographic studies.
atric disorders (Liu et al., 2020). It is reported that illicit substances dur- Validation of multi-disciplinary knowledge is critical at different stages
ing adolescence are heterogeneous, ranging from normative to patho- of the conclusion process.
logical, and are associated with significant acute and long-term health Need for more longitudinal studies: When imaging and other data are
risks (Gray & Squeglia, 2018). To deal effectively with substance-related integrated, it can be more efficient to analyse. Furthermore, longitudinal
problems in adolescents, it is essential to understand the underlying neu- studies provide important information despite the difficulty of labelling
robiology. Recent reviews have focused on the most recent evidence on people.
the clinical use of electronic cigarettes (Becker & Rice, 2022). Errors and validation: Drug use patterns are not generalizable, and
this has proved difficult for authors. There will always be variation and
6.2. Implications to practice jitter in a small dataset and case-based study, no matter how rigorously
they are validated. When increased in practicality, these models function
With implementation of image processing, segmentation and ML badly that tends to have misdiagnosis.
techniques, it can improve the identification of drug-related disorders Specific group-based studies: Many psychiatric disorders are associated
from brain images. It is also critical to develop a customized model for with substance abuse, so focusing on a particular substance would be
these psychiatric conditions with behavioural analysis. The state of an ideal. Psyches of all classes and socioeconomic levels can be affected by
individual can be diagnosed quickly and accurately using ML technol- substance abuse. By looking at specific demographic groups and data, it
ogy. Psychology and brain studies can help to diagnose such conditions is better understood how addiction affects society. In the review, ML ap-
based on differences in structure and function. Moreover, the integra- plications are strongly emphasized, their continuous improvement per-
tion of a digital helpline would further reduce the cost and time factor. mits such gaps in research.
Psychological disorders caused by substance abuse can be associ- Ethics and Interpretability: The overriding element of ethics is some-
ated with neurobiological conditions, although some factors disagree. times overlooked in the search for more powerful calculations and re-
The multi-model approach produces better results when disease diag- sults. ML prediction can be erroneous when applied to studies of human
nosis does not always involve highly distinct qualitative and subjective substance abuse, labelling them as drug users. As a result, clinical prac-
characteristics. Substance-induced disorderly behaviours include con- tices and tools were too distant from research areas for ML results to be
trol and sensation, time actuation, reward circuits, impulses, compulsive interpreted.
behaviour, gray matter deposition, Cerebral density and neural dysfunc-
tional networks. 7. Conclusions

6.3. Limitations and future directions This review summarizes EEG, brain imaging, behavioural, and kine-
matic studies on ML applications in addiction research. By employing
The researchers found some research gaps and challenges based on well-defined inclusion and exclusion criteria, the authors avoided the
the work model they carried out and the results they presented. In the review bias. The main findings suggest that drug-related disorders re-
future, machine learning technology could be used to meet these chal- quire a disease-based diagnostic approach. Brain activity is affected by
lenges and bring about changes in addictions treatment. While there are psychoactive substances and behavioural changes associated with sub-
few papers on addiction studies, researchers are required to develop a stance use are the most striking findings. There will be no progress in re-
strategy for longitudinal studies with multiple multivariate endpoints search on substance abuse, its harms, treatment, or economic loss. Those
that lead to addiction disorders. Clinical investigations are the principal looking for SID treatment and suspicious people are routinely moni-
and only method for diagnosing substance use disorders. To develop a tored. Recently, ML algorithms have gained traction in addiction seeking
practical solution to the problem, researchers can use ML tools and other to classify, identify and predict addictions. In addition to technology-
AI technologies. The use of state-of-the-art methods for the identifica- based interventions, planning, monitoring and treatment management
tion, validation and extraction of features, as well as the use of empirical can reduce the burden of disease from addiction. Disorders are associ-
data from different sources, showed a difference in the ML model per- ated with geographical variations that can include or exclude substance
formance. It will also be useful to share resolutions from other research use, but symptoms are often linked. It would show the geographic, so-
such as COVID-19, retinopathy, neurological and psychiatric conditions. cial, economic, and other variables that lead to addictive behaviours. In-
Each researcher uses their data set to prepare and test models, prepared tegrating multiple spheres and comprehensive clinical data into a model
in a different way. The challenge is to accommodate the performance can further improve the capacity to predict vulnerabilities and risk be-
of the entire existing model and prepare a master dataset for transfer haviours. People interested to have early treatment can largely benefit

B. Chhetri, L.M. Goyal and M. Mittal International Journal of Information Management Data Insights 3 (2023) 100175

You might also like