IT Professional
IT Professional
www.computer.org/itpro
IEEE COMPUTER SOCIETY D&I FUND
Drive Diversity
& Inclusion in
Computing
Supporting projects
and programs that
positively impact
diversity, equity, and
inclusion throughout
the computing DONATE TODAY!
community.
Technology Solutions for the Enterprise
17
GUEST EDITORS’ INTRODUCTION
IT Professional Special Issue on Security
and Data Protection During the COVID-19
Pandemic and Beyond
José L. Hernández-Ramos, Paolo Bellavista, Georgios Kambourakis,
Jason R.C. Nurse, and J. Morris Chang
September/October 2023
20 29 37
Theme Articles
Cybersecurity
64 Labeling Software Security Vulnerabilities
Irena Bojanova and John J. Guerrerio
IT Economics
71 Generative Artificial Intelligence in Marketing
Nir Kshetri
C3 CS Information
www.computer.org/itpro
ISSN: 1520-9202
EDITOR IN CHIEF Rick Kuhn, NIST
Maria R. Lee, Shih Chien University
Charalampos Z. Patrikakis, [email protected]
Helen C. Leligou, University of West Attica
ASSOCIATE EDITORS IN CHIEF Gianfranco Politano, Politecnico di Torino
Georgia Sakellari, University of Greenwich
Regular Papers: Reza Djavanshir, Johns Hopkins
Hiroyuki Sato, University of Tokyo
University; [email protected]
Jilei Tian, BMW Technology Chicago
Columns and Departments: George Hurlburt, STEMCorp;
Eirini Eleni Tsiropoulou, University of New Mexico
[email protected]
Bhuvan Unhelkar, University of South Florida
Special Issues: Tim Weil, SecurityFeeds LLC; [email protected]
Bo Yu, PerceptIn
Outreach and Awards: Saeid Abolfazli, TELUS Canada
CS MAGAZINE OPERATIONS COMMITTEE
COLUMN/DEPARTMENT EDITORS
Irena Bojanova (Chair), Lorena Barba, Lizy K. John,
Cybersecurity: Irena Bojanova, NIST; Fahim Kawsar, San Murugesan, Ipek Ozkaya, George Pallis,
[email protected] Charalampos (Babis) Z. Patrikakis, Sean Peisert,
Formal Methods in Industry: Tiziana Margaria, University of Balakrishnan (Prabha) Prabhakaran, Andre Stork,
Limerick; [email protected] Ramesh Subramanian, Jeff Voas
IT Economics: Nir Kshetri, University of North Carolina at
Greensboro; [email protected] CS PUBLICATIONS BOARD
IT and Future Employment: George Strawn, National Academy Greg Byrd (Interim VP of Publications), Terry Benzel,
of Sciences; [email protected] Irena Bojanova, David Ebert, Dan Katz, Shixia Liu,
IT Trends: Amir Dabirian, California State University, Fullerton; Dimitrios Serpanos, Jaideep Vaidya, Ex officio: Robin Baldwin,
[email protected] Nita Patel, Melissa Russell
Life in the C-Suite: Stephen J. Andriole, Villanova University;
[email protected] IT PROFESSIONAL STAFF
Mastermind: George Strawn, National Academy of Sciences; Senior Journals Production Manager: Kristin Falco LaFleur
[email protected] Peer Review Administrator: [email protected]
Software Technology: Markus Schordan, Lawrence Livermore Periodicals Portfolio Senior Manager: Kimberly Sperka
National Lab.; [email protected] Content Quality Assurance Manager: Jennifer Carruth
Director of Periodicals & Special Projects: Robin Baldwin
ADVISORY BOARD
Senior Advertising Coordinator: Debbie Sims
Irena Bojanova, NIST IEEE Computer Society Executive Director: Melissa Russell
Wushow Chou*, North Carolina State University
Simon Liu*, Agricultural Research Service IEEE PUBLISHING OPERATIONS
San Murugesan*, BRITE Professional Services Senior Director, Publishing Operations: Dawn Melley
Sorel Reisman (Chair), California State University Director, Editorial Services: Kevin Lisankie
Henry Schaffer, North Carolina State University Director, Production Services: Peter M. Tuohy
George Strawn, National Academy of Sciences Associate Director, Editorial Services: Jeffrey E. Cichocki
*EIC Emeritus Associate Director, Information Conversion
and Editorial Support: Neelam Khinvasara
EDITORIAL BOARD Senior Manager, Journals Production: Patrick Kempf
Shireen Atabaki, George Washington University
COMPUTER SOCIETY OFFICE
J. Morris Chang, University of South Florida
Aswani Kumar Cherukuri, Vellore Institute of Technology, India IT PROFESSIONAL
Claudio Giovanni Demartini, Politecnico di Torino c/o IEEE Computer Society
Aline Eid, University of Michigan 10662 Los Vaqueros Circle, Los Alamitos, CA 90720 USA
Marıa Jose Escalona Cuaresma, University of Seville Phone +1 714 821 8380; Fax +1 714 821 4010
Hassan Keshavarz, PAYBACK GmbH Website: www.computer.org/itpro
Editorial: Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in
IT Professional does not necessarily constitute endorsement by IEEE or the IEEE Computer Society. All submissions are subject to editing for
style, clarity, and length. IEEE prohibits discrimination, harassment, and bullying. For more information, https://www.ieee.org/about/corporate/
governance/p9-26.html. Circulation: IT Professional (ISSN 1520-9202) is published bimonthly by the IEEE Computer Society. IEEE Headquarters,
Three Park Ave., 17th Floor, New York, NY 10016 USA; IEEE Computer Society Publications Office, 10662 Los Vaqueros Cir., Los Alamitos, CA
90720 USA, phone +1 714 821 8380; IEEE Computer Society Headquarters, 2001 L St., Ste. 700, Washington, D.C. 20036 USA. For missing/
damaged copies, contact: [email protected]. Subscribe to IT Professional by visiting www.computer.org/itpro. Reuse Rights and
Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for profit; 2)
includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third-
party products or services. Authors and their companies are permitted to post the accepted version of IEEE-copyrighted material on their
own web servers without permission, provided that the IEEE copyright notice and a full citation to the original work appear on the first screen
of the posted copy. An accepted manuscript is a version that has been revised by the author to incorporate review suggestions, but not the
published version with copyediting, proofreading, and formatting added by IEEE. For more information, please go to: https://www.ieee.org/
publications/rights/author-posting-policy.html. Permission to reprint/republish this material for commercial, advertising, or promotional
purposes or for creating new collective works for resale or redistribution must be obtained from IEEE by writing to the IEEE Intellectual
Property Rights Office, 445 Hoes Lane, Piscataway, NJ 08854 USA or [email protected]. Copyright © 2023 IEEE. All rights reserved.
Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of
patrons, provided the per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222
Rosewood Dr., Danvers, MA 01923 USA. Postmaster: Send undelivered copies and address changes to Internet Computing, 445 Hoes Ln.,
Piscataway, NJ 08855 USA. Periodicals postage paid at New York, NY, and at additional mailing offices. Canadian GST #125634188. Canada
Post Corporation (Canadian distribution) publications mail agreement number 40013885. Return undeliverable Canadian addresses to PO
Box 122, Niagara Falls, ON L2E 6S8 Canada. Printed in the USA.
www.computer.org/itpro
EDITOR: Amir Dabirian, [email protected]
DEPARTMENT: IT TRENDS
O
ne of the most crucial resources that technol- by recent hires who had been with their company for
ogy companies depend on is a talented team. less than a year. The study tried to determine which
Dealing with the high turnover rate in the IT values are most significant to these workers and make
industry makes it increasingly difficult to maintain the suggestions to IT companies on how to use employer
talent level required for those organizations to pros- brand value propositions to draw in and keep IT talent
per.9,13 Fierce competition in the IT sector has led to a after COVID-19.
high turnover rate. Due to the rapid pace of technical
breakthroughs and the ongoing evolution of the IT busi- EMPLOYER BRANDING
ness, there is a significant demand for IT specialists. Employer branding is the development of a unique and
With such a high level of mobility in the workforce due to attractive organizational identity that attracts, retains,
this demand, IT workers can readily explore new career and motivates employees. It refers to developing and
prospects. Wilson and Furlonger18 also emphasized how promoting an organization’s distinctive image and rep-
employees may look for new challenges and greater pay utation as an employer of choice. It encompasses a
due to technology’s tendency to change quickly. firm’s efforts to communicate to existing and prospec-
On top of that, IT professionals are more than likely tive staff members that it is a desirable workplace.2
to leave their current job when they don’t like the firm’s Effective employer branding leads to luring top talent,
working conditions. This has particularly become an enhancing employee engagement and retention, boost-
issue in the COVID-19 era15 as more employees have ing organizational performance, and more. Employer
transitioned to remote work, allowing them to manage branding refers to creating and maintaining an organiza-
their time on their own terms. The transition created tion’s reputation as an employer of choice.2,7,8 An organi-
an environment that has made attracting and retaining zation does this by crafting a distinct identity that
employees much more difficult. Thus, IT firms are con- represents the organization’s culture, values, and mis-
stantly looking for effective strategies to attract and sion and showcasing this to potential employees, cur-
recruit new talent and retain their current employees. rent employees, and the public. Employer branding is a
Organizations have experienced the positive and strategic approach aimed at attracting and retaining top
negative effects of this constant hunt for greater oppor- talent in a competitive job market by creating a positive
tunities. It damages organizations that lose talent but image of the organization as a great workplace. This can
benefits those that can attract talented workers. This include promoting the organization’s employee benefits,
new talent pool consists of first-time applicants and perks, and development opportunities and highlighting
seasoned experts looking for new possibilities. As a its diversity and inclusion initiatives, social responsibility
result, businesses need to find strategies to draw in new efforts, and workplace culture.
employees and retain their current ones. In this study, Effective employer branding can result in higher lev-
the focus is on how companies draw in new employees. els of employee engagement, improved retention rates,
When assessing IT employers, IT professionals typi- and a more robust talent pipeline.1 It can also help organ-
cally consider eight values.8 These eight values were izations differentiate themselves from their competitors
identified through a content analysis of 26,460 new and position themselves as leaders in their industry.
employee reviews. The employee reviews were written In addition to the brand image an employer wishes
to convey, there is the reality of what existing and
1520-9202 © 2023 IEEE
future employees perceive about the employer. The
Digital Object Identifier 10.1109/MITP.2023.3321926 degree to which a firm’s brand identity14 matches
Date of current version 3 November 2023. those perceptions determines the brand’s impact in
the market. Employees use technologies such as Web suffers from a skill shortage,19 which is a global issue;
2.012 in the form of electronic word-of-mouth on social competition for talent among employers; changing
media and online employer review sites to discuss employee expectations, particularly after COVID-19;
work-related experiences across organizations, which skill mismatch; job hopping, which is prevalent in the IT
changes the expectations and assessments of their sector; heavy workload and its resulting burnout; and
workplace. IT firms must understand this new world of professional development.
online employer branding to manage their public brand High demand for IT professionals is another chal-
to attract new talent. As such, analyzing the percep- lenge that has affected salaries and benefits packages
tions and responses of first-year employees can be a by raising costs. As a result, small and medium-sized IT
valuable approach to identifying how well the intended companies may need help to attract and retain top
brand perception matches perceived reality. talent because they may not be able to offer the same
Studying the reactions and opinions of newly level of compensation as larger companies. A Society for
acquired IT talent, particularly first-year employees, is a Human Resource Management16 study found that 83%
vital endeavor that can offer valuable insights into the of IT companies reported difficulty recruiting suitable
effectiveness of an organization’s employer branding candidates due to competition from other employers.
efforts and the alignment between employee expec- In addition to these challenges, the COVID-19
tations and actual experiences in the workplace. pandemic has produced other problems. Instead of
Research has shown that employees’ initial impres- face-to-face interactions and collaboration being the
sions of an organization, formed through its reputation norm in traditional work environments, the COVID-19
and employer branding, significantly influence their pandemic necessitated a rapid shift to remote work
work environment expectations. Berthon et al.6 empha- arrangements, challenging these established norms
sized that effective employer branding shapes can- and expectations. In addition, employees’ expectations
didate perceptions and sets expectations for their have evolved to include greater flexibility, technology-
workplace experiences. Investigating whether first- enabled remote collaboration tools, and adaptability.
These pandemic-induced changes accelerated the
year IT employees’ initial impressions align with their
acceptance of virtual work environments, prompting
experiences would provide a comprehensive under-
organizations to modify their structures and policies
standing of the extent to which employer branding
to accommodate hybrid work models. The transition
accurately portrays the organization’s culture, values,
to remote work in response to the pandemic has sig-
and opportunities. By examining whether work environ-
nificantly affected recruitment and retention efforts.
ment expectations have been met, organizations can
Remote work may make retaining existing employees
identify gaps, strengths, and areas for improvement in
more challenging because they may feel isolated and
their onboarding and employer branding strategies.
less engaged with the organization. Remote work has
This research can enhance employee satisfaction,
also lifted limitations related to the need for physical
engagement, and retention, ultimately creating a posi-
proximity to the work location. As such, the competi-
tive and productive work environment that aligns with
tion for top talent has become more intense because
the expectations set through effective employer brand-
geography is no longer a limiting factor.
ing efforts, thereby contributing to employees’ willing-
Although the challenge of increased competition
ness to stay with the firm.3
for talent is due to high demand, as mentioned earlier,
it should also be noted that the pandemic has added
EMPLOYEE ACQUISITION, to these difficulties as organizations across industries
RETENTION, AND COVID-19 have shifted to remote work, creating a more global tal-
Companies face several obstacles when acquiring and ent pool. As a result, IT organizations face increased
retaining talent; IT companies are no exception. As dis- competition for talent from other organizations in the
cussed by Beechler and Woodward,5 these challenges are industry and different sectors.
due to a combination of influencing factors related to the These challenges highlight the need for IT organiza-
quantity, quality, and characteristics of said talent. These tions to adapt their recruitment and retention strate-
factors include global demographic and economic trends, gies in response to these postpandemic realities. This
increasing mobility of people and organizations, trans- may include rethinking traditional recruitment meth-
formational changes to business environments, skills and ods, finding new ways to engage remote workers, and
cultures, and growing levels of workforce diversity. developing strategies to compete for talent in a more
In addition, the IT industry has other challenges, pri- global market. This creates additional emphasis on
marily due to the nature of the industry. The IT sector each company’s branding.
DATA COLLECTION AND ANALYSIS Data scraped using a web crawler for these compa-
nies included all positive and negative reviews, contain-
Glassdoor.com is an online labor market intermediary
ing pros and cons. The resulting dataset featured 94,365
that allows its users to review companies they have
employee reviews. The reviews were split into two
worked for and their management anonymously.17
groups: before COVID-19 (1 January 2017–29 February
Glassdoor is currently the most popular company
2020) and after COVID-19 (1 March 2020–31 December
review site. As in previous academic studies, Glassdoor
2021). Of these 94,365 reviews, 73,456 were related to
was used as a data source.17 Glassdoor’s “Company
the top companies, with 31,235 reviews before COVID-19
Reviews” section features six mandatory fields that
and 42,221 after COVID-19. In addition, there were 20,909
reviewers complete. In addition, they are encouraged
reviews of bottom companies, with 8042 before COVID-19
to enter text information, including “pros” (likes, praises)
and 12,867 after COVID-19. For this study, the data for
and “cons” (dislikes, complaints.) Glassdoor has a five-
new employees (those who worked in their respective
star rating system for each company in the following
workplaces for less than a year) were separated.
categories: career opportunities, compensation and
Similarly, these reviews by new employees were
benefits, work–life balance, senior management, cul-
split into two groups, before and after COVID-19. Of the
ture and values, diversity and inclusion, and an overall
26,460 reviews associated with new employees, 21,127
rating. The diversity and inclusion rating is new (added
involved the top companies, further broken down into
after our initial study before COVID-19); therefore,
two sets: 4691 reviews before COVID-19 and 16,436 after
before COVID-19, data on this issue were unavailable.
COVID-19. The remaining 5333 reviews were about the
Similar to a previous study, Glassdoor.com was
bottom companies, again broken down into 1825 reviews
used10 to identify the 10 top IT firms that con-
before COVID-19 and 3508 after COVID-19. To effec-
stantly scored with high ratings in the past five years
tively analyze this volume of data, the use of technology
(2017–2022) and the five IT firms with the lowest ratings
was necessary. The artificial intelligence engine IBM
(see Table 1). Those companies were analyzed to
Watson Explorer uses natural language processing and
understand the factors that IT employees care about
deep learning to analyze text.11 It was used to analyze
in their evaluation of an employer brand. Glassdoor
the data (the content of the collected reviews), developing
reports on the best and worst companies annually.
a dictionary based on eight employer value propositions
(Table 2) informed by the existing literature.3,4,6,7,8,20
Table 3 demonstrates top- and bottom-rated IT
TABLE 1. Glassdoor’s top and bottom IT firms. (Source:
Glassdoor.com.) companies’ ratings before and after COVID-19. It is
important to note that Glassdoor only requested
IT Companies responses covering five categories for new hires before
Adobe COVID-19, but after COVID-19, Glassdoor added another
category (diversity and inclusion). Therefore, there are
Apple
no results available for this category before COVID-19.
Cisco Systems One result that can be easily observed is that employ-
DocuSign ees’ satisfaction with their work and their employer has
increased overall and in every category for the bottom
Google
IT companies, even slightly higher when compared to
Top LinkedIn our previous study results. However, this was not the
Microsoft case for the top companies. They showed a decline in
every category analyzed, demonstrated by negative
Nvidia
changes in their ratings. Therefore, the employees of
Salesforce the bottom-rated companies expressed more positivity
Zoom Video after COVID-19 in almost every category and overall,
Communications in contrast to the top companies, which received
Spectrum more negative reviews after COVID-19. In addition,
the “Career Opportunities (Development)” category dis-
CDK Global
played the highest positive change (7%) among catego-
Bottom CompuCom ries for the bottom companies. Conversely, it had the
Frontier Communications opposite outcome for the top companies, displaying
the most significant negative change alongside the
Xerox
“Work–Life Balance” category: a 4% decrease.
A total of 21,127 user reviews regarding the top bottom companies, including praise and complaints,
companies and 4333 reviews for the bottom companies before and after COVID-19. The results are displayed in
were analyzed. The results were grouped and classified two sections. The values on the left side are based on
based on the eight value propositions explained in this the top-performing IT companies, and those on the right
article and extensively in previous studies.8 Table 4 are based on the bottom-performing IT companies. In
compiles the percentage calculations for the top and each section, the data have been split between praise
TABLE 3. Star rating comparison of top and bottom IT companies before and after COVID-19.
1
3
0
0
3
0
after COVID-19 (1 March 2020–31 December 2021).
As expected, the results for the top and bottom com-
Complaints
8
22
17
12
6
11
ries with the most significant changes (identified by
different colors). They demonstrate the percentage of
COVID-19
Before
(%)
8
7
8
23
16
were related to the “Social” and “Work–Life Balance”
categories and, to a lesser degree, the “Interest” and
“Application” categories. For top-performing compa-
Difference
(%)
1
0
1
5
0
28
12
5
8
3
10
(%)
23
22
3
11
13
2
7
6
1
18
26
9
6
16
10
11
(%)
7
3
13
11
6
27
1
1
2
0
1
0
0
18
(%)
16
3
11
7
17
Brand image
Economical
Application
Employees
This article introduces the concept of statistical model checking (SMC). This approach
elegantly combines the concepts of formal verification and formal models with those
of simulation. The article also illustrates the potential of the approach in various
applications, ranging from validating complex requirements to secure goal planning.
C
omputer systems play a major role in our every- powerful AI techniques. However, it is considered
day life. The citizens can observe it directly insufficient in the process of strategic certifications
through computers and other telephones pre- because 1) it makes it possible to detect bugs but gives
sent in our homes. However, the major role of informa- little certainty about their possible absence; 2) it allows
tion systems is in decision-making processes ranging us to validate only trivial properties and it ignores, for
from payroll management and employee evaluations to example, causalities between events; and 3) the results
the automation of strategic processes for companies vary as soon as parallelization is present, which is com-
and public authorities. With the arrival of generative mon for “multicomponent” systems.
artificial intelligence (AI), this situation will amplify. For Another way to validate a system is to use formal
all these reasons, it is essential that information sys- methods whose objective is to detect all the bugs or
tems behave correctly and that this is duly verifiable even to prove their total absence. In this context,
and provable. model checking (MC)2 plays a major role. This approach
To achieve this objective, European and American explores all behaviors of the system via complex enu-
agencies (among others) are developing various certifi- meration techniques. To achieve this objective, MC rep-
cation processes that depend on the type and sensitiv- resents the behaviors of components of the system by
ity of the system. All these methodologies have one directed graphs called a transition system (TS). There,
thing in common: the deployment of techniques for nodes represent states and links represent transitions
validating the functional and nonfunctional require- between states. Those links can be decorated with
ments of the system from the design stage. information, which makes it possible, for example, to
If it is certain that certain problems can only be synchronize the TS of several components of the same
detected during the implementation (for example, secu- system, or even the time that passes to change the
rity flaws due to the choice of language) and that the state and/or the probability that this change occurs.
compliance of the implementation with its specifica- The requirements of the system are then represented
tions must be validated by means of tests. It is, how- by so-called linear temporal logic (LTL) formulas, which
ever, strategic to validate (part of) the system as soon make it possible to issue hypotheses on the (sequences
as it is designed. Indeed, detecting all the problems dur- of) states of the system.
ing implementation would be too expensive and would As an example, Figure 1 shows an extended TS
require reversing the entire system design process. where behaviors depends on both nondeterministic
The first approach used to validate the conformity
and stochastic action. The system is equipped with
of a system is simply to test it. This approach is widely
two states (S0, labeled by local property w, and S1,
deployed in companies, and it is often combined with
labeled by local property w) and five transitions. The
edge between S0 and S1 means that the environment
1520-9202 © 2023 IEEE
follows an action a1, and then the system goes to state
Digital Object Identifier 10.1109/MITP.2023.3314327 S1 with probability 1p1 (which may, for example, repre-
Date of current version 3 November 2023. sent the guarantee on the hardware). The temporal
highlighted with engineering inspections. Other exam- long-term planner guarantees the preferences of the
ples include railway, automotive, and space applica- person within a short neighborhood (5 m) with the
tions. In addition to being used in industry (formally or highest possible probability: in short, that the best
not), the approach has been promoted in many Euro- comfort is guaranteed for a person during their entire
pean projects, where it has contributed to create a link visit to the shopping center. Other similar experiments
between academic contributions and industrial appli- use SMC to verify on the fly that the decision made by
cations, and thus to consolidate the bridges between an autonomous vehicle is the least risky possible.
these two worlds. However, despite its obvious usefulness, SMC itself
The SMC approach is considered to be so efficient is the victim of criticism and difficulties. As an example,
that it has even been promoted for the validation of the number of simulations needed to achieve good
biological systems (e.g., for the prediction of treatment accuracy may be too high, which might put SMC out of
results depending on the chosen potiology) in CMACSb sensitive applications such as automotive guidance,
(see http://cmacs.cs.cmu.edu/), one of the largest proj- where decisions need to be taken instantaneously.
ects in the National Science Foundation led by Ed Moreover, simulating a system can take time, even if it
Clarke (Turing Prize in Computing 2008) Beyond its is a model. For example, simulating the life of a biologi-
purely validating character, SMC can also be used in cal organism to the nearest thousandth of a second is
more exotic processes. For example, the devices for not easy. There are also industries that remain resis-
assisted living (DALI) project5 used a combination of tant to SMC because they think they are already doing
SMC with planning processes to guide an elderly per- it and that there is no reason to invest in new research
son in a shopping mall (see Figure 2). The objective of in this area. This perception is quite understandable.
DALI was to ensure that the person would visit all the Indeed, when one reads the functional description of
stores they wanted (long-term objectives), but also that SMC, one might wonder what novelty is brought to
they would not encounter any disturbing events in their manufacturers and what the differences are compared
environment. This short-term objective includes any to well-adopted simulation processes. The answer lies
user preference, such as getting too close to a group of in the fact that the SMC approach exploits two impor-
young people or a dog. In this context, SMC is used tant characteristics from the formal world, namely, the
throughout the guidance to verify on the fly that the TS and the runtime monitoring of LTL properties.
First, these representations make it possible to
b
See http://cmacs.cs.cmu.edu/ define new processes that increase the efficiency of
the algorithm by quickly and efficiently generating sim- Given its simplicity and the services it provides, this
ulations according to the properties to be validated. approach has been widely adopted by the industry and
This, for example, is the case with importance splitting promoted by the academic community. This provides a
techniques, which guide the simulation by using the solid foundation to allow SMC to evolve in the future.
structure of the property, and the causality between
the interactions within the system (for example, the THE FUTURE OF SMC
relationship among components involved in the prop- In the previous section, the application of SMC on
erty) to “go” toward the places where the bug is most formal models was extensively discussed. However,
probable. Currently, a lot of research works focus on nothing prevents the application of SMC directly on
the efficiency of the statistical algorithm. In addition to implementations. The only prerequisite being that one
the exploitation made from the link with a TS and LTL, can give semantics in the form of a TS to the imple-
research is also being carried out to replace the Chern- mentation (the latter does not need to be calculated
off bound, the basis of Monte Carlo, with more efficient
and already exists from testing/MC methods applied to
approaches such as stopping algorithms. Other work
code) and that the runtime validation process exists
focuses on speeding up individual simulations as well
for the language under consideration. There exists initial
as exploiting distributed architectures to produce more
work on applying SMC to validate C code. Thus, SMC
simulations without biasing the process (which could
can be the engineer’s companion, both at the level of
happen if a machine were either faster or more oriented
validation to design and validation to implementation.
than another one).
Many of the benefits of SMC and much future work
A second strength of SMC, which is common to MC,
have already been covered. As with any formal-based
lies in the power of the expressiveness of TS and LTL
approach, the biggest challenges of the approach
formulas. Indeed, through extensions such as timed
will remain its industry acceptance and continued inte-
automata or Markov decision processes, a TS makes it
gration into verification processes. These challenges
possible to represent in a standardized way an abstrac-
cannot be solved by research alone. Collaboration,
tion of the behavior of complex systems. It follows that
dissemination, and training will be the keywords to
any result obtained on a TS can be verified by other
achieve its objective.
teams independently and indisputably. Currently, there
is a lot of work that focuses on the application of SMC
to extensions of a TS. A strong focus is on systems that
REFERENCES
combine probabilistic actions with nondeterministic
1. P. R. D'Argenio, A. Legay, S. Sedwards, and L.-M.
choices coming from an unknown environment. Current
Traonouez, “Smart sampling for lightweight verification
work is all moving toward combinations of SMC to pro-
of Markov decision processes,” Int. J. Softw. Tools
cess system validation with AI algorithms to predict the
Technol. Transfer, vol. 17, no. 4, pp. 469–484, Aug. 2015,
most (un)favorable nondeterministic behavior to the
doi: 10.1007/s10009-015-0383-0.
system. This prediction makes it possible to synthesize
2. E. M. Clarke, O. Grumberg, D. Kroening, D. A. Peled,
in which context the system can be used in a safe way.
and H. Veith, Model Checking, 2nd ed. Cambridge, MA,
A third strength of SMC is that it relies on runtime
USA: MIT Press, 2018.
verification techniques to verify LTL and LTL exten-
3. A. Legay, A. Lukina, L.-M. Traonouez, J. Yang, S. A.
sions. These techniques are robust, subject to spinoffs
Smolka, and R. Grosu, “Statistical model checking,”
(which guarantee long term studies), and apply to both
in Computing and Software Science. Springer-Verlag,
models and code. It also means that SMC is constantly
2019, pp. 478–504.
adapting to new types of properties (such as inclusion
4. K. G. Larsen and A. Legay, “30 years of statistical
of real time, cost of execution, and so on). Moreover,
these runtime techniques quickly make it possible to model checking,” ISoLA, vol. 12476, pp. 325–330, 2020.
either produce a state sequence, which shows a failure 5. A. Colombo, D. Fontanelli, A. Legay, L. Palopoli, and
of the system, or to extract a set of sequences con- S. Sedwards, “Efficient customisable dynamic motion
forming to the specification of the latter. Currently, the planning for assistive robots in complex human
work focuses on AI processes which, from a set of bug- environments,” J. Ambient Intell. Smart Environ., vol. 7,
free traces, can synthesize an LTL that represents no. 5, pp. 617–634, 2015, doi: 10.3233/AIS-150338.
them. Such a formula can then be used to refine the
requirements satisfied by the system. AXEL LEGAY is a professor of cybersecurity and software
As seen in this section, SMC combines the best engineering at Catholic University of Louvain, 1343, Louvain-
of formal methods, runtime verification, and statistics. La-Neuve, Belgium. Contact him at [email protected].
A
ll executives are competitive. They generally “Competitive intelligence, sometimes referred
see “business” as a blood sport where the best to as corporate intelligence, refers to the ability
products and services “win” or “lose.” But the to gather, analyze, and use information col-
marketplace is not the only place where product and lected on competitors, customers, and other
service wars are won. Sometimes they’re won by just market factors that contribute to a business’s
watching and studying what their competitors are competitive advantage. Competitive intelli-
doing. gence is important because it helps businesses
Here are five questions: understand their competitive environment and
the opportunities and challenges it presents.
1) Can you name all of your competitors? Businesses analyze the information to create
2) Do you know what your competitors are doing? effective and efficient business practices.”
3) Do you know what your competitors are planning?
4) Do you know who the “new entrants” are? He also recognizes two “silos” of competitive
5) Does “competitive intelligence” influence your intelligence:
strategy and tactics? “Competitive intelligence activities can be
grouped into two main silos: tactical and strate-
If you answer “no” to any of these questions, you’re
gic. Tactical intelligence is shorter-term and
missing huge tactical and strategic opportunities. You’re
seeks to provide input into issues such as cap-
also risking some unpleasant calls from board members
turing market share or increasing revenues.
when they see the competition doing some things
Strategic intelligence focuses on longer-term
that make them nervous. Major risk comes from “new
issues, such as key risks and opportunities
entrants,” the competitors who sneak around until
facing the enterprise.”
they’re ready to pounce on market leaders.
Now that the stage has been set, let’s talk about
what C-suites should do to conduct competitive
COMPETITIVE INTELLIGENCE intelligence.
To answer the aforementioned five questions, compa-
nies need to treat “competitive intelligence” as an
PROCESSES
ongoing part of their strategic planning process. But
There’s advice everywhere about how to conduct com-
what is competitive intelligence?
petitive intelligence. Let’s start with how the U.S. intelli-
Bloomenthal1 defines it this way:
gence community collects intelligence on its friends
1520-9202 © 2023 IEEE
and adversaries. Executives unfamiliar with the Central
Digital Object Identifier 10.1109/MITP.2023.3314326 Intelligence Agency, National Security Agency, and
Date of current version 3 November 2023. Defense Intelligence Agency, among other U.S. intelligence
agencies, should note the intelligence process they The corporate competitive intelligence process
employ; it consists of five steps: resembles the U.S. intelligence process, although it
falls well short of applying some of the tools, techni-
1) Identify the intelligence targets. ques, and technologies—including balloons—that
2) Collect data and information. countries use to “spy” on each other.
3) Process the data and information.
4) Analyze and produce intelligence. COMPETITIVE INTELLIGENCE
5) Disseminate the intelligence. METHODS, TOOLS, TECHNOLOGIES,
These steps become the framework for corporate
AND PLATFORMS
competitive intelligence: Which methods, tools, technologies, and platforms
enable the corporate competitive intelligence process?
Targeting is important. Four of the five ques- Because intelligence is anchored in data, all of the
tions at the beginning of this article apply: Who current and emerging data analysis tools, technologies,
are your competitors? Do you know what your and platforms are necessary for analysis and pro-
competitors are doing? Do you know what your duction. But, as always, it all begins with targeting.
competitors are planning? Do you know who Companies should rank-order their tactical and strate-
the “new entrants” are? The answers to these gic targets. In an era of disruption, companies should
questions are the targets of the competitive pay special attention to new entrants or competitive
intelligence process. products and services that leave their swim lanes. For
Collection is the process of finding data and example, some companies add services to their products
information about what existing competitors even though their brand is product based. Some compa-
and potential new entrants are doing and nies add new services over time, like Amazon and all of
planning. Here’s where the synergism is clear the cloud computing providers. When Amazon started, it
between what U.S. intelligence agencies and focused on books. Now it hosts a pharmacy. Airbnb,
companies do. Open source data—earnings Uber, and Lyft came out of nowhere, at least from the
calls, press releases, product announcements, perspective of the hospitality and transportation industries.
social media posts, sales processes, job announce- Other technologies are especially important, such
ments, and so on—are all sources of potentially as artificial intelligence (AI); machine learning (espe-
valuable data. But companies can also collect data cially natural language processing); generative AI;
and information in other ways, such as through social media analytics; explanatory, descriptive, predic-
employees, ex-employees, Wall Street analysts, lit- tive, and prescriptive analytics; and data science,
igators, venture capitalists, start-ups, and so on. among other existing and emerging technologies that
Processing is data/information integration and focus on the collection, analysis, production, and dis-
analysis—also known as analytics—with which semination of intelligence.
companies are very familiar. Analytics is now Note also that there are lots of tools and platforms
the lifeblood of many companies as it is with that help support the competitive intelligence process.
the U.S. intelligence community. Note that “Platforms” embed methods, tools, techniques, and
processing includes all “hard” and “soft” data, technologies into applications that companies can use
otherwise known as structured and unstruc- to conduct the intelligence process. Some of these
tured data, respectively (see the “Competitive platforms are designed around function, like marketing
Intelligence Methods, Tools, Technologies, and intelligence; some around data, like social media intelli-
Platforms” section for a discussion of the tech- gence; and some around specific vertical industries,
nologies that enable processing). like health care. The Gartner Group,2 for example, lists
Production refers to the reports that answer the competitive analysis tools and platforms that many
the targeting questions: Who are your competi- companies use. Some companies specialize in vertical
tors? Do you know what your competitors are industry competitive intelligence, such as pharmaceu-
doing? Do you know what your competitors are tical and life science competitive intelligence. These
planning? Do you know who the “new entrants” platforms collect and analyze data, and then generate
are? These “work products” are the output of “dashboards” that present intelligence work products
the competitive intelligence process. that can be disseminated with a key stroke. Companies
Dissemination refers to the distribution of the should avail themselves of these methods, tools, tech-
intelligence to all interested parties, especially niques, technologies, and platforms where it all comes
those in the C-suite. together.
T
he global landscape has been reshaped by the the pandemic demonstrated that cybersecurity and
COVID-19 pandemic, causing unprecedented data protection aspects go beyond technology, but they
health, economic, and societal disruptions. In have legal and social implications that will continue
this transformative period, technology emerged as a evolving in the future.
vital lifeline, serving as both a shield against the virus While the World Health Organization declared “with
and a catalyst for adapting to new challenges. Digital great hope” an end to COVID-19 as a public health
contact tracing frameworks rapidly emerged and emergency in 2023, our society will have to face new
became popular worldwide, thus enabling swift identifi- emergency situations in the coming decades. We hope
cation of potential exposure and containment of infec- that the technological advancements of recent years
tions. In addition, the subsequent development of and those to come as well as the lessons learned from
digital COVID certificates for vaccinations, immunity, the COVID-19 pandemic will serve to develop effective
and testing facilitated the safe resumption of daily responses while cybersecurity and data protection are
activities and cross-border travel. still properly addressed from a more multidisciplinary
In fact, in recent years, technological advancements perspective, considering social and legal aspects. Indeed,
stemming from AI as well as the use of technologies we must ensure that cybersecurity and data protection
like the Internet of Things (IoT) or blockchain, have remain at the forefront of our strategies, considering the
propelled the development of innovative solutions to complex interplay between technology, societal state
combat COVID-19. Such technologies have facilitated transitions, and emergency preparedness.
real-time monitoring of infected individuals and the The articles in this special issue highlight recent
identification of outbreaks, enhancing our capacity to contributions in the areas of cybersecurity and data
respond effectively. However, the massive deployment protection related to COVID-19 and the postpandemic
of such technological solutions exacerbates security world.
and data protection challenges. That is, ensuring the “Blockchain-Based Mechanism for Smart Record
privacy of individuals, safeguarding data integrity, and Monitoring During and After the COVID-19 Pandemic,”
enhancing cybersecurity to adapt to evolving work para- by Geetanjali Rathee et al.,A1 provides an overview of
digms have become paramount concerns. The rapid dig- security and privacy mechanisms for facilitating effi-
ital transformation that accompanied the pandemic’s cient communication, decision making, planning, infor-
challenges brought about not only opportunities but mation recording, and management using smart devices
also significant cybersecurity and data protection chal- in postpandemic scenarios. Furthermore, the authors
lenges. As societies globally navigated these uncharted explore the utilization of blockchain technology to estab-
waters, it became evident that addressing these issues lish a secure and trustworthy communication network.
thoughtfully and comprehensively was essential. Indeed, In “The Changing Landscape of Privacy-
Countermeasures in the Era of the COVID-19 Pan-
1520-9202 © 2023 IEEE
demic,” Sherali Majeed and Seong Oun Hwang1 analyze
Digital Object Identifier 10.1109/MITP.2023.3322609 the aspects related to data privacy due to the COVID-19
Date of current version 3 November 2023. pandemic and present developed countermeasures to
IN THIS ISSUE
address these issues. The importance of implementing organizations aiming to improve communication in
privacy strategies in epidemic-handling systems is remote teams during collaborative problem solving in
emphasized, and key lessons for future similar pan- the postpandemic era.
demics are provided. This article was inadvertently All transformations that accompanied the pandem-
published in the July/August edition of IT Professional. ic’s challenges brought about not only opportunities
As such, it is available there to round out this special but also significant cybersecurity and data protec-
issue with a premature preview. tion challenges. As societies globally navigated these
In “Enhancing Communication Among Remote uncharted waters, it became evident that addressing
Cybersecurity Analysts With Visual Traces,” Chen Zhong these issues thoughtfully and comprehensively was
et al.A2 introduce an approach to enhance communica- essential. Indeed, the pandemic demonstrated that
tion in collaborative cybersecurity analysis during the cybersecurity and data protection aspects go beyond
postpandemic era. This is done by utilizing visual traces technology, but they have legal and social implications
of experts’ analytical processes. Their method addresses that will continue evolving in the future.
the challenges of remote work and demonstrates its “Ransomware Attacks of the COVID-19 Pandemic:
benefits through a case study. This provides insights for Novel Strains, Victims, and Threat Actors,” by Zubair
Baig et al.,A3 analyzes the organizational vulnerabilities PAOLO BELLAVISTA is a full professor with the Department
to threats exposed by the COVID-19 pandemic, focus- of Computer Science and Engineering, Alma Mater Studiorum,
ing on the rise of ransomware attacks. It analyzes University of Bologna, 40136, Bologna, Italy. Contact him at
popular ransomware attacks during the pandemic [email protected].
and highlights the importance of preventing malware
spread in corporate networks. The work identifies
GEORGIOS KAMBOURAKIS is a full professor with the
impactful ransomware strains and emphasizes the
need for preventive and security measures. Department of Information and Communication Systems
Engineering, University of the Aegean, 83200, Karlovasi,
Greece. Contact him at [email protected].
REFERENCE
1. A. Majeed and S. O. Hwang, “The changing landscape
JASON R.C. NURSE is a reader in cybersecurity with the
of privacy– Countermeasures in the era of the COVID-
School of Computing at the University of Kent, CT2 7NF,
19 pandemic,” IT Prof., vol. 25, no. 4, pp. 52–60, Jul./
Kent, U.K., and the Institute of Cyber Security for Society,
Aug. 2023, doi: 10.1109/MITP.2023.3287876.
U.K. Contact him at [email protected].
The COVID-19 pandemic has highlighted the significance and importance of introducing
intelligent devices to improve living standards. Technological advancements and
inventions have proven particularly beneficial during times when face-to-face contact is
limited. Machine vision, as a cutting-edge paradigm, has emerged as a significant tool in
handling various COVID-19 and post-COVID-19 situations. Intelligent devices aim to
automate processes and enhance society’s quality of life by reducing reliance on human
intervention. However, security concerns and technological advancements necessitate
organizations and businesses to adopt robust security and privacy mechanisms instead
of solely relying on intelligent and smart measurement methods. This article aims to present
a comprehensive overview of security and privacy mechanisms for facilitating efficient
communication, decision making, planning, information recording, and management using
smart devices in postpandemic scenarios. Additionally, the article offers a summary of
potential techniques and basic solutions for secure communication in the future.
T
he COVID-19 pandemic has brought about a Organizations are converting their existing commu-
paradigm shift in the daily lives of individuals.1,2 nication technologies to adopt intelligent, robotic, and
It provided a taste and tempering of technology smart-based phenomena to gain more benefits. They
in a world where individuals were afraid of commu- have benefited and improved their communication pro-
nicating with each other physically. Technological cesses through technology by adopting smart and
advancements and innovations have enabled people intelligent communication devices in various fields.6
to continue working remotely through smart and tech- The smart technologies are being adopted by various
nological means, even in the midst of widespread shut- organizations in almost every field, such as health care,
downs. Moreover, technology has played a vital role in industries, Society 5.0, etc. In this article, we consider
effectively managing post-COVID-19 situations and health-care applications and their communication pro-
preventing the transmission of infections from one cesses using smart devices.
individual to another.3,4 To control the spread of the Smart measurements have been adopted in
virus, smart devices and measurements have been health-care systems, where the recording, storing, and
deployed to diagnose and identify infected individuals analysis of information are performed via smart devi-
globally. These smart measurements represent the lat- ces. The prescription written by a doctor, a patient’s
est and most significant advancements in communica- visit to the doctor, and the corresponding medical
tion strategies.5 tests are recorded and analyzed by smart devices.
Machine vision (MV) is one of the technologies inte-
grated with smart measurements to provide automatic
1520-9202 © 2023 IEEE
decision making while connecting with sensors.7,8 MV
Digital Object Identifier 10.1109/MITP.2023.3310650 attempts to integrate with existing technologies and
Date of current version 3 November 2023. capture real-time images for automatic analysis and
information sensing in their surroundings. Accurately record of a patient is being transferred from one place
processing the recognized images enables more pre- to another. Whenever it is needed, the patient may
cise analysis and surveillance recordkeeping within refer to another doctor for his/her satisfaction to
the network. change their prescriptions, or, sometimes, doctors may
In the context of health-care applications, patient refer the patient to another doctor located in a differ-
information, such as doctor visits, prescriptions, medi- ent hospital at a different place. Instead of subjecting
cal tests, and operations, is efficiently managed using a patients to redundant tests or starting the diagnosis
range of smart and intelligent devices. E-health-care process from scratch, the comprehensive patient
systems play a pivotal role in enabling seamless access report can be effortlessly and securely handled and
to and organization of this information through the transferred within the network using smart devices.
Internet. This integration ensures the streamlined and This advancement in technology eliminates the
secure management of health-care data, promoting need for repetitive procedures, saving time and resour-
effective communication and coordination among ces while ensuring the secure exchange of crucial
patients, health-care providers, and medical facilities.9 medical information. In addition, the processing and
A complete health-care mechanism having a number recording, along with integration with the intermedi-
of smart devices that communicate and make indepen- ates, can be done via smart e-health-care systems. Fur-
dent and autonomous decisions is depicted in Figure 1. ther, the involvement of the blockchain network in the
Figure 1 represents the secure communication and communication process strengthens e-health manage-
transmission of information among entities when the ment by providing high-end security and transparency
while moving the information from one location to scratch each time. The entire patient report can now
another. Smooth and secure communication among be effortlessly and securely handled and transferred
intelligent devices can be easily provided by ensuring a within the network using smart devices, ensuring the
trust-based transmission process among the devices. quick and reliable dissemination of crucial medical
information. This advancement greatly enhances the
SIGNIFICANCE OF THE RESEARCH overall efficiency and effectiveness of health-care serv-
MV is defined as one of latest technologies that helps ices for COVID-19 patients. In addition, the contribu-
in dealing with and handling the post-COVID-19 world, tions of the entire article are structured as follows:
where the recording of a patient’s data can be done via
smart systems. However, the smart recording, manage- A number of security schemes and proposals
ment, and storing of information can be further altered that ensure secure communication among
and modified by a number of intruders, present inside intelligent devices are discussed along with
the network, for their own benefit. Let us consider a their limitations.
case study where the record of a patient, including a A number of open challenges and future direc-
doctor visit, a prescription, a medication, and an opera- tions for ensuring secure communication by
tion, is handled via the Internet and smart devices. The understanding the behavior of each device are
security, in this case, can be easily breached by altering presented in this article.
one of sensors present in the network to degrade A secure and transparent intelligent communi-
the network performance in variety of ways.10 The cation framework for an e-health-care system
breached information can be in the hands of insurance using the blockchain mechanism is further
companies, whose goal is to target those persons elaborated, and its accuracy metrics that
whose ages are between 30 and 40 years and who directly affect the overall performance of the
have no major medical issues. In addition, the security network are shown.
can be breached in the situation where a doctor, for
This article aims to emphasize the existing research
their personal benefit, forces their patients to go their
and its limitations while proposing a transparent and
recommended lab or medical store even though the
secure communication mechanism for COVID-19
facility does not have great reviews.11 Therefore, it is
patients. The objective is to establish an accurate and
necessary to determine a secure and efficient commu-
expedited diagnosis process. By addressing the short-
nication mechanism by legitimating the smart devices
comings of current approaches, this article seeks to
in the network.
enhance the efficiency and reliability of communica-
tion channels, ensuring transparency and security in
CONTRIBUTIONS
the diagnosis of COVID-19 cases.
To analyze the legitimacy of each smart device involved The remainder of the article is organized as follows.
in the communication/transmission of information and The related work is discussed in the next section; a
responsible for handling and dealing with post-COVID- number of scientists and researchers have proposed
19 situations, it is necessary to focus on the major numerous security measures. The open challenges and
criticality and concerns corresponding to it. A single a proposed framework are provided in the “Open
wrong decision or incorrect identification of a person Challenges” and “Proposed Solution” sections. Finally,
may lead to the inflammable spread of a virus within the “Conclusion and Future Directions” section con-
the coming 24 hours.12,13 This article presents a valu- cludes the article and discusses the future scope of
able contribution in the form of a comprehensive sur- the work.
vey of various security schemes and proposals aimed
at establishing an efficient communication mechanism.
Moreover, the integration of existing technologies with RELATED WORK
MV has proven to be highly beneficial for COVID-19 The present section presents a number of security and
patients. This integration allows for the capture of privacy proposals suggested by various researchers,
real-time images, enabling automatic analysis and scientists, and engineers. The security schemes of
information sensing to accurately process the recog- smart devices in health-care mechanisms using block-
nized images. chain technology are considered as some of the latest
With this approach, patients are spared from redun- solutions proposed by various scientists. Table 1
dant tests, and the diagnosis process becomes more depicts a literature survey of various security schemes
streamlined by eliminating the need to start from in health-care applications using smart devices.
Haddad et al.14 focused on blockchain technology approaches. The authors used a lightweight machine
for solving the data management issues of health care learning method along with a trained model on the
by incorporating machine learning techniques and fea- server layer to ensure data efficiency. In addition, they
tures. The authors used convolutional neural networks used a streaming layer to score the incoming data by
along with blockchain technology to improve the data improving the analytical framework using a trusted
management results. The authors claimed 98.67% and vision system through blockchain.
96.02% accuracy and recall, respectively, in learning Liu et al.18 analyzed scientific publications and pat-
the high-level semantics of medical records for under- ents to identify the trends of blockchain in health care
taking and assisting with valuable diagnoses. over the last five years. The authors adopted quantita-
Lavigne et al.15 proposed a real-time health care tive and qualitative approaches to discover the tempo-
using blockchain techniques by supporting the lifecycle ral and theme dynamics in industry and academia. The
of the patient and improving its quality. The proposed output results shed light on MV and challenges for
approach uses consensus and a centralized prescription future works.
method for collecting health data. The authors provided Al-Jaroodi et al.19 defined the Health 4.0 objectives
data access control for restricting the persons granted by discussing advances in traditional health-care sec-
access. The performance of the proposed mechanism tors. The authors categorized health-care applications
was analyzed using the Hyperledger Fabric tool to inves- into four groups: targeted, professional, system, and
tigate the communication features of the network. resource management. They conducted a study of
Younis et al.16 provided a holistic solution for diverse applications aimed at supporting services in
securely preserving and accessing data information in achieving their specific goals. The authors also sug-
health-care applications. The security of Internet of gested a creative middleware framework to offer
Things devices was measured through blockchain by service-oriented applications to facilitate services.
granting and revoking their permissions and privacy Yang et al.20 proposed a cyberphysical system (CPS)-
regulations. The authors further proposed a novel based home-care robotic system for reviewing the
authentication and secure communication protocol for related technologies, including sensing, artificial intelli-
interacting with patients and cloud services through gence, cloud computing, and materials and machines
smart contracts. The proposed mechanism was ana- for mapping and capturing information. The authors
lyzed using the automated validation of Internet secu- discussed future perspectives related to the issues and
rity-sensitive protocols and applications. challenges faced by robotic systems in Health Care 4.0.
Wang et al.17 investigated data efficiency and Even though a number of security-based mecha-
enhancement criteria by following three lightweight nisms and approaches have been proposed by various
OPEN CHALLENGES
In this section, we discuss a number of security and pri-
vacy concerns when transmitting sensitive information
among clients in the network. The information is trans-
mitted among various devices or clients in the network FIGURE 2. Possibility of threats during information transmission.
that are already authenticated by the trusted authori-
ties. After a successful authentication process, the
devices can share both regular and sensitive informa- DDoS is considered one of the significant types of
tion in the network. The information is the identifica- threat, where heterogenous devices that are recording
tion of the device, such as its ID, name, organization, a huge amount of information automatically can be
department, etc., while the sensitive information is easily captured by intruders. The intruders may com-
what the devices share among themselves by broad- promise one device, which may lead to other hopes to
distribute their load and perform jamming or network
casting or sharing the message (via text, audio, video,
congestion by simply consuming resources in the net-
etc.) in the network. Even though the communicating
work. Later on, the data falsification threat is consid-
devices are authentic, the information transmitted in
ered to be one of the dangerous attacks, where the
the network can still be breached and compromised by
altered smart devices modify the information or simply
intermediate parties. Many security concerns can fur-
divert the data to any third party that can benefit from
ther arise while sharing, analyzing, or transmitting the
them for their own purpose. The following list dis-
patient’s records in the network.
cusses a number of open challenges and future direc-
In Figure 2, we observe the exchange of devices
tions that need focus to ensure a secure and private
and sensitive records among multiple entities, accom-
remote communication mechanism in health-care
panied by potential security threats. The depicted
applications using smart devices.
threats encompass data alteration, replay attacks,
data falsification, masquerading, and distributed denial Sensitive information to insurance companies:
of service (DDoS). These threats pose significant Insurance companies make money by insuring
risks and can be perpetrated by intruders, leading to the young age group with no serious medical
potential disruptions or unauthorized manipulations of issues. Sometimes, they make a deal with one
the data. of the intermediates in the hospital who can
Whether occurring during network transmission or easily provide a patient’s records to them. Once
introduced later by malicious actors, these threats pre- they have sensitive information about a client,
sent serious concerns for the overall security of the they can easily target them and get them to
system. The visual representation in Figure 2 offers agree to purchase insurance by offering the
insight into the possible methods utilized by attackers best plans and ideas.
to execute these threats when transmitting informa- Money margins by intermediators: Sometimes,
tion across the network and among various clients. for tests and medicines, doctors recommend
Understanding these possibilities is crucial for devising medical labs or medical stores that have low
robust security measures to safeguard against poten- grades but make their huge margins in the mar-
tial breaches and ensure the integrity and confidential- ket. This may directly affect stakeholders, and
ity of the shared information. patients may suffer greatly.
Data falsification threat: This can be consid- Validation procedure: Though blockchain can
ered as one of the significant types of attacks, be considered as one significant solution that
where smart devices are compromised by provides transparency and security in the net-
intruders and act maliciously while analyzing or work, the validation time by the nodes may
recording information in the network. invite a number of network security issues in
Masquerade threat: In the masquerade threat, the network.
an intruder device tries to show the identity of Blockchain handling permission: The adminis-
the ideal device to gain network services. trator who is allowed to modify or surveil the
Energy consumption: The amount of energy blockchain should be managed at various levels
required to transfer information from one place to distribute the load in the network.
to another may further consume resources in Storage of information in blockchain: The block
the network. The compromised devices con- of information in a blockchain should be han-
sume too much energy and degrade the overall dled in such a way that it does not further
performance of the network. cause any kind of storage overhead or conges-
Data accuracy: Data accuracy is considered to tion in the network.
be another significant threat, where devices may
not be able to provide accuracy while sharing Though various techniques have been proposed by
information if they are compromised by intruders. various researchers and scientists, it is necessary to
Information modification: The recorded infor- propose some security solutions to further improve the
mation of a patient can be further compromised communication process in a very secure manner.
or breached by intermediators who sell the infor-
mation to any third party, such as insurance PROPOSED SOLUTION
people or scientists for their research, to make To maintain a very secure and transparent communica-
profits. People who are delivering medicines, tion mechanism, including reduced time and storage
those maintaining patient records, or those overhead, this article presents a blockchain-based
doing the patient tests prescribed by doctors record maintenance framework for health-care sys-
can further utilize such benefits for their growth.
tems. The proposed mechanism is depicted in Figure 3,
which shows a number of smart devices that are main-
While researchers and scientists have proposed
tained using blockchain at various levels.
numerous cryptographic, algorithmic, and analytical
trusted methods to enhance security, the advance-
ments in today’s environment, where wireless net- Blockchain of Entities
works and smart devices dominate communication Figure 3 shows a number of entities that are involved
procedures, present new challenges. The involvement in maintaining the blockchain: patient records, doctors,
of smart devices reduces human efforts by correctly pathology labs, and so on. The present blockchain is
automating the surroundings and taking actions depend- permissioned and private, where the complete block-
ing on them, but this also leads to a number of security chain is maintained by an owner having subcategories
concerns in the network. The smart devices can be com- who gives the permission to do updates in the corre-
promised by intruders to gain their own benefits. Further, sponding blockchain. The patient, doctor, medical
the amount of time and anonymous intermediate entities stores, and pathology labs may maintain a blockchain
involved during the communication and decisions repre- that is further separated into other blockchains. All of
sent additional security gaps that an intruder can use for the blockchains are dependent on each other, and the
malicious and dishonest purposes. Therefore, it is much patient is prioritized—before any modifications are
needed to focus on current scenarios. Blockchain tech- done, everyone has to inform or receive permission
nology, currently, can be considered as one of the latest from the patient.
and most significant methods for analyzing the trust Figure 3 shows a permissioned blockchain where
and communication process transparently. A number of doctors are directly connected with their patients.
security schemes have already been highlighted by scien- Whenever it is necessary to move or change the
tists and researchers for maintaining transparency and location of a patient, either as prescribed by the doc-
improving decision accuracy using blockchain. However, tor or because of the patient’s wish to change his/her
there are still a number of security concerns that need to doctor, permission will be asked of the owners
be treated while using blockchain techniques for main- (patients) when moving or altering any record in the
taining the records of patients in health-care systems. blockchain. A change in the record, such as the
address, prescriptions, or any other necessary things, regular surveillance and may be completely removed
will be reflected by maintaining consistency in the net- from the existing network to prevent further—or any
work. This leads to efficient communication and trans- type of—harm.
parency while communicating among each other. In
addition, the doctors are connected with medical labs
Baseline Approaches (BAs)
and medical stores for their respective prescriptions.
The proposed approach is validated and verified
Furthermore, another blockchain may be maintained
against two existing BAs, BA116 and BA2.17 To smooth
where intermediate entities may communicate among
the details, Figure 4 denotes BA1, BA2, and PA, where
each other via blockchain technology before access-
BA1 is existing approach 1, BA2 is existing approach 2,
ing a patient’s record.
and PA is the proposed approach (the blockchain
framework).
Updates and Verification of Records
The simulation of the proposed framework is also
The verification or addition of a block in the existing
depicted in Figure 4 against the accuracy parameter.
blockchain network is done by some of the elected
Accuracy is defined as the decisions made by smart
miners in the existing chain of blocks. During the net-
sensors without any participation of human power. The
work establishment, where each and every entity is
proposed mechanism shows accuracy along with a
considered as ideal or legitimate, an intruder may com-
reduced delay, as independent blockchains are man-
promise the devices upon communication in the
aged and handled at separate levels in the network.
network. As soon as the communication of the trans-
The proposed mechanism outperforms BA1 and BA2
mission of information proceeds, the possibility of devi-
because of the involvement of the blockchain mecha-
ces malfunctioning or the alteration of records by
nism, which ensures a secure and accurate decision
intermediate entities will be easily performed by
intruders. The aim of intruders is to hack the respective
information and gain their own actions. The intruders
may be intermediate entities whose task is to benefit
in their cost by forcing a patient to test at a specific
test lab. The intruders may also sell the entire record of
a patient or hospital to any other party to gain some
extra money. The blockchain network is maintained to
prevent such types of alterations and modifications.
The validation and verification of each and every entity
is maintained by the miners, who are elected from
among the existing blocks to keep surveillance in the FIGURE 4. Accuracy of the proposed framework (having
network. A single alteration in the record can be easily blockchain) versus existing approaches (BA1 and BA2). BA:
traced and re-examined by the intermediate parties.
baseline approach; PA: proposed approach.
The defective intelligent device may further be kept on
mechanism while transmitting the information in the for education during pandemics,” IT Prof., vol. 24, no.
network. Furthermore, the proposed framework can be 2, pp. 52–61, Mar./Apr. 2022, doi: 10.1109/MITP.2021.
improved by integrating with existing cryptographic or 3066252.
trust-based schemes while understanding the behavior 5. A. Castiglione, M. Umer, S. Sadiq, M. S. Obaidat, and
of communication devices. The validation and online P. Vijayakumar, “The role of Internet of Things to
storage of large blockchains can be easily managed by control the outbreak of COVID-19 pandemic,” IEEE
limiting the amount of information that can be further Internet Things J., vol. 8, no. 21, pp. 16,072–16,082,
achieved by identifying the legitimacy of a device. The Nov. 2021, doi: 10.1109/JIOT.2021.3070306.
proposed framework can be further measured and vali- 6. S. S. Kamble, A. Gunasekaran, A. Ghadge, and R. Raut,
dated against more security measures in the future “A performance measurement system for industry 4.0
scope of this work. enabled smart manufacturing system in SMMEs - A
review and empirical investigation,” Int. J. Prod. Econ.,
vol. 229, Nov. 2020, Art. no. 107853, doi: 10.1016/j.ijpe.
CONCLUSION AND FUTURE
2020.107853.
DIRECTIONS
7. M. Masud et al., “A lightweight and robust secure key
Although MV represents one of the latest technologies
establishment protocol for Internet of Medical Things
for recording patient data and mitigating postpan- in COVID-19 patients care,” IEEE Internet Things J.,
demic situations, organizations must continue to vol. 8, no. 21, pp. 15,694–15,703, Dec. 2020,
enhance the intelligence and effectiveness of current doi: 10.1109/JIOT.2020.3047662.
systems to improve quality of life. The presence of 8. Z. Ren, F. Fang, N. Yan, and Y. Wu, “State of the art in
intruders within the network poses security and pri- defect detection based on machine vision,” Int. J.
vacy concerns that need to be addressed. This article Precis. Eng. Manuf.-Green Technol., vol. 9, no. 2, pp.
presents a survey of security schemes and explores 661–691, Mar. 2022, doi: 10.1007/s40684-021-00343-6.
the utilization of blockchain technology to establish 9. S. S. Kute, A. K. Tyagi, and S. Aswathy, “Industry 4.0
a secure and trustworthy communication network for challenges in e-healthcare applications and emerging
post-COVID-19 environments. The proposed mecha- technologies,” in Intelligent Interactive Multimedia
nism addresses open challenges and introduces a Systems for e-Healthcare Applications, A. K. Tyagi,
blockchain-based security scheme that enhances A. Abraham, and A. Kaklauskas, Eds. Singapore:
security and efficiency. The accuracy of the system is Springer, 2022, pp. 265–290.
compared against conventional approaches, and the 10. K. Shankar, E. Perumal, M. Elhoseny, F. Taher, B.
proposed framework can be further enhanced by inte- Gupta, and A. A. A. El-Latif, “Synergic deep learning
grating existing cryptographic or trust-based schemes. for smart health diagnosis of COVID-19 for connected
Additionally, future research should aim to validate the living and smart cities,” ACM Trans. Internet Technol.,
proposed framework against a broader range of secu- vol. 22, no. 3, pp. 1–14, Nov. 2021, doi: 10.1145/3453168.
rity measures. 11. M. A. Rahman, M. S. Hossain, N. A. Alrajeh, and B.
Gupta, “A multimodal, multimedia point-of-care deep
learning framework for COVID-19 diagnosis,” ACM
REFERENCES Trans. Multimedia Comput. Commun. Appl., vol. 17,
1. S. Pokhrel and R. Chhetri, “A literature review on no. 1s, pp. 1–24, Mar. 2021, doi: 10.1145/3421725.
impact of COVID-19 pandemic on teaching and 12. W. He, Z. J. Zhang, and W. Li, “Information technology
learning,” Higher Educ. Future, vol. 8, no. 1, pp. 133–141, solutions, challenges, and suggestions for tackling the
Jan. 2021, doi: 10.1177/2347631120983481. COVID-19 pandemic,” Int. J. Inf. Manage., vol. 57, Apr.
2. B. Pranggono and A. Arabo, “COVID-19 pandemic 2021, Art. no. 102287, doi: 10.1016/j.ijinfomgt.2020.102287.
cybersecurity issues,” Internet Technol. Lett., vol. 4, no. 13. K. Intawong, D. Olson, and S. Chariyalertsak,
2, Mar./Apr. 2021, Art. no. e247, doi: 10.1002/itl2.247. “Application technology to fight the COVID-19
3. M. Chopra, S. K. Singh, A. Gupta, K. Aggarwal, B. B. pandemic: Lessons learned in Thailand,” Biochemical
Gupta, and F. Colace, “Analysis & prognosis of Biophysical Res. Commun., vol. 534, pp. 830–836,
sustainable development goals using big data-based Jan. 2021, doi: 10.1016/j.bbrc.2020.10.097.
approach during COVID-19 pandemic,” Sustain. 14. A. Haddad, M. H. Habaebi, M. R. Islam, and S. A.
Technol. Entrepreneurship, vol. 1, no. 2, Mar. 2022, Art. Zabidi, “Blockchain for healthcare medical records
no. 100012, doi: 10.1016/j.stae.2022.100012. management system with sharing control,” in Proc. IEEE
4. A. Cheriguene, T. Kabache, A. Adnane, C. A. Kerrache, 7th Int. Conf. Smart Instrum., Meas. Appl. (ICSIMA), 2021,
and F. Ahmad, “On the use of blockchain technology pp. 30–34, doi: 10.1109/ICSIMA50015.2021.9526301.
15. T. Lavigne, B. Mbarek, and T. Pitner, “A real time GEETANJALI RATHEE is an assistant professor with the
healthcare tracking system based on blockchain Department of Computer Science and Engineering, Netaji
application,” in Proc. IEEE/ACS 18th Int. Conf. Comput. Subhas University of Technology, Dwarka, New Delhi, India.
Syst. Appl. (AICCSA), 2021, pp. 1–8, doi: 10.1109/ Her research interests include handoff security, cognitive
AICCSA53542.2021.9686880. networks, and blockchain technology. Rathee received her
16. M. Younis, W. Lalouani, N. Lasla, L. Emokpae,
Ph.D. degree in computer science and engineering from Jay-
and M. Abdallah, “Blockchain-enabled and data-driven
pee University of Information Technology. Contact her at
smart healthcare solution for secure and privacy-
[email protected].
preserving data access,” IEEE Syst. J., vol. 16, no. 3, pp.
3746–3757, Sep. 2022, doi: 10.1109/JSYST.2021.3092519.
CHAKER ABDELAZIZ KERRACHE is an associate professor
17. T. Wang, M. Du, X. Wu, and T. He, “An analytical
framework for trusted machine learning and with the Department of Computer Science and the head of
computer vision running with blockchain,” in Proc. the Informatics and Mathematics Laboratory at the Univer-
IEEE/CVF Conf. Comput. Vision Pattern Recognit. sity of Laghouat, Algeria. His research interests include trust
Workshops, 2020, pp. 32–38, doi: 10.1109/ and risk management, secure multihop communications,
CVPRW50498.2020.00011. and vehicular networks. Kerrache received his Ph.D. degree
18. C. Liu et al., “Blockchain technology in healthcare: A in computer science from the University of Laghouat. Con-
scientific and technological driving force,” in Proc. IEEE tact him at [email protected].
34th Int. Symp. Comput.-Based Med. Syst. (CBMS), 2021,
pp. 550–555, doi: 10.1109/CBMS52027.2021.00036.
ANISSA CHERIGUENE is an associate professor with the
19. J. Al-Jaroodi, N. Mohamed, and E. Abukhousa, “Health
rieure of
Department of English, Ecole Normale Supe
4.0: On the way to realizing the healthcare of the
Laghouat, Laghouat, Algeria. Her research interests include
future,” IEEE Access, vol. 8, pp. 211,189–211,210, Nov.
English as a Foreign Language instruction, educational
2020, doi: 10.1109/ACCESS.2020.3038858.
20. G. Yang et al., “Homecare robotic systems for technology, and linguistics and writing studies. Cheriguene
healthcare 4.0: Visions and enabling technologies,” received her Ph.D. degree in English language and literature
IEEE J. Biomed. Health Inform., vol. 24, no. 9, pp. from the University of Ouargla. Contact her at a.cheriguene@
2535–2549, Sep. 2020, doi: 10.1109/JBHI.2020.2990529. ens-lagh.dz.
T
he rise in online activities and the move to In recent years, the global transition to remote
remote work as a response to the COVID-19 work settings for CSOCs has become prevalent, driven
pandemic have created new challenges for primarily by the COVID-19 pandemic. In 2022, more
cybersecurity professionals.9 Many organizations com- than half of the global cybersecurity professionals
bat the increased frequency and complexity of cyber- worked remotely or had the option to choose their
attacks in their cybersecurity operations centers work location, up from only one quarter having the
(CSOCs). Cybersecurity analysts in CSOCs work as a remote option before the pandemic.7 Additionally, the
team and employ various monitoring and detection number of cybersecurity professionals worldwide work-
tools, such as intrusion detection/prevention systems ing fully remote has quadrupled, forming almost one
and security information and event management (SIEM) quarter of the whole cybersecurity workforce.7 How-
systems. They bear the critical responsibility of investi- ever, remote work introduces complexities in com-
gating incidents: a task demanding a detailed analysis munication for effective collaborative cybersecurity
and correlation of alerts, various system logs, reports, analysis.3,13 In remote teams, inefficient communication
and network traffic, with the goal of determining the practices can lead to delayed or ineffective responses
root cause, impacts, and potential scope of each secu- to cybersecurity incidents. Consequently, as the num-
rity incident. ber of remote analysts continues to rise, it becomes
In CSOCs, analysts usually communicate by creat- increasingly crucial to address and overcome the com-
ing incident reports that include important incident munication challenges to ensure a prompt and effec-
details, supporting evidence, and recommended response tive cybersecurity incident response.
actions.10 They also share the results of their ongoing This article tackles the global challenges arising
investigations during regular meetings and briefings. In from the growing trend of remote work in collaborative
urgent situations, they contact each other directly for cybersecurity analysis and introduces a novel visual
immediate assistance. In fact, the interdependency of trace approach designed to enhance communication
the analysis of each analyst requires high levels of team among remote analysts. Initially, we identify the signifi-
collaboration to respond to sophisticated cyberattacks. cant challenges encountered during collaborative cyber-
Current incident detection and response tools integrate security analysis within remote teams. Furthermore,
functions like instant messaging and ticketing features drawing upon communication theories, we propose a
to enable such collaborative analysis. method that traces analysts’ analytical processes and
visually represents them in an interactive concept map
1520-9202 © 2023 IEEE
called an Action-Observation-Hypothesis (AOH)-Map.
Digital Object Identifier 10.1109/MITP.2023.3318485 We outline the key characteristics of an AOH-Map and
Date of current version 3 November 2023. provide a case study to demonstrate how it can enhance
both the conveyance (information transmission from the During cybersecurity analysis, analysts investigate
analyst to the team level) and convergence (information alerts, search for evidence, and generate hypotheses
transmission from the team to the analyst level) pro- for further analysis. This complex analytical process
cesses. Support for these communication processes is involves 1) analysis actions like event searching,
crucial for effective collaborative cybersecurity analysis. 2) observations of suspicious evidence resulting from
This article makes a significant contribution to the analysis actions, and 3) hypotheses regarding
research and practice by recognizing the global com- potential attack events generated based on the obser-
munication challenges in collaborative cybersecurity vations.17 The analytical process of each analyst results
analysis among remote teams and proposing a novel in an individual cyber defense SA, which includes the
approach that leverages visual traces of analysts’ perception of suspicious events, comprehension of the
analytical processes to enhance communication. The tactics, techniques, and procedures used in an attack,
visual traces contain contextual information about the and prediction of the attacker’s future actions.17
analysts’ analytical processes, such as their analytical Collaborative cybersecurity analysis, on the other
strategies and supporting evidence. This enables team hand, is a progressive and iterative process where
members to achieve a better understanding of the find- individual analysts perform a targeted analysis, com-
ings. The novelty of the proposed approach is its use municate their findings, and leverage the collective
of visual traces for communicating incident findings intelligence of the team for subsequent analysis. To
accompanied by automatically captured contextual accomplish team objectives, analysts continuously
information, marking an advancement over current share their individual Cyber SA with their team, thereby
cybersecurity tools that rely on manual reporting of building and updating the collective Cyber SA. Subse-
findings. Moreover, visual traces, serving as communica- quently, each analyst reviews the collective Cyber SA,
tion media, facilitate a two-way synchronization between adjusts their individual Cyber SA accordingly, and then
individual and collective cyber situation awareness proceeds with their subsequent analysis. As such, col-
(Cyber SA), thereby enabling remote analysts to col- laborative cybersecurity analysis is fundamentally a
laboratively construct a shared mental model. Although collaborative problem-solving process, necessitating
our primary focus is on cybersecurity analysis, the poten- efficient coordination, continuous information exchange,
tial benefits of visual traces in supporting information and clear communication.
transmission and processing can be applied to other col- Given that remote work may alter the communica-
laborative problem-solving tasks in remote teams across tion dynamics among analysts, we have explored the
various domains. challenges cybersecurity analysts face within remote
teams using common existing communication tools, as
COLLABORATIVE CYBERSECURITY summarized in Table 1. To address these challenges,
ANALYSIS IN THE POSTPANDEMIC organizations must comprehend analysts’ needs for
WORLD information transmission and processing in the collab-
Cyber threats, with their complexity and multifaceted orative problem-solving process.
nature, necessitate teamwork among analysts. This
collaboration allows continuous monitoring and fos- Media Capability in the Collaborative
ters more effective problem solving by leveraging Communication Process
diverse expertise and viewpoints within the team. As Cybersecurity analysts typically employ diverse commu-
discussed, the COVID-19 pandemic has accelerated nication media to disseminate potential incident findings
the remote work trend in the cybersecurity field, result- and collaboratively construct a collective Cyber SA. The
ing in the emergence of virtual team environments for capabilities of the chosen media play a significant role in
collaborative cybersecurity analysis. However, these influencing communication performance,4 particularly
environments pose significant challenges to effective within multicultural and remote teams.14 Certain collabo-
communication and teamwork.13 Research indicates rative tasks may necessitate a spectrum of media capa-
that task complexity combined with high levels of virtu- bilities to support the communication processes of
ality increases the potential for misconceptions and remote workers.4 This is primarily because interpersonal
errors, making efficient communication patterns vital communication and teamwork are greatly impacted by
for team effectiveness in remote teams.11 To improve both synchronous (e.g., video calls) and asynchronous
remote team performance, it is recommended to pro- (e.g., discussion forums) communication methods.2
mote openness in information sharing by providing Media synchronicity theory (MST) studies how diverse
access to various tools, such as videoconferencing, communication media support the information process-
e-mail, and shared databases.12 ing needs of individuals and teams.4 MST suggests that
TABLE 1. Major challenges of collaborative cybersecurity analysis in remote teams and the solutions provided by the
AOH-Map.
different tasks necessitate varying levels of “media syn- data sources to triage alerts, confirm event occurrences,
chronicity,” a dimension of media capability defined as and gather pertinent evidence. These analysts acquire
the extent to which the capabilities of a communica- findings of potential attack events through a series of
tion medium enable individuals to achieve synchronic- analytical operations. The traces of these operations
ity.4 According to MST, communication involves two reflect the analysts’ analytical process, providing vital
critical processes: conveyance and convergence. Con- context for their findings. Such context includes the
veyance focuses on the transmission of a substantial methods used by the analysts to reach these findings
amount of information and subsequent retrospective and the supporting evidence. Hence, these traces give
analysis, while convergence focuses on the trans- other analysts a more detailed understanding of the find-
mission of “high-level abstraction of information and ings, delivering valuable contextual insights and deepen-
negotiation of these abstractions to existing mental ing their comprehension.
models.”4 Incorporating media capabilities that can Individual analysis traces are captured within a pre-
accommodate both processes is critical, given that cyber- determined sliding time window. As each time window
security analysis tasks demand analysts to exchange infor- concludes, these traces are automatically integrated
mation and collaborate to achieve a collective Cyber SA. into a visual map, showcasing the team’s collective
analytical operation traces in real time, referred to as
OUR APPROACH WITH the AOH-Map in Figure 1. As highlighted in Figure 1, the
VISUAL TRACES communication between analysts aligns with the two
Figure 1 demonstrates how communication among fundamental processes identified by MST:4 1) convey-
remote analysts can be enhanced by visualizing the ance, during which analysts share their findings and
traces of their analytical processes. As previously men- merge their individual Cyber SA with the team’s col-
tioned, collaborative cybersecurity analysis requires indi- lective Cyber SA, and 2) convergence, during which
vidual analysis and regular communication regarding analysts evaluate and comprehend the findings of
potential threat findings. The upper section of Figure 1 others to refine their own Cyber SA and plan for sub-
illustrates how individual analysts typically filter various sequent analyses. We consider the varying needs for
FIGURE 1. The communication process within a team of analysts through the aid of tracing individual analysts’ analytical pro-
cesses. IDS: intrusion detection system.
information transmission and processing that analysts Subsequently, the trace parser automatically translates
encounter during these processes and discuss how these traces into visual elements on the AOH-Map. For
visual traces can support both the conveyance and real-time analysis, a time window is established, within
convergence processes. which the parser processes the incoming traces. The
time-window duration is adjustable to the team’s needs.
Visual Traces The AOH-Map works as an interactive concept map
Visualizing the analysis traces in the AOH-Map is designed to weave together the traces of analysts’ ana-
advantageous for two main reasons. First, visualization lytical processes. It seamlessly integrates the traces
has been proven effective in navigating cognitive from each team member, representing the team’s collec-
capacity constraints when analysts deal with complex tive Cyber SA, as depicted in Figure 1. Updated by the
or large datasets.6 Visual analytics is commonly parser, the visual map encapsulates every recorded
deployed to aid analysts in conveying findings regard- visual element. In the map, nodes represent analysts’
ing potential attack events.1,18 Second, the traces of actions, observations, and hypotheses, while edges
analytical processes intrinsically provide the context depict the relationships among these nodes. By default,
for findings, aligning with the concept of provenance in these relationships represent the chronological order
visual analytics.15 Provenance includes both the analyt- of the nodes, but analysts have the flexibility to manu-
ical results (findings) and the analytical process from ally modify the edges by linking nodes in a way that
data to findings.15 Concept maps are frequently associ- best represents the causal relationships.
ated with collaborative provenance analytics.15 Studies Given the need for an interactive visual map, we
suggest that utilizing advanced digital concept maps identified five key user interactions integral to the
for knowledge and information visualization can enhance design of the AOH-Map:
knowledge and information awareness, consequently
improving collaborative problem solving among team Searching: Analysts can navigate through the
members who are geographically dispersed.5 There- extensive map using keywords or regular expres-
fore, an interactive visual map of the traces can facili- sions, which enables the efficient location of rele-
tate finding exchanges among analysts. vant information.
Highlighting: Analysts can highlight important
AOH-Map Framework findings or observations and annotate them,
The AOH-Map framework, as shown in the top right thereby indicating the areas they intend to
corner of Figure 2, consists of three main functional focus on in future analyses.
components: tracing modules, a trace parser, and Linking: This empowers analysts to forge links
visual map. The tracing modules chronicle each ana- manually between distinct nodes. For instance,
lyst’s analytical process during individual analysis. an analyst might connect one hypothesis relating
FIGURE 2. The AOH-Map framework (top right) outlines system components, with a case study showing the visual map’s evolu-
tion during collaborative cybersecurity analysis among three analysts: S1, S2, and S3.
to an attack event with another, denoting a causal could subsequently be processed by a Python-based
relationship between two separate attack events. parser to transform them into visual elements on the
Tagging: Analysts can designate tags to nodes map.18 The visual map itself could be constructed using
to indicate their priority or status. For instance, FreeMind,a an open source mind-mapping software that
should the certainty of a hypothesis be in facilitates the hierarchical arrangement of nodes to
doubt, an analyst could assign a question mark accurately represent the relationships among actions,
as its tag. observations, and hypotheses. Moreover, the versatility
Merging: To maintain the clarity and concise- of the AOH-Map design allows for potential implementa-
ness of the map, analysts can consolidate tion with existing SIEM systems, like Splunk, which typi-
duplicate actions or findings into a single node. cally maintain a history of user search queries, which
could serve as traces of the analytical process.
To achieve visual traces of analytical processes
with minimal manual effort, it is crucial to implement Case Study
the AOH-Map with careful consideration of its compat- We use a case to illustrate how the AOH-Map can facil-
ibility with current analysis tools. A prime example of itate communication among remote analysts. Consider
such a visual analytics system is an implementation a scenario where three remote analysts, denoted as S1,
that leverages Analytical Reasoning Support for Cyber S2, and S3, were assigned to examine a cyber incident.
Analysis,18 a tracing tool embedded within an analysis The case under investigation revolves around a work-
application.16 This tool captures individual analysts’ station, referred to as desk, that has reportedly been
analytical processes into traces by automatically log- hit with a ransomware attack.
ging their actions and observations, while hypotheses
are captured based on self-reporting.16 These traces a
http://freemind.sourceforge.net/wiki/index.php
As illustrated in phase 1 of Figure 2, the analysts initi- simultaneously.4 As illustrated by the case, the analysts
ated their investigation from different perspectives. Ana- could report their findings using the AOH-Map without
lyst S1 prioritized determining whether the workstation, composing a report or engaging in direct communication
“desk,” had established connections with any file servers. with fellow analysts. Due to the tracing module and
Upon confirmation, S1 tagged the corresponding hypoth- parser, the process of visualizing traces is automated.
esis as confirmed on the visual map. Concurrently, S2 The only components requiring manual input were inter-
investigated suspicious Windows Registry events and actions with the map, such as tagging and highlighting.
discovered that a USB drive had been attached to Moreover, the visual traces provide context for how
“desk.” As a result, he tagged the associated hypothesis these findings were obtained. This context allows ana-
as confirmed on the map. S2 also hypothesized possible lysts to make more informed decisions regarding the
connections between “desk” and file servers but didn’t trustworthiness of the findings. For example, S1 was
investigate it immediately. Hence, he marked this hypoth- able to connect her discovery of the USB drive’s
esis on the map as “unconfirmed.” Meanwhile, S3 decided attachment to S3’s finding of “desk” visiting a suspi-
to explore whether “desk” had accessed any suspicious cious domain after reviewing S3’s analysis trace. Addi-
domains on the day of the incident. Upon discovering a tionally, analysts could highlight others’ findings to signal
potentially suspicious domain, but remaining uncertain of their intent for further investigation. AOH-Map’s interac-
its malicious nature, S3 tagged this hypothesis to indi- tive functions, such as tagging, linking, and highlighting,
cate that it requires further verification. support retrospective analysis and assist analysts in
Following the initial analysis, the analysts consulted processing information gleaned from others’ findings.
the AOH-Map to understand each other’s findings, as These interactions offer a low level of media synchronic-
shown in phase 2 in Figure 2. S1 recognized that S2’s ity, meeting analysts’ needs for information transmission
recent hypothesis about the connection between “desk” and processing when sharing findings with the team.
and other servers had already been addressed in her
analysis. To avoid redundancy, S1 merged these two Supporting Convergence Processes
hypotheses on the map. Moreover, S1 noticed that S3’s In the case, S1’s detection of a suspicious image file
hypothesis concerning a suspicious domain had not yet downloaded from a malicious domain stemmed from
been verified. Consequently, S1 highlighted this hypothe- S3’s investigation of the domains that “desk” had vis-
sis on the map, indicating her intent to explore it further. ited. Acknowledging S3’s finding allowed S1 to connect
In the subsequent investigation (phase 3 in Figure 2), S3 her previous discovery of “desk” connecting to a file
confirmed the malicious nature of the domain and dis- server as a potential subsequent event to the suspi-
covered an image downloaded from the domain contain- cious image download. Such instances emphasize the
ing an embedded executable program. S3 tagged an importance of analysts maintaining synchronicity with
unconfirmed hypothesis that the executable program their team’s collective Cyber SA throughout their tasks.
could be malware and hypothesized that the infection of Gathering the team’s Cyber SA represents the con-
“desk” by this malware could cause its connection to the vergence process, as highlighted in Figure 1, where
file server (a previous finding of S3’s). Consequently, S3 team members comprehend each other’s findings and
returned to the map and linked these two hypotheses. update their individual Cyber SA. According to MST,
During S2’s map review, S2 determined a possible higher levels of media synchronicity can enhance the
sequence of events: the connection of “desk” to a suspi- convergence process by minimizing delays and foster-
cious domain, as discovered by S3, could have occurred ing a shared understanding among team members.4
after the attachment of the USB drive to “desk.” There- The case study demonstrated that the AOH-Map ena-
fore, S2 linked his hypothesis about the USB drive to S3’s bles analysts to communicate and collaborate effectively
hypothesis concerning the suspicious domain, illustrating by interacting with the visual map (through highlighting,
a potential chain of events. Later, upon noticing S3’s dis- tagging, and linking) even without real-time meetings.
covery of a suspicious executable, and with his expertise Moreover, the AOH-Map can supplement synchronous
in malware analysis, S2 highlighted the hypothesis and communication in virtual team discussions during regu-
proceeded to conduct a malware analysis to confirm it. lar meetings, where analysts can use the map to navi-
gate and present their findings.
Supporting Conveyance Processes
According to MST, the conveyance process can benefit Benefits and Novelty of the AOH-Map
from lower levels of media synchronicity as individuals In Table 1, we summarize how the AOH-Map addresses
do not need to work together to transmit information the major challenges of collaborative cybersecurity
analysis in remote teams. The novelty of the AOH-Map media synchronicity,” MIS Quart., vol. 32, no. 3, pp.
lies in its use of visual traces as a communication 575–600, Sep. 2008, doi: 10.2307/25148857.
medium and its capability to automate the capture of 5. T. Engelmann, S.-O. Tergan, and F. W. Hesse, “Evoking
these visual traces, which are fundamental for suc- knowledge and information awareness for enhancing
cessful collaborative problem solving. Even though the computer-supported collaborative problem solving,”
current cybersecurity tools aim to facilitate collabora- J. Exp. Educ., vol. 78, no. 2, pp. 268–290, Dec. 2009,
tion within teams, they are still ineffective because doi: 10.1080/00220970903292850.
they rely on analysts to report their findings and explain 6. W. Huang, P. Eades, and S.-H. Hong, “Measuring
associated contextual information through channels like effectiveness of graph visualizations: A cognitive
instant messaging and ticketing. In contrast, the visual load perspective,” Inf. Visualization, vol. 8, no. 3,
traces reflect individual analysts’ mental models. Through pp. 139–152, Sep. 2009, doi: 10.1057/ivs.2009.10.
interaction with the visual map, analysts can construct 7. “The (ISC)2 cybersecurity workforce study: A critical
a shared mental model, demonstrating the AOH-Map’s need for cybersecurity professionals persists amidst a
unique capability to transform individual Cyber SA into year of cultural and workplace evolution,” (ISC)2 Inc.,
collective Cyber SA. Alexandria, VA, USA, Rep. (ISC)2, 2022. [Online].
Available: https://media.isc2.org/-/media/Project/
CONCLUSION ISC2/Main/Media/documents/research/ISC2-
Our article addressed the global communication Cybersecurity-Workforce-Study-2022.
challenges that collaborative cybersecurity analysts pdf?rev=1bb9812a77c74e7c9042c3939678c196
encounter when working remotely, exacerbated by the 8. F. B. Kokulu et al., “Matched and mismatched SOCs:
postpandemic era. To overcome these challenges, we A qualitative study on security operations center
proposed a novel method that leverages visual traces issues,” in Proc. ACM SIGSAC Conf. Comput. Commun.
of analysts’ analytical processes to support various Secur., 2019, pp. 1955–1970, doi: 10.1145/3319535.3354239.
communication processes in collaborative cybersecu- 9. M. Lang and L. Connolly. “Managing the cybersecurity
rity analysis. By demonstrating how analysts collec- risks of teleworking in the post-pandemic ‘new
tively construct Cyber SA with the aid of visual traces, normal’.” SSRN. Accessed: Sep. 25, 2023. [Online].
this article offered valuable insights for organizations Available: https://papers.ssrn.com/sol3/papers.
that have difficulty managing remote teams in the cfm?abstract_id=4146506
postpandemic era. Although our proposed method 10. J. T. Luttgens, M. Pepe, and K. Mandia, Incident
addresses the challenges faced by remote cybersecu- Response & Computer Forensics. New York, NY, USA:
rity analysts worldwide, we recognize the impact of cul- McGraw-Hill, 2014, pp. 45–77.
tural differences on the effectiveness of the AOH-Map, 11. S. L. Marlow, C. N. Lacerenza, and E. Salas,
particularly in establishing trust and ensuring team “Communication in virtual teams: A conceptual
goal alignment. framework and research agenda,” Hum. Resour.
Manage. Rev., vol. 27, no. 4, pp. 575–589, Jan. 2017,
doi: 10.1016/j.hrmr.2016.12.005.
REFERENCES 12. J. R. Mesmer-Magnus, L. A. DeChurch, M. Jimenez-
1. M. Angelini, N. Prigent, and G. Santucci, “Percival: Rodriguez, J. Wildman, and M. Shuffler, “A meta-
Proactive and reactive attack and response analytic investigation of virtuality and information
assessment for cyber incidents using visual analytics,” sharing in teams,” Organizational Behav. Hum. Decis.
in Proc. IEEE Symp. Visualization Cyber Secur. (VizSec), Processes, vol. 115, no. 2, pp. 214–225, Jul. 2011,
2015, pp. 1–8, doi: 10.1109/VIZSEC.2015.7312764. doi: 10.1016/j.obhdp.2011.03.002.
2. K. Burke and L. Chidambaram, “How much bandwidth 13. S. Morrison-Smith and J. Ruiz, “Challenges and
is enough? A longitudinal examination of media barriers in virtual teams: A literature review,” SN Appl.
characteristics and group outcomes,” MIS Quart., vol. 23, Sci., vol. 2, no. 6, pp. 1–33, May 2020, doi: 10.1007/
no. 4, pp. 557–579, Dec. 1999, doi: 10.2307/249489. s42452-020-2801-5.
3. O.-K. Choi and E. Cho, “The mechanism of trust 14. J. Schulze and S. Krumm, “The ‘virtual team player’: A
affecting collaboration in virtual teams and the review and initial model of knowledge, skills, abilities,
moderating roles of the culture of autonomy and and other characteristics for virtual collaboration,”
task complexity,” Comput. Hum. Behav., vol. 91, no. 4, Organizational Psychol. Rev., vol. 7, no. 1, pp. 66–95,
pp. 305–315, Feb. 2019, doi: 10.1016/j.chb.2018.09.032. 2017, doi: 10.1177/2041386616675522.
4. A. R. Dennis, R. M. Fuller, and J. S. Valacich, “Media, 15. K. Xu, A. Ottley, C. Walchshofer, M. Streit, R. Chang,
tasks, and communication processes: A theory of and J. Wenskovitch, “Survey on the analysis of user
interactions and visualization provenance,” Comput. Tampa, FL, 33606, USA. Her current research interests include
Graph. Forum, vol. 39, no. 3, pp. 757–783, Jul. 2020, cybersecurity analytics, intelligent systems, and explainable
doi: 10.1111/cgf.14035. artificial intelligence. Zhong received her Ph.D. degree in infor-
16. C. Zhong, J. Yen, P. Liu, R. Erbacher, R. Etoty, and C. mation sciences and technology from The Pennsylvania State
Garneau, “ARSCA: A computer tool for tracing the University. Contact her at [email protected].
cognitive processes of cyber-attack analysis,” in Proc.
IEEE Int. Multi-Disciplinary Conf. Cogn. Methods
Situation Awareness Decis., 2015, pp. 165–171, J. B. (JOO BAEK) KIM is an assistant professor in the Sykes
doi: 10.1109/COGSIMA.2015.7108193. College of Business, the University of Tampa, Tampa, FL,
17. C. Zhong, J. Yen, P. Liu, R. F. Erbacher, C. Garneau, 33606, USA. His research interests include emerging technol-
and B. Chen, “Studying analysts’ data triage ogy and media adoption in organizations and the use of
operations in cyber defense situational analysis,” in gamification in business training/education. Kim received his
Proc. Theory Models Cyber Situation Awareness, 2017, Ph.D. degree in information systems and decision sciences
pp. 128–169, doi: 10.1007/978-3-319-61152-5_6. from Louisiana State University. Contact him at [email protected].
18. C. Zhong, A. Alnusair, B. Sayger, A. Troxell, and J. Yao,
“AOH-map: A mind mapping system for supporting
ALPER YAYLA is an associate professor and the director of
collaborative cyber security analysis,” in Proc. IEEE
Conf. Cogn. Comput. Aspects Situation Manage. the cybersecurity programs in the Sykes College of Business,
(CogSIMA), 2019, pp. 74–80, doi: 10.1109/COGSIMA. the University of Tampa, Tampa, FL, 33606, USA. His research
2019.8724159. interests include information technology leadership and cyber-
security. Yayla received his Ph.D. degree in management infor-
CHEN ZHONG is an assistant professor of cybersecurity mation systems from Florida Atlantic University. Contact him
in the Sykes College of Business, the University of Tampa, at [email protected].
R
ansomware is a global cyber threat. It has dis- Notably, midsized organizations paid an average
rupted digital services and businesses alike ransom of $170,404 to retrieve their data.3 Moreover, a
with impact on human lives, reputation, finan- CrowdStrike intelligence report4 highlighted an 82%
ces, and service restoration abilities. Contemporary increase in ransomware data leaks, with instances ris-
ransomware strains are capable of circumventing intel- ing from 1474 in 2020 to 2686 instances in 2021. In
ligent cybersecurity controls. During the early stages 2020, the average cost of ransom payments reached
of the COVID-19 pandemic (February–March 2020), $1.1 million. Furthermore, according to a report by Palo
an exponential increase in phishing and ransomware Alto Networks,5 average ransom payments in 2021 rose
attacks was reported globally, with weekly attacks in by 71%, reaching $1 million.
June 2020 surpassing 20,000 compared with fewer Figure 1 illustrates the global distribution of attack
than 2000 in the previous year.1 According to the 2022 targets during the COVID-19 pandemic for the period
Verizon Data Breach Investigations Report,2 the number March 2020 to July 2022.
of ransomware incidents during 2020–2021 increased by Two popular attacker tactics for ransomware have
13% when compared to the cumulative total of the pre- appeared during the pandemic. The first is the spray-
vious five years. Additionally, a survey conducted by and-pray tactic, wherein, the adversary distributes mal-
Sophos3 revealed that 66% of organizations experi- ware (ransomware) through e-mail and malicious web
enced ransomware attacks in 2021, representing a sig- advertisements. Inadvertent victim action (of clicking
nificant increase from 37% in 2020, and indicating a 78% a link or downloading a payload) infects the target
rise in just one year. device. The second approach comprises an in situ
presence of the adversary at the victim’s network to
initially chart out the organizational network topology,
1520-9202 © 2023 IEEE
and subsequently issue a data encryption command.
Digital Object Identifier 10.1109/MITP.2023.3297085 Such an attack entails a deep-rooted and sustained
Date of current version 3 November 2023. adversarial presence in the network, possibly through
September/October 2023
were disrupted; the cost was approximately
$600 million to restore.
Fresenius (May 2020)11 Snake Health care, N/A N/A Patients’ details such as name, phone number,
Germany gender, date of birth, nationality, address, test
results, and doctor details were made public.
Waikato hospitals Zeppelin Health care, New Microsoft N/A Data related to patients, staff, and financial
(May 2021)12 Zealand Windows/phone information were released online (dark web); IT
lines/payroll services that control more than 600 servers
systems were shut down; laboratory services were
halted.
Benesov Hospital Emotet/TrickBot/ Health care, Czech IT services N/A Hospital services were halted; patient data
(December 2019)13 Ryuk Republic were made inaccessible; the financial loss was
$1.68 billion.
University of California, Ransomware Education, USA IT network Netwalker/MailTo Attackers gained access to a limited number of
San Francisco (phishing) servers, but not the main server; data related to
(June 2020)14 students, staff, and some information about
academic research were stolen.
Hammersmith Maze Research VPN system Maze Group COVID-19 test lab was targeted; patient data
Medical Research organization, U.K. were stolen and published on the dark web.
(March 2020)15
Cognizant (April 2020)15 Maze IT services, USA Citrix devices Maze Group Internal network was affected, including work-
from-home setups; billing and customer
services were halted; employee details such as
identity numbers, passport IDs, tax IDs, and
social security numbers were compromised; the
financial loss was $50–70 million.
Canadian COVID-19 CryCryptor Digital Services, N/A CryDroid open The operational and financial impacts, as well
IT Professional
contact tracing app Canada source ransomware as loss of data/information, were minimal.
(June 2020)16
CWT (July 2020)17 Ranger Locker Travel agency, USA IT network N/A Two terabytes of company data (financial
reports, employee e-mail addresses, salary
information, and security documents) were
stolen; the financial loss was $4.5 million.
39
N/A: not applicable; VPN: virtual private network.
SECURITY AND DATA PROTECTION DURING THE COVID-19 PANDEMIC AND BEYOND
SECURITY AND DATA PROTECTION DURING THE COVID-19 PANDEMIC AND BEYOND
e-mail, and data exfiltration, and deploying of ransom- techniques such as phishing e-mails, exploit tools,
ware strains Ryuk and Conti. Once the malware has remote desktop protocol vulnerabilities, and brute-
been successfully dropped in a victim’s machine, Trick- force methods. The strain proliferates through lateral
Bot copies itself as an executable file and often stores movements, strategically identifying valuable data assets
itself within these Windows Operating System paths: for subsequent encryption. It follows a double-extortion
C:\Windows\C:\Windows\SysWoW64\C:\Users\[Username]\ technique, whereby in addition to encrypting the victims’
AppData\Roaming\.18 data, it exfiltrates sensitive information from the com-
BazarLoader/BazarBackdoor, on the other hand, is promised network. These illicitly obtained data are then
an extension to TrickBot. It has become one of the stored within the attacker’s infrastructure, creating a
most widely used strains for deploying ransomware in dual threat to the victim.
a victim’s machine. Phishing e-mails are the popular
technique for dropping these payloads onto a victim’s Maze Ransomware
machine.18 The Maze ransomware group, which emerged in 2019,
is a highly sophisticated ransomware attack group,
Play Ransomware known for its advanced attack techniques and imple-
Play was first identified in 2022 and is a sophisticated mentation of the double-extortion strategy. Similar to
crypto virus, which is hard to trace on a victim’s other ransomware strains such as Netwalker, Maze fol-
machine. Like Conti and Ryuk, the Play ransomware lows a multistage attack process, leveraging various
deploys with the intent to establish a remote connec- entry points to gain initial access to the victim’s net-
tivity between the victim and a command-and-control work through phishing e-mails, vulnerabilities of remote
server. It is reported to have been distributed via phish- desktop protocol, and exploit kits.15
ing e-mails, fake system updates, malicious advertise- Notable incidents involving Maze include high-profile
ment links, executable files like JavaScript files, PDFs, attacks against renowned organizations such as Cogni-
and Microsoft Office files. Additionally, the ransomware zant and Xerox in the United States and LG Corporation
gains initial access to targeted systems by exploiting in South Korea. These attacks involved substantial ran-
ProxyNotShell vulnerabilities in Microsoft Exchange. som demands of approximately $70 million, illustrating
Upon successful deployment of the Play variant into a the group’s financial motivations and potential impact
victim’s machine, it first exfiltrates data, then encrypts on targeted entities.15
files and automatically changes file extensions to be
“.PLAY.” Unlike other variants, the Play variant does Snake Ransomware
not provide regular information for victims regarding The Snake ransomware strain, also referred to as
what files are encrypted, ransom amount, and payment Ekans, first emerged in 2020 as a sophisticated form of
instructions. A read.txt file is generated and comprises malware rather than a distinct ransomware group. This
merely an e-mail address of the attacker.19 strain targets critical infrastructure including manufactur-
ing, health care, smart grids, and transport networks. Its
Netwalker/MailTo Ransomware primary objective is to infiltrate and encrypt critical files
The Netwalker ransomware strain, also referred to within industrial control system environments responsi-
as MailTo, represents a highly sophisticated Russian ble for controlling and monitoring the operational pro-
ransomware group that surfaced in 2019. Notably, cesses in critical sectors. Notable incidents involving
this group has gained significant attention due to its the deployment of the Snake variant include attacks on
involvement in various high-profile attacks on global Fresenius Healthcare in Germany, Honda Motors in
organizations, including the University of California, Japan, and Enel Argentina, all of which entailed undis-
San Francisco; Argentina’s immigration department; closed ransom demands.11
and the Toll group in Australia. This sophisticated ran- In line with other ransomware strains, the Snake ran-
somware strain adopts robust encryption standards somware also implements a double-extortion technique,
such as Rivest-Shamir-Adleman (RSA) and Advanced encompassing data exfiltration alongside the encryption
Encryption Standard (AES) to render the victim’s data process. Rather than targeting specific files or systems,
inaccessible and compelling compliance with the stip- Snake has been designed to target the entire network.
ulated ransom payment as a means to regain access.14
The Netwalker group adheres to a multistage attack Ragnar Locker
process, beginning with the infiltration of initial access Ragnar Locker is a ransomware strain that first emerged
points within the victim’s network infrastructure through in 2020 and gained popularity because of its targeted
approach against high-profile and large organizations. adoption of poor cyber hygiene practices. With proper
Attacks against CWT in the United States and the employee training processes and programs in place,
North American network of Energias de Portugal were the level of awareness regarding attacker tactics can
attributed to the Ragnar Locker ransomware strain, be raised. The consequent upliftment of holistic organi-
where significant ransom amounts of $10 million and zational security would help thwart adversarial attempts
$11 million, respectively, were requested.17,20 Ragnar to carry out phishing-based ransomware attacks. A
Locker also adopts a double-extortion technique that cybersecurity framework encompassing good cyberse-
encompasses data exfiltration alongside the sensitive curity governance through the involvement of higher
data or files encryption. management in policy drafting and decision making
To gain initial access to targeted networks, Ragnar for all cybersecurity-related events and comprehensive
Locker conducts meticulous reconnaissance to detect cyber risk management through employee training would
critical assets and potential vulnerabilities. The threat benefit an organization in safeguarding against ransom-
actors of the Ragnar Locker strain often exploit vulner- ware attacks.
abilities of the remote desktop protocol by using brute- Mitigation tactics for ransomware strains can include
force methods to guess usernames and passwords, system resets or reformatting to eliminate the ransom-
employing stolen credentials or weak passwords, or ware strain, recovery from the data backups in place
exploiting unpatched software. Subsequently, to do onto “clean” systems, and reporting of the incident to
privilege escalation within the compromised network, pertinent authorities. It is a procedure that entails
the attacker exploits the CWE-2017-0213 vulnerability restoring a computer system to its original factory set-
in the Windows Component Object Model aggregate tings, effectively removing all existing data and soft-
Marshaler, which enables the execution of arbitrary ware, including the ransomware strain. This approach
code with privilege escalation. In addition, it employs serves as a means to reinstate the system’s integrity
robust obfuscation techniques such as deploying a by removing unauthorized ransomware privileges and
crafted virtual machine (VM) Windows XP image to the reverting any altered system configurations. Addition-
ally, it aids in the elimination of potential backdoors or
VirtualBox VM, mapping all local drives as read and
vulnerabilities that may have been exploited by the ran-
writable into the VM to evade detection. Additionally,
somware strain. According to the report by Sophos,3
the attackers terminate specific processes associated
data backup is the most widely used method for restor-
with security tools and backup systems to hinder their
ing data after a ransomware attack.
functionality.20 Moreover, it leverages Salsa20, a robust
Data encryption is widely recognized as a counter-
custom encryption algorithm with custom matrix.20
measure against ransomware attacks, offering essential
safeguards for data confidentiality, integrity, availability,
CryCryptor
and protection against unauthorized access.6,7 Encryp-
CryCryptor is a ransomware strain based on the open
tion not only provides data confidentiality but also ena-
source CryDroid ransomware variant that primarily tar-
bles the detection of unauthorized modifications and
gets Android mobile devices. The goal is to encrypt
tampering attempts, thus ensuring data integrity. Vari-
files on Android devices and demand ransom. Notable
ous robust cryptographic encryption algorithms, includ-
examples include the CryCryptor strain, which dis-
ing AES, RSA, Triple Data Encryption Standard (3DES),
guises itself as a COVID-19 tracing app of Canadian Elliptic Curve Cryptography (ECC), ChaCha20, and
health-care services. The variant is usually distributed Twofish alongside secure key management practices,
via malicious apps or websites. Once it infiltrates a tar- contribute to the strength of data protection against
geted Android device, it elevates its privileges and ransomware attacks. These encryption algorithms are
encrypts files that are stored on the device’s local designed to make it challenging for an attacker to
memory or external Secure Digital card with an “.enc” decrypt or gain access to the encrypted data. Accord-
extension appended to the file names.8,16 ing to the report by Sophos,3 97% of organizations that
Figure 2 shows the adoption of various attacker had data encrypted got their data back.
tactics mapped to the MITRE ATT&CK threat intelli- To establish a comprehensive data protection strat-
gence framework. egy, it is recommended that higher management defines
minimum data confidentiality standards based on asset
PREVENTION AND and data classifications. This includes crafting policies
MITIGATION TACTICS for data protection, proposing and analyzing best practi-
A shortcoming of small and medium-sized organiza- ces for data encryption, and ratifying minimum data
tions is limited resourcing for safeguarding data and encryption standards. By categorizing critical data based
on risk levels, organizations can tailor their encryption the user’s temp folder to continue its execution.
practices accordingly. For instance, low-risk data may Restricting the usage of “temp” folders such as
have optional encryption, while classified/critical data “C:\users\<user>\appdata\temp” and the utilization of
elements should be subject to mandatory encryption group policy objects and software restriction policies
systems, adhering to minimum baseline encryption will help prevent the spread of ransomware strains.
standards, as stipulated by higher management. Network segmentation is a widely recommended
Restricting the execution of programs from the prevention and mitigation tactic that serves as a strong
“temp” folder is another means of protection against defense mechanism against ransomware attacks. It
ransomware attacks. Typically, the initial execution involves segregating the network into distinct seg-
of ransomware tries to copy its malware payload to ments, typically based on criticality, to limit the lateral
movement of ransomware strains within the network. ability to laterally move within a corporate network, and
A significant strength of network segmentation lies in protection of threat groups based in less cooperative
its ability to enforce access control mechanisms for geojurisdictions. Although prevention is the best way to
each segment.7 mitigate the threat, it is imperative that a victim of the
Network segmentation can be implemented through ransomware attack prevent the spread of the malware
a structured approach that involves identifying critical from the compromised machines of the network to
and noncritical network assets, defining segmentation other vulnerable devices with due diligence by securing
goals, and deploying the appropriate mechanisms. To a corporate network from potential lateral movement
facilitate this process, firewalls, intrusion detection sys- of malicious payloads.
tem and intrusion prevention systems, virtual local area
networks, and software-defined networking (SDN) ACKNOWLEDGMENTS
solutions could be leveraged. SDNs, for instance, offer We thank the anonymous reviewers for their valuable
centralized monitoring and management of network comments, which helped us improve the organization,
infrastructure, providing dynamic and flexible network clarity, content, and presentation of this article.
segmentations. By leveraging SDN controllers, organiza-
tions can establish granular segmentation policies, isolate
critical systems, and control the flow of network traffic.
REFERENCES
A zero-trust architecture (ZTA) is increasingly use- 1. D. Braue, “Ransomware isn’t going away, are you
ful to safeguard against ransomware attacks. The ready for it?” Cyber Today, pp. 45–47, 2022.
effectiveness of a ZTA solution lies in the assumption 2. “Data breach investigations report,” Verizon, pp. 1–88,
that there is no inherent trust toward any user, device, 2022. [Online]. Available: https://www.verizon.com/
business/en-au/resources/reports/dbir/
or network, regardless of their location or security
3. “The state of ransomware 2023,” Sophos, pp. 1–26,
boundaries. In the context of ransomware attacks, ZTA
2023. [Online]. Available: https://www.sophos.com/en-
fosters security through microsegmentation of net-
us/content/state-of-ransomware
works, thereby limiting the lateral movement of ran-
4. “CrowdStrike 2023 global threat report,” CrowdStrike,
somware strains across the network. By implementing
pp. 1–42, 2021. [Online]. Available: https://www.
strict access control and by promoting the principle of
crowdstrike.com/global-threat-report/
least privilege, ZTA helps minimize the attack surface
5. “2022 unit 42 incident response report,” Palo Alto
and reduces the spread of ransomware.
Netw., Santa Clara, CA, USA, pp. 1–42, 2022. [Online].
Furthermore, ZTA emphasizes the adoption of mul-
Available: https://start.paloaltonetworks.com/2022-
tifactor authentication to strengthen security. Thus,
unit42-incident-response-report
organizations can mitigate the risk of unauthorized
6. M. Alawida, A. E. Omolara, O. I. Abiodun, and
access using compromised/stolen credentials.
M. Al-Rajab, “A deeper look into cybersecurity issues
To address the critical aspect of data protection,
in the wake of COVID-19: A survey,” J. King Saud
ZTA incorporates data classification and categorization
Univ.-Comput. Inf. Sci., vol. 34, no. 10, pp. 8176–8206,
based on sensitivity levels. This enables organizations
Nov. 2022, doi: 10.1016/j.jksuci.2022.08.003.
to define appropriate access control and encryption
7. B. Pranggono and A. Arabo, “COVID-19 pandemic
requirements for different types of data. Notably, it
cybersecurity issues,” Internet Technol. Lett., vol. 4,
helps leverage robust encryption algorithms comprising
no. 2, Mar./Apr. 2021, Art. no. e247, doi: 10.1002/itl2.247.
AES, ECC, RSA, and 3DES to bolster data protection
8. H. Saleous et al., “COVID-19 pandemic and the
against ransomware. Data loss-prevention solutions as
cyberthreat landscape: Research challenges and
a proactive measure comprise ZTA solutions and help
opportunities,” Digit. Commun. Netw., vol. 9, no. 1,
monitor and prevent unauthorized data transfers or pp. 211–222, Feb. 2023, doi: 10.1016/j.dcan.2022.06.005.
critical data leakage. These solutions facilitate identifi- 9. I. C. Eian, L. K. Yong, M. Y. X. Li, and Y. H. Qi, “Cyber
cation of potential ransomware attack patterns or attacks in the era of COVID-19 and possible solution
activity indicative of mass data encryption and data domains,” Preprints, doi: 10.20944/preprints202009.
exfiltration. 0630.v1.
ınsky, “Cyber
10. J. Kolouch, T. Zahradnicky, and A. Kuc
CONCLUSION security: Lessons learned from cyber-attacks on
The threat of ransomware will only aggravate over time hospitals in the COVID-19 pandemic,” Masaryk Univ.
because of advancing adversarial capabilities, ready J. Law Technol., vol. 15, no. 2, pp. 301–341, Sep. 2021,
access to open source ransomware tools, sophisticated doi: 10.5817/MUJLT2021-2-7.
11. B. Krebs, “Europe’s largest private hospital operator 19. T. Meskauskas, PLAY (.PLAY) Ransomware Virus -
Fresenius hit by ransomware”, Krebs on Security, Removal and Decryption Options. (2022). PC Risk.
May 2020. [Online]. Available: https://krebsonsecurity. [Online]. Available: https://www.pcrisk.com/removal-
com/2020/05/europes-largest-private-hospital- guides/24571-play-ransomware
operator-fresenius-hit-by-ransomware/ 20. “Ragnar locker ransomware (Everything You Need to
12. “Waikato District Health Board (WDHB) incident Know),” Avertium, May 2022. [Online]. Available:
response analysis,” Tew Hatu Ora, Dec. pp. 1–51, 2022. https://explore.avertium.com/resource/ragnar-locker-
[Online]. Available: https://www.tewhatuora.govt.nz/ ransomware
publications/waikato-district-health-board-wdhb-
incident-response-analysis/
ZUBAIR BAIG is an associate professor of cybersecurity with
13. O. Filipec and D. Plasil, “The cybersecurity of
healthcare the case of the Benegov hospital hit by the School of Information Technology, Deakin University,
Ryuk ransomware, and lessons learned,” Obrana a Geelong, Vic., 3216, Australia. His research interests include
Strategie-Defence Strategy, vol. 21, no. 1, pp. 27–51, cybersecurity, integrated system security architecture, digital
Jun. 2021, doi: 10.3849/1802-7199.21.2021.01.027-052. forensics, and machine learning. Baig received his Ph.D.
14. H. Landi, “UCSF pays hackers $1.1M to regain access degree in computer science from Monash University. He is a
to medical school severs,” Fierce Healthcare, Senior Member of IEEE. Contact him at zubair.baig@deakin.
Jul. 2020. [Online]. Available: https://www. edu.au.
fiercehealthcare.com/tech/ucsf-pays-hackers-1-14m-
to-regain-access-to-medical-school-servers
SRI HARSHA MEKALA is currently pursuing his doctoral
15. D. Belair, “Maze ransomware,” Augment, Nov. 2020.
[Online]. Available: https://www.augmentt.com/ degree in cybersecurity at Deakin University, Geelong, Vic.,
security/threats/ ransomware/maze-ransomware 3216, Australia. His research interests include cybersecurity,
16. “CryCryptor ransomware,” NHS 75 Digital, Leeds, U.K., digital forensics, and the Internet of Things (IoT). Harsha
Jun. 2020. [Online]. Available: https://digital.nhs.uk/ received his master of science degree in cybersecurity from
cyber-alerts/2020/cc-3523 Deakin University. Contact him at [email protected].
17. J. Stubbs, “‘Payment sent’ - Travel giant CWT pays
$4.5 million ransom to cyber criminals,” Reuters,
SHERALI ZEADALLY is a professor in the College of Commu-
Jul. 2020. [Online]. Available: https://www.reuters.
nication and Information at the University of Kentucky, Lex-
com/article/us-cyber-cwt-ransom-idUSKCN24W25W
ington, KY, 40506, USA. His research interests include
18. “Ransomware activity targeting the healthcare
and public health sector,” Cybersecurity and cybersecurity, privacy, and computer networks. Zeadally
Infrastructure Security Agency, Arlington, VA, USA, received his Ph.D. degree in computer science from the Uni-
Nov. 2020. [Online]. Available: https://www.cisa.gov/ versity of Buckingham, England. He is a Senior Member of
news-events/cybersecurity-advisories/aa20-302a IEEE. Contact him at [email protected].
The Internet of Things (IoT) is receiving increasing attention from academia and
industry. However, improving the security of the IoT environment is critical for fostering
trust in it and contributing to its growth in the manufacturing market. This study
comparatively analyzes current methods for detecting intruders and malicious activities
in IoT networks by introducing tree-based machine learning algorithms. It presents a
research gap analysis of the current literature. Furthermore, an empirical evaluation
study is presented to explore the potential of tree-based approaches to detect intruders
in IoT networks. It compares the performance of bagging and boosting techniques in
botnet detection by conducting an extensive experimental benchmarking.
T
he Internet of Things (IoT) represents the capa- and diverse network leads to much vulnerability, such
bility of heterogeneously distributed objects such as data leakage, spoofing, denial of service (DoS), man
as devices, users, gateways, services, infrastruc- in the middle (MITM), energy drain, insecure gateways,
ture, and so on. The IoT is widely used in smart homes, and so on.
smart offices, automated industry, smart cities, smart In 2016, the United States Computer Emergency Pre-
agriculture, smart transportation systems, supply chains, paredness Team reported that botnet malware dis-
smart medical care, and so on. IoT environments include rupted service at a major U.S. Internet service provider.
a large number of devices, most of which have limited The name of the botnet is Mirai. It caused several major
and heterogeneous configurations. Because of this, each websites to be disrupted by a series of massive, distrib-
layer of the three-tier IoT environment becomes a poten- uted DoS (DDoS) attacks. It spread quickly and infected
tial target for attackers and is exposed to various security 100,000 malicious endpoints. The source code is designed
threats. As ever more machines and smart devices are to disrupt busy box systems (often used for IoT devices)
connected to the network, IoT security vulnerabilities are and ultimately initiate a very large-scale DDoS.
gradually increasing. Recently, there has been an increased focus on
Limited resources on IoT devices create constraints building “intelligence” into defensive architectures and
when installing traditional security software. Intrusion leveraging machine learning (ML) techniques to clas-
detection is an essential part of network security, pro- sify and detect malicious traffic. To implement an
viding real-time protection against internal and exter- effective network intrusion detection system (NIDS),
nal attacks. The main task of intrusion detection is artificial intelligence (AI) techniques are of great value.
to classify normal and abnormal network behavior. AI techniques have great potential for building and
Intrusion detection techniques can be divided into training models by learning traffic patterns and provid-
three categories: signature-based, knowledge-based, ing heuristic solutions for IoT environments. These
and anomaly- or behavior-based techniques. Due to techniques are able to intelligently learn the underlying
the heterogeneous nature of the IoT, signature-based data properties without explicitly describing normal
techniques are ineffective. The lack of a dedicated and malicious activities, overcoming the limitations of
traditional schemes. AI has been used for many valid
anomaly detection system for such a heterogeneous
security issues in the IoT; however, this does not mean
1520-9202 © 2023 IEEE
that IoT security issues are properly addressed. Many
Digital Object Identifier 10.1109/MITP.2023.3303919 challenges remain in terms of the data, algorithms, and
Date of current version 3 November 2023. architecture.
ensemble learning (EL). They have been successfully rates were reported as 99.23%, 99.86%, and 99.74%,
applied to solve challenging problems in a number of respectively, while no accuracy was reported for DTs.
fields. Chaudhary and Gupta10 presented an ML-based frame-
EL refers to a group of aggregated models that work for detecting DDoS attacks that works in two
work collectively to achieve a better final prediction by phases: the detection phase and the mitigation phase.
reducing bias. Bagging and boosting are two main Using Wireshark, they created a dataset by capturing the
types of EL methods. The key difference between them traffic of an IoT environment consisting of PCs and Rasp-
is the training method or how the trees are built and berry Pi. It contained a total of 114,565 packets, of which
combined. Bagging adopts parallel training, while boost- 10,061 were benign. For classification, they evaluated four
ing adopts sequential learning.4 Both the bagging Meta algorithms: RF, SVM, LR, and DT. They reported accuracy
classifier (BMC) and an RF use bagging techniques to rates of 99.17%, 98.06%, 97.5%, and 98.34%, respectively.
build a full DT in parallel. The difference lies in the pre- Anthi et al.11 proposed a supervised approach con-
diction method. “F” implies averaging for the final out- sisting of three layers for detecting and classifying
put, while the BMC implies a linear voting combination. intrusions in the IoT. The system performed three main
Gradient boosting is an extension of boosting where functions. 1) It formulated a profile for the normal
the process of additively generating weak models is for- behavior of each IoT device, 2) it identified malicious
malized as a gradient descent algorithm. The final pre- packets on the network in case of an attack, and
diction is a weighted sum of all of the tree predictions. 3) then it classified the type of attack. They built a
EXtreme gradient boosting (XGB) is an implementation smart home testbed consisting of eight IoT devices.
of gradient-boosted DTs (GBDTs).5 It uses a boosting Twelve attacks were injected with four main catego-
technique that aggregates all the predictions from its ries, i.e., DoS, MITM, reconnaissance, and replay. To
constituent learners in a sequential manner. In such a classify unseen data, nine classifiers were selected.
way, each tree eliminates the error of its previous trees The selection of classifiers was based on their classifi-
to update the residual error. cation time, ability to support multiclass classification,
and high-dimensional feature space. The nine classi-
fiers were naïve Bayes (NB), Bayesian network, DT J48,
RELATED WORK
Zero R, One R, simple logistic, SVMs, multilayer percep-
A number of studies have been conducted for achiev-
tron (MLP), and RFs. The results showed that the DT
ing efficient AI-based intrusion detection in an IoT envi-
J48 model achieved the best performance. Reported
ronment. Numerous research works have proposed evaluations for the three functions of device profiling,
adopting the DT algorithm for network intrusion detec- detecting wireless attacks, and attack type classifica-
tion in an IoT environment. Bahsi et al.6 applied feature tion were 96.2%, 90%, and 98%, respectively.
selection to minimize the number of features to achieve Aloqaily et al.12 adopted deep belief and DT ML
higher accuracy. They evaluated two ML methods for mechanisms for intrusion detection in connected vehi-
classification, i.e., DT and k-nearest neighbor (KNN). cle environments. The proposed model achieved an
They reported that the classification accuracy of KNN accuracy rate of 99.43%. They used a simulation data-
(94.97%) is less than that of DT (98.97%). They created a set of 22,544 records.
simulated IoT network consisting of nine IoT devices, Another research direction proposed the adoption
including a security camera, webcam, baby monitor, of an RF algorithm for network intrusion detection in
thermostat, and doorbell. The dataset contained three the IoT. Manimurugan et al.13 suggested a deep belief
labels: normal, Bashlite, and Mirai. The dataset packets model for intrusion detection in smart medical environ-
were 502,605 normal, 2,835,317 Bashlite, and 2,935,131 ments. They used the CICNIDS 2017 dataset.14 Their
Mirai records. Illy et al.7 combined different ML algo- model achieved good accuracy for a benign class
rithms and built ensemble classifiers for intrusion detec- (99.37%), but unsatisfactory accuracy for anomalies.
tion. They adopted a DT bagging ensemble technique. The highest detection accuracy was for a web attack:
The NSL-KDD dataset8 was used for model training and 98.37%. Both brute force and port scan were detected
evaluation. They achieved accuracy rates of 85.81% and at a rate of 97.71%. The lowest accuracy rates were for
84.25% for binary classification and attack classification, DoS/DDoS, with 96.67%, and infiltration, with 96.37%.
respectively. Alsamiri and Alsubhi15 evaluated seven ML algorithms
Goyal et al.9 presented an approach for detecting in terms of detecting IoT network attacks. They used
botnets based on behavioral analysis. They evaluated Bot-IoT10 for evaluation and implementation. The reported
logistic regression (LR), support vector machines (SVMs), highest detection accuracy rate was 99% for KNN. A
artificial neural networks (ANNs), and DTs. The accuracy lower performance of 97% was reported for RFs, Iterative
Dichotomizer 3, and adaptation boosting (ADB). Qua- SVM, DT, NB, KNN, and RF. They reported unsuitable per-
dratic discriminant analysis, MLP, and NB achieved formance for KNN and NB algorithms, while SVM, DT, and
unsatisfactory accuracy rates of 87%, 84%, and 79%, RF performed well. The best reported performance was for
respectively. RF, which achieved an approximate accuracy of 100%.
Doshi et al.16 generated a labeled training dataset A small number of studies reported the adoption of
consisting of benign and malicious traffic by simulating boosting techniques for network intrusion detection in
a local network of consumer IoT devices. They used the IoT. Alqahtani et al.24 presented an approach based
this simulated and labeled dataset to test five different on the XGB algorithm for detecting intrusions in the
ML classifiers, i.e., KNN, SVM with linear kernel (LSVM), IoT. After reducing the number of features in the
NN four-layer fully connected feedforward, DT, and RF N-BaIoT dataset,25 the reported accuracy in multiclass
using Gini impurity scores. They reported that incorpo- classification was 99.97%.
rating stateful features improved accuracy over state- Table 1 presents a comparative analysis for the pre-
less features alone. For all the features, RF achieved vious related work in tabular form.
the highest accuracy of 99.8% and 99.5% against both
KNN and DT, respectively, and of 98.9% against NNs, RESEARCH GAP ANALYSIS
while LSVMs had the worst accuracy of 92.1%. Dwyer After discussing previous efforts and contributions of
et al.17 provided a Domain Name System (DNS)-based adopting tree-based ML algorithms for network intru-
profiling scheme to detect Mirai-like botnet activities. sion detection in the IoT, this section addresses the
Their approach depended on the contents of DNS research gap analysis. Over the past six years, a number
queries and using RF classifiers. They evaluated their
of methodologies and techniques have been proposed;
approach over real honeypot datasets and compared it
however, this section analyzes the literature presented
with Bayesian-based classifiers and KNN. The highest
in the “Background” section to highlight research gaps
accuracy was achieved by the RF classifier, at 99%.
and identify strengths and weaknesses of previous stud-
Alrashdi et al.18 proposed an NIDS for the IoT in a
ies related to adopting tree-based ML approaches.
smart city based on the RF algorithm and extra tree. To
evaluate their model, the UNSW-NB1519 dataset was
Lack of Using the Benchmark Dataset
used. They reported a detection accuracy of 99.34%
Benchmarking plays an important role in improving
with the lowest false-positive (FP) rate. Hasan et al.20
research. Referring to Table 1, the majority of the pre-
performed a comparative implementation study to
sented methods were tested using simulated datasets.
choose the best algorithm for detecting and classifying
A few of them have been tested using standard,
intrusions in the IoT. They considered LRs, SVMs, DTs,
well-known datasets (i.e., NSL-KDD,1 CICNIDS,26 and
RFs, and ANNs. They used the Pahl open source data-
UNSW-NB1519). This leads to missing a unique datum
set21 that contains the synthetic data of the Distributed
for performance evaluation of the proposed approaches.
Smart Space Orchestration System IoT environment.
As each of them depends on a different simulated data-
They reported that the best accuracy performance was
set, the reported high accuracy values cannot be consid-
of RFs, with 99.4%. However, their study was not
applied to big data and other unknown problems. ered when benchmarking with other approaches.
Eskandari et al.22 presented an intelligent NIDS for
the IoT using lean, one-class ML algorithms. They Lack of Adopting Boosting-Based
called their approach Passban, and it consisted of two Approaches
one-class classification techniques, i.e., iForest and a However, most of the previous studies adopted bagging-
local outlier factor. They built an IoT testbed to resem- based tree approaches such as RF algorithms for intru-
ble a typical smart home automation environment. sion detection in IoT networks, and only a small number
They evaluated two scenarios, i.e., an NIDS direct deploy- of the previous studies reported extending this area
ment on the IoT gateway and a separate, independent by evaluating several boosting-based algorithms. Con-
NIDS. They evaluated it against four attacks, i.e., port sidering the potential for boosting-based techniques
scanning, HTTP brute force, Secure Socket Shell (SSH) to reduce the models’ residual error, more attention
brute force, and a synchronized (SYN) flood attack. The should be directed toward studying their implementa-
reported detection accuracy was between 79% and 99%. tion for detecting intrusions in the IoT.
Thamilarasu et al.23 demonstrated a mobile agent-based
intrusion detection system for the medical IoT. They Lack of Evidence of Reported Results
simulated the topology of a hospital network on the med- Most of the surveyed studies reported high accuracy
ical IoT. They trained five supervised ML algorithms, i.e., rates for their approaches, but no evidence of such
September/October 2023
Algorithm Author Reference Year Dataset features Objective classes (%)
[6]
DT Bahsi et al. 2018 Simulated 10 Reduce dimensionality of ML-based IoT Three 98.97
botnet detection
[7]
DT (BE) Illy et al. 2019 NSL-KDD 38 Securing fog to things Five 85.81
[9]
DT Goyal et al. 2019 Simulated Three Detecting botnets based on behavioral Two 87.15
analysis in the IoT
[10]
DT Chaudhary and 2019 Simulated N/A DDoS detection in the IoT Two 98.34
Gupta
[11]
DT (J48) Anthi et al. 2019 Simulated 121 Intrusion detection in the smart medical IoT Two 99
121 four 98
[12]
DT Aloqaily et al. 2019 NSL-KDD 122 NIDS in connected vehicles Five 99.43
[13]
DT Manimurugan et al. 2020 CICNIDS 80 NIDS in the smart medical IoT Six 98.37
[16]
RF Doshi et al. 2017 Simulated 11 DDoS detection in the IoT Two 99.8
[17]
RF Dwyer et al. 2018 Real dataset Six Profiling IoT botnet traffic using the DNS Five 99
[10]
RF Chaudhary and 2019 Simulated N/A DDoS detection in the IoT Two 99.17
Gupta
[18]
RF þ ET Alrashdi et al. 2019 UNSW-NB15 49 NIDS for the IoT Two 99.34
[20]
RF Hasan et al. 2019 Pahl 13 Intrusion detection and classifying in the IoT Eight 99.4
[22]
RF Eskandari et al. 2020 Simulated 24 NIDS for the IoT Five 99
[23]
RF Thamilarasu et al. 2020 Simulated N/A NIDS for the medical IoT Two 100
[24]
XGB Alqahtani et al. 2020 N-BaIoT Three IoT botnet attack detection Three 99.96
BE: bagging ensemble; ET: extra tree; N/A: not applicable.
IT Professional
49
NETWORK SECURITY
NETWORK SECURITY
values was presented. A few studies presented the botnets include scanning, which can discover vulnera-
implementation of their approaches in shared reposito- ble devices; flooding, which makes use of SYN,
ries. Sharing the implementation can provide evidence acknowledgment, User Datagram Protocol, and TCP
of the reported values and give other researchers the flooding; and combo attacks, which open connections
chance to check and improve. and send junk data. Table 2 describes the subattacks
of each.
EMPIRICAL STUDY
This study investigates the performance of adopting Dataset Description
several tree-based algorithms for detecting IoT attacks. This section describes the dataset used in the experi-
The detection algorithms are evaluated using the most ments. The N-BaIoT25 dataset was selected for training
recent standard dataset of IoT networks, along with dif- and evaluation purposes as it is widely accepted as a
ferent types of attacks. benchmark sequential dataset. It contains realistic net-
work traffic and a variety of attack traffic. It was col-
Attack Modeling lected by gathering the traffic of nine commercially IoT
This empirical study employed IoT botnet attacks, i.e., devices authentically infected by Mirai and Bashlite
Gafgyt and Mirai. These two attacks are considered malware. The devices were two smart doorbells, one
the two most common IoT botnet attacks. smart thermostat, one smart baby monitor, four secu-
They attack IoT devices and turn them into a rity cameras, and one webcam. The traffic was cap-
network of remotely controlled bots, called a botnet. tured when the devices were in normal execution and
Gafgyt is a DDoS attack. It is developed using C pro- after infection with malware. The traffic was captured
gramming for infecting Linux systems and designed to using a network-sniffing utility in a raw network traffic
easily cross compile to various computer architec- packet capture format. This can be achieved through
tures. It is also known as Bashlite, Bashdoor, Q-Bot, port mirroring. Five features were extracted from the
Torlus, Lizard Stresser, and Lizkebab. network traffic, as abstracted in Table 3. Three or more
Supported by a built-in dictionary of common user- statistical measures were computed for each of these
names and passwords, it propagates by brute forcing five features for data aggregation, resulting in a total of
default credentials of devices with open telnet ports. It 23 features. These 23 distinct features were computed
uses a client–server model for command and control. over five separate time windows (100 ms, 500 ms, 1.5 s,
If the default credentials combo is unchanged, it 10 s, and 1 min), as demonstrated in Figure 1. Using
can login to the device and infect it. It allows targeting time windows makes this dataset appropriate for
of a network prefix or multiple addresses in a single stateful IDS, resulting in total of 115 features.
attack. Both attacks disable the victim’s IoT device’s N-BaIoT contains instances of network traffic data
telnet and SSH services. The attacks executed by divided into three categories: normal traffic (benign
September/October 2023
Stream characteristics
(statistical aggregation functions)
Stream Stream Variance/
aggregation aggregation standard Correlation Time
designation description Weight Mean deviation Magnitude Radius Covariance coefficient Count frame Features
H Host-source Variance x x x x Three Five 15
IP
MI Host-source Variance x x x x Three Five 15
IP þ MAC
HH Host-to-host Std Dev 7 Five 35
channel
(source IP to
destination
IP)
HH_Jit Host-to-host Variance x x x x Three Five 15
channel
jitter
HP HP Host-port-to- Std Dev 7 Five 35
host port
channel (IP
þ socket)
Total traffic 23 Tot. 115
characteristics features
IP: Internet Protocol; MAC: media access control; Std Dev: standard deviation.
IT Professional
51
NETWORK SECURITY
NETWORK SECURITY
FIGURE 1. Feature extraction. L1: level one; L2: level 2; L3: level 3; L4: level 4; L5: level 5; H: host-source Internet Protocol; MI: host-
source Internet Protocol þ media access control; HH: host-to-host channel; HH_Jit: host-to-host channel jitter; HP HP: host-port-
to-host-port channel.
data), Bashlite-infected traffic, and Mirai-infected traf- because it represents the number of trees to be used.
fic. Each data instance consists of 115 features repre- By increasing this number, it will directly improve the
sented by 23 different traffic characteristics in five algorithm’s accuracy while also increasing the model’s
different time frames. Table 3 presents an abstracted time complexity. For a fair evaluation, the number of
demonstration of the dataset’s attribute information. estimators is set to 100 for all algorithms.
Figure 2 shows the data exploration for the dataset col-
lected by three labeled types, i.e., benign, Mirai, and EVALUATION RESULTS
Gafgyt. Figure 3 shows the dataset’s individual distribu- This section explains the confusion matrix and the
tion of 10 malware classes in addition to the benign evaluation metrics used for comparison. The evalua-
traffic. tion results are then presented, and a further discus-
sion is conducted.
Model Implementation and Parameters
This study considers six tree-based algorithms for Confusion Matrix
empirical evaluation: DT, RF, BMC, ADB, gradient The confusion matrix is used to visualize the perfor-
descent boosting (GDB), and XGB. The experiments mance of a classification technique. It is a table that is
are conducted in a Colab interactive notebook environ- often used to describe the performance on a set of
ment. For the sake of providing an evidence-based test data. It allows for easy identification of confusion
evaluation, the project, along with the datasets, have between classes. The classification is evaluated through
been uploaded to and shared on Kagglea and GitHub.b four indicators that are used to calculate other perfor-
All the configured parameters for each model are pre- mance measures:
sented in Table 4. The number of estimators is the
True positives: packets are predicted as mali-
most critical parameter for all tree-based algorithms
cious, and their ground truth is malicious.
a
https://github.com/MohamedSaiedEssa/TreeBasedIoTNIDS True negatives: packets are predicted as benign,
b
https://www.kaggle.com/MohamedSaiedEssa/TreeBasedIoTNIDS and their ground truth is benign.
False positives: packets are predicted as mali- Four metrics are widely used for evaluating ML
cious, while their ground truth is benign. models: accuracy, precision, recall, F1 score, and specif-
False negatives: packets are predicted as benign, icity. These four measurements are defined through
while their ground truth is malicious. the following equations, respectively:
TP þ TN
A successful detection requires correct attack identifi- Accuracy ¼
TP þ TN þ FP þ FN
cation while minimizing the number of false alarms.
TP
Precision ¼
TP þ FP
Evaluation Metrics
To perform a comprehensive performance assessment TP
Recall ðSensitivityÞ ¼
and objective evaluation, several metrics are addressed TP þ FN
to indicate how the model performs. Accuracy alone is 2 Precision Recall
F1 Score ¼ :
not sufficient for an imbalanced dataset. Precision þ Recall
Predicting
Model Parameter Value
time (s)
1.58
0.69
0.95
3.62
1.37
0.1
criterion gini
splitter best
DT min_sample_split 2
min_sample_leaf 1
random_state 0
Training
estimator DTClassifier
time (s)
427.58
59.06
504.89
360.14
480.41
2307.51
BMC n_estimators 100
random_state 0
criterion gini
n_estimators 100
RF min_sample_split 2
min_sample_leaf 1
0.999982
0.999969
0.999963
0.999968
0.999945
0.999981
0.999892
F1 Score
0.999941
0.999941
0.999914
0.999914
0.999891
random_state 0
learning rate 0.1
ADB n_estimators 100
random_state 0
n_estimators 100
GDB random_state 0
0.999963
0.999946
0.999945
0.999945
0.999945
0.999982
0.999937
0.999882
0.999783
0.999991
Recall
eval_metric mlog_loss
1
XGB n_estimators 100
random_state 0
GDB: gradient descent boosting.
Precision
0.999964
0.999946
0.999946
0.999946
0.999936
0.999945
0.999883
0.999784
The goal is to maximize all measurements, which 0.999981
0.99999
1
1
range from zero toone1. Higher values correspond to
better classification performance.
For a further fair comparison, a computational cost
analysis is conducted. This analysis includes the follow-
ing measurements:
Accuracy
0.999982
0.999964
0.999968
0.999892
0.999941
0.999914
Training time: Training time is the time it takes
to build the model from the training data. Train-
ing time depends on the size of the dataset,
complexity of the algorithm, and the available
resources. For a fair evaluation, the same train-
Malicious
Malicious
Malicious
Malicious
Malicious
Malicious
Benign
Benign
Benign
Benign
Benign
Benign
Class
BMC
GDB
ADB
XGB
RF
TABLE 5. Evaluation results.
Empirical Study
Technique
collecting the 10 malicious classes, a subset of the The best performance was achieved using the RF algo-
dataset is selected to form a balanced binary-labeled rithm (accuracy rate of 0.999982) and relatively reason-
dataset. All the benign traffic is considered to contain able training and predicting times.
555,932 instances. There are several potential areas for future research
The remaining malicious traffic datasets are merged in IoT intrusion detection considering the capabilities
into two collective subsets, i.e., Mirai and Gafgyt, then provided by edge AI. Edge computing is an enabling
randomly shuffled, with half of the benign number of technology that enhances IoT security performance by
instances selected from each of them. In this way, the introducing edge AI. By enabling AI algorithms to run on
total number of instances in the malicious subset the edge, edge AI technology opens future research
equals the benign instances, representing a balanced directions and challenges. To prevent attacks in real
dataset of 1,111,864 total instances. The six algorithms time, edge AI algorithms need to be fast and accurate.
are fitted with the formed dataset. The performance Future research could focus on developing edge AI algo-
evaluation metrics identified in the “Evaluation Metrics” rithms that can quickly detect and respond to attacks in
section are calculated and documented in Table 5. real time without compromising performance.
Discussion REFERENCES
The empirical evaluation results showed significant 1. M. Almiani, A. Abughazleh, A. Al-Rahayfeh, S. Atiewi,
potential for the tree-based ML algorithms in detecting and A. Razaque, “Deep recurrent neural network for
network intrusions in IoT environments. The RF algo- IoT intrusion detection system,” Simul. Model. Pract.
rithm achieved the best performance, with an accuracy Theory, vol. 101, May 2020, Art. no. 102031, doi: 10.1016/
rate of 0.999982. It achieved the highest results in all j.simpat.2019.102031.
other measures. Figure 4 shows its confusion matrix. 2. M. Dimolianis, G. S. Member, A. Pavlidis, and
The GDB algorithm showed the longest training time V. Maglaris, “Signature-based traffic classification
due to it not supporting multithreading. The XGB algo- and mitigation for DDoS attacks using programmable
rithm, an implementation of GDB, supports multithreading. network data planes,” IEEE Access, vol. 9,
pp. 113,061–113,076, Aug. 2021, doi: 10.1109/ACCESS.
CONCLUSION AND FUTURE WORK 2021.3104115.
This article presented an empirical evaluation for adopt- 3. J. Alzubi, A. Nayyar, and A. Kumar, “Machine learning
ing ML tree-based algorithms for detecting network from theory to algorithms: An overview,” J. Phys.
intrusions in the IoT. Six tree-based ML algorithms were Conf. Ser., vol. 1142, Dec. 2018, Art. no. 012012,
implemented and tested using the well-known N-BaIoT doi: 10.1088/1742-6596/1142/1/012012.
4. C. Bente jac, A. Cso
€ rgo
} , and G. Martınez,
dataset for benchmarking. The results demonstrated
the significant potential of tree-based ML algorithms. “A comparative analysis of gradient boosting
algorithms,” Artif. Intell. Rev., vol. 54, no. 3, pp. 1937–1967,
Mar. 2021, doi: 10.1007/s10462-020-09896-5.
5. “XGBoost introduction.” Pythongeeks. Accessed:
Jul. 17, 2023. [Online]. Available: https://pythongeeks.
org/xgboost-introduction/
6. H. Bahsi, S. Nomm, and F. B. La Torre, “Dimensionality
reduction for machine learning based IoT botnet
detection,” in Proc. 15th Int. Conf. Contr., Automat.,
Robot. Vision (ICARCV), 2018, pp. 1857–1862,
doi: 10.1109/ICARCV.2018.8581205.
7. P. Illy, G. Kaddoum, C. M. Moreira, K. Kaur, and
S. Garg, “Securing fog-to-things environment using
intrusion detection system based on ensemble
learning,” in Proc. IEEE Wireless Commun. Netw. Conf.,
2019, pp. 1–7, doi: 10.1109/WCNC.2019.8885534.
8. “NSL-KDD dataset,” University of New Brunswick,
Fredericton, NB, Canada, 2009. Accessed: Jul. 30,
2023. [Online]. Available: https://www.unb.ca/cic/
FIGURE 4. Confusion matrix for an RF classifier.
datasets/nsl.html
9. M. Goyal, I. Sahoo, and G. Geethakumari, “HTTP 20. M. Hasan, M. M. Islam, M. I. I. Zarif, and M. M. A.
botnet detection in IOT devices using network traffic Hashem, “Attack and anomaly detection in IoT
analysis,” in Proc. Int. Conf. Recent Adv. Energy- sensors in IoT sites using machine learning
Efficient Comput. Commun. (ICRAECC), 2019, pp. 1–6, approaches,” Internet Things, vol. 7, Sep. 2019,
doi: 10.1109/ICRAECC43874.2019.8995160. Art. no. 100059, doi: 10.1016/j.iot.2019.100059.
10. P. Chaudhary and B. B. Gupta, “DDoS detection 21. F.-X. A. M.-O. Pahl. “DS2OS traffic traces.” Kaggle.
framework in resource constrained Internet of Things Accessed: Jun. 20, 2023. [Online]. Available:
domain,” in Proc. IEEE 8th Global Conf. Consum. https://www.kaggle.com/datasets/francoisxa/
Electron. (GCCE), 2019, pp. 675–678, doi: 10.1109/ ds2ostraffictraces
GCCE46687.2019.9015465. 22. M. Eskandari, Z. H. Janjua, M. Vecchio, and F.
11. E. Anthi, L. Williams, M. Słowi, G. Theodorakopoulos, Antonelli, “Passban IDS: An intelligent anomaly-based
and P. Burnap, “A supervised intrusion detection intrusion detection system for IoT edge devices,”
system for smart home IoT devices,” IEEE Internet IEEE Internet Things J., vol. 7, no. 8, pp. 6882–6897,
Things J., vol. 6, no. 5, pp. 9042–9053, Oct. 2019, Aug. 2020, doi: 10.1109/JIOT.2020.2970501.
doi: 10.1109/JIOT.2019.2926365. 23. G. Thamilarasu, A. Odesile, and A. Hoang, “An
12. M. Aloqaily, S. Otoum, I. A. Ridhawi, and Y. Jararweh, intrusion detection system for Internet of Medical
“An intrusion detection system for connected Things,” IEEE Access, vol. 8, pp. 181,560–181,576,
vehicles in smart cities,” Ad Hoc Netw., vol. 90, Sep. 2020, doi: 10.1109/ACCESS.2020.3026260.
Jul. 2019, Art. no. 101842, doi: 10.1016/j.adhoc.2019.02.001. 24. M. Alqahtani, H. Mathkour, and M. M. Ben Ismail,
13. S. Manimurugan, S. Al-Mutairi, M. Aborokbah, N. “IoT botnet attack detection based on optimized
Chilamkurti, S. Ganesan, and R. Patan, “Effective extreme gradient boosting and feature selection,”
attack detection in internet of medical things smart Sensors, vol. 20, no. 21, Nov. 2020, Art. no. 6336,
environment using a deep belief neural network,” doi: 10.3390/s20216336.
IEEE Access, vol. 8, pp. 77,396–77,404, Apr. 2020, 25. Y. Meidan et al., “N-BaIoT: Network-based detection
doi: 10.1109/ACCESS.2020.2986013. of IoT botnet attacks using deep autoencoders,” IEEE
14. D. Stiawan, M. Yazid, and A. M. Bamhdi, “CICIDS-2017 Pervasive Comput., vol. 17, no. 3, pp. 12–22, Jul./Sep.
dataset feature analysis with information gain for 2018, doi: 10.1109/MPRV.2018.03367731.
anomaly detection,” IEEE Access, vol. 8, pp. 132,911–132,921, 26. R. Panigrahi and S. Borah, “A detailed analysis
Jul. 2020, doi: 10.1109/ACCESS.2020.3009843. of CICIDS2017 dataset for designing intrusion
15. J. Alsamiri and K. Alsubhi, “Internet of Things cyber detection systems,” Int. J. Eng. Technol., vol. 7, no. 3,
attacks detection using machine learning,” Int. J. Adv. pp. 479–482, Jan. 2018.
Comput. Sci. Appl., vol. 10, no. 12, pp. 627–634,
Jan. 2019.
MOHAMED SAIED is a Ph.D. candidate in information tech-
16. R. Doshi, N. Apthorpe, and N. Feamster, “Machine
nology in the Department of Information Technology, the
learning DDoS detection for consumer Internet of
Institute of Graduate Studies and Research, Alexandria Uni-
Things devices,” in Proc. Deep Learn. Secur. Workshop
versity, Alexandria, 21526, Egypt, and is the automation man-
(DLS), 2018, pp. 29–35, doi: 10.1109/SPW.2018.00013.
17. O. P. Dwyer, A. K. Marnerides, V. Giotsas, and ager of EZDK Steel Company, Alexandria, 21537, Egypt. His
T. Mursch, “Profiling IoT-based botnet traffic research interests include software engineering, software
using DNS,” in Proc. IEEE Global Commun. Conf. flexibility, and artificial intelligence. Saied received his M.Sc.
(GLOBECOM), 2018, pp. 1–6, doi: 10.1109/ degree in information technology from the Institute of Grad-
GLOBECOM38437.2019.9014300. uate Studies and Research, Alexandria University. Contact
18. I. Alrashdi, A. Alqazzaz, E. Aloufi, R. Alharthi, M. him at [email protected].
Zohdy, and H. Ming, “AD-IoT: Anomaly detection of
IoT cyberattacks in smart city using machine SHAWKAT KAMAL GUIRGUIS is a professor of computer
learning,” in Proc. IEEE 9th Annu. Comput. Commun.
science and informatics in the Department of Information
Work. Conf., 2019, pp. 305–310, doi: 10.1109/CCWC.
Technology, the Institute of Graduate Studies and Research,
2019.8666450.
Alexandria University, Alexandria, 21526, Egypt. His research
19. N. Moustafa and J. Slay, “UNSW-NB15: A
comprehensive data set for network intrusion interests include computer networks, information security,
detection systems (UNSW-NB15 Network Data Set),” and databases. Shawkat received his Ph.D. degree in elec-
in Proc. Mil. Commun. Inf. Syst. Conf. (MilCIS), 2015, tronics and communication from the University of London.
pp. 1–6, doi: 10.1109/MilCIS.2015.7348942. Contact him at [email protected].
In recent years, we have witnessed growing interest in using deep learning to detect
misinformation. This increased attention is being driven by deep learning technologies’
ability to accurately detect this misinformation. However, there is a diverse array of
content that can be considered misinformation, such as fake news and satire. Similarly,
in the field of deep learning, there are several architectures with variable efficacy
depending on the context and data involved. This study aims to highlight the various
types of misinformation attacks and deep learning architectures that are used to detect
them. Based on our selection of the recent literature, we present a classification of deep
learning approaches and their relative effectiveness in detecting misinformation, along
with their limitations in terms of accuracy as well as computational overhead. Finally,
we discuss some challenges and limitations that arise FROM the use of deep learning
architectures in misinformation detection.
M
isinformation detection research has been learning models associated with different types of mis-
on the rise over the past decade, yet misin- information detection. In particular, although larger
formation is still prevalent today on the Inter- survey articles have focused on providing an exhaus-
net. Most recent efforts have focused on automation tive list of research efforts and results, this article aims
to easily detect misinformation and compensate for to provide a succinct representative critical review of
the inability of employees by companies to deal with the existing direction of misinformation detection and
the large volumes of data that are being generated on provide possible recommendations for future research
the Internet. Machine learning, and in particular, deep in the area. We organize the rest of the article as fol-
learning solutions, have shown promise in tackling mis- lows. The “Definition” section provides an overview of
information issues.8 Yet, several years of research have the various types of misinformation as defined in the
demonstrated high accuracy in misinformation detec- literature. In the “Deep Learning Approaches” section,
tion and revealed that the problem is not monolithic in we present the deep learning approaches that have
nature. Attacks such as fake news, satire, and hate been used to detect different types of misinformation
news can be classified as misinformation, yet the and highlight their strengths and weaknesses. Finally,
intent and effect differ for each one of these examples. in the “Challenges and Opportunities” section, we
As such, the effectiveness of deep learning approaches identify some challenges and opportunities for future
may vary depending on the misinformation that it aims research in the domain of misinformation detection
to detect. In fact, many studies develop, train, and use using deep learning.
deep learning methods to detect misinformation con-
tent by using binary labels (e.g., fake or not fake text).6
DEFINITION
The outcome is that assumptions are made about the
accuracy of deep learning models that may not reflect Misinformation can be defined as a variant of other-
reality, such as effectiveness in real-world application wise content deception. Proper communication involves
and portability across social media platforms. a sender S that transmits a message M over a commu-
In this article, we discuss the various types of misin- nication channel C to a recipient R:21 The sender and
formation and highlight representative cases of deep receiver utilize functions fðtÞ and dðM 0 , eÞ so that the
lossless function fðtÞ ¼ M and dðM, eÞ ¼ t, where t is
1520-9202 © 2023 IEEE some truth, e is a person’s bias, and M is an encoded
Digital Object Identifier 10.1109/MITP.2023.3314752 message form for t: Rarely does a receiver not possess
Date of current version 3 November 2023. a bias, so t can potentially be distorted to td , a distorted
truth, the degree of which may vary depending on the We posit that these can be fitted into the referenced
bias. Other instances that can result in td is when the classifications mentioned previously. For example,
original t is not used and a generated (altered) version is extreme bias can be fitted under the manipulated
used instead t0 by S (e.g., completely fake content and content category and hate news under the false con-
otherwise not based on any truth), or when the lossless text category.
function f is replaced by a lossy function g when The definitions mentioned earlier demonstrate the
encoding the message (i.e., skewed the original truth). variety of approaches to generate misinformation and
Manipulation of the following process leads to several that the aforementioned categories are not uniform in
categories of misinformation, from which intent can their application. Further, their purpose focuses more
often be attributed. That is, some misinformation has on explaining intent, which is difficult to measure in
malicious intent, while in other cases it is meant to be practice. Additional aspects can also be subjective,
benign. For example, a study15 used different types of such as the cost to the victim as well as to the
misinformation but did not formally clarify how the attacker. Therefore, for the purposes of matching the
attacker would generate each type of misinformation. effectiveness of deep learning as a defensive strategy,
Thus, in this article, we elaborate more on this aspect we focus on classifying the following misinformation
and fit these categories in the aforementioned defini- attacks into corresponding larger categories based on
tion that we provided in the previous paragraph. Later how the attacker achieves the specific misinformation
in this section, we elaborate on the existing classifica- attack:
tion as well as provide our revised classification for
misinformation attacks. Generated content: This misinformation attack
The study15 released a dataset that was labeled focuses on cases where primarily the truth t is
based on multiple categories, the origin of which was manipulated. It includes fabricated content and
from a heuristically crafted typology posted on a blog misleading content.
from Claire Wardle. The original categories were satire, Altered content: This misinformation attack focuses
false connection, misleading content, false context, on cases where the function g is the primary
imposter content, manipulated content, and fabricated means for the attack. It includes manipulated
content. Some of these can be identical to others
content, false context, and false connection.
based on the definitions provided earlier, while others
Hybrid approach: This misinformation attack
can be distinctly different. Satire can alter t, replace g
focuses on utilizing various approaches between
with f or expect a recipient’s e to influence the decod-
the encoding function g, truth t, and even a per-
ing process of a message. A combination of these
can also be used to achieve satire. However, these son’s bias e: It includes satire and imposter content.
approaches exist in multiple other classifications of
misinformation. For example, false connection misin-
formation, where there exists a mismatch between the DEEP LEARNING APPROACHES
content (e.g., the text and the image do not reflect the The definition of deep learning varies slightly in the lit-
same truth), can be achieved by using g: In other words, erature, but deep learning models are effectively neural
it is not necessary to alter the truth or rely on the recipi- network models.18 In turn, neural network models can
ent’s bias to achieve the delivery of misinformation. be divided into discriminative and generative. Discrimi-
A similar approach is used with false context, where native models utilize a bottom-up approach, where
g is used to misrepresent otherwise true information in input flows through several hidden layers and onto an
a biased manner. Manipulated content also utilizes g in output layer. They are used in classification and regres-
such a way that the truth t is distorted. The recipient’s sion problems and the data must be labeled (i.e., super-
bias ðeÞ may also influence the effectiveness of such vised learning). On the other hand, generative models
misinformation. In contrast, misleading content mixes utilize a top-down approach for unsupervised prob-
elements of the original truth along with an altered lems, pretraining and probabilistic distribution prob-
truth such that t t0 : Imposter content utilizes a fake lems. In the literature, a third category, hybrid models,
truth t0 but utilizes a function g so that it mimics a cred- are meant for classification problems and primarily uti-
ible source. Based on a person’s bias e, the fake truth lize discriminative models, but assist them through
will be accepted as legitimate. Finally, fabricate con- data generated by generative models.3 However, the
tent uses a fake truth t0 but does not otherwise rely on categories we mentioned earlier are too broad, so
a specialized function to encode the message or the often in the literature these categories are further
recipient’s bias. It is worth pointing out that there exist expanded.
further categories such as extreme bias and hate There are several feedforward neural network archi-
news, which can also be categorized as misinformation. tectures such as convolutional neural networks (CNNs),
autoencoders, adversarial networks, and restricted related to the use of these techniques in detecting dif-
Boltzmann machines (RBMs). Further architectures ferent misinformation content.
include recurrent neural networks (RNNs), radial
basis function (RBF) neural networks, and Kohonen CNNs
self-organizing neural networks.18 Feedforward neural Numerous studies have utilized feedforward neural
networks consist of one-directional flow models and network techniques in misinformation detection. For
can be unsupervised as well as supervised. In con- example, a study utilized CNNs as a means to both
trast, RNNs have a cyclical flow where the output identify features and in turn identify fake news.10 The
feeds into the input and are supervised. RBF neural accuracy of fake news detection on benchmarked
networks operate similarly to feedforward neural net- datasets was 98%. Similarly, several other studies have
works, with the main difference being the utilization also utilized CNNs and demonstrated high detection
of an RBF (i.e., a Gaussian function). They can operate accuracy (higher than 94%) for fake news.12,14,24 Such
in either a supervised or unsupervised manner. Koho- studies involved binary classification and the datasets
nen self-organizing neural networks focus on unsu- comprise mainly generated content. The focus of these
pervised problems to self-organize the input data. models is on identifying fake content through lexical
They are primarily used for reducing the dimensional- features, and consequently, given the lack of any meta-
ity of data. data, they are unable to be used as tools for identifying
altered content or mixed-approach misinformation.
MISINFORMATION DETECTION However, studies that utilize CNNs to detect altered
USING DEEP LEARNING content can be found in cases where different commu-
For the purposes of our analysis, we collected a repre- nication modalities may exist. For example, one recent
sentative nonexhaustive sample of research studies study has demonstrated the use of regions with CNN
that made use of the different deep learning architec- models to detect altered images.28 The approach yielded
tures in misinformation detection. The selection was higher than 90% accuracy in detecting altered images in
made based on the deep learning architectures that the NIST16 dataset, part of the Open Media Forensics
were used in each paper, and the most relevant results Challenge provided by the National Institute of Stand-
retrieved from Google Scholar formed the basis for the ards and Technology.
sample that we used. In the next step of the selection Another study also combined text and image train-
process, we retained the most representative articles ing at the same time to identify altered content using
for each deep learning architecture. We examined a proposed text and image information-based CNN
them in conjunction with the datasets that were used, approach.29 The study not only demonstrated high
especially in relation to the type of misinformation con- detection rates (F1 was 0.921), but it also showed that
tent that they attempted to detect. Table 1 summarizes CNN approaches that focused on just images or text
our critical review of the methods based on how they were inferior to methods that combined the two. In
have been used in the current literature. We also pre- addition, the datasets contain mixed content of misin-
sent the associated computational requirement for formation (i.e., both altered and generated content in
each architecture. We did this qualitatively because images or text), showing the potential for such a
the time complexity for architectures differs depending method, when used appropriately, to detect a large
on the deep learning parameters, complexity of the variety of misinformation. CNNs tend to be highly effi-
model, and data as well as the implementation. Finally, cient to train. Depending on their configuration, they
we also provide our interpretation of effectiveness can have a linear time complexity. Due to their
TABLE 1. Deep learning architectures and associated papers, data, and relative computational overhead.
feedforward nature and lack of backpropagation, they misinformation. As a result, some researchers have
are also not as memory intensive as other deep learn- proposed approaches to misinformation detection that
ing methods (e.g., RNNs). The data used by most of use unsupervised deep learning architectures. Autoen-
the studies presented in this section include publicly coders often require much fewer data than their super-
available datasets that increase the reproducibility of vised counterparts. A study13 on fake news detection
results. The studies used a combination of large as well has demonstrated that autoencoder use is viable and
as small datasets and one study included the use of can often perform better than a technique such as
short texts (tweets). CNNs in detecting fake news. In particular, the study
trained its models on real Twitter and Weibo posts,
RNNs with no fakes included in the training. Fake cases were
The primary architecture utilized under this type of only used during the testing phase, where the efficacy
deep learning network is long short-term memory of the approach was measured. Another study11 that
(LSTM). The use of such an architecture on its own is utilized autoencoders on images also demonstrated
rare because it does not yield high detection accuracy. that the method can be highly effective in detecting
altered content, such as altering people’s faces using
For example, a study that used LSTM on text yielded
deepfake approaches.
accuracy comparable to just using CNNs.29 Deep,
Given that autoencoders work similarly to anomaly
stacked LSTM networks yield much higher accuracy,
detection approaches, the likelihood of identifying devi-
especially when multiple word-embedding techniques
ations from the norms make them good candidates for
are used on a text.17 However, such techniques per-
detecting both generated and altered content. Further,
form well when layers of CNNs and LSTM are com-
autoencoders are, to some degree, similar to CNNs,
bined. One such study16 applied such a combination to
and therefore their computational cost is relatively low
binary text classification with variable success. In one
compared to other deep learning methods. However,
dataset, the hybrid model of a CNN and LSTM per-
unsupervised learning often requires larger datasets
formed as well as the two models separately (although
compared to supervised learning, and that is also true
all cases had F1 scores above 0.98), whereas in another
for the aforementioned studies. As a method, autoen-
dataset, the hybrid model performed better than the
coders are also used for pretraining of unlabeled data,
stand-alone CNN and LSTM approaches.
which are then passed to other supervised deep learn-
All of the approaches mentioned previously were
ing networks (e.g., CNNs). The studies in this category
applied to misinformation that was generated content
used small, publicly available datasets. The use of short
or did not include other metadata to determine whether texts yielded high accuracy, although it was domain
part of the text has a basis on true text. LSTM is primarily specific (e.g., Twitter and Weibo). Given the public nature
used for detecting altered content in manipulated of datasets, reproducibility for all these studies is con-
images. Studies have demonstrated higher accuracy venient for future researchers.
results with LSTM over other methods, especially
compared to nondeep learning approaches. For exam-
Adversarial Networks
ple, a study1 has shown accuracy rates between 70%
Adversarial networks are the primary approach for
and 80% for a variety of existing forged-image datasets
generating fake videos and images (otherwise called
(e.g., NIST16). Uses for detecting a false connection
deepfakes). In particular, generative adversarial net-
(altered content) also exist. In one study,22 articles with
works (GANs) are designed so that a generator and
unrelated titles and article contents were detected
discriminator component interact to produce better
with 97.8% accuracy using a combination of a CNN and generated images that the discriminator component
LSTM. It is worth pointing out that the computational cannot distinguish from untampered content.27 How-
overhead of LSTM, along with the higher number of ever, the same technique can be used in an unsuper-
parameters that need to be configured, make it a com- vised manner to improve the detection of content that
putationally expensive solution compared to its coun- appears to deviate from some baseline (i.e., anomaly
terparts. Studies that utilized this technique used large, detection). One study utilized an event adversarial neu-
publicly available datasets. Their performance for short ral network to detect fake news on Twitter and Weibo
texts or small datasets is unknown because they were datasets.25 These studies demonstrate that GANs are
not used in any of the studies. Reproducibility remains an effective approach in detecting fake images, com-
high because the datasets are publicly available. pared with RNNs.
In general, given the ability to perform in nonlabeled
Autoencoders datasets, the technique can detect generated and
The deep learning architectures mentioned previously altered content more effectively than other supervised
often require a large set of labeled data to detect techniques. It is also an area where limited research
now exists, but the potential for using a GAN architec- manufactured misinformation).9 The McIntire dataset
ture for misinformation detection has been highlighted is similar to Kaggle and focuses on fake news and
in the literature.2 The computational requirements are primarily generated content. Similarly, the FakeNews
higher for training adversarial networks because a gen- dataset focuses on just content and no associated
erator and discriminator model needs to be trained. metadata, where the misinformation is generated con-
Thus, they may not be an ideal or affordable solution tent. Finally, the WELFake dataset was funded by the
for all misinformation detection contexts. These stud- European Commission and consists of an aggregate of
ies used smaller datasets focused on shorter texts multiple datasets such as Kaggle and McIntire, among
(e.g., tweets). It could be a challenge to reproduce the others.23 The purpose was to make a larger dataset
aforementioned studies because the datasets are not that can aid in the prevention of overfitting of classi-
available directly as an application programming inter- fiers. However, the labels are binary and consist of gen-
face (API) must be used. Moreover, the data may have erated content. Even in cases where we may be looking
been deleted since the studies were conducted. at altered content (i.e., a mix of truth and lie), there is no
associated metadata to help train the classifier in iden-
RBMs tifying such metadata. Further, the classification of mis-
The use of RBMs in fake news detection is another information is based solely on lexical features.
underutilized deep learning architecture. One recent A better example of a useful, publicly available
work7 that made use of RBMs for detecting hate dataset that we found that has been used in misinfor-
speech on Twitter has demonstrated that RBMs can mation detection was the PolitiFact LIAR dataset.4 It
be used to pretrain word vectors before they can be uses text to identify the degree of a lie, with six levels
passed on to other hidden layers in a deep learning of identification regarding the credibility of the text.
model. The results obtained with RBMs against tradi- This opens the possibilities of enabling the creation of
tional machine learning architectures (e.g., random for- new deep learning models that can understand the
est) have demonstrated a higher accuracy for the RBM egregiousness of misinformation, as opposed to just a
neural network architecture. As with other unsuper- binary classification of whether something is or is not
vised approaches, RBMs can be effective in detecting misinformation. We posit that new, complex datasets
generated as well as altered content. In particular, the must be created and made available for the deep learn-
stochastic nature of the architecture can help uncover ing misinformation detection field to advance and
hidden features that a deterministic architecture, such become more accurate. At present, only trivial gains
as autoencoders, may not. However, an RBM’s stochas- are being made with each new study that applies a
tic nature, which is due to sampling via Markov chains, new model on these existing datasets.
makes RBMs more expensive to train compared to
other deep learning approaches. The work in this cate- Underutilized Deep Learning
gory utilized small datasets and focused on short texts. Architectures
Our primary concern is in terms of reproducibility There are some underutilized deep learning architec-
because custom data, which were retrieved using a tures used in misinformation classification studies.
Twitter API, were used. These data may have been Such architectures include RBFs and a Kohonen self-
altered or deleted since the study was conducted. organizing neural network. However, studies that uti-
lize these models for misinformation detection are
CHALLENGES AND limited. For example, an RBF was used in conjunction
OPPORTUNITIES with support vector machines with a relatively high
We identified a few challenges and opportunities dur- success rate by Song et al.19 and Wang et al.26 The
ing the review of the current literature on misinforma- deep learning variant of an RBF neural network was
tion detection. These include dataset considerations, also used for classification, but no noteworthy example
underutilized deep learning architectures, lack of simu- uses can be found in the literature for misinformation
lations, and policy and regulatory considerations. We detection. Similarly, to date, a Kohonen self-organizing
discuss these next. neural network has not been used in the domain of misin-
formation detection. These examples highlight a research
Dataset Considerations gap that exists in this area as well as a missed oppor-
Many studies on misinformation detection utilize the tunity to identify how these perform in this domain.
same datasets to demonstrate their efficacy. For
example, the Kaggle dataset consists of reliable and Lack of Simulations
unreliable news articles, a binary classification, and pri- Although most studies divide the datasets into training
marily consists of generated content (i.e., completely and testing, to estimate the real-world efficacy for their
11. H. Khalid, and S. S. Woo, “OC-FakeDect: Classifying 23. P. K. Verma, P. Agrawal, I. Amorim, and R. Prodan,
deepfakes using one-class variational autoencoder,” “WELFAKE: Word embedding over linguistic features
in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. for fake news detection,” IEEE Trans. Computat.
Workshops (CVPRW), 2020, pp. 2794–2803, doi: 10. Social Syst., vol. 8, no. 4, pp. 881–893, Aug. 2021,
1109/CVPRW50498.2020.00336. doi: 10.1109/TCSS.2021.3068519.
12. M. Z. Khan, and O. H. Alhazmi, “Study and analysis of 24. P. K. Verma, P. Agrawal, V. Madaan, and R. Prodan,
unreliable news based on content acquired using “MCred: Multi-modal message credibility for fake
ensemble learning (Prevalence of Fake News on news detection using BERT and CNN,” J. Ambient
Social Media),” Int. J. Syst. Assurance Eng. Manage., Intell. Humanized Comput., vol. 14, no. 8, pp.
vol. 11, no. S2, pp. 145–153, 2020, doi: 10.1007/s13198- 10,617–10,629, 2022, doi: 10.1007/s12652-022-04338-2.
020-01016-4. 25. Y. Wang et al., “EANN: Event adversarial neural
13. D. Li, H. Guo, Z. Wang, and Z. Zheng, “Unsupervised networks for multi-modal fake news detection,” in
fake news detection based on autoencoder,” IEEE Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery
Access, vol. 9, pp. 29,356–29,365, Feb. 2021, doi: 10. Data Mining, 2018, pp. 849–857, doi: 10.1145/3219819.
1109/ACCESS.2021.3058809. 3219903.
14. M. Mersinias, S. Afantenos, and G. Chalkiadakis, “CLFD: 26. Z. Wang, Z. Yin, and Y. A. Argyris, “Detecting medical
A novel vectorization technique and its application in misinformation on social media using multimodal
fake news detection,” in Proc. 12th Lang. Resour. Eval. deep learning,” IEEE J. Biomed. Health Inform., vol. 25,
Conf. (LREC), 2020, pp. 3475–3483. no. 6, pp. 2193–2203, Jun. 2021, doi: 10.1109/JBHI.2020.
15. K. Nakamura, S. Levy, and W. Y. Wang, “Fakeddit: 3037027.
A new multimodal benchmark dataset for fine- 27. D. Yadav and S. Salmani, “Deepfake: A survey on
grained fake news detection,” in Proc. 12th Lang. facial forgery technique using generative adversarial
Resour. Eval. Conf., 2020, pp. 6149–6157. network,” in Proc. Int. Conf. Intell. Comput. Control
16. J. A. Nasir, O. Subhani Khan, and I. Varlamis, “Fake Syst. (ICCS), 2019, pp. 852–857, doi: 10.1109/ICCS45141.
news detection: A hybrid CNN-RNN based deep 2019.9065881.
learning approach,” Int. J. Inf. Manage. Data Insights, 28. C. Yang, H. Li, F. Lin, B. Jiang, and H. Zhao, “Constrained
vol. 1, no. 1, 2021, Art. no. 100007, doi: 10.1016/j.jjimei. R-CNN: A general image manipulation detection
2020.100007. model,” in Proc. IEEE Int. Conf. Multimedia Expo (ICME),
17. P. K. Roy, A. K. Tripathy, T.-H. Weng, and K.-C. Li, 2020, pp. 1–6, doi: 10.1109/ICME46284.2020.9102825.
“Securing social platform from misinformation using 29. Y. Yang, L. Zheng, J. Zhang, Q. Cui, Z. Li, and P. S. Yu,
deep learning,” Comput. Standards Interfaces, vol. 84, “TI-CNN: Convolutional neural networks for fake
Mar. 2023, Art. no. 103674, doi: 10.1016/j.csi.2022.103674. news detection,” 2018, arXiv:1806.00749.
18. A. Shrestha, and A. Mahmood, “Review of deep
learning algorithms and architectures,” IEEE Access, MICHAIL TSIKERDEKIS is an associate professor with the
vol. 7, pp. 53,040–53,065, Apr. 2019, doi: 10.1109/ Computer Science Department, Western Washington Univer-
ACCESS.2019.2912200. sity, Bellingham, WA, 98225, USA. His research interests
19. C. Song, K. Shu, and B. Wu, “Temporally evolving
include deception, data mining, and cybersecurity. Tsikerdekis
graph neural network for fake news detection,” Inf.
received his Ph.D. degree in informatics from Masaryk Uni-
Process. Manage., vol. 58, no. 6, 2021, Art. no. 102712,
versity, Czechia. He is a Senior Member of IEEE. Contact
doi: 10.1016/j.ipm.2021.102712.
him at [email protected].
20. M. Tsikerdekis, and S. Zeadally, “Multiple account
identity deception detection in social media using
nonverbal behavior,” IEEE Trans. Inf. Forensics SHERALI ZEADALLY is a university research professor at
Security, vol. 9, no. 8, pp. 1311–1321, Aug. 2014, the University of Kentucky, Lexington, Kentucky, 40506, USA,
doi: 10.1109/TIFS.2014.2332820.
and the University of Kentucky Alumni Association Endowed
21. M. Tsikerdekis, and S. Zeadally, “Detecting online
Professor. His research interests include cybersecurity, pri-
content deception,” IT Prof., vol. 22, no. 2, pp. 35–44,
vacy, and the Internet of Things. Zeadally received his doc-
Mar./Apr. 2020, doi: 10.1109/MITP.2019.2961638.
toral degree in computer science from the University of
22. M. Umer, Z. Imtiaz, S. Ullah, A. Mehmood, G. S. Choi,
and B.-W. On, “Fake news stance detection using Buckingham, England. He is a Senior Member of IEEE and a
deep learning architecture (CNN-LSTM),” IEEE Access, fellow of the British Computer Society and the Institution
vol. 8, pp. 156,695–156,706, Aug. 2020, doi: 10.1109/ of Engineering Technology. Contact him at szeadally@
ACCESS.2020.3019735. uky.edu.
DEPARTMENT: CYBERSECURITY
T
here are more than 228,000 (as of August 2023) into the CVEs through the CWEs. Mapping the CWEs to
publicly disclosed cybersecurity vulnerabilities BF and then using the NVD CWE–CVE assignments
in the Common Vulnerabilities and Exposures allows us to take advantage of BF’s formal model in the
(CVE)1 repository—more than 25,000 documented in context of the CVEs. This application of BF can capture
2022 alone. Systematic labeling of this huge set of soft- the primary concepts (e.g., recurrent operations and
ware security vulnerabilities would enable advances in consequences) expressed by the overwhelming num-
modern AI cybersecurity research (e.g., Malzahn et al.2). ber of CVEs and ultimately inform an automated label-
The current state of the art are the National Vulnerabil- ing approach based on machine learning (ML). We can
ity Database (NVD)3 mappings of CVEs to Common learn, for example, that thousands of CVEs relate to
Weakness Enumeration (CWE) entries and assign- erroneous read, while there are none relating to errone-
ments of CVE severity scores according to the Com- ous reallocate. The CWE–BF mappings also provide for-
mon Vulnerability Scoring System (CVSS)4. However, mal BF descriptions that at least partially fit many
deeper analysis of the CWE entries shows that many CVEs, which can aid in their annotation.
are overly specific, ambiguous, or overlapping, compli- BF’s formalism allows us to specify each CWE as a
cating the CWE–CVE assignment. (cause, operation, consequence) BF weakness or a
The Bugs Framework (BF)5 formalism ensures pre- chain of BF weaknesses.6 These specifications reveal
cise descriptions of software security vulnerabilities6, the underlying tacit model of the CWEs and allow us to
which is instrumental in gaining a deeper understand- identify similarities and overlaps in the CWE. The latter
ing of the CVEs. The existing NVD mappings, although are part of the issues that introduce ambiguities into
flawed, offer an opportunity to gain high-level insights the CWE to CVE assignments.
This work is our first exploration of utilizing
BF–CWE–CVE mappings to specify/label CVEs via BF.
1520-9202 © 2023 Crown Copyright We focus on memory-related CWEs, as there are a vast
©Digital Object Identifier 10.1109/MITP.2023.3314368 number of memory-related CVEs corresponding to a
Date of current version 3 November 2023. relatively small number of memory-related CWEs. We
Disclaimer: Certain equipment, instruments, software, or specify them via BF weaknesses and examine the suit-
materials, commercial or noncommercial, are identified in ability of these formalisms to describe corresponding
this paper to specify the experimental procedure adequately. CVEs. We also discuss identified similarities and over-
Such identification is not intended to imply recommendation
or endorsement of any product or service by NIST, nor is it
laps among the CWEs, elucidating the NVD’s and the
intended to imply that the materials or equipment identified security community’s struggles with assigning CWEs
are necessarily the best available for the purpose. to CVEs.
588 (Erroneous Code, Cast, Wrong Type) ! (Casted Pointer, Read/Write/Dereference, Type Confusion)
672 (Erroneous Code, Read/Write/Dereference, Use After Free)
(Wrong Object Type Resolved, Coerce, Wrong Type) ! (Casted Pointer, Read/Write/
843
Dereference, Type Confusion)
(Missing Code, Verify, Wrong Value) ! (Wrong Index, Reassign, Over Bounds Pointer/
119 Under Bounds Pointer) ! (Over Bounds Pointer/Under Bounds Pointer, Read/Write,
Buffer Overflow/Buffer Underflow/Buffer Over–Read/Buffer Under–Read)
(Missing Code/Erroneous Code, Verify, Wrong Value) ! (Wrong Index, Reassign, Over Bounds
118 Pointer/Under Bounds Pointer) ! (Over Bounds Pointer/Under Bounds Pointer, Read/Write,
Buffer Overflow/Buffer Underflow/Buffer Over–Read/Buffer Under–Read)
(Missing Code, Verify, Wrong Value) ! (Wrong Size, Allocate, Not Enough Memory Allocated) !
120
! (Not Enough Memory Allocated, Write, Buffer Overflow)
(Erroneous Code, Calculate, Wrap Around) ! (Wrong Size, Allocate, Not Enough Memory) !
680
! (Not Enough Memory, Write, Buffer Overflow)
459 (Erroneous Code, Clear, Not Cleared Object)
404 (Missing Code/Erroneous Code, Deallocate, Memory Leak/Object Corruption)
(Wrong Index, Reposition, Wrong Position Pointer) ! (Wrong Position Pointer, Deallocate,
761
Object Corruption)
772 (Missing Code, Deallocate, Memory Overflow)
1091 (Missing Code, Deallocate, None)
460 (Missing Code/Erroneous Code, Deallocate, Memory Leak)
586* (Erroneous Code, Deallocate, None)
568* (Missing Code, Deallocate, None)
The order is by the BF Memory Bugs Model operation flow.7 Causing weaknesses are in italics. Arrows (!) depict chaining. Asterisks (*) mark CWEs not
assigned to any CVEs. Ellipses (. . .) depict many possible operations. BF: Bugs Framework; CVE: Common Vulnerabilities and Exposures; CWE: Common
Weakness Enumeration; n/a: not applicable.
are both specified as (Single Owned Address, Reassign, weakness for the child would be contained within the
Memory Leak). main weakness for the parent. For example, (Over-
Other CWEs have slight differences only by BF bounds Pointer, Read, Buffer Over–Read) is contained
attributes, which inform about the severity of the weak- within (Over Bounds Pointer/Under Bounds Pointer,
ness and not about its nature. For example, CWE-121 Read, Buffer Over–Read/Buffer Under–Read). The fact
describes buffer overflow on the stack, and CWE-122 that a parent and a child have the same chain (includ-
describes buffer overflow on the heap. Once the ing both an identical main weakness and causing
stack versus heap difference is accounted for by a weakness) means this difference in ambiguity is miss-
BF attribute, these two entries are specified by the ing and highlights an area of overlap within the CWE.
same instance of a BF weakness type: (Over Bounds BF also reveals missing relationships within the
Pointer/Under Bounds Pointer, Write, Buffer Overflow/ CWE. CWEs that have differences only by BF attributes
Buffer Underflow). (e.g., CWE-121 and CWE-122) should have some relation-
The CWE hierarchical relationships between CWEs ship (e.g., PeerOf) within the CWE.
with the same BF chain (see the third column in
Table 3) reveal that most of them have either ChildOf LABELING MEMORY-RELATED
or PeerOf relations. Some of them, such as CWEs 121, CVEs VIA BF SPECIFICATIONS
122, and 123 as well as CWEs 170, 463, and 464 are sib- There are 60,426 memory-related CVEs as of August
lings with a common parent. However, there are also 2023. We queried the CVE repository for entries with
instances of CWEs without direct relationships that CWEs assigned by NVD that map by main operation to
have identical chains, such as CWE-401 and CWE-771. the _MEM BF class type. We ordered them by the NVD-
An interesting observation is that, while CWE-123, assigned CVSS severity scores and selected a maxi-
CWE-415, and CWE-416 are specified with very different mum of 10 CVEs per operation—thus reducing the
BF weakness triples, they are listed as peers in the CWE. count to 91 observable CVEs for this exploratory analy-
Of special note are parent–child CWE pairs that sis. Then, we examined the groups of CVEs mapped to
share a BF chain. Per the CWE, a parent entry is sup- CWEs with identical causing BF chains and of CVEs
posed to be more abstract than its child entry. In BF mapped to CWEs with entirely identical BF chains.
terms, this is expressed by having multiple possible From the latter group, we also identified the CVEs that
causes, consequences, and/or operations (e.g., Buffer map to CWEs with parent–child relationships.
Overflow/Buffer Underflow versus only Buffer Over- Analyzing this subset of CVEs, we find that it covers
flow). One would expect that parent/child CWEs would well the BF memory operations Reposition, Reassign,
have slightly different BF chains or that the main Verify, Initialize Object, Read, Write, Dereference, Clear,
401
(Single Owned Address, Reassign, Memory Leak)
771
170
463* (Erroneous Code, Write, Object Corruption)
464*
121
(Overbounds Pointer/Under Bounds Pointer, Write,
122
Buffer Overflow/Buffer Underflow)
123
416
825 (Dangling Pointer, Read/Write/Dereference, Use After Free)
415
(Erroneous Code, Deallocate, Double Free)
1341
226
244*
(Missing Code, Clear, Not Cleared Object)
1239*
1272*
590
762 (Mismatched Operation, Deallocate, Object Corruption)
763
The order is by the BF Memory Bugs Model operation flow.7 Listing the operation is sufficient, as BF classes are orthogonal by operation. Causing
weaknesses are in italics. Arrows (!) depict chaining. Asterisks (*) mark CWEs not assigned to any CVEs. Ellipses (. . .) depict many possible
operations and consequences.
and Deallocate. However, although there may be CVEs Examining further this subset of CVEs, we confirm
related to the BF memory operations Initialize Pointer, that, overall, memory-related CWE to CVE assignments
Extend, Reallocate–Extend, Reduce, and Reallocate– are almost completely correct by BF operation. For
Reduce, they are not identifiable via CWEs in the entire example, the CVEs mapped to CWE-126 and CWE-788
CVE repository. This indicates gaps in CWEs or issues (see Table 2) correctly distinguish between the Read
with the CWE assignments. In any case, we would need only and Read/Write operations, respectively. We
different methods to identify and specify CVEs related only sporadically find examples of incorrect CWE to
to these operations. CVE assignments by operation, such as CWE-123 to
CVE-2018-12036. The confusion for this CVE must relate buffer access with an incorrect length value leading to
to the use of “writes” in its description, while, in fact, a buffer overread. NVD assigns this CVE to CWE-119
it is a BF _INP class type vulnerability: (Missing Code, (Improper Restriction of Operations within the Bounds
Validate, File Injection). of a Memory Buffer) and CWE-805 (Buffer Access with
However, when examining the CVEs by the CWE BF Incorrect Length Value). However, CWE-126 (Buffer
weaknesses or chains of weaknesses, which cover not Over-read) also nicely and more accurately describes
only operations but also causes and consequences, we this CVE than the abstract CWE-119.
find that parts of these BF weakness specifications Another area of ambiguity among CWEs relates to
may not fit all of the corresponding CVEs. For example, BF attributes. For example, CVE-2023-40295 describes
the BF CWE-126 chain (see Table 2) completely fits the a buffer overflow on the heap. NVD assigns CWE-787
(Overbounds Pointer, Read, Buffer Over–Read) main (Out-of-bounds Write) to this CVE, which accurately
weakness of CVEs, such as CVE-2014-0160 (Heart- describes the vulnerability. However, CWE-122 is also a
bleed), as well as their (Wrong Index, Reposition, Over- suitable assignment, as it describes buffer overflow
bounds Pointer) direct causing weakness. However, specifically on the heap. Apart from this minor differ-
most of the CVEs with CWE-126 assigned have as an ini- ence (captured in BF by an address-related attribute),
tial weakness Missing Code for a Verify operation and CWE-122 and CWE-787 share an identical main weak-
not Erroneous Code in a Calculate operation—see the ness: (Over Bounds Pointer/Under Bounds Pointer,
first Heartbleed weakness in Bojanova and Galhardo.8 Write, Buffer Overflow, Buffer Underflow). While CWE-
We conclude that the causal chains from Table 2 122 is the more specific mapping, the similarities cre-
may be helpful for describing some CVEs but are too ate difficulty in deciding which CWE should be
specific to fit other CVEs. This can be explained by assigned to this CVE.
CWE’s lack of flexibility to describe all possible security In many cases, CWE–CVE assignments capture
weaknesses—the entries could be too specific to be either the cause of a vulnerability or its consequence
reused, some may be missing, and some may be over- but not both. CVE-2022-34399 and CVE 2022-32454
lapping. We will have to use entirely different methods describe a buffer overread and a buffer overwrite
to identify and specify the parts of the BF description vulnerability, respectively. The BF chain for CVE-2022-
for a CVE that do not fit the overly specific CWE varia- 34399 is (Missing Code, Verify, Inconsistent Value) !
tion assigned to that CVE. (Wrong Index, Reposition, Overbounds Pointer) !
We also find that, from the CWEs with different BF (Overbounds Pointer, Write, Buffer Overflow), and the
specifications (see Table 1), CWE-586 and CWE-568 are BF chain for CVE-2022-32454 is (Missing Code, Verify,
not assigned to any CVE. Then, from the CWE groups Wrong Value) ! (Wrong Size, Read, Buffer Over–Read).
with identical BF specifications (see Table 3), the fol- NVD assigns CWE-119 (Improper Restriction of Opera-
lowing CWEs were not assigned to any CVE at all: CWE tions within the Bounds of a Memory Buffer) and CWE-
806 from the 805, 806 pair, CWE 1341 from the 415, 1341 805 (Buffer Access with Incorrect Length Value) to CVE
pair, CWE 463 and CWE 464 from the 170, 463, 464 tri- 2022-34399 and CWE-121 (Stack Based Buffer Overflow)
ple, and CWE 224, CWE 1239, and CWE 1272 from the to CVE 2022-32454. However, the CWEs assigned to CVE
226, 224, 1239, 1272 quadruple. It is interesting to 2022-34399 only describe the cause of the vulnerability,
explore if there are CVEs that are better described by and the CWE assigned to CVE 2022-32454 only describes
the unused CWEs (marked with asterisks in Tables 1 its consequence. One must assign CWE-126 (Buffer
and 3). One such example is CVE-2023-38434 (although Over-read) to CVE 2022-34399 and CWE-112 (Missing
outside of our 91-CVE set), which describes a double XML Validation) to capture the full story of each weak-
free when closing a web connection. NVD assigns it ness underlying the vulnerability. This inconsistency in
CWE-415 (Double Free) of a memory resource instead capturing either the cause or the consequence of a vul-
of the more general CWE-1341 (Multiple Releases of nerability further complicates the CWE–CVE assignment.
Same Resource or Handle). These two CWEs share the The findings from this work show areas that would
(Erroneous Code, Deallocate, Double Free) BF weak- require additional analysis to create precise BF CVE
ness triple (see Table 3); their similarity and the vague descriptions. We would have to examine deeply corre-
memory versus resource distinction introduces errors sponding vulnerability reports, source code with bugs,
and ambiguities in CWE–CVE assignments that would source code with fixes, and other available related
also affect our efforts for CVE labeling. resources. Utilizing the BF vulnerability model, the BF
The rest of the similar CWEs (see Table 3) lead, in cybersecurity concepts definitions, and BF taxons defi-
many cases, to ambiguous CWE–CVE assignments. nitions5,6 as well as any of their synonyms in use, we
One such example is CVE-2022-0519, which describes a can employ modern ML and AI approaches toward
automatic CVE analysis and generation of BF CVE graphs,” in Proc. IEEE Int. Conf. Cyber Secur. Protection
specifications. Digit. Services (Cyber Secur.), 2020, pp. 1–10, doi: 10.
1109/CyberSecurity49315.2020.9138852.
CONCLUSION 3. “National vulnerability database,” Nat. Inst. Standards
Labeled vulnerability descriptions are of great demand Technol, Gaithersburg, MD, USA, 2023, Accessed:
for advanced AI research related to cybersecurity Mar. 2, 2023. [Online]. Available: https://nvd.nist.gov
vulnerabilities, attacks, and mitigation techniques. As a 4. “Common vulnerability scoring system special interest
formal bugs/weaknesses model,6 BF has the ability group.” FIRST. Accessed: Jul. 7, 2023. [Online].
to unambiguously describe the chains of underlying Available: https://www.first.org/cvss
weaknesses for any software security vulnerability. 5. I. Bojanova, “The bugs framework,” Nat. Inst. Standards
With this work, we begin specifying the detailed infor- Technol, Gaithersburg, MD, USA, 2023. Accessed: Aug.
mation provided for each CWE. We use the CWEs as a 20, 2023. [Online]. Available: https://samate.nist.gov/BF/
bridge to the corresponding CVEs and explore to what 6. I. Bojanova and C. E. Galhardo, “Bug, fault, error,
extent the BF CWE descriptions may aid in the manual or weakness: Demystifying software security
creation of BF CVE descriptions. We invite you to collab- vulnerabilities,” IT Prof., vol. 25, no. 1, pp. 7–12,
orate with us in this direction by joining the BF CVE spec- Jan. 2023, doi: 10.1109/MITP.2023.3238631.
ification challenge at https://samate.nist.gov/BF/.5 7. I. Bojanova and C. E. Galhardo, “Classifying memory
Our future goal is to employ ML and AI approaches bugs using bugs framework approach,” in Proc. IEEE
for automated generation of BF CVE descriptions. 45nd Annu. Comput., Softw., Appl. Conf., 2021, vol. 1,
The result would be a reference dataset of labeled pp. 1157–1164, doi: 10.1109/compsac51774.2021.00159.
vulnerability specifications for use by AI algorithms. 8. I. Bojanova and C. E. Galhardo, “Heartbleed revisited:
The labels will constitute the exhaustive sets of causes, Is it just a buffer over-read?” IT Prof., vol. 25, no. 2,
operations, consequences, and attributes values, pre- pp. 83–89, Mar. 2023, doi: 10.1109/MITP.2023.3259119.
cisely defined as BF class taxonomies. The BF CVE refer-
ence dataset will be a great source not only for research IRENA BOJANOVA is a computer scientist at NIST, Gaithers-
but also for cybersecurity education and guidance. burg, MD, 20899, USA. Contact her at [email protected].
1. “Metrics,” MITRE Corp., US, 2023. Accessed: Jul. 7, 2023. where he is majoring in computer science and mathematical
[Online]. Available: https://www.cve.org/About/Metrics data science, and a Summer Undergraduate Research Fel-
2. D. Malzahn, Z. Birnbaum, and C. Wright-Hamor, lowship student at NIST. Contact him at john.j.guerrerio.26@
“Automated vulnerability testing via executable attack dartmouth.edu.
DEPARTMENT: IT ECONOMICS
Generative artificial intelligence tools have found a wide range of uses in marketing.
This article delves into how these tools are transforming three key areas of marketing:
personalization, insight generation, and content creation.
T
he arrival of generative artificial intelligence marketing function, which saves between 5% and 15%
(GAI) has triggered a truly enterprisewide adop- of total marketing spending.3
tion of AI. Although all functional areas in an The benefits of GAI tools in marketing, however,
enterprise are benefitting from this breakthrough inno- extend far beyond just cost cutting or time saving. GAI
vation,1 marketing has been an area that has been par- tools can be used to do things that are not otherwise
ticularly affected positively by the recent developments possible. For instance, GAI enables brands to take mar-
in GAI. keting and sales to the next level and deliver truly one-
Recent surveys have shown that there is wide- to-one personalized experiences in a way that is not
spread use of GAI among marketers and growing possible by humans alone. GAI brings benefits such as
demand of this innovation (Table 1). Especially midlevel dynamic messaging capabilities by leveraging person-
and junior marketers are reported to have higher GAI alization to send the right message at the right time
adoption rates compared to senior marketers (https:// to customers based on detailed and valuable data.
tinyurl.com/3u63k2vr). Relevant real-time imagery and predictive recommen-
Marketers view GAI as a productivity booster. For dations can help scale personalization across key
instance, in a survey conducted by the nonprofit busi- moments in a cost-effective way.4
ness membership and research group organization In this article, an overview of how GAI is transform-
Conference Board (Table 1), more than four-fifths (82%) ing key areas and activities in marketing is provided.
of respondents expected that productivity will improve Specifically, the focus is on the top three uses of GAI in
with further adoption of GAI. Just 4% expected produc- marketing as identified by Boston Consulting Group’s
tivity to decline with GAI use (https://tinyurl.com/ April 2023 survey: personalization, insight generation,
3u63k2vr). Technology company Salesforce’s May 2023 and content creation. It should be noted that these
survey, which was a part of its Generative AI Snapshot uses are interrelated. Insight generated by GAI, for
Series (Table 1), found that GAI saves marketers instance, can be used as a basis to develop better mar-
more than five h of work per week, which translates keting content that enhances user engagement. Like-
to more than a month per year (5 h*52 weeks)/8-h wise, personalization is a key part of content creation.
workday¼32.5 days) (https://www.salesforce.com/news/
Indeed, roughly one-third of respondents in the Confer-
stories/generative-ai-for-marketing-research/). GAI can
ence Board survey used GAI to personalize customer/
be used for tasks such as analyzing customer reviews,
user content (Table 1).
social media comments, or any other text data and pro-
vide a summary of positive and negative comments.2
This allows marketing professionals to shift their atten- PERSONALIZATION
tion to more strategic tasks. Likewise, management Personalization leads to improved marketing perfor-
consulting firm McKinsey & Company’s estimate sug- mance. Nonetheless, the process involved in personali-
gests that GAI could increase the productivity of the zation is challenging and hard to apply.5 For instance,
firms need to gather and analyze information about
the customer and their interactions from internal
1520-9202 © 2023 IEEE
and external sources. Then, key marketing activities
Digital Object Identifier 10.1109/MITP.2023.3314325 such as advertising, product design, packaging, pricing,
Date of current version 3 November 2023. and distribution need to be customized based on a
Conducted/Released
Survey Conducted By In (Respondents) Key Findings Additional Findings
Salesforce in partnership Conducted from 18–25 May Fifty-one percent were GAI saves marketers more
with YouGov (https:// 2023 (1029 marketers) using GAI and an than 5 h of work per week.
tinyurl.com/5efr7rey) additional 22% had plans
to use the tools “very
soon.”
Boston Consulting Group Conducted in April 2023 Seventy percent were Top three uses:
(https://tinyurl.com/ (more than 200 chief using GAI and 19% were personalization (67%),
puwykkj4) marketing officers from testing. Only 3% had no insight generation (51%)
several sectors in eight plans to use the and content creation
countries in North technology. (49%).
America, Europe, and Asia)
The Conference Board, in August 2023 (287 Eighty-seven percent of Top applications:
collaboration with Ragan marketers and marketers had summarizing content
Communications (https:// communicators) used/experimented with (44%), doing the
tinyurl.com/3u63k2vr) AI tools. legwork/stimulating
creative thinking (41%),
personalizing of customer/
user content (33%),
research (30%), generating
content faster (30%), and
enhancing customer
service (17%).
customer profile.6 Personalized response is thus often recommendations can be specifically tailored toward
too time consuming for marketing teams. local conditions. GAI can be used to create images, vid-
Most of the traditional marketing tools fail to effec- eos, and show products in real time in locations and
tively personalize marketing efforts. For instance, content situations that are relevant for potential customer. For
recommendation systems rely on generic algorithms instance, backyard pictures can be uploaded on the
such as collaborative filtering, which predicts a user’s product page and the vender can show how the lawn-
preference and filters out items based on reactions by mower could work there, or show a product that has
similar users. This technique involves searching a large been designed to the shopper’s use case. This means
group of people and identifying a smaller set of users that a user shopping for a lawnmower in San Francisco,
with preferences similar to a particular user. Such algo- California, is delivered a different experience compared
rithms often fail to capture an individual’s preferences to one in Syracuse, New York.4
(https://tinyurl.com/2xhy65fr). Moreover, GAI such as ChatGPT’s learn and adapt
Marketers can utilize GAI tools to generate sophis- over time. As users interact with the recommended
ticated responses at the individual level, which can be content, the system can refine its recommendations
used to develop and automate an effective personal- based on more data. This ensures that future recom-
ized marketing strategy at scale (https://tinyurl.com/ mendations are even more tailored to the shopper’s
2uhhjrzs). These tools can be utilized to deliver hyper- preferences. All these can create a virtuous cycle of
personalized content, and product recommendations, feedback and improvement that can lead to more
experience, and offers at just the right time. This is accurate suggestions (https://tinyurl.com/2xhy65fr).
illustrated with an example of a lawnmower shopper. Unsurprisingly, surveys have revealed that person-
GAI such as ChatGPT can analyze a user’s language alization is among the most frequent GAI use cases
patterns, behavior, and other data points to make rec- in marketing (Table 1). Chief marketing officers are
ommendations that are highly tailored to a potential already taking advantage of GAI to better personalize
shopper (https://tinyurl.com/2xhy65fr). When the shop- products and services. Banks use GAI to analyze cus-
per starts to search for a lawnmower, ads can be per- tomer data and offer personalized investment advice
sonalized and delivered based on their profile. based on their risk profile. Likewise, some retailers use
Contextualizing the experience is an important GAI to create personalized recommendations that
way to improve personalization. Marketing efforts and would influence them to buy more. The goals of these
personalization efforts include better engagement, Assistant can also create a message for another sales-
improved conversion rates, and increased customer person in Slack who had dealt with the prospect com-
loyalty (https://tinyurl.com/puwykkj4). pany in the past. It can access Tableau to learn more
about specific products that the prospective customer
INSIGHT GENERATION had purchased in the past (https://tinyurl.com/yw2cdk34).
Companies are using GAI to conduct market and con- In this way, Einstein GPT generates customer insight
sumer research in a reliable manner to gain insights, and acts rapidly and effectively on the insight. The
which can be used to improve products and services insight can help companies develop and introduce
(https://tinyurl.com/puwykkj4). In general, AI has reduced new products and services to address unmet needs of
the time taken to generate insights from raw data. AI the customer. All these can lead to an improvement in
solutions can produce insights in seconds that often customer experience.
take human teams days or even weeks, however, a
large proportion of data was illegible for the existing AI CONTENT MARKETING
solutions. This means that organizations could not Content marketing focuses on creating and distributing
make use of such data in the past. Fortunately, the rise marketing content such as videos, photos/imageries,
of GAI has changed this dramatically. presentations, flyers, blogs, infographics, social media
Leading software vendors such as Microsoft and posts, and tweets to enhance customer engagement
Salesforce have incorporated GAI capabilities into their and increase sales (https://tinyurl.com/fzswj968). GAI is
products, which can be used to gain customer insight. a valuable way to come up with new ideas and work
Note that customer insight is knowledge about the more quickly and thus create an effective marketing
customer that is valuable for a company.7 For instance, content faster and at a low cost.
the Copilot feature in Microsoft’s Dynamics 365 Cus- By making use of a huge amount of data about
tomer Insights has added GPT-4 to popular Office human language and aesthetic styles, GAI tools can
products such as Word, Excel, Teams, PowerPoint, and generate sophisticated, personalized, and contextually
Outlook. Marketers can use Copilot’s GAI capabilities relevant marketing content quickly. GAI thus drasti-
to gain insights from their customer data platform. cally cuts the time it takes to produce marketing
Additionally, companies can learn about new customer content. With appropriate prompts, GAI can create 10
groups to target (https://tinyurl.com/mw3txfuk). different ad angles for a product based on customer
Another notable example of a GAI tool being used reviews or other factors.2 It is possible to create an ad
to generate customer insight and make effective use for a product in the style of famous artists such as Sal-
of such insight is Salesforce’s Einstein GPT. Einstein vador Dali, Pablo Picasso, and Vincent Van Gogh. If a
GPT combines Salesforce’s proprietary AI models with company is planning to launch a new product, GAI
OpenAI’s GPT model. The goal is to have layers on top tools can write effective social media copy that can be
of OpenAI’s GPT model and use content stored in the used on social media sites such as Facebook, Twitter,
Salesforce cloud to fine-tune the model (https://tinyurl. and LinkedIn.2
com/yzxt3bfv). Einstein GPT is available across the Marketers’ use of GAI in content creation is also
company’s entire Customer 360 customer relationship facilitated by digital advertising giants such as Meta,
management platform (Salesforce’s portfolio of prod- which have made such tools available. In May 2023,
ucts). It is integrated into Salesforce cloud, Tableau Meta announced a GAI Sandbox that offers advertisers
(visual analytics platform), Slack (messaging app), and access to three tools. The first tool would help adver-
MuleSoft (integration platforms that help businesses tisers create different variations of ad copy in the
to connect their data, devices, and applications) (https:// company’s Facebook or Instagram platform. When
tinyurl.com/yzxt3bfv). advertisers enter an ad’s copy, GAI will suggest several
Salesforce’s Customer 360 can conduct research on variations of the copy to test. The second tool would
a prospective new customer and provide an overview help generate background through text prompts. An
about the customer. A new pane, Einstein Assistant, advertiser can use text prompts to describe the back-
opens on the screen. The overview of the prospect com- ground appearance or style. The GAI tool offers various
pany appears in the form of a text. The assistant can images and advertisers can test their impact on perfor-
also find news articles about the company (e.g., plans to mance. The third tool involves an image cropping fea-
move into a new market). It can then write a letter to ture to create visuals in different aspect ratios for
the prospect acknowledging the new plan. It can be various media such as social posts, stories, or short
asked to rewrite the letter in a different tone. Einstein videos (https://tinyurl.com/5n8wk5pr). Meta made the
Sandbox features available to a small group of adver- Marketers maintain a gloomy view regarding the impact
tisers with a plan to roll out the new program in the of GAI on organizational culture. For instance, in the
future. The makeup brand Jones Road Beauty was one Conference Board survey (Table 1), only 16% expected
of the early testers of the Sandbox (https://tinyurl.com/ improvements in team culture while 22% expected neg-
5t3dehfa). ative effects on such a culture (https://tinyurl.com/
3u63k2vr).
CHALLENGES AND BARRIERS Finally, there are concerns about GAI-led job losses.
HINDERING THE USE OF GAI According to the consulting organization Challenger,
IN MARKETING Gray & Christmas, GAI use in organizations resulted in
There are several barriers and challenges to the wide- roughly 4000 job losses in May 2023 alone (https://
spread use of GAI in marketing. The most important tinyurl.com/45dux8s8). In the Conference Board survey
barrier, especially for small and medium-sized enter- (Table 1), only 4% of respondents expected that GAI’s
prises, centers on costs. For instance, to get access to use would lead to an increase in the number of market-
Copilot, Microsoft 365 businesses need to pay an addi- ing jobs, whereas 40% expected a decline in such jobs
tional $30 per user per month.8 Likewise, as of August (https://tinyurl.com/3u63k2vr).
2023, GPT-4 cost $20 with a monthly subscription to
ChatGPT Plus. Although ChatGPT can be used for free, CONCLUSION
its training data go up until 2021 only. This means that Human teams’ skills, points of view, and experiences
it cannot access real-time content and it does not are still needed in most marketing activities, and thus,
search the Internet for answers. Thus, companies can- GAI will not completely replace them. However, as dis-
not rely on it to get timely insights such as how con- cussed in this article, GAI can perform a number of
sumers might be thinking about a brand. Likewise, marketing tasks at a lower cost, higher speed, and
ChatGPT cannot identify current trends that poten- more effectively than human teams or other existing
tially impact marketing activities. tools. For instance, on the personalization front, past
Second, marketers are worried about data security algorithms were ineffective. Likewise, AI models lacked
and privacy in GAI tools (https://tinyurl.com/3u63k2vr). scalability when it came to personalization. GAI tools
There are potential privacy violations if marketers enter such as ChatGPT can help develop a marketing auto-
sensitive customer data as a part of their prompts. Sur- mation program that can produce hyperpersonalized
veys have reported that the instances of companies’ content based on an individual customer profile and
employees putting confidential client data into ChatGPT their journeys and interactions with a company and its
are more common than most people think.9 Many sur- products and services. These tools thus have the
veys have found that consumers are increasingly con- potential to help achieve personalization at scale for
cerned about organizations’ data protection measures, marketing organizations. GAI is also transforming other
and they are less likely to do business with a company key marketing activities, such as Insight generation
that fails to protect their sensitive data.10 For these and content marketing.
reasons, several companies have banned ChatGPT
to prevent the worst-case scenario of an employee
uploading sensitive data into the chatbot while seek- REFERENCES
ing help at work.11 1. “The great acceleration: CIO perspectives on
Third, the perceived risks and costs associated with generative AI: How technology leaders are adopting
transitioning to a new way of performing marketing emerging tools to deliver enterprise-wide AI,” MIT
tasks may lead to businesses’ and employees’ unwill- Technol. Rev., Jul. 2023. Accessed: Aug. 11, 2023.
ingness to use GAI for such tasks. There are often sub- [Online]. Available: https://www.technologyreview.
stantial immediate costs involved in learning new ways com/2023/07/18/1076423/the-great-acceleration-cio-
of doing things, such as writing effective prompts to perspectives-on-generative-ai/
get the results they want, the time and frustration 2. M. Graham, “Five things marketers should know
costs, and the cost of correcting any mistakes pro- about generative AI in advertising,” Wall Street J.,
duced by unfamiliarity with how GAI works in market- Mar. 2023. Accessed: Aug. 11, 2023. [Online]. Available:
ing.12 There are also concerns about how GAI would fit https://www.wsj.com/articles/five-things-marketers-
in the corporate culture. In general, organizations are should-know-about-generative-ai-in-advertising-5381c1d0
still trying to figure out how to adapt team culture and 3. “The economic potential of generative AI: The next
processes around GAI (https://tinyurl.com/477b3xvu). productivity frontier,” McKinsey & Company,
New York, NY, USA, Jun. 2023. [Online]. Available: 9. N. Kshetri, “Cybercrime and privacy threats of
https://www.mckinsey.com/capabilities/mckinsey- large language models,” IT Prof., vol. 25, no. 3,
digital/our-insights/the-economic-potential-of- pp. 9–13, May/Jun. 2023, doi: 10.1109/MITP.2023.
generative-ai-the-next-productivity-frontier#introduction 3275489.
4. A. Bernard. “AI in ecommerce: True one-on-one 10. N. Kshetri, Cybersecurity Management: An
personalization is coming.” CMSWire. Accessed: Organizational and Strategic Approach. Toronto, ON,
Aug. 11, 2023. [Online]. Available: https://www. Canada: The University of Toronto Press, 2021.
cmswire.com/customer-experience/ai-in-ecommerce- 11. T. Telford and P. Verma, “Employees want ChatGPT at
true-one-on-one-personalization-is-coming/ work. Bosses worry they’ll spill secrets,” Washington
5. J. Vesanen, “What is personalization? A conceptual Post, Jul. 2023. Accessed: Aug. 11, 2023. [Online].
framework,” Eur. J. Marketing, vol. 41, nos. 5–6, Available: https://www.washingtonpost.com/business/
pp. 409–418, 2007, doi: 10.1108/03090560710737534. 2023/07/10/chatgpt-safe-company-work-ban-lawyers-
6. J. Vesanen and M. Raulas, “Building bridges for code/
personalization: A process model for marketing,” 12. H. Margetts and P. Dunleavy, “Cultural barriers to
J. Interactive Marketing, vol. 20, no. 1, pp. 5–20, e-government: Better,” National Audit Office, Public
Winter 2006, doi: 10.1002/dir.20052. Services through e-government, London, U.K., 2002.
7. B. Smith, H. N. Wilson, and M. Clark, “Creating and [Online]. Available: https://www.nao.org.uk/
using customer insight: 12 Rules of best practice,” publications/nao-reports/01-02
J. Med. Marketing, vol. 6, no. 2, pp. 135–139, 2006,
doi: 10.1057/palgrave.jmm.5050013.
NIR KSHETRI is a professor in the Bryan School of Business
8. T. Warren. “Microsoft puts a steep price on Copilot,
its AI-powered future of Office documents.” The and Economics, the University of North Carolina at
Verge. Accessed: Aug. 11, 2023. [Online]. Available: Greensboro, Greensboro, NC, 27412, USA, and IT Professio-
https://www.theverge.com/2023/7/18/23798627/micro- nal’s IT Economics editor. Contact him at nbkshetr@
soft-365-copilot-price-commercial-enterprise uncg.edu.
CHAPTERS: Regular and student chapters worldwide provide 2023 IEEE Division V Director-Elect: Christina M. Schober
the opportunity to interact with colleagues, hear technical
experts, and serve the local professional community. BOARD OF GOVERNORS
Term Expiring 2023:
AVAILABLE INFORMATION: To check membership status,
report an address change, or obtain more information on any Jyotika Athavale, Terry Benzel, Takako Hashimoto, Irene Pazos
of the following, email Customer Service at help@computer. Viana, Annette Reilly, Deborah Silver
org or call +1 714 821 8380 (international) or our toll-free Term Expiring 2024:
number, +1 800 272 6657 (US): Saurabh Bagchi, Charles (Chuck) Hansen, Carlos E. Jimenez-
• Membership applications Gomez, Daniel S. Katz, Shixia Liu, Cyril Onwubiko
• Publications catalog
Term Expiring 2025:
• Draft standards and order forms
İlkay Altintaş, Nils Aschenbruck, Mike Hinchey, Joaquim Jorge,
• Technical committee list
Rick Kazman, Carolyn McGregor
• Technical committee application
• Chapter start-up procedures
• Student scholarship information EXECUTIVE STAFF
• Volunteer leaders/staff directory
Executive Director: Melissa Russell
• IEEE senior member grade application (requires 10 years
practice and significant performance in five of those 10) Director, Governance & Associate Anne Marie Kelly
Executive Director:
Director, Conference Operations: Silvia Ceballos
PUBLICATIONS AND ACTIVITIES Director, Information Technology & Sumit Kacker
Computer: The flagship publication of the IEEE Computer Services:
Society, Computer, publishes peer-reviewed technical Director, Marketing & Sales: Michelle Tubb
content that covers all aspects of computer science, Director, Membership Development: Eric Berkowitz
computer engineering, technology, and applications. Director, Periodicals & Special Projects: Robin Baldwin
Periodicals: The Society publishes 12 magazines, 19 journals.
Conference Proceedings & Books: Conference Publishing IEEE BOARD OF DIRECTORS
Services publishes more than 275 titles every year.
President & CEO: Saifur Rahman
Standards Working Groups: More than 150 groups produce
President-Elect: Thomas M. Coughlin
IEEE standards used throughout the world.
Director & Secretary: Forrest (Don) Wright
Technical Communities: TCs provide professional interaction
Director & Treasurer: Mary Ellen Randall
in more than 30 technical areas and directly influence
Past President: K. J. Ray Liu
computer engineering conferences and publications.
Director & VP, Educational Activities: Rabab Ward
Conferences/Education: The society holds more than 215
Director & VP, Publication Services & Sergio Benedetto
conferences each year and sponsors many educational
Products:
activities, including computing science accreditation.
Director & VP, Member & Geographic Jill Gostin
Certifications: The society offers three software Activities:
developer credentials.
Director & President, Standards Yu Yuan
Association:
COMPUTER SOCIETY OFFICES Director & VP, Technical Activities: John Verboncoeur
Director & President, IEEE-USA: Eduardo Palacio
Washington, D.C.: Los Alamitos:
2001 L St., Ste. 700, 10662 Los Vaqueros Cir.,
Washington, D.C. 20036-4928; Los Alamitos, CA 90720;
Phone: +1 202 371 0101; Phone: +1 714 821 8380;
Fax: +1 202 728 9614; Email: [email protected]
Email: [email protected]
3/26/2
0 10:23
AM
Intern
et of
Ethic Thing
s s
Mach
ine L
Quan earnin
tu g
Comp m
uting
JUNE
2020
JULY 20
ww w.c 20
ompu
ter.org
ww w.c
ompu
ter.org
ce7c1.ind
d 1
5/20/20
7:57 PM
6/24/20
1:20 PM
Secu
rity an
Priva d
cy High-P
Auto
matio
n Comp erforman
Block uting ce
chain Hard
Digit ware
al Affect
Transf ive
ormat Comp
ion uting
Educa
tion
MAY 20
20
w w w.c
ompu
ter.org
ce5c1.
indd
1
ww w.c NOVE
ompu MBER
ter.org 2020
ww w.c
ompu
ter.org
4/22/2
0 5:3
3 PM
ce11c1.in
dd 1
7/22/20
3:51 PM
ce9c1.
indd
10/22/20 1
10:20
ComputingEdge
AM
Secu
rit
Priva y and
cy
Data
Intern
et
Artifi
cial
Intell
Cutting-edge OCTO
BER 20
Unique original Keeps you up to
articles from the content by date on what you
20
w w w.c
ompu
ce10c1
.indd
1
Society’s portfolio thought leaders, the technology
of 12 magazines. innovators, spectrum.
and experts.
9/23/2
0 12:48
PM