0% found this document useful (0 votes)
147 views78 pages

Journal of Computer Science Research - Vol.5, Iss.3 July 2023

Similarity Intelligence: Similarity Based Reasoning, Computing, and Analytics Innovating Pedagogical Practices through Professional Development in Computer Science Education Development of New Machine Learning Based Algorithm for the Diagnosis of Obstructive Sleep Apnea from ECG Data Enhancing Human-Machine Interaction: Real-Time Emotion Recognition through Speech Analysis Expert Review on Usefulness of an Integrated Checklist-based Mobile Usability Evaluation Framework
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
147 views78 pages

Journal of Computer Science Research - Vol.5, Iss.3 July 2023

Similarity Intelligence: Similarity Based Reasoning, Computing, and Analytics Innovating Pedagogical Practices through Professional Development in Computer Science Education Development of New Machine Learning Based Algorithm for the Diagnosis of Obstructive Sleep Apnea from ECG Data Enhancing Human-Machine Interaction: Real-Time Emotion Recognition through Speech Analysis Expert Review on Usefulness of an Integrated Checklist-based Mobile Usability Evaluation Framework
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Editor-in-Chief

Dr. Lixin Tao


Pace University, United States

Dr. Jerry Chun-Wei Lin


Western Norway University of Applied Sciences, Norway

Editorial Board Members

Yuan Liang, China Michalis Pavlidis, United Kingdom


Chunqing Li, China Dileep M R, India
Roshan Chitrakar, Nepal Jie Xu, China
Omar Abed Elkareem Abu Arqub, Jordan Qian Yu, Canada
Lian Li, China Paula Maria Escudeiro, Portugal
Zhanar Akhmetova, Kazakhstan Mustafa Cagatay Korkmaz, Turkey
Hashiroh Hussain, Malaysia Mingjian Cui, United States
Imran Memon, China Besir Dandil, Turkey
Aylin Alin, Turkey Jose Miguel Canino-Rodríguez, Spain
Xiqiang Zheng, United States Lisitsyna Liubov, Russian Federation
Manoj Kumar, India Chen-Yuan Kuo, United States
Awanis Romli, Malaysia Antonio Jesus Munoz Gallego, Spain
Manuel Jose Cabral dos Santos Reis, Portugal Ting-Hua Yi, China
Zeljen Trpovski, Serbia Yuren Lin, Taiwan, Province of China
Degan Zhang, China Lanhua Zhang, China
Shijie Jia, China Samer Al-khateeb, United States
Marbe Benioug, China Neha Verma, India
Saddam Hussain Khan, Pakistan Viktor Manahov, United Kingdom
Xiaokan Wang, China Gamze Ozel Kadilar, Turkey
Rodney Alexander, United States Aminu Bello Usman, United Kingdom
Hla Myo Tun, Myanmar Vijayakumar Varadarajan, Australia
Nur Sukinah Aziz, Malaysia Patrick Dela Corte Cerna, Ethiopia
Shumao Ou, United Kingdom Dariusz Jacek Jakóbczak, Poland
Serpil Gumustekin Aydin, Turkey Danilo Avola, Italy
Nitesh Kumar Jangid, India
Xiaofeng Yuan, China
Volume 5 Issue 3 • July 2023 • ISSN 2630-5151 (Online)

Journal of
Computer Science
Research
Editor-in-Chief
Dr. Lixin Tao
Dr. Jerry Chun-Wei Lin
Volume 5 | Issue 3 | July 2023 | Page1-73
Journal of Computer Science Research

Contents
Articles
1 Similarity Intelligence: Similarity Based Reasoning, Computing, and Analytics
Zhaohao Sun
15 Development of New Machine Learning Based Algorithm for the Diagnosis of Obstructive Sleep Apnea
from ECG Data
Erdem Tuncer
22 Enhancing Human-Machine Interaction: Real-Time Emotion Recognition through Speech Analysis
Dominik Esteves de Andrade, Rüdiger Buchkremer
57 Expert Review on Usefulness of an Integrated Checklist-based Mobile Usability Evaluation Framework
Hazura Zulzalil, Hazwani Rahmat, Abdul Azim Abd Ghani, Azrina Kamaruddin

Review
46 Innovating Pedagogical Practices through Professional Development in Computer Science Education
Xiaoxue Du, Ellen B Meier
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Journal of Computer Science Research


https://journals.bilpubgroup.com/index.php/jcsr

ARTICLE

Similarity Intelligence: Similarity Based Reasoning, Computing, and


Analytics
Zhaohao Sun

Department of Business Studies, PNG University of Technology, Private Mail Bag, Lae 411, Morobe, Papua New
Guinea

ABSTRACT
Similarity has been playing an important role in computer science, artificial intelligence (AI) and data
science. However, similarity intelligence has been ignored in these disciplines. Similarity intelligence is a
process of discovering intelligence through similarity. This article will explore similarity intelligence, similarity-
based reasoning, similarity computing and analytics. More specifically, this article looks at the similarity as an
intelligence and its impact on a few areas in the real world. It explores similarity intelligence accompanying
experience-based intelligence, knowledge-based intelligence, and data-based intelligence to play an important
role in computer science, AI, and data science. This article explores similarity-based reasoning (SBR) and
proposes three similarity-based inference rules. It then examines similarity computing and analytics, and a
multiagent SBR system. The main contributions of this article are: 1) Similarity intelligence is discovered
from experience-based intelligence consisting of data-based intelligence and knowledge-based intelligence. 2)
Similarity-based reasoning, computing and analytics can be used to create similarity intelligence. The proposed
approach will facilitate research and development of similarity intelligence, similarity computing and analytics,
machine learning and case-based reasoning.
Keywords: Similarity intelligence; Similarity computing; Similarity analytics; Similarity-based reasoning; Big data
analytics; Artificial intelligence; Intelligent agents

*CORRESPONDING AUTHOR:
Zhaohao Sun, Department of Business Studies, PNG University of Technology, Private Mail Bag, Lae 411, Morobe, Papua New Guinea; Email:
[email protected]
ARTICLE INFO
Received: 19 March 2023 | Revised: 20 May 2023 | Accepted: 26 May 2023 | Published Online: 9 June 2023
DOI: https://doi.org/10.30564/jcsr.v5i3.5575
CITATION
Sun, Zh.H., 2023. Similarity Intelligence: Similarity Based Reasoning, Computing, and Analytics. Journal of Computer Science Research. 5(3):
1-14. DOI: https://doi.org/10.30564/jcsr.v5i3.5575
COPYRIGHT
Copyright © 2023 by the author(s). Published by Bilingual Publishing Group. This is an open access article under the Creative Commons Attribu-
tion-NonCommercial 4.0 International (CC BY-NC 4.0) License. (https://creativecommons.org/licenses/by-nc/4.0/).

1
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

1. Introduction ing and analytics, and a multiagent SBR system. The


main contributions of this article are: 1) Similarity
Similarity, similarity relations, and similarity met- intelligence is discovered from experience-based
rics have been playing an important role in computer intelligence consisting of data-based intelligence and
science, artificial intelligence (AI), and data science [1-6]. knowledge-based intelligence. 2) Similarity-based
Similarity has also played an important role in ma- reasoning, computing and analytics can be used to
chine learning and case-based reasoning (CBR) [7,8]. create similarity intelligence.
Machine learning including deep learning has had The rest of this article is organized as follows:
important impacts on our life and work [7,9]. CBR as Section 2 looks at why similarity intelligence is im-
an AI technique has also played a significant role in portant. Section 3 examines machine learning and
experience-based reasoning and experience manage- CBR as experience based intelligence. Section 4
ment [8-10]. From a meta viewpoint, what are the rela- examines the fundamentals of similarity. Section 5
tionships between machine learning and CBR? explores similarity-based reasoning and proposes
Intelligence has been also playing an important similarity-based inference rules for conducting SBR.
role in AI, business intelligence, machine learning, Section 6 examines similarity computing and analyt-
and CBR [3,8,11,12]. Intelligence can be defined as the ics. Section 7 proposes a multiagent architecture for
collection, analysis, interpretation, visualization, an SBR system, and Section 8 ends this article with
and dissemination of strategic data, information, and some concluding remarks.
knowledge for discovering and using the knowledge
patterns and insights at the right time in the deci-
sion-making process [13]. Are machine learning and
2. Why is similarity intelligence im-
CBR related to similarity intelligence? This implies
portant?
that similarity intelligence has been ignored in these This section highlights why similarity intelligence
disciplines. More specifically, research issues in this is important.
direction are: Similarity has been playing an important role in
1) Why is similarity intelligence important? mathematics, computer science, AI, and data science.
2) What are the relationships between similarity Similarity has also played a significant role in fuzzy
intelligence and experience intelligence, knowledge- logic [2] and big data [5]. However, similarity intelli-
based intelligence and data-based intelligence? gence has been ignored in these disciplines.
3) What are similarity computing and analytics Similarity intelligence is a process for discover-
and their impacts on similarity intelligence? ing intelligence from two or more objects or cases
This article will explore similarity intelligence, using similarity algorithms and techniques. The Turing
similarity-based reasoning, similarity computing test [14] has already mentioned that intelligence com-
and analytics, and their relationships. To address the puting machinery is similar to that of human beings.
first question, this article looks at the similarity of This is a kind of similarity intelligence. Similarity
intelligence and its impact on a few areas in the real intelligence includes similar relationships consisting
world. To address the second question, it explores of patterns and insights between machines, human
similarity intelligence that has been accompanying beings, and software apps [14,5]. In other words, simi-
experience-based intelligence, knowledge-based larity intelligence is not only from human beings, but
intelligence and data-based intelligence to play an also from machines or software or apps.
important role in computer science, AI, and data Similarity also plays an important role in ChatGPT,
science. After reviewing the fundamentals of simi- because similarity is crucial in natural language un-
larity, this article explores similarity-based reasoning derstanding and processing. Based on the research
(SBR) and proposes three similarity-based inference analyzing 1000 texts produced by ChatGPT, it found
rules. This article then examines similarity comput- that on average, the similarity varies between 70%

2
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

and 75% [15]. Therefore, one of the important tasks of learning [7,18], because as soon as we have created
ChatGPT (http://wwww.openAi.com) is to discover patterns, and we have to use similarity to match what
similarity intelligence from two or more objects, was input to the systems and compare it with our
texts, and cases. patterns.
Similarity intelligence is important because it en- Machine learning is about how to build comput-
ables us to identify similarities and patterns in data ers and apps that improve automatically through ex-
sets, which can be used to make more informed deci- perience [19], that is, machine learning is a process of
sions and predictions. By identifying similarities be- discovering intelligence from experience using com-
tween different sets of data, objects, and cases [8], we puters and software. Therefore, machine learning is
can better understand relationships and draw insights an experience-based Intelligence.
that might not be immediately apparent, at least sim- Machine learning is about how a computer can
ilarity intelligence can allow us to select one from a use a model and algorithm to observe some data
similarity class as a representative and then we can about the world, and adapt to new circumstances
analyze it as a characteristic of the similarity class [16]. and detect and extrapolate patterns [11]. Therefore,
For example, in the field of customer relationship machine learning is a process of discovering in-
management [12], intelligence can be used to identi- telligence from data, that is, machine learning is
fy patterns and preferences in consumer behaviors data-based intelligence, a process of discovering in-
through similarity metrics. The patterns and prefer- telligence from data, because it is a process of using
ences can then be used to develop targeted adver- probabilistic models and algorithms on data to create
tising and product recommendations that are more intelligence through data [11].
likely to appeal to specific groups of consumers. One of the unsupervised machine learning is clus-
Therefore, similarity intelligence is important tering [7]. How we calculate the similarity between
not only for computer science, AI, big data, and data two clusters or two objects is important for cluster-
science, but also for businesses and organizations in ing [4,7,18]. There are a few methodologies that are uti-
a wide range of industries, enabling decision makers lized to calculate the similarity: For example, Min,
to obtain more informed decisions in an intelligent Max, the distance between centroids and other simi-
experience-based, knowledge-driven, and data-driv- larity matrices mentioned in Section 4.4. Therefore,
en world. machine learning is similarity intelligence, a process
for creating intelligence through similarity.
Overall, machine learning is an experience-based
3. Experience-based intelligence
Intelligence, a process of discovering Intelligence
Experience-based intelligence is a process of through experience [4]. Machine learning is da-
discovering intelligence from experiences, based on ta-based intelligence, a process of discovering intel-
experience-based reasoning [17]. Experience-based ligence from data. Machine learning is also similar-
intelligence consists of data-based intelligence and ity intelligence, a process for creating intelligence
knowledge-based intelligence. This section looks at through similarity.
similarity intelligence from experience-based intel-
ligence using two examples, machine learning and 3.2 Case-based reasoning
CBR. Machine learning is data-based intelligence.
CBR is knowledge-based intelligence. CBR is a process of discovering similarity intelli-
gence from a case base, just as data mining is a process
of discovering data intelligence from a large DB [12].
3.1 Machine learning
Similarity intelligence includes the exact case that
Similarity has always been important in pattern has been used in the past for solving the problem en-
recognition, graphical pattern recognition, machine countered recently.

3
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

CBR is a reasoning paradigm based on previous and development is that it points out the importance
experiences or cases [12,8]. CBR is based on two prin- of experience and similarity [9,16]. CBR is experi-
ciples about the nature of the world [8]: The types of ence-based intelligence, a process for discovering
problems an agent encounters tend to recur. Hence, intelligence based on experience. Because case base
future problems are likely to be similar to current is a kind of knowledge base [10,8], so that, CBR is also
problems. The world is regular: similar problems a knowledge-based intelligence [11].
have similar solutions or similar causes bring similar Overall, similarity intelligence accompanies
effects [8]. Consequently, solutions to similar prior experience-based intelligence [10,8], data-based intel-
problems are a useful starting point for new problem ligence [4] and knowledge-based intelligence [11] to
solving. The first principle implies that CBR is a provide constructive insights and decision supports
kind of experience-based reasoning (EBR), while the for businesses and organizations.
second principle is the guiding principle underlying
most approaches to similarity-based reasoning (SBR) [8]. 4. Fundamentals of similarity
“Two cars with similar quality features have similar
The similarity is a fundamental concept for many
prices” is one application of the above-mentioned
fields in mathematics, mathematical logic, computer
second principle, and also a popular experience
science, AI, data science, and other sciences [16,9,20,21].
principle summarizing many individual experiences
This section first briefly looks at similarity and then
of buying cars. It is a kind of SBR. In other words,
focuses on similarity relations, fuzzy similarity rela-
SBR is a concrete realization of CBR. The CBR sys-
tions, and similarity metrics.
tem (CBRS) is an intelligent system based on CBR,
which can be modelled as [8]:
4.1 Introduction
CBRS = Case Base + CBRE (1)
where the case base (CB) is a set of cases, each of The concept of similarity has been studied by nu-
which consists of the previously encountered prob- merous researchers from different disciplines such as
lem and its solution. CBRE is a CBR engine, which in mathematics [20], big data [5], computer science [22,23],
is the inference mechanism for performing CBR, in AI and fuzzy logic [1,2,21], to name a few. For example,
particular for performing SBR. The SBR can be for- Klawonn and Castro [24] examined similarity in fuzzy
malized as: reasoning and showed that similarity is inherent to
P', P' ∼ P P, → Q fuzzy sets. Fontana and Formato [25] extended the res-
∴'
(2)
olution rule as the core of a logic programming lan-
where P, Pꞌ, Qꞌ and Qꞌ represent compound propo- guage based on similarity and discussed similarity in
sitions, Pꞌ ∼ P means that if Pꞌ and P are similar (in deductive databases. The concepts of similarity and
terms of similarity relations, metrics and measures, similarity relations play a fundamental role in many
see Section 4) and then Q and Q’ are also similar. (2) fields of pure and applied science [26,20]. The notion of
is called generalized modus ponens, that is, (2) is one a metric or distance between objects has long been
of the inference rules for performing modus ponens used in many contexts as a measure of similarity
based on SBR. Typical reasoning in CBR, known or dissimilarity between elements of a set [27,22,18].
as the CBR cycle, consists of (case) Repartition, Thus, there exist a wide variety of techniques for
Retrieve, Reuse, Revise and Retain [8]. Each of these dealing with problems involving similarity, similari-
five stages is a complex process. SBR dominates all ty relations, similarity measures, and similarity met-
these five stages [16]. Therefore, CBR is a process for rics [21,23]. For example, fuzzy logic [1,2], databases [5],
discovering intelligence through SBR, because Simi- data mining [18] and CBR [8] provides a number of
lar problems have similar solutions. concepts and techniques for dealing with similarity
One significant contribution of CBR research relations, similarity measures, and similarity metrics.

4
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

In what follows, we briefly introduce similarity rela- S (x, y) = S (y, x)(4)


tions, fuzzy similarity relations, and similarity met- S ≥ S ‧ S(5)
rics.
where ‧ is the composition operation of fuzzy binary
relations based on min and max operations. A more
4.2 Similarity relations
explicit form of Equation (5) is
The concept of a similarity relation is a natural (, ) ≥ ∨((, ) ∧ (, ))(6)
generalization of similarity between two triangles
Equation (6) is called max-min transitivity [2]. The
and two matrices in mathematics [16,18]. More precise- Equation (, ) (6)
≥ of ∨is called max-min transitivity [2]. The revised fo
revised form ((, ) ∧ (,was
this definition )) given by Ovchin-
ly: by Ovchinnikov[28] in 1991 [28] . The main difference between the
nikov in 1991 . The main difference between the
Definition 1. A binary relation S on a non-empty Ovchinnikov Equation lies in that instead of Equation (6),[2]Ovchinnikov
definition
max-min of (6)
Zadeh
transitivity.
is called
and that max-min transitivity
of Ovchinnikov lies in . The revised
set X is called a similarity relation if it satisfies: by Ovchinnikov in 1991 . The main difference between th
[28]

(1) ∀ x, xSx,  that


,  instead
≥ (,of)Equation
Ovchinnikov ∧ (,
lies in that
(6), Ovchinnikov viewed the
) instead of Equation (6), Ovchinniko
(, )following
≥ ∨
((, model
) ∧ as max-min
(, )) transitivity.
(2) If xSy, then ySx, (, )max-min
≥ ∨∨ ((, transitivity.
)
(, ) ≥ ((, ) ∧ (, )) ∧ (, ))
(3) If xSy, ySz then xSz 4.4 Similarity
,  ≥ (, metrics
) ∧ (, ) (7)
Equation
Equation (6) (6) is called
(6)Generally
is called max-min
called max-min
max-min transitivity
transitivity [2]. The revised form of
[2]
The conditions (1), (2), and (3) are the reflexive,
by Equation
Ovchinnikov isin 1991 speaking, transitivity
similarity
[2]
.. The
in The revised form
revised
mathematics form of
is of
co
[28] . The main difference between the definit
[28]
symmetric, and transitive laws. If xSy we say by by Ovchinnikov
4.4 Similarity
thatOvchinnikov
x similarity
4.4 in
Similarity
in
in 1991
1991
CBR metrics
[28]
metrics
is .. The
The
considered main
main difference
difference
both a relation between
between
and a the
the definit
definit
measure, a
Ovchinnikov
Ovchinnikov lies lies
lies inin that
in that instead
that instead
instead of of Equation
of Equation
Equation (6), (6),
(6), Ovchinnikov
Ovchinnikov viewed
viewed
and y are similar [20,16]. Ovchinnikov
max-min Definition
transitivity. 3. A relation, denoted by  , in Ovchinnikov
on mathematics
non-empty X, is viewed
is a
max-min Generally
transitivity. speaking, similarity
Example 1. Matrices B and C in Mn,n are similar max-min [16] transitivity.
: Generally speaking, similarity in mathematics is
 ,,we ≥≥ (,
similarity ) ∧
∧in (,CBR )
) is considered both a relation and a measure
if C = PBP–1 for an invertible P, in which case  , ≥considered
(, )
1)(,

) ∧
is a
(,
(,
as
similarity)
a relation, whileon
relation similarity
X; by in CBR is
Definition 3. A relation,
 denoted  , on non-empty X, i
write B~C. It is easy to prove that is ~ is a similarity considered 2) :1-  isboth
[16] a metric on X;and
a relation that is, it is a afunction
a measure, function,from  ×  t
4.4
4.4 Similarity
Similarity metrics
metrics
relation in Mn,n .[20]
4.4 Similarity and1)ametric For any. x∈ ,  relation
metrics [16,8]
 is a similarity 
,  = on1X;
This example implies that the concept of a simi- Generally 1- For
Definition
2)
Generally speaking,
isall
speaking,  ∈ similarity
3.aAx,metric ,on
relation, X;,that
denoted
similarity  byin
in= mathematics
mit,on
is,Smathematics
mathematics functionis
is,anon-emp- consider
isfrom ×
consider
similarity Generally
in CBR
speaking,
is considered similarity
both a in
relation and a is
measure, considere
aa funct
larity relation here is a generalization of the similari-
similarity ty inX, 
CBR
is a Foris
For all
similarity
similarity in CBR is considered both x,
considered
any ,x∈
metric∈, both
,
if 
it a
, 
, relation

satisfies = ≥1
[16]
:  and
(, a
)
 a relation and a measure, a funct
  measure,
∧   ,  funct
ty between matrices in Mn,n. Definition
Definition 3.
3. A
A relation,
relation, denoted
denoted by
byon ,,, on
on
on non-empty
non-empty
X, is
is aaa simil
X, is simil
simil
1) S  is For
a
[16]Definition 3. mA relation, denoted by
all
similarity x,  relation
∈ ,  
, X;
 = non-empty
 ,  X,
:
[16] :
[16]
Similarity relations can be used for classification where ∧ is min operator. Equation (8) in this definition is ca
: 2) 1- Sm For is aall x, , on∈ X;
metric , that
 , is, it ≥ is afunction
(, ) ∧  , 
1)
1)   is
should
is a similarity
be notedrelation that the on X;
onsimilarity metric here,  , can no
through partition [16] and clustering [18]. 1) 
 is
fromaa similarity
similarity relation
relation
X to [0,1],
X × [16] on
provided X;that:
X;
2)
2) 1-1-
1-inequality is
is aaa metric
 where metric on
∧ is. Equation
metric on X; that
that(8) is,
is, itisit is a function
isbased
itEquation from
on(8)Ovchinnikov’s  ×  to [0,1
toconce
2)  is minX;
on X;operator.
that is, is aa function
function from
in
fromthis definition
××  to [0,1
is
[0,1]
[28]
 ● .should
For
For anyany x∈
anybex∈
x∈noted, 
,  that
 , 
,  the = 1
= 1similarity metric here,  , can
4.3 Fuzzy similarity relations  For ,  ,  = 1
InFor comparison with ,the  definition
= , ofonfuzzy similarity con rela

 ●inequality
For all
For
all x,
all x,  ∈
[16]
x,  ∈∈ , ., 
Equation
,  ,
 ,  (8) = is
= , 
 based
,  (8) Ovchinnikov’s
As an extension of similarity relations, fuzzy sim- similarity metric here is first a≥traditional similarity relation, a

 [28]For. all
For x,
all x, ,
,  ∈
x, , ∈ ,
,   , ,  ≥ ≥   (, ) ∧
∧   , , 
 ● For
[2] some
all
extent, because∈ ,  ,
the similarity  (, )
(, ) ∧
between two, objects is ther
ilarity relations were introduced by Zadeh in 1971 In comparison with the definition of[16]fuzzy similarity
where discuss ∧ is how
min similar
˄ isoperator. they isare
Equation in (8)the context
in this .
definition is called th
and have attracted much attention since then [1,27,21] where
where . where similarity
∧ is
∧ is min
min min operator.
metric
operator.
operator. here
Equation
Equation
Equation
first (8)
(8)
(8)
a traditional
in
in
in this
this
this
defini-
similarity
definition
definition relation,
is called
is called th
the
should
should be
some
tion
be [16]noted
is that
extent,
called
noted that that thethethe
because similarity
similarity
the similarity the
similarity metric metric
similarity
inequality.
metric here, here,
between
It should
here,   ,
two
be can not
objects dire
diret
is
 ,, cancan notnot dire

For example, fuzzy similarity relations haveshould been be noted
inequality
inequality 5. discuss .
[16] .that
[16] Equation
Similarity-based
how thesimilar
Equation (8)
(8) is is
reasoning
they based
is based
based are in on
and
on theOvchinnikov’s
inference
Scontext
Ovchinnikov’s [16]rules concept
. concept of of
of fff
stan- noted
used in CBR [16]. For the sake of brevity, we useinequality . Equation similarity
(8) metric here,
on m, can not di-concept
Ovchinnikov’s
[28]
.
.. comparison
[28]
dard fuzzy set theory notation for operations min [28]
and rectly satisfy the triangle inequality [16]. Equation (8)
In
In comparison This withsection
with the definition
thehighlights
definition of fuzzy
similarity-based
ofand fuzzy similarity
reasoning
similarity relations,
and i
relations,
max, although there are many alternative choices Inforcomparison5. based
is Similarity-based
onwith the
Ovchinnikov’s reasoning
definition concept of offuzzy inference
fuzzy similarity
similari- rulesrelations,
similarity
similaritytymetric metric
metric here
here is first
is first a traditional
first aa traditional
traditional similarity similarity
similarity relation, relation,
relation, and and
and alsoalso
also
these operations available in fuzzy set theory similarity here[28] is
(Zim-
some extent, relations
because .the similarity between two objects is the neces
some extent,
extent, because
5.1 Similarity-based the similarity
This section similarity
reasoning
highlights between two objects
similarity-based objectsreasoning
is the neces and
mermann, 1996). S is still used to denote a some discuss
fuzzy how
discuss how
because
Insimilar
comparison
similar they
they
the withinthe
are
are in the
the
between
definition
context
context
oftwo
[16]
[16] .. fuzzy simi- is the neces
discuss how similar they are in the context (SBR) [16]
.
similarity relation if there is not any confusion aris- larity Similarity-based relations, we emphasize reasoning that the similarity has met-been studied
5.1
different Similarity-based reasoning
ric herefields. For example, Sun examined integration of r
[9]
ing. 5. Similarity-based is first a traditional
reasoning and similarity
inference relation,
rules and
5. Similarity-based
5. Similarity-based
viewpoint. He reasoning
reasoning
considered and
and
SBR inference
inference
as a rules
rules
reasoning-based similarity
Definition 2. A fuzzy binary relation S on a also justSimilarity-based a metric, maybe to reasoning some extent,(SBR) becausehas the been studi
non-empty set is a fuzzy similarity relation in X if it This Carrier considered
section
different highlights
fields. SBR
For as “reasons
similarity-based from similarity”
reasoning and from
its ao
three
twoexample, isSun examined con-integration
[9]
similarity
This
This section
section between
highlights
highlights objects
similarity-based
similarity-based the necessary reasoning
reasoning andsome
and its three
its three
The relationship
viewpoint. between
He considered CBR SBR and SBR
as athey has
reasoning-based drawn similariatte
is reflexive, symmetric, and transitive [28,2], that is: dition to further discuss howThere similar are definition
in the
similarity-based
Carrier [16] considered reasoning?
SBR as “reasons from similarity”offrom
is still no it, to
S (x, x) = 1 5.1
5.1 Similarity-based
(3) context
Similarity-based . reasoning
reasoning
methods
5.1 Similarity-based of
The relationship SBR seem
between CBR and SBR has drawn some ba
reasoning to lack a sound theoretical or logical
precise definition ofreasoning
Similarity-based
similarity-based SBR, in order
reasoning? (SBR)
There to is investigate
has been
stillbeen similarity-bas
studied
no definition by
Similarity-based reasoning (SBR) has been studied of byit,m
different 5Similarity-based
Definition
fields.
methods For
of 4. Let
example,
SBR
reasoning
,
seem  '
Sun , ,
to
(SBR)
[9]and
lack 
examined
a
' has
represent
sound integration
theoretical
studied
compound ofor
by
propo
rule-bas
logica
different fields.
different fields. For For example,
example, Sun Sun examined [9]
[9] examined integration integration of of rule-bas
rule-bas
viewpoint.
viewpoint. rule, He
precise
Hedenoting
considered
definition
considered if P thenSBR
of
SBR SBR, Q.
as
as aaAinreasoning-based
proposition
order
reasoning-based to cansimilarity
investigate be inferred match
similarity-b
similarity matchfro
viewpoint.
Carrier He considered
considered SBR SBR
as ' as' a reasoning-based
“reasons from similarity”
'' similarity
from a match
neural '
provided
Carrier considered
considered that
Definition ,
SBR 4.and Let
as  ,are 
“reasons similar
, , and
from (  ~ ),
represent
similarity” and then 
compound
from aand 
neural proar
Carrier ' ' SBR as “reasons from similarity” from a neural
similarity in CBR is considered both a relation and a measure, a function, and a metric [16,8].
Definition 3. A relation, denoted by  , on non-empty X, is a similarity metric if it satisfies
[16]
:
1)  is a similarity relation onComputer
Journal of X; Science Research | Volume 05 | Issue 03 | July 2023
2) 1-  is a metric on X; that is, it is a function from  ×  to [0,1], provided that:
5. Similarity-based
 For any x∈ ,  , reasoning
=1 and as a composite reasoning paradigm. Furthermore,
inference
 For allrules
x,  ∈ ,  ,  =  ,  the above definition is general, and its generality lies
 For all x, ,  ∈ ,  ,  ≥  (, ) ∧  ,inthat we have not assigned any special (8) meaning or
This section highlights similarity-based reasoning semantics to the similarity used in the definition.
and its three
where inference
∧ is min rules.Equation (8) in this definitionExample
operator. is called4.the similarity
Google Chrome inequality.
as a searchIt engine is
should be noted that the similarity metric here, based , canonnot directly satisfy
similarity-based the triangle
reasoning, that is, “simi-
5.1 Similarity-based
inequality [16] reasoning
. Equation (8) is based on Ovchinnikov’s concept reasoning”
larity-based of fuzzy similarity relationsby https://
which is searched
[28]
. www.google.com.au/ and found 50,000 results (on
In Similarity-based
comparison with reasoning (SBR) has
the definition ofbeen
fuzzy stud-
similarity relations, we emphasize that the
ied by many researchers from different fields. 02 March 2023). However, not every one of the
For relation,
similarity metric here is first a traditional similarity and also just a metric, maybe to
[9] found results is related to “similarity-based reason-
some extent,
example, Sunbecause the similarity
examined between
integration two objects is the necessary condition to further
of rule-based
discuss howfrom
and SBR similar
an AIthey are in the
viewpoint. Hecontext [16]
considered .SBR ing” but to “similarity” or “reasoning”. This is the
reason why we call Google Chrome as similari-
as a reasoning-based similarity matching. Bogacz
5. and
Similarity-based reasoningSBR
Giraud-Carrier considered andasinference
“reasons from rules ty-based reasoning. It does not really result in com-
plying with the inference rule of modus ponens (see
similarity” from a neural network viewpoint [29]. The
This section latter).
andThis is alsoinference
the reasonrules.
why some hope to get
relationship betweenhighlights
CBR and similarity-based
SBR has drawn some reasoning its three
exact results rather than 50,000 or millions of found
attention [8,16]. However, what is similarity-based
5.1reasoning?
Similarity-based reasoning results searched by Chrome and many other search
There is still no definition of it, to our
engines.
knowledge. In fact, manyreasoning
Similarity-based methods of(SBR) SBR seem has tobeen studied by many researchers from
[30] In order to perform SBR, it is necessary to note
lack a sound
different fields.theoretical
For example, or logical Sun basis
[9]
examined. We need integration of rule-based and SBR from an AI
the following three points:
viewpoint.
a relatively Heprecise
considered SBR
definition ofas a reasoning-based
SBR, in order to in- similarity matching. Bogacz and Giraud-
Carrier considered SBR asapproaches
“reasons tofrom 1) What we examined in the previous[29] subsections:
vestigate similarity-based SBR.similarity” from a neural network viewpoint .
The relationship between CBR and SBR has drawn Similarity
some relations,
attention [8,16]fuzzy similarity relations, and
. However, what is
Definition 4. Let P, Pꞌ, Q, and Qꞌ represent com-
similarity-based reasoning? similarity
of it, to metrics are concrete forms of similarity
pound propositions, P→QThere is still no definition
is a production rule, our knowledge. In fact, many
methods of SBR seem to lack a sound theoretical or used logical
in thebasis . We needInaother
above definition.
[30]
relatively
words, each of
denoting if P then Q. A proposition can be inferred
precise definition of SBR, in order to investigate similarity-based them can leadapproaches
to a class of toSBR.
SBR.We can, therefore,
from propositions P and' P→Q, provided that P, and
Definition 4. Let ,  , , and ' represent compound propositions, 
examine SBR from a viewpoint →  is a production
of either similarity
Pꞌ are similar (Pꞌ~P), and then Q and Qꞌ are also sim-
rule, denoting if P then Q. A proposition can be inferred from propositions 
relations or fuzzy similarity relations and  →or ,similarity
ilar; that is: ' '
provided that , and  are similar ( ~ ), and then metrics. '
and  are also
If so, wesimilar;
will make thatour
is: investigation very
' ,' ~, →
(9) complex, although the corresponding research (9) results
∴'
Then, thisthis
reasoning are ofreasoning
significance in applications.
Then, reasoningparadigm
paradigmisiscalled calledsimilarity-based
similari- (SBR).
2) SBR is treated in a more general way in this
ty-based reasoning (SBR).
article; that is, two different forms, ~ and ≈ , of simi-
More generally, a proposition Qꞌ can be similar-
6 larity (e.g., similarity relations, fuzzy similarity rela-
More generally,
ity-based inferred from propositions P1' P2'' , ...,can
a proposition be similarity-based
Pn' pro-
tions, and similarity inferred
metrics) from
are used propositions
in the context.
'1 ,
'2vided ..., Pn→Q, and
, ⋯, P'1 P2,provided 1 , Pi ~2P, i⋯,(i ={1, 2, ..., n}),
 →  , andThe ' ~ the, similarity
 and  '
are also
'

first ~(is = {1, 2, …,


associated with}) between
Q and Qꞌ
similar; that is: are also similar; that is:
P and Pꞌ, while the second ≈ is associated with the
'1,'2 ,⋯,' similarity between Q and Qꞌ. In the context of CBR,
1 , 2, ⋯, → (10) these two similarity relations (or fuzzy relations or
∴' metrics) are in different worlds [8]; that is, the first ~
(10) be noted that the definition is based on is associated with the possible world of problems,
It should
It should
modus ponens be [31]
noted that the
. Therefore, the definition
reasoning defined is based on the
while modussecond ponens [31]
. Therefore,
≈ is associated the used in
with the possible
'
theabove
propositions
can be considered and as [9]
 a kind ) asof SBR with re- world of solutions.
spect to modus ponens|∩(which
' | will be examined fur- 3) The readers can consider ~ and ≈, from now on
 , ' = 2  + | | (11)
ther in the next subsection). It' can be also considered in this article, as either similarity relations or fuzzy

where  is the set of the features of proposition x, and | | is the size of the set of features
of  . The degree of similarity  , ' has the
6 following properties [18]:
1) 0 ≤  , ' ≤ 1.
Note that a similarity function is a special similarity relation.
1 , (10)
2 , ⋯,  →
More generally, a proposition ' can be similarity-based
It
' should ' be noted that the definition is based on' modus po
∴ '
2 , ⋯, (10) provided 1' ,[9]2 , ⋯,  →  , and  ~ ( = {
the propositions
similar; that be  and  ) as
is: noted
It should
| Volume
Journal of Computer Science Research  ' '' 05 | Issue
| ∩that | the definition is based on modus p
03 |July'2023
,
1, 2 ,⋯,=2
'
the propositions  +and|'| [9]) as   '
1 , 2, ⋯, → | ∩ ' |
More generally,
similarity relationsaorproposition
similarity  canin be '
metrics
a consis- where
similarity-based
In , a ∴  '' is the setof the
inferred
CBR 2 from
=customer features
propositions
support system of[8]proposition
' x, and | |
 + |' | ' 1,, Pꞌ is the
' tent way ' in a real-world application. In other words,
2 , ⋯,  provided 1 , 2 , ⋯,  →  , and  ~problem ' of  . The(10) degree
description of similarity
, of and
the ' 
customer, ,
are Pꞌalso hasP means x, andprop
the following
(where features of~proposition
= {1, 2, …,
 is }) the setof the |
similar; that is: problem, if one considers them sim- that
for a real-world 1)It 0Pꞌ should
≤and P,  be are noted
'
≤ 1.thatP→Q
similar, the definition
isthe case' is based on modus p
retrieved
of  . The degree of similarity
' [9] ,  has the following pro
then she/he should use them consis- the propositions ' and )on
as aissimilarity-based

'1,'2metrics,
ilarity ,⋯,' Note
from thatcase
the a similarity
base function

C based a special similarity relation
1 , 2, ⋯, →
1) 0 ≤  ,  | ≤ 1.
'  , that is, P and  are the same propositions, then
2) If,algorithm.
≡ '  ∩ ' | '
tently. retrieval = 2  Q+ ≈ | Qꞌ |means that Q and Qꞌ are
∴' Notethe that a similarity function is a special similarity relatio
In order to perform SBR using Equation (9), we similar, andfollowing:
 '
Qꞌ is the satisfactory solution to thepropositions,
re- x, and |
(10) where ≡ is ,the ' set is,ofPthe andfeatures of sameproposition
' '
'2) If →,
, ' ~, ≈ that  are the the
define
It should a degree
be of
noted similarity
that the between propositions
definition is based P
on quirement
modus ponens of the [31] customer.
. Therefore, the used ' in
of  . The ∴ degree
the ' following: of similarity  ,  has the following pro
theand Pꞌ [9]) as  and ' [9]) as
1) 0 ≤  ,  ≤ 1. [22]of similarity
In
' , 'the context ' ' of fuzzy relations deductive s
propositions The reasoning
~, →, ≈ paradigm the similarity-based
|∩ ' | and
InNote similarity
the context
∴
that
'
a similaritymetrics
of fuzzyfunction , we is
similarity assume
arelations
special thatsimilarity
Pꞌ similarity
and relatio
 , ' = 2  + | | (11) The reasoning paradigm of the (11)
similarity-based deductive
  '
corresponds
corresponds
2) If  ≡  ,that to ' � ,
0 is, Pꞌ '
~P corresponds
~ P and  are thetosame
corresponds
' to 
F �
0101 , propositions,
,  →  correspo
P→Q the
where Fxisisthe
where theset
setofofthe
thefeatures
featuresofofproposition
propositionx, x,tocorresponds
and
�In|,the | iscontext
and the
 ' size ofoffuzzy
corresponds
 the set to of
similarity

 features
. Then,relations
using
 and
the similarity
compositi
10 the following: to F� , Q ≈ Qꞌ corresponds 1 to F10 , and
of and
 . |F
The degree of similarity  , ' has the following
x| is the size of the set of features of Fx. The obtain:
corresponds
propertiesto[18]
' , ' ~, →, ≈0'
11: , ' ~ corresponds to � 01 ,  →  corresp
' Qꞌ
to corresponds
� , and  'to Q1 . Then, using�
corresponds to the. Then,
compositional using the composi
1) 0 ≤ofsimilarity
degree ,  ≤S1.(P, Pꞌ) has the following prop-  �1 = 10 � ∘∴
 � '
01 ∘[1,33]
� 11 ,∘we � 1
obtain:rule of 0inference 10obtain:
Note [18]
erties that
: a similarity function is a special similarity The
relation.
where � reasoning paradigm of the � similarity-based

0 is a fuzzy set in  . 01 , 11 , and 10 are a similari
� deductive

1In=then� �
0 context
the ∘ ,01∘of
� 1.
= ∘fuzzy�  (13)
in10similarity relations and andsimilarity
' ' '
2) If1) 
0 ≤≡S(P, , that
Pꞌ) ≤is,1.P and  are the same propositions,
fuzzy similarity 11
metric �1 is a f
� × �  respectively,
� � 
the that
Note following: where 

corresponds
a similarity function is a special simi- computational is
where P00is a to a fuzzy

0 , set
fuzzy set
' in 
 ~
in W . 

F .
corresponds,  ,
F11
01 
, to
and
11 , and
�

F 
,
are
10a→area similar[
corresp
foundation for
P similarity-based
01 01
10 modus ponens
�1 is a
�
fuzzy similarity ' metric
fuzzy in �a respectively, and
ato , and andcorresponds to .fuzzy
Then, using the composi
' '
 ,  ~, →, ≈
larity relation.
'
similarity 
metric, aEquation 
rule ×
and  (12) similarity 
∴'
unit metric,
10 (13) is1then simplified into:
2) If P = Pꞌ, that is, P and Pꞌ are the same propo-  �obtain:
computational
�0in∘ W foundation for similarity-based modus ponens
The reasoning paradigm of the similarity-based metric
1 = �
deductive �
system P ×∘and
01 W� respectively,
Q similarity-based
11 and Q  is a fuzzy
More generally, a proposition ' �
can be similarity-based
aWhen
1unit

inferred
= metric, � from � propositions 1 '
,
, and ,11Equation
, and(13) is foundation
then 1'simplified into:
sitions, then S (P, Pꞌ) = 1.  �∘ 01� ∘[22] ∘ 10 �1 are
set on W 0
Q.0 This 01is a�computational  only a numerical
for simila
'2In
, ⋯,the 'context provided of fuzzy
 , similarity relations and' similarity
2 , ⋯,  →  , and  ~
11

where
 (1 = {1,
essentially


� metrics
00 ∘is
2, �a01fuzzy
…,
degenerates ∘�
}) , ,set
we inassume
and
into   .' 
'the
� arethat�, and � are a similar
01 , also
computational
corresponds � ' 1
to 0 ,  ~ corresponds to 01 ,  →  corresponds � similarity-based to �
11,, 
11
modus ponens
≈, andcorresponds
[8]
. In11the caseform. 10
of
similar;
5.2 that
Similarity-based is: inference rules fuzzy
In Whenfact, �
similarity

many0 , � metric
other
01
� 11 in
reasoning  
×� 1 are
paradigms
 only a
respectively, numerical
also and  �1simi
follow, is
toa
to � '1,
' �
,⋯,' corresponds to 1 . Then, using the compositional
, '2and Q ≈ Qꞌ,  is arule
F unit of inference
metric, and [1,33]
Equation , we(13) is then
10 essentially
computational
example, degenerates
analogical
10
foundation into
reasoning the computational
for similarity-based
[1]
, although they form.
modus haveponens
diffe
obtain:
1 , This
2, ⋯,section → will look at similarity-based reason- algorithms simplified
In fact, into:
many other reasoning paradigms also follow, to
�1 =  �∴ ' � � � a unit metric, for and
performing Equation their (13) ownis then simplified
reasoning based into:
on differ
 ∘  ∘  ∘
ing (SBR) and its three inference rules.  example,
� � ∘ analogical (13)
reasoning [1]
, although they have dif

0
(10)
01 11 10
� � �  While
1 = 0fuzzy � ∘�
01 reasoning 11 (14) is essentially computational reason
where  0 is a fuzzy set in  . 01 , 11 , and 10 are aboth similarity
algorithms
symbolic metric,
for a
performing
reasoning fuzzy rule
their and own a reasoning based on diffe
It should
Similarity-based be noted that
modus the
ponensdefinition is based on modus When
When ponens �
͠P00, ,͠ F[31]
 �
01 .on , 11,�
͠ FTherefore,
,reasoning
01 and
11 ,and ͠Q1 computational
and the
are
�1 used
 are only
only areasoning
ainnumerical numerical simil[8]
. If w
fuzzy similarity metric in' [9]
  ×  respectively, andreasoning,
�1While
 is a fuzzy
fuzzy then set we 
can . is
This
consider essentially
is a it as acomputational
special reaso
the propositions  and  ) as essentially
similarity degenerates  into the(14) computational form. [8] fu
kind of
In the previous section, we defined SBR con- both symbolic[8] measure reasoningrespectively, and' � essentially
computational de-
reasoning . If
computational foundation |∩' | for similarity-based modusbecause ponens
In fact, the. In
many the case
similarity otherofbetween  ≈ , P
reasoning 10and is  , ~
paradigms '
also' , follow,
and the to
generates
reasoning, into then the computational
we can consider form. it as a special kind of cof
'
 ,metric,
a unit
cerning  modus = and 2 ponens.
Equation In (13)
order is then
to simplified
emphasize the into:
𝪌example, '
≈  , areanalogical replaced by the fuzziness
reasoning [1] (11) between
, although them
they havein thediff
 + |' | ' also '
� �
1importance � �
of∘similarity between P and Q and Qꞌ x,reason because
In fact, the many similarityother between
reasoning (14) and
paradigms
P  , ~ , and the
= 0 ∘01
where  is the11
 set of the features of Pꞌ,
proposition and |why
algorithms  | is for
' to
we performing
the can of
size usethefuzzy their
set ofreasoning
own
features to examine
reasoning based on thediffesim
� � � � follow,
 ≈  , aresome replaced sense, by the
Equation fuzziness
(14), (14) computational reasoc
forbetween
example, them in the
of When
and show the, 01 ,ofsimilarity
difference 11 , and 
of the 1 are
, only
inference hasa the
rule ofnumerical
SBR CBR similarity
While . fuzzy measure : [1] respectively,
reasoning is essentially
[8]
 . The0degree '
following properties [18]
essentially degeneratesmodus
from the generalized into the computational
ponens of fuzzy reason- form. both reason
analogicalsymbolicwhy we
reasoning can
reasoning use
, although fuzzy reasoning
they have different
and computational toreasoning
examine [8] the . Ifsi
1) 0[1]≤  , ' ≤ 1. CBRto .some [8]
In fact,
ing wemany other reasoning paradigms modus also follow, reasoning,
semantics and then sense,we can
operational Equation consider
algorithms (14),itforforasperforming
a special kind of fu
Note ,that acall Equation
similarity (9) similarity-based
function is a special similarity Similarity-based
relation. modus tollens
example,
ponens (SMP), analogical reasoning
' and its form will
[1]
, although
' be replaced with the they have
because
their different
own the semantics
similarity
reasoning ' based and
between
on operational
differentP and ' , ~' , and the
real-world
2) If  ≡  , that is, P and  are the same propositions, then  ,  = 1.
algorithms
following:for performing their own reasoning based on  ≈ Similarity-based
Similarity-based
different
scenarios. '
, are real-world
replaced modus modus
tollens
byscenarios.
the tollensbetween
fuzziness (SMT) them is another
in the in c
While the following:
fuzzy reasoning is essentially computational traditional
reasoning, viewpoint,
SBR can we be can consider
considered SMT
as as an integration
reasonWhile why fuzzy wereasoning
Similarity-based can useismodus fuzzy
essentially reasoning to examine
computation- the sii
' , ' ~, →, ≈'
both symbolic reasoning
(12) and computational reasoning general
[8] [8]form
CBR . If we
. of SMT
regard SBR is as follows:tollens
as computational (12)
(SMT) is another
∴ ' al reasoning,viewpoint,
traditional SBR can be weconsidered
can consider as both
SMTsym- as an integration
reasoning, then we can consider it as a special kind of fuzzy reasoning, to some extent,
TheThe reasoning
reasoning paradigm
paradigm of the similarity-based
of theP similarity-based deductive
general
bolic system
form ofand
reasoning and SMT similarity-based
is as follows:reasoning
computational [8]
. If
because the similarity between and ' , ~ '
, and the similarity[22]
Similarity-based between
modus tollens Q and ' ,'
In the context of fuzzy similarity relations and
[32] similarity
we regard metrics
SBR as , we
computational assume that
reasoning,  then we
 deductive
≈ ' , are replaced system and

similarity-based agent them
'by the fuzziness between �
is in the context of fuzzy logic. This is the
� '
corresponds to 0 ,  (12), ~ corresponds to 01 ,  →  corresponds ittoas11 ,  ≈ modus kind corresponds
basedwhy
reason

on Equation we ' can use fuzzy �
because to examinecan
similarity-based
reasoning theconsider Similarity-based
similarity-based a special modus of tollens
ponens fuzzyinreasoning,
(SMT) is another i
to
CBR
 ,
resolution
10[8]
. and 
[32] corresponds to  . Then,
is an alternative form of Equation (12).
1 using the compositional
traditional
to some extent, rule
viewpoint,because of inference
we
the can
similarity
[1,33]
consider , we
between SMT Pas andan integration
obtain:
Equation (12) is also a logical foundation for CBR . general [8]
Pꞌ, P ~ Pꞌ, form andof theSMT is as follows:
similarity between Q and Qꞌ, Q ≈
�1 = 
 �0 ∘ � ∘ � ∘ � (13)
SMP is one of the
Similarity-based 01 11
modus 10 important reasoning para- Qꞌ, are replaced by the fuzziness between them in the
mosttollens
where �0 is a fuzzy set in  . � , �11 , and
� are a context
similarity metric, a fuzzy rule

digms in many other disciplines, because
01 it is 
the
10ba- of fuzzy logic. This is theand reason a why we can
fuzzy Similarity-based
similarity inmodus tollens (SMT) isand another
�1 is ainference setrule ontofor SBR.is From a
sic form of anymetric  × deductive
similarity-based   respectively,
reasoning use  fuzzy fuzzy
reasoning  . This the
examine a similarity-based
traditional viewpoint, we can consider SMT as an integration of SBR and modus tollens. The
.  ≈ , �
'
computational
paradigm [29,30,9] foundation
. for similarity-based modus modus ponensponens [8]
. In the in CBR case[8]of 10 is
general form of SMT is as follows:
a unit metric, and Equation (13) is then simplified into:
�1 = 
 �0 ∘ � 01 ∘11
� (14)
7
� � � �
When 0 , 01 , 11 , and 1 are only a numerical similarity measure respectively, (14)
essentially degenerates into the computational form.
In fact, many other reasoning paradigms also follow, to some sense, Equation (14), for
In the context of fuzzy similarity relations or similarity metri
rule of inference [1] to the above Equation (15) we obtain:
�0 = 1 − (1 − 
 �1 ) ∘ � � �
10 ∘11 ∘ 01
Journal of Computer Science ResearchThis is a05computational
| Volume foundation for similarity-based mod
| Issue 03 | July 2023

10 is a unit metric, and Equation (16) is then simplified into
Similarity-based modus tollens �0 = 1 − (1 − 
Similarity-based
 �1abduction
) ∘ � �
11 ∘ 01

Similarity-based modus tollens (SMT) is another Abduction has been used in system diagnosis or
inference rule for SBR. From a traditional viewpoint, medical diagnosis [8] and scientific discovery [34].
Similarity-based abduction
we can consider SMT as an integration of SBR and Abduction is an important reasoning paradigm in
modus tollens. The general form of SMT is as fol- Abduction has been usedabductive
SBR. Similarity-based in systemreasoning
diagnosis(SAR)
or medical diagn
lows: . a natural development of abductive reasoning [35],
[34]¬' , ' ≈, →, →'
is
Abduction
∴¬' is an important reasoning paradigm in SBR
¬' , ' ≈, →, →' or an application
Although of SBR tollens
fuzzyismodus in abductive
have reasoning.
not been Its
investigated in
(15) reasoning (SAR) a natural (15)
development of abductive reaso
∴¬' general form is as follows:
In the context of fuzzy similarity relations or similarity metri
SBR in abductive reasoning. Its general form is as follows:
Although
Although fuzzyfuzzymodus modustollens tollenshave havenotnotbeen
been investigated
in- rule' , of' ≈,
in
inferencefuzzy [1] logic [1], this is the first
→, →' to the above Equation (15) we obtain:
[22] (18)
In the context of fuzzy similarity relations or similarity
�0 = 1metrics
 − (1' − �,1 )using∘ � the� compositional

10 ∘11 ∘ 01
[1]
vestigated in fuzzy logic , this is the first time that ∴
rule of inference to the above Equation (15) we obtain:
[1]
Example 6. Similarity-based abductive reasoning. As in Exam
�0similarity-based �1 )modus tollens is discussed. With This Exampleis a computational
6. Similarity-based foundation abductive for reasoning.
similarity-based mod
 = 1 − (1 −  ∘ � ∘  � ∘ � � • RAT: The applicant has a good (16)
credit rating,
the increasing importance of similarity, SMT and its 10
10 11 01 Asisina Example unit metric, 5, let: and Equation (16) '
is then simplified into
This is a computational foundation for similarity-based •�REP: = modus
1 The
− (1 tollens.
applicant
− � ) ∘Inhas
� thea∘ case
good
� financial
≈ , reputation,
corresponding SBR will find their applications in • The 0 ● RAT: The applicant 1 11has a 01 good credit rating,
�10 is a unit metric, and Equation (16) is then simplified intoloan officer has an experienced rule,   →  : If
business and(1mathematics. ● REP: The applicant has a good financial repu-
�0 = 1 − − �1 ) ∘ � 11 ∘ 01
� rating, then the applicant has a good financial reputation.
Example 5. Similarity-based modus tollens. Let: In thistation, case, the loan officer knows(17) the information from ap
Similarity-based The loan abduction
● RAT: The applicant has a good credit rating, has a ● satisfactory officer has
financial an experienced
reputation. Becauserule, “a satisfactory
Abduction
“a goodhas RAT→REP: been used If the applicant has
in system diagnosis a good
that is, or ~ creditmedical' diagn
Similarity-based
● REP: Theabduction applicant has a good financial to financial reputation”; , the
[34]
.
similarity-based rating, then the
abductive applicantreasoning has a good
to make financial
the decision and
reputation,
Abduction has been used in system diagnosis or medical a satisfactorydiagnosis
reputation.
Abduction credit
[8]
and
is an rating, scientific
important because discovery
reasoning
“a goodparadigm in SBR
credit rating” is s
[34] ● The loan officer has an experienced rule,
. reasoning
rating”. In this It is(SAR)
case,
obvious is aloan
the natural
that “Thedevelopment
officer knows thehas
applicant ofa satisfactory
informa- abductive reaso
cred
Abduction RAT→REP: is anIf important
the applicantreasoning has a goodparadigm
credit
SBR
“Thetion inin SBR.
abductive
applicant
from Similarity-based
has
applicant reasoning.
A, REPꞌIts
a satisfactory : Theabductive
general
financial
applicant form is
hasasafollows:
reputation.” Therefo
reasoning rating,
(SAR) thenisthe applicantdevelopment
a natural has a good financial
of abductive ' ' reasoning [35]
' , or an application of
reasoning
 ,  ≈, →,
satisfactory canfinancial
→be alsoreputation. used for Because generations of explanation,
“a satis-
SBR in abductive reputation. reasoning. Its general form is as follows: scientific ∴'
factory discovery .
[34,36]
financial reputation” is similar to “a good
' '
 , In≈, →, → '
this case, the loan officer knows the informa- financial Example 6. Similarity-based abductive reasoning. As in Exam
In the context
reputation”;of fuzzy that similarity
is, REP ~ relations
(18)
REPꞌ,rating,theandloansimilarity m
∴'
has an rule
• RAT: The applicant has a good credit
tion
Example from applicant
6. Similarity-based A, ¬REP: The applicant
abductive reasoning. Asofficerof
in inference
Example
uses
[1]
the5,above to thesimilarity-based
let above Equation abductive (18), we obtain:
•�REP: � The� applicant � has
� a good financial reputation,
unsatisfactory financial
• RAT: The applicant has a good credit rating,reputation. 
reasoning =  ∘ 
1 to make ∘  the ∘ 
11 decision and obtainrule, REPꞌ:The
• 0The loan 10
officer has an 01
experienced  →  : If
• REP: The applicant has a good financial reputation,
Because “a satisfactory financial reputation” This is a computational foundation for similarity-based abdu
rating,
' applicant then� the
has applicant
a satisfactory has a good
credit financial
rating, reputation.
because
• The loantoofficer has an experienced rule,that  ≈ , 
: Ifcase, is
the thea unit
applicant metric, and Equation (19) is then simplifie
is similar “a good financial reputation”,  is,
 → “a  good
In this 10credit rating”loan ishas
officer a good
similar to credit
knows “athe information from ap
satisfactory
rating, then the applicant  � = � ∘ � ∘  �
RAT→REP, therefore, thehasloana officer
good financial reputation.
uses the above hascredit0a satisfactory
1
rating”. It is11 financial
obvious' reputation.
01 that “The applicant Becausehas“aasatisfactory
In thistocase,
SMT makethe theloan officer
decision andknows
obtainthe information
¬REP: from
The to satisfactory
“a good applicant
financial A,   : The
credit rating” is an explanationis,for“The
reputation”; applicant
that ~ ' , the
hasapplicant
a satisfactory financial reputation. Because “a satisfactory
has an unsatisfactory credit rating, since “a 5.3 Summary
similarity-based financial reputation” is similar
applicant
' has aabductive
satisfactory reasoning
financialtoreputation.”
make the decision and
to “a good financial reputation”; that is,  ~ a 
satisfactory , the loancredit officer
rating, usesbecause the above
“a good credit
good credit rating” is similar to “a satisfactory credit Therefore, Table similarity-based
1 summarizes ' abductive
the reasoning
well-known can rating”
inference is sM
rules:
similarity-based abductive reasoning to make the decision rating”. and obtain   : The applicant has
rating”.
¬' , ' ≈, →, →' be alsoIt used
abduction, isandobvious
proposes
for that “The
generations threeofapplicant
inference
explanation, has a satisfactory
rules
as with respect
ab- credt
a satisfactory credit rating, because “a good credit rating”
“The is
applicant similar has to a “a satisfactory
satisfactory (15) credit
financial reputation.” Theref
or traditional forms: Modus ponens, modus for tollens, and abduc
[34,36]
rating”. In the context
It is∴¬ obviousofthat
'
fuzzy “The similarity
applicantrelations
has a satisfactoryductivecredit reasoning rating” does is scientific
an discovery .
reasoning can be also [1]explanation
used forisgenerations of explanation,
Although
similarity fuzzy
metrics [22] modus tollens have not been investigated
examined In the in
three fuzzy
context different logic
of fuzzy , this
inference
similarity the
rules firstfor
relations SBR and (see Table 1
“The
¬ , applicant
' '
 ≈, →, has a , satisfactory
→ ' using the compositional rule of
financial reputation.” scientific Therefore, discovery similarity-based
[22] [34,36]. (15) abductive
In the∴¬ context of fuzzy similarity relations or similarity of them metrics
has
similarity metrics, been , using
thoroughly
using the the compositional
used in
compositional computer science, mathem
reasoning
inference canto be [1]'
also
the above used
Equation for (15) we obtain:of explanation,
generations In the contextas abductive
of fuzzy reasoning
similarity does rule
relations and
of
similarity m
rule
Although of inference
fuzzy modus
[1]
to the above Equation (15)
[34,36] tollens have not been investigated we obtain:
and otherinsciences
inference fuzzy
[1]
to the
[30,39,34]
logic , this [1] . However,
is the firstthey are all the abstrac
scientific discovery . [1] above Equation (18), we obtain:

 = 1 − (1 −of �1 ) ∘ � ∘� � (16) rule of
naturalmetrics inference
reasoning, to the
and ordinary above Equation
reasoning (18), we obtain:
(16)in the real world. Fu
In0 the the context of fuzzy similarity11 ∘ relations
relationsorand similarity , using
using the compositional
compositional
[22]
In context [1] fuzzy10 similarity 01 similarity
�0 =  � metrics,
∘ � ∘  � ∘ the
� (19) '
rule of
rule This
of
This is
is aa computational
inference
inference [1] to the above
to the above
computational foundation
Equation
Equation
foundation for
for similarity-based
(15)
(18), we
weobtain:
similari- obtain: modus tollens. In the case  ≈ ,
1 10 11 01
� � 1is− a (1unit �
metric, �
∘ and �
∘Equation �
11 ∘ 01 (16) is then This is ais computational
a computationalfoundation forforsimilarity-based abdu
0� =
10=
ty-based �
 ∘−
1modus � 1 )�
10 ∘
10�
11 ∘ In
tollens. 01the case Q ≈ Qꞌ, F  issimplified
a '
This into

foundation
(19) (16) similari-
8
This
0
�0 is
 = aa1computational
computational
− (1 −  �1 ) ∘ � foundation � for similarity-based  ty-based
≈ ,modus is atollens.
10abductive unitreasoning.
metric,
In theand case Equation
'
of(19)
Q ≈isQꞌ, then simplifie
11 ∘ 01 Inthe case
the case
≈ ,
10
This is foundation
unit metric, and Equation (16) is then simplified into for similarity-based � abductive

= 1 ∘ 11 ∘ 01 � reasoning.
� In of

 ' is a unit
10 ≈ , � metric, and Equation (16) is then simplified
10 is a unit metric, and Equation (19) is then simplified
0 into
F 10 is a unit metric, into: and Equation (19) is(17) then simpli-
�
� = 1
� − (1 � − � � ) ∘ � ∘ �  (17) fied into:
0 = 1 ∘ 11 ∘ 01
0 1 11 01 (20)
Similarity-based abduction 5.3 Summary (17)
Summaryhas been used in system diagnosis or medicalTable
5.3Abduction 1 summarizes
diagnosis [8] the well-known
and scientific discovery inference rules: M
Similarity-based
[34]
. abduction abduction,
8 and proposes three inference rules with respect
Table 1
Abduction summarizes
is an the well-known inference rules:
traditional Modus
forms: ponens,
Modus modus abductive
ponens, tollens,
modus tollens, and abduc
Abduction has been usedimportant
in systemreasoning paradigm
diagnosis or in SBR. [8]Similarity-based
medical diagnosis and scientific discovery
abduction, and proposes three inference rules with respect to SBR,
examinedreasoning
[34]reasoning (SAR) is a natural development of abductive corresponding
three different
[35]
, orinference to thefor
rules
an application of SBR (see Table 1
.
traditional forms: Modus ponens, modus tollens, and abduction
SBR in abductive reasoning. Its general form is asoffollows:
[37,31]
. So far, we
them has been thoroughly used in computer have science, mathem
In the context of fuzzy similarity relations and similarity metrics, using the compositional
rule of inference [1] to the above Equation (18), we obtain:
�0 = �1 ∘ � � �
10 ∘11 ∘ 01 (19)
This is a computational foundation for similarity-based abductive reasoning. In the case of
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023
' ≈ , � 10 is a unit metric, and Equation (19) is then simplified into:

0 = 1 ∘ �
� �
11 ∘ 01(20)
mechanism for performing modus (20) ponens or modus
tollens or abduction. However, in order to manipu-
5.35.3
Summary late the knowledge in the KB, the RBES must deal
Summary
with knowledge representation, knowledge expla-
Table 1 summarizes the well-known inference rules: Modus ponens, modus tollens,
Table 1 summarizes the well-known inference nation, and knowledge utility which are the main
abduction, and proposes three inference rules with respect to SBR, corresponding [11,8]to the
rules: Modus ponens, modus tollens, abduction, and components of the process model . Therefore, the
traditional forms: Modus ponens, modus tollens, and abduction [37,31] . So far, we have
proposes three inference rules with respect to SBR, reasoning involved in RBES can be considered as a
examined three different inference rules for SBR (see Table 1) in a unified viewpoint, each
of corresponding
them has beentothoroughly
the traditional forms:
used in[37,31]Modus pon-
computer science,composite reasoning
mathematics, paradigm. In
mathematical this [38]
logic way,
, we can
andens, modus
other tollens,[30,39,34]
sciences and abduction
. However, .they So far,
arewe differentiate
all the reasoning
abstractions paradigmsofinSBR,
and summaries mathematical
have reasoning,
natural examined three different reasoning
and ordinary inference in rules
thefor logic and
real world. AI. What we
Furthermore, CBRhavehas
examined
been only in this arti-
SBR (see Table 1) in a unified viewpoint, each of cle are simple or atomic inference rules for SBR. In
them has been thoroughly used in computer science, future work, we will examine composite reasoning
8
mathematics, mathematical logic [38], and other sci- paradigms for SBR, which constitute a “reasoning
ences [30,39,34]. However, they are all the abstractions chain” [3], “reasoning network” or “reasoning tree”
and summaries of SBR, natural reasoning, and ordi- with some depth, and correspond to natural reason-
nary reasoning in the real world. Furthermore, CBR ing in human professional activities.
has been only based on either modus ponens or mo- It should be noted that the above-mentioned
[33,8]
dus tollens or abduction , whereas SBR is based abductive reasoning and its SBR are unsound rea-
on the mentioned three inference rules. It should be soning paradigms from a logical viewpoint [31]. How-
noted that reasoning paradigms can be classified into ever, like nonmonotonic reasoning, which is also
simple (atomic or first level) reasoning paradigms and unsound reasoning [8], this inference rule and its sim-
composite (second level) reasoning paradigms [40], just ilarity-based abduction is the summarization of SBR
as propositions can be divided into simple (atomic) used by people in the real-world situations.
propositions and compound propositions [39]. The
simplest reasoning paradigm is an inference rule,
6. Similarity computing and analytics
which is the basis for any reasoning paradigm.
A composite reasoning paradigm consists of more Similarity computing and analytics are science,
than one inference rule. For example, fuzzy modus technology, system and tools used in data, informa-
ponens [2] is a composite reasoning paradigm that in- tion, and knowledge analysis to measure and com-
tegrates modus ponens and fuzzy rules. Any process pare the similarity between different data, informa-
model of a reasoning paradigm in AI is a method for tion, and knowledge sets. They are used in various
obtaining composite reasoning paradigms. For ex- fields such as AI including machine learning, data
ample, the simplest rule-based expert system (RBES) science, natural language understanding and process-
can mainly consist of the knowledge base (KB) and ing, image recognition, and information retrieval.
an inference engine (IE), where IE is an inference This section will examine similarity computing and

Table 1. Three inference rules for similarity-based reasoning.

Modus ponens Modus tollens Abduction


Modus ponens
Modus
Modus ponensModus tollens
ponens ModusModus tollens Abduction
tollens Abduction
Abduction
Traditional
Traditional , Modus
Traditional
Modus →,
ponens  ponens
Modus ,→ponens→ Modus ¬,Modus
→ ¬,
tollens tollens
Modus ¬,→ → Abduction
tollens , Abduction
→,
Abduction  ,→→ 
Traditional form
form form form
Traditional Traditional
Traditional , →,
∴ ∴ , ∴→ →  ¬, ∴ ¬¬, →
∴ ¬ ¬,

∴ ¬ →  →  ,
∴  →,
 ∴ ,
∴→→ 
' '
similarity-b
formsimilarity-b
form form  ,  ~,
similarity-b '
∴, ∴~, ' ' '
, ~, '
∴  ¬ , ∴¬ '
≈¬ '
,
¬ '
, ∴ ,¬'
≈∴,≈ '
¬ , ' '
 ,  ≈∴ , '
, ,∴≈',≈
''
∴ ,
ased form
ased ased
form
similarity-b form
similarity-b
similarity-b
 → , '
 ,→≈'
~,
,' ' ''
→,  , ≈,≈
~, ' '
~, '
 →¬ '
, ,¬
→ '
~≈,→ ''
,
,¬
 '
,~ '
≈,,' '
≈ ',  →,, 
~ →
 '
~

 ' ' ''
,,
,→,~ ,
≈  ''
 ~,
, ≈ '

ased form
similarity-based asedased
form formform
→ ∴ '
, →∴≈ , → ' ' '
∴,≈≈ ∴→' '
¬,'
∴→~ ¬, '
∴→¬', '
~ ' '
 ~  ∴→,→ '
∴~ ' '
,∴→, '
~  ~'
'
' ' '
∴  ∴ ∴  ∴ ¬ ∴ ¬ '
∴ ¬' ' '
∴  ∴  ∴ ' '

9
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

analytics in some detail. ilarities [4,18].


Similarity computing is a science, technology, Although similarity science has not been pro-
system, and tool for determining the degree of simi- posed in academia, similarity engineering, similarity
larity or dissimilarity between two or more objects to technology, similarity systems (see the next section)
create intelligence. That is, based on the research of and similarity tools based on similarity models,
Sun [41], methods, and algorithms are well-known in the mar-
Similarity computing = Similarity science ket [1,7,44].
+ Similarity engineering Overall, similarity computing and analytics are
+ Similarity technology science, technology, and system in modern data,
+ Similarity system information, and knowledge analysis to enable re-
+ Similarity tools (21) searchers and practitioners to gain similarity intelli-
Similarity relations, fuzzy similarity relations [2] gence, knowledge and insights, and make predictions
and similarity metrics [1] such as Cosine similarity, in various fields.
Jaccard similarity, Euclidean distance, and Pearson
correlation coefficient, among others are fundamen-
tals for realizing similarity or dissimilarity between
7. A multiagent SBR systems
two or more objects for similarity computing [18]. In AI, a reasoning paradigm usually corresponds
Analytics is science, technology, system and tools to an intelligent system. This section proposes a
for mining data, information, knowledge to discov- multiagent SBR system as an example, which con-
er meaningful intelligence, insights, patterns, and stitutes an important basis for developing any multi-
knowledge from big data in a database or data ware- agent SBR systems (MSBRS).
house or knowledge in a knowledge base using sim-
ilarity [42]. This can be achieved using database and 7.1 A general architecture of an SBR system
data warehouse techniques, statistical techniques,
knowledge base techniques, data visualization tech- Similarity case base (SCB) is similar to a case
niques, machine learning algorithms, and other data base in a case base system [8] illustrated in Figure 1.
and knowledge processing tools [4,43]. Similarity The SCB is a text case base in natural language pro-
analytics can be represented below [41], cessing systems [4] and an insight base in data mining
system and data analytics systems [41]. SCB consists
Similarity Analytics = Similarity science
of all the cases that the SBR System collects peri-
+ Similarity engineering
+ Similarity technology odically. A user interface is used to interact with the
+ Similarity system SCB and MIE in the SBR System (see Section 7.3).
+ Similarity tools (22) The MIE is a multi-inference engine that consists
Basically, similarity analytics is a part of similarity of the mechanism for implementing three reason-
computing, just as analytics is a part of computing [41]. ing paradigms based on the above-mentioned three
Both aim to discover similarity intelligence in the similarity-based inference rules and their algorithms
domain. Even so, not only similarity computing for SBR with manipulating the SCB to infer simi-
but also similarity analytics can enable the analysis larity-based problem solving and decision making
of large datasets, information sets and knowledge requested by the user. The remarkable difference
sets to identify and discover intelligence, patterns, between the mentioned SBRS and the traditional
knowledge and insights, and prediction of outcomes CBR system (CBRS) lies in that the latter’s infer-
or cases. For example, in machine learning, simi- ence engine is based on a unique reasoning paradigm
larity computing is used to find similarities between (or inference rule), while the MIE is based on many
different data points, and analytics is used to train different reasoning paradigms. This implies that a
models that can make predictions based on those sim- CBRS is only a subsystem of the SBRS. Therefore,

10
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

this SBR System is the extension of CBRS and simi- ty-based modus tollens and its algorithms (see Sec-
larity-based reasoning [8,31]. tion 5.2).
3) The SAR agent is responsible for manipulating
the SCB to infer the case requested by the user based
on similarity-based abductive reasoning and its al-
gorithm (see Section 5.3). This agent can generate
the explanation for the experience-based reasoning
inferred by the MEBIE. This agent can be consid-
ered as an agentization of an inference engine in an
abductive CBR system [33].
Figure 1. A general architecture for a SBR system.

7.2 MEBIE: A multiagent framework for sim-


ilarity based inference engine

As mentioned in the previous subsection, the MIE


is a multi-inference engine for SBRS [45]. MIE could
automatically adapt itself to the changing situation
and perform one of the mentioned similarity-based
inference rules for SBR (Figure 2). However, any
existing intelligent system has not reached such a
high level [46]. The alternative strategy is to use mul-
tiagent technology to implement the MIE. Based on Figure 2. MIE and other agents in a MSBRS.
this idea, we propose a multi-agent framework for a
similarity-based inference engine (for short MABIE),
7.3 Some other agents in MSBRS
which is a core part of a multiagent SBR system
(MSBRS), as shown in Figure 1. In this framework, For the proposed MSBRS, there are some other
three rational agents (from SMP agent to SAR agent) intelligent agents, shown in Figure 2. These are an
are semiautonomous [8]. These three agents are main- interface agent, an analysis assistant and a SCB man-
ly responsible for performing SBR corresponding to ager. In what follows, we will look at them in some
three similarity-based inference rules in the SBRS depth [45].
respectively. In what follows, we discuss each of The SBRS interface agent is an advisor to help
them in some detail. the MSBRS user to know which reasoning agent she/
1) The SMP agent in the MABIE is responsible he should ask for help. Otherwise, the SBRS inter-
for manipulating the SCB based on similarity-based face agent will forward the problem of the user to all
modus ponens and its algorithm (also see Section 5.1) agents in the MIE for further processing.
to infer the similarity-based problems and solutions The output provided by the MIE can be consid-
requested by the user. This agent can be considered ered as a sub output. The final output as the solutions
as an agentization of an inference engine in a tradi- to the similarity-based problem of the user will be
tional CBR system. The function of the SMP agent processed with the help of the analysis agent. Since
can be extended to infer the cases in the SCB based different agents in the MIE use different inference
on fuzzy modus ponens [46,23]. rules, and then produce different, conflicting results
2) The SMT agent manipulates the SCB to infer with knowledge inconsistency. How to resolve such
the case requested by the user based on similari- knowledge inconsistency is a critical issue for the

11
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

MSBRS. This issue will be resolved by the Analysis provided, the SBRS interface agent will ask U to ad-
assistant of the MSBRS. The analysis assistant will: just some aspects of the problem p, which is changed
● Rank the degree of importance of the sub into pꞌ, then the SBRS interface agent will once
outputs from the MAMIE taking into account again forward the revised problem pꞌ to the MIE for
the knowledge inconsistency, further processing.
● Give an explanation for each of the outputs
from the MIE and how the different results are 8. Conclusions
conflicting,
● Combine or vote to establish the best solu- Artificial intelligence (AI) has addressed expe-
tions, rience-based intelligence and knowledge-based in-
● Forward them to the SBRS interface agent telligence at their early stage. Big data has been ex-
who then forwards them to the user. periencing significant progress in the past 10 years,
The SCB manager is responsible for administer- AI has been developing machine learning and deep
ing the SCB. Its main tasks are SCB creation and learning to address data-based Intelligence. In fact,
maintenance, similarity case base evaluation, reuse, similarity intelligence has been accompanying expe-
revision, and retention. Therefore, the roles of the rience-based intelligence, knowledge-based intelli-
SCB manager are an extended form of the functions gence, and data-based Intelligence to play an import-
of a CBR system [8], because case base creation, case ant role in computer science, AI, and data science
retrieval, reuse, revision and retention are the main in general and similarity computing and analytics in
tasks of the CBR system [16]. particular. The main contributions of this article are:
1) It explored similarity intelligence, based on
the similarity discovered from experience-based
7.4 Workflows of agents in MSBRS
intelligence in machine learning and CBR. Similarity
Now let us have a look at how the MSBRS intelligence will be developed and created by many
works. The user, U, asks the SBRS interface agent to systems and algorithms in AI, computer science, and
solve the problem, p. The SBRS interface agent asks data science.
U whether a special reasoning agent is needed [45]. 2) It explored similarity-based reasoning and
U does not know. Thus, the SBRS interface agent proposed its three different rules, which constitute
forwards p (after formalizing it) to all agents in the the fundamentals for all SBR paradigms.
MIE for further processing. The agent in the MIE 3) It highlighted similarity-based reasoning, com-
manipulates the case in the SCB based on p, and puting, and analytics to create similarity intelligence.
the corresponding reasoning mechanism, and then As an example, the article also proposed a multia-
obtains the solution, which is forwarded to the Anal- gent architecture for an SBR system (MSBRS).
ysis assistant. After the Analysis assistant receives Overall, similarity intelligence is discovered
all solutions to p, it will rank the degree of impor- from big data, information, and knowledge using
tance of the solutions, give an explanation for each similarity relations, fuzzy similarity relations and
of the solutions and how the results are conflicting metrics, SBR, similarity computing and semantics.
or inconsistent, and then forward them (with p) to Furthermore, the similarity-based approach to
the SBRS interface agent who would then forward similarity intelligence, SBR, similarity computing
them to U. If U accepts one of the solutions to the and analytics proposed in the article opens a new
problem, then the MSBRS completes this mission. way to integrate machine learning (e.g. machine
In this case, the SCB manager will look at whether learning algorithms such as instance-based learning
this case is a new one. If yes, then it will add it to the and k-Nearest Neighbor (kNN) classifier and experi-
SCB. Otherwise, it will keep some routine records ence-based reasoning based on SBR, which will be
to update the SCB. If U does not accept the solution examined in future work. Knowledge management

12
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

and experience management have drawn increasing [9] Sun, R., 1995. Robust reasoning: Integrating
attention in business, e-commerce, and computer sci- rule-based and similarity-based reasoning. Arti-
ence. Their correspondence to intelligent systems is ficial Intelligence. 75(2), 241-295.
similarity-based systems such as CBR systems and [10] Bergmann, R., 2002. Experience management:
machine learning. How to apply similarity intelli- Foundations, development methodology, and
gence in Knowledge management, experience man- internet-based applications. Springer: Berlin.
agement, and similarity-based systems will be also [11] Russell, S., Norvig, P., 2020. Artificial intelli-
examined in future work. gence: A modern approach (4th Edition). Pren-
Measurement of intelligence is based on the tice Hall: Upper Saddle River.
ability to solve difficult problems. How to define [12] Laudon, K.G., Laudon, K.C., 2020. Manage-
the measurement of similarity intelligence is still a ment information systems: Managing the digital
weakness of this article. In future work, we will ex- firm (16th Edition). Pearson: Harlow.
plore the measurement of similarity intelligence. [13] López-Robles, J.R., Otegi-Olaso, J.R., Gómez,
I.P., et al., 2019. 30 years of intelligence models
Conflict of Interest in management and business: A bibliometric
review. International Journal of Information
There is no conflict of interest.
Management. 48, 22-38.
[14] Turing, A., 1950. Computing machinery and in-
References telligence. Mind. 49, 433-460.
[1] Zimmermann, H.J., 2011. Fuzzy set theory— [15] Schwab, P.N., 2023. ChatGPT: 1000 Texts
and its applications. Springer Science & Busi- Analyzed and up to 75,3% Similarity [Internet]
ness Media: Berlin. [cited 2023 Mar 17]. Available from: https://
[2] Zadeh, L.A., 1971. Similarity relations and www.intotheminds.com/blog/en/chatgpt-simi-
fuzzy orderings. Information Sciences. 3(2), larity-with-plan/
177-200. [16] Sun, Z., Finnie, G., Weber, K., 2004. Case base
[3] Minsky, M., 1988. Society of mind. Simon and building with similarity relations. Information
Schuster: New York. Sciences. 165(1-2), 21-43.
[4] Aroraa, C., Chitra, L., Munish, J., 2022. Data [17] Finnie, G., Sun, Z., 2003. R5 model for case-
analytics: Principles, tools, and practices. BPB based reasoning. Knowledge-Based Systems.
Publications: New Dalhi. 16(1), 59-65.
[5] Sun, Z., 2022. A mathematical theory of big [18] Kantardzic, M., 2011. Data mining: Concepts,
data. Journal of Computer Science Research. models, methods, and algorithms. John Wiley &
4(2), 13-23. Sons: Hoboken.
[6] Zhang, D.G., Ni, C.H., Zhang, J., et al., 2022. [19] Jordan, M.I., Mitchell, T.M., 2015. Machine
A novel edge computing architecture based on learning: Trends, perspectives, and prospects.
adaptive stratified sampling. Computer Commu- Science. 349(6245), 255-260.
nications. 183, 121-135. [20] Epp, S.S., 2010. Discrete mathematics with ap-
[7] Milošević, P., Petrović, B., Jeremić, V., 2017. plications. Cengage Learning: Boston.
IFS-IBA similarity measure in machine learning [21] Zhang, D.G., Ni, C.H., Zhang, J., et al., 2022.
algorithms. Expert Systems with Applications. New method of vehicle cooperative communi-
89, 296-305. cation based on fuzzy logic and signaling game
[8] Finnie, G., Sun, Z., 2004. Intelligent techniques strategy. Future Generation Computer Systems.
in E-commerce: A case based reasoning per- 142, 131-149.
spective. Springer-Verlag: Berlin. [22] Finnie, G., Sun, Z., 2002. Similarity and metrics

13
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

in case-based reasoning. International Journal of filtering. Artificial Intelligence. 120(1), 1-28.


Intelligent Systems. 17(3), 273-287. [36] Console, L., Dupré, D.T., Torasso, P., 1991. On
[23] Zhang, D., Wang, W., Zhang, J., et al., 2023. the relationship between abduction and deduc-
Novel edge caching approach based on multi- tion. Journal of Logic and Computation. 1(5),
agent deep reinforcement learning for Internet 661-690.
of vehicles. IEEE Transactions on Intelligent [37] Sun, Z., Finnie, G., Sun, J. (editors), 2005. Four
Transportation Systems. 24(6), 1-16. new fuzzy inference rules for experience based
[24] Klawonn, F., Castro Peña, J.L., 1995. Simi- reasoning. Fuzzy Logic, Soft Computing and
larity in fuzzy reasoning. Mathware & Soft Computational Intelligence (IFSA2005); 2005
Computing. 2(3), 197-228. May 30; Beijing. p. 188-193.
[25] Fontana, F.A., Formato, F., 2002. A similari- [38] Reeves, S., Clarke, M., 1990. Logic for comput-
ty-based resolution rule. International Journal of er science. Addison-Wesley: Wokingham.
Intelligent Systems. 17(9), 853-872. [39] Hurley, P.J., 2000. A concise introduction to log-
[26] Biacino, L., Gerla, G., Ying, M., 2000. Approx- ic. Thomson Learning: Wadsworth.
imate reasoning based on similarity. Mathemati- [40] Dosen, K., 1993. A historical introduction to
cal Logic Quarterly. 46(1), 77-86. substructrual logics. Substructrual logics. Clar-
[27] Kundu, S., 2000. Similarity relations, fuzzy lin-
endon Press: Oxford. pp. 1-30.
ear orders, and fuzzy partial orders. Fuzzy Sets
[41] Sun, Z., 2022. Problem-based Computing and
and Systems. 109(3), 419-428.
Analytics. International Journal of Future Com-
[28] Ovchinnikov, S., 1991. Similarity relations,
puter and Communication.11(3), 52-60.
fuzzy partitions, and fuzzy orderings. Fuzzy
[42] Sun, Z., Stranieri, A., 2021. The nature of intel-
Sets and Systems. 40(1), 107-126.
ligent analytics. Intelligent analytics with ad-
[29] Bogacz, R., Giraud-Carrier, C., 2000. A novel
vanced multi-industry applications. IGI-Global:
modular neural architecture for rule-based and
Hershey. pp. 1-22.
similarity-based reasoning. Hybrid neural sys-
[43] Sun, Z., Pambel, F., Wu, Z., 2022. The elements
tems. Springer: Berlin. pp. 63-77.
of intelligent business analytics: Principles,
[30] Hüllermeier, E., 2001. Similarity-based infer-
ence as evidential reasoning. International Jour- techniques, and tools. Handbook of research on
nal of Approximate Reasoning. 26(2), 67-100. foundations and applications of intelligent busi-
[31] Sun, Z., 2017. A logical approach to experi- ness analytics. IGI-Global: Hershey. pp. 1-20.
ence-based reasoning. New Mathematics and [44] Iantovics, L.B., Kountchev, R., Crișan, G.C.,
Natural Computation. 13(1), 21-40. 2019. ExtrIntDetect—A new universal method
[32] Loia, V., Senatore, S., Sessa, M.I., 2004. Com- for the identification of intelligent cooperative
bining agent technology and similarity-based multiagent systems with extreme intelligence.
reasoning for targeted E-mail services. Fuzzy Symmetry. 11(9), 1123.
Sets and Systems. 145(1), 29-56. [45] Sun, Z., Finnie, G., 2016. A Similarity Based
[33] Sun, Z., Finnie, G., Weber, K., 2005. Abductive Approach to Experience Based Reasoning
case-based reasoning. International Journal of (Prepprint). Available: https://ro.uow.edu.au/
Intelligent Systems. 20(9), 957-983. [46] Sun, Z., Finnie, G., 2005. MEBRS: A multiagent
[34] Magnani, L., 2011. Abduction, reason and sci- architecture for an experience based reasoning
ence: Processes of discovery and explanation. system. Knowledge-based intelligent informa-
Springer Science & Business Media: Berlin. tion and engineering systems. Springer: Berlin.
[35] Baral, C., 2000. Abductive reasoning through pp. 972-978.

14
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Journal of Computer Science Research


https://journals.bilpubgroup.com/index.php/jcsr

ARTICLE

Development of New Machine Learning Based Algorithm for the


Diagnosis of Obstructive Sleep Apnea from ECG Data
Erdem Tuncer

Faculty of Technology, Biomedical Eng. Department of Kocaeli University, Kocaeli, 41001, Turkey

ABSTRACT
In this study, a machine learning algorithm is proposed to be used in the detection of Obstructive Sleep Apnea
(OSA) from the analysis of single-channel ECG recordings. Eighteen ECG recordings from the PhysioNet Apnea-ECG
dataset were used in the study. In the feature extraction stage, dynamic time warping and median frequency features
were obtained from the coefficients obtained from different frequency bands of the ECG data by using the wavelet
transform-based algorithm. In the classification phase, OSA patients and normal ECG recordings were classified using
Random Forest (RF) and Long Short-Term Memory (LSTM) classifier algorithms. The performance of the classifiers
was evaluated as 90% training and 10% testing. According to this evaluation, the accuracy of the RF classifier was
82.43% and the accuracy of the LSTM classifier was 77.60%. Considering the results obtained, it is thought that it
may be possible to use the proposed features and classifier algorithms in OSA classification and maybe a different
alternative to existing machine learning methods. The proposed method and the feature set used are promising because
they can be implemented effectively thanks to low computing overhead.
Keywords: ECG; Sleep apnea; Classification; Dynamic time warping; Median frequency

and the waking periods following these obstructions.


1. Introduction OSA can seriously reduce a person’s quality of daily
Obstructive sleep apnea (OSA) is a sleep-related life and cause the development of many cardiovascu-
breathing disorder. It becomes evident with the ob- lar diseases. Therefore, early diagnosis and treatment
structions in the upper respiratory tract during sleep of obstructive sleep apnea is important. Electrocardi-

*CORRESPONDING AUTHOR:
Erdem Tuncer, Faculty of Technology, Biomedical Eng. Department of Kocaeli University, Kocaeli, 41001, Turkey; Email: [email protected]
ARTICLE INFO
Received: 6 June 2023 | Revised: 4 July 2023 | Accepted: 5 July 2023 | Published Online: 14 July 2023
DOI: https://doi.org/10.30564/jcsr.v5i3.5762
CITATION
Tuncer, E., 2023. Development of New Machine Learning Based Algorithm for the Diagnosis of Obstructive Sleep Apnea from ECG Data. Jour-
nal of Computer Science Research. 5(3): 15-21. DOI: https://doi.org/10.30564/jcsr.v5i3.5762
COPYRIGHT
Copyright © 2023 by the author(s). Published by Bilingual Publishing Group. This is an open access article under the Creative Commons Attribu-
tion-NonCommercial 4.0 International (CC BY-NC 4.0) License. (https://creativecommons.org/licenses/by-nc/4.0/).

15
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ogram (ECG) is the process of recording the electri- effect of different features on apnea data instead of
cal activity of the heart. In today’s conditions, ECG the features frequently used in the literature.
signals are used in the diagnosis of OSA. Apnea
diagnosis from the ECG signal is measured by heart 2. Materials and methods
rate variability. It will be economical and practical
to determine whether a person has OSA syndrome 2.1 Data set
with the proposed machine learning technique using
The ECG recordings used in the study were tak-
single-channel ECG recordings. Because with such a
en from the PhysioNet Apnea-ECG dataset. There
system, there will be no need for environments such
are 70 ECG recordings in total. Recordings can take
as sleep laboratories [1,2]. There are many studies in
the literature on the detection of OSA from ECG up to 10 hours in length. All of the sleep recordings
using methods. In the study conducted by Yildiz [3], were taken from 32 subjects. The age range of the
obstructive sleep apnea data from ECG recordings subjects was between 27 and 63 years. The standard
were classified. Twelve features were obtained using V2 lead was used for the placement of the electrodes
wavelet transform and they achieved the highest suc- on the body surface during recording. ECGs were
cess rate of 98.3% with the support vector machine/ digitized by sampling at 16 bits per sample and 100
artificial neural network classifier algorithms. In the Hz. ECG signals with 16-bit resolution. Evaluation
study by Faal et al. [4], they presented a new feature of whether the ECG recordings belong to people
generation method using autoregressive integrated with obstructive sleep apnea was made according to
moving average and exponential generalized autore- the sleep study technique [7]. In this study, 18 ECG
gressive conditional heteroscedasticity model in the recordings of 10 randomly selected patients (a01,
time domain from ECG signals. ECG signals were a02, a03, a04, a05, a06, a07, a08, a09, a10) were
analyzed in one-minute segments. The results were used. The randomly selected apnea and normal ECG
evaluated using five different classifiers (support data signal form is given in Figure 1. In Figure 1(a),
vector machine, neural network, quadratic separation heart rate variability is visually striking after the
analysis, linear separation analysis and k-nearest 4000th sample. In Figure 1(b), the normal one-min-
neighbor). As a result of the classification, a success ute ECG signal form is given.
rate of 81.43% was achieved. Tyagi et al. [5] pro-
posed a new approach to cascade two different types 2.2 Feature selection
of restricted boltzmann machines in the deep belief
networks method for sleep apnea classification using Discrete wavelet transform
electrocardiogram signals. They achieved a success The discrete wavelet transform aims to solve the
rate of 89.11% from the ECG data examined in fixed width window source problem of the fourier
one-minute epochs. Yang et al. [6] proposed a one-di- transform by using a scalable wavelet function. Thus,
mensional compression and excitation residual group optimum time-frequency resolution is provided in
network for sleep apnea detection. With the proposed different frequency ranges for the biomedical signals
method, an accuracy rate of 90.3% was achieved. to be analyzed. With the discrete wavelet transform,
Thus, they argued that cheap and useful sleep apnea it is aimed to eliminate the excessive computational
detectors can be integrated with wearable devices. load. Since an efficient algorithm based on filters has
The aim of this study is to present an automatic been developed in the discrete wavelet transform,
machine-learning method that can detect OSA from the calculation of the wavelet coefficients is made
ECG recordings. In the proposed method, a wavelet for discrete values at certain points. This algorithm,
transform-based algorithm is proposed. Unlike the called multiple resolution, consists of sequential
studies in the literature, it is the examination of the high-pass and low-pass filter pairs [8,9]. The lower fre-

16
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Figure 1. (a) ECG sign with apnea, (b) Normal ECG sign.

quency bands of the ECG data used in the study are C = c1, c2, …., cm-1, cm (2)
given in Table 1. As shown in Table 1, a six-level Q and C in Equation (1) and Equation (2) repre-
wavelet transform is used. sent two different signals or data; n and m indicate
Table 1. Ranges of frequency bands in wavelet transform de- the lengths of these signals. The similarity ratio
composition of ECG signal. between the Q and C signals is calculated using the
Sub-bands Frequency ranges (Hz) Euclidean length as in Equation (3).
D1 25-50
d(qi, cj) = (qi, cj)2(3)
D2 12.5-50
After obtaining the (i, j) matrix for Q and C, the
D3 6.25-12.5
accumulated distance matrix is calculated using this
D4 3.125-6.25
matrix. d represents the accumulated cost matrix and
D5 1.5625-3.125
D6 0.78125-1.5625
is calculated recursively [12].
A6 0-0.78125 Median frequency
Dynamic time warping algorithm Power spectral density is the frequency domain
equivalent of the power content of the signal. It is
Dynamic time warping algorithm is a classifica-
used to characterize broadband random signals. The
tion algorithm that uses similarity measurement of
median frequency represents the midpoint of the
time series. Biomedical signals sampled over a peri-
power spectral density distribution and is the name
od of time form a time series. The similarity between
the series can be calculated by finding the sum of the given to the frequencies above and below that make-
Euclidean distances between the elements of each up 50% of the total power in the ECG [13,14].
element of two discrete time series. The closer the
Euclidean distance sum is to zero, the more similar 2.3 Classification
the time series are. Today, the dynamic time-warping
algorithm is used in many areas from image process- Random forest
ing to audio processing [10,11]. Random Forest (RF) is a very popular learning
Q = q1, q2, …., qn-1, qn (1) algorithm for classification and regression problems.

17
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

The RF algorithm is to generate a large number of Accuracy = TP + TN/TP + FP + TN + FN(5)


unbiased decision trees where each tree votes for a Precision = TP/TP + FP (6)
class. The Gini index is used to construct the deci-
Recall = TP/TP + FN(7)
sion trees and determine the last class in each tree.
In addition, the test performance of each classifier
Therefore, the Gini index Gini (v) at node v meas-
was evaluated by calculating statistical parameters.
ures the purity of v. It is expressed by the formula in
Equation (4) [15,16].
Gini (v) = 
 (1 −  )(4)
3. Results
=1 

Here fi is the fraction of class i recorded at node v. In this study, the machine learning method that
can predict the automatic detection of OSA disease,
Long short-term memory
which is time-consuming and costly to diagnose,
Introduced by Hochreiter and Schmidhuber, Long from single-channel ECG recordings is presented.
Short-Term Memory (LSTM) is an advanced variant The flow chart of the proposed method is shown in
of the Recurrent Neural Network (RNN) architec- Figure 2.
ture. The basic structure of LSTM is that it uses a
memory cell to remember and explicitly span unit
outputs at different time steps. The memory cell of
LSTM uses cell states to remember the information
of temporal contexts. It has a forget gate, an entry Figure 2. Flow chart of the proposed model.

gate and an exit gate to control the flow of informa- In the presented method, ECG data were analyzed
tion between different time steps. The three gates of in one-minute windows. The coefficients of the low-
LSTM make it easy to organize long-term memory. er frequency bands were obtained from each window
LSTM models can learn the temporal dependence data by using the wavelet transform (6-level Symlet2
between data. Due to its ability to learn long-term wavelet). After applying the dynamic time-warping
correlations in a sequence, LSTM networks are ca- algorithm to the wavelet coefficients in different
pable of accurately modeling complex multivariate frequency bands, the results obtained are recorded
sequences such as the ECG signal [17,18]. in the feature matrix. The relationship of the A6 co-
efficients with the other coefficients was evaluated
2.4 Evaluation of classification models with the dynamic time-warping algorithm. Another
parameter calculated as a feature is the median fre-
One of the performance metrics for the machine
quency. The median frequency values of the wavelet
learning classification problem is the confusion ma-
coefficients obtained from all lower frequency bands
trix. Table 2 contains four different combinations of
were calculated. As shown in Table 3, a total of 13
the value to be estimated and the actual values are
features were extracted and given as input to the
called the confusion matrix [19].
classifier algorithms.
Table 2. Confusion matrix.
In this study, two different classifier algorithms
Predicted: No Predicted: Yes were evaluated. One is the deep learning architecture
Actual: No True Negative False Positive LSTM and the other is the traditional learning algo-
Actual: Yes False Negative True Positive rithm RF. The architecture of the model created in
Here, TP: True positive, TN: True negative, FP: the LSTM classifier is shown in Figure 3. LSTM ar-
False positive, FN: False negative. Some of the met- chitecture layers are composed of input layer, LSTM
rics we can calculate with the terms in Table 2 are layer, dropout layer, LSTM layer, dropout layer and
accuracy, precision and recall. Their mathematical output layer, respectively. The LSTM layer contains
equations are given in Equations (5), (6) and (7). 50 units per layer. These units use the Corrected

18
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 3. Feature list.


Difference
Wavelet
No Feature name between wavelet No Feature name
coefficient
coefficients
1 Dynamic Time Warping A6, D1 7 Median frequency D1
2 Dynamic Time Warping A6, D5 8 Median frequency D2
3 Dynamic Time Warping A6, D4 9 Median frequency D3
4 Dynamic Time Warping A6, D3 10 Median frequency D4
5 Dynamic Time Warping A6, D2 11 Median frequency D5
12 Median frequency D6
6 Dynamic Time Warping A6, D1
13 Median frequency A6

Linear Unit (ReLU) activation function and give a study, multiple combinations were tested to find the
different output for each time step. The reason for optimum parameters of tree and parameter num-
using ReLU is that it is generally less costly to train bers. The success rates obtained for combinations
the model in terms of computational load and can of different tree and parameter numbers are shown
achieve better performance than other models. In ad- in Table 4. As can be seen in Table 4, the number
dition, ReLU can avoid the vanishing gradient prob- of trees with the highest success rate was selected
lem, which is an advantage over the tanh function. as 250 and the number of parameters as two for the
After the first LSTM layer, the dropout layer classification of apnea data. Since increasing the
(with a value of 0.2) is applied to reduce overfitting. number of trees does not increase the performance of
The next layer is a new LSTM layer containing 50 the model, the model with the highest performance
units and ReLU activation functions, followed by
with the least number of trees was selected.
the dropout layer. Finally, the value containing the
Table 4. RF algorithm success results by parameters.
classification result is estimated after a sigmoid acti-
vation function is used to estimate the result with the Number of trees
Number of
Accuracy rate (%)
parameters
output layer.
10 2 78.15
20 2 80.13
30 2 80.46
1 1 1 1 40 2 80.57
. . . . . 50 2 80.79
. . . . . 70 2 81.22
100 2 81.66
. . . . .

150 2 81.99
50 50 50 50
200 2 81.88
250 2 82.43
Input LSTM Dropout LSTM Dropout Output
Layer Layer Layer Layer Layer Layer 500 2 82.43
Figure 3. Diagram of the LSTM architecture. The success rates obtained as a result of LSTM
The classification accuracy of the RF method de- architecture and RF architecture are given in Table
pends on user-defined parameters such as the number 5. As can be seen from Table 5, the optimized RF
of trees and the number of parameters. Therefore, algorithm performed better than the LSTM architec-
the selection of the most appropriate parameter for ture. Therefore, the LSTM architecture has the best
the data increases the classification accuracy. In the performance.

19
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 5. Classifier performances. on ECG apnea data. At the same time, the change in
Accuracy (%) Precision (%) Recall (%) success rates with the optimization of the classifier
Dataset
RF LSTM RF LSTM RF LSTM algorithms was examined. It is possible to reach
ECG higher success rates by diversifying and optimizing
82.43 77.60 82.10 76.70 82.40 77.50
Apnea the parameters of the machine learning model.

4. Discussion 5. Conclusions
The analyzed results show that it gives the highest This article discusses the estimation of apnea diag-
accuracy with 82.43% accuracy with the RF algorithm. nosis from ECG data. We propose a binary classifica-
High classification performance was achieved with thir- tion machine learning method to support physicians’
teen features obtained by using two features from ECG decisions in clinical practice. For decision support
data. When the studies in the literature were examined, applications, modeling using the RF algorithm as a
the norm entropy values of each wavelet level were classifier and classification of patients’ apnea data are
calculated by using the twelve-level wavelet trans- recommended. It has been seen that the feature meth-
form of the obstructive sleep apnea data from the ECG od selected with the RF algorithm is successful. In
recordings in the study conducted by Yildiz [3]. The the classification made with the used feature set and
obtained features were applied to the support vector RF algorithm optimization, a successful prediction
was made with 13 features with an accuracy rate of
machine/artificial neural network classifier algorithms
82.43%. The feature set and method we used in our
and the highest success rate of 98.3% was obtained. In
study give hope for higher future success rates. In
the study by Faal et al. [4], they presented a new feature
further studies, it is aimed to evaluate the efficiency of
generation method. As a result of five different classifi-
the feature set by expanding the dataset.
er algorithms, a success rate of 81.43% was achieved.
Tyagi et al. [5] proposed a new approach and achieved
a success rate of 89.11%. Yang et al. [6] proposed a Conflict of Interest
one-dimensional compression and excitation residual The author has no conflicts of interest to declare.
group network and 90.3% accuracy was achieved with
the proposed method. In the study by Razi et al. [20], Funding
ten-time domain features were extracted and reduced
to five features. Principal component analysis and dis- This research received no external funding.
criminant linear analysis were used for size reduction.
RF algorithm is proposed for classification and the re- References
sults are compared with other classifier algorithms. The [1] Wiegand, L., Zwillich, C.W., 1994. Obstructive
highest success rate detected is 95.01%. sleep apnea. Disease-a-Month. 40(4), 202-252.
When the studies in the literature are examined, it DOI: https://doi.org/10.1016/0011-5029(94)90013-2
is observed that the success rates are generally high- [2] Paiva, T., Attarian, H., 2014. Obstructive sleep
er than the study of this article. Most of the studies apnea and other sleep-related syndromes.
aimed to reach a higher success rate by using similar Handbook of clinical neurology. Elsevier:
methods and techniques. However, in the field of Amsterdam. pp. 251-271.
machine learning, the goal is not only to increase [3] Yildiz, A., 2017. Tek kanallı EKG kayıtları ana-
classification success but also to develop different lizinden uyku apne tespiti (Turkish) [Detection
features and method techniques. From this point of of sleep apnea from analysis of single-channel
view, our article differs from the studies in the liter- ECG recordings]. Dicle Üniversitesi Mühendislik
ature. A previously unused feature set is suggested Fakültesi Mühendislik Dergisi. 8(1), 111-122.

20
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

[4] Faal, M., Almasganj, F., 2021. Obstructive sleep Journal of Electrical and Computer Engineering.
apnea screening from unprocessed ECG signals 4(2), 79-83.
using statistical modelling. Biomedical Signal DOI: https://doi.org/10.17694/bajece.43067
Processing and Control. 68, 102685. [13] Brown, C.G., Griffith, R.F., Ligten, P.V., et al.,
DOI: https://doi.org/10.1016/j.bspc.2021.102685 1991. Median frequency—a new parameter for
[5] Tyagi, P.K., Agrawal, D., 2023. Automatic de- predicting defibrillation success rate. Annals of
tection of sleep apnea from single-lead ECG Emergency Medicine. 20(7), 787-789.
signal using enhanced-deep belief network mod- DOI: https://doi.org/10.1016/S0196-0644(05)80843-1
el. Biomedical Signal Processing and Control. [14] Tonner, P.H., Bein, B., 2006. Classic electroen-
80(2), 104401. cephalographic parameters: Median frequency,
DOI: https://doi.org/10.1016/j.bspc.2022.104401 spectral edge frequency etc. Best Practice & Re-
[6] Yang, Q., Zou, L., Wei, K., et al., 2022. search Clinical Anaesthesiology. 20(1), 147-159.
Obstructive sleep apnea detection from single- DOI: https://doi.org/10.1016/j.bpa.2005.08.008
lead electrocardiogram signals using one- [15] Masetic, Z., Subasi, A., 2016. Congestive heart
dimensional squeeze-and-excitation residual failure detection using random forest classifier.
group network. Computers in Biology and Computer Methods and Programs in Biomedi-
Medicine. 140, 105124.
cine. 130, 54-64.
DOI: https://doi.org/10.1016/j.compbiomed.
DOI: https://doi.org/10.1016/j.cmpb.2016.03.020
2021.105124
[16] Coskun, G., Aytekin, I., 2021. Early detection of
[7] Penzel, T., Moody, G.B., Mark, R.G., et al.
mastitis by using infrared thermography in hol-
(editors), 2000. The Apnea-ECG database.
stein-friesian dairy cows via classification and re-
Computers in Cardiology. 2000 September 24-
gression tree (CART) Analysis. Selcuk Journal of
27; Cambridge. USA: IEEE. p. 255-258.
Agriculture and Food Sciences. 35(2), 118-127.
[8] Tuncer, E., Bolat, E.D., 2022. Destek Vektör
[17] Tuncer, E., Bolat, E.D., 2022. Classification of
Makinaları ile EEG Sinyallerinden Epileptik
epileptic seizures from electroencephalogram
Nöbet Sınıflandırması (Turkish) [Epileptic seizure
(EEG) data using bidirectional short-term mem-
classification from EEG signals with support vec-
ory (Bi-LSTM) network architecture. Biomedi-
tor machines]. Politeknik Dergisi. 25(1), 239-249.
cal Signal Processing and Control. 73, 103462.
DOI: https://doi.org/10.2339/politeknik.672077
[9] Mallat, S.G., 1989. A theory for multiresolution DOI: https://doi.org/10.1016/j.bspc.2021.103462
signal decomposition: The wavelet representa- [18] Sowmya, S., Jose, D., 2022. Contemplate on
tion. IEEE Transactions on Pattern Analysis and ECG signals and classification of arrhythmia
Machine Intelligence. 11(7), 674-693. signals using CNN-LSTM deep learning model.
DOI: http://dx.doi.org/10.1109/34.192463 Measurement: Sensors. 24, 100558.
[10] Zhang, Z., Tavenard, R., Bailly, A., et al., 2017. DOI: https://doi.org/10.1016/j.measen.2022.100558
Dynamic time warping under limited warping [19] Dağli, E., Büber, M., Taspinar, Y.S., 2022. De-
path length. Information Sciences. 393, 91-107. tection of accident situation by machine learning
DOI: https://doi.org/10.1016/j.ins.2017.02.018 methods using traffic announcements: The case
[11] Jeong, Y.S., Jeong, M.K., Omitaomu, O.A., of metropol Istanbul. International journal of
2011. Weighted dynamic time warping for time applied mathematics electronics and computers.
series classification. Pattern Recognition. 44(9), 10(3), 61-67.
2231-2240. [20] Razi, A.P., Einalou, Z., Manthouri, M., 2021.
DOI: https://doi.org/10.1016/j.patcog.2010.09.022 Sleep Apnea classification using random forest
[12] Bakir, C., 2016. Automatic speaker gender via ECG. Sleep and Vigilance. 5, 141-146.
identification for the German language. Balkan DOI: https://doi.org/10.1007/s41782-021-00138-4

21
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Journal of Computer Science Research


https://journals.bilpubgroup.com/index.php/jcsr

ARTICLE

Enhancing Human-Machine Interaction: Real-Time Emotion


Recognition through Speech Analysis
Dominik Esteves de Andrade , Rüdiger Buchkremer*

Institute of IT Management and Digitization Research (IFID), FOM University of Applied Sciences, Dusseldorf,
40476, Germany

ABSTRACT
Humans, as intricate beings driven by a multitude of emotions, possess a remarkable ability to decipher and
respond to socio-affective cues. However, many individuals and machines struggle to interpret such nuanced signals,
including variations in tone of voice. This paper explores the potential of intelligent technologies to bridge this
gap and improve the quality of conversations. In particular, the authors propose a real-time processing method that
captures and evaluates emotions in speech, utilizing a terminal device like the Raspberry Pi computer. Furthermore,
the authors provide an overview of the current research landscape surrounding speech emotional recognition and
delve into our methodology, which involves analyzing audio files from renowned emotional speech databases. To aid
incomprehension, the authors present visualizations of these audio files in situ, employing dB-scaled Mel spectrograms
generated through TensorFlow and Matplotlib. The authors use a support vector machine kernel and a Convolutional
Neural Network with transfer learning to classify emotions. Notably, the classification accuracies achieved are 70%
and 77%, respectively, demonstrating the efficacy of our approach when executed on an edge device rather than relying
on a server. The system can evaluate pure emotion in speech and provide corresponding visualizations to depict the
speaker’s emotional state in less than one second on a Raspberry Pi. These findings pave the way for more effective
and emotionally intelligent human-machine interactions in various domains.
Keywords: Speech emotion recognition; Edge computing; Real-time computing; Raspberry Pi

*CORRESPONDING AUTHOR:
Rüdiger Buchkremer, Institute of IT Management and Digitization Research (IFID), FOM University of Applied Sciences, Dusseldorf, 40476,
Germany; Email: [email protected]
ARTICLE INFO
Received: 7 June 2023 | Revised: 7 July 2023 | Accepted: 10 July 2023 | Published Online: 21 July 2023
DOI: https://doi.org/10.30564/jcsr.v5i3.5768
CITATION
Esteves de Andrade, D., Buchkremer, R., 2023. Enhancing Human-Machine Interaction: Real-Time Emotion Recognition through Speech Analy-
sis. Journal of Computer Science Research. 5(3): 22-45. DOI: https://doi.org/10.30564/jcsr.v5i3.5768
COPYRIGHT
Copyright © 2023 by the author(s). Published by Bilingual Publishing Group. This is an open access article under the Creative Commons Attribu-
tion-NonCommercial 4.0 International (CC BY-NC 4.0) License. (https://creativecommons.org/licenses/by-nc/4.0/).

22
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

1. Introduction commonly referred to as Speech Emotional Recog-


nition (SER). Schuller [3] states that even animals can
A phenomenon that transcends both professional perceive the tonality of human speech, suggesting
and personal domains is the growing amalgamation that the time has come for machines to possess this
of machines, which aims to foster human connection. capability. Kraus [4] asserts that distinguishing pure
Numerous individuals cannot decipher socio-affec- voice communication from visual or audiovisual
tive cues, such as nuances in tone of voice. The chal- communication is vital in determining a person’s
lenge is to ensure that gestures, facial expressions, empathy.
and paralinguistic information such as volume, fre- In 2020, Akçay and Oğuz [5] commented that al-
quency, and intonation that make up communication though real-time emotion recognition through SER
are not lost. Therefore, incorporating emotions into systems is technically feasible, it has not yet become
interface design becomes indispensable, as people a ubiquitous part of daily life, unlike speech recog-
tend to exhibit social behaviors during interactions nition systems. Implementing a computer system
with machines. Nonverbal communication often necessitates considering both economic and eco-
carries pivotal information in a typical conversation, logical factors. Energy supply and usage issues are
revealing the speaker’s intentions. Apart from the intertwined with global warming and environmental
semantic content conveyed through text, how words concerns [6]. Thus, an edge emotion recognition ma-
are expressed imparts significant nonverbal cues. chine must consume minimal energy while operating
The precise delivery of spoken words, accompanied in real time.
by appropriate emotions, can bring a special message To achieve near real-time operation, a machine
altogether. requires a highly optimized processor performance,
Consequently, el Ayadi et al. [1] elucidate that short transmission paths, and low latency times.
humanity remains distant from achieving natural These three conditions constitute essential compo-
interaction with machines, particularly in compre- nents of edge computing. Unlike cloud computing,
hending the emotional states of counterparts. At which transfers data to a centralized location for pro-
this juncture, the emergence of emotion recognition cessing, edge computing brings computational pow-
technologies becomes pivotal, encompassing various er closer to the data. Mao et al. [7] argue that cloud
methods and technologies that enable the recognition computing is unsuitable for latency-critical mobile ap-
of emotions beyond human perception. The primary plications due to the distance between the user and the
objective of emotion recognition is to allow a system data center, resulting in significant delays. Abbas et al. [8]
to adapt its response when certain emotions, such as explain that cloud computing is unsuitable for re-
frustration or anger, are detected. In 2001, Corvie al-time applications such as augmented reality or
et al. [2] expound on the two channels of communica- car-to-car communication and thus supports the edge
tion present in every human interaction: the explicit computing approach. Cao et al. [9] report that over 50
and the implicit. While the explicit channel conveys billion end devices are connected to the Internet and
messages, the implicit channel reveals the speaker’s thus to each other, producing a data volume of 40
underlying feelings and moods. They explain that zettabytes. It includes mobile and ambient end de-
extensive research has been conducted to compre- vices such as smartphones, smart speakers, or Rasp-
hend the explicit channel, whereas the implicit chan- berry Pis. For all these network participants to act
nel, though less explored, holds great significance in and communicate with each other in near real-time,
understanding speakers and their emotional states. computing power must be shifted closer to the data.
Machines typically exhibit neutral behavior,
which humans may perceive as indifference. Hence,
devices need to recognize emotions conveyed
2. Related work
through speech to interact effectively. This ability is The introduction of cloud computing was a mile-

23
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

stone in the early 2000s, enabling new business works, SVM, or any combination of these two. Stud-
models and innovations. However, the era of cloud ies reveal that even pure emotion determination by
computing seems to be ending as the edge comput- humans is not accurate in all cases, so the focus is on
ing paradigm is increasingly replacing the cloud the use and further development of neural networks [13].
computing paradigm due to new requirements. Edge In the field of neural networks, recurrent neural net-
computing can support the new requirements for works (RNNs) such as Long Short-Term Memory
low latency, increased data security, mobility sup- Hochreiter and Schmidhuber [14] were initially used
port, and real-time processing. The literature divides because their feedback loops make them more suita-
edge computing into the sub-areas of fog computing, ble for processing continuous inputs such as speech
cloudlet, and mobile edge computing (MEC). While signals [15,16]. RNNs have been superseded by convo-
the first two approaches mentioned are hardly found lutional neural networks (CNNs) such as AlexNet,
in practice, MEC is ubiquitous. In MEC, compu- VGG16, ResNet, or MobileNetV2 due to their high
tationally intensive cloud servers are stationed in resource and memory requirements and continued
mobile base stations at the network’s edge and thus success. Furthermore, MFCC or Mel spectrograms
close to the end devices, ensuring daily use of this were launched using a Convolutional Neural Net-
technology. As Shi et al. [10] stated, MEC means data work (CNN). Moreover, the everyday use of transfer
processing immediately to the end device and on it. learning and Multitask Learning methods makes the
In addition to MEC, mobile cloud computing (MCC) CNN deployment even more efficient [17].
is based on the principle that end devices perform Every pattern recognition is based on previous-
the processing and only send the result or partial re- ly extracted features in considerable quantity and
sult to the MEC server or the MCC server. However, quality. Due to this given diversity, selecting suitable
none of these approaches can be found in pure form parts is relevant in classification. The method gener-
in practice. Instead, cloud and edge computing tech- ally used in machine learning for feature extraction
niques are combined to cover various use cases and is the use of the framework open-source Speech
exploit their advantages. and Music Interpretation by Large-space Extraction
The topic of speech emotion recognition (SER) (openSMILE) [18], which in turn includes the datasets
and its feature extraction and pattern recognition extended Geneva Minimalistic Acoustic Parameter
are a constant part of current research. Thus, the re- Set (eGeMAPS) and ComParE. In deep learning,
cent literature review shows that in SER, especially recent literature has increasingly used CNN for this
the continuous and the spectral features of speech purpose. In this approach, the output layer is either
are used since these reflect the characteristics of preserved as a classifier or replaced by, for example,
emotions most appropriately. Priority is given to an SVM.
the course of the primary speech frequency or loud- In the phase of emotion classification, diverse sets
ness, the temporal ratios, pauses, and spectral fea- of emotions diverge, which in turn harbor a differ-
tures such as the Mel frequency cepstral coefficient ent number of emotions. The settings can vary from
(MFCC) and the Mel spectrograms [3]. The most five to 20 other emotions. The most common set of
common classification techniques used in speech emotions in the literature refers to the six basic emo-
recognition in recent years are the Gaussian Mix- tions, according to Ekman [19], which are happiness,
ture Model (GMM) in combination with the Hidden sadness, anger, fear, disgust, and surprise, including
Markov Model (HMM), the support vector machine a seventh neutral emotion.
(SVM) (Cortes and Vapnik 1995), and more recently, In the mainstream literature, descriptions of the
neural networks [11,12]. Consequently, the successes hardware on which a neural network is trained or
achieved in this regard also inspired using these executed are scarce. However, Tariq et al. [15] de-
techniques in SER, but with a focus on neural net- scribe that neural networks—especially deep neural

24
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

networks run in cloud-like data centers. The locally 4) EMOVO [24]


collected data is transferred to these servers, deleted 5) eNTERFACE’05 [25]
on the local device, processed on the servers, and The eNTERFACE’05 database, in particular,
only the result is sent back to the end device. Thus, holds data in an audiovisual format, whereas the
applying neural networks in the context of MEC and remaining databases are in pure auditory waveform
real-time capability represents a novelty. Despite the format (WAV). The audio part of the database eN-
intensive research on these topics, no everyday emo- TERFACE’05 is extracted from the audiovisual files
tion recognition products currently exist. to use the required audio data.
In Table 1, a detailed overview of databases is
3. Materials and methods presented. While RAVDESS, eNTERFACE’05,
TESS, and EMOVO reflect the six basic emotions,
We employ labeled emotional speech data for Emo-DB contains only five. Except for eNTER-
the prototypical implementation. Audio files with FACE’05, all other databases have a neutral emotion.
a minimum length of one second but a maximum Only RAVDESS and Emo-DB include any other
length of 20 seconds are considered for use. Most emotions beyond these seven listed. However, since
emotion databases refer to six basic emotions [20]. most databases contain the six primary and neutral
Considering arousal and valence dimensions is not emotions, the prototype will classify only those.
part of our work, which is why these criteria are ne- Hence, our dataset comprises 6656 audio files
glected in data acquisition. Since part of this work is from 140 different references, amounting to a cu-
the emotion recognition in speech, but human speech mulative playback time of 5.14 hours. On average,
is divided into sentences from which the emotions each file has a duration of 2.97 seconds. Within the
emerge, the audio length of one to 20 seconds is file names, emotions are encoded either as complete
subjectively chosen since most sentences are spoken textual representations, abbreviations, or numerical
within this period. Thus, the audio files must still values. The number of files and their total length per
contain spoken sentences without singing, noise, or database vary considerably. Figure 1 illustrates the
the like. However, the native language is not a se- distribution of emotions across all the acquired data-
lection criterion since emotions are expressed in any bases, demonstrating a relatively balanced allocation
language. Even though the speaking gender is not of emotion labels. While the neutral emotion cate-
an immediate selection criterion, the totality of all gory is slightly underrepresented, this discrepancy is
databases must contain both male and female spoken compensated for during model training by appropri-
sentences to allow for the generalization of the data. ately adjusting the hyperparameters.
In addition, the stored channel number or sampling Moreover, the distribution of audio file durations
rate is irrelevant in data acquisition, as these are per database is depicted in Figure 2 using boxplots
standardized in the training process. Finally, the au- that exclude outliers. Notably, the eNTERFACE’05
dio files and databases must be freely accessible and database exhibits six outliers surpassing the upper
identified by labels. Thus, the following audio data- threshold of 20 seconds. To preserve the readability
bases are placed that meet the given quality criteria: of the boxplot representation, these outliers are omit-
1) Ryerson Audiovisual Database of Emotional ted. However, it is worth mentioning that most files
Speech and Song (RAVDESS) [21] in the eNTERFACE’05 database have a maximum
2) Berlin Database of Emotional Speech (Emo- length of less than 20 seconds. Consequently, this
DB) [22] database remains a foundational component for the
3) Toronto Emotional Speech Set (TESS) [23] prototype.

25
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 1. Overview of the speech audio databases employed.

Min.
Number of Max. Length Average Total length
Database length in Language Emotions
files in sec. length in sec. in minutes
sec.

Neutral, calm, joy,


RAVDESS 1440 2.94 5.27 3.7 88.82 English sadness, anger, fear,
disgust, surprise

Neutral, joy,
Emo-DB 535 1.23 8.98 2.78 24.79 German sadness, anger, fear,
disgust, boredom

Neutral, joy,
TESS 2800 1.25 2.98 2.06 95.91 English sadness, anger, fear,
disgust, surprise

Neutral, joy,
EMOVO 588 1.29 13.99 3.12 30.59 Italian sadness, anger, fear,
disgust, surprise

Joy, sadness, anger,


eNTERFACE’05 1293 1.12 106.92 3.17 68.37 English fear, disgust, sur-
prise

In recent studies, audio files with a sampling


rate of 16000 hertz and mono tracks are primarily
used [11,26-28]. As a result, the audio files of our data
corpus are transformed to that uniform format.
A model is developed through algorithms that
employ pattern recognition to classify recorded
audio inputs into specific activities or emotions.
Specifically, two distinct models are utilized—one
using a machine learning algorithm, while the other
utilizes a deep learning algorithm. It is important to
note that, as indicated by Shinde and Shah [29], these
Figure 1. The overall distribution of the seven emotions across
algorithms are not synonymous; instead, the latter is
the databases was visualized with Matplotlib.
a subtype of the former. A noteworthy distinction be-
tween the two lies in the requirement for specifying
hyperparameters in the machine learning algorithm,
whereas the deep learning algorithm automatically
determines and optimizes these hyperparameters.
Once trained, the model is deployed on ambient de-
vices to assess the feasibility of implementing such
an application using edge computing. The real-time
capability of the prototype is determined by measur-
ing its processing time.
The implementation of the prototypes in this re-
search is carried out using Python [30]. As mentioned
Figure 2. Boxplot representation of the databases used without earlier, an essential aspect of Speech Emotion Rec-
outliers visualized with Matplotlib. ognition (SER) is the availability of suitable lan-

26
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

guage data within the system. Consequently, imple- purposes. Table 2 lists the hardware components
menting SER necessitates an initial filtering process used in the training process.
that distinguishes between speech and non-speech Once the functional models are prepared, they are
audio. For this purpose, Hershey et al. [31] describe a transferred to ambient end devices where real-time
neural network called YAMNet in their paper, which classification occurs. Typically, these end devices
is specifically trained on audio classification using have lower computational power and memory than
the AudioSet database [32]. YAMNet can distinguish servers, rendering them unsuitable for training ma-
between 521 audio classes from human, animal, chine learning or deep learning models. However,
machine, and natural sources. This publicly availa- these devices’ internal processors and microphones
ble Convolutional Neural Network (CNN) YAMNet are well-suited for executing such models. The per-
serves as the upstream filter for SER and is utilized formance achieved is contingent upon the specific
in both methods described in this work. However, hardware components of each device. Table 3 pre-
since YAMNet does not impact the main SER algo- sents the ambient terminals employed in this study
rithm developed in this research, further details re- and their respective specifications. The table encom-
garding its functionality or structure are not provided passes two distinct types of devices chosen to repre-
here. sent each category.
Distinct terminal configurations are employed The selection of end devices encompasses various
during the creation and execution of the prototypes. categories, encompassing multiple operating sys-
The machine learning and deep learning models are tems, performance levels, and storage capacities. As
trained on a Windows server. This step focuses on a result, the chosen range serves as a representative
processing the five databases using the respective cross-section of the available ambient end devices.
method, necessitating hardware utilization with ap- Due to these end devices’ distinct architectures
propriate performance capabilities. It should be not- and operating systems, specific methods and require-
ed that servers do not possess microphone inputs due ments are necessary for utilizing the trained models.
to their general structure, broad physical localization, The fundamental prerequisites for deploying the
and clustering, which are also irrelevant for training ported models on the end devices include the frame-

Table 2. Server hardware used for model training.


Hardware component Designation
Rack Server HPE ProLiant DL380 Gen10
Operating system name Microsoft Windows Server 2016 Standard
Processor Intel® Xeon® Gold 6226 CPU @ 2.70GHz
Installed memory 256 GB
System type 64-bit operating system, x64-based processor

Table 3. Terminal devices and their components.


Working
ID Category Terminal Operating system Processor Battery
memory

Intel® Core™ i7- 3-cell, 52-Wh,


HP Envy x360 64-bit Windows 10
1 Notebook 8550U CPU @ 1.80 16 GB 4.55-Ah, 11.55V,
Convertible 15-cn0xxx Home version 21H1
GHz Li-ion battery

Raspberry Pi 4 Model Broadcom BCM2711


B 32-bit Raspbian (Cortex-A72, ARM External power
2 Raspberry PI 4 GB
including an additional GNU/Linux 11 v8), 4-core CPU with supply
USB mini microphone 1.5 GHz

27
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

works openSMILE and TensorFlow/TensorFlow Lite illustrates an example of a confusion matrix, depict-
alongside the Python programming language. ing the four potential outcomes. The primary dis-
To measure and evaluate the performance of the tinctionConfusion
lies in whether
matrices are commonly the employed
model’s prediction
for representing aligns
and evaluating
classification problems in machine learning. These matrices juxtapose the model’s predictions
prototypes, appropriate metrics are employed, focus- with reality or deviates from it.
with the actual states. Figure 3 illustrates an example of a confusion matrix, depicting the four
potential outcomes. The primary distinction lies in whether the model’s prediction aligns with

ing on real-time capability and classification success reality or deviates from it.

rate. Real-time capacity is assessed by measuring the


response times of the prototypes in seconds. These
measurements are conducted on the end devices,
commencing immediately after the recording and
storing of speech and concluding after classification.
It is important to note that the model training and
recording time are not considered during this evalu- Figure
Figure3. Example
3. Example of a confusion
of a confusion matrix basedmatrix based
on Davis and on [36]
Goadrich Davis
. and
ation. However, the exact time frame within which Goadrich [36].
the measured response time must satisfy the criteria The confusion matrix, also named the four-field matrix, does not represent a key figure in
The confusion matrix, also named the four-field
the narrower sense but provides the basis for its creation. Thus, the overall accuracy of the
for real-time capability in machine processes is not respective machine learning system is calculated from the confusion matrix and represented in
matrix, does not represent a key figure in the narrow-
decimal numbers, where the value 1.0 represents the maximum, and the value 0.0 is the minimum.
The accuracy indicates the total number of correct predictions of the model and is determined
explicitly defined in the literature. er sense but provides the basis for its creation. Thus,
using the following formula:

In contrast, the ISO/IEC 2382: 2015 standard de- the overall accuracy ofthe respective
   +    machine  learn-
  =
fines real-time as the “processing of data by a com-       +        +       +       
ing system is calculated from the confusion matrix
puter in connection with another process outside the andachieved
For comparison, the previously mentioned CNNs are used, whereby the highest accuracy
represented
in each case, as in decimal
shown in Table 4, isnumbers,
deposited. The CNNswhere thearevalue
mentioned sorted
chronologically by publication date within the table. Besides MobileNetV2, the cited papers do
computer according to time requirements imposed 1.0therepresents
results of the CNNsthewere maximum, and the value
to what is0.0 is the
not specify the machine used to generate results. Therefore, for the time being, it is assumed that
generated on cloud-like servers, similar described by
by the outside process” (ISO/IEC JTC 2015). Thus, Tariq et al. (2019) . Since a direct comparison of server-generated results with terminal device-
minimum.
generated resultsThe accuracy
is not possible, indicates
the subsequent the
interpretation of totalof this
the results number
study is
it is apparent from this definition that specifying an limited.
of correct predictions of the model and is determined
exact time in seconds or milliseconds is not feasible. using the following formula:
Instead, the external process defines the real-time True Positives +True Negalives
Accuracy =
capability, which may include human perception. True Positives + False Pastitves +True Negatives + False Negatives

Human perception is susceptible to linguistic com- For comparison, the previously mentioned CNNs
munication, as pauses of a few milliseconds can be are used, whereby the highest accuracy achieved in
subjectively perceived as interruptions. Vogt et al. [33] each case, as shown in Table 4, is deposited. The
suggest that subjective interruption is perceived after CNNs mentioned are sorted chronologically by pub-
1000 milliseconds. Zhang et al. [34] report that neural lication date within the table. Besides MobileNetV2,
networks for image classification require a range of the cited papers do not specify the machine used to
15.2 to 184 milliseconds for processing, with input generate results. Therefore, for the time being, it is
dimensions similar to the Deep Learning method uti- assumed that the results of the CNNs were generated
lized in this study (224 × 224 × 3). Furthermore, Liu on cloud-like servers, similar to what is described
et al. [35] state that compressed neural networks re- by Tariq et al. (2019). Since a direct comparison of
quire only 103 to 189 milliseconds for processing on server-generated results with terminal device-gener-
ambient devices such as smartphones. Consequently, ated results is not possible, the subsequent interpre-
in this prototyping without employing compressed tation of the results of this study is limited.
methods, a measured duration of fewer than 1000 Additional metrics are utilized for neural net-
milliseconds is considered real-time. works to measure and evaluate training results.
Confusion matrices are commonly employed for These include training and validation accuracy
representing and evaluating classification problems and the duration of training and validation losses.
in machine learning. These matrices juxtapose the Training and validation accuracy are represented as
model’s predictions with the actual states. Figure 3 decimal values, ranging between 0.0 and 1.0, where

28
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

1.0 signifies the highest accuracy. Similarly, training cific application is challenging. Energy consumption
and validation losses are expressed as decimal num- can be estimated indirectly by measuring processor
bers, with no upper limit but a minimum value of 0.0 utilization. Thus, the difference in measured proces-
representing the optimal loss. Consequently, Table 4 sor utilization, represented as a percentage, before
can also be applied to assess the accurate measure- and during classification is utilized as a metric in this
ment of neural networks in this context. However, in context.
evaluating CNNs, the focus is primarily on accuracy, The overall accuracy metric is related to the
rendering an evaluation or classification of training model and, therefore, independent of the hardware
losses unnecessary. Accuracy measurement and the utilized. However, metrics such as time, memory
creation of confusion matrices occur immediately consumption, energy consumption, and processor
after training on the server, unlike the measure of utilization are hardware-dependent. The metrics
classification time. mentioned above are evaluated using the hardware
Table 4. Comparison of prediction accuracy of known CNN
listed in Table 3 in the subsequent analysis.
models.

CNN Accuracy Source 4. Results


[37]
LeNet 82% LeCun et al. 1998
To implement Speech Emotion Recognition
AlexNet 84.6% Krizhevsky et al. 2017 [38]
(SER), an upstream filter is required to distinguish
VGG 93.2% Simonyan and Zisserman 2015 [39]
between speech and non-speech. Hershey et al. [31]
ResNet-152 96.43% He et al. 2016 [40]
introduced a neural network called YAMNet, trained
MobileNetV2 75.32% Sandler et al. 2018 [41]
on the AudioSet database [32], which classifies audio
However, assessing the accuracy and elapsed into 521 different audio classes, including human,
time alone is insufficient to achieve the objectives. It animal, machine, and natural sounds. YAMNet,
is also essential to determine whether the described a freely available Convolutional Neural Network
methods can be executed on ambient end devices (CNN), serves as the upstream filter for SER in this
and whether the results are comparable. While the work without affecting the main SER algorithm. As
training of the models does not occur on the end YAMNet’s functionality and structure have been de-
devices, the classification process does. To evalu- scribed elsewhere, further details are not provided in
ate the performance of a general machine learning this study.
method on an end device, Liu et al. proposed criteria The traditional machine learning algorithm em-
such as accuracy, delay, memory requirements, and ployed in this research follows a supervised learning
power requirements. Accuracy is assessed using the approach using a data corpus of the five mentioned
confusion matrix and the resulting accuracy score, databases. The objective is to generate a model that
as mentioned earlier. As explained previously, the can be ported to an end device for classification pur-
delay or temporal duration is determined by the poses. The openSMILE framework is utilized for
model itself and is expressed in seconds, indicating feature extraction in this method, while a Support
the time required for one classification cycle. Mem- Vector Machine (SVM) is employed as the classifier.
ory requirements are measured in gigabytes, repre- The SVM is trained on the eGeMAPS features ex-
senting the average memory allocation needed for a tracted from the audio files using openSMILE. The
cycle, calculated by comparing the working memory extraction and training processes are applied to the
usage before and during classification. However, entire data corpus rather than individual databases.
measuring the energy consumption of an application After extraction, the parameter dataset is normalized
directly is not feasible since an application does not by removing the mean and scaling it to unit variance.
exclusively run on a single system. As a result, the Subsequently, the normalized dataset is divided into
strict identification of the energy demand for a spe- training and test partitions in an 80:20 ratio. The

29
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

training partition is used for model training, while continues in a continuous three-second cycle, pro-
the test partition validates the training results. cessing the following files until manually terminated.
The relevant hyperparameters for training are Figure 4 provides a schematic overview of the
optimized and determined by the algorithm itself. mentioned processing steps, indicating the sequence
Initially, four hyperparameters with specified value of individual actions. Not all processing steps are ex-
ranges are provided. These include the selection of ecuted on the same hardware, and the figure specifies
available SVM kernels (polynomial, linear, sigmoi- which steps are performed on the server and the end
dal, and radial basis function), a regulation parame- device.
ter ranging from 10–3 to 102, and a degree parameter The deep learning algorithm is based on the same
ranging from zero to nine for the polynomial kernel. data corpus to ensure a subsequent parity compari-
The algorithm optimizes and applies various com- son of both approaches. The goal is to generate an
binations of hyperparameters during training on the executable model for subsequent porting to the end
training partition. Following the training phase, vali- devices. As an alternative to machine learning, CNN
dation is performed using the training dataset. acts as a feature extractor and classifier. In this con-
Upon completion of training, the machine learn- text, the creation and training of the CNN are based
ing system can classify new, unknown data based on on TensorFlow [42]. Since a CNN expects image files
the learned generalization. The system is connected instead of audio files as input, it is first required to
to a microphone, which records human speech at generate corresponding representative spectrograms
1024 frames per buffer every three seconds. The from audio data.
recorded audio is stored locally in a 16-bit WAV for- Input for the CNNs is Mel spectrograms derived
mat with a sampling rate of 16000 Hz and a mono from the spectrogram audio representations. There-
channel. The stored file is then read by the machine fore, speech recordings of different lengths also
learning system and processed using YAMNet. If the result in spectrograms of various sizes. However,
classification result from YAMNet indicates “human since it is necessary to always use identically sized
speech”, the file is further processed using openS- spectrograms for training the CNN, the audio files
MILE to obtain eGeMAPS features. Similar to the must be read in and processed with a fixed window.
training phase, this dataset is normalized and passed To ensure a subsequent comparison, the first three
to the SVM for emotion classification. The classi- seconds of the audio files are read in, of which only
fication is performed immediately, and the process two seconds are processed with an offset of half a

Figure 4. Schematic representation of the processing steps of the machine learning method.

30
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

second. If an audio file is smaller than three seconds, They must be normalized before the Mel spec-
the content of the file is duplicated until the mini- trograms can serve as input data to the CNN. In this
mum size is reached. To generate the spectrograms, method, normalization consists of importing the
the final two-second audio file is transformed using image files with fixed dimensions of 224 × 224 × 3
Fast Fourier Transform with a window size of 512 pixels and then dividing each pixel value by a factor
milliseconds and a jump size of 256 milliseconds of 255. The dimensions of 224 × 224 × 3 pixels have
between windows. From this spectrogram, the Mel been proven in image recognition by CNN since
spectrogram is derived with 128 Mel filters, a min- AlexNet, which is why they are also used here. The
imum frequency of 0 Hertz, and a maximum fre- division by 255 is necessary because neural networks
quency of 8000 Hertz. Finally, this Mel spectrogram are known to operate from zero to one, and thus the
is plotted on a dB scale with 80 dB and the magma pixel values are normalized.
color scheme and is available for subsequent classi- Training a neural network from scratch is compu-
fication. The generated dB-scaled Mel spectrogram, tationally intensive, time-consuming, and involves
including its intermediate stages, is visually present- significant data, so transfer learning is used now.
ed in Figure 5. This way, the entire data corpus is Transfer learning for neural networks consists of
preprocessed and then split again into a training and removing the output layer of a pre-trained neural
a test data set in a ratio of 80 to 20. network and replacing it with new output layers

Figure 5. Mel spectrogram generation in individual steps visualized with TensorFlow and Matplotlib.

of its own, which act as classifiers. MobileNetV2, es of equal size to apply transfer learning. First, the
which is pre-trained on ImageNeta, is this method’s training of the new model operates with 50 initial
base model for transfer learning. MobileNetV2, the epochs on the 154 untrainable layers and weights,
training base, was initially designed for object recog- which is used to transfer the experience of the base
nition and execution on mobile devices. Three sepa- model to the task. Only the three newly added layers
rate output layers then augment the base model with are trainable in this phase. Subsequently, the model
a GlobalAveragePooling2D, a dropout of 0.2, and a training operates another time with 50 epochs, this
fully connected layer including a softmax activation time with 54 trainable and 100 untrainable layers,
function, which is used when the number of classes which is called fine-tuning in the corresponding liter-
is more significant than two. The neural network is ature. Each epoch is run with 100 training steps and
then optimized using the Adam optimization algo- ten validation steps. The training and validation data
rithm [43] and an initial learning rate dropout of 10–5. are read into the model training with a batch size of
Furthermore, categorical cross-entropy is used 16.
as a loss function, which is used to quantify the dif- Figure 6 schematically visualizes the described
ferences between two probability distributions in sequence of the deep learning process. In addition
prediction. Finally, the model is trained in two phas- to the individual processing steps and their arrange-
ment, it is also apparent here which processing steps
a ImageNet is an image database consisting of over 15 million labeled
and high resolution natural images with approximately 22000 categories. are executed on the server or the end device. The

31
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Figure 6. Processing steps of the deep learning method.

similarities and differences between this graphic and by the confusion matrix, shown in Figure 7, for the
the similarities and differences in Figure 4 become test partition of the trained model and the classifica-
apparent. The diagrams outline the appropriate pro- tion report based on it. In the former, the list of the
cessing steps, sequence, and execution location. Fur- seven considered emotions can be lined up vertically
thermore, the optimization of the hyperparameters on the left edge as absolute values on the one hand
and the general parameterization of the models are and horizontally on the bottom edge as values pre-
part of the training and are, therefore, not listed in dicted by the model on the other. Furthermore, it can
both diagrams. Furthermore, it can be seen from the be seen from the marked green fields that the mod-
comparison that additional work steps are necessary el’s prediction agrees with the actual values in most
for the deep learning method before the neural net- cases. Those correct predictions represent the true
work training is started. The effects of the extra steps positives. The remaining whitish areas represent the
on the result will be discussed later. False Positives since the predicted emotion classes
While describing the results of both prototypes, do not match the real-world conditions.
a distinction is made between the generation of the
executable model, including its training, and the re-
al-time classification by the same model.
The prototype is a supervised machine learning
method using a support vector machine as a classifier
for emotion determination. The algorithm is trained
on the five databases to generate an executable and
portable model. The training of this algorithm, in-
cluding the optimization of the hyperparameters, is
about 96 hours. The hyperparameters selected and
optimized by the algorithm are the radial basis func-
tion kernel, the regulation parameter with a value of
Figure 7. Confusion matrix of the machine learning process.
101, the gamma with 10–2, and the degree parameter
with a zero value. The confusion matrix calculates the model’s
The accuracy after the model training is indicated overall accuracy using the above formula, which is

32
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

already included in the classification report and vis- Furthermore, measurement is also performed on
ualized in Figure 8, together with other metrics. For the Raspberry Pi. Here, the arithmetic means of 15
example, in addition to the overall accuracy rates, observed cycles is 4.33 seconds for audio and emo-
accuracy rates for individual emotions are also pres- tion classification. Also, at this point, the result is
ent. Figure 8 shows that the overall accuracy of this compared with a pure emotion classification without
procedure is 0.77. With an achieved value of 0.77 prior audio classification. This cycle takes an aver-
and 77%, respectively, the model is ranked between age of 0.337 seconds, calculated from 15 observed
the CNN MobileNetV2 and LeNet based on Table 4. cycles. In conclusion, based on the set benchmarks,
The confusion matrix and the classification report the emotion-only classification is declared real-time
are valid for the machine learning model and thus capable, but the combined audio and emotion classi-
independent of the end device used. fication with a time of 4330 milliseconds is not. Fur-
ther memory measurement indicates an increase in
memory usage after starting classification from 415
megabytes to 586 megabytes, a relative increase of
4.4% for a total availability of 3838 megabytes, from
10.8% utilization for the first time to 15.2% utiliza-
tion now. On the other hand, processor utilization in-
creases by 26.4% points during execution, from 0.7%
utilization for the first time to 27.1%.
The second method described in 4.3, a CNN, is
Figure 8. Classification report of the machine learning method. used based on the pre-trained MobileNetV2 network.
An exemplary metrics measurement is performed The CNN developed in this method is also trained
on the notebook mentioned in Table 3. The estimat- with the same intention on the five databases. The
ed time is measured within the model between two training time of the neural network with a total of
cycles. A cycle consists of a speech recording, an 100 epochs is about six hours. As described above,
audio classification, and an emotion classification, the training proceeds in two identical phases of 50
depending on this result and its output. The average epochs each, one step for initial learning and one
estimated time for 15 observed cycles is 0.799 sec- stage for finetuning the model. Following each train-
onds. For comparison purposes, processes without ing epoch, the training and validation accuracy and
audio classification are also performed, where emo- the training and validation loss are reported. The pre-
tion classification is applied to each incoming audio liminary result after the first 50 epochs is graphically
signal. The average time required here is 0.114 sec- visualized in Figure 9. On the left side is the training
onds, calculated from 15 observed cycles. and validation accuracy course. The training and
In conclusion, based on the criteria set, 144 mil- validation loss, each for 50 epochs, is shown on the
liseconds for emotion classification alone and 799 right side. Each of these four parameters is shown as
milliseconds for emotion classification, including a separate curve.
previous audio classification, are declared to be re- The accuracy curve shows that the training ac-
al-time capable. The memory requirement increases curacy starts at around 0.16 and increases to ap-
from 9.8 gigabytes to 10.1 gigabytes after starting proximately 0.42 by the 50th epoch. The validation
the classification, which is derivatively an increase accuracy also starts at about 0.16 and reaches an ac-
from 62% to 64% utilization. On the other hand, the curacy of about 0.5 at the 50th epoch. It is noticeable
processor utilization increases by 17% points after here that the validation accuracy is always above the
starting the classification, from an average of 11% to training accuracy. This phenomenon is due to the pe-
around 28%. culiarity of transfer learning. When training a neural

33
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Figure 9. Training result of the CNN after initial 50 epochs with transfer learning.

network without transfer learning, the training accu- Fifty other fine-tuning epochs supplement the
racy is always above the validation accuracy. result in the previous Figure 9. A vertical straight
Similar behavior can be observed in the course of line in the 50th epoch shows at which point the
the loss curve. The training loss curve starts at a loss fine-tuning phase starts. Thus, after the beginning
of around 2.3 and drops to about 1.55 by completing of the fine-tuning stage, the training accuracy curve
the 50th epoch. On the other hand, the validation drops to 0.35 but then takes a steeper course than be-
loss curve begins at 2.2 and drops to about 1.4 by fore and reaches the maximum of 1.0 from the 90th
the 50th epoch. Once again, it is characteristic of epoch, at which point the curve stagnates until the
transfer learning that the validation curve always lies 100th epoch. On the other hand, the validation ac-
below the training curve. curacy curve initially drops to around 0.45 after the
During the subsequent fine-tuning of the CNN, start of the finetuning phase. Still, it rises again and
more trainable layers and, thus, more trainable reaches an accuracy rate of about 0.7 by the 100th
weights are available. The model also has more pos- epoch.
sibilities to optimize performance. The result of the A change can also be observed in the loss curves
fine-tuning is illustrated in Figure 10. after the start of fine-tuning. For example, the train-

Figure 10. Training result of the CNN after 100 epochs using transfer learning.

34
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ing loss curve rises to 1.75 after the start of fine-tun- och and remains there.
ing but then drops more sharply than before, reach- On the other hand, the validation loss curve starts
ing a value of 0.1 at the 100th epoch. The validation with a value of about 2.2 and falls with a fluctuating
loss curve does not rise after the start of fine-tuning downward trend until the 35th epoch. There, the
but drops to a value of 0.8 by the 80th epoch, where curve has reached its local minimum of 0.9. How-
the local minimum of the curve is located. By com- ever, the curve rises again until the 100th epoch to
pleting the 100th epoch, the curve rises to about 1.0. about 1.2.
Ultimately, it is not the training accuracy but the The advantages of transfer learning become ap-
validation accuracy that is decisive for the correct parent when comparing Figure 10 with Figure 11,
classification. With an accuracy of 0.7 and 70 %, and the benefits of transfer learning become evident.
respectively, this result is based on Table 4 ranks be- Starting from the start of fine-tuning, it can be stat-
low MobileNetV2. ed that the validation accuracy curve has already
Analogously, the CNN of this method is trained a reached the value of 0.7 after 20 epochs. In con-
second time on the five databases but without apply- trast, the validation accuracy curve without transfer
ing transfer learning. In this training, the CNN is also learning has only reached this value after about 50
qualified with 100 epochs, but in only one phase and epochs. The advantage can also be seen in that the
with the absolute number of trainable layers. With its validation accuracy curve for the method with trans-
100 epochs, this training has a running time of about fer learning has a higher slope than the validation ac-
six hours, as before. The result of this training of curacy curve without transfer learning. Finally, it can
100 epochs without transfer learning is visualized in be seen that the start of the validation accuracy curve
Figure 11. Here, it can be seen that the training ac- for the process with transfer learning starts higher on
curacy curve starts at a value of 0.22 and rises to the the Y axis with a value of 0.45 than the curve with-
maximum of 1.0 by the 50th epoch. There the curve out transfer learning with a value of 0.22.
remains until the 100th epoch. The validation accu- In contrast to an SVM, the classification result in
racy curve also starts at a value of 0.22 and increases a neural network does not output a single value but a
to 0.7 by the 50th epoch, which remains with fluctu- value range with seven entries corresponding to the
ations until the 100th epoch. The training loss curve number of classes present. The entries in this val-
begins at the value of 2.0 and steadily decreases until ue range represent the probabilities with which the
it reaches the minimum of 0.0 at about the 70th ep- model predicts one class each. The individual entries

Figure 11. Training result of the CNN after 100 epochs without using transfer learning.

35
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

can assume a value between 0.0 and 1.0, with the but the previous mark with 4427 milliseconds is not.
sum of all entries in the value range again resulting Thus, implementing the prototype on the Raspberry
in 1.0. The emotion class with the highest probability Pi lacks real-time capability. The memory used in-
value is the classified emotion. creases from 563 megabytes to 675 megabytes dur-
An exemplary metrics measurement is also ing runtime, a rise of 2.9% points for a total avail-
performed on the notebook mentioned in Table 3. ability of 3838 megabytes, from 14.7% utilization
Here, the time is also measured between two cycles, for the first time to 17.6% now. On the other hand,
whereby one consists of the audio and emotion clas- processor utilization increases by 22% points during
sification, including the output. The arithmetic mean execution, from 2.9% utilization for the first time to
of the estimated time is 0.856 seconds for 15 ob- 24.9%.
served processes. At this point, a comparison is also In this study, four core elements are to be noted as
made to a cycle without prior audio classification. findings. First, a tabular comparison of the results of
The time counted for this cycle is 0.119 seconds, the two methods used is provided in Table 5, where
also calculated from 15 observed cycles. With a time the metrics listed here represent the arithmetic mean
value of 119 milliseconds for a cycle without audio across all measured metrics.
classification and 856 milliseconds for a cycle with It can be deduced from the previous table that an
audio classification, respectively, the result is below SER system can distinguish between speech, non-
the set benchmarks and is therefore considered re- speech, and silence. To this end, the YAMNet neural
al-time capable. The memory requirement increases network, which is not a primary component of this
from 9.9 gigabytes to 10.6 gigabytes from the start of work and was not developed within this research,
classification. Relative to the total available memory, is used within the prototypes. Nevertheless, the
this is an increase of 4% points, from 63% utilization YAMNet neural network is part of both prototypes,
for the first time to 67%. The processor utilization which are thus able to classify audio inputs into dif-
also shows an increase of 16% points, from 15% to ferent categories, such as music, meowing, barking,
31% for the first time. silence, or even speech.
Based on the implementation of the prototype Concerning the databases used in this research, it
on the Raspberry Pi, the average time required for a was shown that they contain various emotional audio
cycle, including audio and emotion classification, is files, including the six basic emotions mentioned by
around 4.43 seconds, calculated from 15 observed Ekman (1971), plus further emotional stimuli such
cycles. In comparison, emotion recognition without as tiredness or boredom. Neutral emotion can also
prior audio classification requires an arithmetic mean be found in the majority of the databases. The proto-
of only 0.393 seconds, again calculated from 15 types trained on these databases are thus able to dis-
practical cycles. With a needed time of 393 millisec- tinguish between the seven emotions. Therefore, an
onds, the latter result is below the set benchmarks, SER system can distinguish between positive, nega-

Table 5. Comparison of the results of the two methods.

Metrics Machine learning method Deep learning method


Training duration 96 hours 6 hours
Accuracy 77% 70%
Working memory requirement increase 10.7% points 3.45% points
Processor load increase 21.7% points 19% points
Time consumption emotion classification 225.5 milliseconds 256 milliseconds
Time consumption audio and emotion classification 2565 milliseconds 2642 milliseconds

36
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

tive, and neutral emotions but is not limited to this. the set index of 1000 milliseconds. Additionally, the
Instead, such a system can perform a more detailed SER, including initial audio classification with an
categorization of speech input into individual emo- average time of 799 milliseconds, is below the set of
tions with an accuracy of 77% for machine learning 1000 milliseconds. Related to Deep Learning, the av-
and 70% for deep learning. erage time for a cycle without an audio classification
The third finding relates to the feasibility of an is 256 milliseconds. Meanwhile, the average time for
SER system on ambient terminals but is distin- a cycle with audio and emotion classification is 2642
guished between the phases and the ambient end milliseconds. According to the results, the fourth
devices used in Table 3. Due to the intensive com- finding is that the choice of the method determines
puting power and high runtime, the one-time model whether the real-time capability is given or not.
training step must be executed on a server. There- However, since there is no porting and, therefore, no
fore, therefore not feasible on an end device. The results regarding the machine learning method, there
subsequent real-time classification phase is based is the possibility that this last finding is falsified.
on the trained model and can be performed multiple
times on a terminal device. The prototype porting to 5. Discussion
a notebook is feasible since notebooks generally sup-
Both machine learning and deep learning ap-
port corresponding Python runtime environments.
proaches in this study rely on a shared data corpus,
Thus, running the emotion classification is possible
which is obtained and selected based on predefined
on a notebook regardless of the method used. Porting
criteria outlined in the existing literature. These crite-
the prototypes to a Raspberry Pi, on the other hand,
ria encompass several factors, including a minimum
is more complex since, on the one hand, Python audio duration of one second and a maximum dura-
runtime environments are supported in principle. tion of 20 seconds. The choice of one second as the
Still, on the other hand, the necessary frameworks, minimum duration is justified because shorter audio
openSMILE and TensorFlow, are not available for files generally lack sufficient information. Converse-
Raspberry Pi’s. Alternatively, for TensorFlow, the ly, selecting a maximum period of 20 seconds is
porting of the Deep Learning procedure is done with somewhat arbitrary, as durations of 10, 15, or 30 sec-
TensorFlow Lite, which runs the trained model on onds could have also been considered. However, as
the end device. In the absence of openSMILE com- depicted in Figure 2, most audio files in the chosen
patibility with 32-bit operating systems and a lack of databases have durations of less than 10 seconds.
a qualitative alternative, the porting of the machine Another criterion is the exclusion of non-spoken
learning procedure is omitted at this point. In sum- sentences since the prototypes focus on speech-
mary, it can be stated that realizing an SER system emotion recognition (SER) rather than general
using edge computing is only possible to a limited audio classification. Therefore, the audio files must
extent. While they assist in executing deep learning exclusively contain speech, even though some music
approaches and neural networks on the end devices, may include spoken segments accompanied by
this does not always apply to the machine learning instruments.
method. Furthermore, it is essential to note that this study
Regarding the real-time capability of the classifi- does not address emotion recognition in music,
cation system, it is necessary to differentiate which although it could be a potential avenue for future
method is used and whether only SER or SER plus research. Including audio files with background noise
prior audio classification is considered. Concerning is essential, as real-life communication often occurs
the machine learning method, the SER system re- in noisy environments. While background noises are
quires an average of 114 milliseconds for pure SER prominent in music, they play a secondary role in
without prior audio classification and is thus below speech-related scenarios. Therefore, incorporating

37
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

databases containing audio data with such background start of the training process. The algorithm then in-
noise would be valuable for enhancing the research. dependently determines the optimal values based on
The criteria specify that the files must be in an these predefined hyperparameters. The selection of
auditory or audiovisual format, but there are no these hyperparameters is based on the research con-
restrictions regarding file type, sampling rate, or ducted by Mao et al. [27]. However, alternative param-
dubbing. Although limiting the selection to purely eters or value ranges described by Wang et al. [44] or
auditive file formats may influence the choice of Cummins et al. [45] could also be considered. Feature
databases, it would not impact the subsequent extraction utilizes the openSMILE framework, par-
procedures, as all audio files are transformed to ticularly the eGeMAPS, which aligns with its usage
a standardized format and file type before model in Cummins et al.’s work.
training. In contrast, the deep learning method employs
The native language used in the audio files is explicit hyperparameters. The training process con-
also not a selection criterion, as indicated in Table sists of two phases, each comprising 50 epochs, as
1, which demonstrates the inclusion of German, established by Tan et al.. Alternatively, Zhang et al.
English, and Italian data in the selected databases utilized a batch size of 30, SGD as the optimization
and audio files in other languages such as Turkish, algorithm, and a learning rate of 10-3 as hyper-
Danish, or Chinese. Since the six basic emotions parameters. However, standard hyperparameters
described by Darwin (1873) and Ekman (1971) employed by Lim et al. include SGD with a learn-
are expressed similarly across cultures, the spoken ing rate of 10-2, a dropout of 0.25, and a Rectified
language does not significantly affect the Mel Linear Unit activation function. Discrepancies also
spectrograms, model training, or results. However, exist in the generation of Mel spectrograms, as men-
it is crucial to include both male and female voices tioned in section 2.3.3 and the relevant literature. For
when selecting databases. Failing to meet this instance, Zhang et al. used 64 Mel filters for audio
criterion could impede data generalization and lead classification within a frequency range of 20 to 8000
to overfitting or underfitting of the model. hertz, utilizing a 25-millisecond Hamming window
Open accessibility and availability of labeled with a ten-millisecond overlap for each window. The
data are mandatory for data collection. Without open variation in Mel spectrogram generation can be jus-
access to the databases, it would be impossible for tified since speech emotion recognition (SER) and
third parties to reproduce the procedures and results speech recognition are distinct processes, as noted
of this study. Moreover, the absence of labeled by Zhang et al. Nonetheless, the use of Mel spectro-
data would render supervised machine-learning grams aligns with the current state of research. Alter-
algorithms infeasible. Investigating the impact natively, MFCC can also be applied within the deep
of different database quantities or compositions, learning procedure.
including language variations, on the outcomes of The base model utilized in this study is Mobile-
this research can be pursued in future investigations. NetV2 when employing transfer learning. However,
Other methodologies that equally impact both the literature suggests considering CNN ResNet50
procedures involve dividing the data corpus into or SqueezeNet [46]. In this study, only the last 54 lay-
training and validation sets using an 80 to 20 ratio. ers out of 154 are fine-tuned. Optionally, different
Preprocessing the audio files commonly entails numbers of trainable layers, such as 32 or 16, can be
transforming them to a 16,000-hertz format with considered. Exploring the impact of modifications to
a mono channel, as frequently reported in the these hyperparameters on the results can be a subject
literature. of future research.
In the machine learning method, the hyperpa- The other cannot be drawn solely from compar-
rameters and their value ranges are defined at the ing the approaches and their results. Table 5 does

38
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

not provide conclusive evidence to support the dom- When examining the individual process steps
inance of either procedure. Examining the training in Figures 4 and 6, no direct conclusion regarding
time reveals that the Machine Learning approach training duration is apparent. However, the Deep
requires approximately 16 times the duration com- Learning method necessitates more process steps
pared to the Deep Learning approach. However, the than the Machine Learning method. As mentioned
Machine Learning model exhibits higher accuracy, earlier, the models’ parameterization is not depicted
faster classification, and lower increases in processor in these figures, as it is part of the mapped training.
load and memory requirements. Specifically, the choice between fixed hyperparam-
The speed advantage of the machine learning eters and ranges of hyperparameter values signifi-
method in real-time classification arises from the uti- cantly impacts training time. In the Deep Learning
lization of distinct emotion recognition algorithms. method, training duration also varies based on pa-
In this case, audio classification is not considered, rameters such as batch size, number of epochs, and
as it is identical in both approaches and precedes number of steps per epoch. The Machine Learning
emotion recognition. In the Machine Learning mod- method determines training duration by the number
el, speech input undergoes openSMILE processing, of hyperparameters and their value ranges. The re-
normalization, and subsequent classification using sulting hyperparameters are determined through the
SVM. Conversely, in the Deep Learning model, the algorithm’s processing and optimization of potential
speech input is initially transformed into a spectro- combinations. In contrast to explicit parameteriza-
gram, stored, normalized, and processed through tion, processing all conceivable combinations is like-
all neural network layers. The storage and retrieval ly responsible for the disparity in training duration.
of spectrograms involve additional read-and-write Consequently, this implies that the overall accuracy
transactions that are not required in the machine of the machine learning model surpasses that of the
learning method, thereby impacting the speed of the deep learning model due to the processing of all
Deep Learning model. However, the difference in combinations.
speed is marginal and invisible to humans, as such Moreover, the results in Figure 10 reveal that
disparities occur in milliseconds. the validation loss curve increases again after
Furthermore, Table 5 highlights that the pure epoch 80. A similar phenomenon is observed in
emotion classification alone operates, on average, Figure 11 from epoch 50 onwards. However, the
ten times faster than the combined audio and emo- validation accuracies in these figures do not exhibit
tion classification, regardless of the chosen method. the same increase. The rising course of these curves
This discrepancy significantly influences the decla- may indicate overfitting, which warrants further
ration of real-time capability, particularly concerning investigation in future research.
porting to the Raspberry Pi. While pure emotion The higher memory requirement in the Deep
classification can be deemed real-time capable, the Learning model is attributed to the necessity of
same cannot be said for the combined type due to ex- storing the generated spectrogram, in addition to
tended runtime. The disparity may likely stem from the primary audio file, for emotion recognition.
the CNN YAMNet employed for audio classification Another contributing factor is the higher number of
and its external development, which falls outside the parameters in the CNN than in an SVM, which are
scope of this study. Consequently, a comprehensive also stored in memory.
analysis of the time difference and its origin cannot Contrary to our expectations, the Machine Learning
be provided. Therefore, optimizing speech classifica- method exhibits higher processor utilization than
tion, which is not extensively examined in this paper, the Deep Learning method. This phenomenon is
could considerably enhance the overall process la- likely due to the improper timing of the recording.
tency. However, considering the marginal variances, the

39
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

21.7% difference in processor utilization between the of processor and memory utilization. Since the sys-
two ways is negligible. tem constantly updates these two indicators, it is
It should be noted that both models consider impossible to identify the exact utilization. Thus,
the presence of a microphone as an essential the processor and RAM utilization documentation
prerequisite. Unlike multimodal emotion recognition, only represents a snapshot, not a calculated average
a unimodal SER system does not require additional value. Furthermore, the maximum number of simul-
provisions such as cameras. Most ambient terminals taneously recognizable emotions is another technical
are equipped with native microphones but lack limitation. This paper assumes that only one emotion
native cameras, as seen in smart speakers or smart is contained in a sentence or voice recording. As the
TVs. Therefore, the prototypes developed in this sentences and audio file length increases, the proba-
study are suitable for porting to such devices. bility that multiple emotions are controlled increases.
Regarding the theoretical foundations of this However, the machine learning method using the
research, SER plays an increasingly crucial role SVM can only classify one emotion, which is why
in human-computer interaction (HCI). HCI occurs the length of the voice recording is limited to three
within the context of remote participation, which seconds.
is a component of the growing computer-supported On the other hand, the CNN in the Deep Learn-
hybrid communication in everyday life. Conse- ing method calculates a probability for each of the
quently, SER holds greater significance in everyday seven emotions. For this reason, this method can
life and is the subject of ongoing research. Similar potentially identify multiple emotions within one
to this study, there are investigations into real-time speech recording. Another technical limitation is the
SER [33] and edge computing. However, no research applicability of the prototypes to only one person.
on SER applications on edge devices exists, as The model training is based on emotional content in
Shi et al. (2016) defined. Thus, the combination the audio files of the acquired databases. Individuals
of SER, edge computing, and real-time processing can be heard in each audio file so that the prototypes
can apply emotion recognition only to individuals.
explored in this study represents a novel research
When multiple individuals speak simultaneously, the
extension. To maintain the focus of this work,
prototypes cannot distinguish between individuals
restrictions are deliberately made. However, other
and their emotions. The extension to multi-person
external limitations also limit this work, which
recognition goes beyond the definition of these pro-
will be explained in more detail below. According
totypes and therefore needs to be investigated in fur-
to Ekman (1971), only the six basic emotions,
ther work.
including a seventh neutral emotion, are considered,
Furthermore, the porting of prototypes is also lim-
which is why emotions such as tiredness or boredom
ited. For example, only two device categories were
are excluded in this work. Accordingly, the data
selected since porting to more devices would exceed
acquisition is made with the mentioned seven
the scope of this paper. For this reason, porting to
emotions, further limiting the selection of suitable
smartphones or tablets, for example, is not included.
databases.
Furthermore, the dimensions of arousal and va-
lence are also omitted. These dimensions can be con- 6. Conclusions
sidered in continuing work but do not play a role in The outcomes and interpretations presented in
the mere emotion recognition in this research. There- this study provide compelling evidence that the de-
fore, it is pinpointed that these dimensions exist, but veloped prototypes are functional and well-suited for
it does not address them in the further course of the practical applications. This Speech Emotion Recog-
study. nition (SER) systems have the potential for various
A technical limitation, however, is the mapping use cases and can offer extensions to existing prod-

40
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ucts and services. Here are some examples of appli- based on emotional states (e.g., increasing
cation areas and their resulting benefits. costs for joyful emotions), is also a potential
(i) Universal Application: SER applications can application.
be utilized wherever speech plays a central This research investigated the ability of an SER
role, such as call centers, radio broadcasts, system to distinguish speech, non-speech, and si-
podcasts, and television shows. New business lence, as well as classify different emotions. The
models can be developed that merge physical study involved a systematic literature review, devel-
and virtual presence. For instance, imple- oping two prototypes using machine learning and
menting an SER system in a smart speaker deep learning approaches, and training the models
can detect vocal activity and emotions in a using a data corpus comprising five audio databases.
home environment. These emotions can be Before being used for model training, the audio files
visually presented to the user and, with their underwent preprocessing, including conversion to a
consent, transmitted to the producer for prod- sampling rate of 16000 Hz and a mono channel.
uct optimization, offering the user a premium In the machine learning approach, the openS-
in return. Similarly, SER can automate the MILE framework was employed for feature ex-
editing of highlights in a broadcast sports traction, generating eGeMAPS features that were
game based on detected emotions. Such normalized and used for classification. Support
scenarios can be extended to internet-based Vector Machine (SVM) served as the classifier. The
broadcasts like Twitch or Netflix. model training took approximately 96 hours on a
(ii) Real-time Audience Mood Capture: SER server, and while successful porting to a notebook
applications can capture the current mood of was achieved, porting to a Raspberry Pi was unsuc-
an audience in real time. Unlike the previous cessful. The prototype demonstrated the capability to
use case, where emotions are summarized identify different sounds in under 1000 milliseconds
over time, this approach focuses on determin- and classify seven emotions in the case of speech.
ing emotion levels at precise moments. This In the Deep Learning Model approach, audio
can be valuable in political talks or product files were transformed into Mel spectrograms, nor-
presentations, where immediate feedback on malized, and used as input for a CNN implemented
expressed emotions is crucial. By providing using TensorFlow. The CNN performed feature
unbiased input to speakers, SER enables extraction and classification. The model training,
them to gauge audience response accurately. including transfer learning, took around six hours on
These techniques apply to physical, virtual, the server. The completed model was successfully
or hybrid forms of communication, further ported to a notebook and a Raspberry Pi. The note-
emphasizing the increasing trend of remote book achieved classification below 1000 millisec-
participation. onds, while the Raspberry Pi required approximately
(iii) Individual-focused Applications: SER appli- 4427 milliseconds. The models’ computation time
cations can cater to individual users, tailoring and classification accuracy were evaluated using the
experiences based on their detected emo- provided formula.
tions. For example, a smart speaker or auto- SER systems are embedded in Human-Computer
mobile with an SER system can adjust music Interaction (HCI) systems and can potentially be ap-
or lighting according to the user’s emotional plied in everyday scenarios. The results of this study
state. In gaming, the algorithm can offer in- demonstrate that the technical feasibility of practical
game relief when anger is detected. Individ- implementation is achievable, and several use cases
ualized advertising in social media or e-com- described in this research can find real-world appli-
merce platforms, varying prices dynamically cations. Moreover, these findings highlight the grow-

41
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ing relevance of SER in everyday communication, as smartphones or tablets, each with its diverse range
where remote participation is increasingly combined of devices and operating systems.
with a physical presence. Such SER systems have Regarding the real-time capability of the proto-
the potential to enhance human-machine interaction, type, it would be worthwhile to explore the execu-
making communication more human-like and intui- tion of digital signal processors and their potential
tive. Based on this research, the acceptance and uti- for optimizing runtimes. Utilizing digital signal
lization of SER-enabled remote participation appli- processors optimized for real-time functions like
cations can be considered an extension of the fourth Fast Fourier Transform in mobile devices like the
criterion of emotion recognition. Raspberry Pi could further enhance the prototypes’
The results and discussions presented in this real-time capability and overall performance.
study can be further enriched and expanded through In conclusion, this study demonstrates the theo-
future research. Additional investigations could ex- retical and practical feasibility of real-time speech-
plore other emotions or broaden the scope of the uti- based emotion recognition through edge computing.
lized databases. Furthermore, examining arousal and The implications of this research extend to practical
valence dimensions would be valuable. Investigating applications and provide a foundation for future in-
the machine’s subsequent actions linked to recog- vestigations.
nized emotions within the SER system is another
avenue worth exploring. For example, studying the Author Contributions
most suitable lighting settings, color combinations,
D.E.d.A. conceived the idea of researching a
or music choices to support or counteract specific
real-time processing method that captures and eval-
emotions based on the determined emotions could be uates emotions in speech. R.B. and D.E.d.A. con-
interesting. This could involve studying music’s gen- ceived the study. R.B. served as D.E.d.A.’s graduate
re, volume, and beat rate and its relation to emotion advisor on his graduate thesis at the FOM University
recognition within songs. Combining both approach- of Applied Sciences. All authors reviewed and ap-
es, selecting songs based on identified emotions and proved the final manuscript.
playing them in response to human emotions, could
provide an intriguing direction for further explora-
Conflict of Interest
tion.
Further research could also focus on the Deep There is no conflict of interest.
Learning method, exploring different hyperparam-
eters for model training and investigating modified Funding
transfer learning techniques. Multitask or semi-su-
This research received no external funding.
pervised learning could offer new perspectives in ad-
vancing SER research. Additionally, the limitations
identified in this study open up opportunities for References
independent research and raise further questions. For [1] El Ayadi, M., Kamel, M.S., Karray, F., 2011.
instance, investigating whether an SER system can Survey on speech emotion recognition: Features,
differentiate between multiple individuals based on classification schemes, and databases. Pattern
speech or identify numerous emotions within a sen- Recognition. 44(3), 572-587.
tence could be explored. Exploring subjective user DOI: https://doi.org/10.1016/j.patcog.2010.09.020
perception and experience could also be valuable. [2] Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N.,
Lastly, the prototypes developed in this study were et al., 2001. Emotion recognition in human-com-
ported to two device categories, prompting whether puter interaction. IEEE Signal Processing Mag-
they can be extended to other device categories, such azine. 18(1), 32-80.

42
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

DOI: https://doi.org/10.1109/79.911197 [12] Nassif, A.B., Shahin, I., Attili, I., et al., 2019.
[3] Schuller, B.W., 2018. Speech emotion recog- Speech recognition using deep neural networks:
nition: Two decades in a nutshell, benchmarks, A systematic review. IEEE Access. 7, 19143-
and ongoing trends. Communications of the 19165.
ACM. 61(5), 90-99. DOI: https://doi.org/10.1109/ACCESS.2019.
DOI: https://doi.org/10.1145/3129340 2896880
[4] Kraus, M.W., 2017. Voice-only communication [13] Schuller, B., Batliner, A., Steidl, S., et al., 2011.
enhances empathic accuracy. American Psychol- Recognising realistic emotions and affect in
ogist. 72(7), 644. speech: State of the art and lessons learnt from
DOI: https://doi.org/10.1037/amp0000147 the first challenge. Speech Communication.
[5] Akçay, M.B., Oğuz, K., 2020. Speech emotion 53(9-10), 1062-1087.
recognition: Emotional models, databases, fea- DOI: https://doi.org/10.1016/j.specom.2011.01.011
tures, preprocessing methods, supporting mo- [14] Hochreiter, S., Schmidhuber, J., 1997. Long
dalities, and classifiers. Speech Communication. short-term memory. Neural Computation. 9(8),
116, 56-76. 1735-1780.
DOI: https://doi.org/10.1016/j.specom.2019.12.001 DOI: https://doi.org/10.1162/neco.1997.9.8.1735
[6] Dincer, I., 2000. Renewable energy and sustain- [15] Khalil, R.A., Jones, E., Babar, M.I., et al., 2019.
able development: A crucial review. Renewable Speech emotion recognition using deep learning
and Sustainable Energy Reviews. 4(2), 157-175. techniques: A review. IEEE Access. 7, 117327-
DOI: https://doi.org/10.1016/S1364-0321(99)00011-8 117345.
[7] Chao, K.M., Hardison, R.C., Miller, W., 1994. DOI: https://doi.org/10.1109/ACCESS.2019.2936124
Recent developments in linear-space alignment [16] Hinton, G., Deng, L., Yu, D., et al., 2012. Deep
methods: A survey. Journal of Computational neural networks for acoustic modeling in speech
Biology. 1(4), 271-291. recognition: The shared views of four research
DOI: https://doi.org/10.1089/cmb.1994.1.271 groups. IEEE Signal Processing Magazine.
[8] Abbas, N., Zhang, Y., Taherkordi, A., et al., 29(6), 82-97.
2017. Mobile edge computing: A survey. IEEE DOI: https://doi.org/10.1109/MSP.2012.2205597
Internet of Things Journal. 5(1), 450-465. [17] Torrey, L., Shavlik, J., Walker, T., et al., 2010.
DOI: https://doi.org/10.1109/JIOT.2017.2750180 Transfer learning via advice taking. Advances in
[9] Cao, K., Liu, Y., Meng, G., et al., 2020. An over- machine learning. Springer: Berlin.
view on edge computing research. IEEE Access. DOI: https://doi.org/10.1007/978-3-642-05177-7_7
8, 85714-85728. [18] Eyben, F., Scherer, K.R., Schuller, B.W., et al.,
DOI: https://doi.org/10.1109/ACCESS.2020. 2015. The Geneva minimalistic acoustic pa-
2991734 rameter set (GeMAPS) for voice research and
[10] Shi, W., Cao, J., Zhang, Q., et al., 2016. Edge affective computing. IEEE Transactions on Af-
computing: Vision and challenges. IEEE Inter- fective Computing. 7(2), 190-202.
net of Things Journal. 3(5), 637-646. DOI: https://doi.org/10.1109/TAFFC.2015.2457417
DOI: https://doi.org/10.1109/JIOT.2016.2579198 [19] Ekman, P., 1971. Universals and cultural differ-
[11] Lin, Y.L., Wei, G. (editors), 2005. Speech emo- ences in facial expressions of emotion. Nebraska
tion recognition based on HMM and SVM. 2005 Symposium on Motivation. University of Ne-
International Conference on Machine Learning braska Press: Nebraska.
and Cybernetics; 2005 Aug 18-21; Guangzhou, [20] Siedlecka, E., Denson, T.F., 2019. Experimental
China. New York: IEEE. methods for inducing basic emotions: A qualita-
DOI: https://doi.org/10.1109/icmlc.2005.1527805 tive review. Emotion Review. 11(1), 87-97.

43
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

DOI: https://doi.org/10.1177/1754073917749016 works. IEEE Transactions on Multimedia. 16(8),


[21] Livingstone, S.R., Russo, F.A., 2018. The Ryer- 2203-2213.
son Audio-Visual Database of Emotional Speech DOI: https://doi.org/10.1109/TMM.2014.2360798
and Song (RAVDESS): A dynamic, multimod- [28] Tzirakis, P., Zhang, J., Schuller, B.W. (editors),
al set of facial and vocal expressions in North 2018. End-to-end speech emotion recognition
American English. PloS One. 13(5), e0196391. using deep neural networks. 2018 IEEE Inter-
DOI: https://doi.org/10.1371/journal.pone.0196391 national Conference on Acoustics, Speech and
[22] Burkhardt, F., Paeschke, A., Rolfes, M., et al. Signal Processing (ICASSP); 2018 Apr 15-20;
(editors), 2005. A database of German emotion- Calgary, AB, Canada. New York: IEEE.
al speech. 9th European Conference on Speech DOI: https://doi.org/10.1109/ICASSP.2018.8462677
Communication and Technology; 2005 Sep 4-8; [29] Shinde, P.P., Shah, S. (editors), 2018. A review
Lisbon, Portugal. of machine learning and deep learning applica-
DOI: https://doi.org/10.21437/interspeech.2005-446 tions. 2018 Fourth International Conference on
[23] Choudhury, A.R., Ghosh, A., Pandey, R., et Computing Communication Control and Au-
al. (editors), 2018. Emotion recognition from tomation (ICCUBEA); 2018 Aug 16-18; Pune,
speech signals using excitation source and spec- India. New York: IEEE.
tral features. 2018 IEEE Applied Signal Pro- DOI: https://doi.org/10.1109/ICCUBEA.2018.8697857
cessing Conference (ASPCON); 2018 Dec 7-9; [30] Adetiba, E., Adeyemi-Kayode, T.M., Akinrin-
Kolkata, India. New York: IEEE. made, A.A., et al., 2021. Evolution of artificial
DOI: https://doi.org/10.1109/ASPCON.2018.8748626 intelligence programming languages-a system-
[24] Costantini, G., Iadarola, I., Paoloni, A., et al. atic literature review. Journal of Computer Sci-
(editors), 2014. EMOVO corpus: An Italian ence. 17(11), 1157-1171.
emotional speech database. Proceedings of the DOI: https://doi.org/10.3844/JCSSP.2021.1157.1171
Ninth International Conference on Language [31] Hershey, S., Chaudhuri, S., Ellis, D.P.W., et al.
Resources and Evaluation (LREC’14); 2014 (editors), 2017. CNN architectures for large-
May; Reykjavik, Iceland. scale audio classification. 2017 IEEE Interna-
[25] Martin, O., Kotsia, I., Macq, B., et al. (editors), tional Conference on Acoustics, Speech and
2006. The eNTERFACE’05 Audio-Visual emo- Signal Processing (ICASSP); 2017 Mar 5-9;
tion database. 22nd International Conference New Orleans, LA, USA. New York: IEEE.
on Data Engineering Workshops (ICDEW’06); DOI: https://doi.org/10.1109/ICASSP.2017.7952132
2006 Apr 3-7; Atlanta, GA, USA. New York: [32] Gemmeke, J.F., Ellis, D.P.W., Freedman, D. et
IEEE. al. (editors), 2017. Audio set: An ontology and
DOI: https://doi.org/10.1109/ICDEW.2006.145 human-labeled dataset for audio events. 2017
[26] Lim, W., Jang, D., Lee, T. (editors), 2016. IEEE International Conference on Acoustics,
Speech emotion recognition using convolutional Speech and Signal Processing (ICASSP); 2017
and Recurrent Neural Networks. 2016 Asia-Pa- Mar 5-9; New Orleans, LA, USA. New York:
cific Signal and Information Processing Associ- IEEE.
ation Annual Summit and Conference (APSIPA); DOI: https://doi.org/10.1109/ICASSP.2017.7952261
2016 Dec 13-16; Jeju, Korea (South). New [33] Vogt, T., André, E., Wagner, J., 2008. Auto-
York: IEEE. matic recognition of emotions from speech: A
DOI: https://doi.org/10.1109/APSIPA.2016.7820699 review of the literature and recommendations
[27] Mao, Q., Dong, M., Huang, Z., et al., 2014. for practical realisation. Affect and emotion in
Learning salient features for speech emotion human-computer interaction. Springer: Berlin.
recognition using convolutional neural net- pp. 75-91.

44
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

DOI: https://doi.org/10.1007/978-3-540-85099-1_7 2016 IEEE Conference on Computer Vision and


[34] Zhang, S., Zhang, S., Huang, T., et al., 2017. Pattern Recognition (CVPR); 2016 Jun 27-30;
Speech emotion recognition using deep convo- Las Vegas, NV, USA. New York: IEEE.
lutional neural network and discriminant tem- DOI: https://doi.org/10.1109/CVPR.2016.90
poral pyramid matching. IEEE Transactions on [41] Sandler, M., Howard, A., Zhu, M., et al. (edi-
Multimedia. 20(6), 1576-1590. tors), 2018. MobileNetV2: Inverted residuals
DOI: https://doi.org/10.1109/TMM.2017.2766843 and linear bottlenecks. 2018 IEEE/CVF Confer-
[35] Liu, S., Nan, K., Lin, Y., et al. (editors), 2018. ence on Computer Vision and Pattern Recogni-
On-demand deep model compression for mobile tion; 2018 Jun 18-23; Salt Lake City, UT, USA.
devices: A usage-driven model selection frame- New York: IEEE.
work. Proceedings of the 16th Annual Interna- DOI: https://doi.org/10.1109/CVPR.2018.00474
tional Conference on Mobile Systems, Appli- [42] Abadi, M., Barham, P., Chen, J. et al. (editors),
cations, and Services; 2018 Jun 10-15; Munich 2016. TensorFlow: A system for large-scale
Germany. machine learning. Proceedings of the 12th USE-
DOI: https://doi.org/10.1145/3210240.3210337 NIX Symposium on Operating Systems Design
[36] Davis, J., Goadrich, M. (editors), 2006. The and Implementation (OSDI’16); 2016 Nov 2-4;
relationship between precision-recall and ROC Savannah, GA, USA.
curves. Proceedings of the 23rd International [43] Kingma, D.P., Ba, J.L. (editors), 2015. Adam: A
Conference on Machine Learning; 2006 Jun 25- method for stochastic optimization. 3rd Interna-
29; Pittsburgh Pennsylvania USA. tional Conference for Learning Representations;
DOI: https://doi.org/10.1145/1143844.1143874 2015 May 7-9; San Diego, CA, USA.
[37] LeCun, Y., Bottou, L., Bengio, Y., et al., 1998. [44] Wang, X., Han, Y., Leung, V.C., et al., 2020.
Gradient-based learning applied to document Convergence of edge computing and deep learn-
recognition. Proceedings of the IEEE. 86(11), ing: A comprehensive survey. IEEE Communi-
2278-2324. cations Surveys & Tutorials. 22(2), 869-904.
DOI: https://doi.org/10.1109/5.726791 DOI: https://doi.org/10.1109/COMST.2020.2970550
[38] Krizhevsky, A., Sutskever, I., Hinton, G.E. [45] Cummins, N., Amiriparian, S., Hagerer, G., et
(editors), 2012. Imagenet classification with al. (editors), 2017. An image-based deep spec-
deep convolutional neural networks. Advances trum feature representation for the recognition
in Neural Information Processing Systems 25: of emotional speech. Proceedings of the 25th
26th Annual Conference on Neural Information ACM international conference on Multimedia;
Processing Systems; 2012 Dec 3-6; Lake Tahoe, 2017 Oct 23-27; Mountain View California
Nevada, United States. USA.
[39] Simonyan, K., Zisserman, A. (editors), 2015. DOI: https://doi.org/10.1145/3123266.3123371
Very deep convolutional networks for large- [46] Ottl, S., Amiriparian, S., Gerczuk, M., et al. (ed-
scale image recognition. The 3rd Internation- itors), 2020. Group-level speech emotion recog-
al Conference on Learning Representations nition utilising deep spectrum features. Proceed-
(ICLR2015); 2015 May 7-9; San Diego, CA, ings of the 2020 International Conference on
USA. Multimodal Interaction; 2020 Oct 25-29; Virtual
[40] He, K., Zhang, X., Ren, S. et al. (editors), 2016. Event Netherlands.
Deep residual learning for image recognition. DOI: https://doi.org/10.1145/3382507.3417964

45
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Journal of Computer Science Research


https://journals.bilpubgroup.com/index.php/jcsr

REVIEW

Innovating Pedagogical Practices through Professional Development


in Computer Science Education
Xiaoxue Du1* , Ellen B Meier2
1
MIT Media Lab, MIT, Cambridge, MA 02139, USA
2
Teachers College, Columbia University, New York City, NY 10027, USA

ABSTRACT
Recent advancements in technology have opened up new avenues for educators to facilitate teaching and leverage
more learning access in the digital age. As the demand for computational skills continues to grow in preparation
for future careers, both teachers and students face the challenge of developing problem-solving, critical thinking,
communication, and collaboration skills within an emerging digital landscape. Technology adoption, big data,
cloud computing and artificial intelligence pose ongoing challenges for both teachers and students in adapting to
the changing workforce development landscape. To tackle these challenges, the paper highlights the importance of
exploring the implications of learning sciences in classroom teaching, developing a holistic vision for professional
development in education, and understanding the complexities of teacher change. To effectively implement these
components, it is crucial to adopt design approaches that prioritize student ownership in education and embrace the
principles of inclusive education to reconceptualize the teaching practices in education and technology.
Keywords: Education; Computational thinking; Teacher education; Professional development; Design; Equity

learning opportunities. In particular, a recent empha-


1. Introduction sis on education requires the development of intel-
As computers continue to automate our routine lect, ethical judgment, societal understanding, and
and complex tasks, equity in technology access, creativity [1]. The technological challenges raise the
content, and use becomes a key barrier to future critical question of how to prepare teachers to face

*CORRESPONDING AUTHOR:
Xiaoxue Du, MIT Media Lab, MIT, Cambridge, MA 02139, USA; Email: [email protected]
ARTICLE INFO
Received: 30 May 2023 | Revised: 9 July 2023 | Accepted: 13 July 2023 | Published Online: 20 July 2023
DOI: https://doi.org/10.30564/jcsr.v5i3.5757
CITATION
Du, X.X., Meier, E.B., 2023. Innovating Pedagogical Practices through Professional Development in Computer Science Education. Journal of
Computer Science Research. 5(3): 46-56. DOI: https://doi.org/10.30564/jcsr.v5i3.5757
COPYRIGHT
Copyright © 2023 by the author(s). Published by Bilingual Publishing Group. This is an open access article under the Creative Commons Attribu-
tion-NonCommercial 4.0 International (CC BY-NC 4.0) License. (https://creativecommons.org/licenses/by-nc/4.0/).

46
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

the unprecedented changes in the immediate future discuss the implications of strengthening teachers’
of education [2]. Teachers must acquire fundamental professional development in the computing science
knowledge and adopt innovative teaching methods education community.
in order to effectively incorporate technology into
their instruction and meet both the academic and so- 2. Key challenges and opportunities
cial-emotional requirements of students in the realm
Implementing in school settings presents new
of technology [3,4]. Drawing upon insights from the
challenges for educators for at least five key reasons.
learning sciences and teacher education literature,
First, there is a lack of “shared meaning” [10] for
technology possesses the capability to pave the way
computer science as an academic discipline in K-12
for groundbreaking teaching approaches within
education. Teachers should strive to develop a shared
classrooms. By harnessing the power of technology,
comprehension of both content knowledge and effec-
educators can unlock diverse learning opportunities
tive pedagogical practices in order to seamlessly in-
that cater to the needs of all students, including those
tegrate them into their curriculum planning. Second,
from diverse cultural and language backgrounds [5].
computational thinking is increasingly considered a
Emerging technologies could serve as cognitive tu-
foundational skill in the 21st century, but is not sys-
tors, peer learners, and conversational agents, in order
tematically addressed in the curriculum. It serves as
to introduce students to novel methods of reflection,
a process for recognizing aspects of computation in
reasoning, and learning in their everyday lives [6,7].
the surroundings and introduces techniques from CS
The growing demands in computer science (CS)
to understand both natural and artificial systems and
education among schools and educational entities
processes [11]. Third, there is not a clear scope and
have shown the need to strengthen students’ knowl- sequence for standards in each grade, which creates
edge and skills in problem-solving and analytical challenges for educators interested in developing
thinking [8,9]. Therefore, establishing meaningful student learning plans to integrate CS across disci-
pedagogical practices and fostering a culture of life- plines. Because of the lack of a scope and sequence,
long learning are crucial aspects when it comes to there is insufficient empirical evidence for student
computer science education. The basic premise of learning and a lack of clear assessment objectives to
the paper is that integrating CS is a complex process, support content definition and sequencing. Fourth,
which requires much more than simply “shoehorn- teachers’ professional development in computer sci-
ing” a new curriculum into the school day. Teachers ence is a new process, which requires more empiri-
need to cultivate the skills to design and establish a cal evidence and research to identify the core profes-
student-centered learning environment, purposefully sional development content material and resources
enabling the effective integration of computer sci- needed to prepare educators for designing stu-
ence education. Moreover, it is essential to foster a dent-centered learning experiences in education [12].
rigorous community space that encourages learners Lastly, recent research has emphasized the signifi-
to make connections between their acquired knowl- cance of nurturing young people’s capacity to create
edge and other areas within the computing field. through the acquisition of computing skills. The
This approach helps foster a sense of belonging development of these skills holds substantial impli-
to the wider computing community. In this paper, cations for their personal lives and the betterment of
we will first synthesize key challenges and oppor- their communities. To ensure that students develop
tunities identified in computer science education. essential computing design skills, it is imperative
Then, we will introduce the key components of a to create an inclusive, motivating, and empowering
research-based, systematic professional development learning environment. This could provide students
approach to build teachers’ capacity to design a stu- with greater autonomy to code, break down complex
dent-centered learning environment. Finally, we will problems, and apply their learning across various

47
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

contexts to solve real-world problems. However, in In this model, teachers take on the role of designers
many cases, the curriculum may focus exclusively collaborating with facilitators to co-design projects
on technical challenges and entry points, requiring that can be implemented in their real school settings.
students to have prior programming experience. This Technology plays a pivotal role as a catalyst for driv-
limits opportunities for students to engage critically ing pedagogical innovation and motivates teachers
with a broader curriculum and participate in a larger as changemakers to advocate and sustain change in
computing community. To address the challenge, it classroom teaching [14,15]. (see Table 1, CTSC Profes-
is important to create a wide and deep learning space sional Development model: Innovating Instruction
for learners, allowing them to connect what they model).
have learned to other computing and content fields The model comprises three fundamental compo-
and fostering a sense of belonging to the greater nents: Design, Situate, and Lead, all aimed at assist-
computing community [13]. ing teachers in transforming their teaching and learn-
One way to address the challenges faced by both ing approaches. It is imperative for teachers to grasp
educators and students in computer science educa- effective teaching practices aligned with principles
tion is to provide more professional development derived from the learning sciences. This understand-
opportunities for educators that position teachers ing enables them to design environments that foster
as designers and effectively integrate computation- meaningful learning experiences and facilitate op-
al-oriented curriculum into daily learning and teach- portunities for students to deepen their understanding
ing. Teachers need the essential knowledge and skills of the subject. Equipping teachers with the capacity
to plan enriched lessons, select the most relevant to design curriculum goals, employ formative assess-
user cases, and design curricula to develop students’ ments, and engage diverse students in inquiry-based
skills in problem-solving, computational thinking, learning environments holds great promise in sup-
and critical awareness in education. In addition, in porting their professional growth [15].
the design process, teachers can develop student-cen- The Situate component plays a crucial role in cus-
tered learning environments that allow students
tomizing the learning experience for each teacher’s
with different background knowledge to engage
classroom and their students. It not only showcases
curriculum, develop their interests, and build confi-
engaging pedagogical practices through a hands-
dence that empowers them to learn and grow in the
on approach but also offers personalized support to
computer science curriculum. Finally, more research
teachers. Incorporating insights from the learning
should be conducted to develop high-quality profes-
sciences, it establishes a foundation for comprehend-
sional development, which could, in turn, prepare
ing the intricacies of learning and thinking. The sci-
a cohort of change leaders to innovate pedagogical
ence of learning and development (SoLD) approach
practices in computer science beyond programming,
has been utilized to expound upon the “whole child
while building local community networks to sustain
model”, which underscores the necessity of address-
the change and innovation in daily teaching.
ing various aspects of students’ academic, cognitive,
ethical, physical, psychological, and socio-emotional
3. Applying the innovating instruction well-being. Specifically, creating a supportive envi-
model in education ronment fosters strong relationships and a sense of
The Innovating Instruction © model has been community among students. The situated approach
developed by the Center for Technology and School encourages teachers to position students as active
Change, Teachers College, Columbia University. “knowledge-builders” within an inquiry-based learn-
The model is developed and built upon the theory of ing environment [16,17].
change, learning science, professional development Finally, the Lead component of the model focuses
theories, and the emerging capabilities of technology. on preparing teachers to become leaders and col-

48
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 1. Innovating instruction model.


DESIGN—Engage teachers as designers of student-centered, authentic learning experiences
Model and support a backwards design approach to project planning that creates meaningful
1. Embrace a Design Approach
learning experiences for students.
Provide opportunities for deepening teachers’ understanding of content, including cross-
2. Enrich Content Knowledge
curricular connections, learning standards, and student misconceptions.
Facilitate the design of authentic assessment and data use to identify and respond to student
3. Integrate Assessment Practices
needs.
Teach the integration of digital tools as part of the design process to facilitate interactive
4. Leverage Digital Tools
student learning and to enrich content.
SITUATE—Provide learning experiences for teachers that respects them as professionals and adapts the learning for their
particular school and situation
Situate the design work in the professional lives of teachers in order to connect deeply to
1. Contextualize Teacher Learning
the realities of teachers’ classrooms and their students.
Provide interactive, hands-on professional development that engages teachers and models
2. Model Effective Practice
project-based learning with available tools and resources.
Co-construct project plans based on student and curricular needs, provide ongoing support
3. Individualize Support
for classroom implementation, and facilitate reflection on teaching and learning.
LEAD—Support leaders in guiding and sustaining change initiatives, while positioning teachers as agents of change
Prioritize instructional leadership and develop actionable goals to promote change in self-
1. Envision Change
identified areas of need.
Provide a forum for identifying leaders--administrators, teachers, and community members-
2. Empower Leadership at All Levels
-who can spearhead efforts that contribute to the common vision.
Scaffold educators’ efforts toward instructional innovation to realize goals beyond the
3. Sustain A Culture for Innovation
immediate scope of the professional development.
Lead research that informs the transformative use of technology in existing and emerging
4. Research practices in schools, while contributing to evolving scholarship on innovations for teaching
and learning
Figure. Situate. Design. Lead. © CTSC’s Professional Development Model for Innovating Instruction, Detailed

laborates with building administrators to empower Learning Environments (STILE 1.0) for STEM (Ex-
individual leadership and foster a culture of change ploratory Award No. DRL-1238643) and STILE 2.0
and innovation. The guiding theoretical framework (Early-Stage Design and Development Award No.
is the theory of change, which recognizes the com- DRL-1621387)—have established the model’s pos-
plexity of the learning environment and the essential itive impact on teachers’ ability to design projects,
components required to facilitate transformative to shift from disciplinary to transdisciplinary project
shifts in education. Strategic planning, the effective design, and to shift instructional thinking to include
implementation of new teaching approaches, and the inquiry-based approaches. The research findings
continuous development of teachers’ beliefs about from the STILE initiatives demonstrate the posi-
teaching and learning are emphasized as pivotal tive impacts of the model on teachers’ pedagogical
factors for driving meaningful change within the change as defined by shifts in STEM perspectives,
school system. The model also underscores the im- STEM design practices, and STEM classroom prac-
portance of creating “shared meaning” among key tices [11]. The research in thirteen diverse school con-
stakeholders, considering the institutional, historical, texts, included 169 classroom visitations, 372 plan-
and cultural perspectives that shape relationships and ning meetings, and over 51 hours of administrator
language in the field of education [10,17]. interaction. The total average dosage was estimated
Two recent National Science Foundation (NSF) at 61 hours per teacher, supporting 169 teachers in
grants—the Systemic Transformation of Inquiry the New York City public school systems, cumula-

49
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

tively ultimately serving over 7536 students. The the teaching profession that prioritizes learning and
research identified positive changes that teachers the needs of learners from diverse communities [18].
have made under the STILE 2.0 program, specifical- For example, professional development programs
ly in teachers’ ability to design projects, shift from can provide teachers with training and resources to
a disciplinary/subject-orientation to a more sophis- enhance their pedagogical skills in computer science
ticated transdisciplinary focus, and broaden their education. This could include workshops on inte-
instructional thinking to include more inquiry-based grating technology into lessons, designing engaging
approaches [11]. coding activities, or implementing project-based
Two recent grants from the National Science learning in the computer science classroom. By
Foundation (NSF), namely STILE 1.0 (Systemic equipping teachers with the necessary knowledge
Transformation of Inquiry Learning Environments and tools, professional development empowers them
for STEM) and STILE 2.0, have demonstrated the to create meaningful learning experiences and cater
positive impact of the model on teachers’ capacity to to the diverse needs of their students in the realm of
design projects, transition from disciplinary to trans- computer science [19,20].
disciplinary project design, and adopt inquiry-based With access to resources in computer science
approaches in their instruction. The research findings (CS) education, teachers receive valuable support
from these initiatives highlight the constructive influ- in designing student-centered learning experiences
ence of the model on teachers’ pedagogical change, that foster the development of students’ identity and
as evidenced by shifts in STEM perspectives, STEM their willingness to actively engage in the broader
design practices, and STEM classroom practices. computing community [21]. For instance, through pro-
The research study took place in thirteen different fessional development, teachers can learn about var-
school settings, comprising 169 classroom observa- ious tools, platforms, and instructional strategies that
tions, 372 planning meetings, and over 51 hours of enable them to create interactive coding projects,
engagement with administrators. On average, each collaborative programming exercises, or real-world
teacher received around 61 hours of support from
CS applications. By incorporating these resources
the program, benefiting a total of 169 teachers in the
into their teaching, teachers can empower students
New York City public school systems and ultimately
to explore their interests, develop problem-solving
influencing over 7536 students. The research find-
skills, and cultivate a sense of belonging within the
ings revealed significant improvements in teachers’
computing field [22]. This not only enhances stu-
abilities to design projects, shift from a narrow disci-
dents’ learning experiences but also nurtures their
plinary focus to a broader transdisciplinary perspec-
enthusiasm and motivation to actively participate in
tive, and enhance their instructional thinking by inte-
the wider CS community beyond the classroom [23].
grating more inquiry-based methods. These positive
As a result, students will have more ownership and
changes were observed as a result of the STILE 2.0
responsibility to explore concepts beyond essential
program [11].
programming ideas (e.g., loops, arrays, conditional
statements). They will utilize their skills to build a
4. Visionary goals in professional deeper understanding of how these concepts apply
development and CS education to broader social and cultural contexts. This expand-
ed perspective encourages students to consider the
4.1 A grand vision for professional develop-
practical applications of computer science in various
ment in CS education
domains, such as healthcare, environmental sustaina-
To effectively introduce computer science curric- bility, or social justice [24,25]. By connecting program-
ulum in schools, professional development plays a ming skills to real-world contexts, students develop
crucial role for teachers to adopt a broader vision for a more comprehensive understanding of the societal

50
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

impact and significance of computer science, em- offer teachers access to various technology tools and
powering them to become critical thinkers and active platforms specifically designed for computer sci-
contributors to their communities [26,27]. ence education. This includes platforms for creating
Finally, CS professional development should interactive programming projects, virtual environ-
deepen teachers’ content knowledge and provide ments for exploring computer science concepts, and
curriculum resources, encouraging teachers to utilize collaborative coding platforms. Through hands-on
different means, access, and representations to sim- workshops and training sessions, teachers can gain
ulate abstract concepts in order to develop students’ practical knowledge of these tools and learn effective
interests and curiosity in the field [28,29]. Specifically, strategies for incorporating them into their teaching.
ongoing research in professional development should To summarize, professional development initiatives
prepare teachers to continuously innovate pedagog- offer teachers increased opportunities to gain famil-
ical practices to design, pilot and implement com- iarity with a variety of technology tools, demon-
puter science curriculum in classroom settings [30]. strate their use in classroom instruction, and provide
It should invite educators to consider a broader, cul- checkpoints for reflection on the implementation
turally relevant approach that designs curriculum process [37,38]. By successfully employing new prac-
situated for a range of learners, especially students tices and research-based methods, teachers receive
who have been traditionally under-represented in the further support in assimilating innovative approaches
computing fields [31]. The CS literature has shown into their existing belief systems [39,40].
that women and students of color have been over-
looked and excluded by the wider computing com- 4.3 Reconceptualizing teaching practices in
munity [8], therefore, it is critical for teachers to iden- CS education
tify effective approaches for engaging all students in
the field of computer science education [32,33]. In the realm of computer science education, it is
crucial to critically examine practices and emerging
research in general education [41]. For example, teach-
4.2 The complexity of the teacher changes in
er education programs can incorporate pedagogical
CS education
strategies that promote hands-on learning experienc-
Research findings demonstrate that thoughtful es, such as coding workshops or robotics projects.
professional development can significantly impact By engaging students in these practical activities,
teachers’ ability to incorporate technology into their they can develop a deeper understanding of program-
classroom practices [34]. Studies indicate that teach- ming concepts and gain valuable problem-solving
ers’ beliefs and practices can evolve when provided skills. Additionally, project-based approaches can be
with clear and specific instructions during profes- implemented in computer science education [42]. For
sional development sessions in CS education [35]. In instance, students can work on real-world projects
the context of computer science education, ongoing like designing a mobile application or creating a
research should place a strong emphasis on exposing website for a local business. These projects not only
teachers to the design process and enabling them reinforce technical skills but also encourage critical
to explore the integration of technology in projects thinking and creativity as students navigate challeng-
that promote a shift in instructional thinking [36]. For es and make design decisions [43].
example, professional development programs can Collaborative problem-solving is another impor-
provide teachers with opportunities to engage in tant strategy to consider. An example of this could
coding and computational thinking activities them- be organizing group activities where students collab-
selves, allowing them to experience firsthand the orate to solve complex coding problems or develop
creative process involved in designing and devel- a software solution together [44]. Through teamwork,
oping computer programs. These programs can also students learn how to communicate effectively, share

51
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ideas, and leverage each other’s strengths to achieve ing situated, culturally relevant computer science
a common goal [45]. To stay up-to-date with the lat- curriculum [55]. In addition, professional development
est advancements, it is important for educators to should provide teachers with an easy-to-use platform
explore research and developments in educational that could encourage students to quickly build pro-
technology, computational thinking, and computer totype, implementation solutions without creating
science instruction [46]. For instance, they can learn complicated programming syntax [56]. For instance,
about new tools and platforms that facilitate inter- the growing usage of block-based programming lan-
active learning experiences or discover innovative guages (e.g., PoseBlocks, App Inventor) has shown
teaching methods that enhance student engagement the value for students to build solutions, implement
and understanding [47]. design, and create functional mobile applications
By reconceptualizing teacher education in com- without complex debugging and programming pro-
puter science, educators can better prepare future cess [57,58]. The ongoing research study could further
teachers to design and deliver engaging and mean- explore core processes and components that prepare
ingful learning experiences [48]. For example, they teachers to use, adapt, and implement computer sci-
can develop new curricular materials that incorporate ence curriculum with technology integration across
coding exercises, multimedia resources, and interac- diverse classroom settings domestically and interna-
tive simulations to make learning more interactive tionally in computer science education [59,60].
and dynamic [49]. Furthermore, assessment approach-
es can be adapted to evaluate students’ computation- 5. Conclusions
al thinking skills, creativity, and problem-solving
abilities [50]. This could involve designing coding To enhance teacher education in the field of com-
challenges or projects that require students to apply puter science, it is crucial to equip teachers with the
their knowledge in practical contexts, as opposed skills to strategically utilize technology in designing
to traditional exams or quizzes that solely focus on engaging curriculum that promotes deep learning in
theoretical concepts [51]. By focusing on these as- computer science education. Given the complexity
pects, students not only gain technical knowledge, of school systems, collaborative efforts involving
but also develop the necessary skills to thrive in researchers, scientists, and professionals are neces-
an ever-changing digital landscape. They become sary to drive these transformative shifts. To prepare
equipped with computational thinking skills, creativ- for the change, the Innovating Instruction model
ity, and problem-solving abilities, which are highly has shown an effective model to incorporate interac-
sought-after in the industry [52,53]. tive and hands-on activities, project-based learning,
and real-world applications of computer science
4.4 Embracing computer science education concepts, tailor their instruction to meet the diverse
for all students with real-world connections needs and interests of their students, and constant-
ly refine their instructional techniques and become
Specifically, effective professional development more effective educators. By nurturing a community
should provide teachers with more accessible re- of practice and fostering collaboration among teach-
sources that reduce barriers for teachers to learn, ers, researchers, and professionals, the field can col-
adopt, and integrate into the daily curriculum [54]. For lectively drive the adoption of innovative technology
instance, providing teachers sample curriculum that and create a more engaging and impactful learning
allows teachers to adapt and integrate into current environment for students.
lesson plans, could be effective for teachers to devel-
op capacity in computer science disciplines, develop
students’ interests in exploring CS topics and en-
Author Contributions
courage educators to understand the value of design- Both authors made equal contributions to the

52
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

manuscript. (RSCL): Social robots as teaching assistants for


higher education small group facilitation. Fron-
Conflict of Interest tiers in Robotics and AI. 6, 148.
[8] Grover, S., Pea, R., Cooper, S., 2015. Design-
The authors have no conflicts of interest to de- ing for deeper learning in a blended computer
clare. science course for middle school students. Com-
puter Science Education. 25(2), 199-237.
Data Availability Statement [9] Hsu, T.C., Chang, S.C., Hung, Y.T., 2018. How
Situate. Design. Lead. © CTSC’s Professional to learn and how to teach computational think-
Development Model for Innovating Instruction is ing: Suggestions based on a review of the litera-
available at the website from Center for Technology ture. Computers & Education. 126, 296-310.
and School Change, Teachers College, Columbia [10] Fullan, M., 2016. The new meaning of educa-
University. https://ctsc.tc.columbia.edu/ tional change. Teachers College Press: New
York.
[11] Grover, S., Pea, R., 2013. Computational think-
References
ing in K-12: A review of the state of the field.
[1] Breazeal, C., 2022. AI Literacy for All with Educational Researcher. 42(1), 38-43.
Prof. Cynthia Breazeal [Internet]. Available [12] Webb, M., Bell, T., Davis, N., et al. (editors),
from: https://openlearning.mit.edu/news/ai-liter- 2017. Computer science in the school cur-
acy-all-prof-cynthia-breazeal riculum: Issues and challenges. Tomorrow’s
[2] Darling-Hamond, L., Oakes, J., 2019. Preparing Learning: Involving Everyone. Learning with
teachers for deeper learning. Harvard Education and about Technologies and Computing: 11th
Press: Cambridge, MA. IFIP TC 3 World Conference on Computers in
[3] Podolsky, A., Kini, T., Darling-Hammond, L., Education, WCCE 2017; 2017 Jul 3-6; Dublin,
2019. Does teaching experience increase teacher Ireland. p. 421-431.
effectiveness? A review of US research. Journal [13] Du, X., Parks, R., Tezel, S., et al. (editors),
of Professional Capital and Community. 4(4), 2023. Designing a computational action pro-
286-308. gram to tackle global challenges. SIGCSE 2023:
[4] Sutcher, L., Darling-Hammond, L., Carv- Proceedings of the 54th ACM Technical Sym-
er-Thomas, D., 2019. Understanding teacher posium on Education; 2023 Mar 15-18; Toronto
shortages: An analysis of teacher supply and ON, Canada. New York: Association for Com-
demand in the United States. Education Policy puting Machinery. p. 1320-1320.
Analysis Archives. 27(35). [14] Meier, E.B., Mineo, C., 2021. Pedagogical chal-
[5] National Academies of Sciences, Engineer- lenges during COVID: Opportunities for trans-
ing, and Medicine, 2018. How people learn II: formative shifts. Handbook of research on trans-
Learners, contexts, and cultures. National Acad- forming teachers’ online pedagogical reasoning
emies Press: Washington, D.C. for engaging K-12 students in virtual learning.
[6] Papadopoulos, I., Lazzarino, R., Miah, S., et IGI Global: Hershey. pp. 86-108.
al., 2020. A systematic review of the literature [15] Meier, E.B., 2021. Designing and using digital
regarding socially assistive robots in pre-ter- platforms for 21st century learning. Educational
tiary education. Computers & Education. 155, Technology Research and Development. 69(1),
103924. 217-220.
[7] Rosenberg-Kima, R.B., Koren, Y., Gordon, G., [16] Darling-Hammond, L., Flook, L., Cook-Harvey,
2020. Robot-supported collaborative learning C., et al., 2020. Implications for educational

53
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

practice of the science of learning and develop- in practice: Scaffolding teenagers’ learning
ment. Applied Developmental Science. 24(2), about emerging technologies and their ethical
97-140. and societal impact. International Journal of
[17] Scardamalia, M.B., 2014. Knowledge building Child-Computer Interaction. 34, 100537.
and knowledge creation. Cambridge handbook [27] Tsortanidou, X., Daradoumis, T., Barberá, E.,
of the learning sciences. Cambridge University 2019. Connecting moments of creativity, com-
Press: Cambridge. pp. 297-417. putational thinking, collaboration and new
[18] Harju, V., Niemi, H., 202). Newly qualified media literacy skills. Information and Learning
teachers’ support needs in developing profes- Sciences. 120(11/12), 704-722.
sional competences: The principal’s viewpoint. [28] Alfaro-Ponce, B., Patiño, A., Sanabria-Z, J.,
Teacher Development. 24(1), 52-70. 2023. Components of computational thinking
[19] Ng, D.T.K., Lee, M., Tan, R.J.Y., et al., 2022. A in citizen science games and its contribution to
review of AI teaching and learning from 2000 to reasoning for complexity through digital game-
2020. Education and Information Technologies. based learning: A framework proposal. Cogent
1-57. Education. 10(1), 2191751.
[20] Dash, B.B., 2022. Digital tools for teaching and [29] Ketelhut, D.J., Mills, K., Hestness, E., et al.,
learning English language in 21 st century. In- 2020. Teacher change following a professional
ternational Journal Of English and Studies. 4(2), development experience in integrating computa-
8-13. tional thinking into elementary science. Journal
[21] Biswas, S., Benabentos, R., Brewe, E., et al., of Science Education and Technology. 29, 174-
2022. Institutionalizing evidence-based STEM 188.
reform through faculty professional develop- [30] Bragg, L.A., Walsh, C., Heyeres, M., 2021. Suc-
ment and support structures. International Jour- cessful design and delivery of online profession-
nal of STEM Education. 9(1), 1-23. al development for teachers: A systematic re-
[22] McGill, M.M., Reinking, A., 2022. Early view of the literature. Computers & Education.
findings on the impacts of developing evi- 166, 104158.
dence-based practice briefs on middle school [31] Mystakidis, S., Fragkaki, M., Filippousis, G.,
computer science teachers. ACM Transactions 2021. Ready teacher one: Virtual and augmented
on Computing Education. 22(4), 1-29. reality online professional development for K-12
[23] Apiola, M., Sutinen, E., 2021. Design science school teachers. Computers. 10(10), 134.
research for learning software engineering and [32] Li, M., 2020. Multimodal pedagogy in TESOL
computational thinking: Four cases. Computer teacher education: Students’ perspectives. Sys-
Applications in Engineering Education. 29(1), tem. 94, 102337.
83-101. [33] Kafai, Y.B., Baskin, J., Fields, D., et al. (editors),
[24] Casey, E., Jocz, J., Peterson, K.A., et al., 2023. 2020. Looking ahead: Professional development
Motivating youth to learn STEM through a gen- needs for experienced CS teachers. SIGCSE’20:
der inclusive digital forensic science program. Proceedings of the 51st ACM Technical Sympo-
Smart Learning Environments. 10(1), 2. sium on Computer Science Education; 2020 Mar
[25] Tissenbaum, M., Weintrop, D., Holbert, N., et 11-14; Portland OR, USA. New York: Associa-
al., 2021. The case for alternative endpoints in tion for Computing Machinery. p. 1118-1119.
computing education. British Journal of Educa- [34] Tshukudu, E., Cutts, Q., Goletti, O., et al. (edi-
tional Technology. 52(3), 1164-1177. tors), 2021. Teachers’ views and experiences on
[26] Schaper, M.M., Smith, R.C., Tamashiro, M.A., teaching second and subsequent programming
et al., 2022. Computational empowerment languages. ICER 2021: Proceedings of the 17th

54
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ACM Conference on International Computing Thinking (CT) in teaching and learning. Learn-
Education Research; 2021 Aug 16-19; Virtual ing and Motivation. 78, 101802.
Event, USA. New York: Association for Com- [44] Wang, Y., 2023. The role of computer supported
puting Machinery. p. 294-305. project-based learning in students’ computation-
[35] Rich, P.J., Larsen, R.A., Mason, S.L., 2021. al thinking and engagement in robotics courses.
Measuring teacher beliefs about coding and Thinking Skills and Creativity. 48, 101269.
computational thinking. Journal of Research on [45] Bers, M.U., Blake-West, J., Kapoor, M.G., et
Technology in Education. 53(3), 296-316. al., 2023. Coding as another language: Re-
[36] Bereczki, E.O., Kárpáti, A., 2021. Technolo- search-based curriculum for early childhood
gy-enhanced creativity: A multiple case study of computer science. Early Childhood Research
digital technology-integration expert teachers’ Quarterly. 64, 394-404.
beliefs and practices. Thinking Skills and Cre- [46] Huang, W., Looi, C.K., 2021. A critical review
ativity. 39, 100791. of literature on “unplugged” pedagogies in K-12
[37] Griful-Freixenet, J., Struyven, K., Vantieghem, computer science and computational thinking
W., 2021. Exploring pre-service teachers’ beliefs education. Computer Science Education. 31(1),
and practices about two inclusive frameworks: 83-111.
Universal design for learning and differentiated [47] Yildiz Durak, H., Atman Uslu, N., Canbazoğlu
instruction. Teaching and Teacher Education. Bilici, S., et al., 2022. Examining the predic-
107, 103503. tors of TPACK for integrated STEM: Science
[38] Dignath, C., Rimm-Kaufman, S., van Ewijk, R., teaching self-efficacy, computational thinking,
et al., 2022. Teachers’ beliefs about inclusive and design thinking. Education and Information
education and insights on what contributes to Technologies. 1-28.
those beliefs: a meta-analytical study. Educa- [48] Lee, S.W.Y., Liang, J.C., Hsu, C.Y., et al., 2023.
tional Psychology Review. 34(4), 2609-2660. Students’ beliefs about computer programming
[39] Almazroa, H., Alotaibi, W., 2023. Teaching 21st predict their computational thinking and com-
century skills: Understanding the depth and puter programming self-efficacy. Interactive
width of the challenges to shape proactive teach- Learning Environments. 1-21.
er education programmes. Sustainability. 15(9), [49] Lee, S.J., Francom, G.M., Nuatomue, J., 2022.
7365. Computer science education and K-12 students’
[40] Bhutoria, A., 2022. Personalized education and computational thinking: A systematic review.
artificial intelligence in the United States, China, International Journal of Educational Research.
and India: A systematic review using a human- 114, 102008.
in-the-loop model. Computers and Education: [50] Ung, L.L., Labadin, J., Mohamad, F.S., 2022.
Artificial Intelligence. 3, 100068. Computational thinking for teachers: Develop-
[41] Bozkurt, A., 2020. Educational technology re- ment of a localised E-learning system. Comput-
search patterns in the realm of the digital knowl- ers & Education. 177, 104379.
edge age. Journal of Interactive Media in Educa- [51] Kallia, M., van Borkulo, S.P., Drijvers, P., et al.,
tion. (1). 2021. Characterising computational thinking in
[42] Çiftci, S., Bildiren, A., 2020. The effect of cod- mathematics education: A literature-informed
ing courses on the cognitive abilities and prob- Delphi study. Research in Mathematics Educa-
lem-solving skills of preschool children. Com- tion. 23(2), 159-187.
puter Science Education. 30(1), 3-21. [52] Ogegbo, A.A., Ramnarain, U., 2022. A system-
[43] Saad, A., Zainudin, S., 2022. A review of Proj- atic review of computational thinking in science
ect-Based Learning (PBL) and Computational classrooms. Studies in Science Education. 58(2),

55
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

203-230. Evaluating professional development. Educa-


[53] Chen, C.H., Liu, T.K., Huang, K., 2023. Scaf- tional Leadership. 59(6), 45-51.
folding vocational high school students’ com- [57] Li, L., Ruppar, A., 2021. Conceptualizing teach-
putational thinking with cognitive and metacog- er agency for inclusive education: A systematic
nitive prompts in learning about programmable and international review. Teacher Education and
logic controllers. Journal of Research on Tech- Special Education. 44(1), 42-59.
nology in Education. 55(3), 527-544. [58] Tissenbaum, M., Sheldon, J., Abelson, H., 2019.
[54] Gao, X., Li, P., Shen, J., et al., 2020. Reviewing From computational thinking to computational
assessment of student learning in interdisciplin- action. Communications of the ACM. 62(3),
ary STEM education. International Journal of 34-36.
STEM Education. 7(1), 1-14. [59] Madkins, T.C., Howard, N.R., Freed, N., 2020.
[55] Madkins, T.C., Martin, A., Ryoo, J., et al. (ed- Engaging equity pedagogies in computer sci-
itors), 2019. Culturally relevant computer sci- ence learning environments. Journal of Comput-
ence pedagogy: From theory to practice. 2019 er Science Integration. 3(2).
Research on Equity and Sustained Participation [60] Morales-Chicas, J., Castillo, M., Bernal, I., et
in Engineering, Computing, and Technology al., 2019. Computing with relevance and pur-
(RESPECT); 2019 Feb 27; Minneapolis, MN, pose: A review of culturally relevant education
USA. New York: IEEE. p. 1-4. in computing. International Journal of Multicul-
[56] Guskey, T.R., 2002. Does it make a difference? tural Education. 21(1), 125-155.

56
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Journal of Computer Science Research


https://journals.bilpubgroup.com/index.php/jcsr

ARTICLE

Expert Review on Usefulness of an Integrated Checklist-based Mobile


Usability Evaluation Framework
Hazura Zulzalil1* , Hazwani Rahmat2, Abdul Azim Abd Ghani1, Azrina Kamaruddin1
1
Department of Software Engineering and Information System, Universiti Putra Malaysia, UPM Serdang, Selangor,
43400, Malaysia
2
Department of Information Technology, Centre for Diploma Studies, Universiti Tun Hussein Onn Malaysia, Pagoh
Higher Education Hub, Pagoh, Johor, 84600, Malaysia

ABSTRACT
Previous mobile usability studies are only pertinent in the context of ergonomics, physical user interface, and mo-
bility aspects. In addition, much of the previous mobile usability conception was built on desktop computing measure-
ments, such as desktop and web application checklists, or scarcely addressed the mobile user interface. Moreover, the
studies focus mainly on interface features for desktop applications and do not reflect comprehensive mobile interface
features such as navigation drawers and spinners. Therefore, conducting usability evaluation using conventional us-
ability measurement would result in irrelevant results. In addition, the resulting works are tailored for usability testing,
which requires highly skilled evaluators and usability specialists (e.g., usability testers and user experience designers),
who are rarely integrated into a development team. The lack of expertise could lead to unreliable usability evaluations.
This paper presents a review from industrial experts on a comprehensive and feasible usability evaluation framework
developed in our previous work. The framework is dedicated to smartphone apps, which integrate evaluator skills and
design concerns. However, there is no evidence of its usefulness in practice. Therefore, the usefulness of the frame-
work measurement for evaluating apps’ usability in the eyes of non-usability specialists is empirically assessed in this
paper through an expert review. The expert review involved eleven industrial developers and was complemented by a
semi-structured interview. The method is replicated in comparison with a framework from another study. The findings
show that the formulated framework significantly outperformed the framework (p = 0.0286) from other studies with
large effect sizes (r = 1.81) in terms of usefulness.
Keywords: Usability framework; Mobile usability; Usability evaluation; Expert review; Heuristic walkthrough

*CORRESPONDING AUTHOR:
Hazura Zulzalil, Department of Software Engineering and Information System, Universiti Putra Malaysia, UPM Serdang, Selangor, 43400, Ma-
laysia; Email: [email protected]
ARTICLE INFO
Received: 28 June 2023 | Revised: 24 July 2023 | Accepted: 26 July 2023 | Published Online: 31 July 2023
DOI: https://doi.org/10.30564/jcsr.v5i3.5816
CITATION
Zulzalil, H., Rahmat, H., Ghani, A.A.A., et al., 2023. Expert Review on Usefulness of an Integrated Checklist-based Mobile Usability Evaluation
Framework. Journal of Computer Science Research. 5(3): 57-73. DOI: https://doi.org/10.30564/jcsr.v5i3.5816
COPYRIGHT
Copyright © 2023 by the author(s). Published by Bilingual Publishing Group. This is an open access article under the Creative Commons Attribu-
tion-NonCommercial 4.0 International (CC BY-NC 4.0) License. (https://creativecommons.org/licenses/by-nc/4.0/).

57
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

1. Introduction features and functionalities are introduced to en-


hance the mobile user interface with constant version
The user interface serves as a conduit between updates. Interface features such as the navigation
human and computer interactions. The evolution of
drawer and expansion panel are used to maximize
the user interface has progressed from the command
the limited screen size. Meanwhile, snack boxes and
line interface (CLI) of a console to the graphical user
toasts are used to deliver a prompt visual response
interface (GUI), and later to the web user interface
when handling divided attention and mobility condi-
and mobile user interface. The evolution of the user
tions, which could result in an accidental activation.
interface has characterized usability dimensions dif-
Consequently, usability criteria such as connectivity,
ferently for mobile applications (apps). CLI requires
relevance, and responsiveness are the highlights of
high memorability for competence and knowledge
conceptualizing app usability. However, the smart-
of entering massive commands. On the other hand,
phone is used by users of all ages with various levels
a GUI adopts a graphical representation for user
of computing background. Thus, usability criteria
interaction. Thus, learnability, effectiveness, and ef-
such as familiarity, flexibility, and appropriateness
ficiency come first in a usable application. The Web
are also highlighted in denoting mobile usability.
user interface operates mostly on hypertext and mul-
The conceptualization of the mobile usability
timedia elements, forming the navigational system
dimension has been widely studied [1,2]. Numerous
and interconnected content. Consequently, consist-
attempts have been made to characterize usability in
ency, simplicity, and information architecture play
view of performance-based measures [3], the physical
an important role in the effectiveness, efficiency, and
navigability of web applications. user interface [4,5], mobile device concerns [6], usabil-
Likewise, a mobile user interface is shaped by its ity principles [7], usability criteria [8], and interface
technological features. Physical device features and features [9-12].
limitations, an integrated sensor such as proximity, This study presents a measurement for evaluating
tactile, or image recognition sensor, and its context the usability of mobile applications in the context of
of use has introduced emergent usability properties integrated evaluator skills. The measurements are de-
in characterizing the mobile usability dimension. veloped by capturing the interface features, usability
Additionally, the unique data entry model, such as criteria, and design pattern, which augment the eval-
the use of a stylus, gestures, and the virtual key- uation basis from multiple evaluators’ viewpoints.
board, has taken place as the input device instead of The integration involves a comprehensive bridging
a mouse and keyboard. of the semantic gap between different abstraction
Mobile applications (apps) are used on-the-go, levels of usability constructs; interface features, us-
thus opening them to divided attention while used in ability features, and usability criteria into one inte-
different mobility conditions (e.g., sitting, walking, grated framework.
and driving). Apps offer support for a broad range of The remainder of this article is divided into eight
tasks (e.g., streaming online movies, browsing infor- sections. Section 2 reviews the existing mobile us-
mation, and performing online transactions) without ability evaluation framework. Section 3 describes
the need for a computer. However, as mobile oper- how the framework measurement is formulated.
ating systems advanced, the user interface of mobile The resulting framework is presented in Section
applications was rapidly enhanced through software 4. The evaluation of the framework’s usefulness is
updates. The update involves logical user interfaces described in Section 5. Meanwhile, the results and
(LUI) and graphical user interfaces (GUI), which discussions, and threats to validity are discussed in
affect apps’ usability, rather than physical user inter- Sections 6 and 7. Finally, in Section 8, conclusions
faces (PUI), which affect the device’s usability. New and future studies are presented.

58
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

2. Related works metrics as a performance indicator for each goal and


question. However, since mobile-specific interface
Mobile usability evaluation methods such as field features for smartphones were just being launched
testing and lab experiments have been introduced for at the time, the outcome focuses more on the logical
evaluating mobile applications. However, the lim- user interface.
itations and difficulties resulting from these methods Further, another study by Saleh et al. [8] adopt-
favor traditional usability evaluation methods such ed the same approach, GQM, in constructing their
as usability testing and inspection methods. Inspec- framework. In contrast to Hussain and Kutar’s ap-
tion methods such as heuristic evaluation (HE) gain proach [3], they developed a more comprehensive
wide acceptance in industrial practice due to their set of usability criteria denoting mobile applications
simplicity, low cost, and short time, with no addi- (apps) by extending the PACMAD model [5] as the
tional equipment required. Hence, a wide variety of base of their framework structure.
heuristic evaluation methods besides Nielsen and Though both studies managed to conceptualize
Molich’s ten heuristics have been developed, such usability, the use of the GQM approach resulted in
as checklist-based heuristics [6]. Consequently, the a metrics-oriented performance-based checklist that
use of a checklist was extended to frameworks in an scarcely acknowledges the characteristics of smart-
effort to characterize usability. This has benefits for phones, such as screen size and interaction method,
usability conception, design, and evaluation purposes. which reflect the interface features of apps in detail.
This section discusses the checklist-based frame-
works in four categories. The categories are attrib- 2.2 Integrated-based frameworks
ute-based frameworks, integrated-based frameworks,
theoretical-based frameworks, and decision-based Insufficient literature on mobile phone character-
frameworks. The first and third categories serve to istics concerning its interface feature has inspired the
conceptualize the usability dimension. Meanwhile, effort of an in-depth comprehension of the mobile
the second approach focuses on the interface feature, interface feature. As a result, the abstraction levels
and the last category is specific to decision-making for this type of framework are realized as an organi-
and prioritizing usability constructs. The frame- zation of mobile interface features.
work’s structural base, evaluator viewpoint, intend- In addressing the comprehensive aspect of us-
ed platform, and scope of evaluation item for each ability issues on a mobile phone, Mugisha et al. ar-
framework concerning the aim of each literature’s ticulate their framework in view of mobile phone UI
work were compared. practitioners [13]. Based on a review of usability prin-
ciples, they defined five categories of UIs tailored
2.1 Attribute-based frameworks for a feature phone. A pairwise comparison approach
was used in mapping the UIs to usability principles.
The increasing capabilities of mobile phones have A checklist relevant to the UIs was developed to
encouraged several usability investigations to char- match the usability principles.
acterize the usability dimension of mobile phones. As a continuation, Xu and Jonsson [14] devised
Initially, as smartphones started to emerge in 2009, their framework by determining common interface
Hussain and Kutar adopted a Goal Question Metric features for tablet applications. The identified inter-
(GQM) approach for their framework in conceptual- face features were grouped into three categories: UI
izing usability dimensions [3]. Based on ISO 9241-11 input, UI components, and UI characteristics. Each
usability criteria (effectiveness, efficiency, and satis- UI, which was paired with a developed checklist,
faction) as usability parameters for the goal, a set of was mapped to the usability principles based on their
questions was associated with each goal, in a check- effect and relationship. Though tailored for tablet
list form. The questions were further used to derive applications, and acknowledging smartphone charac-

59
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

teristics in their work, the UIs mainly reflect desktop usability measurement for mobile applications,
and web application UIs and features such as input, Hoehle, Aljafari, and Venkatesh proposed a set of
hardware, bookmarks, and headers. measurements for mobile applications in view of in-
terface features based on measurement theory [12]. A
2.3 Theoretical-based frameworks content analysis approach was used to relate the con-
structs and variables. Though their work explicitly
The usability framework, which was developed
focused on apps, the measurements were tailored for
based on usability conceptions of principles and
Microsoft-based apps, and mobile interface features
criteria, mainly revolves around the effectiveness of
are not well addressed in their work. Instead, they
existing usability measurements for evaluating apps.
emphasize aspects such as usability principles, aes-
For example, Dubey and Rana [9] acknowledged
thetics, and navigation.
the characteristics and features of mobile devices.
They doubt the effectiveness of existing usability
2.4 Decision-based framework
measurements on mobile phones. By hierarchically
organizing usability indicators (principles), criteria, The primary purpose of adopting a decision-based
and properties based on a goal-mean relationship framework is to determine a usable mobile applica-
between the parameters, they formulated a frame- tion. Lachgar and Abdelmounaim pursued an analyt-
work for usability specialists to conduct an analytical ic hierarchy process in developing their framework.
evaluation of mobile phones. While focusing on the Grounded in measurement theory, he developed
parameters of each abstraction level and all three cat- usability constructs and variables to facilitate the se-
egories of UIs (PUI, LUI, and GUI), their checklist lection of usable mobile phones [15]. Table 1 summa-
suffers from redundancy, ambiguity, confusion, and rizes the literature review.
indirectly measurable issues. Earlier mobile usability studies emphasized the
Pursuing a different approach, Gómez, Caballero, physical user interface. While the logical user inter-
and Sevillano formed their framework by formu- face persists across most computing platforms, rapid
lating a structure of heuristics and sub-heuristics, updates in smartphone technologies highlight the im-
paired with a checklist based on their semantic rela- portance of the graphical user interface, particularly
tions [6]. They achieved excellent results in address- interface features. The coverage of IU studied previ-
ing mobile-specific usability issues while focusing ously conforms to the scope of UI covered in the re-
on LUI and GUI. Unfortunately, though they argue viewed framework from the age of feature phones to
for the effectiveness of a desktop-centered checklist handhelds until smartphones, where PUI is scarcely
for evaluating apps, a portion of their checklist stems studied in recent works.
from a web-based checklist that appears irrelevant
for apps.
3. Formulation of framework meas-
Judging by the limitations of mobile devices, Fa-
urement
tih Nayebi developed a heuristic-based framework
for app evaluation [7]. A set of usability criteria estab- Representative definitions of usability by the
lished based on his review of academic and industri- industry (i.e., ISO 9241-11 [16], ISO 9126 [17]) and
al heuristics, theories, and guidelines were assigned academia [5,18-22] are usually referred to most studies.
to the most relevant logical groups of the reviewed In the context of mobile usability studies, Harrison
bibliographic references. Although he managed to et al. [5] work, which extends Nielsen’s usability
address the characteristics of mobile devices, the conception in view of the ISO 9241-11 context, is
proposed criteria were ambiguous and hardly ad- deemed as a comprehensive reference [23]. However,
dressed mobile interface features. neither metrics nor checklists are associated with
Further, arguing for the effectiveness of current their work, thus leaving little support for usability

60
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 1. Literature review summary.

Types of Attribute-based Integrated-based Theoretical-based Decision-based


Framework frameworks frameworks frameworks framework

Hoehle, Aljafari Lachgar and


Hussain and Saleh, Ismail, Mugisha et al. Xu and Jonsson Gómez, Caballero
Authors Dubey et al. [9] Fatih Nayebi [7] and Venkatesh Abdelmounaim
Kutar [3] and Fabil [8] [13] [14]
and Sevillano [6]
[12] [15]

Usability Usability Usability


Viewpoint developer developer developer developer Non-expert Non-expert
specialist specialist specialist

Bridging Bridging Bridging Bridging Select best


Conceptualise Conceptualise Conceptualise
different groups different groups different groups Addressing mobile different groups alternatives
Aims of research usability usability usability
of usability of usability of usability usability issues of usability among available
dimension dimension dimension
constructs constructs constructs constructs usability criteria

Usability Usability Usability Usability Usability


Base structure ISO 9241-11 PACMAD Mobile constraints ISO 9421-11
principles principles principles principles principles

Mapping of Analytic
Goal Question Goal Question Pairwise- Pairwise- Content Content
abstraction levels Hierarchy Content analysis Content analysis
Metric Metric comparison comparison analysis analysis
components Process

Understanding Understanding Prioritizing Prioritizing Correlating Correlating Correlating Correlating Decision


Context of use
measurement measurement constructs constructs constructs constructs constructs constructs making

Explicit Explicit Thorough Thorough Thorough Thorough


Consistent Consistent Consistent
Benefit measurement measurement construct construct construct construct
judgement judgement judgement
interpretation interpretation classification classification classification classification

Tunnel vision Tunnel vision Large number Large number Tunnel vision Tunnel vision Tunnel vision Large number of
Drawback Tunnel vision bias
bias bias of evaluations of evaluations bias bias bias evaluations

Establishing Establishing
Countermeasure
Not applicable Not applicable selection selection Expert review Not applicable Not applicable Expert review Content rating
implemented
criteria criteria

Platform smartphone smartphone Feature phone Tablet Feature phone Smartphone, tablet smartphone smartphone Handhelds

Scope LUI LUI LUI, GUI LUI, GUI PUI, LUI, GUI LUI, GUI LUI, GUI GUI PUI, LUI, GUI

61
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

inspection. Measures referring to desktop-based input devic-


Consequently, this study formulates a framework es such as mouse and keyboard; web-related user
for usability conception by reviewing checklist-re- interfaces such as a link to related content, bread-
lated bibliographic references. This approach has crumb, and splash screen; physical user interfaces
been demonstrated in previous studies by conduct- such as a widget, soft keys, and notification drawer;
ing a bibliographic search when constructing their and shared devices concern; performance-based
checklist-based framework [6,9,24]. They restricted checklists such as task completion time, loading
their search scope by covering only the relevant and time, download speed, and installation are removed
most influential references. In contrast, this study ex- from the collection. In addition, cross-domain con-
haustively examined relevant bibliographic sources cerns such as user experience and interaction design
for possible quality criteria denoting usability, such are excluded from the collection of candidate check-
as standards, guidelines, and requirements in their lists. Technical and design aspects, such as naming
work. This study reviewed requirements up to the convention and image size, which require coding
evaluation life cycle to obtain a comprehensive de- inspection, are also removed. Any game-specific
scription of apps’ usability [25]. This process, howev- measure was removed due to the exceptional design
er, is restricted to mobile and app-related sources. objective, which distinguishes them from general
Eleven relevant bibliographic references were apps such as banking, utilities, etc. [26,27].
reviewed for possible usability criteria. As a result, Measures specifically for the impaired user, such
a collection of 572 measures was compiled. Mea- as blind users, are also removed due to their excep-
sures irrelevant in the context of app usability were tional design concerns. Conflicting measures within
excluded to ensure mobile-specific measurements. the same bibliographic reference were both excluded
Table 2 highlights the distribution of redacted mea- due to no concrete design decision. Measures refer-
sures. ring to application purpose, e.g., “Application’s pur-

Table 2. Distribution of redacted measures.

Exclusion criteria No. of excluded measures

Design-related measures (Ex: image size) 34

Web-specific design elements (Ex: refresh button, wish list) 15

User-impaired related measures (Ex: visually impaired) 13

Game-based measures 10

Physical user interface related measure (Ex: widget, soft keys, notification drawer) 11

Programming related measures (Ex: naming convention) 9

Performance-based measures (Ex: time taken, number of successful task) 9

Input and output devices (Ex: Desktop based input hardware, wearable) 8

Miscellaneous (Ex: conflicting measure between the same bibliographic reference, application
8
purpose, design statement or fact)

Cross-domain concern (Ex: user experience, interaction design) 4

Technical-related measures (Ex: installation, system resource) 3

Device specific features (Ex: shared device, tablet specific) 3

No. of redacted measures 127

62
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

pose is understandable at first sight”, were excluded, similar apps’ design patterns and usability features.
although they refer to usability criteria of under-
standability. The rationale is that the measure does 4. An integrated usability evaluation
not contribute towards achieving user goals in oper- framework
ating an app, thus irrelevant in representing usability
for apps. Characterizing usability solely on usability prin-
Additionally, given that the measures consist of ciples or usability attributes suffers from a lack of
different forms such as usability requirements, heu- reflection on interface features in detail such as no-
ristics, checklists, guidelines, recommendations, and tification and interaction method, which is another
usability problems, it is not possible to review the aspect influencing mobile usability. On the other
bibliography in terms of quality criteria that share a hand, depending solely on the UI component for the
similar meaning, the same name, or both. Instead, evaluation would be inappropriate for measuring the
regardless of their original form, the measures were usability factor. In addition, considering apps’ short
rephrased into a checklist. time-to-market, where usability specialists are rarely
These measures were reviewed using a content involved during the usability evaluation, there is a
analysis technique to develop usability constructs for need to support non-usability specialists in conduct-
apps. Content analysis of the measures developed ing reliable usability evaluations from their point of
relevant emergent quality attributes and interface view. These suggest a mobile usability framework
features, which later resulted in a paired usability that integrates multiple evaluator viewpoints. How-
checklist. Initially, a conceptual definition for each ever, this would result in different evaluation criteria,
usability criterion and interface feature is established. such as interface features and usability features, in
The conceptual definitions are made as unambiguous contrast to usability specialists and developers, who
as possible in the context of apps. Conceptually sim- mostly view usability in terms of usability heuristics
ilar items and repeated items referring to the same and quality criteria. Figure 1 illustrates the concep-
usability criteria are grouped together and rephrased tual framework.
to homogenize the resulting usability checklist. In The usability constructs are abstracted into three
the case of conflicting items, items that coincide with tiers of abstraction levels: usability feature level,
other items are retained, and the conflicting items are usability criteria level, and interface feature level.
excluded from the checklist pool. Finally, the usabil- Each abstraction level of the framework denotes
ity criteria are examined for similarities and differ- a construct that consists of a group of framework
ences in terms of their design patterns. The usability components. The framework components are paired
criteria are then grouped under conceptual units of with the usability checklists for usability inspection.

Figure 1. The conceptual framework.

63
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Each framework tier reflects the different viewpoints study, the design patterns are formulated as usability
of the usability evaluator and their level of expertise. features to meet the viewpoint of usability specialists
The identified interface feature serves as the frame- in conceptualizing usability as an emergent property
work measurement, which formed the interface fea- of app interaction complexities. Table 3 presents the
ture component for the lowest abstraction levels in elicited usability features in this study.
the framework. The usability criteria are tied to the The usability features level denotes a collection
middle tier of the framework, the usability criteria of smartphone characteristics. These features are
component. Components for the top abstraction lev- characterized by the attributes in the usability criteria
els and usability features were identified by formu- level. It is formulated to meet the usability special-
lating conceptual units with similar usability criteria. ist’s viewpoint in conceptualizing usability as an
Figure 2 illustrates the framework abstraction level. emergent property of app interaction complexities. It
serves as an evaluation basis for both 1) specialists
4.1 Usability features who view usability in terms of design patterns and
2) non-usability specialists, such as developers and
Usability is commonly viewed by specialists in designers, who could benefit from understanding us-
terms of constructs such as heuristics, principles, and ability in terms of design functionalities, in conduct-
guidelines, which are generally abstract. However, ing usability evaluation.
the mobile context of use, such as the interaction
and operating environment, of the application on the
4.2 Usability criteria
intended platform has been regarded as an emergent
property that affects usability [9,10,26]. Functional fea- Characterizing usability solely by either usability
tures of technology have been addressed in usability principles or usability attributes suffers from a lack
studies through design patterns [28-31]. The design of reflection on interface features in detail such as
pattern of app functionalities demonstrates the in- notification and interaction methods, which is anoth-
teraction complexities of smartphone apps. In this er aspect influencing mobile usability. However, on

Figure 2. Framework abstraction level.

Table 3. Components of the usability features.


● F01 Interaction ● F04 Signifiers ● F07 Navigation ● F10 Data Entry
● F02 Notification ● F05 Aesthetic ● F08 Information Architecture ● F11 Workflow
● F03 Permission ● F06 Presentation ● F09 Search ● F12 Selection

64
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

the other hand, depending solely on the UI compo- 4.3 Interface features
nent for the evaluation would be inappropriate for
The interface feature level defines components
measuring the usability factor. Therefore, this study
that are tied to the usability criteria in the middle tier.
bridged usability constructs, usability criteria, and
This level facilitates technical evaluators (e.g., ana-
interface features together with usability features.
lysts, designers, etc.) who perceive usability in view
The usability criteria level consists of a collection
of the design context approach. It evaluates usability
of usability attributes addressing the corresponding
in view of design elements. Table 5 lists the compo-
usability feature in the top tier. It emphasizes usabil-
nents of the interface features.
ity evaluation from a software engineering perspec- In the formulated framework, each usability fea-
tive. Table 4 lists the components of the usability ture is decomposed into several usability criteria (as
criteria. in a one-to-many relationship). However, a usability
The label next to each usability criterion denotes criterion is tied to more than one checklist, assessing
the usability features they are associated with. The different UI elements. Likewise, it is also possible
usability criteria and interface features in the next for the UI elements to be associated with more than
tier facilitate usability evaluation and the perception one usability criteria (as in a many-to-many relation-
of the evaluator in the domain of software engineer- ship). Table 6 exhibits the partial list of the paired
ing and development. usability checklist.

Table 4. Components of the usability criteria.


● Responsiveness (F01) ● Connectivity (F03) ● Readability (F06) ● Conciseness (F08)
● Interactivity (F01) ● Flexibility (F03) ● Relevance (F06) ● Structuredness (F08)
● Playability (F01) ● Security (F03) ● Accessibility (F06) ● Formality (F08)
● Ease of Use (F01) ● Visibility (F04) ● Trustworthy (F06) ● Effectiveness (F09)
● Safety (F01) ● Discoverability (F04) ● Navigability (F07) ● Accuracy (F10)
● Completeness (F02) ● Consistency (F05) ● Complexity (F07) ● Customisation (F10)
● Promptness (F02) ● Appropriateness (F05) ● Linkage (F07) ● Operability (F11)
● Reliability (F03) ● Familiarity (F05) ● Understandability (F08) ● Efficiency (F12)

Table 5. Components of the interface features.


● Action ● Button ● Icons
● Spinner
● Content ● Color ● Layout
● Snackbars
● Menus ● Default ● List
● Switches
● Layout ● Dialogue ● Navigation drawer
● System bar
● Steppers ● Expansion panels ● Picker
● Tabs
● Media ● Gestures ● Progress bar
● Text fields
● Action bar ● Grid list ● Slider sub-headers
● Typography
● Activity bar / circle ● Indicator ● Sub-screen

65
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 6. Partial list of the paired usability checklist.

Usability Usability Interface


Checklist items
features criteria features
Navigation Navigability Action bar Tabs or spinner in the top bar is used for quick view change
Navigation Linkage Action bar Shortcut to most frequent task is provided
Notification Completeness Dialogue There are at most 3 possible actions in a notification
Notification is not created if it is possible for the app to recover from the error
Notification Promptness Dialogue
without user action
Presentation Accessibility Content Content is structurally separated from navigational elements
Presentation Relevance Action bar Unavailable action in the current context is hidden instead of disabled
Permission Flexibility Snackbars The app allows to revert accidental activation
User’s data are kept private and safe (encrypted in the event of loss or
Permission Security Content
malfunction)
Signifiers Visibility Button The UI Buttons are visible
The user interface gives visual clues if something can be used with Pinch-To-
Signifiers Discoverability Gestures
Zoom gesture

5. Evaluating the frameworks use- online databases subscribed by Universiti Putra Ma-
fulness laysia (UPM) and accessed publications. The search
was performed using Google Scholar to review the
In our previous work, we validated the compre- recently proposed checklist-based framework pub-
hensiveness of the framework components among lished during the development of the framework in
academicians in Malaysia’s public universities [32]. this study. The query returned 424 results in the Eng-
The components were refined based on the survey lish language. Any matching results that have been
responses. Subsequently, the components were adopted in developing the framework in this study
evaluated for their feasibility in real practice among are omitted to avoid bias. Subsequently, publication
software engineering practitioners in Malaysia and on the checklist-based framework was filtered for
refined once again based on the survey response [33]. selection. The process ends with two relevant search
In this paper, we conducted an expert review and results. Since the work of Joseph [36] is more about
a semi-structured interview to evaluate the frame- usability heuristics, we have selected the work of
work’s usefulness in comparison to existing usability Thitichaimongkhol and Senivongse [37] as a compari-
evaluation frameworks. son against the formulated framework.
Usefulness is characterized in most usability stud-
ies as a composition of usability and as is utility [34,35]. Methods and material
Likewise, available usefulness questionnaires (e.g., Prior to the evaluation, the participants were giv-
USE and TAM) measure usefulness in the same di- en a demographic form to record their background
mension. The dimension includes a composition of experience, the specifications of the smartphone used
several usability criteria, such as ease of use, learn- during the evaluation, such as brand and operating
ability, and satisfaction, in addition to as-is utility. system, and their experience using apps in the domi-
This section demonstrates the framework’s evalua- nant category in the marketplace.
tion in terms of its usefulness in comparison to the Evaluating the entire framework measurement
selected study. (373 checklists) from this study in comparison with
The usability evaluation framework to be com- the previous work is inefficient in terms of time and
pared to the one from this study was selected resources. Therefore, the evaluation scope covers us-
through an exhaustive search of existing work on ability measurements from both sets that match the

66
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

ISO/IEC 25010 product quality model. Although the face features of the primary task for Lazada apps.
usability criteria corresponding to both checklist sets The participants were given two sets of checklists
for the evaluation have a different name compared (76 items from the formulated framework and 39
to the ISO/IEC 25010 quality criteria, the conceptu- items from the other framework), which correspond
al definition for the corresponding usability criteria to the ISO/IEC 25010 usability criteria and interface
shares the same description as the ISO/IEC 25010 features of primary tasks from selected apps. The
usability criteria. evaluators were required to perform a heuristic walk-
Three apps from different categories commonly through on three apps (Google+, Viber, and Lazada)
used by Malaysians (from survey responses in our using both checklist sets. Subsequently, they are
previous study) are selected from the Play Store.
required to review both checklist sets. The checklist
Task analysis is performed on the apps to identify the
sets are given in random order. The first evaluator is
primary task and the interface feature associated with
given Set 1, followed by Set 2. Meanwhile, the next
each task. Usability criteria from both studies (this
evaluator is given Set 2, followed by Set 1.
study and the other) corresponding to the interface
Finally, both frameworks were rated for their
features associated with the primary task are selected
usefulness using the USE questionnaires. The eval-
for the heuristic walkthrough. Figure 3 exhibits an
uators were given two sets of USE questionnaires,
excerpt from the identified checklist from this study,
corresponding to the usability criteria from ISO/IEC one for each framework. The questionnaire includes
25010 and interface features of the primary task for 30 checklist items on a 7-point Likert scale. The
Lazada apps. scale ranges from 1 (strongly disagree) to 7 (strongly
Likewise, the checklist for Set 2 is prepared us- agree). The resulting USE score was analyzed using
ing the same task and interface features of the same paired t-test to determine if there was any difference
apps, corresponding to the same usability criteria as between the compared frameworks. A post hoc test,
in Set 1. Figure 4 exhibits an excerpt from the iden- Cohen’s D, is used to investigate the effect size on
tified checklist from another study, corresponding to the significance of the compared framework. Equa-
the usability criteria from ISO/IEC 25010 and inter- tion (1) explains Cohen’s D measure of effect size.

Figure 3. Checklist for Set 1.

67
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

gree). The resulting USE score was analyzed using paired t-test to determine if there was
(stronglyframeworks.
nce between the compared agree). The resulting
A post hoc USEtest,score was D,
Cohen’s analyzed
is usedusing paired t-test to determine if there was
to investigate
any difference between the compared frameworks. Figure
A post 4.
hocChecklist for SetD,2.is used to investigate
test, Cohen’s
size on the significance of the compared framework. Equation (1) explains Cohen’s D
effect size. the effect size on the significance of the compared framework. Equation (1) explains Cohen’s D
measure of effect size.
=
� �
2− 1� �
� 2− ��1�

able rating, even after reviewing the score given. It
 = is not feasible to set up an upfront meeting with that
1 2 + 2 2 (1)
1 2 + 2 2
2 expert. The expert’s background profile showed that
2
sample mean, S = sample standard deviation. this expert is the only participant to select gaming
where,
where,  � ==sample
sample mean,
mean, S = sample
S = sample standardstandard
deviation.devia-
tion. into small, medium, large, and very large impacts [38]. An effect apps
ect size is categorized size as the most frequently used apps. Since gaming
ates a small magnitude of Effect size Medium
the effect. is categorized
effectinto
sizesmall,
rangesmedium,
from 0.5large, and very
vicinities. Largelarge impacts [38]. An effect size
ofEffect
ranges from 0.8. Meanwhile, size
0.2 indicates
a very ais categorized
small
large magnitude
effect ofinto
size is small,
the effect.
indicated medium,
byMedium
values sizeapps
effect than,
larger ranges
or
have
from different designs
0.5 vicinities. Large and purposes compared
3. effectand
large, size very
rangeslarge
from 0.8. Meanwhile,
impacts [38]
. An a very
effectlarge effect
size to otherbycategories
size is indicated
of 0.2 values largerofthan,
apps, or the expert’s perception of
equal to 1.3.
indicates a small after magnitude of the effect.why Medium usability might skew away from the other five partic-
emi-structured interview is conducted the experiment to clarify the experts rated
work better than effect A
size
the other. semi-structured
Theranges
identity from interview
of each0.5 is conducted
vicinities.
treatment, after
which Large the
framework was the ipants,
experiment
effect one who
to clarify whywere not familiar
the experts rated with mobile gaming.
one framework
tudy, and which framework was from better
thethan the other.
previous studyThe wereidentity of eachuntil
not revealed treatment, which framework was the one
the Thus,
end.
size
fromranges
this from
study, and0.8.
whichMeanwhile,
framework was a very
from large effectstudy were not
the previous
it is reasonable that the expert gave a contra-
revealed until the end.
le is to have an expert’s honest opinion on the framework.
Theisrationale
size indicated is to by
havevalues
an expert’s
largerhonestthan,opinion 1.3. dictory score compared to the other participants.
on thetoframework.
or equal
and discussions A semi-structured interview is conducted after Therefore, the response by this expert was excluded
6. Results and discussions
the industrial
experiment to clarify whymobile the experts rated one testers from the analysis. The overall USE score collected
approached eleven experts, ranging from developers to mobile
esigners. However, five of Wethemapproached
repeatedly eleven industrial the experts, ranging from mobile the developers
from each expert to mobile testers to evaluate the useful-
is analyzed
framework
and UX better than therescheduled
other.ofThe dateline
identity to each
of complete
and failed to complete thedesigners.
requested However,
evaluation five even after themmore repeatedly
than three rescheduled
follow-up the dateline to complete the
treatment,
evaluation and which failed framework
to complete was from thiseven ness
the one evaluation
the requested of thethan
after more frameworks.
three follow-up Table 7 exhibits the mean
study, and which framework was from the previous differences in the overall USE score for both treat-
reminders.
ly six of the experts managed to complete the experiment. A difference in the overall USE ments.
study
d by the six experts for were Only
both notsix revealed
frameworks was until
of the experts the end.
managed
computed. The rationale
to complete
However, onethe themisgives an
of experiment. A difference in the overall USE
ating, even afterto scores
reviewing rated
have anthe
by the six
score given.
expert’s
experts for
It isopinion both
not feasible frameworks
to up an upfront meetingA paired
was
thesetframework.
computed. However, one t-test
of themisgives
used an to determine the signif-
xpert. The expert’sunreliable
background rating, evenhonest
profile after reviewing
showed that this the
onscore
expert is given.
the only It is not feasible
participant to to set up an upfront meeting
ng apps as the most with that expert.
frequently usedTheapps. expert’s
Since background
gaming appsprofile showed that
have different designsthisicance
expert isofthethe
and onlyusefulness
participant toscore with a p-value of
select gaming apps as the most frequently used
ompared to other categories of apps, the expert’s perception of usability might skew away apps. Since gaming 0.05. The distribution
apps have different designsofandthe USE score differences
6. Results
purposes compared and to discussions
other categories of apps, the
her five participants, who were not familiar with mobile gaming. Thus, it is reasonable between expert’s perception that of usability
both
might skew away
gave a contradictory from the compared
score other five participants, who were notTherefore,
to the other participants. familiar with the mobile
response gaming.
by Thus, it isgroups
reasonableofthat
experts is normally distrib-
We
the approached
expert gave a eleven
contradictory industrial
score compared
was excluded from the analysis. The overall USE score collected from each expert is experts,
to the rang-
other uted,
participants. thus making
Therefore, the it
responseappropriate
by for conducting a
this expert
evaluate the usefulness of thewas excluded Table
frameworks. from the 7 analysis.theThe
exhibits mean overall USE score
differences in thecollected from each expert is
ing from tomobile
analyzed evaluatedevelopers
the usefulness to mobile
of the testersTable
frameworks. and7 exhibits
pairedthet-test.
mean The meaninindicates
differences the that the five experts
E score for both treatments.
UX designers.
overall USE score However,
for both treatments. five of them repeatedly (N = 5) gave a larger USE score for the formulated
Table 7. Overall USE score for both frameworks.
rescheduled the datelineTable
Paired Samples Statistics
to complete
7. Overall USE thescore
evaluation
for both frameworks.framework (mean = 180.60) compared to the other
and failed to complete Paired evaluation
Samples Statistics
Mean theNrequested Std. deviation even
Std. error mean framework (mean = 156.60). In addition, a smaller
Formulated framework USE score 180.60 5 10.991Mean N4.915Std. deviation Std. error mean
after more
Previous framework USE Pair 1 than
score
three
Formulated
156.60
follow-up
framework reminders.
5 USE score 15.143 180.60 5
6.772 standard
10.991deviation 4.915 compared to the other framework
Previous framework USE score 156.60 5 15.143 6.772
Only six of the experts managed to complete the indicates that the USE scores among the experts
paired t-test is used to determine the significance of the usefulness score with a p-value of
distribution of theexperiment.
USE scoreA pairedA difference
t-test is used
differences betweenin the
bothoverall
to determine groups USE
the significance
of scores
experts were more
isof normally
the usefulness scoreconsistent
with a p-valuein the
of formulated framework.
0.05. Thefordistribution ofa paired
the USE score differences between bothfivegroups of experts is normally
rated
thus making it appropriate by the six
conductingexperts for both
t-test. Theframeworks
mean indicates was
that the Tablemean 8 exhibits the results of the paired t-test.
= 5) gave a larger distributed,
USE score for thusthemaking
formulatedit appropriate
framework for (mean
conducting a paired
= 180.60) t-test. The
compared to indicates that the five
computed.
experts (N = However,
5) gave a one
larger USEof them
score gives
for the
amework (mean = 156.60). In addition, a smaller standard deviation compared to the other an unreli-
formulated framework USE
(mean =scores
180.60) for the formulated
compared to framework are
indicates that the the
USEother framework
scores among the(mean = 156.60).
experts In addition,
were more a smaller
consistent in the standard deviation compared to the other
formulated
framework
Table 8 exhibits the results ofindicates that
the paired the USE scores among the experts were more consistent in the formulated
t-test.
framework. Table 8 exhibits the results of the paired t-test. 68
Table 8. Results of the paired t-test.
Paired samples test Table 8. Results of the paired t-test.
Paired samples test
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Table 7. Overall USE score for both frameworks.


Paired Samples Statistics
Mean N Std. deviation Std. error mean
Formulated framework USE score 180.60 5 10.991 4.915
Pair 1
Previous framework USE score 156.60 5 15.143 6.772

Table 8. Results of the paired t-test.


Paired samples test
Paired differences Sig.
t df
Mean Std. deviation Std. error mean (1-tailed)
Formulated framework USE score –
Pair 1 24.000 20.273 9.066 2.647 4 .0286
Previous framework USE score

24 points higher (mean paired difference = 24) than Discussions


USE scores for the previous framework. There is The demographic distribution of the experts in-
enough evidence to claim that the mean USE score dicates that they were the appropriate participants
given by the experts for the formulated framework is to evaluate the formulated framework measurement
greater than the previous framework, t (4) = 2.647, p from the perspective of developers. Only two of the
= 0.0286. Thus, the null hypothesis is rejected since experts are experienced usability practitioners with
the p-value is less than 0.05. Figure 5 illustrates the seven years or more of experience in the field. Both
position of the calculated t statistic (within the H0 re- of them were of different genders and were using
jection region), t-value, and p-value in a graph. different mobile OS. Additionally, all of the experts
are of different ages, ranging from the twenties to the
thirties and forties.
The formulated framework proved to be more
useful than the previously proposed framework. The
semi-structured interview revealed that the experts
came to a consensus, agreeing that the formulated
Figure 5. Position of calculated t statistics. framework is more useful for usability evaluation
compared to the other framework. The reason lies in
The learning experience gained in conducting
the fact that the formulated framework measurement
the heuristic walkthrough using the checklist facil-
is much simpler, UI-oriented, and less ambiguous for
itates comprehending the framework measurement, experts, both developers and usability practitioners.
thus ensuring reliable scoring of the framework’s Both frameworks were compared using the same
usefulness. However, although the result presented baseline measurement: the ISO/IEC 25010 product
indicates that the USE scores of both frameworks quality model (usability component) in conjunction
are unlikely to occur by chance, the magnitude of the with a common interface feature for the primary task
effect of the treatment (the formulated framework) in the evaluated apps. Nevertheless, the learning
over the other framework is unexplained. A post gained from experiencing the framework measure-
hoc test, Cohen’s D, is used to investigate the effect ment exhibits the usefulness of each framework.
size on the significance of the compared framework. The formulated framework is criticized for its
The result of 1.81 indicates a very large effect size, large number of checklists. However, it is not prac-
implying a meaningful difference in the USE score tical to inspect every available criterion. Usability
between both frameworks. evaluation is commonly conducted based on an

69
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

evaluation plan established beforehand, which deter- mobile computing, the drop-down menu is recog-
mines the criteria to be evaluated during an inspec- nized as a list. In addition, the jump menu is known
tion. The formulated framework came in handy due as a spinner in mobile computing. Furthermore, el-
to its features in supporting different backgrounds of ements such as sub-screen and gesture are interface
usability evaluators through the abstraction level. In features that are absent in desktop computing and
fact, restricting usability evaluation to usability cri- could be misinterpreted differently by individuals.
teria of interest will eventually reduce the number of This necessitates a further description of a UI el-
checklists to be used during the usability evaluation. ement’s operation or behavior in view of desktop
computing to facilitate an inexperienced evaluator in
7. Threats to validity this case. Secondly, the experiment is designed as a
repeated measure to reduce variability across partici-
Threats are inevitable yet manageable in re- pants.
search. In this section, threats to internal, external, The main threat to the conclusion validity of the
conclusion, and construct validity are discussed. The result is statistical power. This threat is alleviated by
selection of associated usability criteria tied to UI applying the most common statistical test, appropri-
elements was determined by adopting the ISO/IEC ate for the research design of within-subject design.
25010 product quality model (usability component) Moreover, the significance level was 5%. Hence, the
as a benchmark criterion in comparing the formulat- chance of a Type I error is small.
ed framework over previously proposed frameworks. A checklist from the previous study is used in
However, the experiment is still vulnerable to the comparison to the checklist in this study to manage
order effect. Thus, in replicating the experiment over construct validity. The scores of both checklists in
the other framework in comparison, two sets of eval- measuring the ISO/IEC 25010 product quality model
uation plans representing the formulated framework (usability component) were correlated in conjunc-
and the previously proposed framework were given tion with the use of an established questionnaire to
to the evaluator in random order. measure the framework’s usefulness. In addition, a
Regarding the external threat, the respondent’s well-established usability questionnaire was care-
expertise and experience in using the evaluated app fully selected for this study to measure usefulness
might affect the validity of the result. The respon- appropriately.
dents consist of field experts from various branches
of software engineering disciplines and app develop-
ment stages with a different range of years of expe-
8. Conclusions
rience. In addition, they might use their experience This study empirically evaluates the usefulness of
of using a particular type of app, e.g., transactional, an integrated usability evaluation framework through
communication, or games, as a benchmark in scoring an expert review. The framework measurement is
UI elements. Altogether, the respondent might per- reviewed and compared against a framework from
ceive usability differently based on their background another study. Both frameworks were compared
of expertise and experience with the app, thus af- based on the ISO/EIC 25010 product quality mod-
fecting their subjective judgement. These threats are el (usability component). Hypothesis testing was
controlled through two countermeasures. Firstly, a conducted to investigate the significance and effect
conceptual definition of the evaluated interface fea- size of the response from the expert review. The re-
ture was established. There is a possibility that an sults of the statistical test proved that the formulated
interface feature is recognized by a different name framework had a significant and large effect size and
in academia and industry. For example, a drop-down was more useful compared to the other framework.
menu is well-known in desktop computing. On some In the future, we plan to improve the effectiveness of
occasions, it is used as a jump menu. However, in this framework by comparing the results of using it

70
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

in usability testing against usability inspection. The cations and Services (IiWAS’08); 2008 Nov 24-
rationale is to alleviate a possible false alarm in the 26; Linz, Austria. New York: Association for
formulated framework measurement and capture the Computing Machinery. p. 567-570.
true usability problem. Consequently, an additional [4] Liu, N., Yu, R., 2017. Identifying design feature
checklist could be proposed based on the usability factors critical to acceptance and usage behavior
testing result to complement the developed usability of smartphones. Computers in Human Behavior.
measurement. 70, 131-142.
[5] Harrison, R., Flood, D., Duce, D., 2013. Usabil-
Author Contributions ity of mobile applications: literature review and
rationale for a new usability model. Journal of
H. R. conceived the idea and study of proposing
Interaction Science. 1, 1-16.
a usability evaluation framework for mobile apps
[6] Yáñez Gómez, R., Cascado Caballero, D.,
that incorporates the usability criteria and interface
Sevillano, J.L., 2014. Heuristic evaluation
features in conjunction with different evaluator
on mobile interfaces: A new checklist. The
viewpoints into a framework abstraction level. H.
Scientific World Journal. 1-19.
Z., A. K. and the late A. A. A. A. served as H.R.’s
[7] Nayebi, F., 2015. iOS application user rating
supervisor and co-supervisors on her Ph.D. thesis at
prediction using usability evaluation and ma-
the Universiti Putra Malaysia. All authors reviewed
chine learning [Ph.D. thesis]. Quebec: Universi-
and approved the final manuscript.
ty of Quebec.
[8] Saleh, A., Ismail, R., Fabil, N. (editors), 2017.
Conflict of Interest Evaluating usability for mobile application: A
There is no conflict of interest. MAUEM approach. ICSEB 2017 Proceedings
of the 2017 International Conference on Soft-
Funding ware and E-Business; 2017 Dec 28-30; Hong
Kong. New York: Association for Computing
This research was partially funded by the Re-
Machinery. p. 71-77.
search University Grant Scheme (RUGS), Universiti
[9] Dubey, S.K., Gulati, A., Rana, A., 2012. Inte-
Putra Malaysia (UPM).
grated model for software usability. Internation-
al Journal on Computer Science and Engineer-
References ing. 4(3), 429.
[1] Moumane, K., Idri, A., Abran, A., 2016. Usability [10] Elsantil, Y., 2020. User perceptions of the
evaluation of mobile applications using ISO 9241 security of mobile applications. International
and ISO 25062 standards. SpringerPlus. 5, 1-15. Journal of E-Services and Mobile Applications
[2] Hussain, A., Abubakar, H.I., Hashim, N.B. (edi- (IJESMA). 12(4), 24-41.
tors), 2014. Evaluating mobile banking applica- DOI: https://doi.org/10.4018/IJESMA.2020100102
tion: Usability dimensions and measurements. [11] Malatini, S., Bogliolo, A. (editors), 2015.
Proceedings of the 6th international Conference Gamification in mobile applications usability
on Information Technology and Multimedia; evaluation: A New Approach. MobileHCI’15:
2014 Nov 18-20; Putrajaya, Malaysia. New Proceedings of the 17th International Confer-
York: IEEE. p. 136-140. ence on Human-Computer Interaction with
[3] Hussain, A., Kutar, M. (editors), 2009. Usability Mobile Devices and Services; 2015 Aug 24-27;
metric for mobile application. 2008 Proceed- Copenhagen, Denmark. New York: Association
ings of the 10th International Conference on for Computing Machinery. p. 897-899.
Information Integration and Web-Based Appli- [12] Hoehle, H., Aljafari, R., Venkatesh, V., 2016.

71
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

Leveraging Microsoft’s mobile usability guide- tors), 2012. Smartphone applications usability
lines: Conceptualizing and developing scales for evaluation: A hybrid model and its implemen-
mobile application usability. International Jour- tation. Human-Centered Software Engineering:
nal of Human-Computer Studies. 89, 35-53. 4th International Conference, HCSE 2012; 2012
[13] Mugisha, A., Nankabirwa, V., Tylleskär, T., et Oct 29-31; Toulouse, France. p. 146-163.
al., 2019. A usability design checklist for Mobile [22] Olsina, L., Santos, L., Lew, P., 2014. Evaluating
electronic data capturing forms: the validation mobileapp usability: A holistic quality approach.
process. BMC Medical Informatics and Decision Lecture notes in computer science. Springer,
Making. 19(1), 1-11. Cham.: New York. pp. 111-129.
DOI: https://doi.org/10.1186/S12911-018-0718-3 [23] Fabil, N.B., Saleh, A., Isamil, R.B., 2015. Ex-
[14] Xu, H., Jonsson, M., 2012. Tablet application tension of pacmad model for usability evalua-
GUI usability checklist—Creation of a user in- tion metrics using goal question metrics (Gqm)
terface usability checklist for tablet applications approach. Journal of Theoretical and Applied
[Master’s thesis]. Huddinge: Södertörns Univer- Information Technology. 79(1), 90-100.
sity College. [24] Inostroza, R., Rusu, C., Roncagliolo, S., et al.
[15] Lachgar, M., Abdelmounaim, A., 2017. Decision (editors), 2012. Usability heuristics for touch-
framework for mobile development methods. screen-based mobile devices. 2012 Ninth Inter-
International Journal of Advanced Computer national Conference on Information Technol-
Science and Applications. 8(2), 110-118. ogy—New Generations; 2012 Apr 16-18; Las
DOI: https://doi.org/10.14569/IJACSA.2017.080215 Vegas, NV, USA. New York: IEEE.
[16] ISO 9241-11:1998 Ergonomic Requirements [25] Mi, N., Cavuoto, L.A., Benson, K., et al.,
for Office Work with Visual Display Termi- 2014. A heuristic checklist for an accessible
nals (VDTs)—Part 11: Guidance on Usability smartphone interface design. Universal Access
[Internet]. International Organization for Stan- in the Information Society. 13, 351-365.
dardization; 1998 [cited 2018 Dec 21]. Available [26] Soomro, S., Ahmad, W.F.W., Sulaiman, S.
from: https://www.iso.org/standard/16883.html (editors), 2012. A preliminary study on heuris-
[17] ISO/IEC 9126-1:2001 Software Engineering— tics for mobile games. 2012 International Con-
Product Quality—Part 1: Quality Model [Internet]. ference on Computer and Information Science;
International Standard for Standardization; 2001 2012 Jun 12-14; Kuala Lumpur, Malaysia. New
[cited 2018 Dec 21]. Available from: https://www. York: IEEE. p. 1030-1035.
iso.org/standard/22749.html [27] Zahra, F., Hussain, A., Mohd, H. (editors),
[18] Nielsen, J., Budiu, R., 2012. Mobile usability. 2017. Usability evaluation of mobile applica-
New Riders Press: Berkeley CA. tions; Where do we stand? The 2nd International
[19] Constantine, L.L., Lockwood, L.A., 1999. Soft- Conference on Applied Science and Technology
ware for use: A practical guide to the models 2017 (ICAST’17); 2017 Apr 3-5; Kedah, Malaysia.
and methods of usage-centered design. Addi- [28] Zamfiroiu, A., 2014. Factors influencing the
son-Wesley Publishing Co.: Boston. quality of mobile applications. Informatica
[20] Böckle, M., Rühmkorf, J. (editors), 2019. Economica. 18(1), 131.
Towards a framework for the classification of [29] Homann, M., Wittges, H., Krcmar, H., 2013.
usability issues. Human-Computer Interaction– Towards user interface patterns for ERP appli-
INTERACT 2019: 17th IFIP TC 13 International cations on smartphones. Business Information
Conference; 2019 Sep 2-6; Paphos, Cyprus. p. Systems. 157, 14-25.
610-614. [30] Roder, H. (editor), 2012. Specifying usability
[21] Kronbauer, A.H., Santos, C.A., Vieira, V. (edi- features with patterns and templates. 2012 First

72
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

International Workshop on Usability and Acces- ability: Design discourse. Springer: Cham. pp.
sibility Focused Requirements Engineering (Us- 269-280.
ARE); 2012 Jun 28; Zurich, Switzerland. New DOI: https://doi.org/10.1007/978-3-319-20886-2_26
York: IEEE. p. 6-11. [35] MacDonald, C.M., Atwood, M.E. (editors),
[31] Punchoojit, L., Hongwarittorrn, N., 2017. Us- 2014. What does it mean for a system to be
ability studies on mobile user interface design useful? Proceedings of the 2014 Conference on
patterns: A systematic literature review. Advanc- Designing Interactive Systems; 2014 Jun 21-25;
es in Human-Computer Interaction. 1-22. Vancouver BC, Canada. New York: Association
DOI: https://doi.org/10.1155/2017/6787504
for Computing Machinery. p. 885-894.
[32] Rahmat, H., Zulzalil, H., Ghani, A.A.A., et al.,
[36] Joseph, V., 2017. User experience guidelines for
2018. A comprehensive usability model for
improving retention rate in mobile apps [Mas-
evaluating smartphone apps. Advanced Science
ter’s thesis]. Madrid: Universidad Politécnica de
Letters. 24(3), 1633-1637.
Madrid.
[33] Zulzalil, H., Rahmat, H., Ghani, A.A.A., et al.,
2019. Conceptualising mobile apps usability di- [37] Thitichaimongkhol, K., Senivongse, T., 2016.
mension: A feasibility assessment of Malaysian Enhancing usability heuristics for android ap-
industrial practitioners. International Journal of plications on mobile devices. Proceedings of the
Engineering and Advanced Technology. 9(1), World Congress on Engineering and Computer
1708-1713. Science. 1, 19-21.
[34] Tarkkanen, K., Harkke, V., Reijonen, P., 2015. [38] Sullivan, G.M., Feinn, R., 2012. Using effect
Are we testing utility? Analysis of usability size—or why the P value is not enough. Journal
problem types. Design, user experience, and us- of Graduate Medical Education. 4(3), 279-282.

73
Journal of Computer Science Research | Volume 05 | Issue 03 | July 2023

74

You might also like