Mathematics 10 03212 v2
Mathematics 10 03212 v2
Article
Efficient Ontology Meta-Matching Based on Interpolation
Model Assisted Evolutionary Algorithm
Xingsi Xue 1, * , Qi Wu 2 , Miao Ye 3 and Jianhui Lv 4
1 Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology,
Fuzhou 350118, China
2 College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030000, China
3 School of Information and Communication, Guilin University of Electronic Technology, Guilin 540014, China
4 Pengcheng Laboratory, Shenzhen 518038, China
* Correspondence: [email protected]
Abstract: Ontology is the kernel technique of the Semantic Web (SW), which models the domain
knowledge in a formal and machine-understandable way. To ensure different ontologies’ commu-
nications, the cutting-edge technology is to determine the heterogeneous entity mappings through
the ontology matching process. During this procedure, it is of utmost importance to integrate dif-
ferent similarity measures to distinguish heterogeneous entity correspondence. The way to find
the most appropriate aggregating weights to enhance the ontology alignment’s quality is called
ontology meta-matching problem, and recently, Evolutionary Algorithm (EA) has become a great
methodology of addressing it. Classic EA-based meta-matching technique evaluates each individual
through traversing the reference alignment, which increases the computational complexity and
the algorithm’s running time. For overcoming this drawback, an Interpolation Model assisted EA
(EA-IM) is proposed, which introduces the IM to predict the fitness value of each newly generated
individual. In particular, we first divide the feasible region into several uniform sub-regions using
lattice design method, and then precisely evaluate the Interpolating Individuals (INIDs). On this
Citation: Xue, X.; Wu, Q.; Ye, M.; Lv, basis, an IM is constructed for each new individual to forecast its fitness value, with the help of
J. Efficient Ontology Meta-Matching its neighborhood. For testing EA-IM’s performance, we use the Ontology Alignment Evaluation
Based on Interpolation Model
Initiative (OAEI) Benchmark in the experiment and the final results show that EA-IM is capable of
Assisted Evolutionary Algorithm.
improving EA’s searching efficiency without sacrificing the solution’s quality, and the alignment’s
Mathematics 2022, 10, 3212. https://
f-measure values of EA-IM are better than OAEI’s participants.
doi.org/10.3390/math10173212
Academic Editors: Yu Xue, Chunlin Keywords: ontology matching; evolutionary algorithm; interpolation model; lattice design
He and Ferrante Neri
“coronaviruses” is a subclass of RNA viruses. The instance is represented with circular ova,
e.g., “glucocorticoids” is an instance of the medicine.
Since the rapid growth of the ontology, an ontology might own thousands or even
more entities, and their semantic relationships become more and more complicated [9],
then the ontology matching process gets very complicated. In the ontology matching
process, how to measure the similarity of two entities to distinguish the accurate matching
Mathematics 2022, 10, 3212 3 of 20
elements is a key step in the ontology matching process, which is usually addressed
by similarity measures. Similarity measures calculate to the degree of similarity of two
entities, which can be divided into two broad categories, which, respectively, based on
two entities’ syntax information and linguistic information. Different similarity measures
have their own advantages and disadvantages and the applicable scope. Since using only
one similarity measure is not enough to obtain satisfying ontology matching results, it is
required to integrate multiple measures to enhance the result’s confidence. Ontology meta-
matching investigates how to find the optimal integrating weights for similarity measures
to improve the ontology alignment’s quality [10], which is a open challenge due to the
complex heterogeneous context on entities and the high computational complexity on the
matching process [11]. Due to the following two characteristics: (1) the potential parallel
search mechanism enables EA to effective explore all the feasible regions; (2) the strong
exploration helps prevent the algorithm from falling into the local optimum, and converge
to the global optimum, Evolutionary Algorithm (EA) becomes a popular methodology for
addressing ontology meta-matching problem [12–14].
With respect to the EA-based ontology meta-matching technique, the population’s
evaluation is critical for its performance. However, the expensive evaluation, i.e., the
evaluation on an individual requires large computational resources, would deteriorate the
algorithm’s performance. In the empirical experiment, the classic EA might take about
30 s to evaluate an individual fitness. To improve the algorithm’s efficiency, this work
proposes an Interpolation Model assisted EA, which is able to forecast the newly generated
individual’s fitness value with a problem-specific IM to save the running time. In particular,
we first used the lattice design method [15] to divide the feasible region into several uniform
sub-regions and evaluated the representative solutions. After determining which region the
newly generated individual was in, an IM was built by using its neighborhood to calculate
the fitness value. In particular, the contributions made in this work are as follows:
• a mathematical optimization model on EA-IM based ontology meta-matching problem
is constructed;
• a binomial IM based on lattice design is presented to forecast the fitness of the individ-
uals, which is constructed according to the relationship between ontology alignment’s
two evaluation metrics;
• an EA-IM is proposed to efficiently address the ontology meta-matching problem.
The rest of the paper is organized as follows: Section 2 presents the related work of
ontology meta-matching; Section 3 shows the definitions on ontology matching and the
similarity measures; Section 4 presents the construction of Interpolation Model (IM) and the
IM-assisted EA; Section 5 shows the experimental results; Section 6 draws the conclusion.
2. Related Work
Similarity measure determines to what extent two entities is similar, and the combi-
nation of multiple similarity measures can enhance the quality of alignment. Ontology
meta-matching dedicates to investigate the way to find the integrating weights of similarity
measures to enhance the ontology alignment’s quality. EA is an outstanding algorithm to
overcome ontology meta-matching problem due to its parallel search mechanism and strong
exploration, and in recent years, lots of work about EA-based ontology meta-matching
techniques are researched. Next, we will review the techniques of EA-based ontology
meta-matching in chronological order.
Naya et al. [16] first introduced EA into the field of ontology meta-matching to
enhance ontology alignment’s quality. They investigated how to use EA to aggregate
multiple similarity measures to optimize the quality of matching results. Starting from
the initial population, each individual represented a particular measures combination,
and the algorithm iterated to generate the best measures combination. This work was
impressive for the development of ontology meta-matching study. Martinez-Gil et al. [17]
also proposed an approach based on EA to address the ontology meta-matching problem,
which is Genetics for Ontology Alignments (GOAL). Specifically, GOAL described the
Mathematics 2022, 10, 3212 4 of 20
feasible domain as parameters that were encoded as a chromosome, so the authors devised
a way to translate the decimal numbers into a set of floating-point numbers to an arbitrary
range of [0, 1]. The authors then constructed one fitness function to select which individuals
in the population were more likely to be retained. The experiment proved that GOAL had
better scalability and could optimize the matching process. For effectively optimizing the
weight of similarity aggregation without knowing the ontology features, Giovanni et al. [18]
proposed Memetic Algorithm (MA) to perform the ontology meta-matching to find the sub-
optimal alignments. Specifically, MA brings the local search strategy into EA’s evolutionary
process, and improved converging speed while ensuring the quality of the solution. This
work had shown that the memetic method was an effective way of improving the classic EA-
based meta-matching techniques. On this basis, Giovanni et al. [19] proposed an ontology
alignment system based on MA, which adjusted its specific instance parameters adaptively
with the FML-based fuzzy adjustment to improve the algorithm’s performance. To match
several pairs of ontologies at the same time, and overcome the shortcomings of f-measure,
Xue et al. [20] proposed the MatchFmeasure, a rough evaluation index without reference
matching, and Uniform Improvement Ratio (UIR), a metric to complement MatchFmeasure.
This method was able to align multiple pairs of ontologies simultaneously, and avoided the
bias improvements on the solutions. In order to better enhance the efficiency of ontology
meta-matching process, the Compact EA (CEA) was proposed and used to optimize the
aggregating weights [21]. Experimental results showed that CEA could greatly reduce the
running time and increase the efficiency. Later on, Parallel CEA (PCEA) [22] was presented
to address the meta-matching problem, which combined the parallel technique and compact
encoding mechanism. Comparing with CEA, PCEA could further decrease the execution
time and main memory consumption of the tuning process, without sacrificing the quality of
alignment. Lv et al. [23] proposed a new meta-matching technology for ontology alignment
with grasshopper optimization (GSOOM), which used The Grasshopper Optimization
Algorithm (GOA) to find the corresponding relationship between the source ontology and
target ontology by optimizing the weight of multiple similarity measures. They modeled
the ontology meta-matching problem as an optimization GOA individual fitness problem
with two objective functions. More recently, Lv et al. [24] introduced an adaptive selection
strategy to overcome the premature convergence, which was able to dynamically adjust
the selection pressure of the population by changing individual fitness values.
One of the drawbacks that make the existing EA-based matching techniques unable
to widely be used in the practical scenarios is their solving efficiency, i.e., they need long
running time to find the final alignment especially when evaluating the population. In
this work, to address the issue of expensive evaluation, an EA-IM based ontology meta-
matching technique is proposed, which makes use of the problem-specific IM to save the
algorithm’s running time. In particular, the lattice design is introduced to divide the feasible
regions into several parts, which is able to ensure the the accuracy of the approximate
evaluation.
3. Preliminaries
3.1. Ontology, Ontology Alignment and Ontology Matching Process
In this work, ontology is defined as follows:
Definition 1. An ontology can be seen as a 6-tuple O = (C, P, I, ϕCP , ϕCI , ϕ PI ) [25], where:
• C is a nonempty set of classes;
• P is a nonempty set of properties;
• I is a nonempty set of instances:
• ϕCP : P −→ C × C associates a property peP with two classes;
• ϕCI : C −→ φ( I ) associates a class ceC with a subset of I which represents the instances of
the concept c;
• ϕ PI : P −→ φ( I 2 ) associates a property peP with a subset of Cartesian product I × I which
represents the pair of instances related through the property p.
Mathematics 2022, 10, 3212 5 of 20
To address the ontology heterogeneity issue, the most common method is executing the
ontology matching process to determine ontology alignment, which is defined as follows:
Definition 2. Given two ontology O1 and O2 , an ontology alignment is the set of matched elements,
and the matched element can be seen as a 5-tuple (id, e1 , e2 , confidence, relation), where:
• id is the identifier of the matching element;
• e1 and e2 are entities of ontology O1 and O2 , respectively;
• con f idence is the confidence value of the matched element (generally in the range [0, 1]);
• relation represents the matching relation between entities e1 and e2 , such as equivalence
relation or generalization relation.
Figure 3 shows the illustration of two heterogeneous ontologies and their alignment.
These two ontologies have descriptions of concepts, properties, and instances. Concepts
also have inclusion relationships. In this figure, class is described with the rectangle with
rounded corners, e.g., class “Chairman” is a specialization (subclass) of class “Person”; The
relation between entities has the relation of equivalence and inclusion, entity correspon-
dence is denoted by the thick arrow that links an entity of O1 with an entity of O2 , which
is represented with the relationship which will be reflected by the correspondence, e.g.,
“Author” in O1 is more general than Regular author in O2 . The “SubjectArea” in O1 and
the “Topic” in O2 are a pair of heterogeneous entities, and they are equivalent. An entity is
connected with its attributes by dotted lines, e.g., “has email” is a property of the entity
“Human” which is defined on the string field.
to distinguish the correct entity correspondence is critical for ontology matching process.
Similarity measure can be used to evaluate the similarity value of two entities to distinguish
the correct matching elements. In the ontology matching domain, since syntax-based
similarity measure and linguistic-based similarity measure are frequently used [27], in this
work, we select two syntax-based similarity measures, i.e., SMOA [28] and N-Gram [29],
and one linguistic-based similarity measure, i.e., Wu and Palmer method [30].
SMOA calculates two string’s similarity by taking into account both their similarities
and differences between two strings, which is defined in Equation (1):
where com(r1 , r2 ) is the commonality between two string r1 and r2 , di f (r1 , r2 ) is their
difference and winklerlmpr (r1 , r2 ) is the result’s optimisation using the method introduced
by Winkler.
Specifically, com(r1 , r2 ) first iteratively obtains the maximum common character sub-
string between the strings r1 and r2 until there is no common character substrings. When-
ever a maximum public character substring is found, it will be removed from the original
string, and the search continues for the next maximum public character substring. Finally,
divide the length of the longest common character substring found by the sum of the
lengths of the strings r1 and r2 to get the commonality between them. In particular, their
commonality is defined as following:
2 × ∑i |maxComStringi |
com(r1 , r2 ) = (2)
|r1 | + |r2 |
where maxComStringi is the i-th longest common substring between r1 and r2 , |r1 | and |r2 |
are r1 and r2 ’s cardinality. di f (r1 , r2 ) is determined by the length of the character substring
that does not match in the first iteration of com(r1 , r2 ), which can be defined as Equation (3):
1 |d(r1 )|·|d(r2 )|
di f (r1 , r2 ) = × (3)
2 p +(1 − p)(|d(r1 )|+|d(r2 )|)−|d(r1 )||d(r2 )|
|r −maxComString | |r −maxComString |
where d(r1 ) = 1 |r1 |
i
and d(r2 ) = 2 |r2 |
i
, respectively. p is a pa-
rameter used to adjust a different importance to the difference component of the SMOA
(typically p = 0.6). In the next, we show an example of calculating SMOA value between
two strings “14522345345667890” and “1234567890”. First, their longest common sub-
string is “67890”, and thus |maxComStringi | = 5. Then, the number of |r1 | and |r2 | are
17 and 10, respectively, and the value of com(r1 , r2 ) and winklerlmpr (r1 , r2 ) are 0.38 and
0.68, respectively. The number of |d(r1 )| is 0.5, and the number of |d(r2 )| is 0.71, we can
obtain di f (r1 , r2 ) = 0.24 according to Equation (3). Finally, two strings’ SMOA value is 0.82
according to Equation (1).
According to [31], N-gram is also a great syntax-based similarity measure because it is
able to analyze the similarity between two strings with fine granularity. Given a string, the
N-gram of the string represents the segment of the original word sliced by length N, that is,
all the n-length substrings in the string. If you have two strings and take their N-gram, you
can define the N-gram distance between them in terms of the number of substrings they
have in common. As a similarity measure, N-gram can be defined as Equation (4):
2 · comm(r1 , r2 )
N-gram(r1 , r2 ) = (4)
Nr1 + Nr2
where r1 and r2 are the two strings to be compared, and each of them is divided according
to certain rules. In the experiment, we set N to 3 and three letters are divided into groups
for segmentation. In addition, comm(r1 , r2 ) represents the number of sub-strings that are
identical between the r1 and r2 strings. Nr1 and Nr2 represent the number of substrings r1
and r2 are segmented, respectively. For example, the word “platform” can be cut into six
Mathematics 2022, 10, 3212 7 of 20
substrings: “pla”, “lat”, “atf”, “tfo”, “for”, and “orm”. The word “plat” can be cut into two
substrings: “pla” and “lat”. The same substrings of r1 and r2 are “pla” and “lat”. When
calculate the N-gram similarity between “platform” and “plat”, the number of r1 substrings
Nr1 = 6, the number of r2 substrings Nr2 = 2, and the number of common substrings r1 and
r2 which is com(r1 , r2 ) = 2 can be substituted into the Equation (4).
Different from the above two metrics, Wu and Palmer’s method uses WordNet [32]
to measure the semantic distance of two words. WordNet is an online English vocabulary
retrieval system. As a linguistic ontology and semantic dictionary, WordNet is widely used
in natural language processing. Here, the closer two terms are to their common parent in
semantic depth in WordNet, the more similar they become. Given two words r1 and r2 ,
their linguistic similarity is calculated as follows:
2 · depth( LCA(r1 , r2 ))
Wup(r1 , r2 ) = (5)
depth(r1 ) + depth(r2 )
where LCA(r1 , r2 ) is the closest common parent concept between r1 and r2 , depth( LCA(r1 , r2 ))
represents the depth position of the common parent, depth(r1 ) and depth(r2 ) represent
the depth position of r1 and r2 in WordNet dictionary, respectively. The smaller the gap
between depth( LCA(r1 , r2 )) and depth(r1 ) and depth(r2 ), the closer the kinship between
common parent LCA(r1 , r2 ) and r1 and r2 , that is, the closer r1 and r2 are. Figure 4 shows
an example. The “Animal” in the figure is located in the first layer of the network, which is
the lowest layer. According to the Wup calculation rule, both “Bird” and “Fish” are in the
second layer, and the nearest common parent is the “Animal” in the first layer. Therefore,
the similarity between “Bird” and “Fish” is 2⁄(2 + 2) = 0.5. The concepts “Sparrow” and
“Parrot” are both in the third layer, and their common parent is the "bird" in the second level,
so that the similarity between “Sparrow” and “Parrot” is 4/(3 + 3) ≈ 0.67. Such results
are consistent with the human perception of the world, that “sparrows” and “parrots” are
more similar than “Bird” and “Fish”.
where ei , e j are, respectively, two entities from two different ontologies, and wk is the
aggregating weight for kth similarity measure simmk . For example, assuming there are three
similarity measures, whose similarity values on two entities are, respectively, simm1 = 0.75,
simm2 = 0.62 and simm3 = 0.83, given the aggregating weight vector (w1 , w2 , w3 ) =
(0.2, 0.3, 0.5)T where ∑ wi = 1, and the final similarity value is ∑ wi × simmi = 0.2 × 0.75 +
0.3 × 0.62 + 0.5 × 0.83 = 0.75.
| R A|
T
precision = (7)
| A|
| R A|
T
recall = (8)
| R|
2 · precision · recall
f − measure = (9)
precision + recall
where R is the reference alignment and A is the alignment.
Given two ontologies O1 and O2 , supposing the best alignment of O1 and O2 is a
one-to-one relationship, the more correspondences between O1 and O2 , and the similarity
of the correspondence is proportional to the quality of the alignment. Therefore, ontology
alignment quality measure can be obtained as follows:
| A|
∑i δi
I ( A ) = α × F ( A ) + (1 − α ) × (10)
| A|
where the decision variable X is the parameter set, e.g., the weights for aggregating multiple
similarity measures and the threshold for filtering the aggregated alignment.
Before initializing the population, lattice design is used to divide the feasible domain,
and 16 standard individuals, i.e., INIDs, are set for calculating individual fitness. In
the process of population fitness evaluation, the three INIDs that are most similar to
individuals (that is, the closest distance) are firstly found, and interpolation prediction
model is constructed by using these three INIDs, then the fitness of individuals is obtained.
After that, the individuals are updated by the selection, crossover and mutation operations
of the evolutionary algorithm. The algorithm iterates until the maximum number of
times, and finally outputs the individuals representing the optimal solution. The following
introduction will focus on the coding mechanism of the algorithm, lattice design for feasible
domain, EA-IM based ontology matching and evolution operators.
Since p − 1 bits are needed to indicate the split point and 1 bit to indicate the threshold,
P represents the length of individual codes. Figure 6 shows an instance of weight encoding
and decoding in which there are 6 weights used to integrate 6 different similarity measures:
Mathematics 2022, 10, 3212 10 of 20
Then, according to the five segmentation points, the decoded weights are
w1 = s1 − 0 = 0.175, w2 = s2 − s1 = 0.166, w3 = s3 − s2 = 0.124, w4 = s4 − s3 = 0.301,
w5 = s5 − s4 = 0.166, w6 = 1 − s5 = 0.068.
d1 × I N ID2recall + d2 × I N ID1recall
recall predict = (13)
d1 + d2
precision predict =a× recall 2predict+b × recall predict+c (14)
Mathematics 2022, 10, 3212 11 of 20
where I N ID1recall and I N ID2recall are recall values of I N ID1 and I N ID2 , respectively, and
a < 0, b > 0, c > 0 are the coefficients of a quadratic function determined by I N ID1 ,
I N ID2 , and I N ID3 .
The smaller the distance between two individuals, the more similar they are. In this
work, we use the Euclidean distance to calculate two individuals’ distance, which is defined
in Equation (15): v
u F
∑
u
d ( p , p ) = t ( p − p )2
1 2 1i 2i (15)
i =1
where p1 and p2 are two individuals, and F is the number of their features.
5. Experiment
5.1. Experimental Configuration
In the experiments, we used the well-known Benchmark provided by Ontology Align-
ment Evaluation Initiative (OAEI) [38] to test EA-IM’s performance. OAEI is an interna-
tional ontology Alignment competition designed to evaluate various ontology alignment
algorithms for the purpose of evaluating, comparing, communicating and promoting ontol-
ogy alignment. OAEI’s Benchmark features wide. In particular, it contains 51 ontologies
from the same domain, and they are modified manually, some will change natural language
tags and comments, etc., while others will replace concepts with random strings. This can
fully measure the advantages and inferiority of different ontology matching algorithms.
Specifically, these ontologies are divided into three categories, i.e., 1XX, 2XX and 3XX. 1XX
(two same ontologies) are those testing cases whose ID begins with 1, whose ontologies
are usually used for concept testing, the ontologies of 2XX (two ontologies with different
lexical or structure features) are usually used for comparing different modifications, and
the ontologies of 3XX (two real world ontologies) are developed by different organizations
and come from the same domain in the real world. 16 INIDs of lattice design is shown in
Table 1.
First, we compare the matching results and running time of our algorithm with
classic EA-based ontology meta-matching to prove that our algorithm greatly improves
the efficiency of ontology matching under the condition of having good matching results.
Secondly, we compare the matching results of our algorithm with the participants above
OAEI, further illustrating the superiority of our matching results. To evaluate our algorithm
more comprehensively, the recall, precision and f-measure are used as well as algorithm’s
running time to evaluate our method. As mentioned above, recall measures the ratio of all
positive examples found in the sample, how many of the samples predicted to be positive
Mathematics 2022, 10, 3212 13 of 20
by precision are truly positive samples, while f-measure represents the weighted average of
recall and precision. Algorithm generation time refers to the time it takes for the algorithm
to complete the number of generations we set in advance.
To make the fair comparisons, EA-IM and EA’s parameters are set as the same, which
are as follows:
• Population size PopNum = 20,
• Crossover probability CP = 0.6,
• Mutation probability MP = 0.01,
• Maximum generation MaxGen = 1000,
The above configuration follows the following principles:
• Population size. The setting of the population size depends on the complexity of
the individual, and according to previous studies [39], population size should be in
the range [4×n, 6×n] where n is the decision variable’s dimension number. In this
work, the decision variable owns 4 dimensions, so the population size should be in
the range [16, 24]. The larger population size is, the longer time population might
take to converge. While the smaller it is, the higher probability of which the algorithm
suffers from the premature convergence [40]. Since the ontology meta-matching is a
small-scale issue, we set the population size as 20.
• Crossover and mutation probability. For crossover and mutation probabilities, small
probabilities will decrease the diversity of the population while large probabilities will
miss the optimal individuals [41]. Their suggested ranges are, respectively, [0.6, 0.8]
and [0.01, 0.05], and since the problem in this work is a low-dimensional problem,
we select CP = 0.6 and MP = 0.01, whose effectiveness are also verified in the
experiment.
• Maximum generation. In EA, the maximum of generations is directly proportional to
the scale of the problem [42], and the suggested range is [800, 2000]. Since the ontology
meta-matching problem in this work is a 4-dimensional problem, who’s searching
region is not very large, the maximum generation should be a relative small value,
and in the experiment, MaxGen = 1000 is robust on all testing cases.
In the experiment, we first compare EA-IM with classic EA-based ontology meta-
matching technique in Table 2 in terms of precision, recall and f-measure and the symbols
P, R and F, respectively, represent precision, recall and f-measure. Then, we show the
corresponding box-and-whisker plots in Figures 9–11. After that, we compare their running
time in Table 3, and finally, we compare EA-IM with OAEI’s participants in terms of f-
measure and running time in Tables 4 and 5. The results shown in the table and figures are
the mean value of 30 independent runs.
Mathematics 2022, 10, 3212 14 of 20
two methods; the median of EA-IM is 1.000, while the median of EA is 1.000. Therefore, it
visually illustrates that EA-IM and EA have a high degree of proximity in terms of precision.
In Figure 10, the upper edge of both methods is 1.000; the lower edge of EA-IM is 0.770,
while the lower edge of EA is 0.918, with a difference of 16.1% between the results of the
two methods. This gap is caused by the low results of EA-IM in testing case 248, 302 and
303 because of the more complex lexical information of these ontologies. However, this
does not affect the excellent performance of EA-IM in terms of the final result (f-measure);
the median of EA-IM is 0.990, while the median of EA is 1.000, with a difference of 1.0%.
In Figure 11, the upper edge of both methods is 1.000; the lower edge of EA-IM is 0.875,
while the lower edge of EA is 0.952, with a difference of 8.1% between the results of the two
methods; the median of EA-IM is 0.995, while the median of EA is 1.000, with a difference
of 0.5%. The experimental results shown in these figures further show the effectiveness
of IM.
Figure 10. Comparison of EA-IM and EA on the Box-and-whisker Plot in terms of recall.
Mathematics 2022, 10, 3212 16 of 20
Figure 11. Comparison of EA-IM and EA on the Box-and-whisker Plot in terms of f-measure.
In Table 3, the average running time of EA-IM is 1826 milliseconds, while the aver-
age running time of EA is 29,395 milliseconds, and the improvement degree is 93.79%.
Regarding classic EA-based matching technique, each individual needs to be evaluated
by comparing its corresponding alignment with the reference one, which consumes huge
running time. With the introduction of IM, we construct a problem-specific mathematical
model to forecast the individual’s fitness value, which will decrease the computational
complexity, and therefore decrease the running time. From Table 4, EA-IM’s f-measure
values are higher than those of OAEI’s participants, which shows that the iterative refining
mechanism can effectively improve the alignment’s quality. From the above results, we
can draw the conclusion that EA-IM can efficiently address the ontology meta-matching
problem and determine high-quality alignments.
Table 4. Comparison among EA-IM and OAEI’s participants in terms of f-measure on Benchmark.
Testing Case Edna AgrMaker AROMA ASMOV CODI Ef2Match Falcon GeRMeSMB MapPSO RiMOM SOBOM TaxoMap EA-IM
101 1.00 0.99 0.98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.51 1.00
103 1.00 0.99 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.51 1.00
104 1.00 0.99 0.99 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 0.51 1.00
201 0.04 0.92 0.95 1.00 0.13 0.77 0.97 0.94 0.42 1.00 0.95 0.51 0.95
203 1.00 0.98 0.80 1.00 0.86 1.00 1.00 0.98 1.00 1.00 1.00 0.49 0.99
204 0.93 0.97 0.97 1.00 0.74 0.99 0.96 0.98 0.98 1.00 0.99 0.51 0.99
205 0.34 0.92 0.95 0.99 0.28 0.84 0.97 0.99 0.73 0.99 0.96 0.51 0.88
206 0.54 0.93 0.95 0.99 0.39 0.87 0.94 0.92 0.85 0.99 0.96 0.51 0.93
207 0.54 0.93 0.95 0.99 0.42 0.87 0.96 0.96 0.81 0.99 0.96 0.51 0.94
221 1.00 0.97 0.99 1.00 0.98 1.00 1.00 1.00 1.00 1.00 1.00 0.51 0.99
222 0.98 0.98 0.99 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 0.46 1.00
223 1.00 0.95 0.93 1.00 1.00 1.00 1.00 0.96 0.98 0.98 0.99 0.45 0.99
224 1.00 0.99 0.97 1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00 0.51 1.00
225 1.00 0.99 0.99 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 0.51 1.00
228 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
230 0.85 0.90 0.93 0.97 0.98 0.97 0.97 0.94 0.98 0.97 0.97 0.49 0.97
231 1.00 0.99 0.98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.51 1.00
232 1.00 0.97 0.97 1.00 0.97 1.00 0.99 1.00 1.00 1.00 1.00 0.51 1.00
233 1.00 1.00 1.00 1.00 0.94 1.00 1.00 0.98 1.00 1.00 1.00 1.00 1.00
236 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
237 0.98 0.98 0.97 1.00 0.99 1.00 0.99 1.00 0.99 1.00 1.00 0.46 1.00
238 1.00 0.94 0.92 1.00 0.99 1.00 0.99 0.96 0.97 0.98 0.98 0.45 0.98
239 0.50 0.98 0.98 0.98 0.98 0.98 1.00 0.98 0.98 0.98 0.98 0.94 1.00
240 0.55 0.91 0.83 0.98 0.95 0.98 1.00 0.85 0.92 0.94 0.98 0.88 0.95
241 1.00 0.98 0.98 1.00 0.94 1.00 1.00 0.98 1.00 1.00 1.00 1.00 1.00
246 0.50 0.98 0.97 0.98 0.98 0.98 1.00 0.98 0.98 0.98 0.95 0.94 0.98
247 0.55 0.88 0.80 0.98 0.98 0.98 1.00 0.91 0.89 0.94 0.98 0.88 0.95
248 0.03 0.72 0.00 0.87 0.00 0.02 0.00 0.37 0.05 0.64 0.48 0.02 0.02
301 0.59 0.59 0.73 0.86 0.38 0.71 0.78 0.71 0.64 0.73 0.84 0.43 0.88
302 0.43 0.32 0.35 0.73 0.59 0.71 0.71 0.41 0.04 0.73 0.74 0.40 0.73
303 0.00 0.78 0.59 0.83 0.65 0.83 0.77 0.00 0.00 0.86 0.50 0.36 0.82
Average 0.75 0.92 0.88 0.97 0.81 0.92 0.94 0.90 0.85 0.96 0.94 0.59 0.93
Table 5 shows the comparison among EA-IM and OAEI’s participants in terms of
running. In Table 5, the matcher’s f-measure per second is calculated by dividing its average
F measure by the average running time, which is a measure used by OAEI to measure
matcher efficiency. As can be seen, our algorithm is faster than other matchers, which is
because we have introduced IM to EA to improve the efficiency of ontology matching.
Mathematics 2022, 10, 3212 18 of 20
Table 5. Comparison among EA-IM and OAEI’s participants in terms of running time.
Author Contributions: Conceptualization, X.X. and Q.W.; methodology, X.X. and Q.W.; software,
Q.W.; validation, M.Y. and J.L.; formal analysis, X.X.; investigation, X.X. and Q.W.; resources, M.Y.;
data curation, J.L.; writing—original draft preparation, X.X. and Q.W.; writing—review and editing,
M.Y. and J.L.; funding acquisition, X.X. All authors have read and agreed to the published version of
the manuscript.
Funding: This work is supported by the National Natural Science Foundation of China (No.
62172095), the Natural Science Foundation of Fujian Province (Nos. 2020J01875 and 2022J01644) and
the Scientific Research Foundation of Fujian University of Technology (No. GY-Z17162).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Mathematics 2022, 10, 3212 19 of 20
References
1. Guarino, N.; Oberle, D.; Staab, S. What is an ontology? In Handbook on Ontologies; Springer: Berlin/Heidelberg, Germany, 2009;
pp. 1–17.
2. Arens, Y.; Chee, C.Y.; Knoblock, C.A. Retrieving and Integrating Data from Multiple Information Sources. Int. J. Coop. Inf. Syst.
1993, 2, 127–158. [CrossRef]
3. Baumbach, J.; Brinkrolf, K.; Czaja, L.F.; Rahmann, S.; Tauch, A. CoryneRegNet: An ontology-based data warehouse of
corynebacterial transcription factors and regulatory networks. BMC Genom. 2006, 7, 24. [CrossRef] [PubMed]
4. Wang, X.; Ni, Z.; Cao, H. Research on association rules mining based-on ontology in e-commerce. In Proceedings of the 2007
International Conference on Wireless Communications, Networking and Mobile Computing, Shanghai, China, 21–25 September
2007; pp. 3549–3552.
5. Tu, S.W.; Eriksson, H.; Gennari, J.H.; Shahar, Y.; Musen, M.A. Ontology-based configuration of problem-solving methods and
generation of knowledge-acquisition tools: Application of PROTÉGÉ-II to protocol-based decision support. Artif. Intell. Med.
1995, 7, 257–289. [CrossRef]
6. Gruber, T.R. A translation approach to portable ontology specifications. Knowl. Acquis. 1993, 5, 199–220. [CrossRef]
7. Kashyap, V.; Sheth, A. Semantic heterogeneity in global information systems: The role of metadata, context and ontologies. Coop.
Inf. Syst. Curr. Trends Dir. 1998, 139, 178.
8. Doan, A.; Madhavan, J.; Domingos, P.; Halevy, A. Ontology matching: A machine learning approach. In Handbook on Ontologies;
Springer: Berlin/Heidelberg, Germany, 2004; pp. 385–403.
9. Verhoosel, J.P.; Van Bekkum, M.; van Evert, F.K. Ontology matching for big data applications in the smart dairy farming domain.
In Proceedings of the OM, Bethlehem, PA, USA, 11–12 October 2015; pp. 55–59.
10. Martinez-Gil, J.; Aldana-Montes, J.F. An overview of current ontology meta-matching solutions. Knowl. Eng. Rev. 2012, 27,
393–412. [CrossRef]
11. Xue, X.; Huang, Q. Generative adversarial learning for optimizing ontology alignment. Expert Syst. 2022, e12936. [CrossRef]
12. Arulkumaran, K.; Cully, A.; Togelius, J. AlphaStar: An Evolutionary Computation Perspective. In Proceedings of the Genetic
and Evolutionary Computation Conference Companion, GECCO ’19, Prague, Czech Republic, 13–17 July 2019; Association for
Computing Machinery: New York, NY, USA, 2019; pp. 314–315. [CrossRef]
13. Vikhar, P.A. Evolutionary algorithms: A critical review and its future prospects. In Proceedings of the 2016 International
Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC), Jalgaon, India,
22–24 December 2016; pp. 261–265. [CrossRef]
14. Eiben, A.E.; Smith, J.E. What is an evolutionary algorithm? In Introduction to Evolutionary Computing; Springer: Berlin/Heidelberg,
Germany, 2015; pp. 25–48.
15. Jiao, Y.; Xu, G. Optimizing the lattice design of a diffraction-limited storage ring with a rational combination of particle swarm
and genetic algorithms. Chin. Phys. C 2017, 41, 027001. [CrossRef]
16. Naya, J.M.V.; Romero, M.M.; Loureiro, J.P.; Munteanu, C.R.; Sierra, A.P. Improving ontology alignment through genetic
algorithms. In Soft Computing Methods for Practical Environment Solutions: Techniques and Studies; IGI Global: New York, NY, USA,
2010; pp. 240–259.
17. Martinez-Gil, J.; Aldana-Montes, J.F. Evaluation of two heuristic approaches to solve the ontology meta-matching problem.
Knowl. Inf. Syst. 2011, 26, 225–247. [CrossRef]
18. He, H.; Tan, Y. A two-stage genetic algorithm for automatic clustering. Neurocomputing 2012, 81, 49–59. [CrossRef]
19. Huang, H.D.; Acampora, G.; Loia, V.; Lee, C.S.; Kao, H.Y. Applying FML and fuzzy ontologies to malware behavioural analysis.
In Proceedings of the 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011), Taipei, Taiwan, 27–30 June 2011;
pp. 2018–2025.
20. Xue, X.; Wang, Y.; Ren, A. Optimizing ontology alignment through memetic algorithm based on partial reference alignment.
Expert Syst. Appl. 2014, 41, 3213–3222. [CrossRef]
21. Xue, X.; Liu, J.; Tsai, P.W.; Zhan, X.; Ren, A. Optimizing Ontology Alignment by Using Compact Genetic Algorithm. In
Proceedings of the 2015 11th International Conference on Computational Intelligence and Security (CIS), Shenzhen, China, 19–20
December 2015; pp. 231–234. [CrossRef]
22. Xue, X.; Jiang, C. Matching sensor ontologies with multi-context similarity measure and parallel compact differential evolution
algorithm. IEEE Sens. J. 2021, 21, 24570–24578. [CrossRef]
23. Lv, Z.; Peng, R. A novel meta-matching approach for ontology alignment using grasshopper optimization. Knowl.-Based Syst.
2020, 201–202, 106050. [CrossRef]
24. Lv, Q.; Zhou, X.; Li, H. Optimizing Ontology Alignments Through Evolutionary Algorithm with Adaptive Selection Strategy. In
Advances in Intelligent Systems and Computing, Proceedings of the International Conference on Advanced Machine Learning Technologies
and Applications, Cairo, Egypt, 20–22 March 2021; Springer: Cham, Switzerland, 2021; pp. 947–954.
25. Xue, X.; Yao, X. Interactive ontology matching based on partial reference alignment. Appl. Soft Comput. 2018, 72, 355–370.
[CrossRef]
Mathematics 2022, 10, 3212 20 of 20
26. Xue, X. Complex ontology alignment for autonomous systems via the Compact Co-Evolutionary Brain Storm Optimization
algorithm. ISA Trans. 2022, in press. [CrossRef]
27. Xue, X.; Pan, J.S. A segment-based approach for large-scale ontology matching. Knowl. Inf. Syst. 2017, 52, 467–484. [CrossRef]
28. Winkler, W.E. The State of Record Linkage and Current Research Problems; Statistical Research Division, US Census Bureau:
Suitland-Silver Hill, MD, USA, 1999.
29. Mascardi, V.; Locoro, A.; Rosso, P. Automatic ontology matching via upper ontologies: A systematic evaluation. IEEE Trans.
Knowl. Data Eng. 2009, 22, 609–623. [CrossRef]
30. Wu, Z.; Palmer, M. Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for
Computational Linguistics (COLING-94), Las Cruces, NM, USA, 27–30 June 1994.
31. Ferranti, N.; Rosário Furtado Soares, S.S.; de Souza, J.F. Metaheuristics-based ontology meta-matching approaches. Expert Syst.
Appl. 2021, 173, 114578. [CrossRef]
32. Fellbaum, C. WordNet. In Theory and Applications of Ontology: Computer Applications; Springer: Berlin/Heidelberg, Germany, 2010;
pp. 231–243.
33. Golberg, D.E. Genetic algorithms in search, optimization, and machine learning. Addion Wesley 1989, 1989, 36.
34. Ehrig, M.; Euzenat, J. Relaxed precision and recall for ontology matching. In Proceedings of the K-Cap 2005 Workshop on
Integrating Ontology, Banff, AB, Canada, 2 October 2005; pp. 25–32.
35. Faria, D.; Pesquita, C.; Santos, E.; Palmonari, M.; Cruz, I.F.; Couto, F.M. The agreementmakerlight ontology matching system. In
Lecture Notes in Computer Science, Proceedings of the OTM Confederated International Conferences “On the Move to Meaningful Internet
Systems”, Graz, Austria, 9–13 September 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 527–541.
36. Acampora, G.; Loia, V.; Vitiello, A. Enhancing ontology alignment through a memetic aggregation of similarity measures. Inf. Sci.
2013, 250, 1–20. [CrossRef]
37. Yates, F. A new method of arranging variety trials involving a large number of varieties. J. Agric. Sci. 1936, 26, 424–455. [CrossRef]
38. Achichi, M.; Cheatham, M.; Dragisic, Z.; Euzenat, J.; Faria, D.; Ferrara, A.; Flouris, G.; Fundulaki, I.; Harrow, I.; Ivanova, V.; et al.
Results of the ontology alignment evaluation initiative 2016. In Proceedings of the OM: Ontology Matching, Kobe, Japan, 18
October 2016; pp. 73–129.
39. Liu, X. A research on Population Size Impaction on the Performance of Genetic Algorithm. Ph.D. Thesis, North China Electric
Power University, Beijing, China, 2010.
40. Mirjalili, S. Genetic algorithm. In Evolutionary Algorithms and Neural Networks; Springer: Berlin/Heidelberg, Germany, 2019;
pp. 43–55.
41. Xue, X.; Wang, Y. Using memetic algorithm for instance coreference resolution. IEEE Trans. Knowl. Data Eng. 2015, 28, 580–591.
[CrossRef]
42. Xue, X.; Chen, J. Matching biomedical ontologies through Compact Differential Evolution algorithm with compact adaption
schemes on control parameters. Neurocomputing 2021, 458, 526–534. [CrossRef]