0% found this document useful (0 votes)
4 views

A Text-Image Feature Mapping Algorithm Based on Tr

Uploaded by

Thuan ngo van
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

A Text-Image Feature Mapping Algorithm Based on Tr

Uploaded by

Thuan ngo van
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Open Phys.

2018; 16:1139–1148

Research Article

Deng Pan* and Hyunho Yang

A text-Image feature mapping algorithm based on


transfer learning
https://doi.org/10.1515/phys-2018-0134 technology can no longer satisfy people’s learning needs
Received Oct 09, 2018; accepted Nov 14, 2018 of multimedia information. However, it is still difficult to
develop knowledge models directly in the feature space
Abstract: The traditional uniform distribution algorithm
of multimedia data, especially in the image feature space.
does not filter the image data when extracting the ap-
Whether the mature text mining technology alongside suf-
proximate features of text-image data under the event, so
ficient text information on the Internet can be used to as-
the similarity between the image data and the text is low,
sist the knowledge learning of image data is a hotspot of
which leads to low accuracy of the algorithm. This paper
current research.
proposes a text-image feature mapping algorithm based on
Reference [1] proposes an adaptive control scale-
transfer learning. The existing data is filtered by ‘cluster-
invariant feature transform (SIFT) feature uniform distri-
ing technology’ to obtain similar data with the target data.
bution algorithm (called uniform distribution algorithm)
The significant text features are calculated through the
based on the characteristics of stop and reverse (SAR) im-
latent Dirichlet allocation (LDA) model and information
age data. By using local texture features and combining
gain based on Gibbs sampling. Bag of visual word (BOVW)
them with optimization screening strategy, the SIFT fea-
model and Naive Bayesian method are used to model im-
ture points can be reasonably distributed in image space
age data. With the help of the text-image co-occurrence
and scale space by adaptively controlling the distribution
data in the same event, the text feature distribution is
of features in different spaces while ensuring the stabil-
mapped to the image feature space, and the feature dis-
ity and accuracy of feature points. But without consid-
tribution of image data under the same event is approx-
ering the timeliness of image data, the accuracy of this
imated. Experimental results show that the proposed al-
algorithm is not high. In Reference [2], a fast connected
gorithm can obtain the feature distribution of image data
component labeling algorithm implemented on an field-
under different events, and the average cosine similarity is
programmable gate array (FPGA) is proposed. Run-length
as high as 92%, the average dispersion is as low as 0.06%,
encoding is used to optimize image annotation, which can
and the accuracy of the algorithm is high.
reduce the number and length of tags and extract the fea-
Keywords: Transfer learning, text-image, feature map- tures of components during the run-length encoding. Due
ping, clustering, LDA model, BOVW model to the complexity of the algorithm, the efficiency of the al-
gorithm is low. In Reference [3], by calculating image fea-
PACS: 07.05.Kf, 07.05.Pj, 07.05.Tp
tures and delays, enhanced scanning (DE-MRI) is used to
analyze heterogeneous machine learning, and an uncer-
tainty assessment framework with potential ablation tar-
1 Introduction get recognition is constructed. However, due to the rela-
tionship between image features and delay, in the analysis
With the development of online information dissemina- of heterogeneous machine learning situation, data is not
tion technology, the amount of event information accom- sufficient, and data analysis efficiency is low, which has
panied by text-image is increasing. Traditional text mining certain limitations.
Transfer learning method develops a compact and ef-
fective representation from the annotated data of a source
domain and a few annotated or unannotated data of the
*Corresponding Author: Deng Pan: College of Computer Informa- target domain, and then applies the learning feature repre-
tion and Communication Engineering,Jiujiang University, Jiujiang,
sentation method to the learning task of the target domain.
332005, China, 29882; Email: [email protected]
Hyunho Yang: School of Computer Information andCommunica- In this method, not only self-annotated data but also unan-
tion Engineering, Kunsan National University, 54150, South Korea; notated data are used, so it is neither supervised learn-
Email: [email protected]

Open Access. © 2018 D. Pan and H. Yang, published by De Gruyter. This work is licensed under the Creative Commons Attribution-
NonCommercial-NoDerivatives 4.0 License
1140 | D. Pan and H. Yang

ing [4], nor unsupervised learning and semi-supervised each cluster, and focus on some specific clusters for further
learning, but a new machine learning method. During fea- analysis. At the same time, clustering technology can also
ture migration, even if the data in the source data space be used as a pre-treatment step of other algorithms to ef-
and the target data space do not intersect at the instance fectively improve the performance of other algorithms [10].
level, they may be related at the feature level [5]. Data with
two feature perspectives can be used to establish a link be-
tween two different feature spaces. These data are not nec- 2.1.2 Text representation and text similarity formula
essarily used as training data for knowledge learning, but
they can act as a role of dictionary. Taking a subject event According to the traditional vector space model (VSM)
as the background, the sufficient text-image information representation, the text content can be expressed as a
about the event on the Internet is used as a basis for knowl- weighted feature vector. Let D be a text set, d i is a text in
edge migration. the set, t is a feature word, t i is the i-th feature word, and
Aiming at these problems, in this paper a text-image wi is a weight of the i-th feature word.
feature mapping algorithm based on transfer learning is
d i = (t1 , w1 ; t2 , w2 ; ...; t n .w n ) (1)
proposed, which uses clustering technology to filter the
existing data to find the data very similar to the target Where, the weight wi can be represented by the tf-idf
data [6]. The significant text features are calculated by LDA weight of each feature. The tf-idf formula is as follows:
model based on Gibbs sampling and information gain. ∑︁ |D|
The BOVW model and naive Bayesian method are used tf − idf = tf (d, t) · log (2)
df (t)
to model the subject of image data. With the help of the d∈D

text-image co-occurrence data [7] in same event, the text Where, tf (d, t) is the word frequency of word t in text d,
feature distribution is mapped to the image feature space, df (t) is the number of text containing word t in text set D,
and the feature distribution of image data under the same and |D| represents the number of text contained in text set
event is approximated. D.
The similarity between two texts can be calculated by
the cosine of the angle α between two vectors. Assuming
that two texts are d1 = (t1 , w1 ; t2 , w2 ; . . . ; t n , w n ) and
2 A text-image feature mapping d2 = (t1 , σ1 ; t2 , σ2 ; . . . ; t n , σ n ), the similarity between d1
algorithm based on transfer and d2 is expressed as follows:

learning n
∑︀
wi × σi
i=1
∼ (d1 d2 ) = cos α = (︂ )︂ 21 (3)
2.1 Transfer learning algorithm for n
∑︀
ωi 2 ×
n
∑︀
σi 2
clustering text i=1 i=1

The greater the value of ∼ (d1 d2 ) is, the more similar the
Although the existing auxiliary data is out of date, there two texts are, of which w is the weight of the feature and σ
should be some data in the existing data that is very sim- is the approximate text weight.
ilar to the test data and can be used to help target tasks
learning [8]. Therefore, clustering technology is used to
find data that is very similar to test data from existing data. 2.1.3 Algorithm principle

Firstly, the auxiliary training data are clustered together


2.1.1 Introduction to clustering with the target training data [11]. The result of clustering
is that the intra-cluster similarity is high, and that of the
Clustering is an important form of data mining. The pur- data inter-cluster is different. Therefore, after clustering,
pose of text clustering is to group large-scale text datasets no auxiliary data clustered in the same cluster with the tar-
into multiple classes, and make the text in the same class get training data is filtered out. All that is left is data with
have a high degree of similarity, while the text between dif- high similarity to the target data, and training them with
ferent classes is quite different [9]. As a function of data the target data will greatly improve the performance of the
mining, clustering can be used as an independent tool classifier [12]. The definitions of some basic symbols used
to obtain data distribution, observe the characteristics of in the paper are given below.
A text-Image feature mapping algorithm based on transfer learning | 1141

Definition 2.1.1 * set Xb as the target sample space and Xa feature spaces are connected.These data are not necessar-
as the auxiliary sample space. * set Y = {0, 1} as a class ily used as training data for knowledge learning, but they
space. can act as a dictionary. Taking a subject event as the back-
ground, sufficient text-image information about the event
Definition 2.1.2 (test data set) S = x ti , of which x ti ∈
{︀(︀ )︀}︀
on the internet is used as a basis for knowledge migration.
Xb, i = 1, 2, . . . , k, and k are the number of elements of
set S.
2.2.1 Text-image co-occurrence data constrained by
Definition 2.1.3 (training data set) The training dataset events
consists
{︁(︁ of two(︁ parts:
)︁)︁}︁
Tb = x j , c x bj
b
, where x bj ∈ Xb, j = 1, 2, . . . , m; In the heterogeneous spatial learning model, the difficulty
{︀(︀ a (︀ a )︀)︀}︀
Ta = x i , c x i , where x ai ∈ Xa, i = 1, 2, . . . , n; of the whole learning process will be greatly reduced if a
where, c (x) is the real class label of the instance, t is the data with two feature spatial perspectives is used as an
feature word, i and j are the number of permutations, is the aid [15]. The heterogeneous spatial learning model under
target training data set, Ta is the auxiliary training data event constraint provides the possibility. The text-image
set. M and n are the size of target training dataset and aux- co-occurrence data under event constraints are given here.
iliary training dataset respectively. E is an event set, event e ∈ E;V is the whole image data
set, and the relevant image {v} ∈ V under event e. D
is the whole text data set, and the text set under event
2.1.4 Algorithm steps e is {d} ∈ D; U v is the image feature space, and U D is
the text feature space. Text-image co-occurrence data in-
Input: two training datasets Ta and Tb, a test data set S. stances vd ∈ S and S are co-occurrence data set, ∫ are op-
Output: classification result h t (X t ). eration coefficients, and u v ∈ U v and u d ∈ U D are corre-
Read the training data Ta and Tb. sponding features of image data instances and text data
The training data are classified into N classes according to instances, respectively. Under the constraint of events, the
class labels: T i (i = 1, . . . , N ), of which T i is the instances text-image co-occurrence data vd is formally described at
set of classes labeled i; the feature level as follows:

For i ← 1 to N6 (4) P (u v , u d ) = ∫ P (u v , d) P (u d |d ) dd (4)


D
a. call a basic clustering algorithm to cluster T i and return
clustering results. P (u v , u d ) = ∫ P (v, u d ) P (u v |v ) dv (5)
b. scan T i , delete auxiliary data from instances that are not v

clustered with target data.


P (u v , u d ) = ∫ ∫ P (v, d) P (u d |d ) P (u v |v ) dvdd (6)
End for; vd
Call a basic classification algorithm and get a classification
where, P (u d |d ) and P (u v |v ) are feature extraction pro-
model according to the filtered training data and test data
cesses.
S.

ht : X → Y .
2.2.2 Text subject modeling
Test the performance of the classification model on S and
output it [13]. The LDA model based on Gibbs sampling is used to extract
subject information from text sets for modeling [16], and
the probability model is:
2.2 A text-image feature mapping algorithm (︁ )︁
w i |z i , ϕ(z i ) ∼ disc ϕ(z i ) , ϕ ∼ dir (β) (7)
based on transfer learning ⃒ (︁ )︁
z i ⃒θ d i ∼ disc θ(d i ) , θ ∼ dir (α)

Based on the previous section, the existing data is filtered
by clustering technique [14], and the data which is very In order to deal with the new text outside the event train-
similar to the target data is obtained. Data with two feature ing text and facilitate parameter reasoning, the symmetric
perspectives are used to establish a link, and two different dir (α) and dir (β) prior probability assumptions are made
1142 | D. Pan and H. Yang

for θ(d) and ϕ(z) . In order to obtain the probability distri-


bution of text subjects, the posterior probability P (w |z ) of
lexicon w for text subjects is calculated instead of ϕ and θ,
and then ϕ and θ are calculated indirectly by Gibbs sam-
pling. By calculating the most discriminant feature in each
subject feature space, the feature which has the more in-
formation gain can be as the significant text feature.

2.2.3 Image data modeling

A Naive Bayesian model is used to model the image. Firstly,


the speeded up robust features (SURF) are computed and
Figure 1: Text image feature migration under event constraint
the bag of visual words (BOVW) model is established. Im-
age v is considered as a set of visual words. Each visual
word f comes from the visual vocabulary F, v = {f |f ∈ F }, is inferred from text significant features and text-image co-
and F represents the whole image feature space. According occurrence data in event text sets.
to the feature independence hypothesis, the image clas- ∑︁
P (f |c ) = N c P (f |w, c, S )P (w |c, D ) (9)
sification model is defined as: an event category c deter-
w∈W(c)
mines an image feature distribution P (f ∈ F |c ). Through
this model, the maximum posteriori is used to infer the im- Where, W (c) is the most prominent text feature set in the
age’s classification objective function h NB : V → C, and text set D under event category c, N c is normalization coef-
the image subject category modeling is completed. For tar- ficient, P (w |c, D ) is text feature distribution under event
get image v, the subject categories are: category c, and P (f |w, c, S ) is image’s feature conditional
distribution probability on text-image co-occurrence data.
h NB = arg max p(c) Π P(f |c) (8) Eq. (9) shows that the probability of a particular im-
f ∈v
age feature appearing is proportional to the probability of
it appearing in the text-image co-occurrence data associ-
2.2.4 Text-image feature mapping ated with each significant text feature if the event category
c is given [19]. At the same time, the probability of specific
Both text subject modeling and image subject modeling image features is related to the importance of each signifi-
belong to discrete object models. The feature indepen- cant text feature for the target concept [20]. Next, the cal-
dence hypothesis [17] can be applied to their features, that culations of P (f |w, c, S ) and P (w |c, D ) are elaborated.
is, each feature independently affects the posterior proba- Firstly, the text feature distribution P (w |c, D ) is com-
bility of an instance under a given event category. In the puted for each event category concept c ∈ C, and the most
process of text-image feature migration, the problem of significant event text feature set W (c) is calculated, and
feature migration can be greatly simplified by separating n is the operation coefficient. The LDA model is used to
text features and image features for mapping [18]. Figure 1 model the event text set, and Laplace smoothing is used
is a schematic diagram of text-image feature migration: to solve the sparse problem of text subject features.
The category label of each text in D under event con- P (w |c, D ) = [1 + n (w, c, D)] / [|W | + n + (c, D)] (10)
straint is the same as that of the image target category
c. Text d is represented by a subject feature word bag as ∑︁
d = {t |t ∈ T }. Thematic feature dictionary T is the subject n (w, c, D) = n (w |d )P (c |d ) (11)
vocabulary in a text feature space. At the same time, there d∈D

is a S = {(v, d)} set of text- image co-occurrence data un- ∑︁


n (c, D) = n (n |d )P (c |d ) (12)
der corresponding event. To infer the image feature distri- d∈D
bution P (f |c ) under event category c, the most significant
Then, the image feature conditional distribution
text features in text set D are first computed, and then the
P (f |w, c, S ) in the text-image co-occurrence dataset is
most significant text features are mapped to the image fea-
computed, and Laplacian smoothing is still used.
ture space by means of text-image co-occurrence data set
S. Distribution of image features under the target category P (f |w, c, S ) = (13)
A text-Image feature mapping algorithm based on transfer learning | 1143

[1 + n (f , w, c, S)] / [|F | + n (w, c, S)] bird’s nest incidents. Depending on the duration of the in-
cidents [24], the number of related text downloads ranged
∑︁ from 800 to 2000, with text-image accompanying text ac-
n (f , w, c, S) = n (f |v )P (w, c |d ) (14)
counting for about one-third to one-second.A text-image
(v,d)∈S
accompanying sample is regarded as a co-occurrence data
∑︁
n (w, c, S) = n (v)P (w, c |d ) (15) instance, but in the case of multiple images in a sample, it
(v,d)∈S is considered that one image corresponds to the same ac-
companying text, and the number of co-occurrence data
instances is calculated according to the number of images.
2.2.5 Evaluation Criteria Based on artificially collecting, the image data of each food
safety event from the Internet search engine and related
The goal of the text-to-image feature mapping algorithm is web pages are searched. For each event, 200~400 images
to estimate the feature distribution of image information are collected separately. The BOVW model is used to repre-
under event categories [21]. According to the feature inde- sent each image in a bag of visual word, and the histogram
pendence hypothesis of the BOVW model, image features vector expression of each image is obtained.
are regarded as random variables which appear indepen- Firstly, the feature distribution of the reference im-
dently. Image feature distribution can be represented as a age is constructed. Using all the images under each event
vector with the same size as the bag of character words. category c, an image feature distribution is obtained as
|F|−1
∑︁ the base feature distribution by the Naive Bayesian clas-
P (f |c ) = ⟨P i ≥ 0⟩i=0 , Pj = 1 (16)
i
sifier. Theoretically, the Naive Bayesian classifier can cal-
culate the real image feature distribution under the tar-
Cosine similarity and K-L (Kullback-Leibler) diver-
get category when the training data is sufficient. The two
gence dispersion is used as the performance evaluation
intuitive methods are compared with the text-image fea-
scale [22], it is assumed that p probability distribution is
ture mapping algorithm. The first method is the uniform
the datum distribution, and the other probability distribu-
distribution algorithm, assuming that each image feature
tion q is the approximation of distribution p. The greater
appears randomly under the concept of each event tar-
the cosine similarity of the two approximations is, the
get with the same probability. The second method is the
closer the two feature distributions are and the higher the
tagged query algorithm, which uses the name of category
approximation degree is. The formula for cosine similarity
c as the query keyword, searches in the Internet search en-
is as follows:
⎛ ⎞ gine [25], and uses the returned K image to train the Naive
√︃∑︁ √︃∑︁ Bayesian model to get the image feature distribution. The
Pi 2 +
∑︁
CS (P, q) = Pi qi / ⎝ qi 2⎠ (17) K value of the experiment is 50 based on experience [26–
i i i
31].
K-L dispersion is an asymmetric measure to evaluate the
difference between two probability distributions. Its value
reflects the approximation of distribution q to distribution
3 Results
p. In the determination of the characteristic distribution of
the reference image data, the K-L dispersion is defined as:
The K value of this experiment is 50 according to experi-
∑︁ P ence. The comparison between the three algorithms under
KL (P ‖q ) = P i 1b i (18)
qi cosine similarity is shown in Figure 2, 3, and 4.
i
Analyzing the results of the three algorithms in the
Based on the above methods, 15 categories of video secu-
cosine similarity, we can see that the maximum value of
rity incidents on the Internet are analyzed as data sets [23],
the uniform distribution algorithm is 0.94%, the minimum
corresponding categories are: E1: Sanlu milk powder inci-
value is 0.74%, the maximum value of the label query algo-
dents; E2: red-cored duck egg incidents; E3: Turbot inci-
rithm is 0.97%, the minimum value is 0.76%, and the max-
dents; E4: Jinhao tea oil incidents; E5: Maile chicken inci-
imum value of the proposed algorithm is 0.99% and the
dents; E6: Plasticizer incidents; E7: Clenbuterol incidents;
minimum value is 0.76%.
E8: paraffin wax in hot pot incidents; E9: Gutter oil event;
Through data comparison, the cosine similarity of the
E10: Crayfish event; E11: Fushou snail incidents; E12: Poi-
proposed algorithm is always higher than that of the uni-
sonous steamed bread incidents; E13: Maggot citrus inci-
form distribution algorithm and the label query algorithm.
dents; E14: Bursting watermelon incidents; E15: Poisonous
1144 | D. Pan and H. Yang

Figure 4: Algorithm for estimating distribution effect under cosine


similarity

Figure 2: The effect of uniform distribution algorithm on estimation


of distribution under cosine similarity

Figure 5: Uniform distribution algorithm under K-L discrete value to


estimate distribution effect diagram

proposed algorithm is 0.17% and the minimum value is


0.03%.
Figure 3: Estimation of distribution effect under tag cosine similarity
algorithm Through data comparison, the K-L dispersion of the
proposed algorithm is always lower than that of the uni-
form distribution algorithm and the label query algorithm.
The larger the similarity value is, the more accurate the ap- Discreteness is an asymmetric metric measure to evaluate
proximate feature extraction is. Therefore, the accuracy of the difference of two probability distributions. Its value re-
the proposed algorithm is higher than the other two algo- flects the approximations. The smaller the dispersion is,
rithms. the smaller the difference is. Therefore, the difference of
As shown in Figure 5, 6, and 7, the prediction results of the algorithm in this paper is lower than that of the other
the three algorithms under K-L dispersion are compared: two algorithms.
Analysis of Figure 5, 6, and 7 shows that the results From the comparison of the effects of different algo-
of the three algorithms are comparable under K-L discrete rithms on estimating the distribution under the above dif-
values. The maximum value of the uniform distribution al- ferent metrics, it can be seen that the image feature dis-
gorithm is 0.27%, the minimum value is 0.05%, the max- tribution generated by the text-image feature mapping al-
imum value of the tagged query algorithm is 0.30%, the gorithm is the closest to its benchmark distribution under
minimum value is 0.04%, and the maximum value of the most event categories, while the uniform distribution algo-
rithm is only close to the results of other algorithms under
A text-Image feature mapping algorithm based on transfer learning | 1145

times. The feature distribution and reference distribution


of each image are compared. Finally, the results of all re-
peated rounds are arithmetically averaged. The number of
images randomly selected for each event category is 20, 40,
60, 80, 100, 120, 140, 160 in turn. The approximate results
of uniform distribution algorithm, label query algorithm
and feature mapping algorithm under each category are
averaged, and then compared with the above method. Fig-
ures 8 and 9 show the average difference between the im-
age feature distribution and the reference distribution ob-
tained by these approximate methods under the two mea-
surement scales.

Figure 6: K-L discrete valued mark-up query algorithm to estimate


the distribution effect map

Figure 8: Comparison of different algorithms for estimating distribu-


tion under cosine similarity

Figure 7: The algorithm is used to estimate the distribution effect


under K-L discrete values
As can be seen from Figures 8 and 9, the approximate
results of the uniform distribution algorithm, the label
one category (E6). By checking the data under this cate- query algorithm and the feature mapping algorithm under
gory, it is found that this is due to the large differences in each category are averaged, and then compared with the
the image data between them. The label query algorithm above method. The text-image feature mapping algorithm
under three categories (E1, E9, E11) is equivalent to the text- is similar to the feature distribution obtained by training
image feature mapping algorithm proposed by the author. 100 labeled images, and the cosine similarity of the pro-
By checking the data, for these event categories, the event posed algorithm is 92%; that of the uniform distribution
category name is directly input from the search engine as algorithm is 76%; and the label query algorithm is 84%.
the query keyword, and the resulting images are closely The average value of cosine similarity of the proposed al-
related to the event category, so the similar distribution ef- gorithm is the largest in every category. The discrete de-
fect of the label query algorithm is better. gree of the proposed algorithm is 0.06%; that of the uni-
In addition to the above direct method, the approxi- form distribution algorithm is 0.17%; the label query algo-
mation degree of a similar image feature distribution to rithm is 0.09%; the average value of the discrete results of
the benchmark distribution can be measured from differ- the proposed algorithm is the smallest in each category.
ent training data scales.Each time, from each category of The above data show that the proposed text-image feature
the collected event image data set, N images are randomly mapping algorithm based on transfer learning can effec-
selected, and the Naive Bayesian model is trained for 100 tively learn the image feature distribution under the target
1146 | D. Pan and H. Yang

gorithm not only extracts the approximate feature distri-


bution of text-image data effectively, but also consumes
less time and has high efficiency.

4 Discussion
In the traditional machine learning framework, the task of
learning is to learn a classification model based on given
sufficient training data, and then use this learning model
to classify and predict test documents. However, we see
that machine learning algorithms have a key problem in
the current Internet mining research: a large amount of
training data in some emerging areas is difficult to obtain.
It can be seen that the development of internet applica-
Figure 9: Comparison of different algorithms for estimating distribu-
tion under K-L discrete values tions is very fast. A large number of new areas are emerg-
ing, from traditional news, to web pages, pictures, blogs,
podcasts and so on. Traditional machine learning needs
event category from the text data of related events and text- to calibrate a large amount of training data in each field,
image co-occurrence data. which will consume manpower and material resources.
Under the 100 events, the similarity distribution of Without a large amount of annotated data, a lot of learn-
text- image data is simulated. The proposed algorithm is ing related researches and applicationscan’t be carried
simulated with the uniform distribution algorithm and la- out. Secondly, traditional machine learning assumes that
bel query algorithm. The average optimal fitness and aver- training data and test data obey the same data distribu-
age operation time of the two algorithms are obtained. The tion. However, in many cases, the same distribution hy-
detailed results are described in Table 1. From the analy- pothesis is not satisfied. In addition, training data is often
out of date. This often requires re-labelling a large volume
Table 1: Simulation results of approximate distribution of image of training data to meet our training needs, but labeling
data distribution under 100 events new data is expensive and requires manpower and mate-
rial resources. On the other hand, if we have a lot of train-
Type of algorithm Average Mean ing data with different distributions, it would be wasteful
optimum operation to discard them completely. How to make rational use of
fitness time /s this data is the aim of transfer learning.
/% Main problems are solved. Transfer learning can trans-
Uniform distribution algorithm 7.51 54.09 fer knowledge from existing data to help future learn-
Tagged query algorithm 8.22 53.69 ing. The goal of transfer learning is to use the knowl-
Algorithm in this paper 9.85 34.72 edge learned from one environment to assist learning tasks
in the new environment. Therefore, transfer learning will
not assume the same distribution assumption as tradi-
sis of Table 1, we can see that the optimal fitness of the tional machine learning. At present, the work on trans-
proposed algorithm is 9.85%, the uniform distribution al- fer learning can be divided into two parts: case-based
gorithm is 7.51%, the label query algorithm is 8.22%, and transfer learning in isomorphic space and feature-based
the fitness of the proposed algorithm is the highest. In the transfer learning in isomorphic space. It is pointed out
comparison of the average operation time, the proposed that case-based transfer learning has stronger knowledge
algorithm takes 34.72 seconds, the uniform distribution al- transfer ability, while feature-based transfer learning has
gorithm takes 54.09s, and the label query algorithm takes wider knowledge transfer ability. These two methods have
53.69s, indicating that the proposed algorithm takes the their own merits. Transfer learning is a relatively new re-
shortest time and has the highest efficiency. This algorithm search direction in machine learning. The current research
can quickly and effectively extract the approximate feature mainly focuses on data mining, natural language process-
distribution of text-image data under 100 events. This al- ing, information retrieval and image classification. Ma-
A text-Image feature mapping algorithm based on transfer learning | 1147

chine learning has provided extensive research findings Trans. Biomed. Eng. PP, 2018, (99), 1-1.
and results, but research into transfer learning is mini- [4] Cazade P.A., Zheng W., Pradagracia D., A Comparative Anal-
ysis of Clustering Algorithms: O2 Migration in Truncated
mal. Features and samples are two important aspects of
Hemoglobin I from Transition Networks, J. Chem. Phys., 2015,
text categorization. It is important to consider these two
142(2), 025-103.
factors comprehensively. Sample based transfer learning [5] Wan S., Niu Z., A Learner Oriented Learning Recommendation
is another method to solve the problem of transfer learn- Approach based on Mixed Concept Mapping and Immune Algo-
ing. Traditional methods also use feature-based or sample- rithm, Knowledge-Based Syst., 2016, 103(C), 28-40.
based transfer learning methods, but there is a lack of com- [6] Han X.H., Xiong X., Duan F., A New Method for Image Segmen-
tation based on BP Neural Network and Gravitational Search Al-
prehensive use of these two methods. The algorithm pro-
gorithm Enhanced by Cat Chaotic Mapping, Appl. Intel., 2015,
posed in this paper can find the data very similar to the 43(4), 855-873.
test data from the existing data and improve the accuracy [7] Zhou T., Hu W., Ning J., An Eflcient Local Operator-based Qcom-
of the model. pensated Reverse Time Migration Algorithm with Multistage Op-
timization, Geophys., 2018, 83(3), S249-S259.
[8] Gorodnitskiy E., Perel M., Geng Y., Depth Migration with Gaus-
sian Wave Packets based on Poincaré Wavelets, Geophys. J. Int.,
5 Conclusions 2016, 205(1), 301-318.
[9] Rastogi R., Srivastava A., Khonde K., An Eflcient Parallel Algo-
rithm: Poststack and Prestack Kirchhoff 3D Depth Migration Us-
In this paper, a text-image feature mapping algorithm
ing Flexi-depth Iterations, Comp. Geosci., 2015, 80, 1-8.
based on transfer learning is proposed. Firstly, clustering [10] Tosun S., Ozturk O., Ozkan E., Application Mapping Algorithms
technology is used to filter the existing data and find the for Mesh-based Network-on-chip Architectures, J. Supercomp.,
data which is similar to the target data to help the learning 2015, 71(3), 995-1017.
of the target task and improve the performance of the clas- [11] Kalantar B., Mansor S.B., Sameen M.I., Drone-based Land-cover
Mapping Using a Fuzzy Unordered Rule Induction Algorithm In-
sifier. Then, the event text data is modeled by the potential
tegrated into Object-based Image Analysis, Int. J. Remote Sens.,
Dirichlet assignment method, and the most prominent text 2017, 38(8-10), 2535-2556.
features are selected by calculating the information gain [12] Mackenzie C., Pichara K., Protopapas P., Clustering Based Fea-
of the topic features; the event images are modeled using ture Learning on Variable Stars, Astrophys. J., 2016, 820(2), 138.
the visual word bag model and the naive Bayesian method; [13] Li H., Zhu G., Cui C., Energy-eflcient Migration and Consolidation
The approximate extraction of the image feature distribu- Algorithm of Virtual Machines in Data Centers for Cloud Comput-
ing, Comput., 2016, 98(3), 303-317.
tion is realized by the text data feature distribution and
[14] Xiang T., Yan L., Gao R., A Fusion Algorithm for Infrared and Vis-
the text-image co-occurrence data feature distribution un- ible Images based on Adaptive Dual-channel Unit-linking PCNN
der the same event. Compared with the traditional uniform in NSCT Domain, Infrared Phys. Technol., 2015, 69, 53-61.
distribution algorithm and labeled query algorithm, the [15] Dong J., Xiao X., Menarguez M.A., Mapping Paddy Rice Plant-
average cosine similarity of the proposed algorithm is 92%, ing Area in Northeastern Asia with Landsat 8 Images, Phenology
based Algorithm and Google Earth Engine, Remote Sens. Envir.,
that of the uniform distribution algorithm is 76%, and the
2016, 185, 142-154.
labeled query algorithm is 84%. The average dispersion of [16] Li Q., Zhou H., Zhang Q., Eflcient Reverse Time Migration based
the proposed algorithm is 0.06%, that of the uniform dis- on Fractional Laplacian Viscoacoustic Wave Equation, Geophys.
tribution algorithm is 0.17%, and the labeled query algo- J. Int., 2016, 204(1), 488-504.
rithm is 0.09%. The experimental data shows that the pro- [17] Medrano E.A., Wiel B.J.H.V.D., Uittenbogaard R.E., Simulations
posed algorithm has the advantage of high cosine similar- of the Diurnal Migration of Microcystis Aeruginosa, based on
a Scaling Model for Physical-biological Interactions, Ecolog.
ity and low dispersion.
Mod., 2016, 337, 200-210.
[18] Matsubayashi A., Asymptotically Optimal Online Page Migration
on Three Points, Algorithmica, 2015, 71(4), 1035-1064.
[19] Yap W. S., Phan C.W., Yau W.C., Cryptanalysis of a New Image
References Alternate Encryption Algorithm based on Chaotic map, Nonlin.
Dyn., 2015, 80(3), 1483-1491.
[1] Wang F., Youh J., Fux Y., Auto-Adaptive Well-Distributed Scale- [20] Rastogi R., Londhe A., Srivastava A., 3D Kirchhoff Depth Migra-
Invariant Feature for SAR Images Registration, Geomat. Inform. tion Algorithm, Comp. Geosci., 2017, 100(C), 67-75.
Sci. Wuhan Univ., 2015, 40(2), 159-163. [21] Thierry P., Lambaré G., Podvin P., 3-D Preserved Amplitude
[2] Wang K., Shil Z., Design and Implementation of Fast Connected Prestack Depth Migration on a Workstation, Geophys., 2015,
Component Labeling Algorithm based on FPGA, Comp. Eng. 64(1), 222-229.
Appl., 2016, 52(18), 192-198. [22] Zheng X.W., Lu D.J., Wang X.G.A., Cooperative Coevolutionary
[3] Lozoya R.C., Berte B., Cochet H., Model-based Feature Augmen- Biogeography-based optimizer, Appl. Intel., 2015, 43(1), 1-17.
tation for Cardiac Ablation Target Learning from Images, IEEE
1148 | D. Pan and H. Yang

[23] Wang M., Study on Operation Reliability of Transfer System of [28] Gao W., Farahani M.R., Aslam A., Hosamani S. Distance Learn-
Urban Transportation Hub based on Reliability Theory, Automat. ing Techniques for Ontology Similarity Measuring and Ontology
Instrument., 2016, (1), 418-534. Mapping, Cluster Computing - The J. Net. Soft. Tools Appl., 2017,
[24] Cong S., Gao M.Y., Cao G., Ultrafast Manipulation of a Dou- 20(2SI), 959-968.
ble Quantum-Dot Charge Qubit Using Lyapunov-Based Control [29] Ge S.B., Ma J.J., Jiang S.C., Liu Z., Peng W.X., Potential Use of
Method. IEEE J. Quant. Electr., 2015, 51(8), 1-8. Different Kinds of Carbon in Production of Decayed Wood Plastic
[25] Yan X., Yang S., Hong H.E., Load Adaptive Control Based on Fre- Composite, Arabian J. Chem., 2018, 11(6), 838-843.
quency Bifurcation Boundary for Wireless Power Transfer Sys- [30] Singh K., Gupta N., Dhingra M., Effect of Temperature Regimes,
tem, J. Pow. Supp., 2017, 43(4), 1025-1084. Seed Priming and Priming Duration On Germination and
[26] Lokesha V., Deepika T., Ranjini P.S., Cangul I.N. Operations of Seedling Growth On American Cotton, J. Envir. Biol., 2018, 39(1),
Nanostructures Via Sdd, Abc4 and Ga5 Indices. Applied Mathe- 83-91.
matics and Nonlinear Sciences, 2017, 2(1), 173-180. [31] Hosamani S.M., Correlation of Domination Parameters with
[27] Molinos-Senante M., Guzman C., Benchmarking Energy Efl- Physicochemical Properties of Octane Isomers, Appl. Math.
ciency in Drinking Water Treatment Plants: Quantification of Po- Nonlin. Sci., 2016, 1(2), 345-352.
tential Savings, J. Clean. Prod., 2018, 176, 417-425.

You might also like