0% found this document useful (0 votes)

7 views10 pages

A Text-Image Feature Mapping Algorithm Based On TR

Uploaded by

Thuan ngo van

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views10 pages

A Text-Image Feature Mapping Algorithm Based On TR

Uploaded by

Thuan ngo van

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Open Phys.

2018; 16:1139–1148

Research Article

Deng Pan* and Hyunho Yang

A text-Image feature mapping algorithm based on

transfer learning
https://doi.org/10.1515/phys-2018-0134 technology can no longer satisfy people’s learning needs
Received Oct 09, 2018; accepted Nov 14, 2018 of multimedia information. However, it is still difficult to
develop knowledge models directly in the feature space
Abstract: The traditional uniform distribution algorithm
of multimedia data, especially in the image feature space.
does not filter the image data when extracting the ap-
Whether the mature text mining technology alongside suf-
proximate features of text-image data under the event, so
ficient text information on the Internet can be used to as-
the similarity between the image data and the text is low,
sist the knowledge learning of image data is a hotspot of
which leads to low accuracy of the algorithm. This paper
current research.
proposes a text-image feature mapping algorithm based on
Reference [1] proposes an adaptive control scale-
transfer learning. The existing data is filtered by ‘cluster-
invariant feature transform (SIFT) feature uniform distri-
ing technology’ to obtain similar data with the target data.
bution algorithm (called uniform distribution algorithm)
The significant text features are calculated through the
based on the characteristics of stop and reverse (SAR) im-
latent Dirichlet allocation (LDA) model and information
age data. By using local texture features and combining
gain based on Gibbs sampling. Bag of visual word (BOVW)
them with optimization screening strategy, the SIFT fea-
model and Naive Bayesian method are used to model im-
ture points can be reasonably distributed in image space
age data. With the help of the text-image co-occurrence
and scale space by adaptively controlling the distribution
data in the same event, the text feature distribution is
of features in different spaces while ensuring the stabil-
mapped to the image feature space, and the feature dis-
ity and accuracy of feature points. But without consid-
tribution of image data under the same event is approx-
ering the timeliness of image data, the accuracy of this
imated. Experimental results show that the proposed al-
algorithm is not high. In Reference [2], a fast connected
gorithm can obtain the feature distribution of image data
component labeling algorithm implemented on an field-
under different events, and the average cosine similarity is
programmable gate array (FPGA) is proposed. Run-length
as high as 92%, the average dispersion is as low as 0.06%,
encoding is used to optimize image annotation, which can
and the accuracy of the algorithm is high.
reduce the number and length of tags and extract the fea-
Keywords: Transfer learning, text-image, feature map- tures of components during the run-length encoding. Due
ping, clustering, LDA model, BOVW model to the complexity of the algorithm, the efficiency of the al-
gorithm is low. In Reference [3], by calculating image fea-
PACS: 07.05.Kf, 07.05.Pj, 07.05.Tp
tures and delays, enhanced scanning (DE-MRI) is used to
analyze heterogeneous machine learning, and an uncer-
tainty assessment framework with potential ablation tar-
1 Introduction get recognition is constructed. However, due to the rela-
tionship between image features and delay, in the analysis
With the development of online information dissemina- of heterogeneous machine learning situation, data is not
tion technology, the amount of event information accom- sufficient, and data analysis efficiency is low, which has
panied by text-image is increasing. Traditional text mining certain limitations.
Transfer learning method develops a compact and ef-
fective representation from the annotated data of a source
domain and a few annotated or unannotated data of the
*Corresponding Author: Deng Pan: College of Computer Informa- target domain, and then applies the learning feature repre-
tion and Communication Engineering,Jiujiang University, Jiujiang,
sentation method to the learning task of the target domain.
332005, China, 29882; Email: 4035panda@gmail.com
Hyunho Yang: School of Computer Information andCommunica- In this method, not only self-annotated data but also unan-
tion Engineering, Kunsan National University, 54150, South Korea; notated data are used, so it is neither supervised learn-
Email: hhyang@kunsan.ac.kr

Open Access. © 2018 D. Pan and H. Yang, published by De Gruyter. This work is licensed under the Creative Commons Attribution-
NonCommercial-NoDerivatives 4.0 License
1140 | D. Pan and H. Yang

ing [4], nor unsupervised learning and semi-supervised each cluster, and focus on some specific clusters for further
learning, but a new machine learning method. During fea- analysis. At the same time, clustering technology can also
ture migration, even if the data in the source data space be used as a pre-treatment step of other algorithms to ef-
and the target data space do not intersect at the instance fectively improve the performance of other algorithms [10].
level, they may be related at the feature level [5]. Data with
two feature perspectives can be used to establish a link be-
tween two different feature spaces. These data are not nec- 2.1.2 Text representation and text similarity formula
essarily used as training data for knowledge learning, but
they can act as a role of dictionary. Taking a subject event According to the traditional vector space model (VSM)
as the background, the sufficient text-image information representation, the text content can be expressed as a
about the event on the Internet is used as a basis for knowl- weighted feature vector. Let D be a text set, d i is a text in
edge migration. the set, t is a feature word, t i is the i-th feature word, and
Aiming at these problems, in this paper a text-image wi is a weight of the i-th feature word.
feature mapping algorithm based on transfer learning is
d i = (t1 , w1 ; t2 , w2 ; ...; t n .w n ) (1)
proposed, which uses clustering technology to filter the
existing data to find the data very similar to the target Where, the weight wi can be represented by the tf-idf
data [6]. The significant text features are calculated by LDA weight of each feature. The tf-idf formula is as follows:
model based on Gibbs sampling and information gain. ∑︁ |D|
The BOVW model and naive Bayesian method are used tf − idf = tf (d, t) · log (2)
df (t)
to model the subject of image data. With the help of the d∈D

text-image co-occurrence data [7] in same event, the text Where, tf (d, t) is the word frequency of word t in text d,
feature distribution is mapped to the image feature space, df (t) is the number of text containing word t in text set D,
and the feature distribution of image data under the same and |D| represents the number of text contained in text set
event is approximated. D.
The similarity between two texts can be calculated by
the cosine of the angle α between two vectors. Assuming
that two texts are d1 = (t1 , w1 ; t2 , w2 ; . . . ; t n , w n ) and
2 A text-image feature mapping d2 = (t1 , σ1 ; t2 , σ2 ; . . . ; t n , σ n ), the similarity between d1
algorithm based on transfer and d2 is expressed as follows:

learning n
∑︀
wi × σi
i=1
∼ (d1 d2 ) = cos α = (︂ )︂ 21 (3)
2.1 Transfer learning algorithm for n
∑︀
ωi 2 ×
n
∑︀
σi 2
clustering text i=1 i=1

The greater the value of ∼ (d1 d2 ) is, the more similar the
Although the existing auxiliary data is out of date, there two texts are, of which w is the weight of the feature and σ
should be some data in the existing data that is very sim- is the approximate text weight.
ilar to the test data and can be used to help target tasks
learning [8]. Therefore, clustering technology is used to
find data that is very similar to test data from existing data. 2.1.3 Algorithm principle

Firstly, the auxiliary training data are clustered together

2.1.1 Introduction to clustering with the target training data [11]. The result of clustering
is that the intra-cluster similarity is high, and that of the
Clustering is an important form of data mining. The pur- data inter-cluster is different. Therefore, after clustering,
pose of text clustering is to group large-scale text datasets no auxiliary data clustered in the same cluster with the tar-
into multiple classes, and make the text in the same class get training data is filtered out. All that is left is data with
have a high degree of similarity, while the text between dif- high similarity to the target data, and training them with
ferent classes is quite different [9]. As a function of data the target data will greatly improve the performance of the
mining, clustering can be used as an independent tool classifier [12]. The definitions of some basic symbols used
to obtain data distribution, observe the characteristics of in the paper are given below.
A text-Image feature mapping algorithm based on transfer learning | 1141

Definition 2.1.1 * set Xb as the target sample space and Xa feature spaces are connected.These data are not necessar-
as the auxiliary sample space. * set Y = {0, 1} as a class ily used as training data for knowledge learning, but they
space. can act as a dictionary. Taking a subject event as the back-
ground, sufficient text-image information about the event
Definition 2.1.2 (test data set) S = x ti , of which x ti ∈
{︀(︀ )︀}︀
on the internet is used as a basis for knowledge migration.
Xb, i = 1, 2, . . . , k, and k are the number of elements of
set S.
2.2.1 Text-image co-occurrence data constrained by
Definition 2.1.3 (training data set) The training dataset events
consists
{︁(︁ of two(︁ parts:
)︁)︁}︁
Tb = x j , c x bj
b
, where x bj ∈ Xb, j = 1, 2, . . . , m; In the heterogeneous spatial learning model, the difficulty
{︀(︀ a (︀ a )︀)︀}︀
Ta = x i , c x i , where x ai ∈ Xa, i = 1, 2, . . . , n; of the whole learning process will be greatly reduced if a
where, c (x) is the real class label of the instance, t is the data with two feature spatial perspectives is used as an
feature word, i and j are the number of permutations, is the aid [15]. The heterogeneous spatial learning model under
target training data set, Ta is the auxiliary training data event constraint provides the possibility. The text-image
set. M and n are the size of target training dataset and aux- co-occurrence data under event constraints are given here.
iliary training dataset respectively. E is an event set, event e ∈ E;V is the whole image data
set, and the relevant image {v} ∈ V under event e. D
is the whole text data set, and the text set under event
2.1.4 Algorithm steps e is {d} ∈ D; U v is the image feature space, and U D is
the text feature space. Text-image co-occurrence data in-
Input: two training datasets Ta and Tb, a test data set S. stances vd ∈ S and S are co-occurrence data set, ∫ are op-
Output: classification result h t (X t ). eration coefficients, and u v ∈ U v and u d ∈ U D are corre-
Read the training data Ta and Tb. sponding features of image data instances and text data
The training data are classified into N classes according to instances, respectively. Under the constraint of events, the
class labels: T i (i = 1, . . . , N ), of which T i is the instances text-image co-occurrence data vd is formally described at
set of classes labeled i; the feature level as follows:

For i ← 1 to N6 (4) P (u v , u d ) = ∫ P (u v , d) P (u d |d ) dd (4)

D
a. call a basic clustering algorithm to cluster T i and return
clustering results. P (u v , u d ) = ∫ P (v, u d ) P (u v |v ) dv (5)
b. scan T i , delete auxiliary data from instances that are not v

clustered with target data.

P (u v , u d ) = ∫ ∫ P (v, d) P (u d |d ) P (u v |v ) dvdd (6)
End for; vd
Call a basic classification algorithm and get a classification
where, P (u d |d ) and P (u v |v ) are feature extraction pro-
model according to the filtered training data and test data
cesses.
S.

ht : X → Y .
2.2.2 Text subject modeling
Test the performance of the classification model on S and
output it [13]. The LDA model based on Gibbs sampling is used to extract
subject information from text sets for modeling [16], and
the probability model is:
2.2 A text-image feature mapping algorithm (︁ )︁
w i |z i , ϕ(z i ) ∼ disc ϕ(z i ) , ϕ ∼ dir (β) (7)
based on transfer learning ⃒ (︁ )︁
z i ⃒θ d i ∼ disc θ(d i ) , θ ∼ dir (α)
⃒
Based on the previous section, the existing data is filtered
by clustering technique [14], and the data which is very In order to deal with the new text outside the event train-
similar to the target data is obtained. Data with two feature ing text and facilitate parameter reasoning, the symmetric
perspectives are used to establish a link, and two different dir (α) and dir (β) prior probability assumptions are made
1142 | D. Pan and H. Yang

for θ(d) and ϕ(z) . In order to obtain the probability distri-

bution of text subjects, the posterior probability P (w |z ) of
lexicon w for text subjects is calculated instead of ϕ and θ,
and then ϕ and θ are calculated indirectly by Gibbs sam-
pling. By calculating the most discriminant feature in each
subject feature space, the feature which has the more in-
formation gain can be as the significant text feature.

2.2.3 Image data modeling

A Naive Bayesian model is used to model the image. Firstly,

the speeded up robust features (SURF) are computed and
Figure 1: Text image feature migration under event constraint
the bag of visual words (BOVW) model is established. Im-
age v is considered as a set of visual words. Each visual
word f comes from the visual vocabulary F, v = {f |f ∈ F }, is inferred from text significant features and text-image co-
and F represents the whole image feature space. According occurrence data in event text sets.
to the feature independence hypothesis, the image clas- ∑︁
P (f |c ) = N c P (f |w, c, S )P (w |c, D ) (9)
sification model is defined as: an event category c deter-
w∈W(c)
mines an image feature distribution P (f ∈ F |c ). Through
this model, the maximum posteriori is used to infer the im- Where, W (c) is the most prominent text feature set in the
age’s classification objective function h NB : V → C, and text set D under event category c, N c is normalization coef-
the image subject category modeling is completed. For tar- ficient, P (w |c, D ) is text feature distribution under event
get image v, the subject categories are: category c, and P (f |w, c, S ) is image’s feature conditional
distribution probability on text-image co-occurrence data.
h NB = arg max p(c) Π P(f |c) (8) Eq. (9) shows that the probability of a particular im-
f ∈v
age feature appearing is proportional to the probability of
it appearing in the text-image co-occurrence data associ-
2.2.4 Text-image feature mapping ated with each significant text feature if the event category
c is given [19]. At the same time, the probability of specific
Both text subject modeling and image subject modeling image features is related to the importance of each signifi-
belong to discrete object models. The feature indepen- cant text feature for the target concept [20]. Next, the cal-
dence hypothesis [17] can be applied to their features, that culations of P (f |w, c, S ) and P (w |c, D ) are elaborated.
is, each feature independently affects the posterior proba- Firstly, the text feature distribution P (w |c, D ) is com-
bility of an instance under a given event category. In the puted for each event category concept c ∈ C, and the most
process of text-image feature migration, the problem of significant event text feature set W (c) is calculated, and
feature migration can be greatly simplified by separating n is the operation coefficient. The LDA model is used to
text features and image features for mapping [18]. Figure 1 model the event text set, and Laplace smoothing is used
is a schematic diagram of text-image feature migration: to solve the sparse problem of text subject features.
The category label of each text in D under event con- P (w |c, D ) = [1 + n (w, c, D)] / [|W | + n + (c, D)] (10)
straint is the same as that of the image target category
c. Text d is represented by a subject feature word bag as ∑︁
d = {t |t ∈ T }. Thematic feature dictionary T is the subject n (w, c, D) = n (w |d )P (c |d ) (11)
vocabulary in a text feature space. At the same time, there d∈D

is a S = {(v, d)} set of text- image co-occurrence data un- ∑︁

n (c, D) = n (n |d )P (c |d ) (12)
der corresponding event. To infer the image feature distri- d∈D
bution P (f |c ) under event category c, the most significant
Then, the image feature conditional distribution
text features in text set D are first computed, and then the
P (f |w, c, S ) in the text-image co-occurrence dataset is
most significant text features are mapped to the image fea-
computed, and Laplacian smoothing is still used.
ture space by means of text-image co-occurrence data set
S. Distribution of image features under the target category P (f |w, c, S ) = (13)
A text-Image feature mapping algorithm based on transfer learning | 1143

[1 + n (f , w, c, S)] / [|F | + n (w, c, S)] bird’s nest incidents. Depending on the duration of the in-
cidents [24], the number of related text downloads ranged
∑︁ from 800 to 2000, with text-image accompanying text ac-
n (f , w, c, S) = n (f |v )P (w, c |d ) (14)
counting for about one-third to one-second.A text-image
(v,d)∈S
accompanying sample is regarded as a co-occurrence data
∑︁
n (w, c, S) = n (v)P (w, c |d ) (15) instance, but in the case of multiple images in a sample, it
(v,d)∈S is considered that one image corresponds to the same ac-
companying text, and the number of co-occurrence data
instances is calculated according to the number of images.
2.2.5 Evaluation Criteria Based on artificially collecting, the image data of each food
safety event from the Internet search engine and related
The goal of the text-to-image feature mapping algorithm is web pages are searched. For each event, 200~400 images
to estimate the feature distribution of image information are collected separately. The BOVW model is used to repre-
under event categories [21]. According to the feature inde- sent each image in a bag of visual word, and the histogram
pendence hypothesis of the BOVW model, image features vector expression of each image is obtained.
are regarded as random variables which appear indepen- Firstly, the feature distribution of the reference im-
dently. Image feature distribution can be represented as a age is constructed. Using all the images under each event
vector with the same size as the bag of character words. category c, an image feature distribution is obtained as
|F|−1
∑︁ the base feature distribution by the Naive Bayesian clas-
P (f |c ) = ⟨P i ≥ 0⟩i=0 , Pj = 1 (16)
i
sifier. Theoretically, the Naive Bayesian classifier can cal-
culate the real image feature distribution under the tar-
Cosine similarity and K-L (Kullback-Leibler) diver-
get category when the training data is sufficient. The two
gence dispersion is used as the performance evaluation
intuitive methods are compared with the text-image fea-
scale [22], it is assumed that p probability distribution is
ture mapping algorithm. The first method is the uniform
the datum distribution, and the other probability distribu-
distribution algorithm, assuming that each image feature
tion q is the approximation of distribution p. The greater
appears randomly under the concept of each event tar-
the cosine similarity of the two approximations is, the
get with the same probability. The second method is the
closer the two feature distributions are and the higher the
tagged query algorithm, which uses the name of category
approximation degree is. The formula for cosine similarity
c as the query keyword, searches in the Internet search en-
is as follows:
⎛ ⎞ gine [25], and uses the returned K image to train the Naive
√︃∑︁ √︃∑︁ Bayesian model to get the image feature distribution. The
Pi 2 +
∑︁
CS (P, q) = Pi qi / ⎝ qi 2⎠ (17) K value of the experiment is 50 based on experience [26–
i i i
31].
K-L dispersion is an asymmetric measure to evaluate the
difference between two probability distributions. Its value
reflects the approximation of distribution q to distribution
3 Results
p. In the determination of the characteristic distribution of
the reference image data, the K-L dispersion is defined as:
The K value of this experiment is 50 according to experi-
∑︁ P ence. The comparison between the three algorithms under
KL (P ‖q ) = P i 1b i (18)
qi cosine similarity is shown in Figure 2, 3, and 4.
i
Analyzing the results of the three algorithms in the
Based on the above methods, 15 categories of video secu-
cosine similarity, we can see that the maximum value of
rity incidents on the Internet are analyzed as data sets [23],
the uniform distribution algorithm is 0.94%, the minimum
corresponding categories are: E1: Sanlu milk powder inci-
value is 0.74%, the maximum value of the label query algo-
dents; E2: red-cored duck egg incidents; E3: Turbot inci-
rithm is 0.97%, the minimum value is 0.76%, and the max-
dents; E4: Jinhao tea oil incidents; E5: Maile chicken inci-
imum value of the proposed algorithm is 0.99% and the
dents; E6: Plasticizer incidents; E7: Clenbuterol incidents;
minimum value is 0.76%.
E8: paraffin wax in hot pot incidents; E9: Gutter oil event;
Through data comparison, the cosine similarity of the
E10: Crayfish event; E11: Fushou snail incidents; E12: Poi-
proposed algorithm is always higher than that of the uni-
sonous steamed bread incidents; E13: Maggot citrus inci-
form distribution algorithm and the label query algorithm.
dents; E14: Bursting watermelon incidents; E15: Poisonous
1144 | D. Pan and H. Yang

Figure 4: Algorithm for estimating distribution effect under cosine

similarity

Figure 2: The effect of uniform distribution algorithm on estimation

of distribution under cosine similarity

Figure 5: Uniform distribution algorithm under K-L discrete value to

estimate distribution effect diagram

proposed algorithm is 0.17% and the minimum value is

0.03%.
Figure 3: Estimation of distribution effect under tag cosine similarity
algorithm Through data comparison, the K-L dispersion of the
proposed algorithm is always lower than that of the uni-
form distribution algorithm and the label query algorithm.
The larger the similarity value is, the more accurate the ap- Discreteness is an asymmetric metric measure to evaluate
proximate feature extraction is. Therefore, the accuracy of the difference of two probability distributions. Its value re-
the proposed algorithm is higher than the other two algo- flects the approximations. The smaller the dispersion is,
rithms. the smaller the difference is. Therefore, the difference of
As shown in Figure 5, 6, and 7, the prediction results of the algorithm in this paper is lower than that of the other
the three algorithms under K-L dispersion are compared: two algorithms.
Analysis of Figure 5, 6, and 7 shows that the results From the comparison of the effects of different algo-
of the three algorithms are comparable under K-L discrete rithms on estimating the distribution under the above dif-
values. The maximum value of the uniform distribution al- ferent metrics, it can be seen that the image feature dis-
gorithm is 0.27%, the minimum value is 0.05%, the max- tribution generated by the text-image feature mapping al-
imum value of the tagged query algorithm is 0.30%, the gorithm is the closest to its benchmark distribution under
minimum value is 0.04%, and the maximum value of the most event categories, while the uniform distribution algo-
rithm is only close to the results of other algorithms under
A text-Image feature mapping algorithm based on transfer learning | 1145

times. The feature distribution and reference distribution

of each image are compared. Finally, the results of all re-
peated rounds are arithmetically averaged. The number of
images randomly selected for each event category is 20, 40,
60, 80, 100, 120, 140, 160 in turn. The approximate results
of uniform distribution algorithm, label query algorithm
and feature mapping algorithm under each category are
averaged, and then compared with the above method. Fig-
ures 8 and 9 show the average difference between the im-
age feature distribution and the reference distribution ob-
tained by these approximate methods under the two mea-
surement scales.

Figure 6: K-L discrete valued mark-up query algorithm to estimate

the distribution effect map

Figure 8: Comparison of different algorithms for estimating distribu-

tion under cosine similarity

Figure 7: The algorithm is used to estimate the distribution effect

under K-L discrete values
As can be seen from Figures 8 and 9, the approximate
results of the uniform distribution algorithm, the label
one category (E6). By checking the data under this cate- query algorithm and the feature mapping algorithm under
gory, it is found that this is due to the large differences in each category are averaged, and then compared with the
the image data between them. The label query algorithm above method. The text-image feature mapping algorithm
under three categories (E1, E9, E11) is equivalent to the text- is similar to the feature distribution obtained by training
image feature mapping algorithm proposed by the author. 100 labeled images, and the cosine similarity of the pro-
By checking the data, for these event categories, the event posed algorithm is 92%; that of the uniform distribution
category name is directly input from the search engine as algorithm is 76%; and the label query algorithm is 84%.
the query keyword, and the resulting images are closely The average value of cosine similarity of the proposed al-
related to the event category, so the similar distribution ef- gorithm is the largest in every category. The discrete de-
fect of the label query algorithm is better. gree of the proposed algorithm is 0.06%; that of the uni-
In addition to the above direct method, the approxi- form distribution algorithm is 0.17%; the label query algo-
mation degree of a similar image feature distribution to rithm is 0.09%; the average value of the discrete results of
the benchmark distribution can be measured from differ- the proposed algorithm is the smallest in each category.
ent training data scales.Each time, from each category of The above data show that the proposed text-image feature
the collected event image data set, N images are randomly mapping algorithm based on transfer learning can effec-
selected, and the Naive Bayesian model is trained for 100 tively learn the image feature distribution under the target
1146 | D. Pan and H. Yang

gorithm not only extracts the approximate feature distri-

bution of text-image data effectively, but also consumes
less time and has high efficiency.

4 Discussion
In the traditional machine learning framework, the task of
learning is to learn a classification model based on given
sufficient training data, and then use this learning model
to classify and predict test documents. However, we see
that machine learning algorithms have a key problem in
the current Internet mining research: a large amount of
training data in some emerging areas is difficult to obtain.
It can be seen that the development of internet applica-
Figure 9: Comparison of different algorithms for estimating distribu-
tion under K-L discrete values tions is very fast. A large number of new areas are emerg-
ing, from traditional news, to web pages, pictures, blogs,
podcasts and so on. Traditional machine learning needs
event category from the text data of related events and text- to calibrate a large amount of training data in each field,
image co-occurrence data. which will consume manpower and material resources.
Under the 100 events, the similarity distribution of Without a large amount of annotated data, a lot of learn-
text- image data is simulated. The proposed algorithm is ing related researches and applicationscan’t be carried
simulated with the uniform distribution algorithm and la- out. Secondly, traditional machine learning assumes that
bel query algorithm. The average optimal fitness and aver- training data and test data obey the same data distribu-
age operation time of the two algorithms are obtained. The tion. However, in many cases, the same distribution hy-
detailed results are described in Table 1. From the analy- pothesis is not satisfied. In addition, training data is often
out of date. This often requires re-labelling a large volume
Table 1: Simulation results of approximate distribution of image of training data to meet our training needs, but labeling
data distribution under 100 events new data is expensive and requires manpower and mate-
rial resources. On the other hand, if we have a lot of train-
Type of algorithm Average Mean ing data with different distributions, it would be wasteful
optimum operation to discard them completely. How to make rational use of
fitness time /s this data is the aim of transfer learning.
/% Main problems are solved. Transfer learning can trans-
Uniform distribution algorithm 7.51 54.09 fer knowledge from existing data to help future learn-
Tagged query algorithm 8.22 53.69 ing. The goal of transfer learning is to use the knowl-
Algorithm in this paper 9.85 34.72 edge learned from one environment to assist learning tasks
in the new environment. Therefore, transfer learning will
not assume the same distribution assumption as tradi-
sis of Table 1, we can see that the optimal fitness of the tional machine learning. At present, the work on trans-
proposed algorithm is 9.85%, the uniform distribution al- fer learning can be divided into two parts: case-based
gorithm is 7.51%, the label query algorithm is 8.22%, and transfer learning in isomorphic space and feature-based
the fitness of the proposed algorithm is the highest. In the transfer learning in isomorphic space. It is pointed out
comparison of the average operation time, the proposed that case-based transfer learning has stronger knowledge
algorithm takes 34.72 seconds, the uniform distribution al- transfer ability, while feature-based transfer learning has
gorithm takes 54.09s, and the label query algorithm takes wider knowledge transfer ability. These two methods have
53.69s, indicating that the proposed algorithm takes the their own merits. Transfer learning is a relatively new re-
shortest time and has the highest efficiency. This algorithm search direction in machine learning. The current research
can quickly and effectively extract the approximate feature mainly focuses on data mining, natural language process-
distribution of text-image data under 100 events. This al- ing, information retrieval and image classification. Ma-
A text-Image feature mapping algorithm based on transfer learning | 1147

chine learning has provided extensive research findings Trans. Biomed. Eng. PP, 2018, (99), 1-1.
and results, but research into transfer learning is mini- [4] Cazade P.A., Zheng W., Pradagracia D., A Comparative Anal-
ysis of Clustering Algorithms: O2 Migration in Truncated
mal. Features and samples are two important aspects of
Hemoglobin I from Transition Networks, J. Chem. Phys., 2015,
text categorization. It is important to consider these two
142(2), 025-103.
factors comprehensively. Sample based transfer learning [5] Wan S., Niu Z., A Learner Oriented Learning Recommendation
is another method to solve the problem of transfer learn- Approach based on Mixed Concept Mapping and Immune Algo-
ing. Traditional methods also use feature-based or sample- rithm, Knowledge-Based Syst., 2016, 103(C), 28-40.
based transfer learning methods, but there is a lack of com- [6] Han X.H., Xiong X., Duan F., A New Method for Image Segmen-
tation based on BP Neural Network and Gravitational Search Al-
prehensive use of these two methods. The algorithm pro-
gorithm Enhanced by Cat Chaotic Mapping, Appl. Intel., 2015,
posed in this paper can find the data very similar to the 43(4), 855-873.
test data from the existing data and improve the accuracy [7] Zhou T., Hu W., Ning J., An Eflcient Local Operator-based Qcom-
of the model. pensated Reverse Time Migration Algorithm with Multistage Op-
timization, Geophys., 2018, 83(3), S249-S259.
[8] Gorodnitskiy E., Perel M., Geng Y., Depth Migration with Gaus-
sian Wave Packets based on Poincaré Wavelets, Geophys. J. Int.,
5 Conclusions 2016, 205(1), 301-318.
[9] Rastogi R., Srivastava A., Khonde K., An Eflcient Parallel Algo-
rithm: Poststack and Prestack Kirchhoff 3D Depth Migration Us-
In this paper, a text-image feature mapping algorithm
ing Flexi-depth Iterations, Comp. Geosci., 2015, 80, 1-8.
based on transfer learning is proposed. Firstly, clustering [10] Tosun S., Ozturk O., Ozkan E., Application Mapping Algorithms
technology is used to filter the existing data and find the for Mesh-based Network-on-chip Architectures, J. Supercomp.,
data which is similar to the target data to help the learning 2015, 71(3), 995-1017.
of the target task and improve the performance of the clas- [11] Kalantar B., Mansor S.B., Sameen M.I., Drone-based Land-cover
Mapping Using a Fuzzy Unordered Rule Induction Algorithm In-
sifier. Then, the event text data is modeled by the potential
tegrated into Object-based Image Analysis, Int. J. Remote Sens.,
Dirichlet assignment method, and the most prominent text 2017, 38(8-10), 2535-2556.
features are selected by calculating the information gain [12] Mackenzie C., Pichara K., Protopapas P., Clustering Based Fea-
of the topic features; the event images are modeled using ture Learning on Variable Stars, Astrophys. J., 2016, 820(2), 138.
the visual word bag model and the naive Bayesian method; [13] Li H., Zhu G., Cui C., Energy-eflcient Migration and Consolidation
The approximate extraction of the image feature distribu- Algorithm of Virtual Machines in Data Centers for Cloud Comput-
ing, Comput., 2016, 98(3), 303-317.
tion is realized by the text data feature distribution and
[14] Xiang T., Yan L., Gao R., A Fusion Algorithm for Infrared and Vis-
the text-image co-occurrence data feature distribution un- ible Images based on Adaptive Dual-channel Unit-linking PCNN
der the same event. Compared with the traditional uniform in NSCT Domain, Infrared Phys. Technol., 2015, 69, 53-61.
distribution algorithm and labeled query algorithm, the [15] Dong J., Xiao X., Menarguez M.A., Mapping Paddy Rice Plant-
average cosine similarity of the proposed algorithm is 92%, ing Area in Northeastern Asia with Landsat 8 Images, Phenology
based Algorithm and Google Earth Engine, Remote Sens. Envir.,
that of the uniform distribution algorithm is 76%, and the
2016, 185, 142-154.
labeled query algorithm is 84%. The average dispersion of [16] Li Q., Zhou H., Zhang Q., Eflcient Reverse Time Migration based
the proposed algorithm is 0.06%, that of the uniform dis- on Fractional Laplacian Viscoacoustic Wave Equation, Geophys.
tribution algorithm is 0.17%, and the labeled query algo- J. Int., 2016, 204(1), 488-504.
rithm is 0.09%. The experimental data shows that the pro- [17] Medrano E.A., Wiel B.J.H.V.D., Uittenbogaard R.E., Simulations
posed algorithm has the advantage of high cosine similar- of the Diurnal Migration of Microcystis Aeruginosa, based on
a Scaling Model for Physical-biological Interactions, Ecolog.
ity and low dispersion.
Mod., 2016, 337, 200-210.
[18] Matsubayashi A., Asymptotically Optimal Online Page Migration
on Three Points, Algorithmica, 2015, 71(4), 1035-1064.
[19] Yap W. S., Phan C.W., Yau W.C., Cryptanalysis of a New Image
References Alternate Encryption Algorithm based on Chaotic map, Nonlin.
Dyn., 2015, 80(3), 1483-1491.
[1] Wang F., Youh J., Fux Y., Auto-Adaptive Well-Distributed Scale- [20] Rastogi R., Londhe A., Srivastava A., 3D Kirchhoff Depth Migra-
Invariant Feature for SAR Images Registration, Geomat. Inform. tion Algorithm, Comp. Geosci., 2017, 100(C), 67-75.
Sci. Wuhan Univ., 2015, 40(2), 159-163. [21] Thierry P., Lambaré G., Podvin P., 3-D Preserved Amplitude
[2] Wang K., Shil Z., Design and Implementation of Fast Connected Prestack Depth Migration on a Workstation, Geophys., 2015,
Component Labeling Algorithm based on FPGA, Comp. Eng. 64(1), 222-229.
Appl., 2016, 52(18), 192-198. [22] Zheng X.W., Lu D.J., Wang X.G.A., Cooperative Coevolutionary
[3] Lozoya R.C., Berte B., Cochet H., Model-based Feature Augmen- Biogeography-based optimizer, Appl. Intel., 2015, 43(1), 1-17.
tation for Cardiac Ablation Target Learning from Images, IEEE
1148 | D. Pan and H. Yang

[23] Wang M., Study on Operation Reliability of Transfer System of [28] Gao W., Farahani M.R., Aslam A., Hosamani S. Distance Learn-
Urban Transportation Hub based on Reliability Theory, Automat. ing Techniques for Ontology Similarity Measuring and Ontology
Instrument., 2016, (1), 418-534. Mapping, Cluster Computing - The J. Net. Soft. Tools Appl., 2017,
[24] Cong S., Gao M.Y., Cao G., Ultrafast Manipulation of a Dou- 20(2SI), 959-968.
ble Quantum-Dot Charge Qubit Using Lyapunov-Based Control [29] Ge S.B., Ma J.J., Jiang S.C., Liu Z., Peng W.X., Potential Use of
Method. IEEE J. Quant. Electr., 2015, 51(8), 1-8. Different Kinds of Carbon in Production of Decayed Wood Plastic
[25] Yan X., Yang S., Hong H.E., Load Adaptive Control Based on Fre- Composite, Arabian J. Chem., 2018, 11(6), 838-843.
quency Bifurcation Boundary for Wireless Power Transfer Sys- [30] Singh K., Gupta N., Dhingra M., Effect of Temperature Regimes,
tem, J. Pow. Supp., 2017, 43(4), 1025-1084. Seed Priming and Priming Duration On Germination and
[26] Lokesha V., Deepika T., Ranjini P.S., Cangul I.N. Operations of Seedling Growth On American Cotton, J. Envir. Biol., 2018, 39(1),
Nanostructures Via Sdd, Abc4 and Ga5 Indices. Applied Mathe- 83-91.
matics and Nonlinear Sciences, 2017, 2(1), 173-180. [31] Hosamani S.M., Correlation of Domination Parameters with
[27] Molinos-Senante M., Guzman C., Benchmarking Energy Efl- Physicochemical Properties of Octane Isomers, Appl. Math.
ciency in Drinking Water Treatment Plants: Quantification of Po- Nonlin. Sci., 2016, 1(2), 345-352.
tential Savings, J. Clean. Prod., 2018, 176, 417-425.

FLUXO
No ratings yet
FLUXO
2 pages
K-Means Clustering and Affinity Clustering Based On Heterogeneous Transfer Learning
No ratings yet
K-Means Clustering and Affinity Clustering Based On Heterogeneous Transfer Learning
7 pages
Machine Learning For Text Document Classification-Efficient Classification Approach
No ratings yet
Machine Learning For Text Document Classification-Efficient Classification Approach
8 pages
8090-Article Text-11617-1-2-20201228
No ratings yet
8090-Article Text-11617-1-2-20201228
6 pages
Text, Web and Social Media Analytics: SE Computer, Sem VIII Academic Year: 2023 - 24
No ratings yet
Text, Web and Social Media Analytics: SE Computer, Sem VIII Academic Year: 2023 - 24
36 pages
Image Classification Based On Transfer Learning of CNN
No ratings yet
Image Classification Based On Transfer Learning of CNN
5 pages
17.feature-Based Distant Domain Transfer Learning
No ratings yet
17.feature-Based Distant Domain Transfer Learning
8 pages
Agarwal 2014
No ratings yet
Agarwal 2014
9 pages
Unsupervised Embedding Learning Via Invariant and Spreading Instance Feature
No ratings yet
Unsupervised Embedding Learning Via Invariant and Spreading Instance Feature
10 pages
Abstract
No ratings yet
Abstract
2 pages
Improve Text Classification Accuracy Based On Classifier Fusion Methods
No ratings yet
Improve Text Classification Accuracy Based On Classifier Fusion Methods
6 pages
9 TZ
No ratings yet
9 TZ
101 pages
Exploring Patch-Wise Semantic Relation For Contrastive Learning in Image-to-Image Translation Tasks
No ratings yet
Exploring Patch-Wise Semantic Relation For Contrastive Learning in Image-to-Image Translation Tasks
10 pages
127 1498038923 - 21-06-2017 PDF
No ratings yet
127 1498038923 - 21-06-2017 PDF
9 pages
Mod 2
No ratings yet
Mod 2
10 pages
Tkde Transfer Learning
No ratings yet
Tkde Transfer Learning
15 pages
Fan & Qin, 2018, Research On Text Classification Based On Improved TF-IDF Algorithm
No ratings yet
Fan & Qin, 2018, Research On Text Classification Based On Improved TF-IDF Algorithm
6 pages
Base Paper
No ratings yet
Base Paper
5 pages
Chapter Two
No ratings yet
Chapter Two
3 pages
Text Feature Extraction Based On Deep Learning A Review (PRINTED)
No ratings yet
Text Feature Extraction Based On Deep Learning A Review (PRINTED)
12 pages
Texthuff
No ratings yet
Texthuff
3 pages
Learning Image-Text Associations
No ratings yet
Learning Image-Text Associations
17 pages
230623-Paper-Learning and Representing Object Shape Through An Array of Orientation Columns
No ratings yet
230623-Paper-Learning and Representing Object Shape Through An Array of Orientation Columns
14 pages
Transfer Learning Through Embedding Spaces (Z-Lib - Io)
No ratings yet
Transfer Learning Through Embedding Spaces (Z-Lib - Io)
223 pages
Session 5
No ratings yet
Session 5
33 pages
Zaryab Paper
No ratings yet
Zaryab Paper
10 pages
A Survey On Transfer Learning: Sinno Jialin Pan and Qiang Yang, Fellow, IEEE
No ratings yet
A Survey On Transfer Learning: Sinno Jialin Pan and Qiang Yang, Fellow, IEEE
15 pages
Image Classification AI
No ratings yet
Image Classification AI
150 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Efficient Preprocessing and Patterns Identification Approach For Text Mining
No ratings yet
Efficient Preprocessing and Patterns Identification Approach For Text Mining
6 pages
Image Classification Using Bag of Visual Words (Bovw) : 10.22401/anjs.21.4.11
No ratings yet
Image Classification Using Bag of Visual Words (Bovw) : 10.22401/anjs.21.4.11
7 pages
Similarity-Based Techniques For Text Document Classification
No ratings yet
Similarity-Based Techniques For Text Document Classification
8 pages
Graduation Thesis Topic Recommendation Based On Neural Network
No ratings yet
Graduation Thesis Topic Recommendation Based On Neural Network
6 pages
Construction of CNN Model Based On Hard-Assigned Coding of Image Features
No ratings yet
Construction of CNN Model Based On Hard-Assigned Coding of Image Features
5 pages
KBS Surney Paper References Revised
No ratings yet
KBS Surney Paper References Revised
19 pages
A Survey On Transfer Learning
No ratings yet
A Survey On Transfer Learning
42 pages
Artificial Intelligence Image Recognition Method Based On Convolutional Neural Network Algorithm
No ratings yet
Artificial Intelligence Image Recognition Method Based On Convolutional Neural Network Algorithm
14 pages
Contextual Abstraction Based Clustering Technique For Effective Text Document Mining
No ratings yet
Contextual Abstraction Based Clustering Technique For Effective Text Document Mining
13 pages
7923262
No ratings yet
7923262
11 pages
Identifying Transferable Information Across Domains For Cross-Domain Sentiment Classification
No ratings yet
Identifying Transferable Information Across Domains For Cross-Domain Sentiment Classification
11 pages
Text Mining Notes Full
No ratings yet
Text Mining Notes Full
2 pages
Discriminative Probing and Tuning For Text-To-Image Generation
No ratings yet
Discriminative Probing and Tuning For Text-To-Image Generation
22 pages
Markaki 2019
No ratings yet
Markaki 2019
9 pages
A Comprehensive Survey On Transfer Learning
No ratings yet
A Comprehensive Survey On Transfer Learning
31 pages
Pattern Recognition: Caiyan Jia, Matthew B. Carson, Xiaoyang Wang, Jian Yu
No ratings yet
Pattern Recognition: Caiyan Jia, Matthew B. Carson, Xiaoyang Wang, Jian Yu
13 pages
Paulin Transformation Pursuit For 2014 CVPR Paper
No ratings yet
Paulin Transformation Pursuit For 2014 CVPR Paper
8 pages
Weakly Supervised Contrastive Learning
No ratings yet
Weakly Supervised Contrastive Learning
10 pages
Distributed Transfer Network Learning Based Intrusion Detection
No ratings yet
Distributed Transfer Network Learning Based Intrusion Detection
5 pages
Paper Web Clustering
No ratings yet
Paper Web Clustering
3 pages
Text Features Extraction Based On TF-IDF Associating Semantic
No ratings yet
Text Features Extraction Based On TF-IDF Associating Semantic
7 pages
Format Synopsis DP
No ratings yet
Format Synopsis DP
12 pages
Ontology-Based Text Clustering: A. Hotho and S. Staab A. Maedche
No ratings yet
Ontology-Based Text Clustering: A. Hotho and S. Staab A. Maedche
8 pages
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks
No ratings yet
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks
10 pages
Semantic Translation For Rule-Based Knowledge in Data Mining
No ratings yet
Semantic Translation For Rule-Based Knowledge in Data Mining
16 pages
Journal of Computational Science: Laith Mohammad Abualigah, Ahamad Tajudin Khader, Essam Said Hanandeh
No ratings yet
Journal of Computational Science: Laith Mohammad Abualigah, Ahamad Tajudin Khader, Essam Said Hanandeh
11 pages
Jo (2019) - Text Mining
No ratings yet
Jo (2019) - Text Mining
376 pages
DeekshikaJadyada26 AP24LDS11
No ratings yet
DeekshikaJadyada26 AP24LDS11
7 pages
مقاله4 2019
No ratings yet
مقاله4 2019
14 pages
A Novel Method For Maintenance Record Clustering and Its Application To A Case Study of Maintenance Optimization
No ratings yet
A Novel Method For Maintenance Record Clustering and Its Application To A Case Study of Maintenance Optimization
25 pages
Non Numeric Clustering Seminar
No ratings yet
Non Numeric Clustering Seminar
26 pages
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Slide v1
No ratings yet
Slide v1
22 pages
Radial Bond Tool
No ratings yet
Radial Bond Tool
13 pages
Log
No ratings yet
Log
6 pages
Soft Computing Question Bank
No ratings yet
Soft Computing Question Bank
18 pages
BCA-2Sem Data Structure
No ratings yet
BCA-2Sem Data Structure
30 pages
List of RF Connector Types
No ratings yet
List of RF Connector Types
5 pages
Chap 04
No ratings yet
Chap 04
18 pages
Ti Mmwave Labs: Driver Vital Signs - Developer'S Guide
No ratings yet
Ti Mmwave Labs: Driver Vital Signs - Developer'S Guide
36 pages
TAC Database Apple
100% (2)
TAC Database Apple
2 pages
Modeling Inverse Kinematics in A Robotic Arm - MATLAB & Simulink Example
No ratings yet
Modeling Inverse Kinematics in A Robotic Arm - MATLAB & Simulink Example
5 pages
Substandard Quality and Fraud at Various Stages of Constructions - Jaikumar V H
No ratings yet
Substandard Quality and Fraud at Various Stages of Constructions - Jaikumar V H
22 pages
Iso - Prog Biesse Rover C9
No ratings yet
Iso - Prog Biesse Rover C9
190 pages
SB-19 RevB 032916
No ratings yet
SB-19 RevB 032916
2 pages
KBRF 200a
No ratings yet
KBRF 200a
1 page
HSE Observation Register
No ratings yet
HSE Observation Register
272 pages
2ND YEAR MATH (MCQ'S)
0% (1)
2ND YEAR MATH (MCQ'S)
21 pages
2 - Aggregates
No ratings yet
2 - Aggregates
32 pages
Programming Fundamental Assignment 3
No ratings yet
Programming Fundamental Assignment 3
7 pages
History of Architecture on the Comparative Method 5th ed. 1905 A download
No ratings yet
History of Architecture on the Comparative Method 5th ed. 1905 A download
64 pages
Adeco Graphite Grease: Technical Data Sheet
No ratings yet
Adeco Graphite Grease: Technical Data Sheet
1 page
MicroNet VAV Controllers MNL-V3RVx Datasheet F-26366-8
No ratings yet
MicroNet VAV Controllers MNL-V3RVx Datasheet F-26366-8
4 pages
Lexus - US LX - 2023
No ratings yet
Lexus - US LX - 2023
22 pages
A91 - Rev5 - GB pc60 Data Sheet PDF
No ratings yet
A91 - Rev5 - GB pc60 Data Sheet PDF
2 pages
The KP Pushbutton: Benefits
No ratings yet
The KP Pushbutton: Benefits
1 page
SS - Gr. Petlam Inks (7126)
No ratings yet
SS - Gr. Petlam Inks (7126)
1 page
CS Project File
No ratings yet
CS Project File
15 pages
Manual Canon PIXMA Ip100 Photo - Color - 50 Second Photo 1446B002
No ratings yet
Manual Canon PIXMA Ip100 Photo - Color - 50 Second Photo 1446B002
2 pages
An Empirical Study Assessing Software Modeling in Alloy
No ratings yet
An Empirical Study Assessing Software Modeling in Alloy
11 pages
Tìm Nhà + Khoá Học + Shopping
No ratings yet
Tìm Nhà + Khoá Học + Shopping
5 pages

A Text-Image Feature Mapping Algorithm Based On TR

Uploaded by

A Text-Image Feature Mapping Algorithm Based On TR

Uploaded by

Open Phys.

Deng Pan* and Hyunho Yang

A text-Image feature mapping algorithm based on

Firstly, the auxiliary training data are clustered together

For i ← 1 to N6 (4) P (u v , u d ) = ∫ P (u v , d) P (u d |d ) dd (4)

clustered with target data.

for θ(d) and ϕ(z) . In order to obtain the probability distri-

2.2.3 Image data modeling

A Naive Bayesian model is used to model the image. Firstly,

is a S = {(v, d)} set of text- image co-occurrence data un- ∑︁

Figure 4: Algorithm for estimating distribution effect under cosine

Figure 2: The effect of uniform distribution algorithm on estimation

Figure 5: Uniform distribution algorithm under K-L discrete value to

proposed algorithm is 0.17% and the minimum value is

times. The feature distribution and reference distribution

Figure 6: K-L discrete valued mark-up query algorithm to estimate

Figure 8: Comparison of different algorithms for estimating distribu-

Figure 7: The algorithm is used to estimate the distribution effect

gorithm not only extracts the approximate feature distri-

You might also like