Parallel Knowledge Embedding with MapReduce on a Multi-core Processor

Fan, Miao; Zhou, Qiang; Zheng, Thomas Fang; Grishman, Ralph

Abstract:This article firstly attempts to explore parallel algorithms of learning distributed representations for both entities and relations in large-scale knowledge repositories with {\it MapReduce} programming model on a multi-core processor. We accelerate the training progress of a canonical knowledge embedding method, i.e. {\it translating embedding} ({\bf TransE}) model, by dividing a whole knowledge repository into several balanced subsets, and feeding each subset into an individual core where local embeddings can concurrently run updating during the {\it Map} phase. However, it usually suffers from inconsistent low-dimensional vector representations of the same key, which are collected from different {\it Map} workers, and further leads to conflicts when conducting {\it Reduce} to merge the various vectors associated with the same key. Therefore, we try several strategies to acquire the merged embeddings which may not only retain the performance of {\it entity inference}, {\it relation prediction}, and even {\it triplet classification} evaluated by the single-thread {\bf TransE} on several well-known knowledge bases such as Freebase and NELL, but also scale up the learning speed along with the number of cores within a processor. So far, the empirical studies show that we could achieve comparable results as the single-thread {\bf TransE} performs by the {\it stochastic gradient descend} (SGD) algorithm, as well as increase the training speed multiple times via adapting the {\it batch gradient descend} (BGD) algorithm for {\it MapReduce} paradigm.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB)
Cite as:	arXiv:1509.01183 [cs.DC]
	(or arXiv:1509.01183v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1509.01183

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Parallel Knowledge Embedding with MapReduce on a Multi-core Processor

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators