SEMIE: SEMantically Infused Embeddings with Enhanced Interpretability for Domain-specific Small Corpus

Gupta, Rishabh; Rao, Rajesh N

Computer Science > Computation and Language

arXiv:2103.11431 (cs)

[Submitted on 21 Mar 2021]

Title:SEMIE: SEMantically Infused Embeddings with Enhanced Interpretability for Domain-specific Small Corpus

Authors:Rishabh Gupta, Rajesh N Rao

View PDF

Abstract:Word embeddings are a basic building block of modern NLP pipelines. Efforts have been made to learn rich, efficient, and interpretable embeddings for large generic datasets available in the public domain. However, these embeddings have limited applicability for small corpora from specific domains such as automotive, manufacturing, maintenance and support, etc. In this work, we present a comprehensive notion of interpretability for word embeddings and propose a novel method to generate highly interpretable and efficient embeddings for a domain-specific small corpus. We report the evaluation results of our resulting word embeddings and demonstrate their novel features for enhanced interpretability.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2103.11431 [cs.CL]
	(or arXiv:2103.11431v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2103.11431

Submission history

From: Rishabh Gupta [view email]
[v1] Sun, 21 Mar 2021 16:28:08 UTC (7,337 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-03

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Rishabh Gupta

export BibTeX citation

Computer Science > Computation and Language

Title:SEMIE: SEMantically Infused Embeddings with Enhanced Interpretability for Domain-specific Small Corpus

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:SEMIE: SEMantically Infused Embeddings with Enhanced Interpretability for Domain-specific Small Corpus

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators