[PDF][PDF] Clustering technique in multi-document personal name disambiguation

C Chen, J Hu, H Wang - Proceedings of the ACL-IJCNLP 2009 …, 2009 - aclanthology.org
C Chen, J Hu, H Wang
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop, 2009aclanthology.org
Focusing on multi-document personal name disambiguation, this paper develops an
agglomerative clustering approach to resolving this problem. We start from an analysis of
pointwise mutual information between feature and the ambiguous name, which brings about
a novel weight computing method for feature in clustering. Then a trade-off measure
between within-cluster compactness and among-cluster separation is proposed for stopping
clustering. After that, we apply a labeling method to find representative feature for each …
Abstract
Focusing on multi-document personal name disambiguation, this paper develops an agglomerative clustering approach to resolving this problem. We start from an analysis of pointwise mutual information between feature and the ambiguous name, which brings about a novel weight computing method for feature in clustering. Then a trade-off measure between within-cluster compactness and among-cluster separation is proposed for stopping clustering. After that, we apply a labeling method to find representative feature for each cluster. Finally, experiments are conducted on word-based clustering in Chinese dataset and the result shows a good effect.
aclanthology.org
Showing the best result for this search. See all results