计算机科学 ›› 2017, Vol. 44 ›› Issue (Z6): 80-83.doi: 10.11896/j.issn.1002-137X.2017.6A.016
李颖,郝晓燕,王勇
LI Ying, HAO Xiao-yan and WANG Yong
摘要: 传统信息抽取针对特定的领域。当转换到新领域时,需要人工编写新的抽取规则和人工标记新的训练样本。开放信息抽取突破了传统信息抽取的局限性。现有的开放式信息抽取系统大多针对英文,然而,目前对于中文的研究相对较少,并主要以抽取三元组为主,没有针对中文抽取多元组的方法。因此提出了一种基于依存分析的中文开放式多元实体关系抽取方法。首先,对文本集进行预处理和依存关系分析;然后将动词视为候选关系词,将与此动词有满足条件的有效依存路径的基本名词短语视为实体词,关联两个及两个以上的实体词的关系词可与实体词组成候选多元实体关系组;最后,使用经过训练的逻辑回归分类器对多元实体关系组进行过滤。对百度百科数据集的抽取结果显示,所提方法在抽取大量实体关系多元组时准确性可达到81%。
[1] CHINCHOR N,MARSH E.MUC-7 Information ExtractionTask Definition[C]∥Proc of MUC-7.1998. [2] BANKO M,CAFARELL M J,S ODERLAND S.Open information extraction from the Web[C]∥Proc of IJCAI.2007. [3] BANKO M,ETZIONI O.The tradeoff between open and traditional relation extraction[C]∥Proc of Annual Meeting of the Association for Computational Lingustics.2008:28-36. [4] WU F,WELD D S.Open information extraction using Wikipedia[C]∥Proc of Annual Meeting of the Association for Computational Lingustics.2010:118-127. [5] FADER A,SODERLAND S,ETZIONI O.Identifying relations for open information extarctions[C]∥Proc of Conference on Empirical Methods in Natural Language Processing.2011:1535-1545. [6] ETZIONI O,FADER A,CHRISTENSEN J.Open informationextraction:the second generation[C]∥Proc of International Joint Conference on Artificial Intelligence.2011:3-10. [7] MAUSAM,SCHMITZ M,BART R,et al.Open LanguageLearning from Information Extraction[C]∥Proc of Conference-on Empirical Methods in Natural Language Processing and Computer Language Learning(EMNLP).2012:523-534. [8] XAVIER C C,DE LIMAV L S.Boosting Open Information Extraction with Noun-Based Relations[C]∥LREC.2014. [9] AKBIK A,LOSER A.KRAKEN:N-ary Facts in Open Information Extraction[C]∥Proc of AKBC-WEKEX at NAACL.2012:199-202. [10] AKBIK A,BROSS J.Wanderlust:Extracting semantic relations from natural language text using dependency grammar patterns[C]∥Proc of the 1st Workshop on Semantic Search at 18th WWWW Conference.2009. [11] 杨博,蔡东风,杨华.开放式信息抽取研究进展[J].中文信息学报,2014,28(4):1-11. [12] GAMALLO P,GARCIA M,FERNADEZ-LANZA S.Dependency-based open information extraction [C]∥Proc of ROBUSUNSUP.2012. [13] TSENG Y H,LEE L H,LIN S Y,et al.Chinese open information extraction for knowledge acquisition[C]∥EACL2014.2014:12-16. [14] QIU L K,ZHANG Y.ZORE:A syntax-based system for Chinese open information extraction[C]∥EMNLP.2014:1870-1880. [15] 秦兵,刘安安,刘挺.无指导的中文开放式实体关系抽取[J].计算机研究与发展,2015,52(5):1029-1035. [16] CHE W X,LI Z H,LIU T.LTP:A Chinese Language Techno-logy Platform [C]∥ACL.2010:13-16. [17] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[C]∥CoRR.2013. [18] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]∥Advances in Neural Information Processing Systems.2013:3111-3119. |
No related articles found! |
Viewed | ||||||||||||||||||||||||||||||||||||||||||||||||||
Full text 77
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Abstract 542
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Cited |
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Shared | ||||||||||||||||||||||||||||||||||||||||||||||||||
Discussed |
|