How well conditional random fields can be used in novel term recognition

Xing Zhang, Yan Song, Alex Chengyu Fang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

In this paper, we describe the construction of a machine learning framework that exploit syntactic information in the recognition of biomedical terms and present the limits of machine learning in generating a novel term candidate list. Conditional random fields (CRF), is used as the basis of this framework. We make an effort to find the appropriate use of syntactic information, including parent nodes, syntactic paths and term ratios under this machine learning framework. The experiment results show that CRF model can achieve good precision in term recognition if trained with known term list. However, with regard to discovering potential novel terms for terminology lexicon editors, CRF model fails to show good performance, if trained with known term list only to predict novel terms in testing corpus. Therefore, this result suggests that more semantic information may be needed to determine a word to be a novel term during a specific period.

Original languageEnglish
Title of host publicationPACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation
Pages583-592
Number of pages10
Publication statusPublished - 2010
Event24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24 - Sendai, Japan
Duration: 4 Nov 20107 Nov 2010

Publication series

NamePACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

Conference

Conference24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24
Country/TerritoryJapan
CitySendai
Period4/11/107/11/10

Keywords

  • Conditional random fields
  • Novel term recognition
  • Term recognition

Fingerprint

Dive into the research topics of 'How well conditional random fields can be used in novel term recognition'. Together they form a unique fingerprint.

Cite this