An Information Extraction Approach for Building Vocabulary and Domain Specific Ontology in Information Technology

  • Ta Cong Duy Chien
  • Phan Thi Tuoi

Abstract

This  paper  introduces  the  HETEONTO system that extracts concepts from text files and existing ontologies  such  as  Wikipedia,  ACM,  and  WordNet  in order  to  build  the  vocabulary  and  domain  specific ontology focusing on Information Technology domain. It also  describes  how  to  import  concepts  into  domain ontology.  Data  sources  and  techniques  deployed  in HETEONTO  for  ontology  learning  from  texts, Wikipedia,  WordNet  are  briefly  presented  herein.  This paper then focuses  on  evaluating the generated domain ontology by selected methods. One of these methods that we  introduce  here,  is  comparative.   Comparative evaluation performed in this study use  the same corpus to contrast result from  HETEONTO. Results generated by  such  experiments  show  that  HETEONTO  yields superior  performance,  especially  regarding  semantic relation
Published
2014-10-28
Section
Regular Articles