Domain Extraction From Research Papers
Keywords:
Text Mining, Information Extraction, Domain keyword extraction, Term Frequency and Inverse Document Frequency (TF – IDF)Abstract
Automatically finding domain specific key terms from a given set of research paper is a challenging task and research papers to a particular area of research is a concern for many people including students, professors and researchers. A domain classification of papers facilitates that search process. That is, having a list of domains in a research field, we try to find out to which domain(s) a given paper is more related. Besides, processing the whole paper to read take a long time. In this paper, using domain knowledge requires much human effort, e.g., manually composing a set of labeling a large corpus. In particular, we use the abstract and keyword in research paper as the seeing terms to identify similar terms from a domain corpus which are then filtered by checking their appearance in the research papers. Experiments show the TF –IDF measure and the classification step make this method more precisely to domains. The results show that our approach can extract the terms effectively, while being domain independent.