Affordable Access

Automatic Generation of a Large Scale Semantic Search Evaluation Data-Set

Publication Date
  • Qa076 Computer Software
  • Linguistics


To compare the performance of information retrieval techniques in various settings, the data-sets which model these settings need to be generated. Although there are already available collections, such as those used in TREC conference series, which are used for evaluation of various retrieval tasks, there is a lack of collections which are specially developed for evaluation of the effectiveness of semantically enhanced text retrieval techniques. In this paper, we propose an approach for the automatic generation of such data-sets, by using search engines query logs and data from human-edited web directories. The evaluation is performed by comparing the performance of Lucene, a popular syntactic search engine, and Concept Search, a search engine which extends Lucene's syntactic search with semantics.

There are no comments yet on this publication. Be the first to share your thoughts.