Affordable Access

Access to the full text

A Concept-Based Approach for Generating Better Topics for Web Search Results

  • Mehala, N.1
  • Bhatia, Divyansh2
  • 1 PES University, Bengaluru, India , Bengaluru (India)
  • 2 eBay Inc., San Jose, CA, USA , San Jose (United States)
Published Article
SN Computer Science
Springer Singapore
Publication Date
Sep 08, 2020
DOI: 10.1007/s42979-020-00311-y
Springer Nature


As the web is accessible to a vast population around the globe, web users today pose a large number of queries, with dynamic, vague and unclear intentions, using the web search tools, as a consequence of which organizing search results have become an all the more challenging task. Further, because of such web queries, it is difficult for web search tools to comprehend the exact user context, and thus they retrieve an extensive volume of results, a significant portion of which are unnecessary for the user. One of the answers to this problem is a strategy called search result clustering (SRC), which bunches the search results and presents them to users with many options for the query. In this work, we have proposed an approach that initially classifies the related topics and lays them out in the form of concepts, and then building search results clusters by designating each to the relevant topic and finally, providing relevant labels for these topics. We examine the effectiveness of our approach by measuring it against two most popular non-commercial methods in this field, specifically Lingo and STC, with two standard datasets, ODP and Ambient, and a newly developed dataset, Ex-Ambient, which is a rigorously extended version of the Ambient Dataset. We performed analysis on both qualitative and quantitative dimensions. We define a qualitative dimension as the expressiveness of the cluster label generated, while quantitative dimension regards the correctness of the document assigned to the cluster. The experimental results presented by the proposed method were encouraging in contrast with Lingo and STC for all the datasets and both the dimensions.

Report this publication


Seen <100 times