Affordable Access

Access to the full text

A Concept-Based Approach for Generating Better Topics for Web Search Results

Authors
  • Mehala, N.1
  • Bhatia, Divyansh2
  • 1 PES University, Bengaluru, India , Bengaluru (India)
  • 2 eBay Inc., San Jose, CA, USA , San Jose (United States)
Type
Published Article
Journal
SN Computer Science
Publisher
Springer Singapore
Publication Date
Sep 08, 2020
Volume
1
Issue
5
Identifiers
DOI: 10.1007/s42979-020-00311-y
Source
Springer Nature
Keywords
License
Yellow

Abstract

As the web is accessible to a vast population around the globe, web users today pose a large number of queries, with dynamic, vague and unclear intentions, using the web search tools, as a consequence of which organizing search results have become an all the more challenging task. Further, because of such web queries, it is difficult for web search tools to comprehend the exact user context, and thus they retrieve an extensive volume of results, a significant portion of which are unnecessary for the user. One of the answers to this problem is a strategy called search result clustering (SRC), which bunches the search results and presents them to users with many options for the query. In this work, we have proposed an approach that initially classifies the related topics and lays them out in the form of concepts, and then building search results clusters by designating each to the relevant topic and finally, providing relevant labels for these topics. We examine the effectiveness of our approach by measuring it against two most popular non-commercial methods in this field, specifically Lingo and STC, with two standard datasets, ODP and Ambient, and a newly developed dataset, Ex-Ambient, which is a rigorously extended version of the Ambient Dataset. We performed analysis on both qualitative and quantitative dimensions. We define a qualitative dimension as the expressiveness of the cluster label generated, while quantitative dimension regards the correctness of the document assigned to the cluster. The experimental results presented by the proposed method were encouraging in contrast with Lingo and STC for all the datasets and both the dimensions.

Report this publication

Statistics

Seen <100 times