Affordable Access

Publisher Website

HoloSelecta dataset: 10’035 GTIN-labelled product instances in vending machines for object detection of packaged products in retail environments

  • Fuchs, K.1
  • Grundmann, T.1
  • Haldimann, M.2
  • Fleisch, E.1
  • 1 ETH Zurich, Switzerland
  • 2 D ONE, Switzerland
Published Article
Data in Brief
Publication Date
Sep 08, 2020
DOI: 10.1016/j.dib.2020.106280
PMID: 32984473
PMCID: PMC7494663
PubMed Central


To assess the potential of current neural network architectures to reliably identify packaged products within a retail environment, we created an open-source dataset of 295 shelf images of vending machines with 10’035 labelled instances of 109 products. The dataset contains photos of vending machines by the provider Selecta, the largest European operator of vending machines. The vending machines are a mix of machines in public and private office spaces. The vending machines contain food as well as beverage products. The product instances in the vending machine images are labelled with bounding boxes, where a bounding box encapsulates the entire product with as little overlap as possible. The labels corresponding to the bounding box consist of a structured, human-readable labels including brand, product name and size as well as the GTIN of the product. The GTIN is the global standard to identify products in the retail environment and therefore increases the value as a dataset for the retail industry. Contrary to typical object detection datasets that choose labels at a higher level such as a can or bottle for a much wider variety of objects, this dataset chooses a far more detailed label that depends less on the shape but rather on the exact design of the product. The dataset falls into the category of object detection datasets with a large number of objects, which next to the GTIN label, represents a main differentiator of the dataset to other object detection datasets.

Report this publication


Seen <100 times