Mathonat, Romain Nurbakova, Diana Boulicaut, Jean-François Kaytoue, Mehdi
It is extremely useful to exploit labeled datasets not only to learn models and perform predictive analytics but also to improve our understanding of a domain and its available targeted classes. The subgroup discovery task has been considered for more than two decades. It concerns the discovery of patterns covering sets of objects having interestin...
MILLOT, Alexandre Cazabet, Rémy Boulicaut, Jean-François
Subgroup discovery in labeled data is the task of discovering patterns in the description space of objects to find subsets of objects whose labels show an interesting distribution, for example the disproportionate representation of a label value. Discovering interesting subgroups in purely numerical data-attributes and target label-has received lit...
Bariatti, Francesco Cellier, Peggy Ferré, Sébastien
Many graph pattern mining algorithms have been designed to identify recurring structures in graphs. The main drawback of these approaches is that they often extract too many patterns for human analysis. Recently, pattern mining methods using the Minimum Description Length (MDL) principle have been proposed to select a characteristic subset of patte...
Soulet, Arnaud
Pattern mining is an enumeration technique used to discover knowledge from databases. This Habilitation thesis summarizes our main contributions regarding user-centric pattern mining. First, we introduce the pattern-oriented relational algebra (PORA), which is the formalism used throughout the thesis. We add a domain operator to the relational alge...
Mathonat, Romain Nurbakova, Diana Boulicaut, Jean-François Kaytoue, Mehdi
It is extremely useful to exploit labeled datasets not only to learn models but also to improve our understanding of a domain and its available targeted classes. The so-called subgroup discovery task has been considered for a long time. It concerns the discovery of patterns or descriptions, the set of supporting objects of which have interesting pr...
Bendimerad, Anes Lijffijt, Jefrey Plantevit, Marc Robardet, Celine De Bie, Tijl
Concepts are often described in terms of positive integer-valued attributes that are organized in a hierarchy. For example, cities can be described in terms of how many places there are of various types (e.g. nightlife spots, residences, food venues), and these places are organized in a hierarchy (e.g. a Portuguese restaurant is a type of food venu...
Tatti, Nikolaj Mampaey, Michael
Published in
Data Mining and Knowledge Discovery
Assessing the quality of discovered results is an important open problem in data mining. Such assessment is particularly vital when mining itemsets, since commonly many of the discovered patterns can be easily explained by background knowledge. The simplest approach to screen uninteresting patterns is to compare the observed frequency against the i...
Adriaens, Florian Lijffijt, Jefrey De Bie, Tijl
Consider a large graph or network, and a user-provided set of query vertices between which the user wishes to explore relations. For example, a researcher may want to connect research papers in a citation network, an analyst may wish to connect organized crime suspects in a communication network, or an internet user may want to organize their bookm...
Boeckling, Toon Bronselaer, Antoon De Tré, Guy
Since their introduction in 1976, edit rules have been a standard tool in statistical analysis. Basically, edit rules are a compact representation of non-permitted combinations of values in a dataset. In this paper, we propose a technique to automatically find edit rules by use of the concept of T-dependence. We first generalize the traditional not...
Gautrais, Clément