Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem
- Authors
- Publication Date
- May 31, 2000
- Source
- HAL-Descartes
- Keywords
- Language
- English
- License
- Unknown
- External links
Abstract
In recent work, Dietterich et al. (1997) have presented the problem of supervised multiple-instance learning and how to solve it by building axis-parallel rectangles. This problem is encountered in contexts where an object may have different possible alternative configurations, each of which is described by a vector. This paper introduces the multiple-part problem, which is more general than the multiple-instance problem, and shows how it can be solved using the multiple-instance algorithms. These two so-called "multiple" problems could play a key role both in the development of efficient algorithms for learning the relations between the activity of a structured object and its structural properties and in inductive logic programming. This paper analyzes and tries to clarify multiple-problem solving. It goes on to propose multiple-instance extensions of classical learning algorithms to solve multiple-problems by learning multiple-decision trees (ID3-M, C4.5-M) and multiple-decision rules (AQ-M, CN2-M,Ripper-M). In particular, it suggests a new multiple-instance entropy function and a multiple-instance coverage function. Finally, it successfully applies the multiple-part framework on the well-known mutagenesis prediction problem.