Affordable Access

Contributions to unsupervised domain adaptation : Similarity functions, optimal transport and theoretical guarantees

  • Dhouib, Sofiane
Publication Date
Nov 23, 2020
External links


The surge in the quantity of data produced nowadays made of Machine Learning, a subfield of Artificial Intelligence, a vital tool used to extract valuable patterns from them and allowed it to be integrated into almost every aspect of our everyday activities. Concretely, a machine learning algorithm learns such patterns after being trained on a dataset called the training set, and its performance is assessed on a different set called the testing set. Domain Adaptation is an active research area of machine learning, in which the training and testing sets are not assumed to stem from the same probability distribution, as opposed to Supervised Learning. In this case, the two distributions generating the training and testing data correspond respectively to the source and target domains. Our contributions focus on three theoretical aspects related to domain adaptation for classification tasks. The first one is learning with similarity functions, which deals with classification algorithms based on comparing an instance to other examples in order to decide its class. The second is large-margin classification, which concerns learning classifiers that maximize the separation between classes. The third is Optimal Transport that formalizes the principle of least effort for transporting probability masses between two distributions. At the beginning of the thesis, we were interested in learning with so-called (epsilon,gamma,tau)-good similarity functions in the domain adaptation framework, since these functions have been introduced in the literature in the classical framework of supervised learning. This is the subject of our first contribution in which we theoretically study the performance of a similarity function on a target distribution, given it is suitable for the source one. Then, we tackle the more general topic of large-margin classification in domain adaptation, with weaker assumptions than those adopted in the first contribution. In this context, we proposed a new theoretical study and a domain adaptation algorithm, which is our second contribution. We derive novel bounds taking the classification margin on the target domain into account, that we convexify by leveraging the appealing Optimal Transport theory, in order to derive a domain adaptation algorithm with an adversarial variation of the classic Kantorovich problem. Finally, after noticing that our adversarial formulation can be generalized to include several other cases of interest, we dedicate our last contribution to adversarial or minimax variations of the optimal transport problem, where we demonstrate the versatility of our approach.

Report this publication


Seen <100 times