In terms of capability, there is still a huge gap between the human visual system and existing computer vision algorithms. To achieve results of su cient quality, these algorithms are generally extremely specialised in the task they have been designed for. All the knowledge available during their implementation is used to bias the output result and/or facilitate the initialisation of the system. This leads to increased robustness but a lower reusability of the code. In most cases, it also majorly limits the freedom of the user by constraining him to a limited set of possible interactions. In this thesis, we propose to go in the opposite direction by developing a general framework capable of both tracking and learning objects as complex as articulated objects. The robustness will be achieved by using one task to assist the other. The method should be completely unsupervised with no prior knowledge about the appearance or shape of the objects encountered (although, we decided to focus on rigid and articulated objects). With this framework, we hope to provide directions for a more di cult and distant goal: that of completely eliminating the time consuming prior design of object models in computer vision applications. This long term target will allow the reduction of the time and cost of implementing computer vision applications. It will also provide a larger freedom in the range of objects that can be used by the program. Our research focuses on three main aspects of this framework. The rst one is to create an object description e ective on a wide variety of complex objects and able to assist the object tracking while being learnt. The second is to provide both tracking and learning methods that can be executed simultaneously in real-time. This is particularly challenging for tracking when a large number of features are involved. Finally, our most challenging task and the core of this thesis, is to design robust tracking and learning solutions able to assist each other without creating counter-productive bias when one of them fails.