In this abstract, we combine work from [Lagarde et al., 2010] and [Calinon et al., 2009] for learning and reproduction of, respectively, navigation tasks on a mobile robot and gestures with a robot arm. Both approaches build a sensory motor map under human guidance to learn the desired behavior. With several actions possible at the same time, the selec- tion of action becomes a real issue. Several solutions exist to this problem : hi- erarchical architecture, parallel modules including subsumption architectures or even a mix of both [Bryson, 2000]. In navigation, a temporal sequence learner or a state-action association learner [Lagarde et al., 2010] enables to learn a sequence of direc- tions in order to follow a trajectory. These solu- tions can be extended to action sequence learning. In this paper we propose a simple architecture based on perception-action that is able to produce complex behaviors from the incremental learning of simple tasks. Then we discuss advantages and limitations of this architecture, that raises many questions.