Recent studies have highlighted the influence of multisensory integration mechanisms in the processing of motion information. One central issue in this research area concerns the extent to which the behavioral correlates of these effects can be attributed to late post-perceptual (i.e., response-related or decisional) processes rather than to perceptual mechanisms of multisensory binding. We investigated the influence of various top-down factors on the phenomenon of crossmodal dynamic capture, whereby the direction of motion in one sensory modality (audition) is strongly influenced by motion presented in another sensory modality (vision). In Experiment 1, we introduced extensive feedback in order to manipulate the motivation level of participants and the extent of their practice with the task. In Experiment 2, we reduced the variability of the irrelevant (visual) distractor stimulus by making its direction predictable beforehand. In Experiment 3, we investigated the effects of changing the stimulus-response mapping (task). None of these manipulations exerted any noticeable influence on the overall pattern of crossmodal dynamic capture that was observed. We therefore conclude that the integration of multisensory motion cues is robust to a number of top-down influences, thereby revealing that the crossmodal dynamic capture effect reflects the relatively automatic integration of multisensory motion information.