Current accounts of attentional capture predict the most salient stimulus to be invariably selected first. However, existing salience and visual search models assume noise in the map computation or selection process. Consequently, they predict the first selection to be stochastically dependent on salience, implying that attention could even be captured first by the second most salient (instead of the most salient) stimulus in the field. Yet, capture by less salient distractors has not been reported and salience-based selection accounts claim that the distractor has to be more salient in order to capture attention. We tested this prediction using an empirical and modeling approach of the visual search distractor paradigm. For the empirical part, we manipulated salience of target and distractor parametrically and measured reaction time interference when a distractor was present compared to absent. Reaction time interference was strongly correlated with distractor salience relative to the target. Moreover, even distractors less salient than the target captured attention, as measured by reaction time interference and oculomotor capture. In the modeling part, we simulated first selection in the distractor paradigm using behavioral measures of salience and considering the time course of selection including noise. We were able to replicate the result pattern we obtained in the empirical part. We conclude that each salience value follows a specific selection time distribution and attentional capture occurs when the selection time distributions of target and distractor overlap. Hence, selection is stochastic in nature and attentional capture occurs with a certain probability depending on relative salience.