Abstract This paper addresses a novel approach to automatically extract video salient objects based on visual attention mechanism and seeded object growing technique. First, a dynamic visual attention model to capture the object motions by global motion estimation and compensation is constructed. Through combining it with a static attention model, a saliency map is formed. Then, with a modified inhibition of return (MIOR) strategy, the winner-take-all (WTA) neural network is used to scan the saliency map for the most salient locations selected as attention seeds. Lastly, the particle swarm optimization (PSO) algorithm is employed to grow the attention objects modeled by Markov random field (MRF) from the seeds. Experiments verify that our presented approach could extract both of stationary and moving salient objects efficiently.