Abstract When do multiple representations of information in second-language learning help and when do they hinder learning? English-speaking college students ( N=152), enrolled in a second-year German course, read a 762-word German story presented by a multimedia computer program. Students received no annotations, verbal annotations, visual annotations, or both for 35 key words in the story. Recall of word translations was worse for low-verbal and low-spatial ability students than for high-verbal and high-spatial ability students, respectively, when they received visual annotations for vocabulary words, but did not differ when they received verbal annotations. Text comprehension was worst for all learners when they received visual annotations. Results are consistent with a generative theory of multimedia learning and with cognitive load theory which assume that multimedia learning processes are executed under the constraints of limited working memory.