With encompassing power of cell phones and potentials of mobile learning for language teaching/learning, employing cell phones in language learning seems indispensable. Through exploiting the inherent capabilities of such devices this study investigated the efficacy of multimodal representation of L2 vocabularies for 158 pre-intermediate level L2 learners aged 18-23. Since short-term memory plays an important role in vocabulary learning, they were placed into four different short-term memory (STM) ability groups using visual and verbal STM Tests. Also, cell phone-based vocabulary presentations with different annotations, i.e. pictorial vs. written, were adapted to the cell phone screen to render on learners' cell phones via Bluetooth. Finally, the participants took English vocabulary recognition and recall tests. The statistical analysis of the results showed that presenting learning materials with pictorial or written annotations rather than without annotations to learners with high-visual and high-verbal abilities resulted in better learning. Also, presenting learning materials with pictorial annotation to learners with high-visual ability as well as presenting the materials with written annotation to learners with high-verbal ability resulted in better learning. Low-visual and low-verbal ability groups showed better results under no annotation condition. The findings can provide an appropriate model for designing learning materials for L2 learners.