The (mnemonic) interactions between auditory, visual, and the semantic systems have been investigated using structurally complex auditory stimuli (i.e., product sounds). Six types of product sounds (air, alarm, cyclic, impact, liquid, mechanical) that vary in spectral-temporal structure were presented in four label type conditions: self-generated text, text, image, and pictogram. A memory paradigm that incorporated free recall, recognition, and matching tasks was employed. The results for the sound type suggest that the amount of spectral-temporal structure in a sound can be indicative for memory performance. Findings related to label type suggest that 'self' creates a strong bias for the retrieval and the recognition of sounds that were self-labeled; the density and the complexity of the visual information (i.e., pictograms) hinders the memory performance ('visual' overshadowing effect); and image labeling has an additive effect on the recall and matching tasks (dual coding). Thus, the findings suggest that the memory performances for product sounds are task-dependent.