Cross-modal retrieval is an important research area due to its wide range of applications, and several algorithms have been proposed to address this task. We feel that it is the right time to take a step back and analyze the current status of research in this area. As new object classes are continuously being discovered over time, it is necessary to design algorithms that can generalize to data from previously unseen classes. Towards that goal, our first contribution is to establish protocols for generalized zero-shot cross-modal retrieval and analyze the generalization ability of the standard cross-modal algorithms. Second, we propose a semantic-aware ranking algorithm that can be used as an add-on to any existing cross-modal approach to improve its performance on both seen and unseen classes. Finally, we propose a modification of the standard evaluation metric (MAP for single-label data and NDCG for multi-label data), which we feel is a more intuitive measure of the cross-modal retrieval performance. Extensive experiments on two single-label and three multi-label cross-modal datasets show the effectiveness of the proposed approach.