Unsupervised algorithms which do not make use of labels are commonly found in computer vision and are widely applicable to all problem settings.In the presence of expert-labeled ground truth information, however, these algorithms are not optimal. Altering the unsupervised models to include labelsis not always a straight forward modification. In this dissertation, we explore various ways to incorporate human supervision.We first start with the task of visual sequence recognition and demonstrate ways to effectively make use of temporal information. Next, we tackle the problem of scene segmentation anddevise a novel framework to discriminatively train a generative hierarchical model with nonparametric Bayesian priors; the methodology can be easily appliedto other nonparametric Bayesian models. Finally, we approach the difficult problem of object segmentation and describe how shape priors can be infused intoa generative Bayesian segmentation model. We demonstrate the effectiveness of our models and algorithms on datasets which are widely used by the research community and universallyregarded as difficult. The dissertation concludes with active venues for future research.