Abstract The current gold standard method in the clinical assessment of swallowing is the visual inspection of videofluoroscopic frames. Specific clinical measurements are estimated based on various anatomical and bolus positional information with respect to time (or frame number). However, due to the subjective nature of visual inspection clinicians face intra- and inter-observer repeatability issues and bias when making these estimations. The correct demarcations of reference lines highlighting the positions of important anatomical landmarks would serve as a visual aid and could also be used in conjunction with bolus detection methods to objectively determine these desirable measurements. In this paper, we introduce and test the reliability of applying a 16-point Active Shape Model as a deformable template to demarcate the boundaries of salient anatomical boundaries with minimal user input. A robust end and corner point detection algorithm is also used to provide image information for the suggested movement of the template during the fitting stage. Results show the model deformation constraints calculated from a training set of images are clinically coherent. The Euclidean distances between the fitted model points against their corresponding target points were measured. Test images were taken from two different data sets from frames acquired using two different videofluoroscopy units. Overall, fitting was found to be more reliable on the vertebrae and inferior points of the larynx compared to the superior laryngeal points and hyoid bone, with the model always fitting the C7 vertebra with discrepancies no higher than a distance of 23 pixels (3.2% of the image width, approximately 7.6 mm).