The aim of this systematic review was to assess the accuracy and reliability of automatic landmarking for cephalometric analysis of three-dimensional craniofacial images. We searched for studies that reported results of automatic landmarking and/or measurements of human head computed tomography or cone beam computed tomography scans in MEDLINE, Embase and Web of Science until March 2019. Two authors independently screened articles for eligibility. Risk of bias and applicability concerns for each included study were assessed using the QUADAS-2 tool. Eleven studies with test dataset sample sizes ranging from 18 to 77 images were included. They used knowledge-, atlas- or learning-based algorithms to landmark two to 33 points of cephalometric interest. Ten studies measured mean localization errors between manually and automatically detected landmarks. Depending on the studies and the landmarks, mean errors ranged from <0.50mm to>5mm. The two best-performing algorithms used a deep learning method and reported mean errors <2mm for every landmark, approximating results of operator variability in manual landmarking. Risk of bias regarding patient selection and implementation of the reference standard were found, therefore the studies might have yielded overoptimistic results. The robustness of these algorithms needs to be more thoroughly tested in challenging clinical settings. PROSPERO registration number: CRD42019119637. Copyright © 2020 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.