The aim of this study was to explore the inter- and intra-observer reliability of the Bath Ankylosing Spondylitis Metrology Index (BASMI) across raters from different clinical centres using a consensus-based standardised approach to assessment. One hundred thirty BASMI assessments were completed on the same day using a partially balanced incomplete block design. Thirteen physiotherapists from 10 hospitals assessed 26 participants (19 patients, 7 healthy volunteers). Each therapist assessed six participants and, to assess intra-observer reliability, performed repeat assessments on four. Overall, the mean (standard deviation; SD) BASMI total score was 3.11 (2.04). The constituent components of SD were 0.37 ('residual' inconsistency, i.e. between observer), 0.34 (between replicates), at least 0.06 (between observer means) and 2.03 (between participants). This suggests that the repeatability of BASMI assessments is 0.95 if the same observer is used and 1.05 if different observers are used. Inter-physiotherapist residual SDs for individual constituent component scores were largest for the modified Schober measurement and lumbar side flexion; intra-observer SDs showed similar patterns, although they were smaller for tragus to wall and lumbar side flexion. We found excellent inter-observer and intra-observer reliability, with most of the variability in BASMI scores being between participants. However, for repeat assessments of the same participant by the same physiotherapist, differences in BASMI of 1.0 or less are within bounds of error; likewise, differences of 1.0 or less are within the bounds of error if different physiotherapists perform the assessments. Only changes above these limits can be confidently interpreted as true clinical changes.