Abstract The in vitro reassembly of tobacco mosaic virus (TMV) begins with the specific recognition by the viral coat protein disk aggregate of an internal TMV RNA sequence, known as the assembly origin (Oa). This RNA sequence contains a putative stem-loop structure (loop 1), believed to be the target for disk binding in assembly initiation, which has the characteristic sequence AAGAAGUCG exposed as a single strand at its apex. We show that a 75-base RNA sequence encompassing loop 1 is sufficient to direct the encapsidation by TMV coat protein disks of a heterologous RNA fragment. This RNA sequence and structure, which is sufficient to elicit TMV assembly in vitro, was explored by site-directed mutagenesis. Structure analysis of the RNA identified mutations that appear to effect assembly via a perturbation in RNA structure, rather than by a direct effect on coat protein binding. The binding of the loop 1 apex RNA sequence to coat protein disks was shown to be due primarily to its regularly repeated G residues. Sequences such as (UUG) 3 and (GUG) 3 are equally effective at initiating assembly, indicating that the other bases are less functionally constrained. However, substitution of the sequences (CCG) 3, (CUG) 3 or (UCG) 3 reduced the assembly initiation rate, indicating that C residues are unfavourable for assembly. Two additional RNA sequences within the 75-base Oa sequence, both of the form (NNG) 3, may play subsidiary roles in disk binding. RNA structure plays an important part in permitting selective protein-RNA recognition, since altering the RNA folding close to the apex of the loop 1 stem reduces the rate of disk binding, as does shortening the stem itself. Whereas the RNA sequence making up the hairpin does not in general affect the specificity of the protein-RNA interaction, it is required to present the apex signal sequence in a special conformation. Mechanisms for this are discussed.