Group B Streptococcus (GBS) is the leading cause of neonatal sepsis and meningitis in the United States. The surface-associated C protein alpha antigen of GBS is thought to have a role in both virulence and immunity. We previously cloned the C protein alpha antigen structural gene (named bca for group B, C protein, alpha) into Escherichia coli. Western blots of both the native alpha antigen and the cloned gene product demonstrate a regularly laddered pattern of heterogeneous polypeptides. The nucleotide sequence of the bca locus reveals an open reading frame of 3060 nucleotides encoding a precursor protein of 108,705 Da. Cleavage of a putative signal sequence of 41 amino acids yields a mature protein of 104,106 Da. The 20,417-Da N-terminal region of the alpha antigen shows no homology to previously described protein sequences and is followed by a series of nine tandem repeating units that make up 74% of the mature protein. Each repeating unit is identical and consists of 82 amino acids with a molecular mass of 8665 Da, which is encoded by 246 nucleotides. The size of the repeating units corresponds to the observed size differences in the heterogeneous ladder of alpha C proteins expressed by GBS. The C-terminal region of the alpha antigen contains a membrane anchor domain motif that is shared by a number of Gram-positive surface proteins. The large region of identical repeating units in bca defines protective epitopes and may play a role in generating phenotypic and genotypic diversity of the alpha antigen.