The nucleotide sequence of the gene (engXCA) encoding the major extracellular endoglucanase (ENGXCA) of the phytopathogenic bacterium Xanthomonas campestris pv. campestris (X. c. campestris) was determined and compared with the N-terminal amino acid (aa) sequence of the purified enzyme. An open reading frame of 1479 bp encoding 493 aa was identified, of which the N-terminal 25 aa represent a potential signal peptide. Determination of the exact position of a Tn5 insertion within engXCA, which did not reduce the encoded enzyme activity, indicated that the C-terminal region of the protein is not crucial for ENGXCA activity. Comparison of the complete deduced aa sequence with those deduced from other endoglucanase- and exoglucanase-encoding genes revealed a region with a high degree of homology, located towards the C terminus of the protein. These data indicate that the X. c. campestris ENGXCA may have a domain structure similar to that of many other bacterial and fungal cellulolytic enzymes. Hydrophobic cluster analysis was performed on the deduced aa sequence. Comparison of this analysis with those of 30 other cellulase sequences belonging to six different families indicated that the X. c. campestris enzyme can be classified in family A. The two aa residues which had previously been identified as 'potentially catalytic' within this family of cellulases, are conserved in the X. c. campestris ENGXCA.