Sequencing techniques for single- and double-stranded DNA were used to determine the nucleotide sequence of the gene encoding P2, the major outer membrane (porin) protein of Haemophilus influenzae type b (Hib). The open reading frame encoding the P2 protein comprised 361 amino acid codons. Comparison of the inferred amino acid sequence with data obtained by amino acid sequencing of the N terminus of the mature or fully processed P2 protein revealed that this protein has a signal peptide composed of 20 amino acids. N-terminal amino acid sequencing of tryptic peptides derived from purified P2 allowed direct identification of 158 of the 341 amino acids in the fully processed P2 protein; there was 100% correlation between these amino acid sequences and that inferred from the nucleotide sequence. The amino acid sequence of Hib P2 protein had 23 to 25% homology with the sequence of the OmpF porin of Escherichia coli and with that of the Neisseria gonorrhoeae porin P.IA. Codon usage in the Hib P2 gene was significantly different from that observed for a gene encoding a porin of E. coli. DNA hybridization studies indicated that there is a single copy of the P2 gene in the Hib chromosome. The availability of the nucleotide and amino acid sequences for the Hib P2 protein will facilitate investigation of the antigenic characteristics and structure-function relationship of this porin.