Abstract Heterogeneous nuclear ribonucleoprotein (hnRNP) core protein A1 is a major component of mammalian hnRNP 40 S particles. We describe the structure of an active A1 gene and report on the partial characterization of the A1 gene family. About 30 A1-specific sequences are present per haploid human genome: 15 such sequences were isolated from a human genomic DNA library. Many corresponded to pseudogenes of the processed type but by applying a selection for actively transcribed regions we isolated an active A1 gene. The gene spans a region of 4.6 × 10 3 base-pairs and it is split into ten exons that encode the 320 amino acid residues of the protein. The amino acid sequence derived from the exon sequences is identical with that deduced from cDNA and reported for the protein. One intron exactly separates the two structural domains that constitute the protein. Each of the two RNA-binding domains in protein A1 is encoded by one exon. Experimental evidence indicates that the A1 gene can encode for more than one protein by alternative splicing. The gene is preceded by a strong promoter that contains at least two CCAAT boxes and two possible Sp1 binding sites, but it lacks a TATA box.