AbstractThe genes for long noncoding RNAs (lncRNAs) are widely represented in mammals, but their functions remain largely unknown. One possible way to study lncRNAs is a detailed comparison of various characteristics of lncRNA and protein-coding genes, since there is extensive information on the functions of the latter. A feature of protein-coding genes in mammals is the high evolutionary conservation of the primary exon sequences. Although the conservation of the primary lncRNA exon sequences is not as pronounced as that of the protein-coding genes, it is nevertheless significantly higher than the intron sequences of the lncRNA genes. We assessed the conservation of the aforementioned traits based on multiple mammal alignments: human, chimpanzee, mouse, and rat. The conservation rate was assessed based on the gene segments, e.g., exons, introns, and promoter segments (300 bp upstream of transcription start site). A study of the relationship between the lncRNA gene conservation rate and the presence/absence of CpG islands (CGIs) found greater conservation of the lncRNA genes, which are located next to CGIs. This trend may be the cause of the previously identified association between conservation and the lncRNA expression level. A separate task was to annotate the types of lncRNA within the samples (sense, antisense, and intergenic lncRNA and pseudogenes). It was found that sense-lncRNAs (which reside preferentially in the coding loci) maintain the highest ratio of promoter CGIs. The second and third lncRNA types in terms of the occurrence of CGIs are pseudogenes and antisense lncRNAs (AS-lncRNAs), respectively. The least CGI-enriched lncRNAs are intergenic RNAs. This indicates that CpG islands are more inherent in the promoters of coding genes than noncoding ones. The overall conservation rate of promoter CGIs across all lncRNA classes has been estimated as 45%. The study highlights the presence of gene-specific signals in noncoding RNAs. For the first time (as far as we know), we have extended the spectrum of the (coding) gene-specific signals with the promoter CGI analysis.