Structure and genetics of the partially duplicated gene RP located immediately upstream of the complement C4A and the C4B genes in the HLA class III region. Molecular cloning, exon-intron structure, composite retroposon, and breakpoint of gene duplication
The correlation of many HLA-associated autoimmune and genetic diseases with the polymorphic complement C4 genes may be attributed to the presence of disease susceptibility genes in the close proximity of C4. We have cloned and characterized a pair of partially duplicated genes, RP1 and RP2, located 611 base pairs upstream of the human C4A and C4B genes, respectively. The putative RP protein, consisting of 364 amino acid residues, is basic and highly hydrophilic. There is a bipartite nuclear localization signal at residues 114-131 and therefore RP may be a nuclear protein. Northern blot analysis suggested that RP is ubiquitously expressed. The 5' region of the RP1 gene is CpG rich, which is a characteristic of housekeeping genes. The RP1 gene contains nine exons. Located in the fourth intron is a cluster of Alu elements, and a newly defined composite retroposon SVA with a SINE, multiple copies of GC-rich VNTRs and an Alu element altogether enclosed by direct terminal repeats. Members of SVA are also present in the complement C2 gene located about 20 kilobases upstream of RP1 in the HLA and in the cytochrome CYP1A1 gene. Determination of the DNA sequences for RP2 from two different HLA haplotypes revealed identical hybrid sequences which resulted from fusion of RP with the tenascin-like Gene X and truncation of the 5' regions of both genes. Cumulative data suggest that the four tandemly arranged genes RP, complement C4, steroid 21-hydroxylase (CYP21), and Gene X altogether form a modular structure, RCCX. The number of RCCX modules varies from one to three or more in the population. Absence of the truncated genes RP2 and Gene XA have been detected in genomes with single RCCX modules. Duplication of the RCCX modules probably occurred before the speciation of great apes and humans as they contain the same breakpoint region of RP and Gene X gene duplication.