Abstract

Glycans are biologically important structures synthesised by glycosyltransferase (GT) enzymes. Disruptive genetic null variants in GT genes can lead to serious illness but benign phenotypes are also seen, including antigenic differences on the red blood cell (RBC) surface, giving rise to blood groups. To characterise known and potential carbohydrate blood group antigens without a known underlying gene, we searched public databases for human GT loci and investigated their variation in the 1000 Genomes Project (1000 G). We found 244 GT genes, distributed over 44 families. All but four GT genes had missense variants or other variants predicted to alter the amino acid sequence, and 149 GT genes (61%) had variants expected to cause null alleles, often associated with antigen-negative blood group phenotypes. In RNA-Seq data generated from erythroid cells, 155 GT genes were expressed at a transcript level comparable to, or higher than, known carbohydrate blood group loci. Filtering for GT genes predicted to cause a benign phenotype, a set of 30 genes remained, 16 of which had variants in 1000 G expected to result in null alleles. Our results identify potential blood group loci and could serve as a basis for characterisation of the genetic background underlying carbohydrate RBC antigens.