Genome-Wide Association Study on the Hematological Phenotypic Characteristics of the Han Population from Northwest China

Introduction

The detection of hematological components has positive reference value for many diseases in the clinic. Hematology characteristics mainly include three cell lineages: red blood cells (RBC), white blood cells (WBC) and platelets (PLT).1 Phenotypic indicators related to these lineages are commonly used as clinical parameters and can be used to monitor immune function. In addition, they can also be used as biomarkers for monitoring the severity of a disease.2,3 The deviation of hematological phenotypic indicators outside the normal range varies with the type of disease such as immune diseases, cancer, inflammation, and cardiovascular diseases.3 Studies have shown that hematological characteristics are highly hereditary.4,5

Genome-wide association studies (GWASs) have become the main method to study complex diseases and their susceptibility genes because of their ability to encompass single nucleotide polymorphisms (SNPs) for the whole genome.6 This type of study can efficiently find the gene loci associated with the occurrence and development of a disease. GWASs conduct a population-level statistical analysis of genotypes and phenotypes to determine the phenotypic changes associated with gene loci. Therefore, it is highly feasible to identify genetic polymorphisms associated with hematological phenotypic indicators through a GWAS, which will help us understand the genetic structure of hematological characteristics at a deeper level.

To date, GWASs of hematological phenotypic indicators have been performed among populations with diverse genetic backgrounds, including European,3 Korean,7 Caucasian and African American,8,9 and Japanese populations.10,11 However, the genetic structure of the hematological characteristics of the Han population from northwest China has not been described thus far.

Therefore, we performed a GWAS on 20 phenotypic indicators of hematological characteristics in the Han population from northwest China to identify the gene loci associated with hematological phenotypic indicators. Our study will supplement data on genetic variation associated with hematological characteristics, which will help to further explore the genetic structure of these hematological characteristics. Our study will provide a valuable reference for the clinical monitoring of human diseases.

Materials and Methods Study Subjects and DNA Extraction

The research group consisted of 1005 participants (494 men and 511 women) from the health examination center of the Affiliated Hospital of Xizang Minzu University. The inclusion criteria of participants were as follows: healthy people without disease, no family history of disease, no medication (for at least two weeks), and no pregnancy. We orally informed each participant of the purpose of this study, and all participants signed informed consent forms. The content of the informed consent mainly includes the background, purpose, method, significance and privacy policy of this study. After obtaining written informed consent from all participants, whole blood was collected. Our study was conducted under the standard approved by the ethics committee of the Affiliated Hospital of Xizang Minzu University. We extracted whole genomic DNA according to the kit instructions (GoldMag, Xi’an). Subsequently, a GWAS was performed on 20 hematological phenotypic indicators. These 20 indicators were white blood cell count (WBC), percent lymphocytes (LYMPH%), percent mononuclear cells (MONO%), percent neutrophils (NEUT%), percent eosinophils (EO%), percent basophils (BASO%), absolute monocyte count (MONO), absolute neutrophil count (NEUT), eosinophil count (EO), absolute value of basophils (BASO), platelet count (PLT count), platelet distribution width (PDW), mean platelet volume (MPV), red blood cell count (RBC count), hemoglobin (HGB), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), percent red cell distribution width (RDW%), and red blood cell distribution width (RDW).

Genotyping and Quality Control

A Thermo Scientific Genotyping Chip (Applied Biosystems, Axiom Precision Medicine Diversity Array, PMDA) was used. We used a Gene Titan multichannel instrument and Axiom Analysis Suite 6.0 software for genotyping. Within the scope of our sample, we performed a full gene scan through Axiom, and a total of 874,190 loci were detected. Indels, copy number variation, sex chromosomes and duplicate sites were excluded, and then the necessary quality control was performed on the remaining sites (sample call rate > 0.95, maker call rate > 0.90, and HWE > 5×10−6). Ultimately, 796,288 loci remained before imputation. After excluding indels, copy number variation, sex chromosomes and duplicate sites, the necessary quality control was performed on the remaining loci (sample call rate > 0.95, maker call rate > 0.90, HWE > 5×10−6). In the end, 796,288 loci remained before imputation.

Imputation and Quality Control

We used IMPUTE2 software and the haplotype of the 1000 Genomes Project (phase 3) as a reference for imputation. After the imputation was completed, sites that met the following conditions were retained: sample call rate > 95%, marker call rate > 90%, HWE-control > 5 × 10−6, and allele = 2. Ultimately, a total of 9,336,679 SNVs were used for the subsequent analysis.

Statistical Analysis

An automatic Triad hematology analyzer (Mindray; BC-2800) was used to measure the levels of 20 hematological phenotypic indicators. The levels of all hematological phenotypic indicators were expressed as the mean ± SD, and relevant statistical analysis was conducted using SPSS 22.0 software (SPSS Inc., Chicago, IL, USA). Our study used the single-locus mixed model algorithm implemented in SNP & Variation Suite v 8.7 (Golden Helix Inc., Bozeman, MT) to perform the genome-wide association study.12 Based on the SNP & Variation Suite manual (https://doc.goldenhelix.com/SVS/latest/svs_index.html), we used the mixed linear-additive genetic model and added the IBD matrix to the model to detect SNVs associated with the phenotypic indicators. To remove the influence of confounding factors, all results were adjusted by age and sex. Also, Manhattan plots and quantile figures related to each indicator were drawn. In this study, a p value < 5E-8 was considered to indicate a significant association between the SNV and the phenotypic indicator with genome-wide significance. When the p value < 5E-6, it suggested that the SNV may have a genome-wide significant association with the phenotype indicator.13

Verification of Replication

After the completion of the GWAS, SNVs with p value < 5×10−7 were selected as top SNVs for replication testing, which have been suggested to have genome-wide significance with phenotypic indicators. We recruited 2047 participants at the Affiliated Hospital of Xizang Minzu University for replication verification. In this study, the extreme measurement values (mean > ±3 SD) of each phenotypic indicator were excluded and then normalized with rankbaseINTs. The association analysis between the top SNVs and phenotypic indicators was then performed to select SNVs that were still significantly associated with phenotypic indicators in replication testing. Finally, SPSS software was used to draw a box plot of phenotypic indicator distribution under different genotypes of significant gene loci. In the replication test, a p value < 0.05 was considered significant.

Results

A total of 1005 Chinese Han people were recruited as the research subjects. The basic characteristics of the research subjects, the ratio of males to females and the average levels of various phenotypic indicators are summarized in Table 1.

Table 1 Sample Information of Different Hematological Phenotypic Indicators

GWAS Appraisal results

The GWAS results showed that a total of 90 gene loci were associated with the level of hematological phenotypic indicators investigated in this study (Tables 2–5). The association between 89 SNVs and hematological phenotypic indicator levels reached suggestive genome-wide significance (p value < 5E-06). Only the association between CCDC157-rs35289401 and the level of RDW was significant genome-wide (p value = 4.21E-08).

Table 2 Genetic Loci Significantly Associated with 10 Phenotypic Indicators Related to White Blood Cell Identified by the GWAS After Imputation Analysis

Table 3 Genetic Loci Significantly Associated with 3 Phenotypic Indicators Related to Platelets Identified by the GWAS After Imputation Analysis

Table 4 Genetic Loci Significantly Associated with 4 Phenotypic Indicators Related to Red Blood Cells Identified by the GWAS After Imputation Analysis

Table 5 Genetic Loci Significantly Associated with 3 Phenotypic Indicators Related to Hemoglobin Identified by the GWAS After Imputation Analysis

In this study, hematology characteristics were divided into four categories for analysis, including WBC, PLT, RBC and hemoglobin. We constructed Manhattan plots (Figures 1–4), in which the red line represents the suggestive cutoff value for genome-wide significance (5E-06). Quantile–quantile plots are shown in Figures 5–8 the x-coordinate represents the expected p value, and the y-coordinate represents the actual p value. The QQ plots showed that the distribution of p values for the association test had no systemic bias.

Figure 1 Continued.

Figure 1 Manhattan plot of the results of the genome-wide association study (10 phenotypic indicators related to white blood cells). The phenotypic indicators from top to bottom in the figure are as follows: (A) WBC, (B) LYMPH%, (C) MONO%, (D) NEUT%, (E) EO%, (F) BASO%, (G) MONO, (H) NEUT, (I) EO, and (J) BASO. The x-axis represents chromosomes, whereas the y-axis represents the −log10 of the p value. The red line represents the suggested cutoff value for genome-wide significance (5.0×10−6).

Figure 2 Manhattan plot of the results of the genome-wide association study (3 phenotypic indicators related to platelets). The phenotypic indicators from top to bottom in the figure are as follows: (A) PLT, (B) PDW, and (C) MPV. The x-axis represents chromosomes, whereas the y-axis represents the −log10 of the p value. The red line represents the suggested cutoff value for genome-wide significance (5.0×10−6).

Figure 3 Continued.

Figure 3 Manhattan plot of the results of the genome-wide association study (4 phenotypic indicators related to red blood cells). The phenotypic indicators from top to bottom in the figure are as follows: (A) RBC, (B) RDW%, (C) RDW, and (D) MCV. The x-axis represents chromosomes, whereas the y-axis represents the −log10 of the p value. The red line represents the suggested cutoff value for genome-wide significance (5.0×10−6).

Figure 4 Manhattan plot of the results of the genome-wide association study (3 phenotypic indicators related to hemoglobin). The phenotypic indicators from top to bottom in the figure are as follows: (A) HGB, (B) MCH, and (C) MCHC. The x-axis represents chromosomes, whereas the y-axis represents the −log10 of the p value. The red line represents the suggested cutoff value for genome-wide significance (5.0×10−6).

Figure 5 Quantile–quantile plots of the results of the GWAS (10 phenotypic indicators related to white blood cells). (A) WBC, (B) LYMPH%, (C) MONO%, (D) NEUT%, (E) EO%, (F) BASO%, (G) MONO, (H) NEUT, (I) EO, and (J) BASO. The x-coordinate represents the expected p value, and the y-coordinate represents the actual p value.

Figure 6 Quantile–quantile plots of the results of the GWAS (3 phenotypic indicators related to platelets). (A) PLT, (B) PDW, and (C) MPV. The x-coordinate represents the expected p value, and the y-coordinate represents the actual p value.

Figure 7 Quantile–quantile plots of the results of the GWAS (4 phenotypic indicators related to red blood cells). (A) RBC, (B) RDW%, (C) RDW, and (D) MCV. The x-coordinate represents the expected p value, and the y-coordinate represents the actual p value.

Figure 8 Quantile–quantile plots of the results of the GWAS (3 phenotypic indicators related to hemoglobin). (A) HGB, (B) MCH, and (C) MCHC. The x-coordinate represents the expected p value, and the y-coordinate represents the actual p value.

WBC

The GWAS results (Table 2) showed that a total of 39 gene loci were significantly associated with phenotypic indicators related to white blood cells (WBC, LYMPH%, MONO%, NEUT%, EO%, BASO%, MONO, NEUT, EO, and BASO). The results suggested that these significant associations may have genome-wide significance (p < 5E-06). Four SNVs were selected as top SNVs for subsequent replication verification: LINC02101-PLK2 rs2964173 (WBC count), UNC5C rs78843681 (NEUT), XYLT1-NPIPA7 rs7195345 (NEUT), and EMSY-LRRC32 rs144735144 (BASO).

PLT

The GWAS results (Table 3) showed that 5 gene loci were significantly associated with 3 phenotypic indicators related to platelets (PLT count, PDW, and MPV), which suggested that the associations may have genome-wide significance.

RBC

The GWAS results (Table 4) showed that there was a significant association between gene loci and RDW that reached genome-wide significance. This gene locus was CCDC157-rs35289401 (p = 4.21E-08). In addition, the significant association between 27 SNVs and 4 phenotypic indicators related to RBCs (RBC count, RDW%, RDW, and MCV) were suggested to have genome-wide significance. Finally, 5 gene loci were selected as top SNVs for replication verification: STAP1 rs191799779 (RBC), C15orf53-C15orf54 rs2912390 (RBC), LOC100506474-LINC00276 rs118103202 (RDW%), LINC00578 rs1875098 (RDW), and CCDC157 W rs35289401 (RDW).

Hemoglobin

The GWAS results (Table 5) showed that there may be a potentially significant genome-wide association between 18 SNVs and 3 phenotypic indicators related to hemoglobin (HGB, MCH, and MCHC). Four gene loci were selected as top SNVs for replication verification: ARHGAP25 rs10208669 (MCH), NRIP1-USP25 rs12482879 (MCH), HBS1 L-MYB rs1331309 (MCH), and CBLN1-C16orf78 rs148933121 (MCHC).

Replication Verification

In this study, a total of 13 genetic variants were selected as top SNVs for replication verification. Supplemental Figures 13 show a map of the regions associated with each hematological phenotypic indicator on different chromosomes. The replication verification results showed (Table 6) that only HBS1 L-MYB rs1331309 (p = 6.42E-07) was still significantly associated with the level of MCH in participants different from the GWAS subjects. In addition, the results (Table 7) showed that the level of MCH under genotype GG/GT of HBS1 L-MYB rs1331309 was significantly higher than that under genotype TT. Figure 9 shows the distribution box plot of MCH under different genotypes of rs1331309.

Table 6 According to the Results of GWAS SNVs with p-value < 5E-07 Were Selected for Replication Testing in Another Population

Table 7 The Level of MCH Under Different Genotypes of rs1331309

Figure 9 Box plot of MCH levels under different genotypes of HBS1L-MYB rs1331309.

Discussion

Hematological characteristics are very important for the diagnosis of health status and diseases. Studies have reported that hematological characteristics have a degree of heritability, and genetic factors play a very important role in the variation in hematological characteristics among individuals.1 Although some identified genetic variants associated with hematological phenotypic indicators can be shared among different ethnic groups, a large number of studies have confirmed the existence of racial differences.8–10,14,15 To date, there are relatively few GWASs on the hematological characteristics of the Han population from northwest China, and the genetic structure of these hematological characteristics is still unclear. Our study conducted a more comprehensive genome-wide association study of hematological characteristics in the Han population from northwest China, rather than focusing on phenotypic indicators.

A total of 90 genetic variants were identified that were significantly associated with hematological phenotypic indicators. We found that the significant association between CCDC157 rs35289401 and RDW reached genome-wide significance. Our results suggest that the remaining 89 genes may have genome-wide significance. In addition, the results of the replication test showed that HBS1 L-MYB rs1331309 was still significantly associated with MCH in participants who differed from the GWAS subjects.

RDW can be used to determine the degree of red blood cell heterogeneity and is widely used in the clinical diagnosis of blood system diseases or anemia.16,17 Red blood cell distribution width is a common phenotypic indicator in clinical practice. In recent years, a number of studies have found that elevated RDW levels can help to clinically diagnose a variety of diseases and predict the severity of diseases such as cardiovascular disease,18 ischemic stroke, carotid atherosclerosis,19 and hepatitis B virus-related liver disease.20 We found that there is a significant genome-wide correlation between CCDC157 rs35289401 and RDW. To our knowledge, we are the first to report a significant genome-wide association between CCDC157 rs35289401 and RDW. However, it is worth noting that in our replication test, CCDC157 rs35289401 was not associated with RDW. We speculate that the reason for this discrepancy may be the small sample size. If we directly identify CCDC157 rs35289401 as a genetic signal of RDW among the Han population from northwest China, it appears that the evidence is insufficient. It is necessary to expand the sample size and perform multiple verification tests. Regardless, our findings provide a valuable reference for a follow-up study of the genetic structure of RDW in the Han population from northwest China. It will also provide supplemental data for the GWAS of hematological characteristics.

In addition, the significant association between the HBS1 L-MYB intergenic region rs1331309 and MCH passed the replication verification. The level of MCH in the rs1331309 GG/GT genotype was significantly higher than that in the wild genotype TT. The results of our study are similar to those of previous studies. A genome-wide sequencing and interpolation study conducted by Southam et al in isolated populations from the island of Crete identified a significant association between rs1331309 and MCH levels (p value = 2.00E-9).21 In addition, Chen et al found a significant association between rs1331309 and white blood cell count in a cross-racial GWAS on blood cells (p value = 1.00E-34).22 The above evidence suggests rs1331309 is expected to become a kind of monitoring the level of MCH biomarkers. Studies have reported that low levels of MCH are important phenotypic indicators that can be used to diagnose many diseases.23,24 Combined with the results of our study, we speculate that rs1331309 genetic variants can be used to monitor changes in MCH levels and facilitate diagnosis or predict the occurrence and development of many diseases. Our study lays the foundation for the study of the genetic structure of hematological characteristics in the Han population from northwest China. It also provides a valuable reference for the clinical diagnosis or prediction of a variety of diseases.

Our study has certain limitations: the association between hematological phenotypic indicators in the Han population from northwest China and CCDC157 rs35289401/HBS1 L-MYB rs1331309 has never been reported. Therefore, further functional analysis is necessary to verify the relationship between these two genetic variants and hematological-related phenotypic indicators. In addition, a larger sample size is necessary to confirm the results and allow new discoveries.

Conclusions

In summary, 90 genetic variants were significantly associated with hematological phenotypic indicators among the Han population from northwest China. In particular, rs1331309 (HBS1 L-MYB) was significantly associated with MCH levels and passed the replication test and is expected to be a biomarker for monitoring the dynamics of MCH levels. This study provides a reference for the study of the genetic structure of hematological characteristics. It also provides a valuable reference for the clinical diagnosis/prediction of a variety of diseases.

Data Sharing Statement

The datasets used and analyzed in the current study are available from the corresponding author on reasonable request.

Ethics Approval and Consent to Participate

This study was conducted under the standard approved by the ethics committee of the Affiliated Hospital of Xizang Minzu University. The study conformed to the ethical principles for medical research involving humans of the World Medical Association Declaration of Helsinki. All participants provided written informed consent before participating in this study.

Acknowledgments

We thank all the authors for their contributions and support. In addition, we also thank Hongyan Lu, Yuliang Wang and Zhanhao Zhang for their assistance in this study.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This study was supported by the Key R&D Program of Xizang (Tibet) Autonomous Region (XZ202101ZY0018G), the Natural Science Foundation of Tibet Autonomous Region (XZ 2019 ZR G-42(Z)), a Talent Development Supporting Project entitled “Tibet-Shaanxi Himalaya of Xizang Minzu University” (2020 Plateau Scholar), and the Innovation and Entrepreneurship Program for College Students of Tibet Nationalities University (MD202010695093).

Disclosure

The authors declare that they have no conflicts of interest in this work.

References

1. Okada Y, Kamatani Y. Common genetic factors for hematological traits in humans. J Hum Genet. 2012;57(3):161–169. doi:10.1038/jhg.2012.2

2. Zhang Z, Hong Y, Gao J, et al. Genome-wide association study reveals constant and specific loci for hematological traits at three time stages in a White Duroc × Erhualian F2 resource population. PLoS One. 2013;8(5):e63665. doi:10.1371/journal.pone.0063665

3. Soranzo N, Spector TD, Mangino M, et al. A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nat Genet. 2009;41(11):1182–1190. doi:10.1038/ng.467

4. Evans DM, Frazer IH, Martin NG. Genetic and environmental causes of variation in basal levels of blood cells. Twin Res. 1999;2(4):250–257. doi:10.1375/twin.2.4.250

5. Garner C, Tatu T, Reittie JE, et al. Genetic influences on F cells and other hematologic variables: a twin heritability study. Blood. 2000;95(1):342–346. doi:10.1182/blood.V95.1.342

6. Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005;6(2):95–108. doi:10.1038/nrg1521

7. Kim YK, Oh JH, Kim YJ, et al. Influence of genetic variants in EGF and other genes on hematological traits in Korean populations by a genome-wide approach. Biomed Res Int. 2015;2015:914965. doi:10.1155/2015/914965

8. Li J, Glessner JT, Zhang H, et al. GWAS of blood cell traits identifies novel associated loci and epistatic interactions in Caucasian and African-American children. Hum Mol Genet. 2013;22(7):1457–1464. doi:10.1093/hmg/dds534

9. Lo KS, Wilson JG, Lange LA, et al. Genetic association analysis highlights new loci that modulate hematological trait variation in Caucasians and African Americans. Hum Genet. 2011;129(3):307–317. doi:10.1007/s00439-010-0925-1

10. Kamatani Y, Matsuda K, Okada Y, et al. Genome-wide association study of hematological and biochemical traits in a Japanese population. Nat Genet. 2010;42(3):210–215. doi:10.1038/ng.531

11. Seiki T, Naito M, Hishida A, et al. Association of genetic polymorphisms with erythrocyte traits: verification of SNPs reported in a previous GWAS in a Japanese population. Gene. 2018;642:172–177. doi:10.1016/j.gene.2017.11.031

12. Mucha S, Mrode R, Coffey M, Kizilaslan M, Desire S, Conington J. Genome-wide association study of conformation and milk yield in mixed-breed dairy goats. J Dairy Sci. 2018;101(3):2213–2225. doi:10.3168/jds.2017-12919

13. Son HY, Hwangbo Y, Yoo SK, et al. Genome-wide association and expression quantitative trait loci studies identify multiple susceptibility loci for thyroid cancer. Nat Commun. 2017;8:15966. doi:10.1038/ncomms15966

14. Yasukochi Y, Sakuma J, Takeuchi I, et al. Identification of nine novel loci related to hematological traits in a Japanese population. Physiol Genomics. 2018;50(9):758–769. doi:10.1152/physiolgenomics.00088.2017

15. van Rooij FJA, Qayyum R, Smith AV, et al. Genome-wide trans-ethnic meta-analysis identifies seven genetic loci influencing erythrocyte traits and a role for RBPMS in erythropoiesis. Am J Hum Genet. 2017;100(1):51–63. doi:10.1016/j.ajhg.2016.11.016

16. Fava C, Cattazzo F, Hu ZD, Lippi G, Montagnana M. The role of red blood cell distribution width (RDW) in cardiovascular risk assessment: useful or hype? Ann Transl Med. 2019;7(20):581. doi:10.21037/atm.2019.09.58

17. Salvagno GL, Sanchis-Gomar F, Picanza A, Lippi G. Red blood cell distribution width: a simple parameter with multiple clinical applications. Crit Rev Clin Lab Sci. 2015;52(2):86–105. doi:10.3109/10408363.2014.992064

18. Li N, Zhou H, Tang Q. Red blood cell distribution width: a novel predictive indicator for cardiovascular and cerebrovascular diseases. Dis Markers. 2017;2017:7089493. doi:10.1155/2017/7089493

19. Feng GH, Li HP, Li QL, Fu Y, Huang RB. Red blood cell distribution width and ischaemic stroke. Stroke Vasc Neurol. 2017;2(3):172–175. doi:10.1136/svn-2017-000071

20. Fan X, Deng H, Wang X, et al. Association of red blood cell distribution width with severity of hepatitis B virus-related liver diseases. Clin Chim Acta. 2018;482:155–160. doi:10.1016/j.cca.2018.04.002

21. Southam L, Gilly A, Süveges D, et al. Whole genome sequencing and imputation in isolated populations identify genetic associations with medically-relevant complex traits. Nat Commun. 2017;8:15606. doi:10.1038/ncomms15606

22. Chen MH, Raffield LM, Mousas A, et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 Individuals from 5 global populations. Cell. 2020;182(5):1198–213.e14. doi:10.1016/j.cell.2020.06.045

23. Elkhalifa AME, Abdul-Ghani R. Hematological indices and abnormalities among patients with uncomplicated falciparum malaria in Kosti city of the White Nile state. Sudan. 2021;21(1):507.

24. Beyan C, Kaptan K, Beyan E, Turan M. The platelet count/mean corpuscular hemoglobin ratio distinguishes combined iron and vitamin B12 deficiency from uncomplicated iron deficiency. Int J Hematol. 2005;81(4):301–303. doi:10.1532/IJH97.E0311

留言 (0)

沒有登入
gif