Peripheral whole blood microRNA expression in relation to vascular function: a population-based study

Data availability

The data from the Rhineland Study are not publicly available due to data protection regulations. Access to data can be provided to scientists in accordance with the Rhineland Study’s Data Use and Access Policy. Requests for additional information or to access to the Rhineland Study’s datasets can be send to RS-DUAC@dzne.de.

Study population

We used data from the Rhineland Study, an ongoing population-based prospective cohort study in Bonn, Germany. All inhabitants of two geographically-defined areas of Bonn are invited to participate in the Rhineland Study. There are no specific selection criteria. The only requirements are that participants are aged 30 years and above, and have sufficient command of the German language to provide informed consent.

Approval to undertake the study was obtained from the ethics committee of the University of Bonn, Medical Faculty. The study is carried out in accordance with the recommendations of the International Conference on Harmonization Good Clinical Practice standards. We obtained written informed consent from all participants in accordance with the Declaration of Helsinki.

In this study, we used baseline data of the first 3000 consecutive participants of the Rhineland Study. MicroRNA expression data were unavailable for 33 individuals due to technical issues, with an additional 38 participants being excluded during quality control procedures. Moreover, 323 participants were excluded because of missing cardiovascular data due to contraindication/exclusion criteria (n = 246), participant refusal (n = 8), technical problems (n = 12), exclusion during quality assurance (n = 51) and for other/unknown reasons (n = 6). Our final analysis was conducted in a subset of 2606 participants for whom both microRNA expression data and cardiovascular measurements were available.

Vascular function measurementsBlood pressure

Systolic blood pressure (SBP) and diastolic blood pressure (DBP) were measured in a sitting position, using an oscillometric blood pressure device (Omron 705 IT) [22, 23]. The measurements were performed thrice, separated by ten minutes, by experienced study technicians while participants were sitting in a resting chair in a quiet environment. Cuff size was determined by measuring participants’ arm circumference in the middle of the upper arm between the acromion and olecranon on the right arm of the participant sitting in the measuring position. Measurements were preferably performed in the right arm. In cases where the measurements were not possible on the right arm, the left arm was used. The measured arm was always placed in a resting position at heart level, with the palms facing upward, the shoulders in a horizontal position, and both legs resting on the ground. The mean of the second and third measurement was used in the analyses. Mean arterial pressure (MAP) was calculated as (SBP + 2 × DBP)/3 [24].

Hemodynamic measurements

Impedance cardiography was performed with the CardioScreen 2000 device (Medis, Germany), by experienced study technicians, in a temperature-controlled room. Before the examination, the participants were placed in a supine position and allowed to rest for five minutes. Electrodes as well as arm, ankle and thigh cuffs were placed as per the device manufacturer’s recommendations [25]. All hemodynamic measurements were calculated by in-developed software, based on simultaneously registered electrocardiography signals and blood pressures with 2-minute intervals. We set central venous pressure (CVP) at 6 mmHg as recommended by the producer. Whereas normal CVP reportedly can vary from 5 to 15 mmHg [26], a recent study has shown that setting CVP at certain values does not impact systemic vascular resistance-related outcomes [26].

Stroke volume (mL) was defined as the volume of blood pumped from the left ventricle to systemic circulation in each heartbeat and computed using beat-to-beat for approximately 8 min with an impedance cardiography device (CardioScreen 2000, Medis, Germany) [27]. Cardiac output (L/min) was calculated as stroke volume multiplied by heart rate (beats per minute), and systemic vascular resistance (dynes/sec/cm5) as MAP divided by cardiac output, multiplied by 80. We divided stroke volume, cardiac output and systemic vascular resistance by body surface area (BSA, m2), calculated based on the participant’s height and weight using the Du Bois formula [28], to obtain the stroke index (mL/m2), cardiac index (L/min/m2) and systemic vascular resistance index (dynes/sec/cm5/m2), respectively.

Arterial stiffness measurements

As measures of arterial stiffness, we included aorta-femoral pulse wave velocity (m/s) and total arterial compliance (mL/mmHg). The propagation time of the pulse wave was defined as the difference between the opening of the aortic valve, assessed through impedance cardiography waves, and the arrival of the pulse wave to the mid-femoral cuff. We calculated the pulse wave velocity as the ratio between the distance between the supra-sternal notch and the mid-femoral cuff and the propagation time. Total arterial compliance was obtained by dividing stroke volume by pulse pressure, which is the difference between SBP and DBP. To obtain the total arterial compliance index (mL/mmHg/m2), total arterial compliance was divided by BSA.

Microvascular function

Microvascular function was measured through the reactive skin hyperemia parameter. The examination lasted a total of 26 min, during which time the skin blood flow was measured on the ventral surface of the forearm through a laser Doppler flowmetry device (Moors, UK). In the first 2 min, the skin blood flow at baseline was measured. Afterwards, the temperature in the defined region was increased to 40 °C using a heating probe (following a local thermal heating protocol). Reactive skin hyperemia was defined as the percentage increase between the skin blood flow at baseline and the last 2 min of the plateau level.

White matter hyperintensities

WMHs in the brain were assessed through 3T MRI, which included T1- and T2-weighted, and fluid-attenuated inversion recovery (FLAIR) sequences. WMHs were determined as the hyperintense signal of the white matter tracts in the supratentorial region on T2-weighted images, automatically outlined using an in-house developed algorithm based on DeepMedic [29], utilizing image information from T1- and T2-weighted, and FLAIR sequences. A subset of the automatically segmented images was manually checked for quality assurance. White matter volume was extracted using the FreeSurfer automated segmentation pipeline [30]. To account for the effect of brain size, WMH burden was defined as the ratio of WMHs to white matter volume.

MicroRNA and gene expression profiling

Fasting blood samples were collected between 7:00 to 9:45 in the morning from an antecubital or dorsal hand vein and stored in PAXgene Blood RNA tubes (PreAnalytix/Qiagen). Total RNA was isolated according to the manufacturer’s instructions using PAXgene Blood miRNA Kit and following the semi-automatic purification protocol (PreAnalytix/Qiagen). NEBNext® small RNA library preparation kit was used to generate sequencing libraries. Briefly, we used 100 ng of isolated total RNA as starting material for adapter ligation, primer hybridization, cDNA generation. 150 base pairs band was cut and chosen as insert size for library quantification. The microRNA sequencing was performed on the Illumina HiSeq 2000 platform using a 50-bp single read setup. The quality of the sequencing reads was evaluated through FastQC v0.11.9 software. Sequencing adapters, low-quality score reads (i.e., reads with the average quality per 4-base wide below 15) and reads shorter than 18 base pairs were discarded using Trimmomatic v0.39 software. The alignment and the quantification of microRNA data were performed by Mirdeep2 v2.0.1.2 tool [31], using Human Genome GRCh38.p13 provided by Ensembl and microRNA mature and precursor sequences obtained from miRBase v22.1 [32].

The gene expression profiling was performed on 750 ng of isolated total RNA. After checking RNA integrity and quantity (TapeStation 4200, Agilent), NGS libraries for total RNA sequencing were generated (TruSeq stranded total RNA kit with Ribo-Zero Globin, Illumina). Library size distribution (D1000 assay on TapeStation 4200, Agilent) and quantification (Qubit HS dsDNA assay, Invitrogen) were checked and libraries were clustered at a final 250 pM concentration. The sequencing was performed in paired-end mode (2*50 cycles) on a NovaSeq6000 instrument (Illumina) using S2 v1 chemistry in XP mode, followed by demultiplexing using bcl2fastq2 v2.20. After checking the reads sequence quality with FastQC v0.11.9, the sequencing reads were aligned to the human reference genome GRCh38.p13 provided by Ensembl using STAR v2.7.1 [33]. The count matrix was generated with STAR –quantMode GeneCounts using the human gene annotation version GRCh38.101.

MicroRNAs and genes with an overall mean expression greater than 15 reads and expressed in at least 5% of the participants were used for subsequent analyses. Finally, to normalize for library size and to log-transform the raw data, we applied the varianceStabilizingTransformation function from DESeq2 v1.30.1 R package [32].

Genetics profiling

Genotyping was performed on the Omni 2.5 Exome Array, while GenomeStudio (version 2.0.5) was used for genotype calling. Quality control was performed through PLINK (version 1.9) by checking for poor genotyping rate (< 99%), Hardy-Weinberg disequilibrium (p < 1e − 6), poor sample call rate (< 95%), abnormal heterozygosity, cryptic relatedness, and sex mismatch. Population sub-structure was analyzed through EIGENSOFT v7.2.1.0 [34]. To account for systematic variations in allele frequencies due to different ethnic backgrounds, we excluded participants of non-Caucasian descent. Genotype imputation was performed using IMPUTE v2 [35] with 1000 Genomes phase 3 v5 as the reference panel [36].

Demographic and health variables

Information on age and sex was collected through a questionnaire. Body mass index (BMI) was calculated as weight (kg) divided by height squared (m2). Cardiovascular conditions, including stroke, heart failure, coronary artery disease, arrhythmia, heart valve disease, myocardial infarction and peripheral arterial disease were defined as a self-reported physician diagnosis. Hypertension status was defined based on the mean measured systolic blood pressure ≥ 140 mmHg and/or diastolic blood pressure ≥ 90 mmHg, or antihypertensive drug use. Differential blood cell counts (i.e., erythrocytes, leukocytes, and platelets) were assessed at the Central Laboratory of the University Hospital in Bonn, using EDTA-whole blood samples on a hematological analyzer Sysmex XN9000.

Statistical analysisWeighted microRNA co-expression network analysis

We clustered microRNAs with highly correlated expression levels with the weighted gene co-expression network analysis (WGCNA) method, using the WGCNA R package v1.70-3 [37]. The WGCNA approach assumes that microRNAs that present similar expression patterns tend to be involved in similar biological functions [38]. The clustering analysis was conducted on residual microRNA levels after regressing out the effects of sequencing batch. Firstly, the adjacency matrix was constructed using the threshold power of 7, which was the first power value to reach 0.90 in the scale-free topology and corresponded to a relatively low mean connectivity value [37] (Additional file 1: Fig. S1). The similarity matrix was generated based on biweight midcorrelation and with the signed parameter. The topological overlap matrix (TOM) and the corresponding dissimilarity (1-TOM) were then calculated. MicroRNA modules were identified by first generating a hierarchical clustering tree using the “average” algorithm and then by clustering microRNAs using the Hybrid Dynamic Tree Cut method, setting the minimum cluster size to 5. Finally, highly correlated modules were merged using 0.2 as the “cutHeight” parameter, corresponding to a correlation coefficient of 0.8. We then used the module eigen-microRNA vector, calculated as the first principal component of each module, to represent the expression profile of all microRNAs within a given module. Moreover, we computed the module membership measurement for each microRNA, which describes the correlation between a given microRNA’s expression level and the corresponding module eigen-microRNA vector [37].

Identification of key modules related to vascular function

Before further analysis, we applied a log-transformation for pulse wave velocity and reactive skin hyperemia and a logit transformation for WMH burden to account for their skewed distributions. All the numerical variables were standardized to a mean of 0 and a standard deviation of 1 to allow for better comparison of the effect sizes across the different cardiovascular traits.

As a first exploratory analysis, we evaluated the relationship between age and cardiovascular measurements, as well as microRNA modules.

Next, to identify key modules related to vascular function, we investigated the association of module eigen-microRNA vectors (independent variables) with indexed cardiovascular measurements and WMH burden (dependent variables), by running multivariable linear regression models, adjusting for age and sex.

To examine whether indexing the cardiovascular measurements for BSA affected our results, we also conducted a sensitivity analysis using the non-indexed cardiac output, stroke volume, total arterial compliance and systemic vascular resistance measurements as dependent variables, adjusting for age and sex. Moreover, we examined whether our results were affected when adjusting for BMI rather than BSA, by additionally including BMI as a covariate in the analysis of the non-indexed measurements.

Multiple testing correction was applied using the Benjamini-Hochberg false discovery rate (FDR) method [39]. We reported results both at the FDR < 0.1 and FDR < 0.05 thresholds. This was done to avoid rejecting as statistically non-significant results that were very close to the common FDR < 0.05 threshold, thus reducing the possibility of false negatives (Type II error).

Identification of hub-microRNAs for each module

For each key module, defined as a module significantly associated with at least one of the investigated traits, hub-microRNAs were identified. We defined hub-microRNAs as microRNAs that (a) had module membership greater than 0.8, and (b) were significantly associated with the selected trait (p-value < 0.05). Therefore, hub-microRNAs served as module representatives and were the main drivers of the association between the module and the candidate trait.

Mediation analysis

Additionally, we investigated whether the associations between module eigen-microRNA vectors or hub-microRNA expression levels (exposures) and WMH burden (outcome) were mediated through cardiovascular function. We performed structural equation modeling using the R package laavan v0.6-11. The mediation models were adjusted for age and sex, with the threshold for statistical significance set at p-value < 0.05.

Age and sex modifications

For each hub-microRNA significantly associated with a trait, we assessed whether the association was modified by age or sex through interaction analyses. Specifically, we used linear regression models, adjusting for age and sex and including interaction terms of hub-microRNA with age and, in separate models, with sex. In case of significant interaction terms (p-value < 0.05), we conducted further stratified analyses by age quartiles and sex.

Exploratory analysis of hub-microRNAs

To investigate the overall role of hub-microRNAs in cardiovascular health, they were also evaluated in relation to all other cardiovascular traits. The regression models were adjusted for age and sex and the statistical significance threshold was set at p-value < 0.05.

Hub-microRNA – target genes and functional analysis

To map genes that are potentially regulated by the identified hub-microRNAs, we integrated microRNA and gene expression data. The analysis was conducted on 2181 participants of the Rhineland Study for which microRNA expression data, gene expression data and blood cell counts were available. First, we collected data on experimentally validated target genes and predicted target genes using the R package multiMir v1.12.0. Specifically, we queried MirTarBase [40] and TargetScan [41], widely recognized online databases for microRNA target gene prediction, and miRDB [42] a database providing experimentally validated microRNA-target interactions. Subsequently, we confirmed the relation between each putative microRNA-target gene pair. For this, after correction of microRNA and gene expression levels for batch effects, we created a model using gene expression levels as the dependent variable and microRNA expression levels as the main independent variable, adjusting for age, sex, erythrocyte, leukocyte, and platelet counts. A confirmed target gene was defined as negatively associated (beta estimate < 0 and p-value < 0.05) with the targeting microRNA.

To gain insights into the biological processes regulated by the hub-microRNAs and the corresponding target genes, we performed over-representation analyses on confirmed target genes, separately for each key module. Specifically, we first determined overrepresented Gene Ontology: Biological Process (GO: BP) terms, using the R package clusterProfiler v3.18.1. Next, we aggregated semantically similar terms to facilitate the identification of key terms through rrvgo v1.2.0. Briefly, rrvgo computes the semantic similarity matrix between pairs of GO terms and aggregates them together through complete linkage. We set a low similarity threshold of 0.5 to only merge highly similar terms. The p-values of the terms within each group were combined using the Fisher method and multiple testing correction was applied using the Benjamini-Hochberg false discovery rate (FDR) method [39]. Only terms that reached the significant level at FDR < 0.05 were reported.

Genome-wide miR-eQTL analysis

To identify potential microRNA expression quantitative trait loci (miR-eQTLs), we conducted a genome-wide miR-eQTL analysis on 2456 participants of the Rhineland Study with both genetic and microRNA expression data available. Specifically, we first adjusted microRNA expression levels for age, sex, the first 10 genetic principal components and batch effects, and extracted the residuals. Subsequently, linear regression analysis was used to assess the relation between each SNP (independent variable) and these hub-microRNA residuals (dependent variable). Cis-SNPs were defined as those located within 1 Mb of the mature microRNA, whereas trans-SNPs were defined as those positioned elsewhere [43]. The genome-wide significance level for miR-eQTLs was set at p-value < 5e-8. SNPs with relatively high linkage disequilibrium (i.e., r2 > 0.6) with nearby SNPs were clumped to define genomic risk loci, using the Functional Mapping and Annotation of Genome-Wide Association Studies (GWAS) platform [44]. Lead SNPs in these genomic risk loci were defined as those independent significant SNPs that were in approximate linkage disequilibrium with each other at r2 < 0.1.

留言 (0)

沒有登入
gif