Unsupervised cardiac MRI phenotyping with 3D diffusion autoencoders reveals novel genetic insights

Abstract

Biobank-scale imaging provides a unique opportunity to characterise structural and functional cardiac phenotypes and how they relate to disease outcomes. However, deriving specific phenotypes from MRI data requires time-consuming expert annotation, limiting scalability and does not exploit how information dense such image acquisitions are. In this study, we applied a 3D diffusion autoencoder to temporally resolved cardiac MRI data from 71,021 UK Biobank participants to derive latent phenotypes representing the human heart in motion. These phenotypes were reproducible, heritable (h2 = [4 - 18%]), and significantly associated with cardiometabolic traits and outcomes, including atrial fibrillation (P = 8.5 × 10-29) and myocardial infarction (P = 3.7 × 10-12). By using latent space manipulation techniques, we directly interpreted and visualised what specific latent phenotypes were capturing in a given MRI. To establish the genetic basis of such traits, we performed a genome-wide association study, identifying 89 significant common variants (P < 2.3 × 10-9) across 42 loci, including seven novel loci. Extensive multi-trait colocalisation analyses (PP.H4 > 0.8) linked these variants to various cardiac traits and diseases, revealing a shared genetic architecture spanning phenotypic scales. Polygenic Risk Scores (PRS) derived from latent phenotypes demonstrated predictive power for a range of cardiometabolic diseases and high risk individuals had substantially increased cumulative hazard rates across a range of diseases. This study showcases the use of diffusion autoencoding methods as powerful tools for unsupervised phenotyping, genetic discovery and disease risk prediction using cardiac MRI imaging data.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study did not recieve any funding.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This research has been conducted using the UK Biobank Resource under application number 82779.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All summary statistics will be deposited with GWAS Catalog. All data used to conduct the study is available to any UK biobank approved researcher. Information and URLs of the publicly available GWAS summary statistics used for colocalisation analysis are detailed in Supplementary Table 15. The full cis-eQTL summary statistics from GTEx v8 were obtained from https://www.gtexportal.org/home/protectedDataAccess on 11/06/2024 under dbGaP application phs000424.v2.p1. Single nucleus RNA sequencing data objects were downloaded via the https://www.heartcellatlas.org/ webportal.

https://github.com/GlastonburyGroup/CardiacDiffAE_GWAS

留言 (0)

沒有登入
gif