MOLGENIS VIP: an open-source and modular pipeline for high-throughput and integrated DNA variant analysis

Abstract

In silico variant interpretation pipelines are an integral part in research and genome diagnostics. However, challenges still remain for automated variant interpretation and candidate shortlisting. For instance, reliability is affected by variability in input data caused by different sequencing platforms, erroneous nomenclature and changing experimental conditions. Similarly, differences in predictive algorithms could result in discordant results and scalability is essential to accommodate large amounts of input data, such as Whole Genome Sequencing (WGS). To accelerate causal variant detection and innovation in genome diagnostics and research, we developed the MOLGENIS Variant Interpretation Pipeline (VIP). VIP is a flexible and open-source computational pipeline to generate interactive reports of variants in Whole Exome Sequencing (WES) and WGS data for expert interpretation. VIP is applicable to short- and long-read data from different platforms and offers tools for increased sensitivity. For this purpose, a configurable decision tree, filters based on Human Phenotype Ontology (HPO) and gene inheritance can be used to pinpoint disease-causing variants or to finetune a query for specific variants. Additionally, we present a step-by-step protocol describing how to use VIP to annotate, classify and filter genetic variants of patients with a rare disease that has a suspected genetic cause. Finally, we demonstrate how VIP performs using 25,664 previously classified variants from the Data Sharing initiative of the Vereniging van Klinisch Genetische Laboratoriumdiagnostiek (VKGL), a cohort of 18 patients from routine diagnostics and a cohort of 41 patients with a rare diseases (RD) that were not diagnosed in routine diagnostics but were solved within the EU wide project to solve rare diseases (EU-Solve-RD) using novel omics approaches. The protocol requires bioinformatic knowledge to configure and afterwards every diagnostic professional is able to perform an analysis within 5 hours.  

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study received funding from the EU projects Solve-RD, EJP-RD and CINECA Project (H2020 779257, H2020 825575, H2020 825775, respectively) and NWO grant numbers 917.164.455 and 184.034.019.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Prof. H.P.H. Kremer MD PhD, chair of Medical Ethics Review Board, University Medical Center Groningen, declares that the submission entitled: MOLGENIS VIP: an open-source and modular pipeline for high-throughput and integrated DNA variant analysis. by Maassen et al, fulfils all the requirements for patient anonymity and is in agreement with regulations of our University Hospital for publication of patient data.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The previously classified variants from the VKGL can be downloaded from https://vkgl.molgeniscloud.org/. The routine diagnostic cohort contains patient data from patients within the UMCG and can therefore only be shared upon request. The Solve-RD research cohort is available as a dataset (EGAD50000000390) in the European Genome-Phenome Archive (EGA) and can be accessed by sending a data access request to the DAC (Data Access Committee) of the Solve-RD project.

https://vkgl.molgeniscloud.org/

留言 (0)

沒有登入
gif