The landscape of cytogenetic and molecular genetic methods in diagnostics for hematologic neoplasia

Patients with hematologic neoplasia are classified by the World Health Organization (WHO) classification systems. The classification is based on morphology, immunophenotyping, cytogenetics and molecular genetics [1,2]. In hematologic neoplasia, a variety of chromosomal numerical and structural abnormalities, gene fusions and mutations have been detected during the last decades. Recurrent genetic aberrations with prognostic significance are taken into account in the current risk-adapted therapy protocols. An integration of various diagnostic techniques is used to detect different genetic changes (e.g. chromosome loss/gain, chromosome translocations, and single nucleotide variants). In recent years, new technologies such as high-throughput sequencing have been developed and are increasingly used in routine diagnostics. In this review, we provide an overview of common current cytogenetic and molecular genetic methods, and outlook into the methods, which could be of value in the future. We give an overview of key cytogenetic and molecular genetic nomenclature and databases as well as a brief overview of standards and guidelines for the interpretation of sequence variants.

Since the detection of an abnormally short chromosome 22 in leukemia patients in 1959 [3], known today as the Philadelphia-chromosome, a large number of other leukemia-related genetic aberrations in patients with hematologic neoplasia have been discovered. Conventional karyotyping has played an important role in diagnostics of these aberrations, and is still the gold standard in hematologic neoplasia [1,2,4,5]. For conventional karyotyping, viable cells from bone marrow or peripheral blood are cultured overnight. Colcemid solution is used to arrest mitotic cells in the metaphase. In the majority of laboratories, metaphase chromosomes are afterwards treated with trypsin and stained with Giemsa stain (G-Banding). As karyotyping is a whole genome analysis at very low resolution level, genetic changes on all chromosomes may be visible if they exceed a resolution of 5–10 Mb (Table 1) [6,7]. An advantage of this method is the fast turnaround time and the detection of aberrations on a single cell level, allowing discrimination between different leukemic clones as well as tracking of clonal evolution. Using karyotyping, clinically relevant alterations can be detected, such as hyperdiploid and hypodiploid karyotypes (acute lymphoblastic leukemia (ALL)), Philadelphia-translocation resulting in BCR::ABL1 fusion (ALL, chronic myeloid leukemia (CML) and acute myeloid leukemia (AML)), or rearrangements of the KMT2A and TCF3 genes (ALL and AML) [1,2]. Furthermore, according to WHO-HAEM5, detection of karyotypic complexity is a desirable additional investigation in chronic lymphocytic leukemia (CLL) [2,8]. Cytogenetics also plays a central role in diagnostics for myelodysplastic syndrome (MDS), in which chromosomal abnormalities are found in 50% of patients [9]. These abnormalities, such as deletion of 5q, can be entity-determining in MDS, are used to assess prognosis (IPSS-R and IPSS-M), and are important for therapy stratification [1,[9], [10], [11]]. Clonal complexity, evolution and heterogeneity in myeloid neoplasms, including AML, MDS, and MPN are associated with disease progression and adverse prognosis, which may be well characterized by chromosome analysis at single cell level.

To date, several known recurrent chromosomal aberrations cannot be detected using conventional karyotyping for varied technical and biological reasons. Molecular cytogenetics via fluorescence in-situ hybridization (FISH) is a complementary method that allows the detection of cryptic numerical and structural aberrations in interphase nuclei, as well as on metaphase chromosomes. Fluorescence-labeled probes for specific genomic loci bind to the genomic DNA in the cells and give a signal that provides information on the number, chromosomal location and configuration of the analyzed loci. FISH analysis therefore provides increased sensitivity to detect cytogenetic changes compared to conventional karyotyping [12]. Due to a fast turnaround time, FISH is the method of choice for the detection of the most common fusions and translocations in hematologic neoplasia (e.g. ALL, AML, myeloid/lymphoid neoplasia with eosinophilia) in the majority of diagnostic laboratories, and a rush FISH test within a few hours is always needed to confirm APL, due to its life threatening association with disseminated intravascular coagulation (DIC). Depending on the specific probes that are used, there are limits in both detection level and detectable clone size. Increasing the number of analyzed cells per sample can help to detect smaller clones but needs to be kept at a diagnostically feasible level in high-throughput settings. Therefore, due to the low percentage of bone marrow infiltration with plasma cells in multiple myeloma (MM), a recommended step in the Revised International Staging System (R–ISS) is to preselect CD138+ plasma cells by magnetic cell sorting (MACS). This will increase the number of tumor cells in the sample and enable the detection of characteristic and high-risk aberrations in smaller clones or already treated patients [13,14]. An important advantage of FISH tests is to correlate with cell morphological features or combine with cell immunophenotyping and lineages, which helps for a precise diagnosis, such as de novo B-ALL with t(9; 22) vs CML in lymphoid blast crisis, and allows our understanding the mechanisms of genetic alterations in cell differentiation and lineages.

An even higher resolution in the detection of deletions and duplication at the whole genome level can be achieved using array CGH and SNP microarray. Array comparative genomic hybridization (CGH) is a method to analyze genome-wide copy number variations, as well as ploidy, by competitive hybridization of the DNA probes of a reference and a sample. A prognostic impact of different copy number profiles in pediatric B-ALL has been extensively described during the last years [15,16]. The most prevalently deleted regions include BTG1, CDKN2A/B, EBF1, ETV6, IKZF1, PAX5, RB1, and the pseudoautosomal region (PAR1) involving the CRLF2 and P2RY8 genes [17]. Using array CGH, changes in chromosome ploidy can be addressed together with the detection of the above mentioned deletions. Hypodiploid karyotypes, characterized by the loss of different chromosomes, are known to be associated with adverse prognosis in pediatric and adult B-cell ALL [18]. Ploidy status can alternatively be analyzed by measuring DNA index, where a defined number of cells is fixed and stained with propidium iodide for fluorescence-activated cell sorting (FACS) analysis of DNA content [19].

In addition to the above mentioned copy-number examination, single nucleotide polymorphism (SNP) analysis in SNP array enables detection of copy number-neutral loss of heterozygosity (cnLOH), haploidy and masked hypodiploidy. These patients do not have a visible clone with less than 43 chromosomes in conventional karyotyping. Instead, they mimic hyperdiploid karyotypes resulting from a doubling of the previously hypodiploid clone [20]. Combined array CGH and SNP array can reveal a copy number neutral loss of heterozygosity for these cases and facilitate appropriate treatment according to the protocols of different studies. Array CGH can also improve the detection of rare cases with high-risk genetic features, such as ALL with intrachromosomal amplification of chromosome 21 (iAMP21), that do not meet the conventional criteria determined by FISH. For example, a small percentage of iAMP21 cases can be clearly identified by array CGH but do not meet the common definition of 5 or more copies of RUNX1 gene per cell in interphase FISH with 3 or more copies on a single aberrant chromosome 21 [[21], [22], [23], [24]]. Gene fusions and translocations that do not result in the loss of genetic material cannot be assessed with array CGH. Combining with positive cell enrichment, i.e., CD138 or CD19/20, genomic microarray is extremely useful in detecting various genomic alterations, and complexity in multiple myeloma and CLL, where tumor cells are often not actively dividing in conventional karyotyping analysis, or samples with limited plasma cells or CLL cells.

In addition to cytogenetic methods, the detection of genetic aberrations can be performed using molecular methods based on RNA or DNA (Table 1). These methods do not require the use of viable cells, which makes them accessible across a broader range of scenarios.

A very fast and sensitive method for fusion detection is the multiplex real-time polymerase chain reaction (RT-PCR) assay, which detects different fusions in the RNA. Using reverse transcriptase (RT), cDNA is synthesized from isolated RNA. A combination of different primer sets is then used to amplify different gene fusions in cDNA in parallel, and the results can either be visualized using gel-electrophoresis or quantitatively analyzed using real time commercial or in-house PCR assays. Various in-house multiplex RT-PCR protocols suitable for the detection of fusions with frequent breakpoints and known-partner genes have been described [[25], [26], [27]].

Chronic myeloid leukemia (CML) is a clonal myeloproliferative disorder of older adulthood. This disease entity is characterized by the reciprocal balanced translocation t(9; 22)(q34; q11) resulting in a BCR::ABL1 fusion as the pathognomonic hallmark of the disease. In most patients with CML the resulting BCR::ABL1 fusion gene, encoding a constitutively active protein kinase, can be successfully targeted by selective BCR::ABL1 tyrosine kinase inhibitors (TKIs) [[28], [29], [30], [31]]. Different breakpoints in the BCR and ABL1 gene are described and the detection of the fusion transcript is important for follow up analysis during treatment with TKIs. However, a translocation t(9; 22) also represents a recurrent therapy-relevant alteration in patients with ALL. Multiplex RT-PCR is the method of choice for the detection of the most common fusion transcripts at diagnosis in patients with CML or ALL in the majority of the diagnostic laboratories [[32], [33], [34]]. If the distinct fusion transcript is identified, specific RT-PCR (e.g. specific RT-PCR for P190 and P210 BCR-ABL) can be applied during clinical course.

MLPA is used in the diagnosis of copy number related genetic diseases [35]. This method allows the simultaneous detection and quantification of copy number changes at different loci with a short turnaround time [35,36]. This can be helpful when the diagnostic question depends on a very specific set of genes or loci and not on a whole genome approach; for example, the determination of IKZF1plus status in pediatric ALL [37]. In brief, the probes are comprised of two components that are ligated after target binding. The ligated probes, which contain specific primer sequences can then be amplified, quantified and compared to a reference sample to detect CNVs [35,36]. This method can be used as an adjunct to conventional cytogenetics, as well as a fast first-line screening approach in a variety of hematological diseases [35,38,39].

Fragment (length) analysis is a sensitive and fast DNA- or RNA-based method used to detect specific mutations and duplications/deletions (in particular tandem repeats). The method comprises a series of techniques: (1) The DNA/RNA fragments are fluorescently labeled using primers designed for the gene/gene region of interest; (2) the fragments are separated by capillary electrophoresis; and (3) the fragments are sized by comparison to an internal standard [40]. The fragment (length) analysis can provide different information, e.g. fragment size (size of tandem duplication) and relative quantitation (measurement of mutant allele burden). Pathogenic variants in the FMS-like tyrosine kinase 3 (FLT3) gene occur in approximately 30% of all AML cases. Internal tandem duplications (ITD) are the most common type of pathogenic variants in FLT3 and occur in approximately 25% of all AML cases [[41], [42], [43]]. The FLT3-ITD status is important for genetic risk classifications in AML patients according to European LeukemiaNet (ELN) [44]. Fragment (length) analysis in combination with Sanger sequencing is also used to determine the somatic hypermutation (SHM) status of clonotypic immunoglobulin heavy variable (IGHV) in CLL. SHM status of IGHV is also an important prognostic biomarker of therapy response [45]. Methodological recommendations on how to obtain optimal and comparable results including starting material, template production, primer sequences and reporting output have been developed, published and updated over the past decades [[45], [46], [47]]. Fluorescence labeled primers can be used for detection of monoclonal B-cell populations via capillary electrophoresis. After identification of clonal populations, unlabeled amplified PCR products are used for high-quality sequencing on both strands. Obtained sequences are analyzed via alignment to public database sequences from the international ImMunoGeneTics information system (www.imgt.org) to determine SHM status and identify stereotyped subsets.

For the detection of rare fusions, rare breakpoints, or fusions with unknown partner-genes, targeted RNA sequencing is the method of choice in many diagnostic laboratories [48,49]. Similar to RT-PCR assays, RNA is transcribed into cDNA and next-generation sequencing is used to reveal the changes in the sample. For the detection of a fusion gene, only one of the two affected genes needs to be on the panel. The use of targeted RNA panels can increase sensitivity by providing higher coverage in cancer-specific gene regions without raising expenses. Limitations of targeted RNA-Sequencing include compromised extent of fusion partners on the panel, detection limits of some fusions (e.g. IGH, EPOR) and focus only on fusion detection.

DNA-based next generation sequencing (NGS) can be performed using either a panel-approach (targeted DNA sequencing) or a broader genomic analysis, like whole exome sequencing (WES) or whole genome sequencing (WGS). Targeted DNA sequencing and WES follow the same principle: (1) library preparation, (2) sequencing and (3) data analysis. The enrichment of specific genomic regions reduces analysis costs and therefore can be used to increase sequencing depth. A high sequencing depth is important for patients with a low blast count, e.g. MDS [50]. This can also be especially important when searching for aberrations in minor clones at diagnosis, which can rapidly grow into major clones in relapse [[51], [52], [53]]. In comparison to targeted DNA sequencing, the WES approach provides the ability to analyze all coding regions. Both targeted DNA sequencing and WES can be used for the detection of SNVs (e.g. SRSF2:c.284C > G) and CNVs (e.g. FLT3-ITD).

In particular, SNVs have been detected in patients with myeloid neoplasia (e.g. AML, MDS, MPN) and are important for diagnosis and treatment stratification. For example, polycythemia vera (PV) belongs to the group of myeloproliferative BCR::ABL1-negative neoplasms (MPN). According to the WHO classification, the presence of JAK2 p.V617F (c.1849G > T p.Val617Phe) or pathogenic variants in JAK2 exon 12 in combination with one additional major criterion (e.g., elevated hemoglobin concentration) are sufficient for the diagnosis of PV. Furthermore, SNVs play an important role in therapy resistance. Pathogenic variants in the ABL1 gene are known to decrease drug binding affinity and therefore result in TKI resistance in CML and BCR::ABL1-positive ALL patients [33,54]. As for newly diagnosed MM, although molecular genetic analyses are not included in the R–ISS–Score calculation [13], recurrent SNVs are described in genes of the MAPK signaling pathway like KRAS, NRAS and BRAF, as well as TP53 mutations [55]. While SNVs in KRAS or NRAS are associated with specific treatment response or disease progression [56,57] mutations in TP53 can worsen the prognosis especially in cases of double-hits together with the R–ISS relevant high risk marker of a 17p deletion [58]. This also applies to TP53 mutations in CLL, as those are associated with unfavorable prognosis even if only present in minor subclones [59].

The development of optical genome mapping (OGM) enabled various cytogenetic analyses to be combined into one genome-wide analysis method [60,61]. To carry out this analysis, the isolation of ultra high molecular weight DNA is necessary. This DNA is labeled with fluorescent dyes at the regions containing repetitive DNA motifs, after which linearized DNA backbone and the molecular barcode are detected using fluorescence imaging. The data obtained are aligned against a reference genome and analyzed for structural (SV) and copy number variations (CNVs) [60,61]. In comparative analyses with classical (molecular) cytogenetic methods, high concordance with OGM could be shown in various cohorts of different entities [[61], [62], [63]]. In addition, the genome-wide approach of this analysis also enables the detection of new aberrations that would have been overlooked in classical diagnostic routines [61,64]. Nevertheless, this method is limited in comparison to conventional karyotyping and FISH by higher costs and longer analysis time.

Whole Transcriptome Sequencing (WTS) follows the same methodological principle as targeted RNA sequencing. WTS allows the detection of fusions, gene-expression and SNVs in one approach, and it can be used to analyze both coding and multiple forms of noncoding RNA. The simultaneous analysis and identification of all genetic alterations in a single approach makes WTS superior to the methods (e.g. targeted RNA sequencing or multiplex RT-PCR) discussed earlier. Different studies have shown that known risk-stratifying fusions can also be detected using WTS [65,66], as well as new fusions with a yet unknown therapeutic potential [[67], [68], [69]]. For example, B-ALL is currently subdivided into 23 subtypes – the all in one approach of WTS allows for detection of novel biological ALL subtypes like BCR::ABL1-like ALL, ETV6-like ALL, DUX4-rearranged ALL, as well as sequence mutation based subtypes [49,[70], [71], [72]]. However, high costs, long turnaround times and complex data handling requiring trained bioinformaticians limit broader use of WTS.

Whole genome Sequencing (WGS) follows the same methodological principle as targeted DNA sequencing and WES. WGS allows the analysis of the whole genome including coding, non-coding and mitochondrial DNA. In contrast to the previously discussed targeted DNA sequencing and WES, the WGS method has the advantage to analyze simultaneously all genetic alterations (SVs, CNVs and SNVs). However, in comparison to WTS, the current use of WGS in routine diagnostics is limited due to high costs, particularly if aiming to detect small clones, long turnaround times and complex data handling. Nevertheless, due to the advantages in high coverage, genome-wide analysis approaches have recently been adopted for many entities of hematologic neoplasia [37,[73], [74], [75], [76]].

Long read sequencing is a DNA or RNA-based method that enables sequencing of long DNA or RNA fragments compared to conventional short read (∼150bp) sequencing techniques [77]. The advantage of this sequencing technique is the detection of complex structural variants and the analysis of high repetitive elements [78]. Long read sequencing has been used in patients with hematological malignancies, such as AML, CLL and CML [[79], [80], [81]]. For example, in the analysis of BCR::ABL1-positive leukemia patients long read sequencing demonstrated sensitivity, cost and time saving advantages over Sanger sequencing [80]. Long read sequencing can also be used to detect pathogenic variants in the ABL1 kinase domain and offers the possibility to determine the clonal configuration of multiple pathogenic variants.

Numerous sequence variants can be identified using the molecular genetic methods listed above. To evaluate the identified variants, the American College of Medical Genetics and Genomics (ACMG) has established standards and guidelines for the interpretation of sequence variants. Variants will be classified into five categories: pathogenic, likely pathogenic, uncertain significance (VUS), likely benign and benign. This classification system is based on typical types of variant evidence (e.g. functional data, population data) [82,83]. The ACMG criteria are particularly designed for the evaluation of germline variants (Mendelian disorder). However, they are also used for the evaluation of somatic variants.

In recent years, different publications have provided further classification criteria for the evaluation of somatic variants. For example, Li et al. proposed a four-tier system based on their clinical significance. Somatic sequence variations are categorized into variants with strong clinical significance (tier I), variants with potential clinical significance (tier II), variants of unknown clinical significance (tier III) and variants considered benign or likely benign (tier IV) [83]. Horak et al. provided in their work a Standard Operating Procedure for the classification of oncogenicity of somatic variants. Similar to the aforementioned ACMG criteria, variants will be classified into five categories: oncogenic, likely oncogenic, variants of uncertain significance (VUS), likely benign and benign [84].

In summary, depending on the question to be investigated and the respective disease entity, a range of suitable methods exist to detect different clinically relevant genetic/genomic features. Especially in developing countries, increasing use of robust and cost-effective methods can lead more quickly to improved quality of treatment and care for patients, which also enables larger-scale international studies through better comparability of methods in larger patient populations. Nevertheless, WTS, WGS and long read sequencing provide the greatest overall insights into the spectrum of genetic alterations in hematological malignancies, albeit with greater financial and time burden, as well as requirements for specific expertise. Thus, particularly in a research context, WTS, WGS and long read sequencing represent the best basis for finding new therapeutic approaches, subgroup-specific differences, and prognostic markers.

留言 (0)

沒有登入
gif