Cancers, Vol. 14, Pages 5782: Non-Invasive Biomarkers for Early Lung Cancer Detection

Cancer is a leading cause of death worldwide, accounting for nearly 10 million deaths in 2020. Of all cancers, lung cancer (LC) is the second most common cancer type with 2.21 million new cases and 1.8 million deaths reported globally [1,2]. The overall 5-year survival rate remains low at 20%, which is mostly due to the advanced stage at the time of diagnosis [3], as in most cases early LC is asymptomatic. Patients with early disease often present with lung nodules or a mass revealed incidentally on a chest X-rays or CT scans. In advanced stages, LC can cause symptoms due to local tumour invasion, loco-regional spread, distant metastasis, and in some cases paraneoplastic syndromes. Common symptoms include cough (50–75%), haemoptysis (25–50%), shortness of breath (25%), and chest pain (20%) [4].According to the World Health Organization (WHO), 85% of lung cancers are classified as non-small cell lung cancer (NSCLC) while the remaining are diagnosed as small cell lung cancer (SCLC). Histologically, lung cancer is classified as adenocarcinoma (AC), squamous cell carcinoma (SCC), large cell cancer, and other/unspecified subtypes (Figure 1).In addition to these subtypes, SqCLC (about 20%) is classified into keratinizing, non-keratinizing, and basaloid types. Neuroendocrine tumours comprise four categories: SCLC, large cell carcinoma (LCC), typical carcinoid, and atypical carcinoid. SCLC (14% of all LCs) is further categorized into small cell carcinoma and combined small cell carcinoma. Both SCLC and large cell neuroendocrine carcinoma are high grade tumours. Carcinoid tumours are commonly located in central airways and further divided into two categories: typical carcinoid (intermediate grade) and atypical carcinoid (low grade). Adenosquamous carcinoma is rare and comprises 0.4–4% of LC cases [5]. The complexity of lung cancer is further increased by its highly heterogenous nature.The most important risk factor for LC is smoking, which accounts for approximately 90% of all cases of LC [6,7]. Therefore, most research in LC screening focuses on the early detection of LC in current and former smokers [8], where if LC is caught early, this leads to the most benefit in terms of increasing life expectancy and increasing quality of life [8]. However, it is important to emphasise that LC screening is not an alternative to smoking cessation campaigns and that former smokers should continue to be encouraged to remain abstinent from smoking [9]. The discovery, development, and validation of different biomarkers for early detection of LC is likely to result in saving more lives in the not-too-distant future [10]. Common methods used for diagnosis and prediction of treatment response and disease progression include imaging and tissue biopsy. Both of these methodologies have their own limitations including cost, extensive patient preparation, risk of injury, invasiveness, exposure to radiation, and diagnostic bias due to heterogeneity. Therefore, it is imperative that non-invasive biomarkers are developed for screening/diagnosis, disease monitoring/prognostication, and prediction of response to treatments [11,12]. Numerous non-invasive biomarkers, such as DNA originating from tumour cells and circulating tumour cells (CTCs), proteins, lipids, RNAs, and microRNAs (miRNA), can be detected in bodily fluids such as plasma, serum, urine, saliva, ascites fluid, and CSF. Such cellular biomarkers are an area of extensive research mainly due to the ease of sampling and availability of validated sensitive technologies such as enzyme-linked immuno-sorbent assay (ELISA), polymerase chain reaction (PCR), next generation sequencing, colorimetric/electrochemical assays, and fluorescence methods [13,14,15,16]. However, due to the genomic instability and continuous evolution in lung cancer cells, a wide variation in expression of specified biomarkers is expected, and the ideal biomarker has not been found yet (Box 1).Therefore, researchers are continuously searching for robust biomarkers with high sensitivity (SN) and specificity (SP) that can be used in clinical settings for early diagnosis. This will help in effective timely interventions and better patient management [17]. Following principles, summarised in Box 2, will enable and foster the progress in the biomarker field.

This review aims to provide information on several biomarkers that are being used as well as being investigated for the screening and early detection of lung cancer.

1.1. DNA Methylation in Sputum and Plasma for Early LC DetectionAs epigenetic changes in LC are common, this offers several targets that can concurrently be probed [18]. LC genome analysis reports global hypomethylation that results in the destabilisation of DNA with the exception of CpG dense regions [19,20,21]. In NSCLC, epigenetic changes are associated with cigarette smoking and aggressive tumour behaviour, and as such these changes can be used for risk stratification and histological and molecular characterisation [22,23,24,25,26,27,28,29]. Non-invasive, sputum-based epigenetic testing for the detection of epigenetic changes/promoter DNA hypermethylation at early stages of tumorigenesis is well documented. Palmisano et al. showed that in sputum samples, collected 3 years prior to clinically detectable lung cancer, the hypermethylation of MGMT and/or CDKN2A genes could be effectively detected, indicating that epigenetic markers can indeed play a role in early cancer diagnosis [30]. This was validated in other studies as well [31,32,33]. Moreover, in a study of five participants, RASSF1A methylation, detected in sputum samples, correlated with the development of LCs within 12 to 14 months from the sputum test in three patients [34]. Similarly, a prospective study on 92 high risk individuals and a matched control group identified promoter methylation of 14 genes in the sputum that can be used for risk stratification. It was found that 6 of 14 genes correlated with a >50% increased LC risk. Furthermore, simultaneous methylation of three or more of these six genes correlated with 6.5-fold increased risk of LC [35]. These detected genes are involved in many important biological functions, such as cell cycle regulation (p16 and PAX5 β), apoptosis (DAPK and RASSF1A), signal transduction (GATA5), and DNA repair (MGMT) [35,36,37,38,39].The detection of DNA methylation in plasma, as a tool for screening and diagnostic purposes in LC, has also shown promise. Bearzatto et al. reported an increased frequency in p16INK4A methylation in plasma samples of early-stage adenocarcinoma [40]. Similarly, methylations of RASSF1A and CDKN2A detected in blood samples were frequently identified in early-stage LC with a reported sensitivity of 22 to 66% and specificity of 57–100% [40,41,42]. Another study on 70 participants showed significant differences in the methylation pattern between LC and benign lung lesions. The participants who developed lung cancer showed methylation changes in four tumour suppressor genes, i.e., Kif1a, DCC, RARB, and NISCH. The differences were correlated with LC diagnosis, and it was observed that participants who were finally diagnosed with LC exhibited significant differences in methylation pattern [43]. Another, larger study on 360 participants showed similar results. The methylation status of PTGER4 and SHOX2 genes detected in the plasma of patients with indeterminate pulmonary nodules was distinct as compared to participants with benign lung nodules [44]. Therefore, integrating DNA methylation expression patterns (in plasma/sputum) as a screening tool in national LC screening programs is now needed to progress to novel algorithms for early LC detection. In lieu of this, Kang et al. developed a probabilistic method called Cancer Locator, based on cfDNA detected in blood samples. The study utilized data from a genome-wide DNA methylation profile and DNA methylation microarrays of solid tumour samples to train the model. The model was able to identify the histological type and the site of the tumour together with cancer load in NSCLC [45]. The study could not offer firm conclusions because of small sample numbers; however, the authors foresaw that when more paired samples (tumour sample and the matched adjacent non-tumour sample) become available, Cancer Locator could identify not just the existence but also the location of the tumour [45]. 1.2. The Role of microRNAs in LC DetectionmiRNAs are small non-coding RNAs of 18–25 nucleotides in length which are involved in the post-transcriptional regulation of gene expression [46,47]. They are found to be aberrantly expressed in many pathological conditions, including cancer, and can be detected in bodily fluids including urine, sputum, and blood, making them exciting biomarkers for cancer detection [48,49]. In 2002, their role in LC pathogenesis (proliferation of LC cells, invasion of basement membrane, and metastasis) was reported by Calin et al. [50]. Interestingly, based on the cellular context, miRNAs can act as tumour suppressors or oncogenes and even both [51,52]. Moreover, miRNAs preserve their stability throughout cancer progression from initiation to metastasis, because they are too small to degrade, and some miRNAs are further protected in exosomes. Hence, miRNAs are considered an appealing biomarker for cancer diagnosis and monitoring [53]. Several studies listed in Table 1 investigated miRNAs from different biofluid sources including sputum and serum/plasma for LC biomarker detection [54]. Previous studies of miRNAs in sputum showed that four miRNAs, miR-486, miR-21, miR200b, and miR-375, can differentiate lung adenocarcinoma patients from healthy individuals with an SN of 80.6% and SP of 91.7% [55]. Furthermore, a combination of miR205, miR-210, and miR-708 in sputum samples was able to discriminate squamous cell carcinoma patients from healthy controls with an SN of 73% and SP of 96% [56].In an early study by Yanaihara et al., 12 microRNAs including miR-17-3p, miR-21, miR-106a, miR-146, miR-155, miR-191, miR-192, miR-203, miR-205, miR-210, miR-212, and miR-214 were identified as potential biomarkers of distinguishing cancers from benign lung tissues and as molecular markers, as they have different expressions in different malignant tissues [66]. Another study by Wozniak et al. [64] showed that a combination of 24 miRNAs was able to discriminate LC cases from healthy controls. The authors suggested that the overexpression of the above miRNAs in plasma can serve as a biomarker for the early detection of NSCLC and should be investigated further [64].On the other hand, a study on circulating miRNA profile in the serum samples of 82 pre-operative LC patients, paired 10 days post-operative patients (before and after tumour removal), and 50 healthy participants showed increased expression of four miRNAs (miR-21, miR-205, miR-30d, miR-24) before surgery compared to after surgery and healthy participants. The researchers proposed that these four miRNAs have the potential to be used as biomarkers for post-operative disease relapse [61]. The same miRNAs were upregulated in the serum of early-stage LC patients in comparison to normal volunteers, suggesting that measuring their serum levels could potentially be extended for screening of high-risk subjects. As the serum levels of miR-21 and miR-24 were lower in post-operative compared to pre-operative patients, this feature should be investigated as a tool for monitoring disease recurrence in the post-operative setting [61]. In a similar study by Leidinger et al. [63], plasma miRNA levels were measured before surgery and at subsequent regular intervals up to 18 months post-surgery, with a reported significant correlation between miRNA expression level and time distance from surgery. The study indicated that, over time, the expression of specific miRNAs decreased. The post-surgery analysis of all miRNAs revealed a general reduction shortly after surgery and then a rise at disease progression. A network analysis showed that 12 miRNAs involved in controlling the regulation of 48 genes were deregulated in LC tissue and the level of miRNA expression change after surgery correlated with post-operative patients’ outcome and presence or absence of metastatic disease [63,67]. Therefore, due to the ability of miRNAs to change according to treatment dynamics, it is postulated that miRNA can be used for LC monitoring ad can provide prognostic information. A major issue with studies of miRNA as a tool for LC screening and early detection is the differences in protocols for sample collection and processing, combined with different assays for measuring miRNA expression employed by different studies, which result in variability in the obtained results. These differences in the methodologies should be taken into consideration as they potentially underlie the general lack of overlap in found miRNAs among the above-mentioned studies.Another non-coding RNA type, circRNAs, which have a stable covalently closed circular structure and show a specific expression pattern in different tissues and cells, have also been implicated in LC growth and progression [7]. However, the exact mechanisms remain poorly understood and require more in-depth studies [8]. Using technologies such as RNA-seq and Ribo-Zero, thousands of circRNAs have been discovered ([7], and it is predicted that valid circRNA biomarkers for diagnosis, prognosis, and therapy in LC will increasingly be found. A better understanding of the exact role of circRNAs in the pathogenesis of LC will likely also lead to improvement of the detection of “clinically significant” circRNAs and understanding of the temporal relationship between such circRNAs and the development of preinvasive or early LC. 1.3. The Role of Circulating Tumour DNA (ctDNA) in LCctDNA (circulating tumour DNA) includes both encapsulated (in circulating vesicles) and non-encapsulated free DNA in the blood or other body fluids [68]. ctDNA escapes cancer cells via several mechanisms, namely apoptosis, necrosis, and secretion from extracellular vesicles as well as from CTCs [69,70]. Therefore, analysing ctDNA is a promising approach that could accelerate efforts for body fluid-based LC detection and overcome some of the challenges posed by invasive tissue biopsy, as summarised in Table 2.An important feature of ctDNA is that it can be found in blood prior to clinical diagnosis [80]. Advances in technologies of DNA sequencing made it possible to detect cDNA before clinically evident LC [81]. However, a major challenge in using ctDNA is that most patients have ctDNA levels of less than 0.1% [82,83]. Nonetheless, new techniques have continuously been developed and tested to improve the detection of ctDNA in low concentrations in plasma. There is also evidence of a positive correlation between disease burden and the plasma concentration of ctDNA [81]. A study by Jacob et al. [80] used deep sequencing (CAPP-Seq) and improved protocol for the extraction of unique cfDNA fragments and the segment of cfDNA duplexes for sequencing of both strands [80]. The authors genotyped tumour tissue, analysed pre-treatment cfDNA in plasma and leukocyte DNA from 85 subjects diagnosed with stage I–III NSCLC using targeted deep sequencing of 255 frequently mutated genes in NSCLC, and reported that most somatic mutations in the cfDNA of LC patients and of risk-matched cohorts replicate clonal haematopoiesis and are not recurring. In contrast with mutation driving carcinogens, clonal haematopoiesis mutations are present on longer cfDNA fragments and do not show mutational marks that correlate with tobacco smoking. Incorporating these results with other tumour characteristics such as cell proliferation and lymphovascular invasion, the authors applied and prospectively validated a machine-learning-based method called “LC likelihood in plasma” (Lung-CLiP) [82]. Three control groups were used as a validation cohort: a low-risk group of 42 adult blood donors, a matched risk control group of 56 age, sex, and smoking status matched adults who had negative low-dose CT (LDCT) screening scans, and a third group comprising 48 risk-matched participants receiving LDCT screening recruited prospectively at a different centre. The study reported that Lung-CLiP successfully differentiates early-stage LC patients from risk-matched cohorts, with an overall 80% SP and SP of 63% in stage I, 69% in stage II, and 75% in stage III patients. Lung-CLiP performance was comparable to that of tumour-informed ctDNA detection, allowing tuning of assay specificity for the screening and early diagnosis of LC. The authors concluded that the potential of cfDNA for LC screening is strongly emerging and highlighted the significance of risk-matching LC cases and control groups in studies utilising cfDNA-based screening to account for hidden biases. The study proposed that Lung-CLiP could be used for high-risk subjects who decline LDCT due to concerns regarding false positives, limited access, and radiation exposure, by referring only individuals with positive Lung-CLiP test for further LDCT screening. One study suggested that this approach of integrating Lung-CLiP with LDCT could increase the number of lives saved in the US from LC from about 600 to approximately 12,000 by increasing the sensitivity of Lung-CLiP in detecting early lung cancer [84]. The study also noted a correlation between the pre-treatment levels of ctDNA and clinical outcomes, which might signify micro-metastasis even in early stages of LC, indicating the benefit of neoadjuvant and adjuvant systemic therapies. Another study reported a novel plasma-based assay for the diagnosis of early-stage LC exploiting high-throughput targeted DNA methylation sequencing of ctDNA [85]. The researchers established a methylation profile by high-throughput DNA bisulfite sequencing in tissue samples (nodule diameter of less than 85]. One of the key shortcomings of molecular analysis by studying ctDNA is that it provides no information on histology; therefore, invasive biopsy will be required to make a histological diagnosis of LC. False-negative results from analysing ctDNA is a further important issue in the context of low tumour load or low rate of shedding of ctDNA to the systemic circulation [86]. Moreover, the precision of the data acquired by analysing ctDNA is affected by the location of the metastatic disease. A pooled analysis of EGFR-mutated NSCLC revealed that the detection rate of ctDNA EGFR mutation was considerably higher in patients with extrathoracic compared to intrathoracic lesions [79]. Furthermore, the false-positive results can be acquired using ctDNA as mentioned above (molecular alterations originated by clonal haematopoiesis rather than the tumour) [87]. Identification of unintended germline mutations during ctDNA evaluation that are not linked to the pathogenesis of LC is not an infrequent occurrence that mandates disclosure to the patient and referral for genetic counselling clinics [88]. For example, in the molecular analysis using ctDNA of 10,888 unselected patients with metastatic cancer (41% were lung malignancies), 1.4% were discovered to have possible hereditary cancer mutations in 11 genes [88]. Finally, technical aspects in relation to ctDNA specimen acquisition and handling can affect the quality of the data. Despite the many advantages of LBs compared to tissue biopsies, the SN and SP of detecting specific molecular changes in NSCLC from LB remain affected by technology, clinical trial methodologies, and logistics, which in turn affect the safe and effective integration of LB into clinical practice [89]. In a first published systematic review of 34 studies involving 1141 patients with NSCLC by Esagian et al., the positive percent agreement (PPA) in detecting common mutations using targeted NGS between LB and tissue biopsy was provided [90]. The authors stated that they used PPA rather than SN, SP, and PPV and NPV because NGS was not validated in all the studies they reviewed, and hence PPA was deemed more appropriate. The calculated PPA rates were 53.6% (45/84) for ALK, 53.9% (14/26) for BRAF, 56.5% (13/23) for ERBB2, 67.8% (428/631) for EGFR, 64.2% (122/190) for KRAS, 58.6% (17/29) for MET, 54.6% (12/22) for RET, and 53.3% (8/15) for ROS1. The above findings are consistent with other publications that concluded that the detection of specific mutations via NGS from LB is less sensitive compared to tissue biopsy [91,92]. 1.4. Urine Cell-Free DNA (ucfDNA) in the Diagnosis of LCImprovements in the knowledge and the technologies for the isolation and analysis of biomarkers from urine provide novel opportunities for the clinical applications of cancer urine biomarkers. The presence of biomarkers such as exfoliated bladder cancer cells, ctDNA, proteins, miRNAs, and exosomes in the urine have been investigated in the context of different primary cancers such as bladder, prostate, pancreas, and lung; the cost-effectiveness and convenience of use make urine biomarkers attractive choices for patients and physicians alike [93,94,95,96]. Using urine biomarkers for assessing treatment efficacy and resistance is a major advantage when compared to tissue biopsies and radiological imaging [97]. Furthermore, another advantage of urine biomarker analysis is that cfDNA extraction is technologically easier [97,98,99], when compared with plasma, as urine contains a lower concentration of interfering proteins [100]. The evidence for the reliability and sensitivity of the detection of gene mutations and DNA methylation in the urine is growing, especially as the technologies used are consistently undergoing refinement [101,102,103].Methods associated with the extraction and classification of urinary constituents are multifarious and diverse and can vary from methods for protein and genomic profiling to microfluidic techniques [104]. In recent years, the detection of EGFR mutation and the subsequent mutation profile in patients with metastatic NSCLC who might be eligible to receive first and second lines of anti-EGFR tyrosine kinase inhibitors (TKIs) has grown rapidly. A study by Reckamp et al. showed that EGFR mutations (T790M, L858R, and exon 19 deletions) were successfully identified in the urine of NSCLC patients and the results were congruent with the EGFR mutation state identified through tissue biopsy [105]. A comparative study was reported by Ren et al., who measured the concentration of ucfDNA, using qPCR, in 55 LC patients and a cohort of 35 healthy participants [106]. The study reported that the concentration of ucfDNA is consistently higher in LC patients, especially with lymph node involvement, compared to the healthy cohort, suggesting that ucfDNA could potentially play a role in the early diagnosis of LC [106]. Another study compared the urine cell-free DNA (ucfDNA) of 55 NSCLC patients of different disease stages with 35 healthy volunteers by means of quantitative real-time PCR (qPCR) [107]. The study showed that concentrations of urinary cell-free DNA (ucfDNA) were considerably greater in individuals with stage III/IV than in those with stage I/II and the disease-free cohort. The receiver operating characteristic curves (ROCs) for distinguishing participants with stage III/IV from disease-free volunteers showed areas under the curve (AUCs) of 0.84 and 0.88, respectively. In another study [106], ucfDNA concentration and integrity indexes were explored as biomarkers for early LC detection. The cohort included 55 LC patients and 35 healthy participants. The study found that concentration and integrity indexes of ucfDNA were considerably higher in LC patients compared to the healthy individuals. Moreover, the ucfDNA integrity indexes in patients with metastasis to lymph nodes were significantly higher compared with patients without lymph node involvement, suggesting that ucfDNA could potentially play a role in the early diagnosis of LC [106]. 1.5. RNA Airway and Nasal SignatureThe approach of analysis of RNA acquired from airway samples centres on gene expression profiles of cancer-associated processes affecting the tracheobronchial tree [108]. A study identified a 23-gene biomarker panel from endobronchial brushings of patient who received bronchoscopy to investigate LC [109]. Consequently, two separate prospective cohorts showed an SN of 88% to 89% and an SP of 48% for such a gene-expression classifier. As biomarkers, these 23 genes were especially indicative of possible underlying cancer in patients with an intermediate (10–60%) pre-test risk of LC (91% negative predictive value, NPV). These results suggest that the NPV of a negative bronchoscopy could be improved if combined with the 23-gene panel, which could potentially circumvent the need for invasive lung biopsy by monitoring such patients with less invasive tests such as follow up CT scans [110]. In another study by the Aegis Study Team [111], the same concept of “field of injury” was used to investigate samples of nasal epithelial cells. The main advantage of this approach is bypassing the need for bronchoscopy. The investigators developed a 30-gene nasal expression panel for the detection of LC among smokers with suspected LC. This approach showed improvement in AUC, SN, and NPV if combined with clinical risk models. The study showed that combining clinical factors (age, smoking status, time since smoking cessation, tumour mass size) and the expression of the 30 genes from nasal cavity had a statistically significantly higher AUC (0.81; 95% confidence interval (CI) = 0.74 to 0.89, p = 0.01) and SN (0.91; 95% CI = 0.81 to 0.97, p = 0.03) than a clinical-factor only model [111]. 1.6. Radiomics Signatures of Primary and Secondary Pulmonary Malignant LesionsIn the past decade, medical imaging has progressed from chiefly being a primary diagnostic tool to acquiring an important role in providing vital molecular data required for targeted based therapy through the adoption of advanced hardware, novel imaging agents, streamlined scanning protocols, and improvements in computational power [112]; thus, we will briefly discuss its role here. The technological advances have enabled the extraction and processing of a large amount of data from quantitative imaging, in a process called radiomics [112]. By utilising a characterisation algorithm, radiomics has the potential to unveil disease features that cannot be seen by the naked eye [113]. The process of radiomics involves obtaining sub-visual, yet quantitative, image characteristics in order to produce usable datasets from radiological films [114]. Radiomics data extracted from medical scans (e.g., CT and MRI scans) can be utilised to discover diagnostic, predictive, and prognostic data in patients with malignancy through comparison with objective response criteria such as overall and progression-free survival, and can also be combined with tumour molecular and genetic profile (genotype); the latter is referred to as radiogenomics [115]. The process of converting medical imaging into meaningful data typically involves four steps: (a) image acquisition and reconstruction, (b) region of interest segmentation, (c) feature extraction and quantification, and (d) building predictive and prognostic models, as illustrated in Figure 2.As a new technology, radiomics is in its infancy; therefore, its clinical application is still limited. In the context of primary LC, a significant interest in using radiomics to predict the histological and molecular characteristics, response to treatment, and overall prognosis is raised. Several studies have been able to identify specific radiomics signatures that differentiate NSCLC from other benign and pre-invasive lesions, including the prediction of EGFR status and response to treatment with TKI [116,117,118,119,120,121,122,123], as well as histological subtype. For example, a retrospective study of 148 patients with histologically confirmed NSCLC found thirteen radiomics features that predict histological subtype (ALC vs. SqCLC) with AUCs of 0.819 and 0.824, respectively [124]. Several studies of radiomics signatures have reported features distinguishing benign from cancerous lung pathologies and are shown in Table 3.

To conclude, radiomics offers a tangible opportunity for even wider use of medical imaging in oncology, especially in difficult to access lesions or lesions in patients in whom invasive lung biopsy could be detrimental.

留言 (0)

沒有登入
gif