AI in NSCLC: Promise, Progress, and the Path to Responsible Integration

Lung cancer remains the leading cause of cancer-related mortality worldwide, with non-small cell lung cancer (NSCLC) accounting for approximately 85% of all lung cancers[1]. Despite advances in immunotherapy and targeted therapies, overall survival has historically been low. Late diagnosis remains a major driver of poor outcomes, while treatment resistance, tumor heterogeneity, and immunologically “cold” tumor microenvironments further limit the effectiveness of available therapies.

This landscape is beginning to shift with the introduction of low-dose computed tomography (LDCT) screening in high-risk populations. Large screening trials have demonstrated that detecting lung cancer earlier can drastically improve long-term survival, with 20-year survival rates reaching up to 81% among patients diagnosed with early-stage (stage I) disease[2]. While adoption varies across healthcare systems, LDCT screening programs are increasingly being incorporated into clinical guidelines and national screening initiatives.

As screening becomes more widespread, clinicians are increasingly confronted with a new challenge: interpreting large numbers of imaging studies, characterizing pulmonary nodules, and integrating imaging findings with pathology and molecular data to guide treatment decisions.

In this evolving landscape, AI-driven approaches are beginning to support multiple steps in the NSCLC care pathway, from improving the detection and characterization of lung nodules on imaging to enhancing pathology analysis and enabling more precise risk prediction. As AI becomes increasingly integrated into oncology workflows, balancing innovation with rigorous validation and human oversight will be essential to ensuring that these technologies deliver meaningful clinical benefit.

Smarter Imaging and Earlier Detection

As LDCT screening expands, clinicians are increasingly faced with large volumes of imaging data and pulmonary nodules that must be detected, characterized, and monitored over time. In reality, imaging is almost always the first step in the diagnostic pathway, whether lung cancer is detected through screening, incidental findings, or clinical symptoms, further compounding the volume of imaging data that must be interpreted in clinical practice.

CT, PET-CT, and MRI all play a central role in detecting pulmonary nodules, assessing malignancy risk, and guiding decisions about biopsy and further diagnostic evaluation. AI models trained on large imaging datasets are now demonstrating performance that rivals, and in some cases surpasses, experienced radiologists.[2]

Deep learning systems applied to low-dose CT screening have achieved area-under-the-curve (AUROC) values as high as 0.94 for lung cancer detection. AUROC is a statistical measure of how well a model can distinguish between two outcomes (in this case, whether a lung nodule is malignant or benign), with values ranging from 0.5 (no better than chance) to 1.0 (perfect discrimination). In practical terms, stronger model discrimination translates into fewer missed cancers and fewer unnecessary follow-up imaging studies or invasive diagnostic procedures. Some models reduce false positives – benign nodules incorrectly flagged as suspicious – by approximately 11%, while reducing false negatives by 5% compared with conventional radiologist-led screening workflows.[2]

Beyond detection, AI is also beginning to help estimate an individual’s future risk of developing lung cancer (risk stratification), and support decisions about how closely suspicious findings should be monitored over time (longitudinal monitoring).

One example is the Sybil model, a deep learning system designed to analyze low-dose CT scans from lung cancer screening programs. Unlike conventional approaches that focus only on visible nodules, Sybil evaluates patterns across the entire CT scan to estimate a person’s likelihood of developing lung cancer in the coming years, even when no obvious tumor is present.

Using a single CT scan, the model can predict lung cancer risk one to six years into the future. External validation studies report AUROCs between 0.75 and 0.81, suggesting that AI-based forecasting could help personalize screening intervals by identifying individuals who require closer monitoring while allowing others to safely extend the time between scans.[2]

Taken together, these advances have the potential to transform how imaging is used in lung cancer care. By improving early detection, reducing diagnostic uncertainty, and supporting more personalized screening strategies, AI-assisted imaging may help clinicians identify disease earlier while minimizing unnecessary procedures.

Transforming Pathology

Once a suspicious lesion is identified through imaging, diagnosis relies on tissue analysis. Biopsy samples are examined to determine tumor histology and to perform molecular testing for clinically relevant mutations. Increasingly, AI is being used to assist pathologists in this complex and time-intensive process.[3][2]

Deep learning models applied to digital pathology slides can distinguish between major NSCLC subtypes, including adenocarcinoma and squamous cell carcinoma, with approximately 95 % accuracy, comparable to expert human interpretation. These models are trained on large collections of high-resolution digital scans of entire pathology slides (whole-slide images), allowing them to learn and recognize subtle differences in tumor cell morphology and tissue architecture that characterize different lung cancer subtypes.[3]

Beyond tumor classification, AI models are beginning to predict molecular alterations directly from routine hematoxylin and eosin (H&E) slides, the standard strains used in pathology to visualize cell cultures and tissue architecture under a microscope. Because H&E staining is performed for nearly all biopsy samples, these slides represent one of the most widely available sources of diagnostic information in cancer care.

One example is the EAGLE model, a deep learning system trained to analyze digital H&E pathology slides and identify visual patterns associated with specific genetic mutations. In NSCLC, the model has been used to predict the presence of EGFR mutations, which play an important role in determining eligibility for targeted therapies.

Studies report AUROCs ranging from 0.85 to 0.89 for EGFR mutation prediction. While molecular testing remains the gold standard for confirming these mutations, AI-based approaches may help identify which cases are most likely to harbor EGFR alterations, allowing clinicians to prioritize testing for this biomarker. In real-world evaluations, the implementation of EAGLE has reduced reflex molecular testing by 43% while maintaining high sensitivity, enabling faster results, and lowering diagnostic costs without compromising reliability.[3]

AI has similarly improved the consistency of PD-L1 immunohistochemistry scoring, achieving correlation values of 0.94 compared with pathologists. This improved reproducibility is particularly important because PD-L1 expression plays a critical role in determining eligibility for immunotherapy.[3]

Importantly, these technologies are not intended to replace pathologists. Instead, they function as precision tools that enhance diagnostic consistency, reduce variability, and support more efficient pathology workflows.

Beyond Staging: Risk Stratification and Survival Prediction

Traditional staging in NSCLC relies heavily on the TNM system, which categorizes disease based on tumor size, lymph node involvement, and metastasis. While TNM staging remains essential for clinical decision-making, it does not fully capture biological heterogeneity of lung tumors.

AI-driven models are now integrating imaging, pathology, and molecular data to generate more nuanced predictions of disease progression and survival.

For example, CT-based models that analyze imaging features have achieved AUROCs of approximately 0.70 for predicting survival outcomes, compared with roughly 0.60 for models based only on traditional clinical variables such as stage and patient characteristics. Pathomics models, which extract quantitative features from digital pathology slides, have reported five-year survival prediction AUROCs between 0.64 and 0.85.[4]

One particularly promising development is the use of federated learning, an approach that allows models to be trained across multiple institutions without sharing sensitive patient data. The FedCPI model achieved an AUC of 0.9255 and an accuracy of 89.09 % for predicting early-stage disease progression at one center, outperforming both traditional TNM staging and clinical logistic regression models.[4]

For patients and clinicians alike, improved risk prediction could support more personalized treatment decisions and better informed discussions about prognosis.

Improving Workflows and Efficiency

Beyond diagnostic accuracy, AI is also improving efficiency across oncology workflows.

In radiotherapy planning, AI-assisted segmentation and treatment planning can significantly reduce preparation times for image-guided radiotherapy. In pathology laboratories, AI-assisted screening tools can streamline diagnostic workflows and reduce unnecessary molecular testing, in some cases by as much as 43 %.

Federated learning frameworks further enable hospitals to collaborate on model development without directly exchanging sensitive patient data, improving model generalizability while maintaining patient privacy. These efficiency gains are particularly valuable in busy oncology centers where timely diagnosis and treatment decisions are critical.[5][2]

The Limitations We Cannot Ignore

Despite its promise, AI in NSCLC faces several important challenges.

Data bias remains a major concern. Many models are trained using datasets from single institutions or relatively homogeneous populations. When applied to new patient groups or imaging systems, model performance can decline by 5 to 10 AUROC points due to differences in scanners, patient demographics, or clinical practices.

Explainability also remains an ongoing challenge. Many AI systems operate as “black boxes,” generating predictions without transparent reasoning. Tools such as saliency maps attempt to visualize model decisions, but these methods sometimes highlight irrelevant artifacts rather than biologically meaningful features.

Finally, clinical integration presents practical hurdles. Regulatory approval, workflow compatibility, and clinician trust all influence whether AI tools are adopted in routine practice. Demonstrating consistent value will be essential before widespread implementation becomes feasible.[5][2]

The Path Forward

For AI to deliver equitable and reliable impact in NSCLC care, rigorous validation standards will be essential.

Models must undergo external validation, calibration testing, and decision curve analysis to ensure reliability across diverse clinical settings. Fairness audits are also necessary to ensure that performance remains consistent across patient populations.

Improved explainability methods, including deletion curves and concept-based analyses, may help increase transparency and clinician confidence. Most importantly, prospective clinical studies will be required to demonstrate real-world benefit.

AI is unlikely to replace physicians; but when deployed responsibly, it can augment clinical judgment, reduce diagnostic variability, and support more personalized care at scale.

At Helix BioPharma, we believe that integrating advanced analytics with strong validation frameworks and human expertise is essential to unlocking AI’s potential in lung cancer care. By combining innovation with scientific rigor, the goal is to help advance smarter, safer, and more equitable approaches to NSCLC diagnosis and treatment.

References:

1. https://pmc.ncbi.nlm.nih.gov/articles/PMC10308181/

2. https://pubs.rsna.org/doi/10.1148/radiol.231988

3. Li Y, Chen D, Wu X, Yang W, Chen Y. A narrative review of artificial intelligence-assisted histopathologic diagnosis and decision-making for non-small cell lung cancer: achievements and limitations. J Thorac Dis . 2021;13(12):7006-7020. doi:10.21037/jtd-21-806

4. Huang Z, Feng B, Chen Y, et al. Risk stratification for early-stage NSCLC progression: a federated learning framework with large-small model synergy. Front Oncol . 2025;15:1719433. Published 2025 Dec 16. doi:10.3389/fonc.2025.1719433

5. https://jmai.amegroups.org/article/view/9284/html

AI in NSCLC: Promise, Progress, and the Path to Responsible Integration

Jacek Antas

James B. Murphy

Thomas Mehrling

Kim Gaspar

Brenda Lee

Jerzy Leszczynski

Janusz Grabski

Malgorzata Laube

Jacek Antas

Jonathan Davis

Davide Guggi

Newsletter

Stay Connected

Tumor Defense Breaker™, L-DOS47

LEUMUNA™

GEMCEDA™