Abstract
This study tackles the challenge of data scarcity in medical AI, focusing on Non-Small-Cell Lung Cancer (NSCLC) diagnosis from Positron Emission Tomography (PET) and Computed Tomography (CT) images. We introduce MedScanGAN, a conditional Generative Adversarial Network designed to generate high-fidelity synthetic PET and CT images of Solitary Pulmonary Nodules (SPNs) to enhance computer-aided diagnosis systems. The framework incorporates advanced architectural features, including residual blocks, spectral normalization, and stabilized training strategies. MedScanGAN produces realistic images—particularly for PET representations—capable of plausibly misleading medical professionals. More importantly, when used to augment training datasets for established deep learning models such as YOLOv8, VGG-16, ResNet, and MobileNet, the synthetic data significantly improves NSCLC classification performance. Accuracy gains of up to +5.8 absolute percentage points were observed, with YOLOv8 achieving the best results at 94.14% accuracy, 93.12% specificity, and 95.33% sensitivity using the augmented dataset. The conditional generation mechanism enables the targeted synthesis of underrepresented classes, effectively addressing class imbalance. Overall, this work demonstrates both state-of-the-art medical image synthesis and its practical value in improving real-world diagnostic systems, bridging generative AI research and clinical pulmonary oncology.
