Abstract:
Accurate classification of solitary pulmonary nodules (SPNs) as benign or malignant can be critical for lung cancer diagnosis and timely treatment. Traditional machine learning approaches rely on hand-crafted features for the classification task. This study explores the use of large language models (LLMs) for automated feature engineering to enhance SPN malignancy classification using a Random Forest classifier. A baseline dataset containing five standard radiological features was used to train a Random Forest model. Multiple LLMs, including GPT-4.0, Gemini, and others, were prompted to propose up to five new, clinically plausible features derived from or related to the original features. The suggested extra features were incorporated into new feature sets and evaluated using accuracy, sensitivity, and specificity metrics. All LLM-enhanced feature sets improved the classifier (88.52% accuracy), with the best results achieved using features proposed by GPT-4.0, reaching 94.64% accuracy, 96.16% sensitivity, and 93.54% specificity. Recurrent high-impact features included the SUVmax-to-Diameter Ratio, Margin Irregularity Index, and Nodule Growth Rate. LLMs show significant promise for automated feature engineering in clinical machine learning. Their ability to generate medically interpretable and performance-enhancing features can accelerate model development and improve diagnostic accuracy in lung cancer screening.
