Predictive Analysis Of Oscc Patient Status Using Machine Learning: A Focus On Lifestyle And Demographic Factors

Authors

  • Wan Muhamad Amir W Ahmad*, Mohamad Nasarudin Adnan, Farah Muna Mohamad Ghazali, Nor Azlida Aleng, Nurfadhlina Abdul Halim, Mohamad Shafiq Mohd Ibrahim

Abstract

Oral Squamous Cell Carcinoma (OSCC) is a major public health concern, with patient outcomes influenced by lifestyle and demographic factors. Accurate insight into these variables is essential to improve prognostic models and guide targeted therapies. Advanced computational methods now offer highly accurate OSCC outcome predictions, enhancing clinical decision-making. Objective: This study aims to construct and validate a predictive model for survival outcomes in OSCC patients. By analyzing the influence of key variables- smoking status, age, betel quid use, and alcohol consumption, the study seeks to quantify each factor's contribution to survival probability in OSCC.  Materials and Methods: Conducted as a retrospective analysis, this study employed a machine learning framework, specifically a Multilayer Feedforward Neural Network (MLFFNN), to evaluate data from OSCC patients. The survival status (live or dead) served as the outcome variable, while the predictors included smoking status, age, betel quid use, alcohol consumption, and sex. Model development was performed in R, using sophisticated statistical processes such as data normalization, bootstrap resampling, and systematic data partitioning for training, testing, and validation. The neural network’s architecture was refined with hidden layers and a logistic activation function to achieve optimal predictive accuracy. Results: The MLFFNN model identified smoking status as the most influential predictor of OSCC survival (24%), followed by age (16.91%), betel quid use (15.37%), alcohol consumption (8.57%), and sex (10.21%). The model exhibited strong predictive capabilities, with performance metrics such as Mean Absolute Error (MAE) of 0.2628 and Root Mean Squared Error (RMSE) of 0.3466 on the validation data, indicating its reliability. Additionally, it achieved a model accuracy of 73.715% on the validation dataset. The Mean Squared Error (MSE) on the testing data was 0.1638, further reinforcing the model's effectiveness in predictive tasks.  Conclusion: This study demonstrates the efficacy of an MLFFNN model in evaluating factors affecting OSCC survival outcomes. The results highlight the pronounced impact of lifestyle behaviours, especially smoking, on survival prognosis. The successful application of a neural network-based approach in R underscores the potential of computational models to contribute meaningfully to OSCC management, offering clinicians a data-driven tool to guide treatment decisions and intervention strategies.

Downloads

Published

2024-12-05

Issue

Section

Articles