Dr. Gordon S. Doig has produced a comprehensive study comparing a popular statistical method and our neural networks.
Dr. Gordon S. Doig has produced an outstanding and comprehensive study comparing a very popular statistical method and our neural networks. His study completed in April 1999, Severity of Illness Scoring in the Intensive Care Unit: A Comparison of Logistic Regression and Artificial Neural Networks, was Dr. Doig's doctoral thesis at the University of Western Ontario, London, Ontario, Canada. Dr. Doig used genetic adaptive and backprop nets to make ICU outcome predictions. Rarely have we seen such a complete work. Dr. Doig is now at the London Health Sciences Centre, and we have reproduced the abstract of his work below. Abstract
Purpose: To compare the predictive performance of a series of logistic regression models (LMs) to a corresponding series of back-propagation artificial neural networks (ANNs).
Location: A 30 bed adult general intensive care unit (ICU) that serves a 600-bed tertiary care teaching hospital.
Patients: Consecutive patients with a duration of ICU stay greater than 72 hours.
Outcome: ICU-based mortality.
Methods: Data were collected on day one and day three of stay using a modified APACHE III methodology. A randomly generated 811 patient developmental database was used to build models using day one data (LM1 and ANN1), day three data (LM2 and ANN2) and a combination of day one and day three data (LMOT and ANNOT). Primary comparisons were based on area under the receiver operating curves (aROC) as measured on a 338 patient validation database. Outcome predictions were also obtained from experienced ICU clinicians on a subset of patients.
Results: Of the 3,728 patients admitted to the ICU during the period from March 1, 1994 through February 28, 1996, 1,181 qualified for entry into the study. There was no significant difference between LM and ANN models developed using day one data. The ANN developed using day three data performed significantly better than the corresponding LM (aROC LM2 0.7158 vs. ANN2 0.7845, p=0.0355). The time dependent ANN model also performed significantly better than the corresponding LM (aROC LMOT 0.7342 vs. ANNOT 0.8095, p=0.0140).
The predictions obtained from ICU consultants (aROC 0.8210) discriminated significantly better than LMOT (aROC 0.6814, p=0.0015) but there was no difference between the consultants and ANNOT (aROC 0.8094, p=0.7684).
Conclusion: Although the 1,181 patients who became eligible for entry into this study represented only 32 percent of all ICU admissions, they accounted for 80 percent of the resources (costs) expended. ANNs demonstrated significantly better predictive performance in this clinically important group of patients. Four potential reasons are discussed: 1) ANNs are insensitive to problems associated with multicollinearity; 2) ANNs place importance on novel predictors; 3) ANNs automatically model nonlinear relationships and ; 4) ANNs implicitly detect all possible interaction terms.
Keywords: intensive care, critical care, severity-of-illness, logistic regression, artificial neural networks, genetic algorithms, back-propagation, receiver operating characteristic, predictive model building.