Data-driven Diagnostic Decision Support

Female doctor showing an elderly patient her an xray scan

We are implementing a prototype next-generation decision support system called SmartDx, which learns the optimal sequence of diagnostic tests tailored to a patient’s unique characteristics and circumstances. The goal of this project is to optimize clinical pathways based on diagnostic accuracy, timeliness, and cost for individual patients using routine data collected longitudinally in the electronic health record. We are developing and validating machine learning algorithms that adapt by discovering the relevant clinical features that are most informative in identifying the appropriate diagnostic test for the individual. The tool personalizes the sequences of tests based on the availability of new information, optimizing based on the cost, timeliness, and accuracy of test results. Our efforts are driven by practical decision support questions related to optimal paradigms for breast and lung cancer screening, which are at the forefront of radiology today.

The project is organized around three key tasks: 1) development of novel adaptive learning methods to discover the most informative features that are predictive of subsequent actions taken in real-time; 2) exploration of deep reinforcement learning approaches that not only discovers what is the next best diagnostic test to order but also identifies additional information that is needed to make a definitive diagnosis; and 3) assessment of methods for making decision support models more transparent by communicating the rationale and uncertainty associated with model predictions. Our objectives are to: 1) determine what combination of diagnostic procedures (e.g., imaging, labs, biopsy) should be used to achieve an accurate and timely diagnosis – and in what sequence; and 2) demonstrate that learning such pathways can be done using real-world clinical data, allowing our methodology to be applied in realistic scenarios that require learning from incomplete and inconsistent data.