Posts classified under: Biomedical Knowledge Representation

Data-driven Diagnostic Decision Support

We are implementing a prototype next-generation decision support system called SmartDx, which learns the optimal sequence of diagnostic tests tailored to a patient’s unique characteristics and circumstances. The goal of this project is to optimize clinical pathways based on diagnostic accuracy, timeliness, and cost for individual patients using routine data collected longitudinally in the electronic health record. We are developing and validating machine learning algorithms that adapt by discovering the relevant clinical features that are most informative in identifying the appropriate diagnostic test for the individual. The tool personalizes the sequences of tests based on the availability of new information, optimizing based on the cost, timeliness, and accuracy of test results. Our efforts are driven by practical decision support questions related to optimal paradigms for breast and lung cancer screening, which are at the forefront of radiology today.

The project is organized around three key tasks: 1) development of novel adaptive learning methods to discover the most informative features that are predictive of subsequent actions taken in real-time; 2) exploration of deep reinforcement learning approaches that not only discovers what is the next best diagnostic test to order but also identifies additional information that is needed to make a definitive diagnosis; and 3) assessment of methods for making decision support models more transparent by communicating the rationale and uncertainty associated with model predictions. Our objectives are to: 1) determine what combination of diagnostic procedures (e.g., imaging, labs, biopsy) should be used to achieve an accurate and timely diagnosis – and in what sequence; and 2) demonstrate that learning such pathways can be done using real-world clinical data, allowing our methodology to be applied in realistic scenarios that require learning from incomplete and inconsistent data.

Making Biomedical ML Reproducible

The confluence of machine learning (ML) data-driven approaches and increased computational power, alongside access to the wealth of electronic health records (EHRs) and other emergent types of data (e.g., omics, imaging, mHealth), are accelerating the development of biomedical predictive models. Such models range from traditional statistical approaches (e.g., regression) through to more advanced deep learning techniques (e.g., convolutional neural networks, CNNs), and span different tasks (e.g., biomarker/pathway discovery, diagnostic, prognostic, etc.). Two issues have become evident: 1) as there are no comprehensive standards to support the dissemination of these models, scientific reproducibility (vs. replicability) is problematic, given challenges in interpretation and implementation; and 2) as new models are put forth, methods to assess differences in performance, as well as insights into external validity (i.e., transportability), are necessary. Tools moving beyond data sharing and model “executables” are needed, capturing the information needed to fully reproduce a model and its evaluation. The objective of this R01 is the development of <strong>PREMIERE (PREdictive Model Index and Exchange REpository)</strong>, an informatics standard supporting the requisite information for scientific reproducibility for statistical and ML-based biomedical predictive models.