Predicting Clinical Outcomes Across Changing Electronic Health Record Systems

14 Nov
Tuesday, 11/14/2017 4:00pm to 5:00pm
Computer Science Building, Room 150/151
Data Science Tea
Speaker: Jen Gong & Tristan Naumann (MIT)

Existing machine learning methods typically assume consistency in how semantically equivalent information is encoded. However, the way information is recorded in databases differs across institutions and over time, often rendering potentially useful data obsolescent. To address this problem, we map database-specific representations of information to a shared set of semantic concepts, thus allowing models to be built from or transition across different databases. We demonstrate our method on machine learning models developed in a healthcare setting. In particular, we evaluate our method using two different intensive care unit (ICU) databases and on two clinically relevant tasks, in-hospital mortality and prolonged length of stay. For both outcomes, a feature representation mapping EHR-specific events to a shared set of clinical concepts yields better results than using EHR-specific events alone.

Jen Gong is a Ph.D. candidate in Electrical Engineering and Computer Science at MIT working with Professor John Guttag in CSAIL's Data Driven Inference group. Her research focuses on transfer learning and multi-modal learning methods for improving clinical decision-making aids. In particular, her research looks at how different modalities of health care data (e.g., unstructured clinical notes, physiological time-series) and auxiliary sources (e.g., data from similar patient populations, expert-encoded ontologies) can be leveraged to improve risk models for adverse clinical outcomes.

Tristan Naumann is a Ph.D. candidate in Electrical Engineering and Computer Science at MIT working with Professor Peter Szolovits in CSAIL's Clinical Decision Making group. His research includes exploring relationships in complex, unstructured healthcare data using natural language processing and unsupervised learning techniques. He has been an organizer for workshops and datathon events, which bring together participants with diverse backgrounds in order to address biomedical and clinical questions in a manner that is reliable and reproducible.