Data Science Hub Machine learning/AI
Our interdisciplinary team combines expertise in machine learning, natural language processing, artificial intelligence, and healthcare to develop and apply innovative tools for analyzing clinical notes. Leveraging state-of-the-art algorithms and techniques, we aim to extract valuable insights from unstructured clinical text data, facilitating improved patient care, clinical decision-making, and medical research. Our efforts enhance the efficiency, accuracy, and efficacy of healthcare delivery by unlocking the wealth of information contained in electronic health records and other clinical documentation.
Meet the team
Yi Zhong
View profile
Hyun-Hwan Jeong
Learn more
Hu Chen
A recent case study
Diagnosing genetic disorders requires extensive manual curation and interpretation of candidate variants, a labor-intensive task even for trained geneticists. Although artificial intelligence (AI) shows promise in aiding these diagnoses, existing AI tools have only achieved moderate success for primary diagnosis. We developed AI-MARRVEL (AIM), which uses a random-forest machine-learning classifier trained on over 3.5 million variants from thousands of diagnosed cases. It also incorporates expert-engineered features into training to recapitulate the intricate decision-making processes in molecular diagnosis. In this study, we benchmarked AIM with diagnosed patients from three independent cohorts and found it was more accurate than existing methods for genetic diagnosis. We also demonstrated its potential for novel disease gene discovery by correctly predicting two newly reported disease genes from the Undiagnosed Diseases Network. This study was published in NEMJ AI.
Our tools:
CLAMP (the Clinical Language Annotation, Modeling, and Processing Toolkit) — a natural language processing tool specifically designed to analyze clinical text, enabling tasks such as named entity recognition, relation extraction, and classification in clinical notes.
ClinPhen — a fast, high-accuracy algorithm that scans clinical notes and generates a prioritized list of patient phenotypes.
ChatGPT — a conversational, artificial intelligence model built on the Generative Pre-trained Transformer (GPT) architecture that can be adapted for analyzing clinical notes and that provides natural language understanding capabilities to assist in tasks such as summarizing, classifying, and answering questions in healthcare documentation.
PhenoGPT — a specialized version of the GPT language model designed to analyze clinical text, enabling tasks such as phenotype extraction, disease coding, and clinical decision support in electronic health records.
AI-MARRVEL — A knowledge-driven AI system for diagnosing mendelian disorders.
Get in touch with us
Texas Children's Hospital researchers can submit a ticket here.
Baylor College of Medicine researchers can submit a ticket here.
Researchers from other institutions can contact us at researchdata@texaschildrens.org