Pyhealth

Comprehensive Pyhealth automation and integration for healthcare AI and data science

PyHealth is a community skill for building clinical prediction models using the PyHealth library, covering health record processing, clinical task definition, deep learning models, evaluation metrics, and dataset management for healthcare machine learning applications.

What Is This?

Overview

PyHealth provides tools for developing machine learning models on electronic health record data with standardized pipelines for clinical prediction tasks. It covers health record processing that transforms raw clinical data into patient-level feature sequences from visits, diagnoses, and medications, clinical task definition that configures prediction targets like mortality, readmission, and drug recommendation, deep learning models that implement RNN, transformer, and graph-based architectures for sequential clinical data, evaluation metrics that measure model performance with clinical relevance including AUROC and precision at recall thresholds, and dataset management that loads and preprocesses standard clinical benchmarks. The skill enables researchers to build healthcare prediction models efficiently.

Who Should Use This

This skill serves health informatics researchers building predictive models from electronic health records, clinical data scientists developing risk scoring systems, and ML engineers working on healthcare applications that require standardized clinical data processing.

Why Use It?

Problems It Solves

Clinical data from EHR systems uses proprietary formats and coding systems that require extensive preprocessing before modeling. Defining clinical prediction tasks requires domain-specific label extraction logic that is error-prone to implement from scratch. Comparing model architectures on clinical tasks lacks standardized evaluation protocols and data splits. Healthcare datasets contain temporal patient visits that general ML frameworks do not handle natively.

Core Highlights

Data processor transforms raw EHR records into model-ready patient sequences. Task definer configures clinical prediction targets with label extraction logic. Model library provides RNN, transformer, and graph architectures for clinical data. Evaluator measures model performance with clinically relevant metrics.

How to Use It?

Basic Usage

from pyhealth.datasets import (
  MIMIC3Dataset)
from pyhealth.tasks import (
  mortality_prediction_
    mimic3_fn)
from pyhealth.models import (
  Transformer)
from pyhealth.trainer import (
  Trainer)

dataset = MIMIC3Dataset(
  root='data/mimic3',
  tables=[
    'DIAGNOSES_ICD',
    'PROCEDURES_ICD',
    'PRESCRIPTIONS'],
  code_mapping={
    'ICD9CM':
      'CCSCM'})

samples = (
  dataset.set_task(
    mortality_prediction_
      mimic3_fn))

from pyhealth.datasets\
  import split_by_patient
train, val, test = (
  split_by_patient(
    samples,
    [0.8, 0.1, 0.1]))

model = Transformer(
  dataset=samples)
trainer = Trainer(
  model=model)
trainer.train(
  train_dataloader=
    train,
  val_dataloader=val)

Real-World Examples

from pyhealth.models import (
  Transformer, RETAIN,
  GRU)
from pyhealth.metrics\
  import (
    multilabel_metrics_fn)

class ModelBenchmark:
  MODELS = {
    'transformer':
      Transformer,
    'retain': RETAIN,
    'gru': GRU}

  def __init__(
    self,
    dataset,
    train, val, test
  ):
    self.dataset = (
      dataset)
    self.splits = (
      train, val, test)
    self.results = {}

  def run(
    self,
    model_name: str,
    epochs: int = 20
  ) -> dict:
    cls = self.MODELS[
      model_name]
    model = cls(
      dataset=
        self.dataset)
    trainer = Trainer(
      model=model,
      epochs=epochs)
    train, val, test = (
      self.splits)
    trainer.train(
      train_dataloader=
        train,
      val_dataloader=
        val)
    metrics = (
      trainer.evaluate(
        test))
    self.results[
      model_name
    ] = metrics
    return metrics

Advanced Tips

Use code mapping to convert granular diagnosis codes into broader clinical categories reducing vocabulary size and improving model generalization. Split data by patient rather than by sample to prevent information leakage from having the same patient in both training and test sets. Combine multiple clinical data tables to give models richer patient representations that capture diagnoses, procedures, and medications together.

When to Use It?

Use Cases

Build a mortality prediction model using diagnosis and medication sequences from electronic health records. Compare transformer and RNN architectures on hospital readmission prediction with standardized evaluation protocols. Process MIMIC-III clinical data into patient sequences for drug recommendation tasks.

Related Topics

PyHealth, clinical prediction, electronic health records, healthcare ML, MIMIC, deep learning, and medical informatics.

Important Notes

Requirements

PyHealth Python package with PyTorch backend for model training. Access to clinical datasets such as MIMIC-III with proper data use agreements. Sufficient memory for loading patient record sequences during training.

Usage Recommendations

Do: use patient-level splits to prevent data leakage in clinical prediction evaluation. Map diagnosis codes to standardized clinical categories for better model generalization across coding variations. Report clinically meaningful metrics like AUROC and AUPRC alongside accuracy for healthcare prediction tasks.

Don't: train models on clinical data without proper institutional approval and data use agreements. Use time-unaware splits that allow future patient data to appear in training sets. Interpret model predictions as clinical decisions without validation by healthcare professionals.

Limitations

Model performance depends heavily on clinical data quality which varies across institutions and EHR systems. PyHealth currently supports a limited set of clinical benchmarks with MIMIC as the primary dataset. Clinical prediction models trained on one hospital may not generalize to different patient populations.