Medical Summary API

Clinical Assistant: Enter the detailed radiology findings below to generate an automated abstractive summary.

Impression summary will load below text box.

Impression:

Impression:


Project Documentation

Model & Framework

  • Model: Falconsai/medical_summarization (T5-base)
  • Framework: Hugging Face Transformers / PyTorch
  • Backend: FastAPI

Data Cleaning

Data for the model was gathered from XML data(https://openi.nlm.nih.gov/faq?download=true). This information was put in to a dataframe and cleaned of any white space (duplicate+/trailing white space) and any censored data (the data contained a lot of "XXXX" to hide personal information) as to maintain responsible AI practices. Columns that featured any null values were dropped from the dataframe. From here it was fransformed in to a Hugging Face Dataset for the model after being split in to both train/test.

Tuning & Hyperparameters

The model used for the project was T5 Large for Medical Text Summarization from Falcons AI. This model was chosen after an initial test of the T5-small model. The T5-small produced and average ROUGE score of 0.42 but after inspection it appeared to just be repeating the primary sentence or two of the findings as the impression. A couple of changes in hyperparameters showed amended this but reduced the ROUGE to 0.12. The change to the Falcons AI model was undertaken as the model had been pretrained on large amounts of medical data which would make it more suitable for the summarization task. A learning rate of 3e-5 was used after an initial run with 2e-5 and an increased number of epochs from 3 to 5.

API Reference

POST /get_summary

Submit clinical findings to receive an AI-generated impression.

// Request Body
{
    "findings": "Lungs are clear. Heart size is normal. No pneumonia."
}
// Success Response (200 OK)
{
    "impression": "No acute disease."
}
Error Handling

If the input is empty or invalid, the API returns a 400 Bad Request.

// Error Response (400)
{
            "detail": "Findings text cannot be empty."
             }
Metric ROUGE-1 ROUGE-2 ROUGE-L
Score 0.5374 0.4496 0.5353

Project Limitations

  • Data Volume: Final dataset (~3,000 samples) limits the model's exposure to rare clinical pathologies.
  • Compute: CPU-bound training necessitated torch.no_grad() and specific hyperparameter trade-offs to manage RAM and disk space.