Overview

Medical documents inherently contain Personally Identifiable Information (PII), which is crucial for associating records with individual profiles and building comprehensive longitudinal data. At the same time, securely redacting PII is vital for workflows such as data anonymization for AI model training and secure document sharing. EkaCare’s advanced PII extraction API empowers businesses to seamlessly identify and manage PII, enabling secure, efficient, and compliant handling of sensitive medical data.

EkaCare’s PII extraction solution leverages its customised vision-LLM to accurately extract structured data such as name, age, gender and medical facility name. Designed specifically for the Indian healthcare ecosystem our solution offers high level of accuracy and doesn’t involve human in a loop.

This service offers:

  • Extraction of:
    • Patient Name
    • Patient Age
    • Patient Gender
    • Doctor Name
    • Mobile Numbers
    • Facility Names
    • Dates
  • Ability to work with PDFs as well as scanned/clicked images of prescriptions Example**

Use Cases

  • Improved medical profiling: Seamlessly associate medical documents with individual patient profiles along with other critical information such as age, gender and dates.
  • PII Redaction: Ensure data anonymization for secure document sharing and AI training workflows.

Technology Deep-Dive

Our PII extraction service is powered by or custom Large Language Models (LLMs), specifically trained on millions of diverse medical documents. These documents span diverse formats and contexts, with a particular focus on the Indian healthcare ecosystem.

Our rigorous training and fine-tuning process ensures exceptional accuracy while minimizing common pitfalls like hallucinations that often impact other SOTA LLMs. The result is a highly reliable system, as demonstrated in the benchmarks provided in the subsequent section.

Evaluation and Benchmarks

Our benchmark experiments with evaluation dataset comprising thousands of documents showcase Eka’s superior performance in terms of accuracy compared to other SOTA models. NOTE this evaluation dataset contains both PDF and clicked images.

TaskParrotlet-V (Eka Care’s LLM)OpenAI GPT-4oClaude Sonnet 3.5Qwen2-VL (7B)Llama-3.2-Vision (11B)Phi-3.5-vision (4.2B)
PII extraction0.9150.8840.8240.7190.5410.585

A deeper view on results of these experiments are summarised below.

Field NameParrotlet-VGPT-4oClaude Sonnet 3.5
name0.9540.9290.93
age0.9550.8940.619
gender0.9730.9780.95
dob0.9740.9730.956
facility0.8990.7960.729
document_date0.9470.9550.951
doctor0.7040.6650.634
Average0.9150.8840.824

Spotlight

Try Out

Experience the power of EkaCare’s PII extraction with our developer-friendly API.

  1. Visit our API Documentation to get started.
  2. Upload a lab report or prescription and see our technology in action.
  3. Contact us for a custom demo tailored to your use case.

Ready to unlock the full potential of healthcare data? Get in Touch today.