The Challenge
The US-based healthcare company faces a daily influx of hundreds of provider
appointments, generating a substantial volume of free-text encounter assessments. While
electronic health record (EHR) data in healthcare is predominantly structured, essential
information like doctor’s prescriptions and encounter assessments often remains in
unstructured, free-text formats. These assessments may include critical details such as
medication information, referrals, and follow-ups. Converting this unstructured data
into a structured, classified format at scale and through automated means was a
significant challenge. This transformation was crucial for efficient provider workload
planning, medication monitoring, and faster care actions.
The Solution
To tackle this challenge, a comprehensive solution was devised, encompassing the
following key steps:
Data Cleanup and Tokenization: The unstructured free-text encounter assessments of
patients were retrieved from the Athena Health Encounter API. Given that the Encounter
API output was in HTML code, the assessment part was extracted by cleaning the data.
Tokenization and sentence detection were performed using the Spacy Python NLP library.
Machine Learning Classification: A machine learning (ML) model was built using the Keras
library. This model aimed to classify raw data into multi-class, multi-label categories,
including Medication, Diagnosis, ‘Notes for Pharmacy,’ ‘Internal note for Staff,’
Referral, and ‘Note to patient.’ The process involved tokenization, cleaning,
vectorization, and label encoding of unstructured free text.
ML Model Optimization: Extensive experimentation was conducted with various model
architectures, hyperparameters, and text preprocessing techniques to optimize the ML
model’s performance.
Service Endpoint: A web API service endpoint was established in Azure Kubernetes Service,
utilizing a cluster of 3 nodes with one dedicated to load balancing.
Security Measures: Authentication and authorization to run the ML model were implemented
through Azure Active Directory and App Registration. Only authorized applications with
appropriate secret credentials could access the ML model.
Integration with Data Workflow: The API was integrated into the data workflow using
Talend Data Integration. This integration facilitated the retrieval of Athena Health
Encounter data, data cleansing, and the transmission of assessment text to the API. The
resulting multi-class information was stored in Salesforce for action by the care team.
The Result
The implementation of this solution delivered substantial benefits:
The care team gained the ability to act swiftly and efficiently on various encounter
assessment categories, including Medication, Diagnosis, ‘Notes for Pharmacy,’ ‘Internal
note for Staff,’ Referral, and ‘Note to patient.’
Manual efforts were significantly reduced, leading to operational efficiency and time
savings.
Service quality was notably improved, ensuring that patients received timely and accurate
care.
This client success story showcases how the US-based healthcare company harnessed machine
learning and automation to transform the analysis of free-text encounter assessments,
ultimately enhancing the speed and quality of care services provided to patients while
optimizing operational efficiency.
Tools and Platforms used