Verification of an AI-based app enabling instant, secure communication across language barriers in health care
Reference number | |
Coordinator | Mabel AI AB - Sahlgrenska Science Park |
Funding from Vinnova | SEK 300 000 |
Project duration | November 2022 - September 2023 |
Status | Completed |
Venture | Innovative Startups |
Call | Innovative Impact Startups autumn 2022 |
Important results from the project
The goal of the project was to develop and validate a MVP of an AI-based translation app for medical conversations. The unique aspect of our solution is the focus on privacy, which is of paramount importance in medicine. We conduct all our operations on the mobile device without the use of internet. This guarantees that the medical conversations stay private. During the project period, we successfully created an app that can be used for validation, and created significantly improved models for English and Ukrainian, enriched with medical data.
Expected long term effects
Improved Word-Error-Rate for speech-to-text of Ukrainian from 13% to 3.2%, by using a different network architecture, fine tuning to the medical domain, and using our own implementation of beam search. Filtered out low quality translation data, Russian words, profanities, and mismatched translations. With the help of LLMs and doctors, we collected simulated medical conversations, which were used to evaluate the new translation models. We achieved an accuracy improvement of 64% over existing English-Ukrainian models! We will make our Eng-Ukr model open-source!
Approach and implementation
Any AI system is as good as the training data. We devoted much time to gathering and cleaning data, using both manual and automated approaches. We identified a bias in the open source data, where about 80% of the voices were of men aged 20-40, and hired a female Ukrainian refugee to evaluate the performance of our models on voices outside this demographic. She identified a discrepancy between the reported accuracy of the existing models and her voice, and detected Russian words in the translation. We trained new models with cleaned and augmented data, significantly improving our accuracy.