Base project for development of next generation large scale multimodal language models
Reference number | |
Coordinator | Lindholmen Science Park AB - AI Sweden |
Funding from Vinnova | SEK 4 995 566 |
Project duration | January 2024 - December 2024 |
Status | Completed |
Important results from the project
** Denna text är maskinöversatt ** The project aimed to develop a general multimodal large-scale generative language model for the Swedish language. All sub-goals covered by the funding phase 1 have been achieved: we have established frameworks and processes for data management, we have developed frameworks and architecture for multimodal models, we have compiled the largest data set to date for training Swedish language models, and we have trained an audio- text model as a first example of how a multimodal Swedish model can work.
Expected long term effects
** Denna text är maskinöversatt ** A development process and dialogue with copyright owners about responsible ways can work with data for the development of language models. Our ambition has been to offer a national alternative to using foreign APIs in the form of open models, and in connection with this to strengthen Swedish resilience through competence, infrastructure and resource building. The project has led to subsequent projects, has been valuable in strengthening Nordic cooperation around language models, that can lead to cooperation and value creation at the Nordic level.
Approach and implementation
** Denna text är maskinöversatt ** The project was originally planned to be implemented in three phases, with control and decision points between each phase. The control points were based on the KPIs defined for each phase in proposal. The project managed to complete the first phase and achieved all its KPIs during it, but the funding for the further two phases did not materialize.