Your browser doesn't support javascript. This means that the content or functionality of our website will be limited or unavailable. If you need more information about Vinnova, please contact us.

Bridging physics research to real-life applications with data science at scientifyRESEARCH

Reference number
Coordinator Scientify Research Solutions AB
Funding from Vinnova SEK 500 000
Project duration December 2022 - May 2023
Status Completed
Venture MSCA Employment
Call Attract, integrate and retain international excellence

Important results from the project

The project “Bridging physics research to real-life applications with data science at scientifyRESEARCH” is a collaboration between the company scientifyRESEARCH and the Marie Cure PhD scholar Smita Chakraborty. The project has two main goals. One, for the company, is to use scale the production of our research funding database with machine learning and natural language processing. The second goal, for the candidate, is to provide the candidate with transferrable skills and experiences that can be easily recognized by industry in Sweden and beyond.

Expected long term effects

The project “Bridging physics research to real-life applications with data science at scientifyRESEARCH” reached its goals in that it formed the foundation for scientifyRESEARCH to scale its content production with machine learning and natural language processing techniques. For the candidate, the project provided Smita Chakraborty the opportunity to learn from a real-world expert in ML/NLP/AI. Smita was immersed in Microsoft Azure as the development platform, another tool that is widely used in industry, and adds credentials to her employability in industry.

Approach and implementation

The project involved using python, ML/NLP to process a large dataset into a research funding database. The design and implementation of the project include: 1.Data Collection: Automate the process of retrieving data from a large national funder’s database. 2.Data Processing: XML file data are fed into a clean python data frame. 2.NLP: We tested several large language models and settled on using NLP from the OpenAI repository. 3.Data visualization: data are presented visually on the scientifyRESEARCH.org website as enriched funding information.

External links

The project description has been provided by the project members themselves and the text has not been looked at by our editors.

Last updated 15 July 2023

Reference number 2022-02955