Using text mining to aid in cancer risk assessment
Reference number | |
Coordinator | Karolinska Institutet - Institutet för miljömedicin |
Funding from Vinnova | SEK 2 524 000 |
Project duration | December 2011 - December 2015 |
Status | Completed |
Purpose and goal
The initial aims of this project were to develop a new method to assess chemicals and cancer risks and a new approach to identify carcinogenic modes of action. The method is based on text-mining. We have developed a freely available text mining-tool for cancer risk assessment. Besides publications describing the development of the tool, we have also published several scientific articles showing how the method can be used for risk assessment purposes and as a method to increase the mechanistic understanding of how chemicals cause cancer. By that, both project goals are met.
Results and expected effects
The tool can be used in chemical risk assessment but also as a tool for research. The tool uses the already published literature which makes it possible to take advantage of already conducted research to create new knowledge. By classifying large amounts of textual data, patterns in data can be identified and important knowledge gaps can be discovered. Groups of chemicals can be compared to detect important similarities or differences. The method is also useful to generate hypotheses as basis for experimental studies.
Approach and implementation
The tool has been developed in close collaboration between researchers at the Institute of Environmental Medicine (IMM) at Karolinska Institutet and at the Computer Laboratory, University of Cambridge, UK. The tool is built around a cancer risk assessment-taxonomy, which structures the modes of action relevant to cancer development. The work has also involved manual annotation of scientific literature, machine learning and techniques for text-zoning. We have also increased the usability gradually, including automatic statistics and a direct link to the search engine PubMed.