Your browser doesn't support javascript. This means that the content or functionality of our website will be limited or unavailable. If you need more information about Vinnova, please contact us.

Using the SB Sam NLP tools for manual and automatic annotation of climate change texts

Reference number
Coordinator Institutet För Språk & Folkminnen
Funding from Vinnova SEK 182 331
Project duration November 2021 - December 2022
Status Completed
Venture AI - Competence, ability and application
Call Staff exchange for applied AI-research 2.0

Important results from the project

The natural language processing technique "topic modeling" was applied to extract re-occurring topics from two different corpora on climate change that had been collected by "the Applied CompLing Discourse Research Lab". The two corpora were (i) a collection of recently published German tweets, and (ii) a collection of editorials published in the journals Nature and Science between 1969 and 2016. Results were disseminated at the Clarin conference, which was held in Prague in 2022, and in an article published in the Journal of Computational Social Science.

Expected long term effects

For the corpus with editorials from Nature and Science, we compared topic trends detected through a manual annotation of the editorials with trends that were automatically extracted by topic modeling. Most of the major trends that were detected by the manual annotation were also found automatically when using topic modeling. These results provide an example of how natural language processing methods can be used for supporting tasks that would otherwise require a fully manual analysis of large amounts of texts.

Approach and implementation

The text-mining tool Topics2Themes was used for exploring both of the two corpora. This tool is currently maintained and further developed at the Swedish national research infrastructure node "SB Sam", which is also a Clarin node. Topics2Themes uses topic modeling to automatically extract frequently re-occurring topics from large text collections, and display the output in an interactive graphical user interface. Most of the collaboration was conducted on-site in Potsdam. All travels, i.e., visits to Potsdam, and travels to conferences and seminars, were made by train.

External links

The project description has been provided by the project members themselves and the text has not been looked at by our editors.

Last updated 30 June 2023

Reference number 2021-03973