Feb 23, 2021 by Thibault Debatty | 1426 views
The ongoing COVID-19 crisis is being discussed a lot on social media platforms. Researchers and social media platforms a like make use of the online conversation to increase their situational awareness about the continuously evolving situation. At the same time, foreign powers or special interest groups have also been observed of piggybacking the large scale discussion to spread fake news and/or misinformation.
In the context of our ongoing Social Media Intelligence (SOCMINT) project, the current situation does however create several opportunities:
Large-scale datasets are available. Twitter Developer Labs even created a specific COVID-19 stream endpoint: https://developer.twitter.com/en/docs/labs/covid19-stream/overview.
Given the global importance of the subject, multiple teams of researchers have actively engaged in manually labeling misinformation. This has lead to the existence of a multitude of annotated datasets in different languages (although most datasets are in English). The existence of multiple instances of a somewhat reliable ground truth is a luxury one rarely comes across in this domain:
These datasets will be used to further test and evaluate ongoing research into different methods for automated misinformation detection. Furthermore, we examine whether the conclusions that can be drawn at a global level can also be applied specifically to Belgium.