Monday, October 26, 2020

Machine learning detects online influence campaigns.

Maybe there is light at the end of the tunnel in the struggle to determine when malign information or online influence campaigns are being spread on social media platforms. Alizadeh and collaborators summarize in a Washington Post article their use of machine learning techniques, described in more detail in an article in Science Advances. They also point to promising techniques being developed by other scholars.
There’s no single solution, but there is a path forward
Unfortunately, this means there is no single model for finding foreign influence campaigns. Social media usage is dynamic. Normal users are always responding to current events and trolls are continually adapting and trying new tactics.
While we did not find a stable set of characteristics that allow us to detect all campaigns, we did find a method for detecting these campaigns based on the fact that troll content is almost always different in detectable ways. And machine learning allows us to find those differences at scale. Other scholars have developed promising techniques, as well.
The day when we can have a “daily report” of online influence campaigns to inform citizens may not be as far away as it would seem.
Here is the abstract from their Science Advances article:
We study how easy it is to distinguish influence operations from organic social media activity by assessing the performance of a platform-agnostic machine learning approach. Our method uses public activity to detect content that is part of coordinated influence operations based on human-interpretable features derived solely from content. We test this method on publicly available Twitter data on Chinese, Russian, and Venezuelan troll activity targeting the United States, as well as the Reddit dataset of Russian influence efforts. To assess how well content-based features distinguish these influence operations from random samples of general and political American users, we train and test classifiers on a monthly basis for each campaign across five prediction tasks. Content-based features perform well across period, country, platform, and prediction task. Industrialized production of influence campaign content leaves a distinctive signal in user-generated content that allows tracking of campaigns from month to month and across different accounts.

No comments:

Post a Comment