Social audio data to develop effective vaccine persuasion narratives
Short solution summary:
Use data from conversations taking place on the emerging breed of social audio apps (Clubhouse, Twitter Spaces) to scale up development and validation of anti-misinformation narratives and interventions in public health campaigns, such as for COVID-19 vaccination. Detect state-sponsored activity clusters and common "wedges" driven into communities.
In what city, town, or region is your solution team based?San Francisco, CA, USA
Who is the Team Lead for your solution?
Sean Young, PhD - UCIPT
Which Challenge Area does your solution most closely address?Respond (Decrease transmission & spread), such as: Optimal preventive interventions & uptake maximization, Cutting through “infodemic” & enabling better response, Data-driven learnings for increased efficacy of interventions
What specific problem are you solving?
COVID-19 and now COVID-19 vaccine misinformation are a significant problem that affect the health of populations and effectiveness of public health measures by spreading Fear, Uncertainty, and Doubt in the affected populations. See, for instance, previous work addressing this in the context of Twitter (Barton Rhodes part of the team).
While text and images on social media constitute a significant problem that the tech giants and non-profits already seek and struggle to address, spoken word has been an effective means of persuasion for perhaps as long as we have had language. From the demagogue on the pulpit, to radio, to a quiet conversation with a loved one, some of the most important decisions we make as humans come through conversations and auditory experiences.
The ascent of the social audio apps like Clubhouse present a unique threat landscape with threats that are significantly harder to evaluate, much less mitigate at scale.
Therein lies a challenge and an opportunity:
- detect audio misinformation about COVID-19 and vaccine at scale while observing standards of ethics - what was said when, by whom, with what intent?
- discover what we can say to someone reluctant to get vaccinated in order to sway their belief
Who does your solution serve, and what needs of theirs does it address?
The solution is designed to help those struggling to find the right words to persuade people trapped in ideation born out of conspiracy theories and active misinformation campaigns, as well as the social platforms themselves in detecting and mitigating campaigns aimed at reducing the effectiveness of public health measures in susceptible populations. It is designed to scale up focus groups and smaller studies to the scale of all the relevant conversations happening at once, and provide timely insight into what conversations are being had that is currently next to impossible to address in the audio-only channel.
In addition to the primary objective, a secondary goal is to develop a set of ethical standards for analysis of social audio data in a manner that preserves the dignity and privacy of the speakers, regardless of their stated belief, as well as a broader set of guidelines for such analysis to meet benevolent ends.
Ultimately, this will help our target group of "interrupters" ensure information hygiene of communities worldwide (English speakers initially, with pathways to localization given additional resources).
What is your solution’s stage of development?Proof of Concept: A venture or organisation building and testing its prototype, research, product, service, or business/policy model, and has built preliminary evidence or data
Please select all the technologies currently used in your solution:
What “public good” does your solution provide?
The solution's data pipelines will be accessible to researchers worldwide under strict ethical guidelines (e.g. existing models for access in post-Cambridge Analytica world).
Additionally, the guidelines for having effective conversations that counter misinformation narratives will be regularly published in simple to digest format that people can take to their loved ones and use as a conversation aide when trying to convey the benefits of the vaccine.
The codebase and findings will be shared with the platform themselves, allowing for fostering of more resilient, equitable communities for users of audio-only apps.
How will your solution create tangible impact, and for whom?
The solution will allow for more effective conversational interventions to counter the impact of conspiracy theory narratives and allow for changing the minds / actions of the susceptible populations to be more aligned to the scientific understanding and public health campaigns.
The data gleaned from the solution can be used to detect campaigns, and have a tangible set of datapoints that can be used to enhance broader informational awareness.
On the conversation analysis front, the solution can be used to capture change of mind events, or more generally reason about conversation arcs and allow for iterative, data-driven development of tooling to be used by trained specialists in countering effects of misinformation.
How will you scale your impact over the next one year and the next three years?
As more conversations start happening on the social audio apps and off of them, having a methodical approach and a data pipeline will enable unprecedented access to conversations outside of the typical university study setting.
By diversifying the groups outside of just students or study volunteers to all the users of the social audio apps, a more representative / diverse footprint can be obtained.
Given the universality of methods (e.g. ability to rely on pretrained models in multiple languages), the solution can account for conversation dynamics available to a variety of cultural perspectives and existing narratives.
Over the next year, the solution can be used to:
- target English-speaking segment of the apps
- enhance the data sources of existing anti-misinfo communities with social audio input
- practical conversational guides and persuasion / intervention techniques get developed
Over the next three years:
- the solution can be extended to more languages
- the scale of the data intake will grow with adoption of the apps in the space
- active national security collaboration can be established
How are you measuring success against your impact goals?
Most immediately, integration with the existing measurement methodologies of the public health campaign success is desirable.
In case of COVID-19, this means measuring vaccination rates, as well as new infections / deaths by geography.
Directly in the conversation, it means measuring both self-reported propensity to get the vaccine, as well as latent sentiment around topics addressed by the speaker using sentiment analysis.
In the future, similar measurements could be applied to a particular behavior / attitude change pursued.
In addition to the public health effectiveness measurements, for known state-sponsored activity, the measurement of success can be in the correlation between detections using this solution and threat intelligence sources from existing anti-misinformation efforts. The measures can be as simple as detecting the same events (e.g. recent Turkey-led campaign that appears in both the Twitter analysis and on Clubhouse), to measuring account overlap, to topic overlap between known campaigns.
In which countries do you currently operate?
In which countries do you plan to deploy your solution within the next 3 years?
What barriers currently exist for you to accomplish your goals in the next year and the next 3 years? How do you plan to overcome these barriers?
- access to Clubhouse / Twitter Spaces
certain countries have outright banned Clubhouse / Twitter (China, Myanmar), relying on NGOs and other means of speaking to populations by local volunteers / using similar linguistic groups in other countries and rooms that are not banned for input
- ethical challenges
gathering data about conversations could lead to inadvertent capture of non-salient and personally identifiable facts about speakers -- the ethical statement / consent, combined with private ML techniques (such as MPC, tf-encrypted, OpenMined) can mitigate these / allow conformance with existing privacy frameworks
- scalability challenges
processing audio at scale is significantly more computationally intensive than processing text, and it is only going to be compounded by the type of modeling / processing enabled by the solution
What type of organisation is your solution team?Academic or Research Institution
Why are you applying to The Trinity Challenge?
Trinity Challenge can help us with the initial infusion of funding necessary to run a full-scale study in terms of compute / storage costs.
An additional benefit is to help clarify the solution and the exposure it can obtain to bring in even more volunteers (Slack channel is at 17 members now).
Most importantly, this solution is designed to exist in context of other work being done in the space and fill the need for audio / conversational misinformation with a data source allowing for unique construction of interventions. Doing this under auspices of the Trinity Challenges can open more conversations with Clubhouse / Twitter and speed up integration of our tooling with existing data pipelines at these companies, as well as evolving partnerships with member organizations.
What organisations would you like to partner with, why, and how would you like to partner with them?
- Palantir - building of national security partnerships / graph analysis
- Google - potential to obtain compute for the resources, existing anti-misinfo efforts
- The Behavioral Insights Team - to develop well-defined bridges between NLP tasks and behavioral / conversational interventions
- international members / universities - to adapt solution to new cultural / linguistic contexts