《Global evidence of expressed sentiment alterations during the COVID-19 pandemic》

《在 COVID-19 大流行期间表达情绪变化的全球证据》

Abstract:The COVID-19 pandemic has created unprecedented burdens on people’s physical health and subjective well-being. While countries worldwide have developed platforms to track the evolution of COVID-19 infections and deaths, frequent global measurements of affective states to gauge the emotional impacts of pandemic and related policy interventions remain scarce. Using 654 million geotagged social media posts in over 100 countries, covering 74% of world population, coupled with state-of-the-art natural language processing techniques, we develop a global dataset of expressed sentiment indices to track national- and subnational-level affective states on a daily basis. We present two motivating applications using data from the first wave of COVID-19 (from 1 January to 31 May 2020). First, using regression discontinuity design, we provide consistent evidence that COVID-19 outbreaks caused steep declines in expressed sentiment globally, followed by asymmetric, slower recoveries. Second, applying synthetic control methods, we find moderate to no effects of lockdown policies on expressed sentiment, with large heterogeneity across countries. This study shows how social media data, when coupled with machine learning techniques, can provide real-time measurements of affective states.

摘要:COVID-19 大流行给人们的身体健康和主观幸福感带来了前所未有的负担。虽然世界各国已经开发了平台来跟踪 COVID-19 感染和死亡的演变,但对情感状态的全球频繁测量以衡量大流行和相关政策干预的情绪影响仍然很少。我们使用 100 多个国家/地区的 6.54 亿个地理标记社交媒体帖子,覆盖 74% 的世界人口,再加上最先进的自然语言处理技术,我们开发了一个表达情感指数的全球数据集,以跟踪国家和地方层面每天的情感状态。我们使用来自第一波 COVID-19(2020 年 1 月 1 日至 5 月 31 日)的数据展示了两个激励应用程序。首先,使用回归不连续设计,我们提供一致的证据表明,COVID-19 的爆发导致全球表达情绪急剧下降,随后出现不对称、缓慢的复苏。其次,应用综合控制方法,我们发现封锁政策对表达的情绪有中等甚至没有影响,各国之间存在很大的异质性。这项研究展示了社交媒体数据如何与机器学习技术相结合,可以提供情感状态的实时测量。

Introduction:

The December 2019 coronavirus disease (COVID-19) outbreak has threatened the stability of health-care systems and generated unparalleled social and economic disruptions in nations across the globe . The health crisis has brought emotional distress to citizens beyond those contracting the disease . Over the course of 2020, individuals faced novel health risks associated with daily activities, shortages of resources and increased uncertainty in their financial futures and social lives . In addition, numerous governments have imposed strict controls on movement, infringing on personal freedoms and increasing loneliness, depression, anxiety and other negative emotions.(Sibley, C. G. et al. Efects of the COVID-19 pandemic and nationwide lockdown on trust, attitudes toward government, and well-being. Am. Psychol. 75, 618–630 (2020). and Brooks, S. K. et al. Te psychological impact of quarantine and how to reduce it: rapid review of the evidence. Lancet 395, 912–920 (2020))

由于政府的强加管控,侵犯了人们的个人自由,使人们变得孤独、抑郁、焦虑。

Many governments worldwide are incorporating measures of citizens’ subjective well-being into policy decision-making to complement economic indicators such as gross domestic product . Affective state (positive and negative emotions) is one of the central components of subjective well-being and is likely to be greatly impacted by the COVID-19 pandemic. As the COVID-19 crisis continues and the world faces the expectation of recursive virus outbreaks, governments worldwide are increasingly concerned about the emotional impacts of COVID-19 outbreaks and anti-contagion policies used to manage the pandemic. (Lima, C. K. T. et al. Te emotional impact of coronavirus 2019-nCoV (new coronavirus disease). Psychiatry Res. 287, 112915 (2020) )Nonetheless, while there have been numerous efforts to track COVID-19 infection and policy responses globally, there are no standardized high-frequency measures of the affective aspect of subjective well-being.

新冠危机持续,我们将面对新冠病毒的反复爆发。各国政府更关注新冠爆发的情绪影响和用于管理该疾病的反传染策略。

Tracking the affective states of citizens during disruptive events such as natural disasters and epidemics is especially challenging due to the unpredictability and volatility of these crises. There are a variety of laudable survey initiatives, such as the Weekly COVID-19 Snapshot Monitoring in Germany, to track the evolution of risk perception using cross-sectional national surveys. However, such traditional survey measurements are prohibitively expensive at the global scale and usually suffer from limited coverage, insufficient sampling frequency and substantial delays. Social media data have offered a valuable complement to track the affective aspect of subjective well-being . Sentiment analysis, which uses natural language processing (NLP) and computational linguistics, allows standardized quantification of emotional states from text. Expressed sentiment indices built on people’s posts on social media platforms have been validated to be meaningfully correlated with subjective well-being measured by conventional surveys (for example, the Gallup–Sharecare Well-Being Index survey, which is currently the most definitive measurement of subjective well-being and is widely used in well-being research) . Recent literature has applied social media expressed sentiment indices to estimate the effects of temperatures, local air pollution, natural disasters and other environmental stressors on subjective well-being.

作者首先写到对于流行病的不可预测性和波动性,跟踪情感状态非常具有挑战性,有些调查举措(如: 德国的每周covid-19快照监测)值得称赞,但其却有着一定的局限性。还写到社交媒体上的数据为研究情感分析提供数据,利用NLP和计算机语言学的情感分析研究这些本文数据是可行的(如:Gallup–Sharecare Well-Being Index survey)。最后还写到有些文献应用情绪指数来估计温度、空气污染、自然灾害等对主观幸福感的影响。

Here we build a global dataset that tracks expressed sentiment at the national and subnational (state/province) levels with high temporal and spatial granularity using anonymized and aggregated data from the two largest social media microblogging platforms (Twitter and Weibo (the Chinese equivalent of Twitter)). The data contain more than 600 million geotagged social media posts on all topics published by 10.56 million individuals during the first wave of the COVID-19 pandemic (from 1 January to 31 May 2020) (Fig. 1). Since the sample of COVID-19-related discussions might not be a good representation of the affective state of the general population and could be polluted by political campaigns, we exclude tweets directly related to COVID-19 when building out the main sentiment indices. We then apply the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) NLP technique to compute daily sentiment measures in over 100 countries standardized across 65 languages (Methods). Unlike dictionary-based sentiment analysis such as Linguistic Inquiry and Word Count (LIWC), deep-learning-based BERT algorithms allow word representations to be enriched with contextual information and enable multilingual computations.

数据:取自两大社交平台(Twitter和weibo)的匿名和汇总数据(2020.1~2020.5),带有地理标签的帖子

清洗:防止政治污染,排除与covid-19直接相关的推文

技术:BERT NLP

On the basis of our measures of expressed sentiment and under the assumption that the existing evidence on the correlation between sentiment and affective well-being is valid, we conduct two inter-related empirical exercises to evaluate the global affective impacts of the COVID-19 pandemic and policy responses. The first exercise estimates the overall expressed sentiment alterations associated with COVID-19. We employ reduced-form econometric methods to measure the sentiment drops related to the advent of COVID-19 human-to-human transmission and estimate the recovery time needed for sentiment to return to the baseline levels. Our second exercise applies synthetic control methods (SCM) to explore how social-media-based sentiment measures can be used by countries and international organizations to evaluate alterations in affective states after policy interventions or events, using lockdown policies as an example. To facilitate comparisons across countries, our estimates of sentiment alteration are all measured in the unit of a country’s own magnitude of sentiment variation (that is, the standard deviation of sentiment time series before COVID-19). We describe our approach in more detail in the Methods.

两个研究:

①与covid-19相关的总体表达的情绪变化

②国家与国际组织如何利用社交媒体的情绪测量来评估政策干预或事件发生后的情绪状态的改变,以封锁政策为例

Results:

①Expressed sentiment alterations during the COVID-19 pandemic

image.png

The time series of the global standardized daily sentiment score (grey; seven-day smooth line in black) and the Google mobility index at transit (light green; seven-day smooth line in green). The sentiment index is obtained by applying NLP algorithms to tweets (Methods). The mobility index is from Google (https://www.google.com/covid19/mobility/), with trends at the transit stations’ being used.

全球表达情绪出现了明显下降,特别在WHO 宣布covid-19为全球疫情后

image.png

Standardized sentiment index by country (labelled with ISO three-letter country codes). The vertical line shows the time the WHO declared COVID-19 a global pandemic, and the purple line represents the sentiment nadir (that is, the lowest point) of each country. The countries are labelled with ISO 3166-1 alpha-3 country codes.

样本中的所有国家在COVID-19流行时都遭受着情绪变化,其程度和持续时间各不相同。

To measure the patterns of sentiment alterations created by COVID-19, we develop two global indices (Methods, ‘Modelling of sentiment dynamics’): sentiment drop and recovery half-life.

为了测量COVID-19造成的情绪变化模式,引入两个指数:情绪下降和恢复半衰期

sentiment drop --- a country’s sentiment decline from the level before COVID-19 to its lowest value during the first wave of COVID-19.

image.png

Mapping of sentiment drop variations across countries. Sentiment drop is defined by the magnitude of sentiment decline from each country’s average sentiment before COVID-19 to its lowest sentiment value and measured by RDD (Methods, ‘Sentiment alterations during the COVID-19 pandemic’). The magnitude of drops is measured in standard deviations before COVID-19. Red represents large shock; green represents small shock. The box plot shows the median (interquartile range) of sentiment shock sizes in s.d.

sentiment recovery half-life--- measures the days it took for a country to recover from the lowest sentiment to half of its stationary state of recovered sentiment (that is, the convergence value in the calibrated sentiment recovery model).

The recovery time not only reflects the emotional resilience towards the pandemic itself. This measure should be interpreted as a combined effect of pandemic severity and regulatory policies, and it may be influenced by other events happening around the first wave of the pandemic within each country.

有意思的是,恢复时间不仅反映了对流行病本身的情绪恢复能力,这一措施应该被理解为流行病的严重性和监管政策的综合影响,其可能受每个国家第一波流行病前后发生的其他事情的影响。

image.png

Mapping of recovery half-life variations across countries. Light colours correspond to quick recovery, blue indicates long recovery and purple indicates that the country is still in the recovering stage by 25 May 2020 (that is, recovery degree is below −1 s.d.). The circle sizes further display the recovery degrees by country. The box plot shows the median (interquartile range) of recovery half-life in days.

②Impacts of lockdowns on expressed sentiment

image.png

Average changes in sentiment scores associated with lockdown policies

image.png

Distribution of changes in sentiment associated with the enforcement of a national lockdown for the 52 countries in our sample that enforced a lockdown during our sample period (see Methods, ‘Impacts of lockdowns on expressed sentiment’, for the definition). Supplementary analyses at the subnational level for the United States are included in Supplementary Note 6. The effect is measured by comparing the standardized sentiment index with the pre-COVID-19 sentiment average for each country. The countries are labelled with ISO 3166-1 alpha-3 country codes. The flag SVG assets, used under the CC-BY 4.0 license, are taken from the Emojitwo set: https://emojitwo.github.io/.

Discussion

The author discusses some advantages and disadvantages.

Methods

The author introduces the experimental data, methods, modeling procedures, influencing factors and so on.

Data source

GitHub - Jianghao/Sentiment_COVID-19

posted @ 2022-03-24 16:36  我在吃大西瓜呢  阅读(151)  评论(0编辑  收藏  举报