Text Analysis of Evolving Emotions and Sentiments in COVID-19 Twitter Communication

In addition to the computer-based analysis, we manually analyzed the posts at a finer granularity, by adopting the work of Scherer [25], who identifies 36 ontological categories that deal with affect: admiration/awe, amusement, anger, anxiety, being touched, boredom, compassion, contempt, contentment, desperation, disappointment, disgust, dissatisfaction, envy, fear, feeling of affection/love, gratitude, guilt, happiness, hatred, hope, humility, interest/enthusiasm, irritation, jealousy, joy, longing, lust, pleasure/enjoyment, pride, relaxation/serenity, relief, sadness, shame, surprise, and tension/stress. These terms help to identify sentiment in natural language [26, 27].

We read each of the tweets and classified them as factual or emotional, based on Scherer’s [25] categories. Approximately, 55% of the posts were factual, simply referring to a fact (without emotion or sentiment) intended to be true at the time of posting. Other tweets were factual with some type of emotion and included direct reporting of patient experiences (5%). The remaining tweets reflected only a few emotions (anger, anxiety, desperation, disgust, hope, and surprise).

An example of a tweet classified as factual (which could be falsified later) was: “The spread of #COVID19 by an asymptomatic or someone who is not showing any symptoms appears to be less likely, said #WHO (@WHO) in the recently published summary of transmission of COVID-19 including symptomatic, pre-symptomatic and asymptomatic patients.” An example of the emotion fear was: “A patient with symptoms of a heart attack refused treatment after reading on Facebook that she would die if she went to hospital during the COVID-19 crisis.” Another tweet (again subject to later falsification), but intended to be factual, was: “@CDCgov issued some very useful current best estimates:—About 1/3 of COVID-19 infections are asymptomatic.—40% of transmission is occurring before people feel sick.—Time from exposure to symptom onset: ~ 6 days on average.”

Data

Using an online tool, https://birdiq.net/Twitter-search, we manually collected tweets based on keywords, such as COVID-19, CDC, and WHO. These tweets (over 2000) were reviewed to identify sentiments and insights that would not be possible to extract using an automated tool, again, to gain insight into user behaviors. We strived to show the value of manual mining, recognizing that this type of analysis is not feasible on a large scale. Table 20 provides examples of tweets and classifies them based on their actual, or assumed, intended (potential or real) significance.

Table 20 Data set of ad hoc tweets

From this sample, the most-likely keyword, COVID-19, revealed a variety of expected tweets on: the spread of the virus, the serious of it based on experiences and testimonials, testing, innovative ways to approaching testing and treatments, and others. The tweets shown from the WHO and DCD relate to advice and awareness.

Searches for Nuggets

The notion of the wisdom of the crowds implies that sometimes the crowd is able to perform better than individuals [28]. We investigated whether the content, as provided by the user community of Twitter (the crowd), could provide insights that might be helpful to the general public, or perhaps even medical professionals. The types of insights we were looking for required a human to identify what might be useful content and extract ideas at the tweet level of analysis [29]. Therefore, we reviewed approximately 2000 tweets posted throughout the pandemic. We attempted to identify nuggets; that is, pieces of information with the potential to have real value or use, beyond just a post. Examples of potentially influential tweets are given below. The first is a best practices suggestion.

The following tweet shows passing on blood type information from a legitimate news source. Such information could be useful for someone assessing their own risk (e.g., for potential usefulness).

Tweet (blood types). This study finds COVID patients with type A blood are at much higher risk of developing life-endangering symptoms, patients with type O blood experience a “protective effect” https://www.nytimes.com/2020/06/03/health/coronavirus-blood-type-genetics.html

However, a later study by Harvard showed that people who were symptomatic and had blood types of B + or AB + were more likely to have a positive COVID-19 test than people who were symptomatic with blood type O.Footnote 8

The following tweet is factual. Knowing the potential length of the illness might last would be useful to anyone concerned with whether they are experiencing a typical duration.

The following tweet reports on a medial study and would be useful for anyone concerned with how seriously the virus might infect them.

Factual: Low levels of the prognostic biomarker suPAR are predictive of mild outcome in patients with symptoms of COVID-19 - a prospective cohort study. Authors: jesper eugen-olsen, Izzet Altintas, Jens Tingleff, Marius St... http://medrxiv.org/cgi/content/short/2020.05.27.20114678

The following two tweets show associations of patient characteristics and occurrence of the disease. These posts are interesting in the sense that the associations being made are non-intuitive. However, they serve as examples of the types of tweets that might trigger self-reporting of whether a person falls into one of these categories, which, in turn, could lead to further investigation to uphold or falsify the conclusions from the reports.

Factual (implications true or falsified later): In one report, dermatologists evaluated 88 COVID-19 patients in an Italian hospital and found 1 in 5 had some sort of skin symptom, mostly red rashes over the trunk. https://inq.news/COVID-toes

Factual (implications true or falsified later): Most #coronavirus patients had no hair https://www.hulldailymail.co.uk/news/uk-world-news/bald-men-could-risk-more-4194866

The tweets below could be important because they provide information on the virus itself, as well as a potential treatment, but are not scientifically proven.

COVID-19 maybe mutating but it’s for the good. Doctors in Italy have claimed that the symptoms of COVID-19 and their intensity is less than what they experienced with the first wave of patients. This suggests that COVID-19 gets weaker as it spreads. https://elemental.medium.com/could-the-coronavirus-be-weakening-as-it-spreads-928f2ad33f89

A new drug, #famotidine, available over-the-counter for relieving #heartburn, has shown promising results in treating the symptoms of #COVID19 https://www.firstpost.com/health/heartburn-drug-famotidine-may-reduce-symptoms-of-non-hospitalised-COVID-19-patients-suggests-case-series-8452421.html

As these tweets illustrate, they provide useful, or potentially useful, information when so much is unknown about this global crisis. Human judgment is needed to assess the validity of the claims in the tweets with scientific study clearly required for some of them. However, the potential value of the information contained within a tweet could not easily be obtained by software.

Twitter Use

People generally turned to Twitter as a platform to make sense of the pandemic. The tweets showed that people also wanted to provide useful information for others, sharing their opinions and knowledge. There were many compassion posts triggered by personal situations.

Example (desperation/disgust): My father, 62 yr suffering from high fever (103-104) from 9 days with no other apparent symptoms. He tested negative on COVID 19. He has history of CABG in 2006. Our family doctor advised to get him admitted. No hospital is accepting patient with fever. Pls help #caremongers

Example (factual/tension/stress): My friend is a nurse & finally broke her silence. She said she’s seeing COVID-19 patients leaving the hospital after COVID with kidney damage. Others will suffer with COPD like symptoms for the rest of their lives. It’s very scary.

Example (disgust) MILD. There’s a huge amount at stake in term mild – for gov actors, health service planners, clinicians, patients, carers...In the days when my own ‘mild’ #COVID19 symptoms have been manageable (Day 52 now), I reflected on mild COVID-19 for @somatosphere https://t.co/rQ9wFdcSQ7?amp=1

This mining revealed a great deal of posts with different perspectives. Many posts were intended to provide useful information. However, some of the posts which reported information considered to be factual (e.g., do not need to wear masks) had the potential to later be proven false.

Ideally, the mining for nuggets could produce insights for the management of the virus. For example, some cases reported on successful convalescent plasma treatments, leading to requests for plasma donations from recovered patients. Other tweets reported some members of a family getting the virus and others who lived at the same location, not getting it. Such reports might be of interest to researchers trying to find commonalities in these cases. The Appendix contains tweets mined from an additional data set. They reveal a combination of medical innovations (attempted or actual), health information, sentiment and personal reports, opinions, and creative comments. The tweets reveal a need to contribute to an ongoing crisis by providing medical information; contributing to the global conversation on COVID-19; or seeking help.

Themes—Summary

The use of Twitter as a critical social media tool in times of major communication needs was obvious with Twitter text providing valuable insights into users’ opinions and attitudes. The same held true as other world events unfold; for example, the Arab Spring and Japan’s earthquakes [6]. For COVID-19, the sentiment analysis revealed a change over time as the pandemic progressed. The most notable trend was that tweet content progressed from providing, and seeking, factual information to expressing emotions, including anger. Prior research found that Twitter, along with other social media, could be used as a predictor of COVID-19 cases and other threats to community health [30]. It is likely there will be continued use of these platforms. The development of large databases of tweets or other user-generated content should, thus, continue to provide substantial research opportunities to investigate COVID-19 or other issues related to global challenges of such magnitude [7].

Content themes emerged. The tweets emphasized information on testing, treating, reporting of well-known figures who tested positive, warnings about the severity of the disease, and other health-related information. Additional themes related to politics, reported scientific breakthroughs (some of which were later shown to be false), economics, reopening of schools and businesses, and others.

We attempted to understand the content of the tweets using sentiment analysis. Many tweets were factual; other showed predictable sentiment of anger, desperation, and hope. Of interest was how Twitter might be used to identify information nuggets, in the traditional sense of a valuable idea. This involved manual inspection and mining. One nugget was a relatively early suggestion that a hospital in India collect the blood of patients who had recovered from COVID-19. Later, the identification of blood type was scientifically investigated as an indicator of the risk of experiencing the disease. However, there does not appear to be a way for a computer program to connect these two, demonstrating the limitations of tools to extract inherent information in text data [12].

In the same way, there is much intentional or non-intentional sharing of misinformation, often referred to as fake news [31, 32]. A computer program that can deal with sentiment well might be able to identify tweets with specific content and others with opposite or contradictory content. We did not, for example, investigate tweets that suggested the COIVD-19 pandemic should not be taken seriously. Instead, we considered reasons why people elected to share content. Representative examples are provided in Table 21.

Table 21 Sharing of Twitter content

Many other investigations are possible. For example, could the impact of international protests be factored into the sentiment analysis? Is it possible to identify a “tipping” point where people realized the importance of being vigilant (wearing masks, etc.) based on posts reported by infected people relaying the seriousness of the disease to others?

Twitter has been used for social debates and expressions of public opinion (e.g., [33]). No doubt, it will continue to be used in this manner for topics of large, public impact. However, with millions of tweets being generated each day, our study has involved a limited number of texts, restricted to those written in English. It would be useful to expand the categories of sentiment we use as well as well as to determine whether there was any age group or gender differences in the negative tweets. Finally, finding the true nuggets will, no doubt, require a huge, semi-automated approach, but doing so might help to identify insights that could lead to the development of better sentiment analysis tools.

What we learn from Twitter as a platform is the potential to reach a large audience and provide much information, informative or otherwise. Of course, there are many issues, but it is not possible to verify them without scientific experimentation and reporting of actual numbers. For example, at one point, based on data from Italy and the UK, a website reported men as having approximately twice the number of deaths as women.Footnote 9 Finally, not all insights can be obtained using existing sentiment analysis tools, but there is a limited amount of insight that can be obtained from manual mining.

留言 (0)

沒有登入
gif