TY - JOUR
T1 - Identifying Patients With Inflammatory Bowel Disease on Twitter and Learning From Their Personal Experience
T2 - Retrospective Cohort Study
AU - Stemmer, Maya
AU - Parmet, Yisrael
AU - Ravid, Gilad
N1 - Publisher Copyright:
© 2022 Journal of Medical Internet Research. All rights reserved.
PY - 2022/8/1
Y1 - 2022/8/1
N2 - Background: Patients use social media as an alternative information source, where they share information and provide social support. Although large amounts of health-related data are posted on Twitter and other social networking platforms each day, research using social media data to understand chronic conditions and patients' lifestyles is limited. Objective: In this study, we contributed to closing this gap by providing a framework for identifying patients with inflammatory bowel disease (IBD) on Twitter and learning from their personal experiences. We enabled the analysis of patients' tweets by building a classifier of Twitter users that distinguishes patients from other entities. This study aimed to uncover the potential of using Twitter data to promote the well-being of patients with IBD by relying on the wisdom of the crowd to identify healthy lifestyles. We sought to leverage posts describing patients' daily activities and their influence on their well-being to characterize lifestyle-related treatments. Methods: In the first stage of the study, a machine learning method combining social network analysis and natural language processing was used to automatically classify users as patients or not. We considered 3 types of features: the user's behavior on Twitter, the content of the user's tweets, and the social structure of the user's network. We compared the performances of several classification algorithms within 2 classification approaches. One classified each tweet and deduced the user's class from their tweet-level classification. The other aggregated tweet-level features to user-level features and classified the users themselves. Different classification algorithms were examined and compared using 4 measures: precision, recall, F1 score, and the area under the receiver operating characteristic curve. In the second stage, a classifier from the first stage was used to collect patients' tweets describing the different lifestyles patients adopt to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that patients with IBD use when describing their daily routine. Results: Both classification approaches showed promising results. Although the precision rates were slightly higher for the tweet-level approach, the recall and area under the receiver operating characteristic curve of the user-level approach were significantly better. Sentiment analysis of tweets written by patients with IBD identified frequently mentioned lifestyles and their influence on patients' well-being. The findings reinforced what is known about suitable nutrition for IBD as several foods known to cause inflammation were pointed out in negative sentiment, whereas relaxing activities and anti-inflammatory foods surfaced in a positive context. Conclusions: This study suggests a pipeline for identifying patients with IBD on Twitter and collecting their tweets to analyze the experimental knowledge they share. These methods can be adapted to other diseases and enhance medical research on chronic conditions.
AB - Background: Patients use social media as an alternative information source, where they share information and provide social support. Although large amounts of health-related data are posted on Twitter and other social networking platforms each day, research using social media data to understand chronic conditions and patients' lifestyles is limited. Objective: In this study, we contributed to closing this gap by providing a framework for identifying patients with inflammatory bowel disease (IBD) on Twitter and learning from their personal experiences. We enabled the analysis of patients' tweets by building a classifier of Twitter users that distinguishes patients from other entities. This study aimed to uncover the potential of using Twitter data to promote the well-being of patients with IBD by relying on the wisdom of the crowd to identify healthy lifestyles. We sought to leverage posts describing patients' daily activities and their influence on their well-being to characterize lifestyle-related treatments. Methods: In the first stage of the study, a machine learning method combining social network analysis and natural language processing was used to automatically classify users as patients or not. We considered 3 types of features: the user's behavior on Twitter, the content of the user's tweets, and the social structure of the user's network. We compared the performances of several classification algorithms within 2 classification approaches. One classified each tweet and deduced the user's class from their tweet-level classification. The other aggregated tweet-level features to user-level features and classified the users themselves. Different classification algorithms were examined and compared using 4 measures: precision, recall, F1 score, and the area under the receiver operating characteristic curve. In the second stage, a classifier from the first stage was used to collect patients' tweets describing the different lifestyles patients adopt to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that patients with IBD use when describing their daily routine. Results: Both classification approaches showed promising results. Although the precision rates were slightly higher for the tweet-level approach, the recall and area under the receiver operating characteristic curve of the user-level approach were significantly better. Sentiment analysis of tweets written by patients with IBD identified frequently mentioned lifestyles and their influence on patients' well-being. The findings reinforced what is known about suitable nutrition for IBD as several foods known to cause inflammation were pointed out in negative sentiment, whereas relaxing activities and anti-inflammatory foods surfaced in a positive context. Conclusions: This study suggests a pipeline for identifying patients with IBD on Twitter and collecting their tweets to analyze the experimental knowledge they share. These methods can be adapted to other diseases and enhance medical research on chronic conditions.
KW - IBD
KW - NLP
KW - Twitter
KW - inflammatory bowel disease
KW - natural language processing
KW - patient identification
KW - sentiment analysis
KW - user classification
UR - http://www.scopus.com/inward/record.url?scp=85135512032&partnerID=8YFLogxK
U2 - 10.2196/29186
DO - 10.2196/29186
M3 - Article
C2 - 35917151
AN - SCOPUS:85135512032
SN - 1439-4456
VL - 24
JO - Journal of Medical Internet Research
JF - Journal of Medical Internet Research
IS - 8
M1 - e29186
ER -