Sentimental Analysis of COVID-19 Related Messages in Social Networks by Involving an N-Gram Stacked Autoencoder Integrated in an Ensemble Learning Scheme
Language English Country Switzerland Media electronic
Document type Journal Article
Grant support
RSP-2021/167
King Saud University
PubMed
34833656
PubMed Central
PMC8623208
DOI
10.3390/s21227582
PII: s21227582
Knihovny.cz E-resources
- Keywords
- COVID-19, N-gram feature extraction, data prediction, ensemble machine learning, twitter data,
- MeSH
- COVID-19 * MeSH
- Humans MeSH
- Pandemics MeSH
- SARS-CoV-2 MeSH
- Social Media * MeSH
- Social Networking MeSH
- Machine Learning MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
The current population worldwide extensively uses social media to share thoughts, societal issues, and personal concerns. Social media can be viewed as an intelligent platform that can be augmented with a capability to analyze and predict various issues such as business needs, environmental needs, election trends (polls), governmental needs, etc. This has motivated us to initiate a comprehensive search of the COVID-19 pandemic-related views and opinions amongst the population on Twitter. The basic training data have been collected from Twitter posts. On this basis, we have developed research involving ensemble deep learning techniques to reach a better prediction of the future evolutions of views in Twitter when compared to previous works that do the same. First, feature extraction is performed through an N-gram stacked autoencoder supervised learning algorithm. The extracted features are then involved in a classification and prediction involving an ensemble fusion scheme of selected machine learning techniques such as decision tree (DT), support vector machine (SVM), random forest (RF), and K-nearest neighbour (KNN). all individual results are combined/fused for a better prediction by using both mean and mode techniques. Our proposed scheme of an N-gram stacked encoder integrated in an ensemble machine learning scheme outperforms all the other existing competing techniques such unigram autoencoder, bigram autoencoder, etc. Our experimental results have been obtained from a comprehensive evaluation involving a dataset extracted from open-source data available from Twitter that were filtered by using the keywords "covid", "covid19", "coronavirus", "covid-19", "sarscov2", and "covid_19".
Department of Mathematics Faculty of Science Mansoura University Mansoura 35516 Egypt
Faculty of Informatics and Computing Singidunum University Danijelova 32 11000 Belgrade Serbia
Faculty of Science and Technology Norwegian University for Life Science 1430 Ås Norway
See more in PubMed
Zhang X., Saleh H., Younis E.M., Sahal R., Ali A.A. Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System. Hindawi Complex. 2020;2020:6688912. doi: 10.1155/2020/6688912. DOI
Alamoodi A.H., Zaidan B.B., Zaidan A.A., Albahri O.S., Mohammed K.I., Malik R.Q., Almahdi E.M., Chyad M.A., Tareq Z., Albahri A.S., et al. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. Expert Syst. Appl. 2021;167:114155. doi: 10.1016/j.eswa.2020.114155. PubMed DOI PMC
Forbes 5G Networks and COVID-19 Coronavirus: Here Are the Latest Conspiracy Theories. [(accessed on 9 April 2021)]. Available online: https://www.forbes.com/sites/brucelee/2020/04/09/5g-networks-and-covid-19-coronavirus-here-are-the-latest-conspiracy-theories/?sh=47d7ce926d41.
Brennen J.S., Simon F., Howard P.N., Nielsen R.K. Types, Sources, and Claims of COVID-19 Misinformation. Volume 7 Reuters Institute; Oxford, UK: 2020.
Chawla S., Mittal M., Chawla M., Goyal L. Corona Virus-SARS-CoV-2: An Insight to Another way of Natural Disaster. EAI Endorsed Trans. Pervasive Health Technol. 2020;6:e2. doi: 10.4108/eai.28-5-2020.164823. DOI
Mertens G., Gerritsen L., Duijndam S., Salemink E., Engelhard I.M. Fear of the coronavirus (COVID-19): Predictors in an online study conducted in March 2020. J. Anxiety Disord. 2020;74:102258. doi: 10.1016/j.janxdis.2020.102258. PubMed DOI PMC
Socio-Economic Impact of COVID-19|UNDP. [(accessed on 21 April 2021)]. Available online: https://www.undp.org/content/undp/en/home/coronavirus/socio-economic-impact-of-covid-19.html.
Staszkiewicz P., Chomiak-Orsa I. Dynamics of the COVID-19 Contagion and Mortality: Country Factors, Social Media, and Market Response Evidence from a Global Panel Analysis. IEEE Access. 2020;8:106009–106022. doi: 10.1109/ACCESS.2020.2999614. DOI
Donthu N., Gustafsson A. Effects of COVID-19 on business and research. J. Bus. Res. 2020;117:284–289. doi: 10.1016/j.jbusres.2020.06.008. PubMed DOI PMC
Guo Y.-R., Cao Q.-D., Hong Z.-S., Tan Y.-Y., Chen S.-D., Jin H.-J., Tan K.-S., Wang D.-Y., Yan Y. The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak—An update on the status. Mil. Med. Res. 2020;7:11. doi: 10.1186/s40779-020-00240-0. PubMed DOI PMC
Mittal M., Battineni G., Goyal L.M., Chhetri B., Oberoi S.V., Chintalapudi N., Amenta F. Cloud-based framework to mitigate the impact of COVID-19 on seafarers’ mental health. Int. Marit. Health. 2020;71:213–214. doi: 10.5603/IMH.2020.0038. PubMed DOI
Akande O.N., Badmus T.A., Akindele A.T., Arulogun O.T. Dataset to support the adoption of social media and emerging technologies for students’ continuous engagement. Data Brief. 2020;31:105926. doi: 10.1016/j.dib.2020.105926. PubMed DOI PMC
Garcia L.P., Duarte E. Infodemic: Excess quantity to the detriment of quality of information about COVID-19. Epidemiol. Serv. Health. 2020;29:e2020186. PubMed
Hung M., Lauren E., Hon E.S., Birmingham W.C., Xu J., Su S., Hon S.D., Park J., Dang P., Lipsky M.S. Social Network Analysis of COVID-19 Sentiments: Application of Artificial Intelligence. J. Med. Internet Res. 2020;22:e22590. doi: 10.2196/22590. PubMed DOI PMC
Mehmood R., See S., Katib I., Chlamtac I. Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies. Springer International Publishing; Cham, Switzerland: 2020. p. 692. AI/Springer Innovations in Communication and Computing.
Shi Z., Rui H., Whinston A.B. Content Sharing in a Social Broadcasting Environment: Evidence from Twitter. MISQ. 2014;38:123–142. doi: 10.25300/MISQ/2014/38.1.06. DOI
Boon-Itt S., Skunkan Y. Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study. JMIR Public Health Surveill. 2020;6:e21978. doi: 10.2196/21978. PubMed DOI PMC
Plutchik R. A general psych evolutionary theory of emotion. In: Robert P., Henry K., editors. Theories of Emotion. Academic Press; Cambridge, MA, USA: 1980. pp. 3–33.
Lyu J., Han E., Luli G. COVID-19 Vaccine—Related Discussion on Twitter: Topic Modeling and Sentiment Analysis. J. Med. Internet Res. 2021;23:e24435. doi: 10.2196/24435. PubMed DOI PMC
Jang H., Rempel E., Roth D., Carenini G., Janjua N. Tracking COVID-19 Discourse on Twitter in North America: Infodemiology Study Using Topic Modeling and Aspect-Based Sentiment Analysis. J. Med. Internet Res. 2021;23:e25431. doi: 10.2196/25431. PubMed DOI PMC
Apuke O.D., Omar B. Fake news and COVID-19: Modelling the predictors of fake news sharing among social media users. Telemat. Inform. 2021;56:101475. doi: 10.1016/j.tele.2020.101475. PubMed DOI PMC
Zaman A. COVID-19-Related Social Media Fake News in India. J. Media. 2021;2:100–114.
Depoux A., Martin S., Karafillakis E., Preet R., Wilder-Smith A., Larson H. The pandemic of social media panic travels faster than the COVID-19 outbreak. J. Travel Med. 2020;27:taaa031. doi: 10.1093/jtm/taaa031. PubMed DOI PMC
Gao J., Zheng P., Jia Y., Chen H., Mao Y., Chen S., Wang Y., Fu H., Dai J. Mental health problems and social media exposure during COVID-19 outbreak. PLoS ONE. 2020;15:e0231924. PubMed PMC
Ahmad A.R., Murad H.R. The Impact of Social Media on Panic during the COVID-19 Pandemic in Iraqi Kurdistan: Online Questionnaire Study. J. Med. Internet Res. 2020;22:e19556. doi: 10.2196/19556. PubMed DOI PMC
Cinelli M., Quattrociocchi W., Galeazzi A., Valensise C.M., Brugnoli E., Schmidt A.L., Zola P., Zollo F., Scala A. The COVID-19 social media infodemic. Sci. Rep. 2020;10:16598. doi: 10.1038/s41598-020-73510-5. PubMed DOI PMC
Twitter Twitter Usage Statistics—Internet Live Stats. [(accessed on 19 October 2020)]. Available online: https://www.internetlivestats.com/twitter-statistics/
Chakraborty K., Bhatia S., Bhattacharyya S., Platos J., Bag R., Hassanien A.E. Sentiment Analysis of COVID-19 tweets by Deep Learning Classifiers—A study to show how popularity is affecting accuracy in social media. Appl. Soft Comput. 2020;97:106754. doi: 10.1016/j.asoc.2020.106754. PubMed DOI PMC
Shahsavari S., Holur P., Tangherlini T.R., Roychowdhury V. Conspiracy in the time of corona: Automatic detection of COVID-19 conspiracy theories in social media and the news. J. Comput. Soc. Sci. 2020;3:279–317. doi: 10.1007/s42001-020-00086-5. PubMed DOI PMC
Havey N.F. Partisan public health: How does political ideology influence support for COVID-19 related misinformation? J. Comput. Soc. Sci. 2020;3:319–342. doi: 10.1007/s42001-020-00089-2. PubMed DOI PMC
Pinter G., Felde I., Mosavi A., Ghamisi P., Gloaguen R. COVID-19 pandemic prediction for Hungary; a hybrid machine learning approach. Mathematics. 2020;8:890. doi: 10.3390/math8060890. DOI
Twitter: Standard Search Api. 2020. [(accessed on 20 April 2020)]. Available online: https://developer.twitter.com/en/docs/tweets/search/overview.
Twitter: Filter Real Time Tweets. 2020. [(accessed on 20 April 2020)]. Available online: https://developer.twitter.com/en/docs/tweets/filter-realtime/overview.
Singh V., Kumar B., Patnaik T. Feature extraction techniques for handwritten text in various scripts: A survey. Int. J. Soft Comput. Eng. 2013;3:238–241.
Trier D., Jain A.K., Taxt T. Feature extraction methods for character recognition—A survey. Pattern Recognit. 1996;29:641–662. doi: 10.1016/0031-3203(95)00118-2. DOI
Liang H., Sun X., Sun Y., Gao Y. Text feature extraction based on deep learning: A review. EURASIP J. Wirel. Commun. Netw. 2017;1:211. doi: 10.1186/s13638-017-0993-1. PubMed DOI PMC
Kavinwidholm, Machine Learning Pipeline for Real-Time Sentiment Analysis. [(accessed on 19 April 2018)]. Available online: https://www.novatec-gmbh.de/en/blog/sentimentanalyzer/
Park W., You Y., Lee K. Twitter Sentiment Analysis Using Machine Learning, Research Briefs on Information & Communication Technology Evolution. 2017. [(accessed on 21 April 2021)]. Available online: http://rbisyou.wixsite.com/rebicte/volume-3-2017.
Feng S., Kang J.S., Kuznetsova P., Choi Y. Connotation lexicon: A dash of sentiment beneath the surface meaning; Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics; Sofia, Bulgaria. 4–9 August 2013; pp. 1774–1784.
Losada M., Heaphy E. The Role of Positivity and Connectivity in the Performance of Business Teams: A Nonlinear Dynamics Model. Am. Behav. Sci. 2004;47:740–765. doi: 10.1177/0002764203260208. DOI
Park W., You Y., Lee K. Detecting Potential Insider Threat: Analyzing Insiders Sentiment Exposed in Social Media. Hindawi Secur. Commun. Netw. 2018;2018:7243296. doi: 10.1155/2018/7243296. DOI
Venkatachalam K., Prabu P., Almutairi A., Abouhawwash M. Secure biometric authentication with de-duplication on distributed cloud storage. PeerJ Comput. Sci. 2021;7:e569. PubMed PMC