A Classification Model to Analyze the Spread and Emerging Trends of the Zika Virus in Twitter

Tripathy, B. K.; Thakur, Saurabh; Chowdhury, Rahul

doi:10.1007/978-981-10-3874-7_61

A Classification Model to Analyze the Spread and Emerging Trends of the Zika Virus in Twitter

B. K. Tripathy¹⁶,
Saurabh Thakur¹⁶ &
Rahul Chowdhury¹⁶

Conference paper
First Online: 20 May 2017

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 556))

Abstract

The Zika disease is a 2015–16 virus epidemic and continues to be a global health issue. The recent trend in sharing critical information on social networks such as Twitter has been a motivation for us to propose a classification model that classifies tweets related to Zika and thus enables us to extract helpful insights into the community. In this paper, we try to explain the process of data collection from Twitter, the preprocessing of the data, building a model to fit the data, comparing the accuracy of support vector machines and Naïve Bayes algorithm for text classification and state the reason for the superiority of support vector machine over Naïve Bayes algorithm. Useful analytical tools such as word clouds are also presented in this research work to provide a more sophisticated method to retrieve community support from social networks such as Twitter.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Cristianini, Nello, and John Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.
Google Scholar
El Kourdi, Mohamed, Amine Bensaid, and Tajje-eddine Rachidi. “Automatic Arabic document categorization based on the Naïve Bayes algorithm.” Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages. Association for Computational Linguistics, 2004.
Google Scholar
Hassan, Sundus, Muhammad Rafi, and Muhammad Shahid Shaikh. “Comparing SVM and naive bayes classifiers for text categorization with Wikitology as knowledge enrichment.” Multitopic Conference (INMIC), 2011 IEEE 14th International. IEEE, 2011.
Google Scholar
Joachims, Thorsten. “Text categorization with support vector machines: Learning with many relevant features.” European conference on machine learning. Springer Berlin Heidelberg, 1998.
Google Scholar
Khan, Aamera ZH, Mohammad Atique, and V. M. Thakare. “Combining lexicon-based and learning-based methods for Twitter sentiment analysis.” International Journal of Electronics, Communication and Soft Computing Science & Engineering (IJECSCSE) (2015): 89.
Google Scholar
Lerman, Kristina, and Rumi Ghosh. “Information contagion: An empirical study of the spread of news on Digg and Twitter social networks.” ICWSM 10 (2010): 90–97.
Google Scholar
McCallum, Andrew, and Kamal Nigam. “A comparison of event models for naive bayes text classification.” AAAI-98 workshop on learning for text categorization. Vol. 752. 1998.
Google Scholar
Pak, Alexander, and Patrick Paroubek. “Twitter as a Corpus for Sentiment Analysis and Opinion Mining.” LREc. Vol. 10. 2010.
Google Scholar
Sakaki, Takeshi, Makoto Okazaki, and Yutaka Matsuo. “Earthquake shakes Twitter users: real-time event detection by social sensors.” Proceedings of the 19th international conference on World wide web. ACM, 2010.
Google Scholar
Sebastiani, Fabrizio. “Machine learning in automated text categorization.” ACM computing surveys (CSUR) 34.1 (2002): 1–47.
Google Scholar
Shmilovici, Armin. “Support vector machines.” Data Mining and Knowledge Discovery Handbook. Springer US, 2005. 257–276.
Google Scholar
Tong, Simon, and Daphne Koller. “Support vector machine active learning with applications to text classification.” Journal of machine learning research 2. Nov (2001): 45–66.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Science and Engineering, VIT University, Vellore, India
B. K. Tripathy, Saurabh Thakur & Rahul Chowdhury

Authors

B. K. Tripathy
View author publications
You can also search for this author in PubMed Google Scholar
Saurabh Thakur
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. K. Tripathy .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering & Information Technology, Veer Surendra Sai University of Technology, Sambalpur, Odisha, India
Himansu Sekhar Behera
Department of CSE, National Institute of Technology (NIT), Rourkela, Odisha, India
Durga Prasad Mohapatra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tripathy, B.K., Thakur, S., Chowdhury, R. (2017). A Classification Model to Analyze the Spread and Emerging Trends of the Zika Virus in Twitter. In: Behera, H., Mohapatra, D. (eds) Computational Intelligence in Data Mining. Advances in Intelligent Systems and Computing, vol 556. Springer, Singapore. https://doi.org/10.1007/978-981-10-3874-7_61

Download citation

DOI: https://doi.org/10.1007/978-981-10-3874-7_61
Published: 20 May 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3873-0
Online ISBN: 978-981-10-3874-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics