SISTEM PENDETEKSI BERITA HOAX DI MEDIA SOSIAL DENGAN TEKNIK DATA MINING SCIKIT LEARN

Munawar Munawar, Yosua Riadi

Sari


Currently social media (especially Twitter and Facebook), have become an alternative media for news dissemination, Data shows hoax news complaints reaching 5070 in 2017 (Damar, 2017). In fact there is an increasing tendency to fabricate lies to cover up the truth or known as hocus to trick (Prasetijo et al., 2017). Therefore it is necessary to develop a tool to detect whether a news is a hoax or not. To the best of our knowledge, there is no research in hoax detection system in Indonesian language except using text vector representations based on Term Frequency and Document Frequency as well as the Support Vector Machine and Stochastic Gradient Descent classification techniques with 60% of accuracy (Prasetijo et al., 2017).  It still needed a research to develop an integrated applications to detect hoax news on social media. Scikit - Learn is a python module that integrates various machine learning algorithms for medium-scale supervised and uncontrolled problems. This module is very efficient for data mining and data analysis (Jason, 2014). By using python and scikit learn, machine learning models can be obtained for the detection of news hoaxes on social media. This research covers application development for pre-processing data based on data collected from Twitter and Facebook for 3 months, creating models with scikit learn and testing the model with actual news to check the accuracy of the model in detecting hoax news. The results of this study indicate that hoax news detection systems on social media can be done by creating a classification model with TF-IDF, CountVectorizer, PassiveAgressive Classifier and SupportVector Classifier. The model developed successfully shows whether a news is fake or real by looking at the accuracy of the vector classification results. The higher the accuracy of a news on the classification vector, the more easily known whether fake or real.

Keyword : hoax detection, news, scikit learn, social media


Teks Lengkap:

PDF

Referensi


Cahyanti, O.D.; Saksono, P.H.; Suryayusra; Negara, E. . (2015). Pemanfaatan Data Media Sosial Untuk Penelitian. In Social Media Analytics. Palembang.

Damar, A. M. (2017). Jumlah Aduan Hoax dan SARA Lampaui Konten Pornografi.

Grafelly, D. (2015). Bagaimana perkembangan Twitter saat ini? Retrieved from http://www.techno.id/social/bagaimana-perkembangan-twitter-saat-ini-1509122.html

Hotsuit. (2019). Digital 2019: Global Internet Use Accelerates – We Are Social. Retrieved from https://wearesocial.com

Ishikawa, H. (2015). Social Big Data Mining. USA: Taylor & Francis Group.

Jason, B. (2014). A Gentle Introduction to Scikit-Learn: A Python Machine Learning Library. Retrieved from https://machinelearningmastery.com/a-gentle-introduction-to-scikit-learn-a-python-machine-learning-library/

Prasetijo, A. B., Isnanto, R. R., Eridani, D., Soetrisno, Y. A. A., Arfan, M., & Sofwan, A. (2017). Hoax detection system on Indonesian news sites based on text classification using SVM and SGD. Proceedings - 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering, ICITACEE 2017, 2018-January, 45–49. https://doi.org/10.1109/ICITACEE.2017.8257673

Rahadi, D. R. (2017). Pelaku Pengguna Dan Informasi Hoax Di Media Sosial. Jurnal Manajemen Dan Kewirausahaan, 5(1).

Ramageri, B. M. (2010). Data Mining Techniques and Applications. Indian Journal of Computer Science and Engineering, 1(4), 301 – 305.




DOI: https://doi.org/10.47007/komp.v4i02.3140

Refbacks

  • Saat ini tidak ada refbacks.


VISITOR COUNTER:

gerEGGe

 

Web Analytics Made Easy - Statcounter View My Stats