Improvement performance by using Machine learning algorithms for fake news detection

Authors

  • Eman Shekhan Hamsheen College of Science ,Computer Science and IT Department, Salahaddin University-Erbil, Kurdistan Region, Iraq.
  • Laith R.Flah Computer Science Department, Cihan University -Erbil, Kurdistan Region, Iraq.

DOI:

https://doi.org/10.21271/ZJPAS.35.2.6

Keywords:

KEY WORDS: Fake News Detection, Kurdish Language, Machine learning, Classifiers, Passive-Aggressive.

Abstract

   The prevalence of internet use and the volume of actual-time data created and shared on social media sites and applications have raised the risk of spreading harmful or misunderstanding content, engaging in unlawful activity, abusing others, and disseminating false information. As of today, some studies have been done on fake news recognition in the Kurdish language. For extremely resourced languages like Arabic, English, and other international languages, false news detection is a well-researched research subject. Less resourced languages, however, stay out of attention because there is no labeled fake corpus, no fact-checking website, and no access to NPL tools. This paper illustrates the process of identifying fake news, using two components of the dataset for fake news and actual news. Several classifiers were then applied to the quantity after using identifiers as a highlight of selection. Results of the proposed study demonstrated that Passive-Aggressive Classifier (PAC) outperformed the other classifiers on both datasets the dataset with an accuracy score of 93.0 percent and other classifiers were less in some percentage that show high accuracy as well since it is 90 percent.

References

Ahmadi, S., 2020, November. KLPT–Kurdish Language Processing Toolkit. In Proceedings of Second Workshop for NLP Open-Source Software (NLP-OSS) (pp. 72-84).

Ahmadi, S., Hassani, H., & Abedi, K. (2020, May). A corpus of the Sorani Kurdish folkloric lyrics. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL) (pp. 330-335).

Allen, J., Howland, B., Mobius, M., Rothschild, D. and Watts, D.J., 2020. Evaluating the fake news problem at the scale of the information ecosystem. Science advances, 6(14), p. eaay3539

Al-Rabeeah, A.A.N. and Hashim, M.M., 2019. Social Network Privacy Models. Cihan University-Erbil Scientific Journal, 3(2), pp.92-101.

Al-Rabeeah, A.A.N. and Saeed, F., 2017, May. Data privacy model for social media platforms. In 2017 6th ICT International Student Project Conference (ICT-ISPC) (pp. 1-5). IEEE.

Al-Yahya, M., Al-Khalifa, H., Al-Baity, H., AlSaeed, D. and Essam, A., 2021. Arabic fake news detection: comparative study of neural networks and transformer-based approaches. Complexity, 2021.

Azad, R., Mohammed, B., Mahmud, R., Zrar, L. and Sdiqa, S., 2021. Fake News Detection in low-resourced languages “Kurdish language” using Machine learning algorithms. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(6), pp.4219-4225.

Bryar A. Hassan, Tarik A. Rashid, Seyedali Mirjalili, Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets, Data in Brief, Volume 36, 2021, 107044, ISSN 2352-3409,https://doi.org/10.1016/j.dib.2021.107044.

Himdi, H., Weir, G., Assiri, F. and Al-Barhamtoshy, H., 2022. Arabic fake news detection based on textual analysis. Arabian Journal for Science and Engineering, pp.1-17.

Humayoun, M., 2022. The 2021 Urdu Fake News Detection Task using Supervised Machine Learning and Feature Combinations. arXiv preprint arXiv:2204.03064.

Khalifa, M. and Hussein, N., 2019, January. Ensemble learning for irony detection in Arabic tweets.

Khanam, Z., Alwasel, B. N., Sirafi, H., & Rashid, M. (2021). Fake News Detection Using Machine Learning Approaches. IOP Conference Series: Materials Science and Engineering, 1099(1), 012040. https://doi.org/10.1088/1757-899x/1099/1/012040

Martin Forst and Ronald M Kaplan. 2006. The importance of precise tokenizing for deep grammars. In LREC, pages 369–372.

Mustafa, A. M., & Rashid, T. A. (2018). Kurdish stemmer pre-processing steps for improving information retrieval. Journal of Information Science, 44(1), 15–27. https://doi.org/10.1177/0165551516683617

Oshikawa, R., Qian, J. and Wang, W.Y., 2018. A survey on natural language processing for fake news detection. arXiv preprint arXiv:1811.00770.

Rodriguez, M., Peterson, R.M. and Krishnan, V., 2012. Social media’s influence on business-to-business sales performance. Journal of Personal Selling & Sales Management, 32(3), pp.365-378.

Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter, 19(1), 22-36.

Tony McEnery and Andrew Wilson. 2003. Corpus linguistics. The Oxford handbook of computational linguistics, pages 448–463.

Veisi, H., MohammadAmini, M., & Hosseini, H. (2020). Toward Kurdish language processing: Experiments in collecting and processing the AsoSoft text corpus. Digital Scholarship in the Humanities, 35(1), 176-193.

Published

2023-04-20

How to Cite

Eman Shekhan Hamsheen, & Laith R.Flah. (2023). Improvement performance by using Machine learning algorithms for fake news detection . Zanco Journal of Pure and Applied Sciences, 35(2), 48–57. https://doi.org/10.21271/ZJPAS.35.2.6

Issue

Section

Mathematics, Physics and Geological Sciences