Main Article Content

Abstract

Saat ini, penggunaan email yang masif telah meluas dan memiliki dampak baik dan buruk. Salah satu dampak negatif adalah munculnya email spam, yang berisi promosi produk, konten pornografi, virus, dan konten tidak penting yang dikirim ke banyak orang tanpa permintaan. Hal tersebut bisa terjadi dikarenakan kebocoran data, penjualan data ilegal, dan pendaftaran kita sendiri ke berbagai grup dan milis-milis tertentu yang disalahgunakan. Untuk mengatasi masalah ini, diperlukan metodologi klasifikasi email yang dapat secara otomatis mendeteksi apakah sebuah email merupakan spam email berbahaya atau bukan. Penelitian ini bertujuan untuk mengembangkan sebuah model klasifikasi email spam dan non-spam menggunakan Algoritma Naïve Bayes dengan menggunakan dua pendekatan, yaitu penggunaan library scikit-learn dan juga matematika murni Algoritma Naïve Bayes. Penggunaan matematika murni suatu algoritma sangat jarang digunakan di dalam perbandingan antar algoritma. Biasanya memperbandingkan beberapa algoritma. Oleh karena itu, peneliti mencoba melakukan perbandingan tersebut di dalam penelitian ini. Hasil penelitian menunjukkan bahwa penggunaan Algoritma Naïve Bayes dengan library scikit-learn mampu melakukan klasifikasi dengan sangat baik, mencapai tingkat akurasi sebesar 97%, sedangkan penerapan metode matematika Naïve Bayes yang menghasilkan tingkat akurasi yang lebih besar yaitu 98%.

Keywords

Algoritma Naïve Bayes, Dampak email spam, Deteksi spam, Keamanan email, Kebocoran data

Article Details

References

  1. [1]P. DiMaggio, E. Hargittai, W. R. Neuman, dan J. P. Robinson, "Social implications of the Internet," Annual review of sociology, vol. 27, no. 1, hal. 307-336, 2001.
  2. [2]I. N. Dewi dan C. Supriyanto, "Klasifikasi Teks Pesan Spam Menggunakan Algoritma Naïve Bayes," Semantik, vol. 3, no. 1, 2013.
  3. [3]A. Setiyono dan H. F. Pardede, "Klasifikasi SMS Spam Menggunakan Support Vector Machine," Jurnal Pilar Nusa Mandiri, vol. 15, no. 2, hal. 275-280, 2019.
  4. [4]A. Wibisono, "Filtering Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 4, 2023.
  5. [5]D. Sartika dan D. I. Sensuse, "Perbandingan algoritma klasifikasi Naive Bayes, Nearest Neighbour, dan Decision Tree pada studi kasus pengambilan keputusan pemilihan pola pakaian," JATISI (Jurnal Teknik Informatika dan Sistem Informasi), vol. 3, no. 2, hal. 151-161, 2017.
  6. [6]A. Deolika, K. Kusrini, dan E. T. Luthfi, "Analisis pembobotan kata pada klasifikasi text mining," Jurnal Teknologi Informasi (JurTI), vol. 3, no. 2, hal. 179-184, 2019.
  7. [7]Mukhtar, H., Al Amien, J., & Rucyat, M. A. (2022). Filtering Spam Email menggunakan Algoritma Naïve Bayes. Jurnal CoSciTech (Computer Science and Information Technology), 3(1), 9-19.
  8. [8]A. Wibisono, "Filtering Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 4, 2023.
  9. [9]A. Hidayat, "Klasifikasi Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 2, 2023.
  10. [10]A. Wibisono, "Filtering Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 4, 2023.
  11. [11]P. Nagaraj, V. Muneeswaran, G. S. S. Reddy, V. B. Kumar, B. M. Mohan, dan S. Kumar, "Automatic Email Spam Classification Using Naïve Bayes," dalam 2023 International Conference on Computer Communication and Informatics (ICCCI), Januari 2023, hal. 1-5, IEEE.
  12. [12]B. Hemapriya, K. Devi, dan K. Harini, "Automatic Scikit-learn based detection and classification of Breast Cancer using Machine Learning techniques," dalam 2023 Third International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), Januari 2023, hal. 1-8, IEEE.
  13. [13]H. Zhang, N. Cheng, Y. Zhang, dan Z. Li, "Label flipping attacks against Naive Bayes on spam filtering systems," Applied Intelligence, vol. 51, hal. 4503-4514, 2021.
  14. [14]T. Lv, P. Yan, H. Yuan, dan W. He, "Spam filter based on naive Bayesian classifier," dalam Journal of Physics: Conference Series, vol. 1575, no. 1, hal. 012054, IOP Publishing, Juni 2020.
  15. [15]S. J. S. Daisy dan A. R. Begum, "Smart material to build mail spam filtering technique using Naive Bayes and MRF methodologies," Materials Today: Proceedings, vol. 47, hal. 446-452, 2021.
  16. [16]B. Kuchipudi, R. T. Nannapaneni, dan Q. Liao, "Adversarial machine learning for spam filters," dalam Proceedings of the 15th International Conference on Availability, Reliability and Security, Agustus 2020, hal. 1-6.
  17. [17]S. Rapacz, P. Chołda, dan M. Natkaniec, "A method for fast selection of machine-learning classifiers for spam filtering," Electronics, vol. 10, no. 17, hal. 2083, 2021.
  18. [18]I. Wickramasinghe dan H. Kalutarage, "Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation," Soft Computing, vol. 25, no. 3, hal. 2277-2293, 2021.
  19. [19]X. Yang, H. Yu, dan Z. Jia, "Research on spam filtering algorithm based on mutual information and weighted naive Bayesian classification," International Journal of Ad Hoc and Ubiquitous Computing, vol. 37, no. 4, hal. 240-248, 2021.
  20. [20]M. Novo-Loures, D. Ruano-Ordas, R. Pavon, R. Laza, S. Gomez-Meire, dan J. R. Mendez, "Enhancing representation in the context of multiple-channel spam filtering," Information Processing & Management, vol. 59, no. 2, hal. 102812, 2022.
  21. [21]T. Mehrotra, G. K. Rajput, M. Verma, B. Lakhani, dan N. Singh, "Email spam filtering technique from various perspectives using machine learning algorithms," dalam Data Driven Approach Towards Disruptive Technologies: Proceedings of MIDAS 2020, hal. 423-432, Springer Singapore, 2021.
  22. [22]S. Raschka, J. Patterson, dan C. Nolet, "Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence," Information, vol. 11, no. 4, hal. 193, 2020.
  23. [23]A. Z. Farmadiansyah, A. F. Hidayatullah, dan F. Rahma, "Deteksi Email Spam dan Non Spam Bahasa Indonesia Menggunakan Metode Naïve Bayes," AUTOMATA, vol. 2, no. 2, 2021.
  24. [24]A. Wibisono, "Filtering Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 4, 2023.
  25. [25]S. Chen, G. I. Webb, L. Liu, dan X. Ma, "A novel selective naïve Bayes algorithm," Knowledge-Based Systems, vol. 192, hal. 105361, 2020.
  26. [26]A. Wibisono, "Filtering Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 4, 2023.
  27. [27]A. Hidayat, "Klasifikasi Spam Email Menggunakan Metode Naive Bayes," Jurnal Teknologi Pintar, vol. 3, no. 2, 2023.
  28. [28]A. Putri, "Kinerja Naïve Bayes Classifier pada Penyaringan Short Message Service (SMS) Spam," 2023.
  29. [29]H. Mukhtar, J. Al Amien, dan M. A. Rucyat, "Filtering Spam Email menggunakan Algoritma Naïve Bayes," Jurnal CoSciTech (Computer Science and Information Technology), vol. 3, no. 1, hal. 9-19, 2022.
  30. [30]M. Anita, B. Susanto, dan L. Larwuy, "Perbandingan Metode Random Forest dan Naïve Bayes dalam Email Spam Filtering," KUBIK: Jurnal Publikasi Ilmiah Matematika, vol. 7, no. 2, hal. 88-96, 2022.
  31. [31]R. P. Cota dan D. Zinca, "Comparative results of spam email detection using machine learning algorithms," dalam 2022 14th International Conference on Communications (COMM), Juni 2022, hal. 1-5, IEEE.
  32. [32]G. Mohanan, D. M. Padmanabhan, dan G. S. Anisha, "Analyzing Random Forest, Naive Bayes, and SVM to Filter Spam Emails Across Multiple Datasets," dalam ICCCE 2021: Proceedings of the 4th International Conference on Communications and Cyber Physical Engineering, hal. 325-332, Springer Nature Singapore, Mei 2022.
  33. [33]G. Mohanan, D. M. Padmanabhan, dan G. S. Anisha, "Classifying Emails into Spam or Ham Using ML Algorithms," dalam Data Science and Security: Proceedings of IDSCS 2021, hal. 214-221, Springer Singapore, 2021.
  34. [34]K. F. Rafat, Q. Xin, A. R. Javed, Z. Jalil, dan R. Z. Ahmad, "Evading obscure communication from spam emails," Math. Biosci. Eng., vol. 19, no. 2, hal. 1926-1943, 2022.
  35. [35]S. Gibson, B. Issac, L. Zhang, dan S. M. Jacob, "Detecting spam email with machine learning optimized with bio-inspired metaheuristic algorithms," IEEE Access, vol. 8, hal. 187914-187932, 2020.
  36. [36]R. S. Lutfiyani dan N. Retnowati, "IMPLEMENTASI PENDETEKSIAN SPAM EMAIL MENGGUNAKAN METODE TEXT MINING DENGAN ALGORITMA NAÏVE BAYES DAN DECISION TREE J48," J-Icon: Jurnal Komputer dan Informatika, vol. 9, no. 2, hal. 244-252, 2021.
  37. [37]A. Hosseinalipour dan R. Ghanbarzadeh, "A novel approach for spam detection using horse herd optimization algorithm," Neural Computing and Applications, vol. 34, no. 15, hal. 13091-13105, 2022.