Cost Sensitive Tree dan Naïve Bayes pada Klasifikasi Multiclass

  • M. Aldiki Febriantono Bina Nusantara
  • Ridho Herasmara Universitas Islam Raden Rahmat
  • Gusti Pangestu Universitas Bina Nusantara
Keywords: cost sensitive, decision tree, multiclass classification, naïve bayes.

Abstract

Data mining merupakan proses pengolahan data untuk mengambil keputusan secara cepat, tepat dan akurat. Data mining pada bidang kesehatan dan manufacturing menjadi hal yang sangat penting dikarenakan suatu kesalahan klasifikasi (misclassification) akan memiliki dampak serius. Masalah utama pada data mining ketika data yang digunakan bersifat imbalanced multiclass karena classifier kesulitan untuk mengklasifikasikan data sehingga menyebabkan terjadinya misclassification. Solusi untuk meminimalkan missclasification dengan menggunakan metode cost sensitive pada classifier decision tree C5.0 dan naïve bayes. Penelitian ini menggunakan dataset glass, lympografi, vehicle, thyroid dan wine yang diperoleh dari UCI Respository. Kelima dataset dilakukan proses seleksi atribut menggunakan particle swarm optimazation. Kemudian dataset diuji menggunakan metode cost sensitive decision tree C5.0 dan cost sensitive naïve bayes. Hasil pengujian menggunakan metode cost sensitive decision tree C5.0 diperoleh nilai accuracy pada dataset glass, lympografi, vehicle, thyroid dan wine berturut-turut sebesar 76.17%, 83.33%, 75.27%, 95.81% dan 95.83%. Sedangkan metode cost sensitive naïve bayes memiliki performa accuracy pada dataset berturut-turut sebesar 32.24%, 82.61%, 25.53%, 97.67% dan 94.94%.

Downloads

Download data is not yet available.

References

Ali H, M. N. M. Salleh, Saedudin, R. and Hussain, K. (2019): Imbalance class problems in data mining: a review.”, Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3., 2019, pp. 1560–1571.
Bernard, S., Chatelain, C., and Adam, S., (2015): The Multiclass ROC Front method for cost-sensitive classification. Pattern Recognition, vol. 52., 2015: pp. 46–60.
Chai, X., Deng, L., Yang., Q., et al., (2004): Test Cost Sensitive Naïve Bayes Classification., Proceedings of the 4th IEEE International Conference on Data Mining pp.51-58.
Daraei A., (2017): An Efficient Predictive Model for Myocardial Infarction Using Cost-sensitive J48 Model. Iran J Public Health, Vol. 46, No.5. 2017: pp.682-692.
Domingos P., (1999): MetaCost: A general method for making classifiers cost-sensitive. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining. ACM Press., 1999, pp. 155-164.
Faisal KM, Mofizur RC. (2011): Enhanced classification accuracy on naïve bayes data mining models. International journal of computer applications, 2011, 28(3): 9-16
Friedman N.,Geiger., D and Goldezmidt M., (1997): Bayesian Network Classifier. Machine Learning, 1997, pp:131-163.
Haldankar A.N. (2016): A Cost Sensitive classifier for Big Data. IEEE International Conference on Advances in Electronics, Communication and Computer Technology (ICAECCT).
Jauhari F,, Supianto, A.A., (2019): Building student’s performance decision tree classifier using boosting algorithm. Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3., 2019, pp. 1298–1304.
Janasthar, S. and A. Hanskunatai, (2014): The ensemble of Naïve Bayes Classifiers for Hotel Searching, International Computer Science and Engineering Conference (ICSEC), 2014.
Larose D. T., (2005): Discovering knowledge in data : an introduction to data mining. Jhon Wiley & Sons Inc.
Pandya R., dan Pandya, J., (2015): C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning. International Journal of Computer Applications, vol. 117., 2015: pp. 0975 – 8887.
Patel B.R, dan Rana, K.K., (2014): A Survey on Decision Tree Algorithm For Classification. International Journal of Engineering Development and Research, Vol. 2, No. 1., 2014.
Patel B.N, Prajapati, S.G., and Lakhtaria, K.I. (2012): Efficient Classification of Data Using Decision Tree. Bonfring international journal of data mining, Vol. 2, No. 1., 2012.
Ramaswati M., (2014): Validating Predictive Performance of Classifier Models for Multiclass Problem in Educational Data Mining, International Journal of Computer Science Issue,Vol. 11, Issue.5., 2014.
Herasmara, R., Muslim, M.A., Mudjirahardjo, P. (2019): Optimasi Struktur Convolutional Neural Network LeNet5m dengan Pendekatan MorphNet, Jurnal EECCIS, Malang, Teknik Elektro Universitas Brawijaya.
Thomas M.C. and Joy A. T. (2006): Elements of imformation Theory, A John Wiley & Sons, INC., Publication, 2006, pp. 13-14.
Wang, S. and Yao, X., (2012): Multiclass Imbalance Problems : Analysis and Potential Solutions.” IEEE Trans. Syst. Man. Cybern., vol. 42, no. 4., 2012, pp. 1119–1130.
Wei S, Ching, Y.K., Chieh, C.S., and Jung, L.Z. (2008): Particle Swarm Optimization for Parameter Determination and Feature Selection of Support Vector Machines. ScienceDirect: Expert System With Aplications., 2008, pp.1817- 1824.
Xiangju L, Hong Z and William Z., (2015): A Cost Sensitive Decision Tree Algorithm with Two Adaptive Mechanisms, Knowledge-Based System, vol. 88, 2015, pp. 24-23.
Xue, B., Zhang, M., & Browne, W. N. (2013): Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach. IEEE Transactions on Cybernetics, 43(6), 2013, pp. 1656–1671.
Zhang, S., Zhang, C., and Yang, Q., (2010): Data preparation for data mining. Applied Artificial Intelligence an International Journal, Vol. 17, 2010, pp. 5-6.
Published
2021-02-23
How to Cite
[1]
M. A. Febriantono, R. Herasmara, and G. Pangestu, “Cost Sensitive Tree dan Naïve Bayes pada Klasifikasi Multiclass ”, JIP, vol. 7, no. 2, pp. 57-64, Feb. 2021.