Classification of Student Achievement Using Data Mining Techniques: A Comparative Study
Keywords:
Data mining, Student performance, Classification, Machine learning, Neural networksAbstract
This study investigates the application of data mining techniques to classify secondary school students’ academic performance. The Student Performance Dataset, obtained from the UCI Machine Learning Repository, was used for analysis. After excluding two of the exam results, the dataset comprised 31 attributes for 395 students. The classification was based on final exam grades: scores between 0–10 were labeled as "unsuccessful" (0) and scores between 11 and 20 as "successful" (1). The dataset was preprocessed to correct CSV format errors, making it suitable for analysis in the WEKA software. Four classification algorithms—Iterative Classifier Optimizer, OneR, LogitBoost, and Artificial Neural Networks—were evaluated using 5, 7, and 10-fold cross-validation. Results showed that OneR achieved the highest average accuracy (92.15%) and sensitivity (96%), while LogitBoost yielded the best specificity (88%). The findings suggest that OneR is the most effective method for classifying student success using this dataset.
References
Altun M, Kayıkçı K, Irmak S. 2019. Estimation of graduation grades of primary education students by using regression analysis and artificial neural networks. Int J Educ Res, 10(3): 29-43.
Arslan B, Babadoğan C. 2005. İlköğretim 7. ve 8. sınıf öğrencilerinin öğrenme stillerinin akademik başarı düzeyi, cinsiyet ve yaş ile ilişkisi. Euras J Educ Res, 31: 35-48.
Aydemir E. 2019. Ders geçme notlarının veri madenciliği yöntemleriyle tahmin edilmesi. Avrupa Bilim Teknol Derg, 15: 70-76.
Aydın F, Arslan Z. 2017. Yapay öğrenme yöntemleri ve dalgacık dönüşümü kullanılarak nörodejeneratif hastalıkların teşhisi. Gazi Üniv Müh Mim Fak Derg, 32(3): 745-754.
Cortez P, Silva AMG. 2008. Using data mining to predict secondary school student performance. The 5th Annual Future Business Technology Conference, April 9-11, Porto, Portugal, pp: 5-12.
Elmas Ç. 2003. Artificial neural networks theory, architecture, education, practice (first edition). Seçkin Publishing, Ankara, Türkiye, pp: 192.
Gorr WL, Nagin D, Szczypula J. 1994. Comparative study of artificial neural network and statistical models for predicting student grade point averages. Int J Forecast, 10(1): 17-34.
Güre ÖB, Kayri M, Erdoğan F. 2020. PISA 2015 matematik okuryazarlığını etkileyen faktörlerin eğitsel veri madenciliği ile çözümlenmesi. Eğitim Bilim, 45: 251-270.
Holte RC. 1993. Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11(1): 63-90.
Kurnaz G, Murat N. 2023. Determination of harness production time and defective product formation risk factors with artificial neural network. BSJ Eng Sci, 6(4): 325-329. https://doi.org/10.34248/bsengineering.1296187
Manikandan G, Aravind V, Anitha B. 2018. A survey to identify an efficient classification algorithm for heart disease prediction. Int J Pure Appl Math, 119(2): 13337-13345.
Osborn J, Francisco Javier De CJ, Guzman D, Butterley T, Myers R, Guesalaga A, Laine J. 2011. Using artificial neural networks for open-loop tomography. Optics Express, 20(3): 2420-2432. https://doi.org/10.1364/OE.20.002420
Öztemel E. 2003. Yapay sinir ağları. Papatya Yayıncılık İstanbul, Türkiye, pp: 44.
SubbaNarasimha PN, Arinze B, Anandarajan M. 2000. The predictive accuracy of artificial neural network and multiple regression in the case of skewed data: Exploration of some issues. Expert Syst Appl, 19(2): 117-123.
Tosun S. 2007. Artificial neural networks, decision tree comparison in classification analysis: An application on students’ success. MSc Thesis, İstanbul Technical University, Institute of Science, İstanbul, Türkiye, pp: 128.
Uzun Y. 2005. Machine learning algorithms and learning rules with fuzzy logic on medical data. MSc Thesis, Selçuk University, Institute of Science, Konya, Türkiye, pp: 59.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Black Sea Journal of Statistics

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.