Disertasi
Data mining approach for classification of exam questions with modified thematic term weighting based on Bloom’s taxonomy / Sucipto
Abstrak
The classification of exam questions based on the Bloom Taxonomy (BT) is an important aspect in learning assessment to ensure the quality of evaluation in accordance with the cognitive level. Various studies have developed term weighting schemes to improve the accuracy of BT-based classifications including TF-IDF ETF-IDF and ETFPOS-IDF. However the approach that focuses solely on analyzing verb types still has limitations especially in capturing the contextual meaning of the problem. Therefore this study proposes a more adaptive method of weighting thematic words namely TWTFPOS-IDF FTFPOS-IDF and HTF-IBS delta F which not only considers verbs but also thematic words related to the cognitive level in Bloom rsquo s Taxonomy. In this study modifications were made to the Term Frequency-Inverse Document Frequency (TF-IDF) method incorporating a fuzzy and spatial density approach to adjust word weights based on cognitive level automatically. Models are evaluated using a variety of Machine Learning (ML) and Deep Neural Networks (DNN) algorithms with performance metrics in the form of Evaluation Metrics. Additionally ANOVA statistical tests were conducted to evaluate the significance of performance differences between the proposed weighting methods. The experimental results demonstrate that the proposed method achieves a significant performance improvement over baseline methods including TF-IDF and ETF-IDF. The HTF-IBS delta F method recorded the highest accuracy with an average of 0.882 followed by TWTFPOS-IDF (0.866) and FTFPOS-IDF (0.849). In terms of F1-score the HTF-IBS delta F method is also superior with a score of 0.866. The ANOVA test revealed that the performance differences between the methods were statistically significant (p-value 0.002195 for F1-score and p-value 0.010223 for accuracy) confirming the effectiveness of the proposed approach in enhancing the classification quality. This research contributes to improving the accuracy of cognitive level identification of exam questions with a more complex thematic word weighting approach. The implementation of this model can help in a more objective and adaptive artificial intelligence-based educational evaluation system. Further research will focus on automating word weight identification as well as exploring deep learning techniques to improve model performance and scalability.