UPT Perpustakaan UM

  • Beranda
  • Informasi
  • Repository UM
  • SIPADU UM
  • OPAC SIPADU

Pencarian Spesifik

Pencarian berdasarkan :

SEMUA Pengarang Subjek ISBN/ISSN Pencarian Spesifik

Pencarian terakhir:

{{tmpObj[k].text}}
No image available for this title

Disertasi

Modification of the SMOTE method using noise reduction and clustering to address imbalanced health data / Hairani

Hairani - Nama Orang;

Abstrak
Data imbalance occurs when one class is underrepresented (the minority class) while another class is overrepresented in the data (the majority class). Imbalanced data can lead to decreased performance of classification methods and overfitting. In other words a classification method can produce high accuracy on the majority of the data and low accuracy on the minority of the data. Imbalanced data is typically addressed using the Synthetic Minority Oversampling Technique (SMOTE). Recently the SMOTE-LOF method has been developed which only considers outliers to remove minority classes (noisy data). However SMOTE-LOF has several weaknesses it only considers minority data as noise within the outlier area whereas the more important type of noise to address is minority data adjacent to the majority class. Furthermore this method uses only LOF filtering to detect noise without involving clustering mechanisms. It also does not address the overlapping problem in the synthetic minority class data generated. Therefore this study proposes a combined approach of filtering clustering and distance modification to reduce noise and overlapping produced by SMOTE. Filtering removes minority class data (noise) located in majority class regions with the k-NN method applied for filtering. The use of Noise Reduction (NR) which removes data that is considered noise before applying SMOTE has a positive impact in overcoming data imbalance. Clustering establishes decision boundaries by partitioning data into clusters allowing SMOTE with modified distance metrics to generate minority class data within each cluster. This SMOTE clustering and distance modification approach aims to minimize overlap in synthetic minority data which can introduce noise. The proposed method is called NR-Clustering SMOTE which has several stages in balancing data (1) filtering by removing minority classes close to majority classes (data noise) using the k-NN method (2) Clustering data using K-means aims to establish decision boundaries by partitioning data into several clusters (3) applying SMOTE oversampling with Manhattan distance within each cluster. Test results indicate that the proposed NR-Clustering SMOTE method achieves the best performance across all evaluation metrics for classification methods such as Random Forest SVM and Naive Bayes compared to the original data and traditional SMOTE. The implication of the results of this study is that the proposed methods namely NR-Modified SMOTE and NR-Clustering SMOTE are proven to be able to improve classification performance compared to the traditional SMOTE method and its latest variants such as SMOTE-LOF Radius-SMOTE and RN-SMOTE in solving imbalanced health data with two classes. In addition this finding also provides a better alternative solution in handling imbalanced data compared to other SMOTE variants. Practically the findings of this study have the potential to be applied in medical decision support systems to support early detection of diabetes. Improved accuracy in minority classes is expected to enhance the reliability of predictive models and strengthen the basis for healthcare decision-making. Thus the results of this study not only provide theoretical contributions to the development of methods for handling imbalanced data but also provide added value in the application of intelligent technology to support more accurate diagnostic processes and clinical decision-making.


Informasi Detail
DDC
Rd 006.312 HAI m
Prodi
Universitas Negeri Malang. Program Studi Teknik Elektro dan Informatika, 2025.
Deskripsi Fisik
xvi, 99 lembar : ilus ; 30 cm.
Bahasa
No Reg
00284/RD/25
Edisi
Disertasi (Pascasarjana)--Universitas Negeri Malang. 2025
Subjek
1. DATA MINING - SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE

Pembimbing
1. Triyanna Widiyaningtyas, M.t.; 2. Dr.eng Didik Dwi Prasetya, S.t., M.t.
Lampiran Berkas
You must be logged in to get fulltext


UPT Perpustakaan UM
  • Berita

Tentang Kami

TIM IT Perpustakaan 2023

Cari

masukkan satu atau lebih kata kunci dari judul, pengarang, atau subjek

Donasi untuk SLiMS

Pilih subjek yang menarik bagi Anda
  • Karya Umum
  • Filsafat
  • Agama
  • Ilmu-ilmu Sosial
  • Bahasa
  • Ilmu-ilmu Murni
  • Ilmu-ilmu Terapan
  • Kesenian, Hiburan, dan Olahraga
  • Kesusastraan
  • Geografi dan Sejarah
Icons made by Freepik from www.flaticon.com
Pencarian Spesifik