UPT Perpustakaan UM

  • Beranda
  • Informasi
  • Repository UM
  • SIPADU UM
  • OPAC SIPADU

Pencarian Spesifik

Pencarian berdasarkan :

SEMUA Pengarang Subjek ISBN/ISSN Pencarian Spesifik

Pencarian terakhir:

{{tmpObj[k].text}}
No image available for this title

Disertasi

A multilingual translation model using named entity recognition and indonesian as a pivot for local languages / Danang Arbian Sulistyo

Sulistyo, Danang Arbian - Nama Orang;

Abstrak
Translation of Low-Resource language faces various challenges including limited parallel datasets loss of semantic meaning and error propagation in machine translation models. Conventional NMT cannot handle these limitationsoptimally. The Pivot-Based Neural Machine Translation (Pivot-NMT) approach has proven to be an effective solution in overcoming the limitations of the dataset. The method utilizes an intermediate language (Indonesian) to facilitate the translation process between regional languages. However the propagation of errors in the two stages of translation and its inability to accurately maintain named entities still need adjustment. This dissertation explores the integration of Named Entity Recognition with Pivot-NMT (NER-Pivot-NMT) to address the limitations of previous translation models particularly in preserving named entites and enhancing processing efficiency. The novel NER-Pivot-NMT combines Named Entity Recognition (NER) for the identification and preservation of named entities with Pivot-NMT to address thelimitations of parallel datasets in low-resource languages. Named Entity Recognition (NER) is crucial in the translation process since it autonomously identifies and categorizes named entities including proper nouns localities and organizations from the original language using supervised learning on annotated datasets. The efficacy of named entity recognition directly influences the overall quality of translation since these entities are preserved as placeholders throughout the translation process. Upon identification entities are safeguarded from translation alterations so averting semantic distortions in the outcome. Integrating NER with Pivot-NMT guarantees the preservation of named entities throughout the multi-step translation process. This not only preserves the coherence of the translated text but also reduces mistake propagation which is particularly crucial in low-resource environments where parallel corpora are few and the potential for semantic loss is significant. The efficacy of NER substantially enhances the system s resilience in delivering high-quality translation. Experimental findings indicate that NER-Pivot-NMT outperforms traditionalmodels such as NMT Pivot-NMT and NER-NMT in key evaluation metrics. It achieves a 33.7 BLEU Score which is 4.0 points higher than NER-NMT (29.7) and 8.3 points higher than Pivot-NMT (28.9) demonstrating superior translationaccuracy. In terms of Entity Preservation Rate (EPR) NER-Pivot-NMT excels with a 92.1% rate a significant improvement over NER-NMT (85.1%) and Pivot-NMT (78.4%). Furthermore the model shows enhancements in NER Precision (91.3%) NER Recall (89.7%) and NER F1-score (90.5%) outpacing NER-NMT in all three metrics. While NER-Pivot-NMT demonstrates better entity preservation and higher translation accuracy it does exhibit a slight increase in inference latency (2.6s/sentence) compared to NER-NMT (1.4s/sentence) Pivot-NMT (2.1s/sentence) and NMT (0.9s/sentence). Additionally the model has a higher number of parameters (200M) and longer training time (22h) compared to its counterparts but the trade-off results in a more reliable and scalable solution for translating regional languages like Madurese and Javanese. This research primarily contributes to a more precise and efficient solution for NMT in low-resource language contexts. It also lays the groundwork for further exploration in developing pivot-free multilingual models optimizing datasets and integrating NMT with other NLP technologies including speech-to-text. Thisapproach may support language preservation and advancement in the digital area.


Informasi Detail
DDC
Rd 621.3076 SUL m
Prodi
Universitas Negeri Malang. Program Studi Teknik Elektro dan Informatika, 2025.
Deskripsi Fisik
xii, 173 lembar. : ilus. ; 30 cm.
Bahasa
No Reg
00227/RD/25
Edisi
Disertasi(Pascasarjana)--Universitas Negeri Malang. 2025
Subjek
1. TEKNIK ELEKTR0 - MODEL PENERJEMAHAN MULTIBAHASA
2. USING NAMED ENTITY RECOGNITION
3. INDONESIAN AS A PILOT FOR LOCAL LANGUAGE
4. ELECTRICAL ENGINEERING - MULTILINGUAL TRANSLATION MODEL

Pembimbing
1. Aji Prasetya Wibawa, S.t., M.mt., Ph.d; 2. Dr.eng Didik Dwi Prasetya, S.t., M.t.
Lampiran Berkas
You must be logged in to get fulltext


UPT Perpustakaan UM
  • Berita

Tentang Kami

TIM IT Perpustakaan 2023

Cari

masukkan satu atau lebih kata kunci dari judul, pengarang, atau subjek

Donasi untuk SLiMS

Pilih subjek yang menarik bagi Anda
  • Karya Umum
  • Filsafat
  • Agama
  • Ilmu-ilmu Sosial
  • Bahasa
  • Ilmu-ilmu Murni
  • Ilmu-ilmu Terapan
  • Kesenian, Hiburan, dan Olahraga
  • Kesusastraan
  • Geografi dan Sejarah
Icons made by Freepik from www.flaticon.com
Pencarian Spesifik