Abstrak
Tuberkulosis (TB) tetap menjadi masalah kesehatan utama di Indonesia dengan angka kejadian yang tinggi, termasuk pada anak-anak yang berkontribusi sekitar 16,68% dari total kasus TB nasional. Untuk mencapai target eliminasi TB pada tahun 2030, salah satu upaya strategis adalah optimalisasi deteksi dini melalui pemanfaatan teknologi digital dalam proses skrining dan diagnosis. Penelitian ini bertujuan mengembangkan model skrining TB anak berbasis machine learning dengan sistem skoring otomatis guna meningkatkan cakupan deteksi dan notifikasi kasus secara lebih efisien. Penelitian menggunakan pendekatan kuantitatif dengan rancangan kohort retrospektif dan dilaksanakan pada April–Agustus 2025 di delapan Puskesmas Kecamatan di wilayah Jakarta Barat. Data penelitian diperoleh dari rekam medis elektronik (RME) puskesmas dan database sistem informasi tuberkulosis (SITB) tahun 2023–2024. Model dikembangkan melalui empat skenario menggunakan lima algoritma, yaitu Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbor, dan Support Vector Machine. Evaluasi kinerja model dilakukan menggunakan cross-validation k-fold = 5 dengan metrik akurasi, sensitivitas, spesifisitas, dan Area Under the Receiver Operating Characteristic (AUROC). Hasil penelitian menunjukkan bahwa variabel dengan kontribusi prediktif terbesar meliputi pembesaran kelenjar, malaise ≥ dua minggu, penurunan atau stagnasi berat badan dua bulan terakhir, status gizi, dan riwayat kontak TB. Berdasarkan variabel tersebut, Decision Tree menjadi algoritma dengan performa terbaik karena menghasilkan nilai AUROC > 0,90. Nilai AUROC yang sangat tinggi (mendekati 1) menunjukkan bahwa kedua model memiliki kemampuan yang sangat baik dalam membedakan pasien anak dengan TB positif dan negatif berdasarkan skoring, serta sesuai untuk karakteristik data yang bersifat non-linear dengan interaksi antar gejala. Prototype aplikasi berbasis web yang dikembangkan mampu memberikan estimasi risiko secara cepat dan interaktif, sehingga berpotensi mendukung skrining TB anak di fasilitas layanan primer.
Tuberculosis (TB) remains a major public health challenge in Indonesia, with a high incidence rate, including among children who account for approximately 16.68% of all national TB cases. To achieve the 2030 TB elimination target, optimizing early detection through the use of digital technologies in screening and diagnosis is a key strategic approach. This study aims to develop a machine learning–based pediatric TB screening model equipped with an automated scoring system to enhance the efficiency of case detection and notification. A quantitative approach with a retrospective cohort design was employed, conducted from April to August 2025 across eight sub-district primary health centers (Puskesmas) in West Jakarta. Data were obtained from electronic medical records (RME) and the tuberculosis information database system (SITB) database for the years 2023–2024. The model was developed under four scenarios using five algorithms: Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbor, and Support Vector Machine. Model performance was evaluated using 5-fold cross-validation with accuracy, sensitivity, specificity, and Area Under the Receiver Operating Characteristic (AUROC) as the evaluation metrics. The findings indicate that the variables with the strongest predictive contributions include lymph node enlargement, malaise lasting ≥ two weeks, weight loss or stagnation over the past two months, nutritional status, and TB contact history. Based on these variables, Decision Tree demonstrated the best performance, achieving AUROC values > 0.90. Such high AUROC values (approaching 1) suggest excellent discriminatory ability in distinguishing TB-positive from TB-negative pediatric patients, particularly given the non-linear patterns and interactions among clinical symptoms. A prototype web-based application was developed and demonstrated the ability to generate rapid and interactive risk estimations. This tool shows strong potential to support pediatric TB screening efforts in primary healthcare settings.