| تعداد نشریات | 8 |
| تعداد شمارهها | 435 |
| تعداد مقالات | 5,634 |
| تعداد مشاهده مقاله | 7,427,743 |
| تعداد دریافت فایل اصل مقاله | 6,215,703 |
A Systematic Survey and Empirical Comparison of Hybrid Methods for Imbalanced Fraud Detection: Combining Resampling and Machine Learning | ||
| AUT Journal of Mathematics and Computing | ||
| مقالات آماده انتشار، پذیرفته شده، انتشار آنلاین از تاریخ 12 آذر 1404 | ||
| نوع مقاله: Review Article | ||
| شناسه دیجیتال (DOI): 10.22060/ajmc.2025.24642.1446 | ||
| نویسندگان | ||
| Behnam Yousefimehr1؛ Mehdi Ghatee* 2 | ||
| 1Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran | ||
| 2Department of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic) | ||
| چکیده | ||
| The accurate identification of fraudulent activities has been a significant focus of computational research, leading to the development of diverse methodologies ranging from traditional statistical tests to advanced machine learning and deep learning models. A persistent and critical challenge undermining these approaches is the inherent class imbalance present in most real-world fraud datasets, where genuine transactions vastly outnumber fraudulent ones, often causing models to exhibit bias toward the majority class. To mitigate this issue, a promising paradigm has emerged: hybrid frameworks that synergistically integrate data resampling techniques with robust machine learning algorithms. These frameworks are particularly valuable for their potential to facilitate accurate, real-time automated detection systems. This survey provides a comprehensive examination of the efficacy and impact of such hybrid techniques on the field of fraud detection. To quantitatively evaluate their performance, we conduct a rigorous numerical study using auto insurance fraud as a case study. Employing the Car fraud datasets, we perform a detailed comparative analysis of various detection algorithms, each coupled with different resampling methods. Our empirical results demonstrate that the performance of each fraud detection algorithm is profoundly contingent upon the specific resampling strategy employed, highlighting the necessity for careful methodological selection tailored to the dataset's characteristics. Code for analysis is available at \url{https://github.com/behnamy2010/Car-Claims-Compression}. | ||
| کلیدواژهها | ||
| Imbalanced Learning؛ Oversampling؛ Undersampling؛ Ensemble Learning؛ Auto Insurance Fraud Detection | ||
|
آمار تعداد مشاهده مقاله: 72 |
||