
تعداد نشریات | 8 |
تعداد شمارهها | 428 |
تعداد مقالات | 5,583 |
تعداد مشاهده مقاله | 6,896,283 |
تعداد دریافت فایل اصل مقاله | 5,878,817 |
Classifying AI-Generated Text in Low-Resource Languages like Arabic | ||
AUT Journal of Modeling and Simulation | ||
مقالات آماده انتشار، پذیرفته شده، انتشار آنلاین از تاریخ 20 مهر 1404 | ||
نوع مقاله: Research Article | ||
شناسه دیجیتال (DOI): 10.22060/miscj.2025.24060.5408 | ||
نویسندگان | ||
عبدالحسین وهابی* ؛ Ohood Al Minshidawi | ||
Computer engineering department, college of Alborz, University of Tehran, Tehran, Iran | ||
چکیده | ||
AI-Generated Texts (AIGTs) refer to written content produced by artificial intelligence systems using technologies such as natural language processing and machine learning. The rise of AIGT has introduced new challenges in content authenticity, trustworthiness, and information integrity across digital platforms. In low-resource languages, like Arabic, AIGT detection is challenging because of their more complex structural features. Accurate identification of AI-generated versus human-written text is essential to combat misinformation, preserve credibility in communication, and enhance content moderation systems. In this study, we propose a novel framework for AIGT detection on the AutoTweet Dataset, an annotated corpus of Arabic tweets. To the best of our knowledge, this is the first work to leverage Large Language Models (LLMs) for AIGT detection in Arabic, addressing a critical gap in low-resource natural language processing. We introduce a dynamic few-shot prompting technique, powered by a retrieval-based Judge Prompter module, which selects semantically and stylistically relevant support examples to enhance the contextual understanding of LLMs. We conduct a comprehensive evaluation across multiple LLMs, including Mistral-7B, LLaMA-3.1-8B, and ALLaM-7B-Instruct-preview, under zero-shot, few-shot, and fine-tuning scenarios. Our best results were achieved using Mistral-7B with QLoRA fine-tuning and dynamic few-shot prompting, reaching an accuracy of 88.69% and F1-score of 88.35%. These findings demonstrate the feasibility of adapting LLMs for AIGT detection in Arabic and highlight the effectiveness of context-aware prompting in low-resource settings, paving the way for future progress in text classification. | ||
کلیدواژهها | ||
Arabic text detection؛ AI-generated text؛ Zero-shot learning؛ Few-shot learning؛ Supervised fine-tuning | ||
آمار تعداد مشاهده مقاله: 2 |