
تعداد نشریات | 8 |
تعداد شمارهها | 421 |
تعداد مقالات | 5,530 |
تعداد مشاهده مقاله | 6,387,196 |
تعداد دریافت فایل اصل مقاله | 5,487,075 |
Double Deep Q Network with Adaptive Prioritized Experience Replay | ||
AUT Journal of Modeling and Simulation | ||
مقالات آماده انتشار، پذیرفته شده، انتشار آنلاین از تاریخ 12 مرداد 1404 | ||
نوع مقاله: Research Article | ||
شناسه دیجیتال (DOI): 10.22060/miscj.2025.23426.5373 | ||
نویسندگان | ||
Mohammad Mahdi UO Ebadzadeh* 1؛ Majid UO Adibian2 | ||
1Amirkabir University Of Technology | ||
2Amirkabir University of Technology | ||
چکیده | ||
In deep reinforcement learning, experience replay buffers help break the correlation of sequential data and improve the efficiency of learning from past experiences. Prioritized Experience Replay (PER) enhances this process by selecting transitions based on their temporal difference (TD) error. However, PER does not account for how often a transition has been used or its overall importance. To address this, we introduce an adaptive prioritization method that incorporates three additional transition-level factors: reward, usage count (counter), and policy probability—collectively termed RCP values. Each RCP value is normalized and combined with the TD error to determine the selection probability of transitions from the replay buffer. We test our approach on several Atari game environments and find that using any of the RCP values individually improves performance over standard PER. To leverage all three RCP components, we evaluate three aggregation strategies: taking the minimum, maximum, or mean of the RCP values. Results indicate that while the best aggregation method varies by environment, the mean function consistently delivers stable performance improvements. This is likely because it balances the influence of all three factors, preventing over-reliance on any single one. Our findings indicate that incorporating RCP values provides a straightforward and effective improvement over conventional prioritization methods in experience replay. | ||
کلیدواژهها | ||
Deep reinforcement learning؛ Prioritized Experience Replay؛ Deep Q-Network | ||
آمار تعداد مشاهده مقاله: 2 |