- H. Ang, G. Chong, Y. Li, PID control system analysis, design, and technology, IEEE Transactions on Control Systems Technology, 13(4) (2005) 559–576.
- Samad, A. M. Annaswamy, The impact of control technology: Overview, success stories, and research challenges, IEEE Control Systems Magazine, 40(6) (2020) 36–70.
- Ogata, Modern Control Engineering, 5th ed., Prentice Hall, 2010.
- Kuhnle, J. P. Kaiser, F. Theiß, N. Stricker, G. Lanza, Designing an adaptive production control system using reinforcement learning, Journal of Intelligent Manufacturing, 32 (2021) 855–876.
- Panda, P. K. Patra, Nonlinear control and estimation of industrial pneumatic actuator systems: A survey, ISA Transactions, 109 (2021) 177–193.
- Su, H. Liu, Nonlinear control of robotic manipulators using adaptive neural network approaches: A review, IEEE/CAA Journal of Automatica Sinica, 8(4), (2021) 678–694.
- Liu, Y. Jiang, Q. Zhang, Intelligent process monitoring and control of machining systems using data-driven techniques: A review, Journal of Manufacturing Systems, 56 (2020) 188–206.
- F. Camacho, C. Bordons, Model Predictive Control, Springer Science & Business Media, Springer London, 2013.
- J. Åström, R. M. Murray, Feedback Systems: An Introduction for Scientists and Engineers, Princeton University Press, Princeton, New Jersey, 2010.
- Lee, S. Koo, I. Jang, J. Kim, Comparison of deep reinforcement learning and PID controllers for automatic cold shutdown operation, Energies, 15(8) (2022) 2834.
- Wang, T. Hong, Reinforcement learning for building controls: The opportunities and challenges, Applied Energy, 269 (2020) 115036.
- Tao, D. Zhang, W. Ma, X. Liu, D. Xu, Automatic metallic surface defect detection and recognition with convolutional neural networks, Applied Sciences, 8(9) (2018) 1575.
- J. Antsaklis, A. Rahnama, Control and machine intelligence for system autonomy, Journal of Intelligent & Robotic Systems, 91 (2018) 23–34.
- F. Arinez, Q. Chang, R. X. Gao, C. Xu, J. Zhang: Artificial intelligence in advanced manufacturing: Current status and future outlook, Journal of Manufacturing Science and Engineering, 142(11) (2020) 110804.
- Hussain, H. A. Gabbar, M. R. Khan, Digital twin-based smart monitoring and control of petrochemical processes using AI and IoT, IEEE Access, 9 (2021) 141128–141145.
- M. Bianchi, L. Livi, C. Alippi, Predictive maintenance for industrial IoT of things: A deep learning approach with attention-based RNNs, IEEE Transactions on Industrial Informatics, 17(9) (2021) 6204–6212.
- A. Rummery, M. Niranjan, On-Line Q-Learning Using Connectionist Systems, Department of Engineering, University of Cambridge, Cambridge, 1994.
- S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction. 2nd ed., MIT Press, Cambridge, 2018.
- LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015) 436-444.
- Lyu, Y. Tian, R. Zhao, S. Yin, Deep reinforcement learning for wind turbine control: Challenges and opportunities, Renewable and Sustainable Energy Reviews, 144 (2021) 110948.
- Yu, Z. Zhou, Y. Liu, C. Li, Y. Liu, Reinforcement learning in industrial applications: Recent advances and prospects, Engineering Applications of Artificial Intelligence, 105 (2021) 104398.
- R. Vázquez-Canteli, Z. Nagy, Reinforcement learning for demand response: A review of algorithms and modeling techniques, Applied Energy, 276 (2020) 115446.
- Kiumarsi, H. Modares, F. L. Lewis, A. Karimpour, A. Davoudi, Optimal and autonomous control using reinforcement learning: A survey, IEEE Transactions on Neural Networks and Learning Systems, 29(6) (2018) 2042–2062.
- Szepesvari, Algorithms for Reinforcement Learning, Morgan & Claypool Publishers, Switzerland, 2010.
- R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. Sallab, S. Yogamani, Deep reinforcement learning for autonomous driving: A Survey, IEEE Transactions on Intelligent Transportation Systems, 23 (6) (2021) 4909–4926.
- Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al., Mastering the game of Go with deep neural networks and tree search, Nature, 529 (2016) 484–489.
- Riedmiller, Neural fitted Q-iteration – First experiences with a data efficient neural reinforcement learning method, in: Proceedings of the 16th European Conference on Machine Learning (ECML), Porto, Portugal, 2005, pp. 317–328.
- Lin, Y. Liu, F. Lin, L. Zou, P. Wu, W. Zeng, H. Chen, Ch. Miao, A survey on reinforcement learning for recommender systems, IEEE Transactions on Neural Networks and Learning Systems, 35(10) (2024) 13164 - 13184.
- J. Park, S. K. S. Fan, C. Y. Hsu, A review on fault detection and process diagnostics in industrial processes, Processes, 8(9) (2020) 1123.
- A. Gupta, M. Y. Chow, Networked control system: Overview and research trends, IEEE Transactions on Industrial Electronics, 57(7) (2010) 2527–2535.
- Garcia, F. Fernandez, A comprehensive survey on safe reinforcement learning, Journal of Machine Learning Research, 16 (2015) 1437–1480.
- P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning, in: International Conference on Learning Representations (ICLR 2016), San Juan, 2016.
- Bellman, A Markovian decision process, Journal of Mathematics and Mechanics 6(5) (1957) 679–684.
- Morales, Grokking Deep Reinforcement Learning, Manning Publications, Shelter Iland, 2020.
- S. Sutton, Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, in: Proceedings of the Seventh International Conference on Machine Learning, Austin, Texas, 1990, pp. 216–224.
- S. Sutton, Dyna, an integrated architecture for learning, planning, and reacting, ACM 2(4) (1991) 160–163.
- Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. Lillicrap, K. Simonyan, D. Hassabis, A general reinforcement learning algorithm that Masters Chess, Shogi, and Go through self-play, Science 362(6419) (2018) 1140–1144.
- Watkins, Learning from Delayed Rewards, PhD thesis, University of Cambridge, England, 1989.
- V. Hasselt, Double Q-learning, Advances in Neural Information Processing Systems 23 (2010) 2613–2621.
- J. Lin, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning, 8(3-4) (1992) 293–321.
- O’Neill, B. Pleydell-Bouverie, D. Dupret, J. Csicsvari, Play it again: Reactivation of waking experiences and memory, Trends in Neurosciences, 33(5) (2010) 220–229.
- Zhang, R. Li, Q-value-based experience replay in reinforcement learning, Knowledge-Based Systems 315(2025) 113296.
- Manih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., Human-level control through deep reinforcement learning, Nature 518(7540) (2015) 529–533.
- R. Konda, J. N. Tsitsiklis, On actor critic algorithms, SIAM Journal on Control and Optimization 42(4) (2003) 1143–1166.
- Bhatnagar, R. S. Sutton, M. Ghavamzadeh, M. Lee, Natural Actor Critic Algorithm, Automatica 45(11) (2009) 2471–2482.
- Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, M. Riedmiller, Deterministic policy gradient algorithms, in: Proceedings of the 31st International Conference on Machine Learning, PMLR 32(1), 2014, PP. 387–395.
- Fujimoto, H. Hoof, D. Meger, Addressing function approximation error in actor-critic methods, in: International Conference on Machine Learning (ICML), 2018, pp. 1582–1591.
- Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, D. Meger, Deep reinforcement learning that matters, in: AAAI Conference on Artificial Intelligence, 2018, pp. 3207–3214.
- Zhang, N. Ballas, J. Pineau, A dissection of overfitting and generalization in continuous control, ArXiv Preprint, arXiv:1806.07937 (2018).
- Fujimoto, H. Hoof, D. Meger, Addressing function approximation error in actor critic methods, in: Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, PMLR 80, 2018, pp. 1587–1596.
- Haarnoja, A. Zhou, P. Abbeel, S. Levine, Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, in: Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, PMLR 80, 2018, pp. 1861–1870.
- Gu, Y. Cheng, C. L. P. Chen, X. Wang, Proximal policy optimization with policy feedback, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(7) (2022) 4600–4610.
- Doostmohammadian, M. I. Qureshi, M. H. Khalesi, H. R. Rabiee, U. A. Khan, Log-scale quantization in distributed first-order methods: Gradient-based learning from distributed data, IEEE Transactions on Automation Science and Engineering 22 (2025) 10948–10959.
- Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, W. Zaremba, OpenAI Gym, ArXiv Preprint, arXiv: 1606.01540 (2016).
- Lei, Q. Zhu, R. Li, Cascaded robust fixed-time terminal sliding mode control for uncertain cartpole systems with incremental nonlinear dynamic inversion, International Journal of Non-Linear Mechanics 167 (2024) 104900.
- V. Florian, Correct Equations for the Dynamics of the Cart‑Pole System, Center for Cognitive and Neural Studies (Coneural), Romania, 2007.
|