Hostname: page-component-77c89778f8-rkxrd Total loading time: 0 Render date: 2024-07-16T19:24:20.945Z Has data issue: false hasContentIssue false

Dynamic feature-based deep reinforcement learning for flow control of circular cylinder with sparse surface pressure sensing

Published online by Cambridge University Press:  28 May 2024

Qiulei Wang
Affiliation:
School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China
Lei Yan*
Affiliation:
School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China
Gang Hu*
Affiliation:
School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China Guangdong Provincial Key Laboratory of Intelligent and Resilient Structures for Civil Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China Guangdong-Hong Kong-Macao Joint Laboratory for Data-Driven Fluid Mechanics and Engineering Applications, Harbin Institute of Technology, Shenzhen 518055, PR China
Wenli Chen
Affiliation:
Guangdong-Hong Kong-Macao Joint Laboratory for Data-Driven Fluid Mechanics and Engineering Applications, Harbin Institute of Technology, Shenzhen 518055, PR China Key Laboratory of Smart Prevention and Mitigation of Civil Engineering Disasters, the Ministry of Industry and Information Technology, Harbin Institute of Technology, Harbin 150090, PR China
Jean Rabault
Affiliation:
Independent Researcher, Oslo 0376, Norway
Bernd R. Noack
Affiliation:
School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, PR China
*
Email addresses for correspondence: 180410212@stu.hit.edu.cn, hugang@hit.edu.cn
Email addresses for correspondence: 180410212@stu.hit.edu.cn, hugang@hit.edu.cn

Abstract

This study proposes a self-learning algorithm for closed-loop cylinder wake control targeting lower drag and lower lift fluctuations with the additional challenge of sparse sensor information, taking deep reinforcement learning (DRL) as the starting point. The DRL performance is significantly improved by lifting the sensor signals to dynamic features (DFs), which predict future flow states. The resulting DF-based DRL (DF-DRL) automatically learns a feedback control in the plant without a dynamic model. Results show that the drag coefficient of the DF-DRL model is 25 % less than the vanilla model based on direct sensor feedback. More importantly, using only one surface pressure sensor, DF-DRL can reduce the drag coefficient to a state-of-the-art performance of approximately 8 % at Reynolds number $(Re) = 100$ and significantly mitigates lift coefficient fluctuations. Hence, DF-DRL allows the deployment of sparse sensing of the flow without degrading the control performance. This method also exhibits strong robustness in flow control under more complex flow scenarios, reducing the drag coefficient by 32.2 % and 46.55 % at $Re =500$ and 1000, respectively. Additionally, the drag coefficient decreases by 28.6 % in a three-dimensional turbulent flow at $Re =10\,000$. Since surface pressure information is more straightforward to measure in realistic scenarios than flow velocity information, this study provides a valuable reference for experimentally designing the active flow control of a circular cylinder based on wall pressure signals, which is an essential step toward further developing intelligent control in a realistic multi-input multi-output system.

Type
JFM Papers
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Achenbach, E. 1968 Distribution of local pressure and skin friction around a circular cylinder in cross-flow up to $Re = 5\times 10^6$. J. Fluid Mech. 34 (4), 625639.CrossRefGoogle Scholar
Altman, N. & Krzywinski, M. 2018 The curse(s) of dimensionality. Nat. Meth. 15 (6), 399400.CrossRefGoogle ScholarPubMed
Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P. & Zaremba, W. 2018 Hindsight experience replay. In Advances in Neural Information Processing Systems, vol. 30 (ed. I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan & R. Garnett). Curran Associates.Google Scholar
Bellemare, M.G., Dabney, W. & Munos, R. 2017 A distributional perspective on reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning (ed. D. Precup & Y.W. Teh), pp. 449–458. PMLR.Google Scholar
Bishop, R.E.D., Hassan, A.Y. & Saunders, O.A. 1997 The lift and drag forces on a circular cylinder oscillating in a flowing fluid. Proc. R. Soc. Lond. A 277 (1368), 5175.Google Scholar
Blanchard, A.B., Cornejo Maceda, G.Y., Fan, D., Li, Y., Zhou, Y., Noack, B.R. & Sapsis, T.P. 2021 Bayesian optimization for active flow control. Acta Mechanica Sin. 37 (12), 17861798.CrossRefGoogle Scholar
Brunton, S.L., Brunton, B.W., Proctor, J.L., Kaiser, E. & Kutz, J.N. 2017 Chaos as an intermittently forced linear system. Nat. Commun. 8 (1), 19.CrossRefGoogle ScholarPubMed
Brunton, S.L., Brunton, B.W., Proctor, J.L. & Kutz, J.N. 2016 a Koopman invariant subspaces and finite linear representations of nonlinear dynamical systems for control. PLoS ONE 11 (2), e0150171.CrossRefGoogle ScholarPubMed
Brunton, S.L. & Noack, B.R. 2015 Closed-loop turbulence control: progress and challenges. Appl. Mech. Rev. 67 (5), 050801.CrossRefGoogle Scholar
Brunton, S.L., Proctor, J.L. & Kutz, N.J. 2016 b Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl Acad. Sci. USA 113 (5), 39323937.CrossRefGoogle ScholarPubMed
Dong, S., Karniadakis, G.E., Ekmekci, A. & Rockwell, D. 2006 A combined direct numerical simulation–particle image velocimetry study of the turbulent near wake. J. Fluid Mech. 569, 185207.CrossRefGoogle Scholar
Duriez, T., Brunton, S.L. & Noack, B.R. 2017 Machine Learning Control-Taming Nonlinear Dynamics and Turbulence, vol. 116. Springer.CrossRefGoogle Scholar
Fan, D., Yang, L., Wang, Z., Triantafyllou, M.S. & Karniadakis, G.Em. 2020 Reinforcement learning for bluff body active flow control in experiments and simulations. Proc. Natl Acad. Sci. USA 117 (42), 2609126098.CrossRefGoogle ScholarPubMed
François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G. & Pineau, J. 2018 An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 11 (3–4), 219354.CrossRefGoogle Scholar
Fujimoto, S., van Hoof, H. & Meger, D. 2018 Addressing function approximation error in actor-critic methods. In Proceedings of the 35th International Conference on Machine Learning (ed. J. Dy & A. Krause), vol. 80, pp. 1587–1596. PMLR.Google Scholar
Garnier, P., Viquerat, J., Rabault, J., Larcher, A., Kuhnle, A. & Hachem, E. 2021 A review on deep reinforcement learning for fluid mechanics. Comput. Fluids 225, 104973.CrossRefGoogle Scholar
Gautier, N., Aider, J.-L., Duriez, T., Noack, B.R, Segond, M. & Abel, M. 2015 Closed-loop separation control using machine learning. J. Fluid Mech. 770, 442457.CrossRefGoogle Scholar
Gopalkrishnan, R. 1993 Vortex-induced forces on oscillating bluff cylinders. PhD thesis, Massachusetts Institute of Technology.Google Scholar
Guastoni, L., Rabault, J., Schlatter, P., Azizpour, H. & Vinuesa, R. 2023 Deep reinforcement learning for turbulent drag reduction in channel flows. Eur. J. Phys. 46 (4), 27.Google ScholarPubMed
Ha, D. & Schmidhuber, J. 2018 Recurrent world models facilitate policy evolution. In Advances in Neural Information Processing Systems (ed. S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi & R. Garnett), vol. 31. Curran Associates.Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P. & Levine, S. 2018 Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the 35th International Conference on Machine Learning (ed. J. Dy & A. Krause), vol. 80, pp. 1861–1870. PMLR.Google Scholar
He, G.S., Li, N. & Wang, J.J. 2014 Drag reduction of square cylinders with cut-corners at the front edges. Exp. Fluids 55 (6), 1745.CrossRefGoogle Scholar
Jasak, H., Jemcov, A. & Tukovic, Z. 2007 OpenFOAM: a C++ library for complex physics simulations. In International Workshop on Coupled Methods in Numerical Dynamics, vol. 1000, pp. 1–20. IUC Dubrovnik Croatia.Google Scholar
Ji, T., Jin, F., Xie, F., Zheng, H., Zhang, X. & Zheng, Y. 2022 Active learning of tandem flapping wings at optimizing propulsion performance. Phys. Fluids 34 (4), 047117.CrossRefGoogle Scholar
Kingma, D.P., Ba, J., Bengio, Y. & LeCun, Y. 2015 In 3rd International Conference on Learning Representations (ed. Y. Bengio & Y. LeCun), 7–9 May, ICLR, San Diego, CA. USA.Google Scholar
Korkischko, I. & Meneghini, J.R. 2012 Suppression of vortex-induced vibration using moving surface boundary-layer control. J. Fluids Struct. 34, 259270.CrossRefGoogle Scholar
Lee, C., Kim, J., Babcock, D. & Goodman, R. 1997 Application of neural networks to turbulence control for drag reduction. Phys. Fluids 9 (6), 17401747.CrossRefGoogle Scholar
Li, S., Snaiki, R. & Wu, T. 2021 A knowledge-enhanced deep reinforcement learning-based shape optimizer for aerodynamic mitigation of wind-sensitive structures. Comput.-Aided Civil Infrastructure Engng 36 (6), 733746.CrossRefGoogle Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. 2019 Continuous control with deep reinforcement learning. arXiv:1509.02971.Google Scholar
Loiseau, J.-C., Noack, B.R. & Brunton, S.L. 2018 Sparse reduced-order modeling: sensor-based dynamics to full-state estimation. J. Fluid Mech. 844, 459490.CrossRefGoogle Scholar
Mei, Y.-F., Zheng, C., Aubry, N., Li, M.-G., Wu, W.-T. & Liu, X. 2021 Active control for enhancing vortex induced vibration of a circular cylinder based on deep reinforcement learning. Phys. Fluids 33 (10), 103604.CrossRefGoogle Scholar
Mezić, I. 2005 Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn. 41 (1), 309325.CrossRefGoogle Scholar
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. & Kavukcuoglu, K. 2016 Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning (ed. M.F. Balcan & K.Q. Weinberger), pp. 1928–1937. PMLR.Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. & Riedmiller, M. 2013 Playing atari with deep reinforcement learning. arXiv:1312.5602.Google Scholar
Mnih, V., et al. 2015 Human-level control through deep reinforcement learning. Nature 518 (7540), 529533.CrossRefGoogle ScholarPubMed
Navrose, & Mittal, S. 2013 Free vibrations of a cylinder: 3-D computations at $Re=1000$. J. Fluids Struct. 41, 109118.CrossRefGoogle Scholar
Norberg, C. 2003 Fluctuating lift on a circular cylinder: review and new measurements. J. Fluids Struct. 17 (1), 5796.CrossRefGoogle Scholar
Novati, G., Verma, S., Alexeev, D., Rossinelli, D., van Rees, W.M. & Koumoutsakos, P. 2017 Synchronisation through learning for two self-propelled swimmers. Bioinspir. Biomim. 12 (3), 036001.CrossRefGoogle ScholarPubMed
Pino, F., Schena, L., Rabault, J. & Mendez, M.A. 2023 Comparative analysis of machine learning methods for active flow control. J. Fluid Mech. 958, A39.CrossRefGoogle Scholar
Pintér, J.D. 1995 Global Optimization in Action: Continuous and Lipschitz Optimization: Algorithms, Implementations and Applications, vol. 6. Springer Science & Business Media.Google Scholar
Rabault, J., Kuchta, M., Jensen, A., Réglade, U. & Cerardi, N. 2019 Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control. J. Fluid Mech. 865, 281302.CrossRefGoogle Scholar
Rabault, J. & Kuhnle, A. 2019 Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach. Phys. Fluids 31 (9), 094105.CrossRefGoogle Scholar
Rabault, J., Ren, F., Zhang, W., Tang, H. & Xu, H. 2020 Deep reinforcement learning in fluid mechanics: a promising method for both active flow control and shape optimization. J. Hydrodyn. 32, 234246.CrossRefGoogle Scholar
Ren, F., Rabault, J. & Tang, H. 2021 a Applying deep reinforcement learning to active flow control in weakly turbulent conditions. Phys. Fluids 33 (3), 037121.CrossRefGoogle Scholar
Ren, F., Wang, C. & Tang, H. 2019 Active control of vortex-induced vibration of a circular cylinder using machine learning. Phys. Fluids 31 (9), 093601.CrossRefGoogle Scholar
Ren, F., Wang, C. & Tang, H. 2021 b Bluff body uses deep-reinforcement-learning trained active flow control to achieve hydrodynamic stealth. Phys. Fluids 33 (9), 093602.CrossRefGoogle Scholar
Roghair, J., Niaraki, A., Ko, K. & Jannesari, A. 2022 A vision based deep reinforcement learning algorithm for UAV obstacle avoidance. In Intelligent Systems and Applications (ed. K. Arai), Lecture Notes in Networks and Systems, pp. 115–128. Springer International.CrossRefGoogle Scholar
Roy, S., Ghoshal, S., Barman, K., Das, V.K., Ghosh, S. & Debnath, K. 2019 Modulation of the recirculation region due to magneto hydrodynamic flow. Engng Sci. Technol., Intl J. 22 (1), 282293.Google Scholar
Schaarschmidt, M., Kuhnle, A., Ellis, B., Fricke, K., Gessert, F. & Yoneki, E. 2018 LIFT: reinforcement learning in computer systems by learning from demonstrations. CoRR. arXiv:1808.07903.Google Scholar
Schäfer, M., Turek, S., Durst, F., Krause, E. & Rannacher, R. 1996 Benchmark computations of laminar flow around a cylinder. In Flow Simulation with High-Performance Computers II (ed. E.H. Hirschel), pp. 547–566. Springer.CrossRefGoogle Scholar
Schulman, J., Levine, S., Moritz, P., Jordan, M.I. & Abbeel, P. 2015 Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning (ed. F. Bach & D. Blei), pp. 1889–1897. PMLR.Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. 2017 Proximal policy optimization algorithms. arXiv:1707.06347.Google Scholar
Silver, D., et al. 2017 Mastering Chess and Shogi by self-play with a general reinforcement learning algorithm. arXiv:1712.01815.Google Scholar
Takens, F. 1981 Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980 (ed. D. Rand & L.-S. Young), Lecture Notes in Mathematics, pp. 366–381. Springer.CrossRefGoogle Scholar
Tang, H., Rabault, J., Kuhnle, A., Wang, Y. & Wang, T. 2020 Robust active flow control over a range of reynolds numbers using an artificial neural network trained through deep reinforcement learning. Phys. Fluids 32 (5), 053605.CrossRefGoogle Scholar
Varela, P., Suárez, P., Alcántara-Ávila, F., Miró, A., Rabault, J., Font, B., García-Cuevas, L.M., Lehmkuhl, O. & Vinuesa, R. 2022 Deep reinforcement learning for flow control exploits different physics for increasing Reynolds number regimes. Actuators 11 (12), 359.Google Scholar
Verma, S., Novati, G. & Koumoutsakos, P. 2018 Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proc. Natl Acad. Sci. USA 115 (23), 58495854.CrossRefGoogle ScholarPubMed
Vignon, C., Rabault, J. & Vinuesa, R. 2023 Recent advances in applying deep reinforcement learning for flow control: perspectives and future directions. Phys. Fluids 35 (3), 031301.CrossRefGoogle Scholar
Viquerat, J., Rabault, J., Kuhnle, A., Ghraieb, H., Larcher, A. & Hachem, E. 2021 Direct shape optimization through deep reinforcement learning. J. Comput. Phys. 428, 110080.CrossRefGoogle Scholar
Wang, J. & Feng, L. 2018 Flow Control Techniques and Applications. Cambridge Aerospace Series. Cambridge University Press.CrossRefGoogle Scholar
Wang, Q., Yan, L., Hu, G., Li, C., Xiao, Y., Xiong, H., Rabault, J. & Noack, B.R. 2022 a Drlinfluids: an open-source python platform of coupling deep reinforcement learning and openfoam. Phys. Fluids 34 (8), 081801.CrossRefGoogle Scholar
Wang, Y.-Z., Hua, Y., Aubry, N., Chen, Z.-H., Wu, W.-T. & Cui, J. 2022 b Accelerating and improving deep reinforcement learning-based active flow control: transfer training of policy network. Phys. Fluids 34 (7), 073609.Google Scholar
Weaver, L. & Tao, N. 2013 The optimal reward baseline for gradient-based reinforcement learning. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (2001), pp. 538–545. Morgan Kaufmann.Google Scholar
Weber, T., et al. 2018 Imagination-augmented agents for deep reinforcement learning. In Advances in Neural Information Processing Systems (ed. I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan & R. Garnett). Curran Associates.Google Scholar
Weng, J., Chen, H., Yan, D., You, K., Duburcq, A., Zhang, M., Su, Y., Su, H. & Zhu, J. 2022 Tianshou: a highly modularized deep reinforcement learning library. J. Mach. Learn. Res. 23 (267), 16.Google Scholar
Xu, H., Zhang, W., Deng, J. & Rabault, J. 2020 Active flow control with rotating cylinders by an artificial neural network trained by deep reinforcement learning. J. Hydrodyn. 32, 254258.CrossRefGoogle Scholar
Zheng, C., Ji, T., Xie, F., Zhang, X., Zheng, H. & Zheng, Y. 2021 From active learning to deep reinforcement learning: intelligent active flow control in suppressing vortex-induced vibration. Phys. Fluids 33 (6), 063607.CrossRefGoogle Scholar
Zhou, Y., Fan, D., Zhang, B., Li, R. & Noack, B.R. 2020 Artificial intelligence control of a turbulent jet. J. Fluid Mech. 897, A27.CrossRefGoogle Scholar