research-article

Deep reinforcement learning meets graph neural networks: : Exploring a routing optimization use case

Authors:

José Suárez-Varela,

Krzysztof Rusek,

Pere Barlet-Ros,

Albert Cabellos-AparicioAuthors Info & Claims

Volume 196, Issue C

Pages 184 - 194

https://doi.org/10.1016/j.comcom.2022.09.029

Published: 01 December 2022 Publication History

Abstract

Deep Reinforcement Learning (DRL) has shown a dramatic improvement in decision-making and automated control problems. Consequently, DRL represents a promising technique to efficiently solve many relevant optimization problems (e.g., routing) in self-driving networks. However, existing DRL-based solutions applied to networking fail to generalize, which means that they are not able to operate properly when applied to network topologies not observed during training. This lack of generalization capability significantly hinders the deployment of DRL technologies in production networks. This is because state-of-the-art DRL-based networking solutions use standard neural networks (e.g., fully connected, convolutional), which are not suited to learn from information structured as graphs.

In this paper, we integrate Graph Neural Networks (GNN) into DRL agents and we design a problem specific action space to enable generalization. GNNs are Deep Learning models inherently designed to generalize over graphs of different sizes and structures. This allows the proposed GNN-based DRL agent to learn and generalize over arbitrary network topologies. We test our DRL+GNN agent in a routing optimization use case in optical networks and evaluate it on 180 and 232 unseen synthetic and real-world network topologies respectively. The results show that the DRL+GNN agent is able to outperform state-of-the-art solutions in topologies never seen during training.

References

[1]

Hartert R., Vissicchio S., Schaus P., Bonaventure O., Filsfils C., Telkamp T., Francois P., A declarative and expressive approach to control forwarding paths in carrier-grade networks, ACM SIGCOMM Comput. Commun. Rev. 45 (4) (2015) 15–28.

[2]

Knight S., Nguyen H.X., Falkner N., Bowden R., Roughan M., The internet topology zoo, IEEE J. Sel. Areas Commun. 29 (9) (2011) 1765–1775.

[3]

Mnih V., Kavukcuoglu K., Silver D., Rusu A.A., Veness J., Bellemare M.G., Graves A., Riedmiller M., Fidjeland A.K., Ostrovski G., et al., Human-level control through deep reinforcement learning, Nature 518 (2015) 529–533.

[4]

Silver D., Hubert T., Schrittwieser J., Antonoglou I., Lai M., Guez A., Lanctot M., Sifre L., Kumaran D., Graepel T., et al., Mastering chess and shogi by self-play with a general reinforcement learning algorithm, 2017, arXiv preprint arXiv:1712.01815.

[5]

Feamster N., Rexford J., Why (and how) networks should run themselves, 2017, arXiv preprint arXiv:1710.11583.

[6]

Wang M., Cui Y., Wang X., Xiao S., Jiang J., Machine learning for networking: Workflow, advances and opportunities, IEEE Netw. 32 (2) (2017) 92–99.

[7]

Mestres A., Rodriguez-Natal A., Carner J., Barlet-Ros P., Alarcón E., Solé M., Muntés-Mulero V., Meyer D., Barkai S., Hibbett M.J., et al., Knowledge-defined networking, ACM SIGCOMM Comput. Commun. Rev. 47 (3) (2017) 2–10.

[8]

P. Kalmbach, J. Zerwas, P. Babarczi, A. Blenk, W. Kellerer, S. Schmid, Empowering self-driving networks, in: Proceedings of the Afternoon Workshop on Self-Driving Networks, 2018, pp. 8–14.

[9]

A. Valadarsky, M. Schapira, D. Shahaf, A. Tamar, Learning to route, in: Proceedings of the ACM Workshop on Hot Topics in Networks, HotNets, 2017, pp. 185–191.

[10]

Rusek K., Suárez-Varela J., Almasan P., Barlet-Ros P., Cabellos-Aparicio A., RouteNet: Leveraging graph neural networks for network modeling and optimization in SDN, IEEE J. Sel. Areas Commun. 38 (10) (2020) 2260–2270.

[11]

X. Chen, J. Guo, Z. Zhu, R. Proietti, A. Castro, S.J.B. Yoo, Deep-RMSA: A deep-reinforcement-learning routing, modulation and spectrum assignment agent for elastic optical networks, in: Proceedings of the Optical Fiber Communications Conference, OFC, 2018.

[12]

Z. Xu, J. Tang, J. Meng, W. Zhang, Y. Wang, C.H. Liu, D. Yang, Experience-driven networking: A deep reinforcement learning based approach, in: IEEE Conference on Computer Communications, INFOCOM, 2018, pp. 1871–1879.

[13]

L. Chen, J. Lingys, K. Chen, F. Liu, Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization, in: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, 2018, pp. 191–205.

[14]

A. Mestres, E. Alarcón, Y. Ji, A. Cabellos-Aparicio, Understanding the modeling of computer network delays using neural networks, in: Proceedings of the ACM SIGCOMM Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, Big-DAMA, 2018, pp. 46–52.

[15]

Suárez-Varela J., Mestres A., Yu J., Kuang L., Feng H., Cabellos-Aparicio A., Barlet-Ros P., Routing in optical transport networks with deep reinforcement learning, IEEE/OSA J. Opt. Commun. Networking 11 (11) (2019) 547–558.

[16]

Battaglia P.W., Hamrick J.B., Bapst V., Sanchez-Gonzalez A., Zambaldi V., Malinowski M., Tacchetti A., Raposo D., Santoro A., Faulkner R., et al., Relational inductive biases, deep learning, and graph networks, 2018, arXiv preprint arXiv:1806.01261.

[17]

Scarselli F., Gori M., Tsoi A.C., Hagenbuchner M., Monfardini G., The graph neural network model, IEEE Trans. Neural Netw. 20 (1) (2008) 61–80.

Digital Library

[18]

J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, Neural message passing for quantum chemistry, in: Proceedings of the International Conference on Machine Learning- Vol. 70, ICML, 2017, pp. 1263–1272.

[19]

https://github.com/knowledgedefinednetworking/DRL-GNN.

[20]

Li Y., Tarlow D., Brockschmidt M., Zemel R., Gated graph sequence neural networks, 2015, arXiv preprint arXiv:1511.05493.

[21]

Veličković P., Cucurull G., Casanova A., Romero A., Lio P., Bengio Y., Graph attention networks, 2017, arXiv preprint arXiv:1710.10903.

[22]

P.W. Battaglia, R. Pascanu, M. Lai, D.J. Rezende, et al., Interaction networks for learning about objects, relations and physics, in: Proceedings of Advances in Neural Information Processing Systems, NIPS, 2016, pp. 4502–4510.

[23]

Watkins C.J.C.H., Dayan P., Q-learning, Mach. Learn. 8 (3–4) (1992) 279–292.

Digital Library

[24]

Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D., Riedmiller M., Playing atari with deep reinforcement learning, 2013, arXiv preprint arXiv:1312.5602.

[25]

J. Kuri, N. Puech, M. Gagnaire, Diverse routing of scheduled lightpath demands in an optical transport network, in: Proceedings of the IEEE International Workshop on Design of Reliable Communication Networks, DRCN, 2003, pp. 69–76.

[26]

ITU-T Recommendation g.709/y.1331: Interface for the optical transport network, 2019, https://www.itu.int/rec/T-REC-G.709/.

[27]

Sutton R.S., Barto A.G., Reinforcement Learning: An Introduction, MIT Press, 2018.

Digital Library

[28]

Cho K., Van Merriënboer B., Bahdanau D., Bengio Y., On the properties of neural machine translation: Encoder-decoder approaches, 2014, arXiv preprint arXiv:1409.1259.

[29]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: A system for large-scale machine learning, in: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2016, pp. 265–283.

[30]

Brockman G., Cheung V., Pettersson L., Schneider J., Schulman J., Tang J., Zaremba W., Openai gym, 2016, arXiv preprint arXiv:1606.01540.

[31]

L. Bottou, Large-scale machine learning with stochastic gradient descent, in: Proceedings of the International Conference on Computational Statistics, COMPSTAT, 2010, pp. 177–186.

[32]

Hei X., Zhang J., Bensaou B., Cheung C.-C., Wavelength converter placement in least-load-routing-based optical networks using genetic algorithms, J. Opt. Netw. 3 (5) (2004) 363–378.

[33]

Barreto F., Wille E.C., Nacamura Jr. L., Fast emergency paths schema to overcome transient link failures in OSPF routing, 2012, arXiv preprint arXiv:1204.2465.

[34]

Francois P., Filsfils C., Evans J., Bonaventure O., Achieving sub-second IGP convergence in large IP networks, ACM SIGCOMM Comput. Commun. Rev. 35 (3) (2005) 35–44.

[35]

Jain S., Kumar A., Mandal S., Ong J., Poutievski L., Singh A., Venkata S., Wanderer J., Zhou J., Zhu M., et al., B4: Experience with a globally-deployed software defined WAN, ACM SIGCOMM Comput. Commun. Rev. 43 (4) (2013) 3–14.

[36]

Hagberg A., Swart P., S. Chult D., Exploring Network Structure, Dynamics, and Function Using Networkx, Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2008.

[37]

Xu K., Hu W., Leskovec J., Jegelka S., How powerful are graph neural networks?, 2018, arXiv preprint arXiv:1810.00826.

[38]

Gong L., Zhou X., Liu X., Zhao W., Lu W., Zhu Z., Efficient resource allocation for all-optical multicasting over spectrum-sliced elastic optical networks, IEEE/OSA J. Opt. Commun. Networking 5 (8) (2013) 836–847.

[39]

Gong L., Zhou X., Lu W., Zhu Z., A two-population based evolutionary approach for optimizing routing, modulation and spectrum assignments (RMSA) in O-OFDM networks, IEEE Commun. Lett. 16 (9) (2012) 1520–1523.

[40]

Klinkowski M., Ruiz M., Velasco L., Careglio D., Lopez V., Comellas J., Elastic spectrum allocation for time-varying traffic in flexgrid optical networks, IEEE J. Sel. Areas Commun. 31 (1) (2012) 26–38.

[41]

Wang Y., Cao X., Pan Y., A study of the routing and spectrum allocation in spectrum-sliced elastic optical path networks, in: IEEE International Conference on Computer Communications, INFOCOM, 2011, pp. 1503–1511.

[42]

Christodoulopoulos K., Tomkos I., Varvarigos E.A., Elastic bandwidth allocation in flexible OFDM-based optical networks, J. Lightwave Technol. 29 (9) (2011) 1354–1366.

[43]

Sun P., Guo Z., Lan J., Li J., Hu Y., Baker T., Scaledrl: A scalable deep reinforcement learning approach for traffic engineering in SDN with pinning control, Comput. Netw. 190 (2021).

[44]

Zhang J., Ye M., Guo Z., Yen C.-Y., Chao H.J., CFR-RL: Traffic engineering with reinforcement learning in SDN, IEEE J. Sel. Areas Commun. 38 (10) (2020) 2249–2259,.

[45]

Troia S., Sapienza F., Varé L., Maier G., On deep reinforcement learning for traffic engineering in SD-WAN, IEEE J. Sel. Areas Commun. 39 (7) (2020) 2198–2212.

[46]

F. Geyer, G. Carle, Learning and generating distributed routing protocols using graph-based deep learning, in: Proceedings of the ACM SIGCOMM Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, Big-DAMA, 2018, pp. 40–45.

[47]

H. Zhu, V. Gupta, S.S. Ahuja, Y. Tian, Y. Zhang, X. Jin, Network planning with deep reinforcement learning, in: Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 2021, pp. 258–271.

[48]

Bernárdez G., Suárez-Varela J., López A., Wu B., Xiao S., Cheng X., Barlet-Ros P., Cabellos-Aparicio A., Is machine learning ready for traffic engineering optimization?, in: 2021 IEEE 29th International Conference on Network Protocols, ICNP, IEEE, 2021, pp. 1–11.

[49]

H. Mao, M. Schwarzkopf, S.B. Venkatakrishnan, Z. Meng, M. Alizadeh, Learning scheduling algorithms for data processing clusters, in: Proceedings of ACM SIGCOMM, 2019, pp. 270–288.

Cited By

Weil JBao ZAbboud OMeuser TDastani MSichman JAlechina NDignum V(2024)Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message PassingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663055(1919-1927)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663055
Gebrekidan TStein SNorman TDastani MSichman JAlechina NDignum V(2024)Deep Reinforcement Learning with Coalition Action Selection for Online Combinatorial Resource Allocation with Arbitrary Action SpaceProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662918(660-668)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662918
Blübaum LHeindorf SChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Causal Question Answering with Reinforcement LearningProceedings of the ACM on Web Conference 202410.1145/3589334.3645610(2204-2215)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645610
Show More Cited By

Index Terms

Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Generating a Graph Colouring Heuristic with Deep Q-Learning and Graph Neural Networks
Learning and Intelligent Optimization
Abstract
The graph colouring problem consists of assigning labels, or colours, to the vertices of a graph such that no two adjacent vertices share the same colour. In this work we investigate whether deep reinforcement learning can be used to discover a ...
Deep Reinforcement Learning for Auto-optimization of I/O Accelerator Parameters
Benchmarking, Measuring, and Optimizing
Abstract
Reinforcement Learning (RL) has made several advances in the machine learning domain especially Deep Reinforcement Learning. AlphaGo developed by DeepMind is a good example of how the deep neural network can train an agent to play and outperform ...
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computer Communications

Computer Communications Volume 196, Issue C

Dec 2022

278 pages

ISSN:0140-3664

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 December 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Weil JBao ZAbboud OMeuser TDastani MSichman JAlechina NDignum V(2024)Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message PassingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663055(1919-1927)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663055
Gebrekidan TStein SNorman TDastani MSichman JAlechina NDignum V(2024)Deep Reinforcement Learning with Coalition Action Selection for Online Combinatorial Resource Allocation with Arbitrary Action SpaceProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662918(660-668)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662918
Blübaum LHeindorf SChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Causal Question Answering with Reinforcement LearningProceedings of the ACM on Web Conference 202410.1145/3589334.3645610(2204-2215)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645610
Josbert NWei MWang PRafiq A(2024)A look into smart factory for Industrial IoT driven by SDN technologyJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10206936:5Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1016/j.jksuci.2024.102069
Xu JWang YZhang BMa J(2024)A Graph reinforcement learning based SDN routing path selection for optimizing long-term revenueFuture Generation Computer Systems10.1016/j.future.2023.09.017150:C(412-423)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.future.2023.09.017
Ilić NDašić DVučetić MMakarov APetrović R(2024)Distributed web hacking by adaptive consensus-based reinforcement learningArtificial Intelligence10.1016/j.artint.2023.104032326:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.artint.2023.104032
Herlihy CDickerson JWilliams BChen YNeville J(2023)Networked restless bandits with positive externalitiesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i10.26415(11997-12004)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i10.26415
Li YGao HGao YGuo JWu W(2023)A Survey on Influence Maximization: From an ML-Based Combinatorial OptimizationACM Transactions on Knowledge Discovery from Data10.1145/360455917:9(1-50)Online publication date: 18-Jul-2023
https://dl.acm.org/doi/10.1145/3604559
Wei HZhao YXu K(2023)G-Routing: Graph Neural Networks-Based Flexible Online RoutingIEEE Network: The Magazine of Global Internetworking10.1109/MNET.012.230005237:4(90-96)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1109/MNET.012.2300052
Yang SChang XZhu GCao JQin WWang YWang Z(2023)GAA-PPONeurocomputing10.1016/j.neucom.2023.126707557:COnline publication date: 7-Nov-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.126707
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents