skip to main content
research-article

Deep reinforcement learning meets graph neural networks: : Exploring a routing optimization use case

Published: 01 December 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Deep Reinforcement Learning (DRL) has shown a dramatic improvement in decision-making and automated control problems. Consequently, DRL represents a promising technique to efficiently solve many relevant optimization problems (e.g., routing) in self-driving networks. However, existing DRL-based solutions applied to networking fail to generalize, which means that they are not able to operate properly when applied to network topologies not observed during training. This lack of generalization capability significantly hinders the deployment of DRL technologies in production networks. This is because state-of-the-art DRL-based networking solutions use standard neural networks (e.g., fully connected, convolutional), which are not suited to learn from information structured as graphs.
    In this paper, we integrate Graph Neural Networks (GNN) into DRL agents and we design a problem specific action space to enable generalization. GNNs are Deep Learning models inherently designed to generalize over graphs of different sizes and structures. This allows the proposed GNN-based DRL agent to learn and generalize over arbitrary network topologies. We test our DRL+GNN agent in a routing optimization use case in optical networks and evaluate it on 180 and 232 unseen synthetic and real-world network topologies respectively. The results show that the DRL+GNN agent is able to outperform state-of-the-art solutions in topologies never seen during training.

    References

    [1]
    Hartert R., Vissicchio S., Schaus P., Bonaventure O., Filsfils C., Telkamp T., Francois P., A declarative and expressive approach to control forwarding paths in carrier-grade networks, ACM SIGCOMM Comput. Commun. Rev. 45 (4) (2015) 15–28.
    [2]
    Knight S., Nguyen H.X., Falkner N., Bowden R., Roughan M., The internet topology zoo, IEEE J. Sel. Areas Commun. 29 (9) (2011) 1765–1775.
    [3]
    Mnih V., Kavukcuoglu K., Silver D., Rusu A.A., Veness J., Bellemare M.G., Graves A., Riedmiller M., Fidjeland A.K., Ostrovski G., et al., Human-level control through deep reinforcement learning, Nature 518 (2015) 529–533.
    [4]
    Silver D., Hubert T., Schrittwieser J., Antonoglou I., Lai M., Guez A., Lanctot M., Sifre L., Kumaran D., Graepel T., et al., Mastering chess and shogi by self-play with a general reinforcement learning algorithm, 2017, arXiv preprint arXiv:1712.01815.
    [5]
    Feamster N., Rexford J., Why (and how) networks should run themselves, 2017, arXiv preprint arXiv:1710.11583.
    [6]
    Wang M., Cui Y., Wang X., Xiao S., Jiang J., Machine learning for networking: Workflow, advances and opportunities, IEEE Netw. 32 (2) (2017) 92–99.
    [7]
    Mestres A., Rodriguez-Natal A., Carner J., Barlet-Ros P., Alarcón E., Solé M., Muntés-Mulero V., Meyer D., Barkai S., Hibbett M.J., et al., Knowledge-defined networking, ACM SIGCOMM Comput. Commun. Rev. 47 (3) (2017) 2–10.
    [8]
    P. Kalmbach, J. Zerwas, P. Babarczi, A. Blenk, W. Kellerer, S. Schmid, Empowering self-driving networks, in: Proceedings of the Afternoon Workshop on Self-Driving Networks, 2018, pp. 8–14.
    [9]
    A. Valadarsky, M. Schapira, D. Shahaf, A. Tamar, Learning to route, in: Proceedings of the ACM Workshop on Hot Topics in Networks, HotNets, 2017, pp. 185–191.
    [10]
    Rusek K., Suárez-Varela J., Almasan P., Barlet-Ros P., Cabellos-Aparicio A., RouteNet: Leveraging graph neural networks for network modeling and optimization in SDN, IEEE J. Sel. Areas Commun. 38 (10) (2020) 2260–2270.
    [11]
    X. Chen, J. Guo, Z. Zhu, R. Proietti, A. Castro, S.J.B. Yoo, Deep-RMSA: A deep-reinforcement-learning routing, modulation and spectrum assignment agent for elastic optical networks, in: Proceedings of the Optical Fiber Communications Conference, OFC, 2018.
    [12]
    Z. Xu, J. Tang, J. Meng, W. Zhang, Y. Wang, C.H. Liu, D. Yang, Experience-driven networking: A deep reinforcement learning based approach, in: IEEE Conference on Computer Communications, INFOCOM, 2018, pp. 1871–1879.
    [13]
    L. Chen, J. Lingys, K. Chen, F. Liu, Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization, in: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, 2018, pp. 191–205.
    [14]
    A. Mestres, E. Alarcón, Y. Ji, A. Cabellos-Aparicio, Understanding the modeling of computer network delays using neural networks, in: Proceedings of the ACM SIGCOMM Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, Big-DAMA, 2018, pp. 46–52.
    [15]
    Suárez-Varela J., Mestres A., Yu J., Kuang L., Feng H., Cabellos-Aparicio A., Barlet-Ros P., Routing in optical transport networks with deep reinforcement learning, IEEE/OSA J. Opt. Commun. Networking 11 (11) (2019) 547–558.
    [16]
    Battaglia P.W., Hamrick J.B., Bapst V., Sanchez-Gonzalez A., Zambaldi V., Malinowski M., Tacchetti A., Raposo D., Santoro A., Faulkner R., et al., Relational inductive biases, deep learning, and graph networks, 2018, arXiv preprint arXiv:1806.01261.
    [17]
    Scarselli F., Gori M., Tsoi A.C., Hagenbuchner M., Monfardini G., The graph neural network model, IEEE Trans. Neural Netw. 20 (1) (2008) 61–80.
    [18]
    J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, Neural message passing for quantum chemistry, in: Proceedings of the International Conference on Machine Learning- Vol. 70, ICML, 2017, pp. 1263–1272.
    [20]
    Li Y., Tarlow D., Brockschmidt M., Zemel R., Gated graph sequence neural networks, 2015, arXiv preprint arXiv:1511.05493.
    [21]
    Veličković P., Cucurull G., Casanova A., Romero A., Lio P., Bengio Y., Graph attention networks, 2017, arXiv preprint arXiv:1710.10903.
    [22]
    P.W. Battaglia, R. Pascanu, M. Lai, D.J. Rezende, et al., Interaction networks for learning about objects, relations and physics, in: Proceedings of Advances in Neural Information Processing Systems, NIPS, 2016, pp. 4502–4510.
    [23]
    Watkins C.J.C.H., Dayan P., Q-learning, Mach. Learn. 8 (3–4) (1992) 279–292.
    [24]
    Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D., Riedmiller M., Playing atari with deep reinforcement learning, 2013, arXiv preprint arXiv:1312.5602.
    [25]
    J. Kuri, N. Puech, M. Gagnaire, Diverse routing of scheduled lightpath demands in an optical transport network, in: Proceedings of the IEEE International Workshop on Design of Reliable Communication Networks, DRCN, 2003, pp. 69–76.
    [26]
    ITU-T Recommendation g.709/y.1331: Interface for the optical transport network, 2019, https://www.itu.int/rec/T-REC-G.709/.
    [27]
    Sutton R.S., Barto A.G., Reinforcement Learning: An Introduction, MIT Press, 2018.
    [28]
    Cho K., Van Merriënboer B., Bahdanau D., Bengio Y., On the properties of neural machine translation: Encoder-decoder approaches, 2014, arXiv preprint arXiv:1409.1259.
    [29]
    M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: A system for large-scale machine learning, in: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2016, pp. 265–283.
    [30]
    Brockman G., Cheung V., Pettersson L., Schneider J., Schulman J., Tang J., Zaremba W., Openai gym, 2016, arXiv preprint arXiv:1606.01540.
    [31]
    L. Bottou, Large-scale machine learning with stochastic gradient descent, in: Proceedings of the International Conference on Computational Statistics, COMPSTAT, 2010, pp. 177–186.
    [32]
    Hei X., Zhang J., Bensaou B., Cheung C.-C., Wavelength converter placement in least-load-routing-based optical networks using genetic algorithms, J. Opt. Netw. 3 (5) (2004) 363–378.
    [33]
    Barreto F., Wille E.C., Nacamura Jr. L., Fast emergency paths schema to overcome transient link failures in OSPF routing, 2012, arXiv preprint arXiv:1204.2465.
    [34]
    Francois P., Filsfils C., Evans J., Bonaventure O., Achieving sub-second IGP convergence in large IP networks, ACM SIGCOMM Comput. Commun. Rev. 35 (3) (2005) 35–44.
    [35]
    Jain S., Kumar A., Mandal S., Ong J., Poutievski L., Singh A., Venkata S., Wanderer J., Zhou J., Zhu M., et al., B4: Experience with a globally-deployed software defined WAN, ACM SIGCOMM Comput. Commun. Rev. 43 (4) (2013) 3–14.
    [36]
    Hagberg A., Swart P., S. Chult D., Exploring Network Structure, Dynamics, and Function Using Networkx, Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2008.
    [37]
    Xu K., Hu W., Leskovec J., Jegelka S., How powerful are graph neural networks?, 2018, arXiv preprint arXiv:1810.00826.
    [38]
    Gong L., Zhou X., Liu X., Zhao W., Lu W., Zhu Z., Efficient resource allocation for all-optical multicasting over spectrum-sliced elastic optical networks, IEEE/OSA J. Opt. Commun. Networking 5 (8) (2013) 836–847.
    [39]
    Gong L., Zhou X., Lu W., Zhu Z., A two-population based evolutionary approach for optimizing routing, modulation and spectrum assignments (RMSA) in O-OFDM networks, IEEE Commun. Lett. 16 (9) (2012) 1520–1523.
    [40]
    Klinkowski M., Ruiz M., Velasco L., Careglio D., Lopez V., Comellas J., Elastic spectrum allocation for time-varying traffic in flexgrid optical networks, IEEE J. Sel. Areas Commun. 31 (1) (2012) 26–38.
    [41]
    Wang Y., Cao X., Pan Y., A study of the routing and spectrum allocation in spectrum-sliced elastic optical path networks, in: IEEE International Conference on Computer Communications, INFOCOM, 2011, pp. 1503–1511.
    [42]
    Christodoulopoulos K., Tomkos I., Varvarigos E.A., Elastic bandwidth allocation in flexible OFDM-based optical networks, J. Lightwave Technol. 29 (9) (2011) 1354–1366.
    [43]
    Sun P., Guo Z., Lan J., Li J., Hu Y., Baker T., Scaledrl: A scalable deep reinforcement learning approach for traffic engineering in SDN with pinning control, Comput. Netw. 190 (2021).
    [44]
    Zhang J., Ye M., Guo Z., Yen C.-Y., Chao H.J., CFR-RL: Traffic engineering with reinforcement learning in SDN, IEEE J. Sel. Areas Commun. 38 (10) (2020) 2249–2259,.
    [45]
    Troia S., Sapienza F., Varé L., Maier G., On deep reinforcement learning for traffic engineering in SD-WAN, IEEE J. Sel. Areas Commun. 39 (7) (2020) 2198–2212.
    [46]
    F. Geyer, G. Carle, Learning and generating distributed routing protocols using graph-based deep learning, in: Proceedings of the ACM SIGCOMM Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, Big-DAMA, 2018, pp. 40–45.
    [47]
    H. Zhu, V. Gupta, S.S. Ahuja, Y. Tian, Y. Zhang, X. Jin, Network planning with deep reinforcement learning, in: Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 2021, pp. 258–271.
    [48]
    Bernárdez G., Suárez-Varela J., López A., Wu B., Xiao S., Cheng X., Barlet-Ros P., Cabellos-Aparicio A., Is machine learning ready for traffic engineering optimization?, in: 2021 IEEE 29th International Conference on Network Protocols, ICNP, IEEE, 2021, pp. 1–11.
    [49]
    H. Mao, M. Schwarzkopf, S.B. Venkatakrishnan, Z. Meng, M. Alizadeh, Learning scheduling algorithms for data processing clusters, in: Proceedings of ACM SIGCOMM, 2019, pp. 270–288.

    Cited By

    View all
    • (2024)Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message PassingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663055(1919-1927)Online publication date: 6-May-2024
    • (2024)Deep Reinforcement Learning with Coalition Action Selection for Online Combinatorial Resource Allocation with Arbitrary Action SpaceProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662918(660-668)Online publication date: 6-May-2024
    • (2024)Causal Question Answering with Reinforcement LearningProceedings of the ACM on Web Conference 202410.1145/3589334.3645610(2204-2215)Online publication date: 13-May-2024
    • Show More Cited By

    Index Terms

    1. Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Computer Communications
        Computer Communications  Volume 196, Issue C
        Dec 2022
        278 pages

        Publisher

        Elsevier Science Publishers B. V.

        Netherlands

        Publication History

        Published: 01 December 2022

        Author Tags

        1. Graph neural networks
        2. Deep reinforcement learning
        3. Routing
        4. Optimization

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 14 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message PassingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663055(1919-1927)Online publication date: 6-May-2024
        • (2024)Deep Reinforcement Learning with Coalition Action Selection for Online Combinatorial Resource Allocation with Arbitrary Action SpaceProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662918(660-668)Online publication date: 6-May-2024
        • (2024)Causal Question Answering with Reinforcement LearningProceedings of the ACM on Web Conference 202410.1145/3589334.3645610(2204-2215)Online publication date: 13-May-2024
        • (2024)A look into smart factory for Industrial IoT driven by SDN technologyJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10206936:5Online publication date: 24-Jul-2024
        • (2024)A Graph reinforcement learning based SDN routing path selection for optimizing long-term revenueFuture Generation Computer Systems10.1016/j.future.2023.09.017150:C(412-423)Online publication date: 1-Jan-2024
        • (2024)Distributed web hacking by adaptive consensus-based reinforcement learningArtificial Intelligence10.1016/j.artint.2023.104032326:COnline publication date: 4-Mar-2024
        • (2023)Networked restless bandits with positive externalitiesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i10.26415(11997-12004)Online publication date: 7-Feb-2023
        • (2023)A Survey on Influence Maximization: From an ML-Based Combinatorial OptimizationACM Transactions on Knowledge Discovery from Data10.1145/360455917:9(1-50)Online publication date: 18-Jul-2023
        • (2023)G-Routing: Graph Neural Networks-Based Flexible Online RoutingIEEE Network: The Magazine of Global Internetworking10.1109/MNET.012.230005237:4(90-96)Online publication date: 1-Jul-2023
        • (2023)GAA-PPONeurocomputing10.1016/j.neucom.2023.126707557:COnline publication date: 7-Nov-2023
        • Show More Cited By

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media