research-article

Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES

Authors:

Davide Cingolani,

Alessandro Pellegrini,

Francesco QuagliaAuthors Info & Claims

ACM Transactions on Modeling and Computer Simulation (TOMACS), Volume 27, Issue 2

Article No.: 11, Pages 1 - 26

https://doi.org/10.1145/3077583

Published: 27 May 2017 Publication History

Abstract

The Time Warp synchronization protocol for Parallel Discrete Event Simulation (PDES) is universally considered a viable solution to exploit the intrinsic simulation model parallelism and to provide model execution speedup. Yet it leads the PDES system to execute events in an order that may generate causal inconsistencies that need to be recovered via rollback, which requires restoration of a previous (consistent) simulation state whenever a causality violation is detected. The rollback operation is so critical for the performance of a Time Warp system that it has been extensively studied in the literature for decades to find approaches suitable to optimize it. The proposed solutions can be roughly classified as based on either checkpointing or reverse computing. In this article, we explore the practical design and implementation of a fully new approach based on the runtime generation of so-called undo code blocks, which are blocks of instructions implementing the reverse memory side effects generated by the forward execution of the events. However, this is not done by recomputing the original values to be restored, as instead it occurs in reverse computing schemes. Hence, the philosophy undo code blocks rely on is similar in spirit to that of undo-logs (as a form of checkpointing). Nevertheless, they are not data logs (as instead checkpoints are); rather, they are logs of instructions. Our proposal is fully transparent, thanks to the reliance on static software instrumentation (targeting the x86 architecture and Linux systems). Also, as we show, it can be combined with classical checkpointing to further improve the runtime behavior of the state recoverability support as a function of the workload. We also present experimental results related to our implementation, which is released as free software and fully integrated into the open source ROOT-Sim package. Experimental data support the viability and effectiveness of our proposal.

Supplementary Material

a11-cingolani-apndx.pdf (cingolani.zip)

Supplemental movie, appendix, image and software files for, Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES

Download
358.14 KB

References

[1]

GDB: The GNU Project Debugger. 2017. Retrieved from http://www.gnu.org/software/gdb/.

[2]

V. Bala, E. Duesterwald, and S. Banerjia. 2000. Dynamo: A transparent dynamic optimization system. SIGPLAN Not. 35, 5 (2000), 1--12.

Digital Library

[3]

S. Bellenot. 1992. State skipping performance with the Time Warp operating system. In Proceedings of the 6th Workshop on Parallel and Distributed Simulation (PADS’92). 53--64.

[4]

B. Biswas and R. Mall. 1999. Reverse execution of programs. SIGPLAN Not. 34, 4 (1999), 61--69.

Digital Library

[5]

J. Bonwick. 1994. The slab allocator: An object-caching kernel memory allocator. In Proceedings of the USENIX Summer 1994 Technical Conference. 6.

Digital Library

[6]

C. D. Carothers, K. S. Perumalla, and R. M. Fujimoto. 1999. Efficient optimistic parallel simulations using reverse computation. ACM Trans. Model. Comput. Simul. 9, 3 (1999), 224--253.

Digital Library

[7]

V. Cortellessa and F. Quaglia. 2001. A checkpointing-recovery scheme for time warp parallel simulation. Parallel Comput. 27, 9 (2001), 1227--1252.

[8]

S. R. Das, R. M. Fujimoto, K. Panesar, D. Allison, and M. Hybinette. 1994. GTW: a Time Warp system for shared memory multiprocessors. In Proceedings of the 26th conference on Winter simulation. Society for Computer Simulation International, 1332--1339.

[9]

J. Fleischmann and P. A. Wilsey. 1995. Comparative Analysis of Periodic State Saving Techniques in Time Warp Simulators. In Proceedings of the 9th Workshop on Parallel and Distributed Simulation. IEEE Computer Society, 50--58.

[10]

M. P. Frank. 1999. Reversibility for Efficient Computing. Ph.D. thesis, Massachusetts Institute of Technology.

[11]

S. Franks, F. Gomes, B. Unger, and J. Cleary. 1997. State saving for interactive optimistic simulation. In Proceedings of the 11th workshop on Parallel and Distributed Simulation. IEEE Computer Society, 72--79.

[12]

R. M. Fujimoto. 1990. Performance of Time Warp Under Synthetic Workloads. In Proceedings of the Multiconf. on Distributed Simulation. Society for Computer Simulation, 23--28.

[13]

R. M. Fujimoto, J. J. Tsai, and G. Gopalakrishnan 1992. Design and evaluation of the rollback chip: Special purpose hardware for Time Warp. IEEE Trans. Comput. 41, 1 (1992), 68--82.

Digital Library

[14]

C. Hou, G. Vulov, D. Quinlan, D. Jefferson, R. Fujimoto, and R. Vuduc. 2012a. A New Method for Program Inversion. Springer, Berlin, 81--100.

[15]

C. Hou, G. Vulov, D. J. Quinlan, D. Jefferson, R. Fujimoto, and R. W. Vuduc. 2012b. A new method for program inversion. In Proceeding of the 21st International Conference on Compiler Construction. 81--100.

Digital Library

[16]

D. R. Jefferson. 1985. Virtual Time. ACM Trans. Program. Lang. Syst. 7, 3 (1985), 404--425.

Digital Library

[17]

S. Kandukuri and S. Boyd. 2002. Optimal Power Control in Interference-Limited Fading Wireless Channels with Outage-Probability Specifications. IEEE Trans. Wireless Commun. 1, 1 (2002), 46--55.

Digital Library

[18]

J. M. LaPre, E. J. Gonsiorowski, and C. D. Carothers. 2014. LORAIN: A Step Closer to the PDES ’Holy Grail’. In Proceedings of the 2nd ACM SIGSIM/PADS Conference on Principles of Advanced Discrete Simulation (PADS’14). ACM Press, 3--14.

Digital Library

[19]

D. Lea. 1996. A Memory Allocator. Retrieved from http://g.oswego.edu/dl/html/malloc.html.

[20]

Y.-B. Lin and E. D. Lazowska. 1990. Reducing the Saving Overhead for Time Warp Parallel Simulation. University of Washington Department of Computer Science and Engineering. Technical Report 90-02-03.

[21]

A. C. Palaniswamy and P. A. Wilsey. 1993. An analytical comparison of periodic checkpointing and incremental state saving. In Proceedings of the 7th Workshop on Parallel and distributed simulation. ACM, 127--134.

Digital Library

[22]

A. Pellegrini. 2013. Hijacker: Efficient static software instrumentation with applications in high performance computing. In Proceedings of the 2013 International Conference on High Performance Computing and Simulation (HPCS’13). 650--655.

[23]

A. Pellegrini. 2015. Parallelization of Discrete Event Simulation Models. Sapienza Università Editrice, Rome, Italy.

[24]

A. Pellegrini and F. Quaglia. 2013. The ROme OpTimistic Simulator: A Tutorial. In Proceedings of the 1st Workshop on Parallel and Distributed Agent-Based Simulations (PADABS’13). LNCS. Springer-Verlag.

[25]

A. Pellegrini, R. Vitali, and F. Quaglia. 2009. Di-DyMeLoR: Logging only dirty chunks for efficient management of dynamic memory based optimistic simulation objects. In Proceedings - Workshop on Principles of Advanced and Distributed Simulation (PADS’09). IEEE, 45--53.

[26]

A. Pellegrini, R. Vitali, F. Quaglia, A. Pellegrini, and F. Quaglia. 2015. Autonomic State Management for Optimistic Simulation Platforms. IEEE Trans. Parallel Distrib. Syst. 26, 6 (2015), 1560--1569.

[27]

J. L. Peterson and T. A. Norman. 1977. Buddy systems. Commun. ACM 20, 6 (1977), 421--431.

Digital Library

[28]

B. R. Preiss, W. M. Loucks, and D. MacIntyre. 1994. Effects of the Checkpoint Interval on Time and Space in Time Warp. ACM Trans. Model. Comput. Simul. 4, 3 (1994), 223--253.

Digital Library

[29]

F. Qin, C. Wang, Z. Li, H.-S. Kim, Y. Zhou, and Y. Wu. 2006. LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 135--148.

Digital Library

[30]

F. Quaglia. 2001. A Cost Model for Selecting Checkpoint Positions in Time Warp Parallel Simulation. IEEE Trans. Parallel Distrib. Syst. 12, 4 (2001), 346--362.

Digital Library

[31]

F. Quaglia and A. Santoro. 2003. Non-Blocking Checkpointing for Optimistic Parallel Simulation: Description and an Implementation. IEEE Trans. Parallel Distrib. Syst. 14, 6 (2003), 593--610.

Digital Library

[32]

S. Robinson. 2004. Simulation: The Practice of Model Development and Use. John Wiley 8 Sons.

[33]

R. Rönngren and R. Ayani. 1994. Adaptive Checkpointing in Time Warp. In Proceedings of the 8th Workshop on Parallel and Distributed Simulation. Society for Computer Simulation, 110--117.

Digital Library

[34]

R. Rönngren, M. Liljenstam, R. Ayani, and J. Montagnat. 1996. Transparent Incremental State Saving in Time Warp Parallel Discrete Event Simulation. In Proceedings of the 10th Workshop on Parallel and Distributed Simulation. IEEE Computer Society, 70--77.

[35]

M. Schordan, D. Jefferson, P. Barnes, T. Oppelstrup, and D. Quinlan. 2015. Reverse Code Generation for Parallel Discrete Event Simulation. Springer International Publishing, Cham, 95--110.

[36]

S. K. Seal and K. S. Perumalla. 2011. Reversible parallel discrete event formulation of a tlm-based radio signal propagation model. ACM Trans. Model. Comput. Simul. 22, 1 (2011), 4.

Digital Library

[37]

S. Skold and R. Rönngren. 1996. Event Sensitive State Saving in Time Warp Parallel Discrete Event Simulation. In Proceedings of the 1996 Winter Simulation Conference. Society for Computer Simulation, 653--660.

Digital Library

[38]

H. M. Soliman and A. S. Elmaghraby. 1998. An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation. IEEE Trans. Parallel Distrib. Syst. 9, 10 (1998), 947--951.

Digital Library

[39]

R. Sosič. 1994. History cache: Hardware support for reverse execution. SIGARCH Comput. Arch. News 22, 5 (1994), 11--18.

Digital Library

[40]

J. S. Steinman. 1992. SPEEDES: A Multiple-Synchronization Environment for Parallel Discrete Event Simulation. Int. J. Comput. 251--286.

[41]

J. S. Steinman. 1993. Incremental State Saving in SPEEDES Using C Plus Plus. In Proceedings of the Winter Simulation Conference. Society for Computer Simulation, 687--696.

Digital Library

[42]

R. Toccaceli and F. Quaglia. 2008. DyMeLoR: Dynamic Memory Logger and Restorer Library for Optimistic Simulation Objects with Generic Memory Layout. In Proceedings of the 22nd Workshop on Principles of Advanced and Distributed Simulation. IEEE Computer Society, 163--172.

Digital Library

[43]

R. Wahbe, S. Lucco, and S. L. Graham. 1993. Practical Data Breakpoints: Design and Implementation. In Proceedings of the 1993 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’93). 1--12.

Digital Library

[44]

D. West and K. Panesar. 1996. Automatic Incremental State Saving. In Proceedings of the 10th Workshop on Parallel and Distributed Simulation. IEEE Computer Society, 78--85.

[45]

Q. Zhao, R. Rabbah, S. Amarasinghe, L. Rudolph, and W. F. Wong. 2008. How to do a million watchpoints: Efficient Debugging using dynamic instrumentation. Lecture Notes in Computer Science, Vol. 4959. 147--162.

Cited By

Piccione AAndelfinger PPellegrini A(2023)Hybrid Speculative Synchronisation for Parallel Discrete Event SimulationProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591124(84-95)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3573900.3591124
Marotta RMontesano FQuaglia F(2023)Effective Access to the Committed Global State in Speculative Parallel Discrete Event Simulation on Multi-core MachinesProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591117(107-117)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3573900.3591117
Jefferson DBarnes P(2023)Virtual Time III, Part 1: Unified Virtual Time Synchronization for Parallel Discrete Event SimulationACM Transactions on Modeling and Computer Simulation10.1145/350524832:4(1-29)Online publication date: 11-Jan-2023
https://dl.acm.org/doi/10.1145/3505248
Show More Cited By

Index Terms

Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation types and techniques
      1. Discrete-event simulation
      2. Massively parallel and high-performance simulations
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation

Recommendations

Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES
SIGSIM PADS '15: Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

The rollback operation is a fundamental building block to support the correct execution of a speculative Time Warp-based Parallel Discrete Event Simulation. In the literature, several solutions to reduce the execution cost of this operation have been ...
Lightweight Time Warp- A Novel Protocol for Parallel Optimistic Simulation of Large-Scale DEVS and Cell-DEVS Models
DS-RT '08: Proceedings of the 2008 12th IEEE/ACM International Symposium on Distributed Simulation and Real-Time Applications

This paper proposes a novel Lightweight Time Warp (LTW) protocol for high-performance parallel optimistic simulation of large-scale DEVS and Cell-DEVS models. By exploiting the characteristics of the simulation process, the protocol is able to set free ...
Time Warp on the GPU: Design and Assessment
SIGSIM-PADS '17: Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

The parallel execution of discrete-event simulations on commodity GPUs has been shown to achieve high event rates. Most previous proposals have focused on conservative synchronization, which typically extracts only limited parallelism in cases of low ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Modeling and Computer Simulation

ACM Transactions on Modeling and Computer Simulation Volume 27, Issue 2

Special Issue on PADS 2015

April 2017

203 pages

ISSN:1049-3301

EISSN:1558-1195

DOI:10.1145/3015562

Editor:
Adelinde M. Uhrmacher
University of Rostock, Germany

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2017

Accepted: 01 March 2017

Revised: 01 September 2016

Received: 01 November 2015

Published in TOMACS Volume 27, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
148
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Piccione AAndelfinger PPellegrini A(2023)Hybrid Speculative Synchronisation for Parallel Discrete Event SimulationProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591124(84-95)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3573900.3591124
Marotta RMontesano FQuaglia F(2023)Effective Access to the Committed Global State in Speculative Parallel Discrete Event Simulation on Multi-core MachinesProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591117(107-117)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3573900.3591117
Jefferson DBarnes P(2023)Virtual Time III, Part 1: Unified Virtual Time Synchronization for Parallel Discrete Event SimulationACM Transactions on Modeling and Computer Simulation10.1145/350524832:4(1-29)Online publication date: 11-Jan-2023
https://dl.acm.org/doi/10.1145/3505248
Marotta RMontesano FPellegrini AQuaglia F(2023)Incremental Checkpointing of Large State Simulation Models with Write-Intensive Events via Memory Update Correlation on Buddy Pages2023 IEEE/ACM 27th International Symposium on Distributed Simulation and Real Time Applications (DS-RT)10.1109/DS-RT58998.2023.00014(40-47)Online publication date: 4-Oct-2023
https://doi.org/10.1109/DS-RT58998.2023.00014
Pimpini APiccione ACiciani BPellegrini A(2022)Speculative Distributed Simulation of Very Large Spiking Neural NetworksProceedings of the 2022 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3518997.3531027(93-104)Online publication date: 8-Jun-2022
https://dl.acm.org/doi/10.1145/3518997.3531027
Montesano FMarotta RQuaglia F(2022)Spatial/Temporal Locality-based Load-sharing in Speculative Discrete Event Simulation on Multi-core MachinesProceedings of the 2022 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3518997.3531026(81-92)Online publication date: 8-Jun-2022
https://dl.acm.org/doi/10.1145/3518997.3531026
Hoey JUlidowski I(2022)Reversing an imperative concurrent programming languageScience of Computer Programming10.1016/j.scico.2022.102873223(102873)Online publication date: Nov-2022
https://doi.org/10.1016/j.scico.2022.102873
Schordan MOppelstrup TJefferson DBarnes P(2022)Generation of Reversible C++ Code for Optimistic Parallel Discrete Event SimulationNew Generation Computing10.1007/s00354-018-0038-236:3(257-280)Online publication date: 11-Mar-2022
https://dl.acm.org/doi/10.1007/s00354-018-0038-2
Xie JZhou ZLuo TPang HMeng XZhou F(2021)Study on Design and Additive Manufacturing of Customized Bionic Sports Sole for the ElderlyIEEE Access10.1109/ACCESS.2021.30781629(69830-69838)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3078162
SAWA THE FKAWABATA AOKI E(2020)Algorithms for Distributed Server Allocation ProblemIEICE Transactions on Communications10.1587/transcom.2020EBP3006E103.B:11(1341-1352)Online publication date: 1-Nov-2020
https://doi.org/10.1587/transcom.2020EBP3006
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents