skip to main content
research-article

Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES

Published: 27 May 2017 Publication History

Abstract

The Time Warp synchronization protocol for Parallel Discrete Event Simulation (PDES) is universally considered a viable solution to exploit the intrinsic simulation model parallelism and to provide model execution speedup. Yet it leads the PDES system to execute events in an order that may generate causal inconsistencies that need to be recovered via rollback, which requires restoration of a previous (consistent) simulation state whenever a causality violation is detected. The rollback operation is so critical for the performance of a Time Warp system that it has been extensively studied in the literature for decades to find approaches suitable to optimize it. The proposed solutions can be roughly classified as based on either checkpointing or reverse computing. In this article, we explore the practical design and implementation of a fully new approach based on the runtime generation of so-called undo code blocks, which are blocks of instructions implementing the reverse memory side effects generated by the forward execution of the events. However, this is not done by recomputing the original values to be restored, as instead it occurs in reverse computing schemes. Hence, the philosophy undo code blocks rely on is similar in spirit to that of undo-logs (as a form of checkpointing). Nevertheless, they are not data logs (as instead checkpoints are); rather, they are logs of instructions. Our proposal is fully transparent, thanks to the reliance on static software instrumentation (targeting the x86 architecture and Linux systems). Also, as we show, it can be combined with classical checkpointing to further improve the runtime behavior of the state recoverability support as a function of the workload. We also present experimental results related to our implementation, which is released as free software and fully integrated into the open source ROOT-Sim package. Experimental data support the viability and effectiveness of our proposal.

Supplementary Material

a11-cingolani-apndx.pdf (cingolani.zip)
Supplemental movie, appendix, image and software files for, Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES

References

[1]
GDB: The GNU Project Debugger. 2017. Retrieved from http://www.gnu.org/software/gdb/.
[2]
V. Bala, E. Duesterwald, and S. Banerjia. 2000. Dynamo: A transparent dynamic optimization system. SIGPLAN Not. 35, 5 (2000), 1--12.
[3]
S. Bellenot. 1992. State skipping performance with the Time Warp operating system. In Proceedings of the 6th Workshop on Parallel and Distributed Simulation (PADS’92). 53--64.
[4]
B. Biswas and R. Mall. 1999. Reverse execution of programs. SIGPLAN Not. 34, 4 (1999), 61--69.
[5]
J. Bonwick. 1994. The slab allocator: An object-caching kernel memory allocator. In Proceedings of the USENIX Summer 1994 Technical Conference. 6.
[6]
C. D. Carothers, K. S. Perumalla, and R. M. Fujimoto. 1999. Efficient optimistic parallel simulations using reverse computation. ACM Trans. Model. Comput. Simul. 9, 3 (1999), 224--253.
[7]
V. Cortellessa and F. Quaglia. 2001. A checkpointing-recovery scheme for time warp parallel simulation. Parallel Comput. 27, 9 (2001), 1227--1252.
[8]
S. R. Das, R. M. Fujimoto, K. Panesar, D. Allison, and M. Hybinette. 1994. GTW: a Time Warp system for shared memory multiprocessors. In Proceedings of the 26th conference on Winter simulation. Society for Computer Simulation International, 1332--1339.
[9]
J. Fleischmann and P. A. Wilsey. 1995. Comparative Analysis of Periodic State Saving Techniques in Time Warp Simulators. In Proceedings of the 9th Workshop on Parallel and Distributed Simulation. IEEE Computer Society, 50--58.
[10]
M. P. Frank. 1999. Reversibility for Efficient Computing. Ph.D. thesis, Massachusetts Institute of Technology.
[11]
S. Franks, F. Gomes, B. Unger, and J. Cleary. 1997. State saving for interactive optimistic simulation. In Proceedings of the 11th workshop on Parallel and Distributed Simulation. IEEE Computer Society, 72--79.
[12]
R. M. Fujimoto. 1990. Performance of Time Warp Under Synthetic Workloads. In Proceedings of the Multiconf. on Distributed Simulation. Society for Computer Simulation, 23--28.
[13]
R. M. Fujimoto, J. J. Tsai, and G. Gopalakrishnan 1992. Design and evaluation of the rollback chip: Special purpose hardware for Time Warp. IEEE Trans. Comput. 41, 1 (1992), 68--82.
[14]
C. Hou, G. Vulov, D. Quinlan, D. Jefferson, R. Fujimoto, and R. Vuduc. 2012a. A New Method for Program Inversion. Springer, Berlin, 81--100.
[15]
C. Hou, G. Vulov, D. J. Quinlan, D. Jefferson, R. Fujimoto, and R. W. Vuduc. 2012b. A new method for program inversion. In Proceeding of the 21st International Conference on Compiler Construction. 81--100.
[16]
D. R. Jefferson. 1985. Virtual Time. ACM Trans. Program. Lang. Syst. 7, 3 (1985), 404--425.
[17]
S. Kandukuri and S. Boyd. 2002. Optimal Power Control in Interference-Limited Fading Wireless Channels with Outage-Probability Specifications. IEEE Trans. Wireless Commun. 1, 1 (2002), 46--55.
[18]
J. M. LaPre, E. J. Gonsiorowski, and C. D. Carothers. 2014. LORAIN: A Step Closer to the PDES ’Holy Grail’. In Proceedings of the 2nd ACM SIGSIM/PADS Conference on Principles of Advanced Discrete Simulation (PADS’14). ACM Press, 3--14.
[19]
D. Lea. 1996. A Memory Allocator. Retrieved from http://g.oswego.edu/dl/html/malloc.html.
[20]
Y.-B. Lin and E. D. Lazowska. 1990. Reducing the Saving Overhead for Time Warp Parallel Simulation. University of Washington Department of Computer Science and Engineering. Technical Report 90-02-03.
[21]
A. C. Palaniswamy and P. A. Wilsey. 1993. An analytical comparison of periodic checkpointing and incremental state saving. In Proceedings of the 7th Workshop on Parallel and distributed simulation. ACM, 127--134.
[22]
A. Pellegrini. 2013. Hijacker: Efficient static software instrumentation with applications in high performance computing. In Proceedings of the 2013 International Conference on High Performance Computing and Simulation (HPCS’13). 650--655.
[23]
A. Pellegrini. 2015. Parallelization of Discrete Event Simulation Models. Sapienza Università Editrice, Rome, Italy.
[24]
A. Pellegrini and F. Quaglia. 2013. The ROme OpTimistic Simulator: A Tutorial. In Proceedings of the 1st Workshop on Parallel and Distributed Agent-Based Simulations (PADABS’13). LNCS. Springer-Verlag.
[25]
A. Pellegrini, R. Vitali, and F. Quaglia. 2009. Di-DyMeLoR: Logging only dirty chunks for efficient management of dynamic memory based optimistic simulation objects. In Proceedings - Workshop on Principles of Advanced and Distributed Simulation (PADS’09). IEEE, 45--53.
[26]
A. Pellegrini, R. Vitali, F. Quaglia, A. Pellegrini, and F. Quaglia. 2015. Autonomic State Management for Optimistic Simulation Platforms. IEEE Trans. Parallel Distrib. Syst. 26, 6 (2015), 1560--1569.
[27]
J. L. Peterson and T. A. Norman. 1977. Buddy systems. Commun. ACM 20, 6 (1977), 421--431.
[28]
B. R. Preiss, W. M. Loucks, and D. MacIntyre. 1994. Effects of the Checkpoint Interval on Time and Space in Time Warp. ACM Trans. Model. Comput. Simul. 4, 3 (1994), 223--253.
[29]
F. Qin, C. Wang, Z. Li, H.-S. Kim, Y. Zhou, and Y. Wu. 2006. LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 135--148.
[30]
F. Quaglia. 2001. A Cost Model for Selecting Checkpoint Positions in Time Warp Parallel Simulation. IEEE Trans. Parallel Distrib. Syst. 12, 4 (2001), 346--362.
[31]
F. Quaglia and A. Santoro. 2003. Non-Blocking Checkpointing for Optimistic Parallel Simulation: Description and an Implementation. IEEE Trans. Parallel Distrib. Syst. 14, 6 (2003), 593--610.
[32]
S. Robinson. 2004. Simulation: The Practice of Model Development and Use. John Wiley 8 Sons.
[33]
R. Rönngren and R. Ayani. 1994. Adaptive Checkpointing in Time Warp. In Proceedings of the 8th Workshop on Parallel and Distributed Simulation. Society for Computer Simulation, 110--117.
[34]
R. Rönngren, M. Liljenstam, R. Ayani, and J. Montagnat. 1996. Transparent Incremental State Saving in Time Warp Parallel Discrete Event Simulation. In Proceedings of the 10th Workshop on Parallel and Distributed Simulation. IEEE Computer Society, 70--77.
[35]
M. Schordan, D. Jefferson, P. Barnes, T. Oppelstrup, and D. Quinlan. 2015. Reverse Code Generation for Parallel Discrete Event Simulation. Springer International Publishing, Cham, 95--110.
[36]
S. K. Seal and K. S. Perumalla. 2011. Reversible parallel discrete event formulation of a tlm-based radio signal propagation model. ACM Trans. Model. Comput. Simul. 22, 1 (2011), 4.
[37]
S. Skold and R. Rönngren. 1996. Event Sensitive State Saving in Time Warp Parallel Discrete Event Simulation. In Proceedings of the 1996 Winter Simulation Conference. Society for Computer Simulation, 653--660.
[38]
H. M. Soliman and A. S. Elmaghraby. 1998. An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation. IEEE Trans. Parallel Distrib. Syst. 9, 10 (1998), 947--951.
[39]
R. Sosič. 1994. History cache: Hardware support for reverse execution. SIGARCH Comput. Arch. News 22, 5 (1994), 11--18.
[40]
J. S. Steinman. 1992. SPEEDES: A Multiple-Synchronization Environment for Parallel Discrete Event Simulation. Int. J. Comput. 251--286.
[41]
J. S. Steinman. 1993. Incremental State Saving in SPEEDES Using C Plus Plus. In Proceedings of the Winter Simulation Conference. Society for Computer Simulation, 687--696.
[42]
R. Toccaceli and F. Quaglia. 2008. DyMeLoR: Dynamic Memory Logger and Restorer Library for Optimistic Simulation Objects with Generic Memory Layout. In Proceedings of the 22nd Workshop on Principles of Advanced and Distributed Simulation. IEEE Computer Society, 163--172.
[43]
R. Wahbe, S. Lucco, and S. L. Graham. 1993. Practical Data Breakpoints: Design and Implementation. In Proceedings of the 1993 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’93). 1--12.
[44]
D. West and K. Panesar. 1996. Automatic Incremental State Saving. In Proceedings of the 10th Workshop on Parallel and Distributed Simulation. IEEE Computer Society, 78--85.
[45]
Q. Zhao, R. Rabbah, S. Amarasinghe, L. Rudolph, and W. F. Wong. 2008. How to do a million watchpoints: Efficient Debugging using dynamic instrumentation. Lecture Notes in Computer Science, Vol. 4959. 147--162.

Cited By

View all
  • (2023)Hybrid Speculative Synchronisation for Parallel Discrete Event SimulationProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591124(84-95)Online publication date: 21-Jun-2023
  • (2023)Effective Access to the Committed Global State in Speculative Parallel Discrete Event Simulation on Multi-core MachinesProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591117(107-117)Online publication date: 21-Jun-2023
  • (2023)Virtual Time III, Part 1: Unified Virtual Time Synchronization for Parallel Discrete Event SimulationACM Transactions on Modeling and Computer Simulation10.1145/350524832:4(1-29)Online publication date: 11-Jan-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Modeling and Computer Simulation
ACM Transactions on Modeling and Computer Simulation  Volume 27, Issue 2
Special Issue on PADS 2015
April 2017
203 pages
ISSN:1049-3301
EISSN:1558-1195
DOI:10.1145/3015562
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2017
Accepted: 01 March 2017
Revised: 01 September 2016
Received: 01 November 2015
Published in TOMACS Volume 27, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Optimistic synchronization
  2. software reversibility
  3. time warp

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 04 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Hybrid Speculative Synchronisation for Parallel Discrete Event SimulationProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591124(84-95)Online publication date: 21-Jun-2023
  • (2023)Effective Access to the Committed Global State in Speculative Parallel Discrete Event Simulation on Multi-core MachinesProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3591117(107-117)Online publication date: 21-Jun-2023
  • (2023)Virtual Time III, Part 1: Unified Virtual Time Synchronization for Parallel Discrete Event SimulationACM Transactions on Modeling and Computer Simulation10.1145/350524832:4(1-29)Online publication date: 11-Jan-2023
  • (2023)Incremental Checkpointing of Large State Simulation Models with Write-Intensive Events via Memory Update Correlation on Buddy Pages2023 IEEE/ACM 27th International Symposium on Distributed Simulation and Real Time Applications (DS-RT)10.1109/DS-RT58998.2023.00014(40-47)Online publication date: 4-Oct-2023
  • (2022)Speculative Distributed Simulation of Very Large Spiking Neural NetworksProceedings of the 2022 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3518997.3531027(93-104)Online publication date: 8-Jun-2022
  • (2022)Spatial/Temporal Locality-based Load-sharing in Speculative Discrete Event Simulation on Multi-core MachinesProceedings of the 2022 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3518997.3531026(81-92)Online publication date: 8-Jun-2022
  • (2022)Reversing an imperative concurrent programming languageScience of Computer Programming10.1016/j.scico.2022.102873223(102873)Online publication date: Nov-2022
  • (2022)Generation of Reversible C++ Code for Optimistic Parallel Discrete Event SimulationNew Generation Computing10.1007/s00354-018-0038-236:3(257-280)Online publication date: 11-Mar-2022
  • (2021)Study on Design and Additive Manufacturing of Customized Bionic Sports Sole for the ElderlyIEEE Access10.1109/ACCESS.2021.30781629(69830-69838)Online publication date: 2021
  • (2020)Algorithms for Distributed Server Allocation ProblemIEICE Transactions on Communications10.1587/transcom.2020EBP3006E103.B:11(1341-1352)Online publication date: 1-Nov-2020
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media