skip to main content
research-article

Detecting Software Cache Coherence Violations in MPSoC Using Traces Captured on Virtual Platforms

Published: 02 January 2017 Publication History

Abstract

Software cache coherence schemes tend to be the solution of choice in dedicated multi/many core systems on chip, as they make the hardware much simpler and predictable. However, despite the developers’ effort, it is hard to make sure that all preventive measurements are taken to ensure coherence. In this work, we propose a method to identify the potential cache coherence violations using traces obtained from virtual platforms. These traces contain causality relations among events, which allow first to simplify the analysis, and second to avoid relying on timestamps. Our method identifies potential violations that may occur during a given execution for write-through and write-back cache policies. Therefore, it is independent of the software coherence protocol. We conducted experiments on parallel applications running on a lightweight SMP operating system, and we were able to detect coherence issues that we could then solve.

References

[1]
K. Aisopos and L.-S. Peh. 2011. A systematic methodology to develop resilient cache coherence protocols. In Proceedings of the 44th International Symposium on Microarchitecture. 47--58.
[2]
T. Ashby, P. Díaz, and M. Cintra. 2011. Software-based cache coherence with hardware-assisted selective self-invalidations using bloom filters. IEEE Trans. Comput. 60, 4 (April 2011), 472--483.
[3]
L. Censier and P. Feautrier. 1978. A new solution to coherence problems in multicache systems. IEEE Trans. Comput. C-27, 12 (1978), 1112--1118.
[4]
B. Choi, R. Komuravelli, H. Sung, R. Smolinski, N. Honarmand, S. Adve, V. Adve, N. Carter, and C.-T. Chou. 2011. DeNovo: Rethinking the memory hierarchy for disciplined parallelism. In Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques. 155--166.
[5]
M. A. P. Cunha, N. Fournel, and F. Pétrot. 2015. Collecting traces in dynamic binary translation based virtual prototyping platforms. In Proceedings of the 7th ACM RAPIDO Workshop on Rapid Simulation and Performance Evaluation.
[6]
M. A. P. Cunha, N. Fournel, and F. Pétrot. 2016. Deterministic reversible MPSoC debugger based on virtual platform execution traces. Des. Autom. Embed. Syst. 20, 1 (2016), 47--63.
[7]
D. L. Dill. 1998. What’s between simulation and formal verification? In Proceedings of the Design Automation Conference. ACM, 328--329.
[8]
B. Dupont de Dinechin, R. Ayrignac, P.-E. Beaucamps, P. Couvert, B. Ganne, P. Guironnet de Massas, F. Jacquet, S. Jones, N. Morey Chaisemartin, F. Riss, and T. Strudel. 2013. A clustered manycore processor architecture for embedded and accelerated applications. In Proceedings of the IEEE High Performance Extreme Computing Conference. IEEE, 1--6.
[9]
M. Gligor, N. Fournel, and F. Pétrot. 2009. Using binary translation in event driven simulation for fast and flexible MPSoC simulation. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis. ACM, 71--80.
[10]
D. Hedde and F. Pétrot. 2011. A non intrusive simulation-based trace system to analyse multiprocessor systems-on-chip software. In Proceedings of the 22nd IEEE International Symposium on Rapid System Prototyping (RSP). 106--112.
[11]
Kalray. 2014. MPPA Many Core. Retrieved from http://www.kalray.eu/products/.
[12]
H. Kapoor, P. Kanakala, M. Verma, and S. Das. 2013. Design and formal verification of a hierarchical cache coherence protocol for NoC based multiprocessors. J. Supercomput. 65, 2 (Aug. 2013), 771--796.
[13]
J. H. Kelm, D. R. Johnson, M. R. Johnson, N. C. Crago, W. Tuohy, A. Mahesri, S. S. Lumetta, M. I. Frank, and S. J. Patel. 2009. Rigel: An architecture and scalable programming interface for a 1000-core accelerator. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 140--151.
[14]
G. Keramidas, N. Strikos, and S. Kaxiras. 2011. Multicore cache simulations using heterogeneous computing on general purpose and graphics processors. In Proceedings of the 2011 14th Euromicro Conference on Digital System Design (DSD). 270--273.
[15]
R. Komuravelli, S. V. Adve, and C.-T. Chou. 2014. Revisiting the complexity of hardware cache coherence and some implications. ACM Trans. Archit. Code Optim. 11, 4 (Dec. 2014), 37:1--37:22.
[16]
M. Lis, K. S. Shim, M. H. Cho, and S. Devadas. 2011. Memory coherence in the age of multicores. In Proceedings of the 29th International Conference on Computer Design. IEEE, 1--8.
[17]
M. Loghi and M. Poncino. 2005. Exploring energy/performance tradeoffs in shared memory MPSoCs: Snoop-based cache coherence vs. software solutions. In Proceedings of the Design, Automation and Test in Europe. 508--513 Vol. 1.
[18]
M. M. K. Martin, M. D. Hill, and D. J. Sorin. 2012. Why on-chip cache coherence is here to stay. Commun. ACM 55, 7 (July 2012), 78--89.
[19]
T. G. Mattson, M. Riepen, T. Lehnig, P. Brett, W. Haas, P. Kennedy, J. Howard, S. Vangal, N. Borkar, G. Ruhl, and others. 2010. The 48-core SCC processor: The programmer’s view. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, 1--11.
[20]
B. Mihajlović, v. Žilić, and W. J. Gross. 2014. Dynamically instrumenting the QEMU emulator for linux process trace generation with the GDB debugger. ACM Trans Embed. Comput. Syst. 13, 5s (Dec. 2014), 1--18.
[21]
S. Owicki and A. Agarwal. 1989. Evaluating the performance of software cache coherence. In Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems. 230--242.
[22]
A. Schmidt and O. Horst. 2012. Software-based online monitoring of cache contents on platforms without coherence fabric. In Proceedings of the 19th Asia-Pacific Software Engineering Conference. 194--202.
[23]
D. J. Sorin, M. D. Hill, and D. A. Wood. 2011. A primer on memory consistency and cache coherence. Synth. Lect. Comput. Archit. 6, 3 (May 2011), 1--212.
[24]
S. Taylor, C. Ramey, C. Barner, and D. Asher. 2001. A simulation-based method for the verification of shared memory in multiprocessor systems. In Proceedings of the International Conference on Computer Aided Design. 10--17.
[25]
A. Terechko, J. Hoogerbrugge, G. Alkadi, S. Guntur, A. Lahiri, M. Duranton, C. Wüst, P. Christie, A. Nackaerts, and A. Kumar. 2012. Balancing programmability and silicon efficiency of heterogeneous multicore architectures. ACM Trans. Embed. Comput. Syst. (TECS) 11S, 1 (June 2012), 1--32.
[26]
H. Zhao, A. Shriraman, S. Kumar, and S. Dwarkadas. 2013. Protozoa: Adaptive granularity cache coherence. In Proceedings of the 40th Annual International Symposium on Computer Architecture. 547--558.

Cited By

View all
  • (2020)Deep learning parallel computing and evaluation for embedded system clustering architecture processorDesign Automation for Embedded Systems10.1007/s10617-020-09235-5Online publication date: 7-Mar-2020
  • (2018)Directed Test Generation for Validation of Cache Coherence ProtocolsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.280123938:1(163-176)Online publication date: 18-Dec-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 16, Issue 2
Special Issue on LCETES 2015, Special Issue on ACSD 2015 and Special Issue on Embedded Devise Forensics and Security
May 2017
705 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3025020
Issue’s Table of Contents
© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 02 January 2017
Accepted: 01 August 2016
Revised: 01 May 2016
Received: 01 August 2015
Published in TECS Volume 16, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MPSoC
  2. cache coherence
  3. trace analysis
  4. virtual platforms

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Ministry of Education of Brazil through the CAPES Foundation
  • French Ministry of Industry through the SoCTrace FUI project

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Deep learning parallel computing and evaluation for embedded system clustering architecture processorDesign Automation for Embedded Systems10.1007/s10617-020-09235-5Online publication date: 7-Mar-2020
  • (2018)Directed Test Generation for Validation of Cache Coherence ProtocolsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.280123938:1(163-176)Online publication date: 18-Dec-2018

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media