Article

Generation of in-bounds inputs for arrays in memory-unsafe languages

Authors:

Marcus Rodrigues,

Breno Guimarães,

Fernando Magno Quintão PereiraAuthors Info & Claims

CGO 2019: Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization

Pages 136 - 148

Published: 16 February 2019 Publication History

Abstract

This paper presents a technique to generate in-bounds inputs for arrays used in memory-unsafe programming languages, such as C and C++. We show that most memory indexation found in actual C programs follows patterns that are easy to analyze statically. Based on this observation, we show how symbolic range analysis can be used to establish contracts between the arguments of a function and the arrays used within that function. To demonstrate the effectiveness of our ideas, we use them to implement Griffin-TG, a tool to stress-test C programs whose source code might be partially available. We show how Griffin-TG improves Aprof, a well-known algorithmic profiling tool, and we show how it lets us enrich Polybench with a large set of new inputs.

References

[1]

H. Tuch, G. Klein, and M. Norrish, “Types, bytes, and separation logic,” in POPL, (Washington, DC, USA), pp. 97–108, ACM, 2007.

Digital Library

[2]

D. M. Ritchie, “The development of the c language,” in HOPL-II, (New York, NY, USA), pp. 201–208, ACM, 1993.

Digital Library

[3]

R. Bod´ık, R. Gupta, and V. Sarkar, “ABCD: Eliminating array bounds checks on demand,” in PLDI, (New York, NY, USA), pp. 321–333, ACM, 2000.

Digital Library

[4]

G. C. Necula, S. McPeak, and W. Weimer, “CCured: Type-safe retrofitting of legacy code,” in POPL, (New York, NY, USA), pp. 128– 139, ACM, 2002.

Digital Library

[5]

J. Lakos, Large-scale C++ Software Design. Redwood City, CA, USA: Addison Wesley Longman Publishing Co., Inc., 1996.

Digital Library

[6]

C. Cadar, D. Dunbar, and D. Engler, “KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs,” in OSDI, (Berkeley, CA, USA), pp. 209–224, USENIX Association, 2008.

Digital Library

[7]

P. Godefroid, N. Klarlund, and K. Sen, “DART: Directed automated random testing,” in Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, (New York, NY, USA), pp. 213–223, ACM, 2005.

Digital Library

[8]

K. Lakhotia, M. Harman, and H. Gross, “AUSTIN: An open source tool for search based software testing of C programs,” Inf. Softw. Technol., vol. 55, no. 1, pp. 112–125, 2013.

Digital Library

[9]

J. Condit, M. Harren, Z. Anderson, D. Gay, and G. C. Necula, “Dependent types for low-level programming,” in ESOP, (Berlin, Heidelberg), pp. 520–535, Springer-Verlag, 2007.

Digital Library

[10]

P. Godefroid, M. Y. Levin, and D. Molnar, “SAGE: Whitebox fuzzing for security testing,” Queue, vol. 10, pp. 20:20–20:27, Jan. 2012.

Digital Library

[11]

K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov, “Address-Sanitizer: A fast address sanity checker,” in USENIX, (Berkeley, CA, USA), pp. 28–28, USENIX Association, 2012.

Digital Library

[12]

D. Bruening and Q. Zhao, “Practical memory checking with dr. memory,” in CGO, (Washington, DC, USA), pp. 213–223, IEEE, 2011.

Digital Library

[13]

S. M. Blackburn, A. Diwan, M. Hauswirth, P. F. Sweeney, J. N. Amaral, T. Brecht, L. Bulej, C. Click, L. Eeckhout, S. Fischmeister, D. Frampton, L. J. Hendren, M. Hind, A. L. Hosking, R. E. Jones, T. Kalibera, N. Keynes, N. Nystrom, and A. Zeller, “The truth, the whole truth, and nothing but the truth: A pragmatic guide to assessing empirical evaluations,” ACM Trans. Program. Lang. Syst., vol. 38, no. 4, pp. 15:1– 15:20, 2016.

Digital Library

[14]

E. Coppa, C. Demetrescu, and I. Finocchi, “Input-sensitive profiling,” in PLDI, (New York, NY, USA), pp. 89–98, ACM, 2012.

Digital Library

[15]

H. N. Santos, P. Alves, I. Costa, and F. M. Quintao Pereira, “Just-intime value specialization,” in CGO, (Washington, DC, USA), pp. 1–11, IEEE, 2013.

Digital Library

[16]

P. Feautrier, “Automatic parallelization in the polytope model,” in The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications, (London, UK, UK), pp. 79–103, Springer-Verlag, 1996.

Digital Library

[17]

M. Griebl, C. Lengauer, and S. Wetzel, “Code generation in the polytope model,” in PACT, (Washington, DC, USA), pp. 106–115, IEEE, 1998.

Digital Library

[18]

J. Robinson, “The undecidability of exponential diophantine equations,” Studies in Logic and the Foundations of Mathematics, vol. 44, no. 8, pp. 12–13, 1966.

[19]

A. Wiles, “Modular elliptic curves and fermat’s last theorem,” Annals of Mathematics, vol. 141, no. 3, pp. 443–551, 1995.

[20]

J. Ferrante, K. J. Ottenstein, and J. D. Warren, “The program dependence graph and its use in optimization,” TOPLAS, vol. 9, no. 3, pp. 319–349, 1987.

Digital Library

[21]

B. Hardekopf and C. Lin, “The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code,” in PLDI, (New York, NY, USA), pp. 290–299, ACM, 2007.

Digital Library

[22]

F. M. Q. Pereira and D. Berlin, “Wave propagation and deep propagation for pointer analysis,” in CGO, (Washington, DC, USA), pp. 126–135, IEEE, 2009.

Digital Library

[23]

L. T. C. Melo, R. G. Ribeiro, M. R. de Ara´ujo, and F. M. Q. a. Pereira, “Inference of static semantics for incomplete C programs,” Proc. ACM Program. Lang., vol. 2, no. POPL, pp. 29:1–29:28, 2018.

Digital Library

[24]

P. Alves, F. Gruber, J. Doerfert, A. Lamprineas, T. Grosser, F. Rastello, and F. M. Q. Pereira, “Runtime pointer disambiguation,” in OOPSLA, (New York, NY, USA), pp. 589–606, ACM, 2015.

Digital Library

[25]

W. Blume and R. Eigenmann, “Symbolic range propagation,” in IPPS, (Washington, DC, USA), pp. 357–363, IEEE, 1994.

Digital Library

[26]

H. Nazaré, I. Maffra, W. Santos, L. Barbosa, L. Gonnord, and F. M. Q. Pereira, “Validation of memory accesses through symbolic analyses,” in OOPSLA, (New York, NY, USA), pp. 791–809, ACM, 2014.

Digital Library

[27]

R. Rugina and M. C. Rinard, “Symbolic bounds analysis of pointers, array indices, and accessed memory regions,” TOPLAS, vol. 27, no. 2, pp. 185–235, 2005.

Digital Library

[28]

S. Rus, L. Rauchwerger, and J. Hoeflinger, “Hybrid analysis: Static and dynamic memory reference analysis,” in ICS, (Washington, DC, USA), pp. 251–283, IEEE, 2002.

Digital Library

[29]

P. Cousot and R. Cousot, “Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints,” in POPL, (New York, NY, USA), pp. 238–252, ACM, 1977.

Digital Library

[30]

F. Nielson, H. R. Nielson, and C. Hankin, Principles of program analysis. Berlin, Heidelberg: Springer-Verlag, 2005.

Digital Library

[31]

F. Logozzo and M. Fähndrich, “Pentagons: A weakly relational abstract domain for the efficient validation of array accesses,” Sci. Comput. Program., vol. 75, no. 9, pp. 796–807, 2010.

Digital Library

[32]

A. Miné, “The octagon abstract domain,” Higher Order Symbol. Comput., vol. 19, no. 1, pp. 31–100, 2006.

Digital Library

[33]

L. O. Andersen, Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, 1994.

[34]

B. Steensgaard, “Points-to analysis in almost linear time,” in POPL, (New York, NY, USA), pp. 32–41, ACM, 1996.

Digital Library

[35]

Q. Liu, X. Wu, L. Kittinger, M. Levy, and C. Jung, “Benchprime: Effective building of a hybrid benchmark suite,” ACM Trans. Embed. Comput. Syst., vol. 16, pp. 179:1–179:22, Sept. 2017.

Digital Library

[36]

N. Nethercote and J. Seward, “Valgrind: a framework for heavyweight dynamic binary instrumentation,” in PLDI, (New York, NY, USA), pp. 89–100, ACM, 2007.

Digital Library

[37]

J. Jaeger, P. Carribault, and M. Pérache, “Fine-grain data management directory for openmp 4.0 and openacc,” Concurr. Comput. : Pract. Exper., vol. 27, no. 6, pp. 1528–1539, 2015.

Digital Library

[38]

S. Wienke, P. Springer, C. Terboven, and D. an Mey, “OpenACC: First experiences with real-world applications,” in Euro-Par, (Berlin, Heidelberg), pp. 859–870, Springer-Verlag, 2012.

Digital Library

[39]

J. M. Andi´on, M. Arenaz, F. Bodin, G. Rodr´ıguez, and J. T. no, “Locality-aware automatic parallelization for GPGPU with OpenHMPP directives,” Inter. Journal of Parallel Programming, vol. 44, no. 3, pp. 620–643, 2016.

Digital Library

[40]

C. Meenderinck and B. Juurlink, “Nexus: Hardware support for taskbased programming,” in DSD, (Berlin, Heidelberg), pp. 442–445, Springer-Verlag, 2011.

Digital Library

[41]

T. B. Jablin, P. Prabhu, J. A. Jablin, N. P. Johnson, S. R. Beard, and D. I. August, “Automatic cpu-gpu communication management and optimization,” in PLDI, (New York, NY, USA), pp. 142–151, ACM, 2011.

Digital Library

[42]

G. Mendonc¸a, B. Guimar˜aes, P. Alves, M. Pereira, G. Ara´ujo, and F. M. Q. a. Pereira, “DawnCC: Automatic annotation for data parallelism and offloading,” Trans. Archit. Code Optim., vol. 14, no. 2, pp. 13:1–13:25, 2017.

Digital Library

[43]

T. Su, K. Wu, W. Miao, G. Pu, J. He, Y. Chen, and Z. Su, “A survey on data-flow testing,” ACM Comput. Surv., vol. 50, no. 1, pp. 5:1–5:35, 2017.

Digital Library

[44]

S. Anand, E. K. Burke, T. Y. Chen, J. Clark, M. B. Cohen, W. Grieskamp, M. Harman, M. J. Harrold, and P. Mcminn, “An orchestrated survey of methodologies for automated software test case generation,” J. Syst. Softw., vol. 86, no. 8, pp. 1978–2001, 2013.

Digital Library

[45]

Y. Jia and M. Harman, “An analysis and survey of the development of mutation testing,” Trans. Softw. Eng., vol. 37, no. 5, pp. 649–678, 2011.

Digital Library

[46]

P. McMinn, “Search-based software test data generation: A survey: Research articles,” Softw. Test. Verif. Reliab., vol. 14, no. 2, pp. 105–156, 2004.

[47]

V. G. Yusifo˘glu, Y. Amannejad, and A. B. Can, “Software test-code engineering: A systematic mapping,” Information and Software Technology, vol. 58, pp. 123–147, 2015.

[48]

D. Graham and M. Fewster, Experiences of Test Automation: Case Studies of Software Test Automation. Boston, MA, US: Addison-Wesley Professional, 1st ed., 2012.

Digital Library

[49]

J. Edvardsson, “A survey on automatic test data generation,” in Compse, (Begijnhoflaan, Belgium), pp. 21–28, EAI, 1999.

[50]

E. Kit and S. Finzi, Software Testing in the Real World: Improving the Process. New York, NY, USA: ACM, 1995.

Digital Library

[51]

C. Lattner and V. Adve, “LLVM: A compilation framework for lifelong program analysis & transformation,” in CGO, (Washington, DC, USA), pp. 75–, IEEE Computer Society, 2004.

Digital Library

[52]

K. Mao, M. Harman, and Y. Jia, “Sapienz: Multi-objective automated testing for android applications,” in ISSTA, (New York, NY, USA), pp. 94–105, ACM, 2016.

Digital Library

[53]

G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu, “Understanding and detecting real-world performance bugs,” in PLDI, (New York, NY, USA), pp. 77–88, ACM, 2012.

Digital Library

[54]

R. Mudduluru and M. K. Ramanathan, “Efficient flow profiling for detecting performance bugs,” in ISSTA, (New York, NY, USA), pp. 413– 424, ACM, 2016.

Digital Library

[55]

A. Nistor, L. Song, D. Marinov, and S. Lu, “Toddler: Detecting performance problems via similar memory-access patterns,” in ICSE, (Piscataway, NJ, USA), pp. 562–571, IEEE Press, 2013.

Digital Library

[56]

O. Olivo, I. Dillig, and C. Lin, “Static detection of asymptotic performance bugs in collection traversals,” in PLDI, (New York, NY, USA), pp. 369–378, ACM, 2015.

Digital Library

[57]

L. Fang, L. Dou, and G. Xu, “PerfBlower: Quickly detecting memoryrelated performance problems via amplification,” in ECOOP (J. T. Boyland, ed.), vol. 37 of LIPIcs, (Dagstuhl, Germany), pp. 296–320, Schloss Dagstuhl, 2015.

[58]

M. Grechanik, C. Fu, and Q. Xie, “Automatically finding performance problems with feedback-directed learning software testing,” in ICSE, (Piscataway, NJ, USA), pp. 156–166, IEEE Press, 2012.

Digital Library

[59]

P. Zhang, S. Elbaum, and M. B. Dwyer, “Automatic generation of load tests,” in ASE, (Washington, DC, USA), pp. 43–52, IEEE, 2011.

Digital Library

[60]

J. Brock, C. Ding, R. Lavaee, F. Liu, and L. Yuan, “Prediction and bounds on shared cache demand from memory access interleaving,” in ISMM, (New York, NY, USA), pp. 96–108, ACM, 2018.

Digital Library

[61]

C. Dubach, T. M. Jones, E. V. Bonilla, and M. F. P. O’Boyle, “A predictive model for dynamic microarchitectural adaptivity control,” in MICRO, (Washington, DC, USA), pp. 485–496, IEEE, 2010.

Digital Library

[62]

E. Duesterwald, C. Cascaval, and S. Dwarkadas, “Characterizing and predicting program behavior and its variability,” in PACT, (Washington, DC, USA), pp. 220–231, IEEE, 2003.

Digital Library

[63]

C. Dubach, T. M. Jones, and M. F. P. O’Boyle, “Exploring and predicting the effects of microarchitectural parameters and compiler optimizations on performance and energy,” Trans. Embedded Comput. Syst., vol. 11, no. S1, p. 24, 2012.

Digital Library

[64]

A. E. Helal, W. Feng, C. Jung, and Y. Y. Hanafy, “AutoMatch: An automated framework for relative performance estimation and workload distribution on heterogeneous HPC systems,” in IISWC, (Washington, DC, USA), pp. 32–42, IEEE, 2017.

[65]

B. Dagenais and L. Hendren, “Enabling static analysis for partial java programs,” in OOPSLA, (New York, NY, USA), pp. 313–328, ACM, 2008.

Digital Library

[66]

P. Godefroid, “Micro execution,” in ICSE, (New York, NY, USA), pp. 539–549, ACM, 2014.

Digital Library

[67]

C. Yao, Y.-W. Wang, F. li, and Y.-Z. Gong, “A method of function modeling in accurate stub generation,” in ICSAI, (Washington, DC, USA), pp. 1–8, IEEE, 2014.

[68]

G. Meszaros, xUnit test patterns: Refactoring test code. London, United Kingdom: Pearson Education, 2007.

Digital Library

Cited By

Soares LMagno FPereira QLee J(2021)Memory-safe elimination of side channelsProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370305(200-210)Online publication date: 27-Feb-2021
https://dl.acm.org/doi/10.1109/CGO51591.2021.9370305

Recommendations

Defunctionalizing push arrays
FHPC '14: Proceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing

Recent work on embedded domain specific languages (EDSLs) for high performance array programming has given rise to a number of array representations. In Feldspar and Obsidian there are two different kinds of arrays, called Pull and Push arrays. Both ...
On GID-testable two-dimensional iterative arrays
Abstract
A new approach is presented for easily testable two-dimensional iterative arrays. It is an improvement on GI-testability (Group Identical testability) and is referred to as GID-testability (Group Identical and Different testability). In a GID-...
Representations of Recursively Enumerable Array Languages by Contextual Array Grammars
Contagious Creativity - In Honor of the 80th Birthday of Professor Solomon Marcus

The main result proved in this paper shows that the natural embedding of any recursively enumerable one-dimensional array language in the two-dimensional space can be characterized by the projection of a two-dimensional array language generated by a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CGO 2019: Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization

February 2019

286 pages

ISBN:9781728114361

General Chair:
Mahmut Taylan Kandemir
Penn State University, USA
,
Program Chairs:
Alexandra Jimborean
Uppsala University, USA
,
Tipp Moseley
Google, USA

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS: Computer Society

Publisher

IEEE Press

Publication History

Published: 16 February 2019

Check for updates

Author Tags

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
88
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)2

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Soares LMagno FPereira QLee J(2021)Memory-safe elimination of side channelsProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370305(200-210)Online publication date: 27-Feb-2021
https://dl.acm.org/doi/10.1109/CGO51591.2021.9370305

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents