skip to main content
research-article

Programming by Example Made Easy

Published: 24 November 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Programming by example (PBE) is an emerging programming paradigm that automatically synthesizes programs specified by user-provided input-output examples. Despite the convenience for end-users, implementing PBE tools often requires strong expertise in programming language and synthesis algorithms. Such a level of knowledge is uncommon among software developers. It greatly limits the broad adoption of PBE by the industry. To facilitate the adoption of PBE techniques, we propose a PBE framework called Bee, which leverages an “entity-action” model based on relational tables to ease PBE development for a wide but restrained range of domains. Implementing PBE tools with Bee only requires adapting domain-specific data entities and user actions to tables, with no need to design a domain-specific language or an efficient synthesis algorithm. The synthesis algorithm of Bee exploits bidirectional searching and constraint-solving techniques to address the challenge of value computation nested in table transformation. We evaluated Bee’s effectiveness on 64 PBE tasks from three different domains and usability with a human study of 12 participants. Evaluation results show that Bee is easier to learn and use than the state-of-the-art PBE framework, and the bidirectional algorithm achieves comparable performance to domain-specifically optimized synthesizers.

    References

    [1]
    2023. Excel Forum. https://www.excelforum.com/excel-general.Accessed 9 March 2023.
    [2]
    [3]
    2023. OxygenXML. https://www.oxygenxml.com/forum.Accessed 9 March 2023.
    [4]
    2023. Bee Homepage. https://github.com/Sissel-Wu/Bee.Accessed 9 March 2023.
    [5]
    Rajeev Alur, Rastislav Bodík, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis. In Proceedings of the Formal Methods in Computer-Aided Design. IEEE, 1–8. Retrieved from https://ieeexplore.ieee.org/document/6679385/
    [6]
    Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. 2017. Scaling enumerative program synthesis via divide and conquer. In Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems - 23rd International Conference, TACAS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22–29, 2017, Proceedings, Part I.Axel Legay and Tiziana Margaria (Eds.), Lecture Notes in Computer Science, Vol. 10205, 319–336. DOI:
    [7]
    Rajeev Alur, Pavol Černý, and Arjun Radhakrishna. 2015. Synthesis through unification. In Proceedings of the Computer Aided Verification.Daniel Kroening and Corina S. Păsăreanu (Eds.), Springer International Publishing, Cham, 163–179. DOI:
    [8]
    Shaon Barman, Sarah Chasins, Rastislav Bodik, and Sumit Gulwani. 2016. Ringer: Web automation by demonstration. SIGPLAN Not. 51, 10 (2016), 748–764. DOI:
    [9]
    Daniel W. Barowy, Sumit Gulwani, Ted Hart, and Benjamin Zorn. 2015. FlashRelate: Extracting relational data from semi-structured spreadsheets using examples. SIGPLAN Not. 50, 6 (2015), 218–228. DOI:
    [10]
    Clark Barrett, A. Stump, and Cesare Tinelli. 2010. The SMT-LIB standard - version 2.0. In Proceedings of the 8th International Workshop on Satisfiability Modulo Theories, Edinburgh, Scotland.
    [11]
    Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica. 2019. AutoPandas: Neural-backed generators for program synthesis. Proc. ACM Program. Lang. 3, OOPSLA (2019), 27 pages. DOI:
    [12]
    Rohan Bavishi, Caroline Lemieux, Koushik Sen, and Ion Stoica. 2021. Gauss: Program synthesis by reasoning over graphs. Proc. ACM Program. Lang. 5, OOPSLA (2021), 1–29. DOI:
    [13]
    Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic model for code. In Proceedings of the 33nd International Conference on Machine LearningMaria-Florina Balcan and Kilian Q. Weinberger (Eds.), JMLR.org, 2933–2942. Retrieved from http://proceedings.mlr.press/v48/bielik16.html
    [14]
    Matthew Bowers, Theo X. Olausson, Lionel Wong, Gabriel Grand, Joshua B. Tenenbaum, Kevin Ellis, and Armando Solar-Lezama. 2023. Top-down synthesis for library learning. Proceedings of the ACM on Programming Languages 7, POPL (2023), 41:1182–41:1213. DOI:
    [15]
    Salman Cheema, Sarah Buchanan, Sumit Gulwani, and Joseph J. LaViola Jr.2014. A practical framework for constructing structured drawings. In Proceedings of the 19th International Conference on Intelligent User Interfaces.Tsvi Kuflik, Oliviero Stock, Joyce Yue Chai, and Antonio Krüger (Eds.), ACM, 311–316. DOI:
    [16]
    Leonardo Mendonça de Moura and Nikolaj S. Bjørner. 2008. Z3: An efficient SMT solver. In Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29–April 6, 2008. Proceedings. (C. R. Ramakrishnan and Jakob Rehof (Eds.), Lecture Notes in Computer Science, Vol. 4963, Springer, 337–340. DOI:
    [17]
    Li Dong and Mirella Lapata. 2018. Coarse-to-fine decoding for neural semantic parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Iryna Gurevych and Yusuke Miyao (Eds.), Association for Computational Linguistics, 731–742. DOI:
    [18]
    Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sablé-Meyer, Lucas Morales, Luke Hewitt, Luc Cary, Armando Solar-Lezama, and Joshua B. Tenenbaum. 2021. DreamCoder: Bootstrapping inductive program synthesis with wake-sleep library learning. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 835–850. DOI:
    [19]
    Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. 2018. Program synthesis using conflict-driven learning. SIGPLAN Not. 53, 4 (2018), 420–435. DOI:
    [20]
    Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-based synthesis of table consolidation and transformation tasks from examples. SIGPLAN Not. 52, 6 (2017), 422–436. DOI:
    [21]
    Xiang Gao, Shraddha Barke, Arjun Radhakrishna, Gustavo Soares, Sumit Gulwani, Alan Leung, Nachiappan Nagappan, and Ashish Tiwari. 2020. Feedback-driven semi-supervised synthesis of program transformations. Proceedings of the ACM on Programming Languages 4, OOPSLA (Nov. 2020), 219:1–219:30. DOI:
    [22]
    Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Thomas Ball and Mooly Sagiv (Eds.), ACM, 317–330. DOI:
    [23]
    Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. 2017. Program synthesis. Foundations and Trends® in Programming Languages 4, 1–2 (2017), 1–119. DOI:
    [24]
    Daniel Conrad Halbert. 1984. Programming by Example. Ph. D. Dissertation. AAI8512843.
    [25]
    William R. Harris and Sumit Gulwani. 2011. Spreadsheet table transformations from examples. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11, San Jose, CA, USA, June 4-8, 2011), Mary W. Hall and David A. Padua (Eds.). ACM, 317–328.
    [26]
    Steve Henty. 2015. UI Response Times. Retrieved from https://medium.com/@slhenty/ui-response-times-acec744f3157Accessed 9 March 2023.
    [27]
    Daniel Jackson. 2006. Software Abstractions - Logic, Language, and Analysis. MIT Press. Retrieved from http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10928
    [28]
    Ruyi Ji, Jingjing Liang, Yingfei Xiong, Lu Zhang, and Zhenjiang Hu. 2020. Question selection for interactive program synthesis. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 1143–1158. DOI:
    [29]
    Ruyi Ji, Jingtao Xia, Yingfei Xiong, and Zhenjiang Hu. 2021. Generalizable synthesis through unification. Proceedings of the ACM on Programming Languages.DOI:Publisher: ACM PUB27 New York, NY.
    [30]
    Ashwin Kalyan, Abhishek Mohta, Alex Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. Neural-guided deductive search for real-time program synthesis from examples. In Proceedings of the 6th International Conference on Learning Representations. Retrieved from https://www.microsoft.com/en-us/research/publication/neural-guided-deductive-search-real-time-program-synthesis-examples/
    [31]
    Vu Le and Sumit Gulwani. 2014. FlashExtract: A framework for data extraction by examples. SIGPLAN Not. 49, 6 (2014), 542–553. DOI:
    [32]
    Woosuk Lee. 2021. Combining the top-down propagation and bottom-up enumeration for inductive program synthesis. Proceedings of the ACM on Programming Languages 5, POPL (2021), 54:1–54:28. DOI:
    [33]
    Alan Leung, John Sarracino, and Sorin Lerner. 2015. Interactive parser synthesis by example. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. David Grove and Stephen M. Blackburn (Eds.), ACM, 565–574. DOI:
    [34]
    Leonid Libkin. 2003. Expressive power of SQL. Theoretical Computer Science 296, 3 (2003), 379–404. DOI:
    [35]
    Ruben Martins, Jia Chen, Yanju Chen, Yu Feng, and Isil Dillig. 2019. Trinity: An extensible synthesis framework for data science. Proc. VLDB Endow. 12, 12 (2019), 1914–1917. DOI:
    [36]
    Anders Miltner, Sumit Gulwani, Alan Leung, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari, and Abhishek Udupa. 2019. On the fly synthesis of edit suggestions. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 143:1–143:29. DOI:
    [37]
    Chandrakana Nandi, Max Willsey, Adam Anderson, James R. Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. 2020. Synthesizing structured CAD models with equality saturation and inverse transformations. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 31–44. DOI:
    [38]
    Ansong Ni, Pengcheng Yin, and Graham Neubig. 2020. Merging weak and active supervision for semantic parsing. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, AAAI 2020, The 32nd Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The 10th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY. AAAI Press, 8536–8543. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/6375
    [39]
    Jakob Nielsen. 2014. Response Times: The 3 Important Limits. Retrieved from https://www.nngroup.com/articles/response-times-3-important-limits/Accessed: 2023.
    [40]
    Elizabeth J. O’Neil. 2008. Object/relational mapping 2008: Hibernate and the entity data model (edm). In Proceedings of the ACM SIGMOD International Conference on Management of Data. Jason Tsong-Li Wang (Ed.), ACM, 1351–1356. DOI:
    [41]
    Rangeet Pan, Vu Le, Nachiappan Nagappan, Sumit Gulwani, Shuvendu Lahiri, and Mike Kaufman. 2021. Can program synthesis be used to learn merge conflict resolutions? An empirical analysis.2021 IEEE/ACM 43rd International Conference on Software Engineering. IEEE Computer Society, 785–796. DOI:ISSN: 1558-1225.
    [42]
    Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A framework for inductive program synthesis. OOPSLA 50, 10 (2015), 107–126. DOI:
    [43]
    Mukund Raghothaman, Jonathan Mendelson, David Zhao, Mayur Naik, and Bernhard Scholz. 2019. Provenance-guided synthesis of Datalog programs. Proceedings of the ACM on Programming Languages 4, POPL (2019), 62:1–62:27. DOI:
    [44]
    Mohammad Raza and Sumit Gulwani. 2017. Automated data extraction using predictive program synthesis. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Satinder Singh and Shaul Markovitch (Eds.), AAAI Press, 882–890. Retrieved from http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/15034
    [45]
    Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning syntactic program transformations from examples. In Proceedings of the 39th International Conference on Software Engineering.IEEE Press, Buenos Aires, Argentina, 404–415. DOI:
    [46]
    Jiasi Shen and Martin C. Rinard. 2019. Using active learning to synthesize models of applications that access databases. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 269–285. DOI:
    [47]
    Jiasi Shen and Martin C. Rinard. 2021. Active learning for inference and regeneration of applications that access databases. ACM Transactions on Programming Languages and Systems 42, 4 (2021), 18:1–18:119. DOI:
    [48]
    Xujie Si, Woosuk Lee, Richard Zhang, Aws Albarghouthi, Paraschos Koutris, and Mayur Naik. 2018. Syntax-guided synthesis of Datalog programs. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.Association for Computing Machinery, New York, NY, 515–527. DOI:
    [49]
    Rishabh Singh and Sumit Gulwani. 2016. Transforming spreadsheet data types using examples. SIGPLAN Not. 51, 1 (2016), 343–356. DOI:
    [50]
    Aalok Thakkar, Aaditya Naik, Nathaniel Sands, Rajeev Alur, Mayur Naik, and Mukund Raghothaman. 2021. Example-guided synthesis of relational queries. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 1110–1125. DOI:
    [51]
    Emina Torlak and Rastislav Bodik. 2014. A lightweight symbolic virtual machine for solver-aided host languages. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 530–541. DOI:
    [52]
    Quoc Trung Tran, Chee-Yong Chan, and Srinivasan Parthasarathy. 2009. Query by output. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data.Association for Computing Machinery, New York, NY, 535–548. DOI:
    [53]
    Chenglong Wang, Alvin Cheung, and Rastislav Bodík. 2017. Synthesizing highly expressive SQL queries from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation.Albert Cohen and Martin T. Vechev (Eds.), ACM, 452–466. DOI:
    [54]
    Chenglong Wang, Yu Feng, Rastislav Bodík, Alvin Cheung, and Isil Dillig. 2020. Visualization by example. Proc. ACM Program. Lang. 4, POPL (2020), 49:1–49:28. DOI:
    [55]
    Xinyu Wang, Greg Anderson, Isil Dillig, and K. L. McMillan. 2018. Learning abstractions for program synthesis. In Proceedings of the Computer Aided Verification.Hana Chockler and Georg Weissenbacher (Eds.), Springer International Publishing, Cham, 407–426. DOI:
    [56]
    Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017. Synthesis of data completion scripts using finite tree automata. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 62:1–62:26. DOI:
    [57]
    Yuepeng Wang, James Dong, Rushi Shah, and Isil Dillig. 2019. Synthesizing database programs for schema refactoring. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation.Association for Computing Machinery, New York, NY, 286–300. DOI:
    [58]
    Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bulletin 1, 6 (1945), 80–83. Retrieved from http://www.jstor.org/stable/3001968
    [59]
    Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. 2016. Synthesizing transformations on hierarchically structured data. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. Chandra Krintz and Emery D. Berger (Eds.), ACM, 508–521. DOI:
    [60]
    Sai Zhang and Yuyin Sun. 2013. Automatically synthesizing SQL queries from input-output examples. Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering. (2013), 224–234. DOI:
    [61]
    Xiangyu Zhou, Rastislav Bodik, Alvin Cheung, and Chenglong Wang. 2022. Synthesizing analytical SQL queries from computation demonstration. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation. Association for Computing Machinery, New York, NY, 168–182. DOI:

    Cited By

    View all

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 1
    January 2024
    933 pages
    ISSN:1049-331X
    EISSN:1557-7392
    DOI:10.1145/3613536
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 November 2023
    Online AM: 07 July 2023
    Accepted: 07 June 2023
    Revised: 10 March 2023
    Received: 21 September 2022
    Published in TOSEM Volume 33, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Program synthesis
    2. programming by example

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation of China
    • Leading-edge Technology Program of the Jiangsu Natural Science Foundation
    • Hong Kong Research Grant Council/General Research Fund
    • Hong Kong Research Grant Council/Research Impact Fund
    • Natural Sciences and Engineering Research Council of Canada Discovery Grant
    • Fundamental Research Funds for the Central Universities of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 598
      Total Downloads
    • Downloads (Last 12 months)483
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 14 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media