skip to main content
research-article
Free access

Spreadsheet data manipulation using examples

Published: 01 August 2012 Publication History

Abstract

Millions of computer end users need to perform tasks over large spreadsheet data, yet lack the programming knowledge to do such tasks automatically. We present a programming by example methodology that allows end users to automate such repetitive tasks. Our methodology involves designing a domain-specific language and developing a synthesis algorithm that can learn programs in that language from user-provided examples. We present instantiations of this methodology for particular domains of tasks: (a) syntactic transformations of strings using restricted forms of regular expressions, conditionals, and loops, (b) semantic transformations of strings involving lookup in relational tables, and (c) layout transformations on spreadsheet tables. We have implemented this technology as an add-in for the Microsoft Excel Spreadsheet system and have evaluated it successfully over several benchmarks picked from various Excel help forums.

References

[1]
Cypher, A., ed. Watch What I Do: Programming by Demonstration, MIT Press, 1993.
[2]
Das Sarma, A., Parameswaran, A., Garcia-Molina, H., Widom, J. Synthesizing view definitions from data. In ICDT (2010).
[3]
Fisher, K., Walker, D. The PADS project: an overview. In ICDT (2011).
[4]
Gualtieri, M. Deputize end-user developers to deliver business agility and reduce costs. In Forrester Report for Application Development and Program Management Professionals (Apr. 2009).
[5]
Gulwani, S. Dimensions in program synthesis. In PPDP (2010).
[6]
Gulwani, S. Automating string processing in spreadsheets using input-output examples. In POPL (2011).
[7]
Gulwani, S., Jha, S., Tiwari, A., Venkatesan, R. Synthesis of loop-free programs. In PLDI (2011).
[8]
Harris, W.R., Gulwani, S. Spreadsheet table transformations from examples. In PLDI (2011).
[9]
Itzhaky, S., Gulwani, S., Immerman, N., Sagiv, M. A simple inductive synthesis methodology and its applications. In OOPSLA (2010).
[10]
Jha, S., Gulwani, S., Seshia, S., Tiwari, A. Oracle-guided component-based program synthesis. In ICSE (2010).
[11]
Kandel, S., Paepcke, A., Hellerstein, J., Heer, J. Wrangler: Interactive visual specification of data transformation scripts. In CHI (2011).
[12]
Ko, A.J., Myers, B.A., Aung, H.H. Six learning barriers in end-user programming systems. In VL/HCC (2004).
[13]
Lau, T. Why P B D systems fail: lessons learned for usable AI. In CHI 2008 Workshop on Usable AI (2008).
[14]
Lau, T., Wolfman, S., Domingos, P., Weld, D. Programming by demonstration using version space algebra. Mach. Learn. 53(1--2) (2003).
[15]
Lieberman, H. Your Wish is My Command: Programming by Example, Morgan Kaufmann, 2001.
[16]
Miller, R.C., Myers, B.A., Interactive simultaneous editing of multiple text regions. In USENIX Annual Technical Conference (2001).
[17]
Mitchell, T.M. Generalization as search. Artif. Intell. 18, 2 (1982).
[18]
Singh, R., Gulwani, S. Learning semantic string transformations from examples. PVLDB 5 (2012), in press.
[19]
Srivastava, S., Gulwani, S., Chaudhuri, S., Foster, J.S. Path-based inductive synthesis for program inversion. In PLDI (2011).
[20]
Srivastava, S., Gulwani, S., Foster, J. From program verification to program synthesis. In POPL (2010).
[21]
Taly, A., Gulwani, S., Tiwari, A. Synthesizing switching logic using constraint solving. In VMCAI (2009).
[22]
Tran, Q.T., Chan, C.Y., Parthasarathy, S. Query by output. In SIGMOD (2009).
[23]
Walkenbach, J. Excel 2010 Formulas, John Wiley and Sons, 2010.

Cited By

View all
  • (2025)E-code: Mastering efficient code generation through pretrained models and expert encoder groupInformation and Software Technology10.1016/j.infsof.2024.107602178(107602)Online publication date: Feb-2025
  • (2024)Refinement Types for VisualizationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695550(1871-1881)Online publication date: 27-Oct-2024
  • (2024)Contextualized Data-Wrangling Code Generation in Computational NotebooksProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695503(1282-1294)Online publication date: 27-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 55, Issue 8
August 2012
105 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2240236
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2012
Published in CACM Volume 55, Issue 8

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)523
  • Downloads (Last 6 weeks)92
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)E-code: Mastering efficient code generation through pretrained models and expert encoder groupInformation and Software Technology10.1016/j.infsof.2024.107602178(107602)Online publication date: Feb-2025
  • (2024)Refinement Types for VisualizationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695550(1871-1881)Online publication date: 27-Oct-2024
  • (2024)Contextualized Data-Wrangling Code Generation in Computational NotebooksProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695503(1282-1294)Online publication date: 27-Oct-2024
  • (2024)Auto-Tables: Relationalize Tables without Using ExamplesACM SIGMOD Record10.1145/3665252.366526953:1(76-85)Online publication date: 14-May-2024
  • (2024)DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language ModelsProceedings of the ACM on Management of Data10.1145/36392792:1(1-24)Online publication date: 26-Mar-2024
  • (2024)A Case for Synthesis of Recursive Quantum Unitary ProgramsProceedings of the ACM on Programming Languages10.1145/36329018:POPL(1759-1788)Online publication date: 5-Jan-2024
  • (2024)Lifting Micro-Update Models from RTL for Formal Security AnalysisProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640418(631-648)Online publication date: 27-Apr-2024
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • (2024)Deep learning for code generation: a surveyScience China Information Sciences10.1007/s11432-023-3956-367:9Online publication date: 20-Aug-2024
  • (2024)GXJoin: Generalized Cell Transformations for Explainable JoinabilityAdvances in Databases and Information Systems10.1007/978-3-031-70626-4_9(123-137)Online publication date: 1-Sep-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media