research-article

Selection and presentation practices for code example summarization

Authors:

Annie T. T. Ying,

Martin P. RobillardAuthors Info & Claims

FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Pages 460 - 471

https://doi.org/10.1145/2635868.2635877

Published: 11 November 2014 Publication History

Abstract

Code examples are an important source for answering questions about software libraries and applications. Many usage contexts for code examples require them to be distilled to their essence: e.g., when serving as cues to longer documents, or for reminding developers of a previously known idiom. We conducted a study to discover how code can be summarized and why. As part of the study, we collected 156 pairs of code examples and their summaries from 16 participants, along with over 26 hours of think-aloud verbalizations detailing the decisions of the participants during their summarization activities. Based on a qualitative analysis of this data we elicited a list of practices followed by the participants to summarize code examples and propose empirically-supported hypotheses justifying the use of specific practices. One main finding was that none of the participants exclusively extracted code verbatim for the summaries, motivating abstractive summarization. The results provide a grounded basis for the development of code example summarization and presentation technology.

References

[1]

M. Asaduzzaman, A. S. Mashiyat, C. K. Roy, and K. A. Schneider. Answering questions about unanswered questions of stack overflow. In Proceedings of the Working Conference on Mining Software Repositories, Challenge Track, pages 97–100, 2013.

Digital Library

[2]

S. Bajracharya, J. Ossher, and C. Lopes. Leveraging usage similarity for effective retrieval of examples in code repositories. In Proceedings of the International Symposium on the Foundations of Software Engineering, pages 157–166, 2010.

Digital Library

[3]

R. Buse and W. Weimer. Synthesizing API usage examples. In Proceedings of the International Conference on Software Engineering, pages 782–792, 2012.

Digital Library

[4]

E. Cutrell and Z. Guan. What are you looking for? An eye-tracking study of information usage in web search. In Proceedings of the Conference on Human Factors in Computing Systems, pages 407–416, 2007.

Digital Library

[5]

E. Duala-Ekoko and M. Robillard. Asking and answering questions about unfamiliar APIs: An exploratory study. In Proceedings of the International Conference on Software Engineering, pages 266–276, 2012.

Digital Library

[6]

L. M. Eshkevari, V. Arnaoudova, M. Di Penta, R. Oliveto, Y.-G. Guéhéneuc, and G. Antoniol. An exploratory study of identifier renamings. In Proceedings of the Working Conference on Mining Software Repositories, pages 33–42, 2011.

Digital Library

[7]

B. Fluri, M. Wursch, M. Pinzger, and H. C. Gall. Change distilling: Tree differencing for fine-grained source code change extraction. Transactions on Software Engineering, 33(11):725–743, 2007.

Digital Library

[8]

M. Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, 1999.

Digital Library

[9]

T. Fritz, J. Ou, G. Murphy, and E. Murphy-Hill. A degree-of-knowledge model to capture source code familiarity. In Proceedings of the International Conference on Software Engineering, pages 385–394, 2010.

Digital Library

[10]

S. Haiduc, J. Aponte, L. Moreno, and A. Marcus. On the use of automated text summarization techniques for summarizing source code. In Proceedings of the Working Conference on Reverse Engineering, pages 35–44, 2010.

Digital Library

[11]

A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu. On the naturalness of software. In Proceedings of the International Conference on Software Engineering, pages 837–847, 2012.

Digital Library

[12]

H. Jing, R. Barzilay, K. McKeown, and M. Elhadad. Summarization evaluation methods: Experiments and analysis. In AAAI Symposium on Intelligent Summarization, pages 51–59, 1998.

[13]

H. Jing and K. R. McKeown. The decomposition of human-written summary sentences. In Proceedings of the Annual International Conference on Research and Development in Information Retrieval, pages 129–136, 1999.

Digital Library

[14]

J. Kim, S. Lee, S.-W. Hwang, and S. Kim. Enriching documents with examples: A corpus mining approach. Transactions on Information Systems, 31(1):1–27, 2013.

Digital Library

[15]

J. Kupiec, J. Pedersen, and F. Chen. A trainable document summarizer. In Proceedings of the Annual International Conference on Research and Development in Information Retrieval, pages 68–73, 1995.

Digital Library

[16]

D. Lawrie and D. Binkley. Expanding identifiers to normalize source code vocabulary. In Proceedings of the International Conference on Software Maintenance, pages 113–122, 2011.

Digital Library

[17]

U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In Proceedings of the International Conference on World Wide Web, pages 391–400, 2005.

Digital Library

[18]

C. Lewis and J. Rieman. Task-Centered User Interface Design: A Practical Introduction, chapter 5: Testing The Design With Users. Self-published, 1993. http://grouplab.cpsc.ucalgary.ca/saul/hci topics/tcsdbook/contents.html.

[19]

I. Mani. Automatic summarization. John Benjamins Publishing, 2001.

[20]

N. Meng, M. Kim, and K. S. McKinley. Lase: locating and applying systematic edits by learning from examples. In Proceedings of the International Conference on Software Engineering, pages 502–511, 2013.

Digital Library

[21]

A. Mockus and J. Herbsleb. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the International Conference on Software Engineering, pages 503–512, 2002.

Digital Library

[22]

L. Moreno, J. Aponte, G. Sridhara, A. Marcus, L. Pollock, and K. Vijay-Shanker. Automatic generation of natural language summaries for Java classes. In Proceedings of the International Conference on Program Comprehension, pages 23–32, 2013.

[23]

S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns. What makes a good code example? A study of programming Q&A in StackOverflow. In Proceedings of the International Conference on Software Maintenance, pages 25––34, 2012.

Digital Library

[24]

A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 145–152, 2004.

[25]

D. Radev, E. Hovy, and K. McKeown. Introduction to the special issue on summarization. Computational Linguistics, 28(4):399–408, 2002.

Digital Library

[26]

S. Rastkar, G. C. Murphy, and A. W. Bradley. Generating natural language summaries for crosscutting source code concerns. In Proceedings of the International Conference on Software Maintenance, pages 103–112, 2011.

Digital Library

[27]

M. Reape and C. Mellish. Just what is aggregation anyway. In Proceedings of the European Workshop on Natural Language Generation, pages 20–29, 1999.

[28]

E. Reiter and R. Dale. Building natural language generation systems. MIT Press, 2000.

[29]

M. Robillard and R. DeLine. A field study of API learning obstacles. Empirical Software Engineering, 16(6):703–732, 2011.

Digital Library

[30]

P. Rodeghero, C. McMillan, P. W. McBurney, N. Bosch, and S. D’Mello. Improving automated source code summarization via an eye-tracking study of programmers. In Proceedings of the International Conference on Software Engineering, pages 390–401, 2014.

Digital Library

[31]

C. B. Seaman. Qualitative methods in empirical studies of software engineering. Transactions on Software Engineering, 25(4):557–572, 1999.

Digital Library

[32]

S. Sim, R. Gallardo-Valencia, K. Philip, M. Umarji, M. Agarwala, C. Lopes, and S. Ratanotayanon. Software reuse through methodical component reuse and amethodical snippet remixing. In Proceedings of the Conference on Computer-Supported Cooperative Work, pages 1361–1370, 2012.

Digital Library

[33]

G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker. Towards automatically generating summary comments for Java methods. In Proceedings of the International Conference on Automated Software Engineering, pages 43–52, 2010.

Digital Library

[34]

J. Stylos and B. Myers. Mica: A web-search tool for finding API components and examples. In Proceedings of the Symposium on Visual Languages and Human-Centric Computing, pages 195–202, 2006.

Digital Library

[35]

S. Subramanian and R. Holmes. Making sense of online code snippets. In Proceedings of the Working Conference on Mining Software Repositories, Challenge Track, pages 85–88, 2013.

Digital Library

[36]

E. R. Tufte. Beautiful evidence. Graphics Press, Cheshire, CT, 2006.

Digital Library

[37]

R. White, J. Jose, and I. Ruthven. A task-oriented study on the influencing effects of query-biased summarisation in web searching. Information Processing & Management, 39(5):707––733, 2003.

Digital Library

[38]

A. T. T. Ying and M. P. Robillard. Code fragment summarization. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the International Symposium on the Foundations of Software Engineering, New Ideas Track, pages 655–658, 2013.

Digital Library

Cited By

Shi YYin YYu MChu L(2024)CogCol: Code Graph-Based Contrastive Learning Model for Code SummarizationElectronics10.3390/electronics1310181613:10(1816)Online publication date: 8-May-2024
https://doi.org/10.3390/electronics13101816
Arya DGuo JRobillard M(2024)Properties and Styles of Software Technology TutorialsIEEE Transactions on Software Engineering10.1109/TSE.2023.333256850:2(159-172)Online publication date: Feb-2024
https://doi.org/10.1109/TSE.2023.3332568
Zhong HWang X(2023)An empirical study on API usages from code search engine and local libraryEmpirical Software Engineering10.1007/s10664-023-10304-z28:3Online publication date: 13-Apr-2023
https://dl.acm.org/doi/10.1007/s10664-023-10304-z
Show More Cited By

Index Terms

Selection and presentation practices for code example summarization
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Software management
        Software maintenance
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues

Recommendations

An Extractive-and-Abstractive Framework for Source Code Summarization
(Source) Code summarization aims to automatically generate summaries/comments for given code snippets in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization ...
Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...
Graph-based informative-sentence selection for opinion summarization
ASONAM '13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

In this paper, we propose a new framework for opinion summarization based on sentence selection. Our goal is to assist users to get helpful opinion suggestions from reviews by only reading a short summary with few informative sentences, where the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

November 2014

856 pages

ISBN:9781450330565

DOI:10.1145/2635868

General Chair:
Shing-Chi Cheung
Hong Kong University of Science and Technology, China
,
Program Chairs:
Alessandro Orso
Georgia Institute of Technology, USA
,
Margaret-Anne Storey
University of Victoria, Canada

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SIGSOFT/FSE'14

Sponsor:

SIGSOFT

SIGSOFT/FSE'14: 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering

November 16 - 21, 2014

Hong Kong, China

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
515
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shi YYin YYu MChu L(2024)CogCol: Code Graph-Based Contrastive Learning Model for Code SummarizationElectronics10.3390/electronics1310181613:10(1816)Online publication date: 8-May-2024
https://doi.org/10.3390/electronics13101816
Arya DGuo JRobillard M(2024)Properties and Styles of Software Technology TutorialsIEEE Transactions on Software Engineering10.1109/TSE.2023.333256850:2(159-172)Online publication date: Feb-2024
https://doi.org/10.1109/TSE.2023.3332568
Zhong HWang X(2023)An empirical study on API usages from code search engine and local libraryEmpirical Software Engineering10.1007/s10664-023-10304-z28:3Online publication date: 13-Apr-2023
https://dl.acm.org/doi/10.1007/s10664-023-10304-z
Werner CLi ZLowlind DElazhary OErnst NDamian D(2022)Continuously Managing NFRs: Opportunities and Challenges in PracticeIEEE Transactions on Software Engineering10.1109/TSE.2021.306633048:7(2629-2642)Online publication date: 1-Jul-2022
https://doi.org/10.1109/TSE.2021.3066330
Moran KYachnes APurnell GMahmud JTufano MCardenas CPoshyvanyk DH'Doubler Z(2022)An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00069(514-525)Online publication date: Mar-2022
https://doi.org/10.1109/SANER53432.2022.00069
Uddin GBaysal OGuerrouj LKhomh F(2021)Understanding How and Why Developers Seek and Analyze API-Related OpinionsIEEE Transactions on Software Engineering10.1109/TSE.2019.290303947:4(694-735)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TSE.2019.2903039
Rani PPanichella SLeuenberger MDi Sorbo ANierstrasz O(2021)How to identify class comment types? A multi-language approach for class comments classificationJournal of Systems and Software10.1016/j.jss.2021.111047(111047)Online publication date: Jul-2021
https://doi.org/10.1016/j.jss.2021.111047
Uddin GKhomh FRosu GDi Penta MNguyen T(2017)Opiner: an opinion search and summarization engine for APIsProceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering10.5555/3155562.3155690(978-983)Online publication date: 30-Oct-2017
https://dl.acm.org/doi/10.5555/3155562.3155690
Uddin GKhomh FRosu GDi Penta MNguyen T(2017)Automatic summarization of API reviewsProceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering10.5555/3155562.3155586(159-170)Online publication date: 30-Oct-2017
https://dl.acm.org/doi/10.5555/3155562.3155586
Azad SRigby PGuerrouj L(2017)Generating API Call Rules from Version History and Stack Overflow PostsACM Transactions on Software Engineering and Methodology10.1145/299049725:4(1-22)Online publication date: 9-Jan-2017
https://dl.acm.org/doi/10.1145/2990497
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents