skip to main content
10.1145/2931037.2931072acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Documenting database usages and schema constraints in database-centric applications

Published: 18 July 2016 Publication History

Abstract

Database-centric applications (DCAs) usually rely on database operations over a large number of tables and attributes. Understanding how database tables and attributes are used to implement features in DCAs along with the constraints related to these usages is an important component of any DCA’s maintenance. However, manually documenting database related operations and their asynchronously evolving constraints in constantly changing source code is a hard and time-consuming problem. In this paper, we present a novel approach, namely DBScribe, aimed at automatically generating always up-to-date natural language descriptions of database operations and schema constraints in source code methods. DBScribe statically analyzes the code and database schema to detect database usages and then prop- agates these usages and schema constraints through the call-chains implementing database-related features. Finally, each method in these call-chains is automatically documented based on the underlying database usages and constraints.
We evaluated DBScribe in a study with 52 participants analyzing generated documentation for database-related methods in five open-source DCAs. Additionally, we evaluated the descriptions generated by DBScribe on two commercial DCAs involving original developers. The results for the studies involving open-source and commercial DCAs demonstrate that generated descriptions are accurate and useful while understanding database usages and constraints, in particular during maintenance tasks.

References

[1]
Dbscribe online appendix. http: //www.cs.wm.edu/semeru/data/ISSTA16-DBScribe.
[2]
Fina http://sourceforge.net/projects/fina/.
[3]
Jsqlparser. http://jsqlparser.sourceforge.net/.
[4]
Liminal ltda http://www.liminal-it.com/.
[5]
Openemm e-mail & marketing automation http://sourceforge.net/projects/openemm/files/ OpenEMM%20software/OpenEMM%206.0/.
[6]
Qualtrics. http://www.qualtrics.com.
[7]
Risk it repository. https://riskitinsurance.svn.sourceforge.net.
[8]
Umas repository. https://github.com/ University-Management-And-Scheduling.
[9]
Xinco rev 700 http://sourceforge.net/p/xinco/code/700/tree/trunk/.
[10]
Xinco http://sourceforge.net/projects/xinco/.
[11]
R. Agrawal, T. Imieli´ nski, and A. Swami. Mining association rules between sets of items in large databases. In ACM SIGMOD Record, volume 22, pages 207–216. ACM, 1993.
[12]
R. Alhajj. Extracting the extended entity-relationship model from a legacy relational database. Information Systems, 28(6):597–618, 2003.
[13]
D. Alur, D. Malks, and J. Crupi. Core J2EE Patterns: Best Practices and Design Strategies. Prentice Hall Press, Upper Saddle River, NJ, USA, 2nd edition, 2013.
[14]
K. Bakshi. Considerations for big data: Architecture and approach. In Aerospace Conference, 2012 IEEE, pages 1–7. IEEE, 2012.
[15]
R. Buse and W. Weimer. Automatically documenting program changes. In ASE’10, pages 33–42, 2010.
[16]
R. P. Buse and W. R. Weimer. Automatic documentation inference for exceptions. In Proceedings of the 2008 international symposium on Software testing and analysis, pages 273–282. ACM, 2008.
[17]
G. Canfora, L. Cerulo, and M. Di Penta. Ldiff: An enhanced line differencing tool. In Proceedings of the 31st International Conference on Software Engineering, pages 595–598. IEEE Computer Society, 2009.
[18]
A. Cleve, M. Gobert, L. Meurice, J. Maes, and J. Weber. Understanding database schema evolution: A case study. Science of Computer Programming, 97, Part 1:113 – 121, 2015. Special Issue on New Ideas and Emerging Results in Understanding Software.
[19]
L. F. Cortés-Coy, M. Linares-Vásquez, J. Aponte, and D. Poshyvanyk. On automatically generating commit messages via summarization of source code changes. In Source Code Analysis and Manipulation (SCAM), 2014 IEEE 14th International Working Conference on, pages 275–284. IEEE, 2014.
[20]
J. Feigenspan, C. Kästner, J. Liebig, S. Apel, and S. Hanenberg. Measuring programming experience. In ICPC’12, pages 73–82, 2012.
[21]
B. Fluri, M. Wursch, and H. Gall. Do code and comments co-evolve? on the relation between source code and comment changes. In Reverse Engineering, 2007. WCRE 2007. 14th Working Conference on, pages 70–79, Oct 2007.
[22]
B. Fluri, M. Würsch, E. Giger, and H. C. Gall. Analyzing the co-evolution of comments and source code. Software Quality Journal, 17(4):367–394, 2009.
[23]
T. Fritz, D. C. Shepherd, K. Kevic, W. Snipes, and C. Bräunlich. Developers’ code context models for change tasks. In 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 7–18, New York, NY, USA, 2014.
[24]
M. Goeminne, A. Decan, and T. Mens. Co-evolving code-related and database-related changes in a data-intensive software system. In Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week-IEEE Conference on, pages 353–357. IEEE, 2014.
[25]
M. Grechanik, C. Csallner, C. Fu, and Q. Xie. Is data privacy always good for software testing? In ISSRE’10, pages 368–377, 2010.
[26]
D. Jackson and D. A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In Software Maintenance, 1994. Proceedings., International Conference on, pages 243–252. IEEE, 1994.
[27]
M. Kamimura and G. Murphy. Towards generating human-oriented summaries of unit test cases. In 2013 IEEE 21st International Conference on Program Comprehension (ICPC), pages 215–218, May 2013.
[28]
K. Kevic, T. Fritz, and D. Shepherd. Comogen: An approach to locate relevant task context by combining search and navigation. In IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 61–70, Sept 2014.
[29]
M. Kim and D. Notkin. Discovering and representing systematic code changes. In Proceedings of the 31st International Conference on Software Engineering, pages 309–319, Washington, DC, USA, 2009. IEEE Computer Society.
[30]
M. Kim, D. Notkin, D. Grossman, and G. Wilson. Identifying and summarizing systematic code changes via rule inference. IEEE Transactions on Software Engineering, 39(1):45–62, 2013.
[31]
C. M. Kuok, A. Fu, and M. H. Wong. Mining fuzzy association rules in databases. ACM Sigmod Record, 27(1):41–46, 1998.
[32]
B. Li, M. Grechanik, and D. Poshyvanyk. Sanitizing and minimizing databases for software application test outsourcing. In Software Testing, Verification and Validation (ICST), 2014 IEEE Seventh International Conference on, pages 233–242. IEEE, 2014.
[33]
B. Li, C. Vendome, M. Linares-Vásquez, D. Poshyvanyk, and N. Kraft. Automatically documenting unit test cases. In ICST’16, pages 341–352, 2016.
[34]
D.-Y. Lin and I. Neamtiu. Collateral evolution of applications and databases. In IWPSE-Evol ’09, pages 31–40, 2009.
[35]
M. Linares-Vásquez, L. F. Cortés-Coy, J. Aponte, and D. Poshyvanyk. Changescribe: A tool for automatically generating commit messages. In 37th IEEE/ACM International Conference on Software Engineering (ICSE’15) - Tool Demo Track, pages 709–712. IEEE, 2015.
[36]
M. Linares-Vásquez, B. Li, C. Vendome, and D. Poshyvanyk. How do developers document database usages in source code? In ASE’15 - New Ideas Track, pages 36–41, 2015.
[37]
D. C. Littman, J. Pinto, S. Letovsky, and E. Soloway. Mental models and software maintenance. J. Syst. Softw., 7(4):341–355, Dec. 1987.
[38]
A. Maule, W. Emmerich, and D. S. Rosenblum. Impact analysis of database schema changes. In Proceedings of the 30th international conference on Software engineering, pages 451–460. ACM, 2008.
[39]
P. W. McBurney and C. McMillan. Automatic documentation generation via source code summarization of method context. In ICPC’14, page to appear, 2014.
[40]
Microsoft. Microsoft Application Architecture Guide. Microsoft Press, 2nd edition, 2009.
[41]
L. Moreno, J. Aponte, G. Sridhara, A. Marcus, L. Pollock, and K. Vijay-Shanker. Automatic generation of natural language summaries for java classes. In Program Comprehension (ICPC), 2013 IEEE 21st International Conference on, pages 23–32. IEEE, 2013.
[42]
L. Moreno, G. Bavota, M. D. Penta, R. Oliveto, A. Marcus, and G. Canfora. Automatic generation of release notes. In FSE’14, 2014.
[43]
L. Moreno, A. Marcus, L. Pollock, and K. Vijay-Shanker. Jsummarizer: An automatic generator of natural language summaries for java classes. In Program Comprehension (ICPC), 2013 IEEE 21st International Conference on, pages 230–232. IEEE, 2013.
[44]
H. A. Nguyen, T. T. Nguyen, H. V. Nguyen, and T. N. Nguyen. idiff: Interaction-based program differencing tool. In Automated Software Engineering (ASE), 2011 26th IEEE/ACM International Conference on, pages 572–575. IEEE, 2011.
[45]
S. Panichella, J. Aponte, M. Di Penta, A. Marcus, and G. Canfora. Mining source code descriptions from developer communications. In 2012 IEEE 20th International Conference on Program Comprehension (ICPC), pages 63–72, June 2012.
[46]
S. Panichella, A. Panichella, M. Bella, A. Zaidman, and H. Gall. The impact of test case summaries on bug fixing performance: An empirical investigation. In 38th International Conference on Software Engineering (ICSE 2016), page to appear, 2016.
[47]
C. Parnin and C. Görg. Improving change descriptions with change contexts. In Proceedings of the 2008 international working conference on Mining software repositories, pages 51–60. ACM, 2008.
[48]
J.-M. Petit, F. Toumani, J.-F. Boulicaut, and J. Kouloumdjian. Towards the reverse engineering of renormalized relational databases. In Data Engineering, 1996. Proceedings of the Twelfth International Conference on, pages 218–227. IEEE, 1996.
[49]
D. Qiu, B. Li, and Z. Su. An empirical analysis of the co-evolution of schema and code in database applications. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 125–135. ACM, 2013.
[50]
S. Rastkar. Summarizing software concerns. In Software Engineering, 2010 ACM/IEEE 32nd International Conference on, volume 2, pages 527–528, May 2010.
[51]
S. Rastkar, G. Murphy, and A. Bradley. Generating natural language summaries for crosscutting source code concerns. In 27th IEEE International Conference on Software Maintenance (ICSM), pages 103–112, Sept 2011.
[52]
S. Rastkar, G. C. Murphy, and G. Murray. Automatic summarization of bug reports. IEEE Trans. Software Eng, 40(4):366–380, 2014.
[53]
D. Sjøberg. Quantifying schema evolution. Information and Software Technology, 35(1):35–44, 1993.
[54]
G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker. Towards automatically generating summary comments for java methods. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE’10), pages 43–52, 2010.
[55]
G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker. Towards automatically generating summary comments for java methods. In Proceedings of the IEEE/ACM international conference on Automated software engineering, pages 43–52. ACM, 2010.
[56]
C. Vassallo, S. Panichella, M. Di Penta, and G. Canfora. Codes: Mining source code descriptions from developers discussions. In 22Nd International Conference on Program Comprehension, pages 106–109, New York, NY, USA, 2014. ACM.
[57]
A. T. T. Ying and M. P. Robillard. Code fragment summarization. In ESEC/FSE’13, 2013.
[58]
A. T. T. Ying and M. P. Robillard. Selection and presentation practices for code example summarization. In 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 460–471, 2014.

Cited By

View all
  • (2023)An Empirical Study on Low- and High-Level Explanations of Deep Learning Misbehaviours2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)10.1109/ESEM56168.2023.10304866(1-11)Online publication date: 26-Oct-2023
  • (2022)An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00069(514-525)Online publication date: Mar-2022
  • (2021)Simplified Evaluation Framework for Query Extraction Techniques2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO)10.23919/MIPRO52101.2021.9596923(1648-1653)Online publication date: 27-Sep-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis
July 2016
452 pages
ISBN:9781450343909
DOI:10.1145/2931037
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Database-centric applications
  2. Documentation
  3. SQL-data statements
  4. Schema constraints

Qualifiers

  • Research-article

Conference

ISSTA '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)1
Reflects downloads up to 04 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)An Empirical Study on Low- and High-Level Explanations of Deep Learning Misbehaviours2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)10.1109/ESEM56168.2023.10304866(1-11)Online publication date: 26-Oct-2023
  • (2022)An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00069(514-525)Online publication date: Mar-2022
  • (2021)Simplified Evaluation Framework for Query Extraction Techniques2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO)10.23919/MIPRO52101.2021.9596923(1648-1653)Online publication date: 27-Sep-2021
  • (2021)SAND: a static analysis approach for detecting SQL antipatternsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464818(270-282)Online publication date: 11-Jul-2021
  • (2021)Exploring User Experience of Automatic Documentation ToolsExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411763.3451606(1-6)Online publication date: 8-May-2021
  • (2021)Automated Documentation of Android AppsIEEE Transactions on Software Engineering10.1109/TSE.2018.289065247:1(204-220)Online publication date: 1-Jan-2021
  • (2021)An RDBMS-only architecture for web applications2021 XLVII Latin American Computing Conference (CLEI)10.1109/CLEI53233.2021.9640017(1-9)Online publication date: 25-Oct-2021
  • (2021)An Empirical Study of (Multi-) Database Models in Open-Source ProjectsConceptual Modeling10.1007/978-3-030-89022-3_8(87-101)Online publication date: 18-Oct-2021
  • (2020)Hybrid Methods for Reducing Database Schema Test SuitesProceedings of the IEEE/ACM 1st International Conference on Automation of Software Test10.1145/3387903.3389305(41-50)Online publication date: 7-Oct-2020
  • (2020)Software documentationProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380405(590-601)Online publication date: 27-Jun-2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media