On the usage of pythonic idioms

CV Alexandru, JJ Merchante, S Panichella… - Proceedings of the …, 2018 - dl.acm.org
Developers discuss software architecture and concrete source code implementations on a
regular basis, be it on question-answering sites, online chats, mailing lists or face to face. In …

HyperAST: Enabling efficient analysis of software histories at scale

Q Le Dilavrec, DE Khelladi, A Blouin… - Proceedings of the 37th …, 2022 - dl.acm.org
Abstract Syntax Trees (ASTs) are widely used beyond compilers in many tools that measure
and improve code quality, such as code analysis, bug detection, mining code metrics …

Software provenance tracking at the scale of public source code

G Rousseau, R Di Cosmo, S Zacchiroli - Empirical Software Engineering, 2020 - Springer
We study the possibilities to track provenance of software source code artifacts within the
largest publicly accessible corpus of publicly available source code, the Software Heritage …

Flexeme: Untangling commits using lexical flows

PP Pârțachi, SK Dash, M Allamanis… - Proceedings of the 28th …, 2020 - dl.acm.org
Today, most developers bundle changes into commits that they submit to a shared code
repository. Tangled commits intermix distinct concerns, such as a bug fix and a new feature …

An nlp-based tool for software artifacts analysis

A Di Sorbo, CA Visaggio, M Di Penta… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
Software developers rely on various repositories and communication channels to exchange
relevant information about their ongoing tasks and the status of overall project progress. In …

Training data selection for imbalanced cross-project defect prediction

S Zheng, J Gai, H Yu, H Zou, S Gao - Computers & Electrical Engineering, 2021 - Elsevier
Abstract Machine learning methods have been applied in software engineering to effectively
predict software defects. Researchers proposed cross-project defect prediction (CPDP) for …

Ultra-large-scale repository analysis via graph compression

P Boldi, A Pietri, S Vigna… - 2020 IEEE 27th …, 2020 - ieeexplore.ieee.org
We consider the problem of mining the development history—as captured by modern
version control systems—of ultra-large-scale software archives (eg, tens of millions software …

7 Dimensions of software change patterns

M Janke, P Mäder - Scientific Reports, 2024 - nature.com
Evolving software is a highly complex and creative problem in which a number of different
strategies are used to solve the tasks at hand. These strategies and reoccurring coding …

Hyperdiff: Computing source code diffs at scale

Q Le Dilavrec, DE Khelladi, A Blouin… - Proceedings of the 31st …, 2023 - dl.acm.org
With the advent of fast software evolution and multistage releases, temporal code analysis is
becoming useful for various purposes, such as bug cause identification, bug prediction or …

Frankenstein: fast and lightweight call graph generation for software builds

M Keshani, G Gousios, S Proksch - Empirical Software Engineering, 2024 - Springer
Call Graphs are a rich data source and form the foundation for advanced static analyses that
can, for example, detect security vulnerabilities or dead code. This information is invaluable …