research-article

Translating Video Recordings of Complex Mobile App UI Gestures into Replayable Scenarios

Authors:

Carlos Bernal-Cárdenas,

Andrian MarcusAuthors Info & Claims

IEEE Transactions on Software Engineering, Volume 49, Issue 4

Pages 1782 - 1803

https://doi.org/10.1109/TSE.2022.3192279

Published: 01 April 2023 Publication History

Abstract

Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly evolving platforms, automated techniques for analyzing all types of rich software artifacts provide benefit to mobile developers. Unfortunately, automatically analyzing screen recordings presents serious challenges, due to their graphical nature, compared to other types of (textual) artifacts. To address these challenges, this paper introduces <sc>V2S+</sc>, an automated approach for translating video recordings of Android app usages into replayable scenarios. <sc>V2S+</sc> is based primarily on computer vision techniques and adapts recent solutions for object detection and image classification to detect and classify user <italic>gestures</italic> captured in a video, and convert these into a replayable test scenario. Given that <sc>V2S+</sc> takes a computer vision-based approach, it is applicable to both hybrid and native Android applications. We performed an extensive evaluation of <sc>V2S+</sc> involving 243 videos depicting 4,028 GUI-based actions collected from users exercising features and reproducing bugs from a collection of over 90 popular native and hybrid Android apps. Our results illustrate that <sc>V2S+</sc> can accurately replay scenarios from screen recordings, and is capable of reproducing <inline-formula><tex-math notation="LaTeX">$\approx$</tex-math><alternatives><mml:math><mml:mo>≈</mml:mo></mml:math><inline-graphic xlink:href="cooper-ieq1-3192279.gif"/></alternatives></inline-formula> 90.2% of sequential actions recorded in native application scenarios on physical devices, and <inline-formula><tex-math notation="LaTeX">$\approx$</tex-math><alternatives><mml:math><mml:mo>≈</mml:mo></mml:math><inline-graphic xlink:href="cooper-ieq2-3192279.gif"/></alternatives></inline-formula> 83% of sequential actions recorded in hybrid application scenarios on emulators, both with low overhead. A case study with three industrial partners illustrates the potential usefulness of <sc>V2S+</sc> from the viewpoint of developers.

References

[1]

M. Nayebi, “Eye of the mind: Image processing for social coding,” in Proc. 42nd IEEE/ACM Int. Conf. Softw. Eng. New Ideas Emerg. Results, 2020, pp. 49–52.

Abstract

References

Cited By

Recommendations

An Explorative Study of the Mobile App Ecosystem from App Developers' Perspective

Translating video recordings of mobile app usages into replayable scenarios

Learning Mobile App Development: A Hands-on Guide to Building Apps with iOS and Android

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations