This post has been republished via RSS; it originally appeared at: Microsoft Mobile Engineering - Medium.
Understanding & Evaluating Visual Regression Testing

Dive into learning about VRT tools like Facebook SDK, Paparazzi and study their compatibility for migration.
Since Jetpack Compose was released in 2021, its declarative programming model has offered a modern and intuitive approach to building user interface.
Compared to View system framework, Compose promotes cohesion and simplifies state-management. This framework is lightweight because the UI components are built using composable functions as separate building pieces. With this paradigm shift, boilerplate code is lessened, code reuse is encouraged, and the development process is made simpler. Migrating to Compose seems quite appealing.
Will there be any regression if we migrate from one framework to another, though? Because Compose is still relatively new compared to the well-established View-based UI, there might be potential challenges faced during the migration period.
What about visual regressions? Possible visual regressions can be encountered due to different UI frameworks, view configuration and styling approach.
Styling and themes are usually defined in XML resource files which are applied to the views and then rendered. Whereas in Compose, these are programmatically defined in Kotlin and then takes advantage of efficient recomposition. These variances may result in changes to padding, margin, etc.
Thus, to maintain a seamless transition and avoid any visual bugs going unreported, visual regression testing can be used. This ensures that the final UI is consistent to enhance user experience.
Understanding Visual Regression Testing (VRT):
Let us now break down the process into two design approaches — High level and Low level
High-Level Design Approach
Consists of 2 steps:
1) Record — Generating a snapshot and saving it as Golden (baseline/reference snapshot used to compare subsequent snapshots)
2) Verify — Compares the two snapshots

Low-Level Design Approach
The capture feature of record functionality depends on the type of testing — Unit or Instrumentation. Comparison feature on the other hand in verify is similar regardless of testing type.
The regression criterion is essentially the percentage representing the maximum allowable difference of pixels between the Golden and the current captured snapshot.

Evaluating VRT Tools:
It’s time to assess the tools at our disposal now that we have a better understanding of the fundamental workings of a VRT.
The parameters we want to look out for are:
— UI framework: Does it support View system or Compose or both?
— Testing type: Is it a unit or instrumentation testing based?
— Mechanism: How does it capture the screenshots?
— File format: How are the screenshots saved?
— Granularity: Can it capture just activity screen or it can capture individual views as well?
— Speed: How fast can it execute the test cases?
— Requirements: What are the pre setups needed to use the tool?
— Limitations: What are its constraints?
The main factor we want to focus on is whether the tool can support both UI frameworks, check for major limitations, and perhaps find out future enhancement scopes.
Let’s now check out the tools and give it a shot :)
1. Facebook SDK (Screenshot-tests-for-android)
As an instrumentation testing tool, it requires an emulator or physical device to capture screenshots. It does so by inflating the component under test on the test thread to allow direct control of the view. (Note: It does not inflate the whole activity).
Provides two snapshot-capturing features:
1) Screenshot.snapActivity(activity).record() → to capture activity
2) Screenshot.snap(view).record() → to capture view or viewGroup
How does this tool capture snapshots?
A bitmap is to be created with the appropriate dimensions to accommodate the component’s content. For that to be done, the SDK divides the dimensions such that the component will be split into tiles of 512 pixels (default).
This tile-capturing feature is used to efficiently capture large screens. After creating such a bitmap, individual canvas is set up on which the tile is drawn, rendered, and then captured. Bitmaps of each tile and its coordinates are stored in an album. This helps in reassembling the full screenshot for report purposes later. These screenshots are saved in PNG format.
Does this SDK support both UI frameworks?
Facebook SDK fully supports View-based system as the granularity is sustained (able to capture whole activity to individual components). However, when tested out on Compose application, it was only able to capture the whole activity but not the individual component. Hence, it partially supports for Jetpack Compose.

Why is it able to capture the whole activity of a Compose application and not individual units? This may be due to the tool capturing the entire screen content that includes the Compose content within the activity’s view hierarchy. The library however, does not seem to have the necessary support for capturing the visual representation of Composable functions.
Possible Workaround? Since Facebook SDK is essentially meant for View applications, one workaround was to wrap the composable functions in a ComposeView — Interoperability of having Compose in View. However, the road to this workaround hit a dead end since, we need window composer and other dependencies to use the Compose framework.
But that is not it. Since Facebook SDK needs to access the emulator’s SD card to store the screenshots, it needs WRITE_EXTERNAL_STORAGE permission however, this has been deprecated from Android 11 (API Level 30). Hence, does not work with the latest SDK. Oh, no limitation!
2. CashApp’s Paparazzi
Paparazzi is a unit testing tool that completely runs on JVM, which means no emulator or physical device set up required. It is worth noting that it only works at a modular level and not at the application level. (Note: Test suites run on JDK 11+)
How does it capture snapshots without any emulator/physical device?
The tool relies on layoutlib, the library in Android SDK that provides the rendering engine for displaying Android layouts. It enables the developers to preview and interact with the UI layouts during the development process without the need of deploying the emulator.
Paparazzi requests a render session from layoutlib to provide the buffered image. The screenshot is captured based on this buffered image. The regression criterion set for this tool is 0.1.
Provides one screenshot-capturing function that can capture a whole activity to an individual component UI.
Buffered Image == Bitmap? They are similar in concept and serve the purpose of representing images but are specific to the different programming environments. Buffered image is a class in Jawa’s AWT library and Bitmap is a class in Android framework.
Does Paparazzi support both UI frameworks?
Paparazzi shows both View-based and Compose support.
1) paparazzi.snapshot(view) → to capture views
2) paparazzi.snapshot{ composableFunction() } → to capture compose components


Both View and Compose applications were created to have a similar structure of having a top bar and an expandable list. 6 test cases were written to capture the entire activity, top bar, list (closed), list (3rd item open), single item (closed), and single item (open).
Something does not seem right…The granularity seems to be sustained however the sizing of the individual components is not exact. Thankfully, there is a workaround for it — Paparazzi’s sizing issue solution.
Paparazzi relying on layoutlib in turn means it depends on the Android Studio build system. This high coupling with the system may cause friction for developers since they will have to face a chain of updating dependencies to be compatible with the Gradle plugin, Android SDK version, etc. So, whenever the Studio updates its system, layoutlib maybe affected making Paparazzi to be changed. To use the updated testing tool, the project needs to be up to date with compatible dependencies. Seems like additional work.
What about the speed of both the tools?

Paparazzi running on JVM means faster execution as compared to Facebook SDK. Running tests on JVM allows you to isolate dependencies and execute the TCs in a controlled environment. There is no need for additional setup hence reducing the overhead and potential latency associated with running rests in runtime environment.
Is it right to compare the speeds of tools that are of two different testing types? Well, the answer lies in what our learning objective is. We are here to find out and evaluate the tools against the parameters.
Final Thoughts
Now that the tools have been tested out against an XML application and Compose application, what is our takeaway?
Facebook SDK scope for future compatibility seems low due to the lack of full support for Compose and the low maintenance for which there are permission issues, etc.
Paparazzi on the other hand seems to show great promise in terms of support for both UI frameworks and as well as speed. It is good to note that it does come with its own set of limitations such is as its high dependence on the build system. The fact that it runs on JVM means no runtime environment so faster execution does seem like a great pull and seems like it can aid us during the migration.
Understanding & Assessing Visual Regression Testing was originally published in Microsoft Mobile Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.