fast deterministic screenshot tests for android

Slide 1

Fast Deterministic Screenshot TestsArnold Noronha

Why?

How do you test views?Move out logicRobolectric, Espresso

How do you test views?

Most seasoned developers will tell you that in order to test a view you move all the logic out side the view, and unit test the logic. Excellent advice.

Augmenting that, you can use Robolectric or Espresso to make assertions on the view state, or even assertions on view interaction.

Both of these are very valuable, but *

What about rendering?

But what about rendering? Why do we dismiss the need for testing paddings and margins and colors? Perhaps youd say, well, the view doesnt change that often so it doesnt break often, but you and me both know deep down that thats a lie.*

Consider News Feed

Consider Facebook's newsfeed: our stories and views are laid out inhundreds of different configurations depending on the storycontent. These hundreds of configurations means impossible to test theaffects of a commit completely. Especially considering we at Facebooklike code reusability, and our views and layouts are extensivelyreused. (For instance, refer to our DroidCon 2014 talk on multi-row.)

*

But were really good at re-using codeViews and layouts affect rendering at multiple places in the app

See multi-row

But reusing views has different a different set of problems compared to reusing infrastructure classes. Changing an infrastructure class while keeping the API unchanged will keep your tests passing, but changing a view almost definitely means all your dependent views are going to change!

But this means tweaking a view can result in unintended regressions in other configurations

We needed a way to catch these regressions automatically.. and for developers to have more confidence in their changes.*

How can we catch UI regressions?How does test driven development look like for UI?

TDD for rendering?Fast feedback loop in developmentCatch regressions in continuous integration

Fast feedback loop: product developers have to keep tweaking paddings and margins and colors, and between each tweak they have to rebuild the app and navigate to their view. This is much much more slower than backend engineers might achieve with TDD where the tests tell you whether your code is correct.

Lot of surface area for regressions which cant all be covered with pure unit tests: can lead to brittle over-specified tests*

Determinism is hard

Monkeyrunner test runs outside of the process, which means less control over external factors affecting the rendering.Even if you run the screenshots as part of an instrumentation test, the UI is rendered on a different thread from the test thread which makes it hard to screenshot things like animations.We needed determinism for this to be practical*

Our approachMimic measure(), layout() and draw() All on the test threadFast and deterministic!

Open Source!

Lets talk code

Alice creates a layout: search_bar.xml

Alice wants to build a new search bar for her app. Alice and herteammates have already built a tonne of amazing features, and havepulled in many dependencies to this makes building her app reallyslow. It's hard to iterate in such an environment. Alice wants tobuild a UI and then plug it into the app only when it's ready.*

Alice writes a screenshot testpublic class SearchBarTest extends InstrumentationTestCase { public void testRendering() throws Throwable { LayoutInflater inflater = LayoutInflater.from(getInstrumentation().getTargetContext()); View view = inflater.inflate(R.layout.search_bar, null, false); ViewHelpers.setupView(view) .setExactWidthDp(300) .layout(); Screenshot.snap(view) .record(); } }

Alice wants to know if this renders correctly, so she writes ascreenshot test, this is just an instrumentation test which calls theScreenshot library.

*

Alice runs the test$ ./gradlew connectedAndroidTest$ pull_screenshots com.foo.bar.tests

Im not going to show a slide for how to run the tests. It runs like a normal instrumentation test, which basically uses whatever test infrastructure you already have, be it buck or gradle. After the test runs, the screenshots are stored on a device, and we provide you a script to pull those screenshots and generate an output. This is how it looks like:*

Iterate: Fix search_bar.xml

Alice wants to build a new search bar for her app. Alice and herteammates have already built a tonne of amazing features, and havepulled in many dependencies to this makes building her app reallyslow. It's hard to iterate in such an environment. Alice wants tobuild a UI and then plug it into the app only when it's ready.*

It works!

Test in multiple configurations!

For instance, different backgrounds, different text typed outView all renderings with one single runNo harm committing all of these screenshot tests because theyre fast*

Notice that its possible to render it in different languages too.

Btw, look at the second screenshot here. When writing this test, I expected EditText to be single line by default, and I wrote the test just to demonstrate this specific edge case. But then it turned out that you need singleLine=true explicitly. I fixed this in the sample code, but wanted to show you how I was able to iterate on these edge cases without having to plug this into a real app.*

Tracking regressions

The Record/Verify model

This is perfectly decent model, and in fact we do use this currently for iOS snapshot tests inside facebook.

This option requires the least bit of tooling, but suffers from some developer efficiency problems. Force all your developers to use the same exact emulator configuration. But also:*

Record/VerifyNot much tooling requiredUsed by iOS teams at Facebook

More work for the developerAt Facebook we have 3-4 changes a dayBut only ~1 regression a week.Optimize workflow for the intentional changes!

Continuous integration and bisect

This is what we do at facebook for android.

At Facebook, we run the tests hourly and if we detect that any renderings changed, we bisect and find the blame commit and notify the author and subscribers*

This is what we do at FacebookRun the tests hourlyBisect changes to commit and notify author

Android News FeedStories in news feed can be serialized to JSONWe can dump hundreds of JSONs and get coverage without much effort

Extremely lightweight: talk more about how we dont even require the author to explain a change, just close it. Developer efficiency is paramount.

Btw, notice that none of these renderings have images attached to it. Our screenshot tests are deterministic and dont hit the network, while not impossible handling images in the framework tends to be more than just dumping a json which is why we havent done it.*

Example: Progress spinners

Example: real regression

Example: subtle regression

Thank you!http://github.com/facebook/screenshot-test-for-androidscreenshot-test-for-android@googlegroups.com

Questions?

How do you test views?

Most seasoned developers will tell you that in order to test a view you move all the logic out side the view, and unit test the logic. Excellent advice.

Augmenting that, you can use Robolectric or Espresso to make assertions on the view state, or even assertions on view interaction.

Both of these are very valuable, but *But what about rendering? Why do we dismiss the need for testing paddings and margins and colors? Perhaps youd say, well, the view doesnt change that often so it doesnt break often, but you and me both know deep down that thats a lie.*Consider Facebook's newsfeed: our stories and views are laid out inhundreds of different configurations depending on the storycontent. These hundreds of configurations means impossible to test theaffects of a commit completely. Especially considering we at Facebooklike code reusability, and our views and layouts are extensivelyreused. (For instance, refer to our DroidCon 2014 talk on multi-row.)

*See multi-row

But reusing views has different a different set of problems compared to reusing infrastructure classes. Changing an infrastructure class while keeping the API unchanged will keep your tests passing, but changing a view almost definitely means all your dependent views are going to change!

But this means tweaking a view can result in unintended regressions in other configurations

We needed a way to catch these regressions automatically.. and for developers to have more confidence in their changes.*Fast feedback loop: product developers have to keep tweaking paddings and margins and colors, and between each tweak they have to rebuild the app and navigate to their view. This is much much more slower than backend engineers might achieve with TDD where the tests tell you whether your code is correct.

Lot of surface area for regressions which cant all be covered with pure unit tests: can lead to brittle over-specified tests*Monkeyrunner test runs outside of the process, which means less control over external factors affecting the rendering.Even if you run the screenshots as part of an instrumentation test, the UI is rendered on a different thread from the test thread which makes it hard to screenshot things like animations.We needed determinism for this to be practical*Alice wants to build a new search bar for her app. Alice and herteammates have already built a tonne of amazing features, and havepulled in many dependencies to this makes building her app reallyslow. It's hard to iterate in such an environment. Alice wants tobuild a UI and then plug it into the app only when it's ready.*Alice wants to know if this renders correctly, so she writes ascreenshot test, this is just an instrumentation test which calls theScreenshot library.

*Im not going to show a slide for how to run the tests. It runs like a normal instrumentation test, which basically uses whatever test infrastructure you already have, be it buck or gradle. After the test runs, the screenshots are stored on a device, and we provide you a script to pull those screenshots and generate an output. This is how it looks like:*Alice wants to build a new search bar for her app. Alice and herteammates have already built a tonne of amazing features, and havepulled in many dependencies to this makes building her app reallyslow. It's hard to iterate in such an environment. Alice wants tobuild a UI and then plug it into the app only when it's ready.*For instance, different backgrounds, different text typed outView all renderings with one single runNo harm committing all of these screenshot tests because theyre fast*Notice that its possible to render it in different languages too.

Btw, look at the second screenshot here. When writing this test, I expected EditText to be single line by default, and I wrote the test just to demonstrate this specific edge case. But then it turned out that you need singleLine=true explicitly. I fixed this in the sample code, but wanted to show you how I was able to iterate on these edge cases without having to plug this into a real app.*This is perfectly decent model, and in fact we do use this currently for iOS snapshot tests inside facebook.

This option requires the least bit of tooling, but suffers from some developer efficiency problems. Force all your developers to use the same exact emulator configuration. But also:*This is what we do at facebook for android.

At Facebook, we run the tests hourly and if we detect that any renderings changed, we bisect and find the blame commit and notify the author and subscribers*Extremely lightweight: talk more about how we dont even require the author to explain a change, just close it. Developer efficiency is paramount.

Btw, notice that none of these renderings have images attached to it. Our screenshot tests are deterministic and dont hit the network, while not impossible handling images in the framework tends to be more than just dumping a json which is why we havent done it.*

fast deterministic screenshot tests for android

Software