Sloppy Screencasts: Test-Oriented UI Development with BackstopJS / Larva Show-n-Tell

I think I figured out a really useful workflow for test-oriented UI development using BackstopJS. Read more for a description of the concepts, and there is a good hour of sloppy screencast demos in a playlist you are welcome to watch!

The screencasts are towards the bottom of this post, or you can skip to watching them here.

Why Test Driven Development

Here is the benefit of test-driven development in my own words (which is a synthesis of others’ words, specifically Zero Bugs and Program Faster):

Any code you write will have bugs. Writing good tests early in the development process forces bugs to surface now vs. writing a bunch of code on top of the buggy code, and dealing with the bugs later in development when the code has become (exponentially?) more complex and harder to debug.

As far as I can tell, is a Very Good Thing, but…it takes a lot more time up front, and not everyone is able to justify that time 1) to themselves in the moment or, 2) with the rest of the team in other moments. In my not-that-extensive experience with test-driven development, however, it is soooooo worth it, both for you or your team now (even though it is hard and requires a lot of discipline) and, perhaps more-so, for future teams and developers whether those folks are you and your current team or others.

I’ve been thinking a lot about how to do this with UI development – specifically web UI development with HTML and CSS, of course. I have a bit in CSS Algorithms v3.5.1 about unit testing CSS – watch it here – and I wrote two posts about ideas for a unit test suite for CSS algorithms (there should be a third post, maybe I will write it eventually…but not now). That is still potentially a good idea, but I haven’t had time to do more experiments yet to fully recommend an approach.

Surprise success with BackstopJS

First: What is screenshot / visual regression testing?

This is a form of testing UI where a “reference” screenshot of the rendered interface is compared to a screenshot of a “test” version of the same UI, likely taken from a different branch or behind a feature flag to test a new feature. A tool like Backstop JS takes the screenshots, navigates to different screens in a headless browser, and provides a UI highlighting the pixel differences between the “reference” and “test” screenshots.

It can be very finicky and challenging to set up, depending on the application…especially if you are loading ads.

At PMC, I got BackstopJS in place to test URLs on our pattern library (no ads, whew!). My intention was to use this for regression testing; basically run the tests on specific modules before opening a PR to make sure nothing unexpected broke during development.

What I didn’t expect: this setup has turned out to be really useful during development, in a workflow I might dub “Test-Oriented UI Development”. It’s not exactly test-driven development since we essentially build the test along with the actual UI (vs. starting with a test as in TDD), but I think this has lots of potential, and as the sloppy screencasts show, I absolutely experienced the benefits of catching bugs early!

The Backstop JS UI showing the “test” screenshot, “reference” screenshot and the pixel difference between the two.

The Workflow

The trick is to run the test on one, small module, incrementally as you develop the UI.

Here is an example of a regression I caught (outside of the video recordings). I had been adding a search icon, and I had shallow cloned the data object for the o-button-icon pattern and that resulted in the menu icon also changing to a search icon:

The “test”, or current version of the screenshot shows a regression, where the menu icon turned into a search icon.

The Backstop UI can show multiple viewport sizes at a single glance, which saves the cognitive load of remembering to resize the browser every time you make a change.

Once an update to the feature is correct, you can save the state of the test and transform it into the reference screenshot (or approve it, in Backstop terms). Then you write a bit more code, and run the test to compare the new version against the approved version, keeping an eye on the different viewports along the way.

Eventually, I think these tests can be stored in the repo, and perhaps the presence of the tests will be a key part of the completion of a feature, similar to how we treat PHP Unit tests at PMC.

And…maybe, someday, our designer can provide the comparison PNG screenshot and this workflow can actually be test-driven development. Wouldn’t that be cool? This concept originated in my brain via this talk by Jakob Mattson from back in 2014.

The Sloppy Screencasts!!

I was very excited about this yesterday and decided to record some sloppy screencasts before I thought too much about it and it either turned into a bigger project or into a “won’t do”.

So, here are said sloppy (and probably snarky) screencasts that walk through the beginnings of this “test-oriented UI development” workflow with our design system at PMC, called Larva.

Important Notes

Note that this is from the trenches. It’s a real project, it was the end of the day on Friday, and I make lots of mistakes. I like the idea of sharing this kind of content though – typos and similar phenomena really are the great equalizer when it comes to programming.

An outline of the contents:

Part 1 is 16 minutes of me being a little bit nervous and building out a button, showing the basic testing workflow of Backstop JS, and debugging CSS. Unfortunately, you can’t really see the button as I build it in much of the video because Zoom placed the square with my face in it on top of the button :(. But still, it gets the point across and you can see an idea of what it’s like to develop UI with Larva (in its current, alpha version).
Part 2 is 37 minutes of my being less nervous, and coming back after a refactor of the aforementioned button. I provide lots of commentary about the current state of Larva / the workflow, and where things can be improved. I also had to pause the video twice to debug things that weren’t working.
Part 3 is another 16 minutes or so of showing off our cool tools for integrating patterns into the WordPress theme, namely the Twig to PHP parser and the write-to-JSON command. I’m very proud of how this part of the process has come together.

Another ironic fact, given my CSS evangelizing – I think there is literally zero CSS written in these videos. I consider that a major mark of success. I started writing about why that is a mark of success, and that turned into a long and informative post that will be published…soon, maybe, I hope? Or it might morph into actual documentation. TBD!

I’m curious…

Do other people use screenshot testing like this? Or do you use it for regression testing, not really as a development tool? Do your tests run this fast? Faster? What’s it like? Does what I am writing about make sense? Does it sound useful?

Feel free to respond to the thread from this post on Twitter, or here in the comments! Though I might close the comments soon because I get around a 25:1 ratio of spam to actual comments.