Designing Around User Interface ‘Flicker’

A few months ago I rolled out a total overhaul of my Shopping List web app using AngularJS, which was a lot of fun and has in general been a success from a programming/tinkering perspective. It was not, however, a success in user happiness, as we found out the hard way the first trip to the grocery store using version 2.0.

The problem? As part of the overhaul I had patched a data integrity bug where two delete requests are sent in quick succession and it’s ambiguous to the server which line to delete (that’s an architecture problem for another day). I thought this was a nice change which prevented a rare scenario of accidentally deleting the wrong list item. It turned out, however, that this actually created a workflow where the server update message can come back and can subtly shift the client UI a second or two after the user expects that it’s already done updating, restoring the entry that the server believed to have been deleted in error.

When the red X is clicked, that row will be deleted… usually. Sometimes it will go away, and then pop back unexpectedly. What a ‘nice’ feature!

This meant that just about the time the user goes to tap the next item to mark it done or deleted, the server rudely shoves a new item into the list and jumbles up the tap targets. This behavior and its unfortunate timing makes it very, very easy to click on the wrong thing. In practice this is just as bad as the bug I had meant to fix, because it manifests in the same behavior of an accidentally lost entry.

Trying to solve this design issue got me thinking about this interesting class of bugs: cases where a user interface updates just before you try to click or tap an element, causing you to hit the wrong one. It comes up somewhat often in web browsing if you’re loading a page with lots of images and trying to click a link towards the bottom of the page. As each image above you loads, the link you’re trying to click can shift. This creates a frustrating race where you have to chase the click target down the page. You can also see this in iOS when you’re looking at a single image in Photos, and you get stuck in loop between tapping the HUDless screen to bring up the “Back” button and trying to actually tap “Back”. For me personally, my brain->finger pathway seems to have exactly the same timing as the UI delay on hiding/showing the HUD, so I can never quite tap on the button I want to until I force myself to stop and break the loop.

I don’t really have a great solution to this problem. One technique I’ve seen applied in mission-critical professional software is the use of a ‘shield’ which blocks all user input when the application knows that a server refresh is pending. This seems to work ok, but it’s tricky to get the user experience right because it’s replacing a rare misclick/mistap scenario with frequent appearance of unresponsiveness. As long as the UI both indicates the reason for the lack of response (maybe with a ‘Loading’ icon) and allows the user to somehow read data underneath if necessary, it seems like a viable path. This only stretches so far, though: after a couple seconds of loading I would expect users to abandon the screen rather than continue to wait. It’s also potentially making the 95% case a worse user experience for the sake of the 5%, which is probably a mistake in most software.

For this particular shopping list application, my solution was to consolidate all workflows that cause row insertion/deletion outside the list itself, so that the risk of a race condition between ambiguous server commands is mostly eliminated, and there aren’t destructive buttons shifting up and down anyway. It seems to work ok for now, but even here it feels like we lost functionality to deal with a bug that, while frustrating when it does manifest, is only relevant a percentage of the time. Here the timing of the server requests made this percentage high enough for a change to be worthwhile, but I’m not sure that’s often the case in other applications.