Here's my personal submission for "UI problem that has existed for years on touch interfaces, plus a possible solution, but at this point I'm just shouting into the void":
In short, an interface should not be interactable until a few milliseconds after it has finished (re)rendering, or especially, while it is still in the midst of reflowing or repopulating itself in realtime, or still sliding into view, etc.
Most frustratingly this happens when I accidentally fat-finger a notification that literally just slid down from the top when I went to click a UI element in that vicinity, which then causes me to also lose the notification (since iOS doesn't have a "recently dismissed notifications" UI)
I've had this a few times, particularly on mobile, where you're doing something and some pop-up will steal focus, but of course you were tapping or swiping or something the exact instance it popped; it stayed just long enough for the after-image on your retinas to catch a single word and you realise it might have been important, but it's gone now, with no sign.
This happened to my just the other day; I was purchasing something online with a slightly complicated process, from my mobile, I didn't want to f* up the process, and I was tapping occasionally to keep the screen awake while it was doing "stuff"; needless to say, something popped up, too fast for me to react, I have no idea which button I tapped if any, or if I just dismissed it, to this day no idea what it wanted but I know it was related to the payment process.
I've seen this solved in dialogs/modals with a delay on the dismiss button, but rarely; it would also make sense to delay a modal/dialog of some kind by a couple hundred milliseconds to give you time to react, particularly if tapping outside of it would dismiss it.
I find myself using Notification History on Android more and more often, but a lot of the time it's not even notifications, it's some in-app thing that's totally within the developer's control.
You can use this fact to see deleted messages on whatsapp. Just enable the notificaion history and Now whenver someones sends you somethign regrettable and deletes it you can still see it in notif history. The reason you need the notif history is whatsapp actually live wipes the notifictions for deleted messages.
I believe it's an INTENTIONAL BEHAVIOR in Facebook. Particularly the mobile web interface...want to show someone a video, but you're on the can, and really, it's going to be a minute while you wash up (or similar example lasting 40 seconds)
You're not going to be able to do it. They're not on facebook, you can't just link to the video, you're going to hold the phone carefully but the bared fraction of their palm will register with the screen, or the page will refresh, or the screen (now 27 feet deep in the doomscroll) will scroll all the way to the top of the screen.
The one I don't quite know how to solve is when I'm tapping a device to connect to -- whether a WiFi router or an AirPlay speaker or whatever -- and I swear to god, half the time my intended device slides out from under me a newly discovered device enters above and pushes it down. Or sometimes devices disappear and pull it up. Maybe it's because I live in an apartment building with lots of devices.
I've seen this solved in prototypes by always adding new devices at the bottom, and graying out when one disappears, with a floating "resort" button so you can find what you're looking for alphabetically. But it's so clunky -- nobody wants a resort button. And you can't use a UX delay on every update or you'd never be able to tap anything at all for the first five seconds.
Maybe ensuring there's always 3 seconds of no changes, then gray out everything for 0.5 seconds while sliding in all new devices discovered from the past 3.5 seconds, then re-enabling? I've never seen anything like that attempted.
To me the BIGGEST annoyance is the iOS “End call” button.
Just as I’m about to tap it, the other person ends the call and what I’m actually tapping is some other person on my call list that it then immediately calls.
Even if I end the call quickly they often call back confused “You called, what did you want?”
Apple: PLEASE add a delay to touch input after the call screen closes.
Confirmation before calling would be nice. I've accidentally made calls when trying to get more information about a missed call. (I've also had Siri pocket dial, but I e got that disabled now.)
That too! 99% of the time I tap a missed call, it's because I want to see more info about the call or the contact or if there was a voice mail. My phone should never make a call unless I explicitly tap a big ol' "CALL" button. It should never be the default action for tapping on a contact or a missed call.
The solution needs to be global. Literally, if any part of the screen just changed (except for watching videos, which would make them impossible to interact with), add a small interaction delay where taps are no-op'd.
Video shouldn't count as the screen changing, it should just be an area of "media". It's appearance or disappearance could be counted as a change, but not the contents of the area itself. That could be a global rule to save much exception making.
It's gotten better, but using navigation. It tells you which lane you need to----nope, calendar Appointment.
and the notification doesn't self-dissapear, so stressed navigation also includes a ham-handed reach and swipe up to make the appointment dissapear. Hope it wasn't important.
The screen is MASSIVE folks. SO MANY PIXELS. keep the GPS AND the calendar appointment.
I also wish you could blacklist/permanently hide individual devices so that I could prune the list of 400 smart TVs, Bluetooth speakers, electric toothbrushes, other people's phones, smart fridges, etc that come up every time I try to link to my earbuds in my apartment.
I have another submission for most annoying UI problem: Trying to read a thing, but it's on medium.com. The sheer amount of popups and overlays I need to click away before I can actually read your thing, geez.
This is one reason why I prefer keyboard-based UIs that do not automatically change anything that affects the UI while the UI is active (and being able to select things by entering a number or a name also helps, rather than having to use the arrows). (I also dislike touch-screen for many other reasons.)
Sometimes you can disable JavaScripts and that works (for that document on Medium, this works for me). Sometimes that doesn't work, but if you disable both JavaScripts and CSS, then it does work. Sometimes that does not work either.
I had two popups visiting that medium link with uBlock Origin. Perhaps it is possible to get zero with UBO, but default settings don't seem to be enough.
OP might also be using a list like the "Anti-adblock Filter". You also might want to explore the settings of UBO as there are more lists titled "Annoyances" that you can enable to shut out further garbage.
Some sites seem to be able to disable reader mode. I'm not sure why browsers allow this but it happens often enough, reader mode just isn't an option on some sites. In firefox you used to be able to do something like:
Oh man. This makes me want to throw my phone against the nearest brick wall sometimes. The UI is loading, I reach for the button I want to hit, but it moves and a different button takes its place because the app/page is still loading, or worse, an ad has taken the place of the button.
This also happens where sometimes the hotbar has three buttons, and sometimes four, and the worst apps are when buttons switch ordinal positions depending on if there are three or four buttons in there.
It feels very strange to get so agitated by these small behaviors, but here we are.
> or worse, an ad has taken the place of the button.
this has happened to me and i even clicked on the ad. It actually made me smile a little bit and reminded me of the "clever girl" scene in Jurassic Park.
If a user is interacting, DO NOT UPDATE the list / models / etc.
If an update is required, rather than just desired, freeze all input so the user knows it's about to update, this might be accompanied by a quick 'fade' or other color shift to indicate an update is about to be pushed and they should release and re-plan actions.
It's not just a touch issue. Desktop environments have toast notifications and dialogs that can pop up unexpectedly (neither of which are remotely new problems). You can be trying to click something at the corner of your screen and have it intercepted by a notification or you can be pressing enter and have it activate the default action on a dialog that just popped up. Especially in the dialog case you often just have to hope that it wasn't actually something you needed to see or select a different option on.
Or apps like Skype which popup a dialog while you are typing, and when you where in the middle of pressing space while that happens, then you auto-answer a call you didn't know existed a second earlier.
> In short, an interface should not be interactable until a few milliseconds after it has finished (re)rendering
I was a console game developer working on UI for many years so I am deeply familiar with the problem when a UI should be responsive to input while the visuals are changing and when it should not.
You might be surprised, but it turns out that blocking input for a while until the UI settles down is not what you want.
Yes, in cases where the UI is transitioning to an unfamiliar state, the input has a good chance to be useless or incorrect and would be better dropped on the floor. It's annoying when you think you're going to click X but the UI changes to stick Y under your finger instead.
However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
I've watched users use software that temporarily disables input like this and what you see is that they either get trained to learn the input delay and time their tap as tightly as possible, or they just get annoying and hammer inputs until it gets processed.
And, in practice, it turns out these latter times where a user is interacting with a familiar UI are 100x more common than when they misclick on an unfamiliar UI. So while the latter case is super annoying, it's a better experience in aggregate if the app is as responsive as it can be, as quickly as it can be.
Perhaps there is a third way where an app makes a distinction between flows to static context versus dynamically generated content and only puts an input block in for the latter, but I expect the line between "static" and "dynamic" is too fuzzy. People certainly learn to rely on familiar auto-complete suggestions.
UI is hard. A box of silicon to a great ape is not an easy connection to engineer.
These are great points. But I would debate the 100x point a little. And I think there are some cases where ignoring fast taps is clearly preferable.
I’m specifically thinking about phone notifications that slide in from the top – ie, from an app other than the one you’re using.
So we have two options: ignore taps on these notification banners for ~200ms after the slide-down (risking a ‘failed tap’) or don’t (risking a ‘mis-tap’).
I’d argue these are in different leagues of annoyingness, at least for notification banners, so their relative frequency difference is somewhat beside the point. A ‘failed tap’ is an annoying moment of friction - you have to wait and tap it again, which is jarring. Whereas a ‘mis-tap’ can sometimes force you to drop what you were doing and switch contexts - eg because you have now cleared the notification which would have served as a to-do, or because you’ve now marked someone’s message as read and risk appearing rude if you don’t reply immediately. Or sometimes even worse things than that.
So I would argue that even if it’s 100x less common, an mis-tap can be 1000x worse of an experience. (Take these numbers with a pinch of salt, obviously.)
Also, I’d argue a ‘failed tap’ in a power user workflow is not actually something that gets repeated that many times, as in those situations the user gets to learn (after a few jarring halts) to wait a beat before tapping.
All that said, this is all just theory, and if Apple actually implemented this for iOS notifications then it’s always possible I might change my view after trying it! In practice, I have added these post-rendering interactivity periods to UI elements myself a few times, and have found it always needs to be finely tuned to each case. UI is hard, as you say.
> So we have two options: ignore taps on these notification banners for ~200ms after the slide-down (risking a ‘failed tap’) or don’t (risking a ‘mis-tap’).
Yeah, notifications are an interesting corner case where by their nature you can probably assume a user isn't anticipating one and it might be worth ignoring input for a bit.
> Also, I’d argue a ‘failed tap’ in a power user workflow is not actually something that gets repeated that many times, as in those situations the user gets to learn (after a few jarring halts) to wait a beat before tapping.
You'd be surprised. Some users (and most software types are definitely in this camp) will learn the input delay and wait so they optimize their effort and minimize the number of taps.
But there are many other people on this planet who will just whale on the device until it does what they want. These are the same people who push every elevator and street crossing button twenty times.
I don't view notifications as a corner case. I think two factors are key:
1. Can the user predict the UI change? This is close to the static vs dynamic idea, but doesn't matter if the UI changes. If the user can learn to predict how the UI changes, processing the tap makes more sense. This allows (power) users to be fast. You usually don't know that a notification is about to be displayed, so this doesn't apply.
2. Is the action reversible? If a checkbox appears, undoing the misclick is trivial. Dismissing a potentially important notification with no history, deleting a file etc. should maybe block interactions for a moment to force the user to reconsider.
Often even better is to offer undo (if possible). It allows to fast track the happy path while you can still recover from errors.
For that reason, it's wonderful when games provide a log of actions and/or recent dialogue, so you can easily see what you missed. That kind of functionality seems less common outside games.
I am 99% sure the NY Times Games app on Android is blocking input until fully rendered on its 'home' screen where all the games are listed, and it drives me nuts. I tap on the element I want and nothing happens, I have to tap again. Maybe some kind of overlay or spinner would help signal that it's not accepting input would help? Arg.
I wonder if a good distinction is user initiated actions versus system initiated. If the user begins the action, the changes are immediate and buffered to the interface that appears next.
But when the system initiates it (eg. notifications, popups), then the prior interface remains active.
This is not the only distinction, but it is one of them, and I think that one is a good idea. Another distinction is the results of the user initiated action, of whether the result is expected or unexpected, and that distinction is not always so clear.
Great point, and I suspected the problem might not be as easy as it appears at first glance. (Because of course it isn't...)
I also considered the case when you're rapidly scrolling through a page- if a naive approach simply made things non-interactable if they've recently moved, that would neuter re-scrolling until the scrolling halted, which is NOT what people want
While your use case is valid, it's nowhere near as annoying to have to wait 1 second for a button to enable than it is to call random person from your contacts because his name appeared under your fat finger. Maybe there can be a distinction between expected layout change and ad-hoc elements appearing, like notifications, list updates etc. I would probably go too far asking for a setting of "time to enable after layout change"
> However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
This is very true, but the app has to be explicitly designed around this e.g. by not injecting random UI elements that can affect the layout.
Unfortunately this seems to be regressing in modern app UX, and not just on mobile. For example, for a very long time, the taskbar in Windows was predictable in this sense because e.g. the Start button is always in the corner, followed by the apps that you've pinned always being in the same locations. And then Win11 comes and changes taskbar layout to be centered by default instead of left-adjusted - which means that, as new apps get launched and their icons added to taskbar, the existing icons shift around to keep the whole thing centered. Who thought this was a good idea? What metric are they using to measure how good their UX is?
> Yes, in cases where the UI is transitioning to an unfamiliar state, the input has a good chance to be useless or incorrect and would be better dropped on the floor. It's annoying when you think you're going to click X but the UI changes to stick Y under your finger instead.
> However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
I agree with both of these, but I think that such a thing would work better with keyboard-oriented interfaces. However, when using a mouse or touch-screen, these are still good ideas anyways, although the situations where you will know and should expect what comes next is less when using the mouse, still it can be important because unexpected pop-ups etc from other programs, just as much as, when using the keyboard, pop-ups that take keyboard focus are as significant for this issue. Since this can sometimes involve multiple programs running on the same computer, that do not necessarily know each other, it cannot necessarily be solved from only the program itself. (I think that it will be another thing to consider in the UI of my operating system design.)
I've become obsessed with how Visual Studio Code or Helix editor gives a great big JSON settings/properties file for tweaking values. SO much so that I despise other apps for their lack of "set-ability".
To the original author's point, the consternation arises when you as a programmer just know there is an animation time, or a delay time, etc. that is hardcoded into the app and you can't adjust the value. The lack of interface and inability to have that exposed to the user is at least one major frustration that could help OP.
This has a name, it’s called CLS (cumulative layout shift) and Google actually penalizes SERP rankings for pages with bad CLS. More info on it on lighthouse.dev (if I recall the domain correctly).
Google is an annoying example! Especially on mobile the search bar shifts around between mainpage, search field edit-mode, and results page, but shifts only a half second after loading a static version. End up clicking into a blank new search page because the doodle jumps to right under your fingertip.
Rules for thee but not for me!
Google also abuses their power of controlling Gmail to do special tricks in their email campaigns which regular senders cannot do. Using these they force white backgrounds on users who have dark mode enabled (because their default auto dark mode CSS is absolutely awful and makes emails look like crap).
For me the most annoying one is the cookie consent banner. Very few sites have clearly defined buttons like “Allow all” “Deny all” etc. but majority of them have a (intentionally) convoluted UI so that a lot of users just accept all.
We've had multiple standards, mostly recently https://www.w3.org/TR/gpc/ . They always fail, because they're never mandatory, so surveillance capitalists just decline to implement them – or, worse, quietly sabotage them.
The thing that kills me about touch uis is the chance that this problem can happen at all. For some unknown reason designers insist on putting most controls at the top of the screen. I guess because that’s where they usually are on desktop but when you think about it, it makes no sense. All the controls get put exactly in the place where you can’t reach them easily on most devices. Where you’re likely to misclick on notifications because they’re in the same spot. Where your hand will be covering most of the screen when you reach for them. It would make so much more sense to put controls at the bottom, content at the top. You could reach them without stretching, you wouldn’t cover the screen with your hand to reach them, and they wouldn’t conflict with pop down notification systems. Why on earth are all the controls still at the top?!?
I certainly agree in many cases like those you mention, particularly for unprompted popovers like notifications. For desktop professional software, on the other hand, I believe the value of swift muscle memory supercedes this. In this case, input should be buffered and applied to the state of the app after the previous input has been processed.
Open a tool window, subsequent keystrokes should be sent to that too window, even if it takes a second to show. The "new/modern" interface on my CNC is both show and doesn't properly buffer input, and its hugely painful.
EDIT: I realize you specified touch, which isn't "desktop", but my CNC control is touch based and the same applies.
> In this case, input should be buffered and applied to the state of the app after the previous input has been processed.
Yes, and this works best with keyboard-oriented interfaces (which I think is generally much better than touch screens anyways; a lot of software I write is designed for keyboard use because it has this and other benefits). However, it should only be done if the process of the UI is what is expected; if something unexpected occurs then it might be better to discard any pending input. (But, sometimes this "expected" and "unexpected" is not so clear.)
I dream of making my own OS that would eliminate this completely. It would use a framework of taking turns with the user. When it's the user's turn to provide input, nothing on the screen would change, ever. After the user completes their input, the computer can do whatever and then provide the output when it's done. Command lines pretty much do this all the time. I think there might be an old IBM terminal type that would send a form for the user to fill out and send back. If there were ever notifications (I would prefer not to have them at all), they would be incorporated into the next set of outputs.
Imagine this as a voice chat interface between 2 human beings. This is basically pretending like the interaction of thought and perception of what is on screen is somehow gated by a “I have fully consciously absorbed everything on screen and decided my next action” perfect modeling where both the human ability to perceive and the computer ability to represent information are prefect.
No. That’s not how humans interact with computers. It’s not how humans interact with each other either.
Turn based games can be fun. They are not how we want to interact for day to day life.
Sorry but your idea comes across as one that makes the job of making the computer good to interact with easier, but not as one that makes the computer better to interact with as a human.
Please stop over simplifying a complex system. Humans are complex, the solution is t to be less human. It is for computers to become better at human interaction on human levels.
I think the reason this is hard is because your eye only thinks it is seeing the change occur after you touch, but it's just seeing the change occur after the decision to touch which isn't the same thing at all. Maybe not all the time, but more often than you probably think.
Since you can't go back in-time, what I suggest to arrange for the event (if it occurs slightly after the redraw) to be applied using the old display model (instead of dropped). If the redraw occurs slightly after the event (and you're right) I'd prefer delaying the redraw instead of delaying the tap.
counterpoint, and I know it's not apples to apples, but have you ever used an old terminal app? Buffering the key strokes and then applying them _once_ the menu was ready was awesome. You could move so fast through an app
Could often do this in ye olde Macintosh System Software, too. Could fill the event queue with events, and rely on the program to intelligently clear the queue just before an incompatible UI change.
If you're curious to learn more, this is often referred to as "layout shift". There are black-hat designers who deliberately use it as a dark pattern -- you might notice a suspicious number of cases where the most common click target is replaced by a click target (or late-arriving popup) for submitting an email address or entering the sales funnel. But typically it's just bad design.
In the latter case, you could quietly disable buttons after a layout shift, but this can cause problems when users attempt to interact with an onscreen element only to have it mysteriously ignore them. You could visually indicate the disabled state for a few hundred milliseconds, but this would appear as flicker. If you want to be very clever you could pass a tap/click to the click target that was at that location a few tens/hundreds of milliseconds prior, but now you've got to pick a cutoff based on average human response times which may be too much for some and tool little for others. That also wouldn't help with non-click interactions such as simply attempting to read content -- while not as severe, trying to read a label that suddenly moves can be almost as frustrating. Some products attempt to pause layout shifts that might impact an element that the user is about to interact with, but while this is possible with a mouse cursor to indicate intent it can be harder to predict on mobile.
Some of these ideas are even used in cases where a layout shift is necessary such as in a livestream with interactive elements. However, the general consensus is to use content placeholders for late-loading content and avoid rendering visible elements, especially interactive ones, until you have high confidence that they will not continue to move. That's why most browsers provide penalties for websites with "cumulative layout shift", e.g. see https://web.dev/articles/cls
> Why do we even show interactable elements when the final layout isn't completed yet?
Typically such a product either doesn't have sufficient UX attention, or it has black-hat UX folks.
Optimally toolkits and browsers should have handled this since they know the layout dependencies. If an element is still loading and it doesn't have fixed dimensions then all elements whose positions are dependent on that element should not be shown.
I am sure Youtube at least relies on this kind of issue in order to inflate ad clicks on their mobile app. Ads pop up at any point and override the in-screen controls.
Here's my personal submission for "UI problem that has existed for years on touch interfaces, plus a possible solution, but at this point I'm just shouting into the void":
https://medium.com/@pmarreck/the-most-annoying-ui-problem-r3...
In short, an interface should not be interactable until a few milliseconds after it has finished (re)rendering, or especially, while it is still in the midst of reflowing or repopulating itself in realtime, or still sliding into view, etc.
Most frustratingly this happens when I accidentally fat-finger a notification that literally just slid down from the top when I went to click a UI element in that vicinity, which then causes me to also lose the notification (since iOS doesn't have a "recently dismissed notifications" UI)