Hacker News new | past | comments | ask | show | jobs | submit login
Three ways of handling user input (2022) (dubroy.com)
66 points by stcucufa on April 29, 2024 | hide | past | favorite | 22 comments



One of the most annoying "dropped input" problems for me is in the Linux (GNOME's?) locking screen. To unlock the machine you need press e.g. Space to get the password entry textbox to appear and focused on, then type in the password and the press Enter — that's all fine. However, after locking the screen, the monitor powers down, and so the first keypress turns the monitor back on and then only after some delay it puts focus to the password entry textbox. That means that if I go to my machine and rapidly type "Spacebar, hunter42, Enter", about 2 or 3 first characters of my password are ignored, and I get a "wrong password!" notification — again, after some delay because preventing on-site brute-forcing is important.

Constrast that with pressing F11, two arrows down to pick the correct OS, Enter, in quick succession during the boot, before the monitor boots up and starts showing the info (for some reason, this monitor I use takes about much time to boot up as the computer itself) — this has never failed me yet because apparently the bootloaders don't just throw the keyboard events away.


Are you sure the delay before the input box is focused is not just the computer waking from sleep? I have the same thing on my desktop - need to press a key to wake it up. On my laptop that wakes when the lid opens I can just type the password, not even a key to get past the lock screen.


I don't think it is? I normally turn all the power-saving stuff off, but it could have come back.

In any case, the keyboards do have buffer to hold several keypresses, so I believe what happens is that the event loop kicks in, reads those keypresses, and dispatches them into the void; the booting stuff, on the other hand, only polls the keyboard when it's going to act on it.


The delay is important for increasing security, but then why not have it after the check?


> The delay is important for increasing security, but then why not have it after the check?

It really isn't on a local machine, when the keyboard is used to enter the password manually. There should be zero delay. 0 ms. That'd be acceptable. And some system can be configured to "no delay" before successive password attempts.

If one really believes adding a delay locally helps, the only thing that makes sense is introducing a delay after several broken attempts (like, say, five attempts: 10 seconds delay).

But security theatre by having a 3 second delay between two login attempts is ridiculous.


This series of articles is massively underrated. Even modern React setups with hooks and the like, with modern state handling, are rife with data races which happen at the input level. This article and the followup to it present a very compelling approach to handling input in a structured way and avoiding those data races.


Do such issues exist in popular desktop GUI frameworks?


Yes. There's nothing particularly magical about any UI framework. They all have to deal with the fundamental fact that UIs are enormously stateful, the state is a function of approximately everything the user has ever done, various inputs arrive asynchronously and unpredictably, responsiveness SLOs are in the 10ms range, and the hardware is generally diverse and off-site.


This reminds me of Evan Czaplicki's PhD Thesis [1] on how concurrent functional reactive programming (FRP) can be used to solve most (if not all) of those challenges with little effort from the application developer.

[1] https://elm-lang.org/assets/papers/concurrent-frp.pdf


Judging by the fact that apps in iOS that are produced by Apple itself have races with resizing menus when you rotate the device the answer is a resounding "yes".


thanks, but that's why I mentioned desktop, was curious whether they still retain this type of fundamental fail despite "decades of improvements", it's much less of surprise re. the web/mobile


Yes on Android also.


The "fixed" version doesn't return on Escape and the first bug doesn't appear after you've moved the box (and both sometimes get glued to the mouse pointer even on mouse up). Doesn't seem like a great solution


When dealing with inputs of diverse latencies, another approach is a replayable event stream with app state that's checkpointed or otherwise retractable. My laptop had cameras tracking hands on keyboard, creating fused input. For key events with finger id and position on keycap, keycap touching as modifiers, and keycap touch events without pressing. But while you want ui response to keypress to be very fast, tracking info wouldn't become available until the next video frame finally showed up and was processed. Integrating voice faces similar issues - speech-to-text "launch the missiles!" might shortly later become "lunch is mussels, in butter!", with launch needing retraction. Which required adapting app state and its modification.


Replayable event streams (but careful there as, say, a complete drag interaction of multiple minutes may produce tens of thousands of events, so deduping/vector addition is needed). Also - "interlocked" interactions (if one interaction is in progress, no other interaction may start), orchestrated global shortcut installation...


> My laptop had cameras tracking hands on keyboard, creating fused input.

What laptop is that?


ThinkPad P53 "portable workstation" with DIY cameras on sticks mounted on back of screen. Here mostly a keyboard-cam that flopped up-and-over to look down on the keyboard, and a convex mirror bar along the base of screen for crappy touch detection. But also fisheyes hinged up to corners. Python/MediaPipe/libev under an Electron/browser stack.


Ah, so the laptop is normal and you've tacked on a DIY rig so it can see your keyboard. Is there a writeup/blogpost/repo somewhere? That's an interesting idea


Nod. Sigh, regrettably not. A photo[1] got snagged for a twitter conversation, but little else. A different part of the effort, diy shutter glasses, got a brief reddit post.

Failure mode: Big scope, "these kludges only matter if they end up day-to-day usable", "just for me - user community potential is zero". Motivated with an local in-person tech community ended by covid - I didn't even do a "ok, this looks dead, so let's scavenge a wrap-up".

Ah well. I've been wondering what to do differently with a new potential project. Maybe try an offline live-blogging-flavored research journal, with a story of "when things comes up in conversation, I'll scavenging together a live-blogging-effort-level post in answer"?

Thanks for the interest. Happy to answer questions. Hand/head/gaze/stylus tracking, electron-based tall stack (low-level-input to full-screen, X11 apps via screen capture - eg emacs presenting adjacent slightly different buffers for stereo), shallow-3D ui, integration with shutter glasses / nreal / diy-passthrough-ar-in-vr/drone-headsets, laptop keyboard as multitouch surface. And none of it usable. Lots of extra "well, that didn't work out" and "interesting, but no" spikes. Lots of "this only becomes usable if/when this other thing really works" dependencies. Blech. Might be able to pull out something bite-sized for a student thesis project or such. :/

[1] https://imgur.com/a/Z1VipaL


What about the other 10 ways that appear when you program for your OS rather than a browser?


Fortunately, all input handling techniques can be reduced to relatively few concepts.


This problem is just begging for a FSM.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: