Ask HN: What are some of the best user experiences with AI?

codingdave · 2024-03-22T11:56:40 1711108600

When someone pulled right in front of me from a side street a while back while I was driving, the car slammed on the brakes and before I even registered fully what was going on, I found myself stopped safely, but only inches away from having t-boned that guy.

That is the level of experience AI needs to get to. Not buttons that basically say: "Use AI!!" but features so fully integrated and smooth that you don't even think about whether or not AI is behind it... it just does what it does when it needs to do it.

(And I know, my anecdote wasn't about LLMs, but that is kinda the point.)

davedx · 2024-03-22T12:08:10 1711109290

Not necessarily AI at all. Collision avoidance can work purely based on range finding. Self driving that actually does use AI is all over the place

taneq · 2024-03-22T13:21:39 1711113699

Not in any 2+ dimensional situation. Radar based AEB is awful. Reliable collision avoidance requires robust prediction of future trajectories for everything around you (technically it requires robust lower bounds on minimum time to impact for every object around you, which requires perception of and identification of every object around you plus reasonable predictions based on identification and behaviour history for each object... if you're willing to allow "nah that was BS and not your fault" style collisions like the "stopped for a paper bag sitting on the road" / "hit a stack of bricks in a paper bag" dichotomy then it's a little more forgiving but still much harder than rangefinder + 2nd order equations of motion.)

robg · 2024-03-22T11:59:18 1711108758

Agreed, this basic application is so deeply potent, been saved a few times because the car reacted faster than I ever could have.

nunez · 2024-03-22T12:55:57 1711112157

Oh man; that's a daily experience in Houston!

tbf Tesla's full self driving software does use AI to react to exactly these cut-in scenarios as of v12. (They were hardcoded in previous versions.)

guappa · 2024-03-22T12:18:45 1711109925

My car has an alert for collision detection.

It always triggers slightly AFTER I stopped or avoided the collision.

stavros · 2024-03-22T13:25:27 1711113927

Oh, that's not the collision detection alert, that's actually the Aperture Collision Avoidance Congratulatory Chime.

adrianh · 2024-03-22T11:35:47 1711107347

At Soundslice we're using a custom machine learning system to convert sheet music images into semantic music data:

https://www.soundslice.com/sheet-music-scanner/

Personally I think the user experience is interesting because we show you very specific questions for low-confidence decisions. Some example screenshots are on that link above. Over time, the number of manual questions has gone down, as our models have gotten smarter about the (seemingly endless!) edge cases in music notation.

Once music is uploaded and scanned, you can use our bespoke notation editor to make any edits. The original image is tightly integrated into our editor, so you can cross-reference.

This is my first production ML product after 20 years of being a web developer. I wrote up some general thoughts here: http://www.holovaty.com/writing/machine-learning-thoughts/

sp332 · 2024-03-22T12:03:23 1711109003

Wow, I think AI that can ask questions might be a killer app.

appplication · 2024-03-22T13:58:34 1711115914

Agree. Since ChatGPT came out I thought it was neat, but to be useful I’d want something that asked me questions instead.

E.g. “I see you have a meeting with Tim later today, what’s your planned agenda? What are your target outcomes?”

_bramses · 2024-03-22T11:49:58 1711108198

Gonna drop my own link here, because I really think the UX I’m working on is truly novel. Inspired by the commonplace book format, I take highlights from Kindle and embed them in a DB [1]. From there I build (multiple) downstream apps but the central one, Commonplace Bot [2] is a bot that serves as a retrieval and transformer for said highlights. It has changed the way I read books. I now get to link ideas from books I read in 2018 to books I read last week. I don’t need to always have a query either, as I added a hypothetical question as an entry point allowing for the UX of finding an idea to be as simple as typing “wander”. Finally, since quotes are dense, short, and generally context free, I enable a bunch of transformations like Anki quizzes, and art from quotes, and using the quote itself as a centroid to search its neighbors, etc.

[1] - https://github.com/bramses/quoordinates [2] - https://github.com/bramses/commonplace-bot

rupi · 2024-03-22T12:31:10 1711110670

I love this. I have my commonplace book in Roam Research. Search in Roam is not perfect and I have wondered lately if there was a way to get all of the content into a graph DB and then query using LLMs. But I haven't had time to tinker with it - I am sure open source libraries exist that do exactly this.

Can your library take all highlights from Readwise or just Kindle? I use Readwise Reader quite a bit and will love something that takes everything I save + all highlights + other places (Roam Research, Email, Calendar) etc. and I can just ask it questions.

_bramses · 2024-03-22T12:42:50 1711111370

You definitely could! Funnily enough, I have a function named "justBooks()" [1] that filters the Readwise export to just book type tags, but you could use the entire export, or whatever upstream method you want. I think much like journaling, every one's use case will be catered to their own tasks/quotes/ideas, but allow me to share centralized advice. You'll definitely need: 1) a database that supports vectors, I use Postgres 2) a low friction way to get your "new" highlights from your reading practice, I use Readwise 3) an llm to "cache" transformations [2]. This transformation does an insane amount of work, and takes it to the next level in terms of utility, I wouldn't skip it.

[1] - https://github.com/bramses/quoordinates/blob/1b9d1fadaded98b... [2] - https://github.com/bramses/quoordinates/blob/1b9d1fadaded98b...

aprilthird2021 · 2024-03-22T13:06:36 1711112796

This is really cool! I could see it being excellent for anyone who writes or gives speeches very often, a great way to quickly access the knowledge one builds up over a lifetime of reading. Love it!

lynx23 · 2024-03-22T11:21:01 1711106461

Well, as a blind user, I'd like to point at the OpenAI Vision integration of BeMyEyes! Being able to get fully detailed scene descriptions including OCR and translation all in one package was pretty much a game changer for me.

Not so much kick-ass, but still works nicely: https://github.com/mlang/tracktales -- My MPD track announcer with support for describing album art...

hi · 2024-03-22T11:47:09 1711108029

This is an example of great AI UX and not sure why it's not upvoted more. It's rare for users to use the term "Game Changer" when it comes to UX.

thejohnconway · 2024-03-22T11:59:40 1711108780

Can I ask if you think this might make alt text on images obsolete? Do you use the alt text where it’s available, or BeMyEyes (I presume you have a choice)?

lynx23 · 2024-03-22T12:31:34 1711110694

Those are two different topics. BeMyEyes is a smartphone app which brings your camera and OpenAI vision models together. It is ment to be used to describe/OCR things in your real-world environment.

While alt texts could theoretically be replaced by a browser/screen reader functionality that asks a vision model to describe the image, it is a waste of time and energy to have each and every user do it over and over again.

thejohnconway · 2024-03-22T13:15:08 1711113308

Ah, sorry, got you. The aspect I think about with alt text is that AI is often better than a mediocre human effort, and it is improving all the time. Improving AI will improve the description of all images, even older ones, and therefore you might want to run the current AI on all images, even if they have existing alt text.

spacecadet · 2024-03-22T11:11:50 1711105910

No one? In my opinion we are not there yet. Right now "chat" is the killer UX, I think conversational inputs will take over for a time... right now my killer AI app is a home brew slack bot RAG pipeline for my own knowledge base building and searching... Why share it, its free now... I think this will also be a trend. Those who can, those who cant.

caesil · 2024-03-22T11:19:56 1711106396

Chat sucks and will be replaced in almost all cases, I suspect. Speaking is much easier for many tasks, traditional UIs for others. Mediating everything through a keyboard is clunky and slow.

guappa · 2024-03-22T12:19:57 1711109997

Speaking is tiring. And generally also slower than typing.

stavros · 2024-03-22T13:30:25 1711114225

I very much doubt that speaking is slower than typing, given that basically only stenographers can transcribe things in real time, and that's by using special technique.

guappa · 2024-03-25T14:10:25 1711375825

Try typing for 4h and talking for 4h.

Leaving aside the problem that you can no longer share a space if you interact by talking.

stavros · 2024-03-25T14:16:05 1711376165

Talking is still faster, and I can do it while my hands and eyes are occupied, so I definitely don't see typing ever being the preferred interface. You'd never type to interact with an assistant while driving.

davedx · 2024-03-22T12:10:07 1711109407

Subjective. I much prefer written interactions to spoken

wenebego · 2024-03-22T11:43:42 1711107822

A keyboard is really nice input method, with current technology

aodonnell2536 · 2024-03-22T11:55:08 1711108508

It’s also easier to have greater precision over what one says when communicating through text vs speech

blowski · 2024-03-22T11:58:38 1711108718

Also a lot more flexible over how you use it. Voice communication is difficult on public transport, around sleeping people, in a busy office, etc.

That said, voice can be better for people and situations where typing is hard.

spacecadet · 2024-03-22T15:15:19 1711120519

You all took chat as a single type of interaction... "CHAT" as I put it, is typing, talking, screen readers, etc... conversational UX...

spacecadet · 2024-03-24T11:47:19 1711280839

Downvotes but you all have the wrong idea. Classic hacker news... "Dont tell me!!" lol, eat a bag of...

stavros · 2024-03-22T11:45:28 1711107928

The voice chat in the ChatGPT app. The present has never been more futuristic than that, I really feel like I'm talking to another person. It's not just what the responses are, but also how, the voice really brings the whole interaction to life.

eithed · 2024-03-22T11:28:18 1711106898

Year ago I cancelled my policy with Churchill; I found the entire process pretty painless = called the phoneline, was greeted by the robot, I said "I want to cancel my policy", from the phone number I was calling it has gotten who I was; I confirmed my identity, it outlined when my policy will be cancelled and confirmed with me that I want to proceed. The entire experience was a self-serve that I'd like to see everywhere.

Comparing this to yesterdays adventure with other service (my package got lost) where the bot couldn't decipher what a WRITTEN "my package got lost" or "where is my delivery" means.

tkgally · 2024-03-22T11:51:11 1711108271

Talking with an LLM feels very different to me from text-based chat interactions.

I used the spoken interface with ChatGPT 4 a lot a few months ago after it was released on the iPhone app, and it was pretty immersive. The latency was a bit long, though, and even when prompted to reply briefly the bot tended to ramble on, often with numbered lists, which sound awkward in speech.

For the past couple of weeks, I’ve been experimenting with Inflection AI’s Pi. Its voices are very natural—the American female voice I use even has vocal fry [1]—and the latency is short. It will talk about serious topics (sometimes with numbered lists), but it seems prompted mainly for friendly conversation. It calls me by my name and remembers our previous conversations. I can easily see people becoming emotionally attached to bots like that.

A man named Chris Cappetta has created some open-source software for talking with Claude 3. His conversations with the bot about AI are pretty remarkable [2, 3].

The current spoken interfaces all seem to run what the user says through a speech-to-text converter, so the bot does not perceive pronunciation, intonation, hesitation, etc. After multimodal models that can hear and respond to the speaker’s tone become available, the experience will become even stickier.

[1] https://en.wikipedia.org/wiki/Vocal_fry_register

[2] https://www.youtube.com/watch?v=fVab674FGLI

[3] https://www.youtube.com/watch?v=gY9-1isnARs

logtempo · 2024-03-22T12:10:27 1711109427

If neural network is AI, then translation tools, deepl being one of my favorite.

There is also AI in video editing apps: fast autofocus on faces, face detection and modifiers that follow you. It's really incredible and intuiive enough so that many people use it (too much maybe).

yanis_t · 2024-03-22T12:27:50 1711110470

I was recently working on a UI system that attempts to let AI build websites autonomously. When you start working on it you quickly realize how many tradeoffs you have to think through. Most of the questions arise from the limitedness of the context window. For example, Claude 3 supports 200k tokens.

You also have to take into consideration, that since chat history is sent with every new message, the price of the conversion growths ~ n^2 of the number of messages. So do you send the whole codebase? Or do you let AI run commands like ls and cat to read the files it needs? Do you want a file in the directory with quick history of what's already done, and what needs to be done?

Another thing I find interesting is how microservices became a natural choose vs. monolith apps when building with AI, again due to limits of the context window. So you focus on thinking through all the components and their APIs, and then let AI build each of the component. If it can be done in isolation without any knowledge of other components, that's better.

Also, it quickly becomes obvious that fully-autonoumous builder does not make any practical sense. Real person still needs to look at the progress, and give guidance. Not even because AI can't do this, it probably can. But because your own understand of what you are building changes over time. So it should be semi-automatical, with real users being able to change the course any moment.

How do you build the autonomous loop?

One thing I find useful is to let AI write tests first, and then run those automatically on each new chat message. TypeScript types also helps catching broken code early. In those case automatical message is sent "Hey, you broke the tests. Here are the error messages. Go ahead and fix those." Operator doesn't have to bother, until it's fixed.

Another loop can be build with the ability to send screenshots. So at any moment system can send a screenshot to AI, and ask if it's good enough, and if it wants to make any changes. That also improves the quality.

Well, you get the ides. It's an interesting task to ponder.

nucleative · 2024-03-22T12:13:59 1711109639

Klaviyo. I know they have been around a long while but I just implemented it for the first time today on a new ecommerce project.

The signup and setup flow is quite lengthy because they need to implement email flows, abandoned cart reminders, sms flows, push messaging which of course it needs to be highly customized. All of this is needed just to unlock some of the basic features of the tool.

I was surprised and delighted to begin setting up an email series only to discover it had already scanned my website and used AI to write the content of all the messages to be applicable to our tone and messaging.

Highly impressive and it makes getting it up and running super fast.

47thpresident · 2024-03-22T19:43:43 1711136623

There is a segment in a recent Linus Tech Tips video [1] that showcased media asset management software [2] where you can search for portions of locally stored videos via natural language. e.g. Person X holding object Y, working on task Z. If this type of AI video tagging comes to mobile I think it will be a game changer.

[1] https://youtu.be/CcHevgjAnV0?feature=shared&t=1374 [2] https://axle.ai/

yieldcrv · 2024-03-22T11:34:52 1711107292

I would say Open AI's Whisper just works, a nice GUI wrapper that leverages Metal/GPU/Co-processors is "Hello Transcribe"

Whisper transcribes conversations from audio files. Hello Transcribe is a GUI wrapper by someone else that uses Whisper under the hood to create subtitles in subtitle format with timestamps.

Does not distinguish between speakers though

I wish other use cases were considered by that application, as I pretty much never want subtitle files and never plan to

There are some techniques to distinguish between speakers but I haven't seen anyone put the combinations together in a nice GPU leveraging app.

tiborsaas · 2024-03-22T11:46:06 1711107966

Which AI you mean? It's so ubiquitous that it's hard to pick. Categorization and labelling existed before LLM-s, but they made it much better. Imagine not having to manually label support tickets if you have a 1000 new every day.

3D model generation is also a good example, you can save tons of time.

AI is everywhere, but most of the time you simply don't notice it since it's so well integrated or they are driving backend logic, anti fraud systems, etc.

taneq · 2024-03-22T13:35:23 1711114523

"AI" is a moving target but a few months ago I was blown away to find out that phones are WAY better at semantically tagging photos now, to a point where it's actually useful. I can search my phone photos for 'cat' or 'car' (or more usefully, that one scanned pic of my drivers' license if I forgot my ID). Sure, resnet has been around for a while and this is still not great but it's there in production right now and has been legitimately useful to me.

Recently I got to drive a rental Corolla with lanekeep assist and smart-ish cruise control which could sorta keep itself on the road by itself. It was definitely a fun toy for a controls engineer, kind of like riding a narcoleptic horse with ADHD but still on the cusp of being a net positive.

bedits · 2024-03-26T06:24:43 1711434283

AI applications that aim to provide positive user experiences:

a. Personal assistants like Siri, Alexa, and Google Assistant.

b. AI photo editing tools that allow users to quickly enhance photos with filters, touch-ups, and automated suggestions.

c. AI tools that augment creative work like automated image generation, video editing assistants, writing aids, and more.

d. Accessibility tools using AI for tasks like vision assistance, transcription, translation, and more.

e. Personalized digital assistants created by Anthropic to be helpful, harmless, and honest.

BrandiATMuhkuh · 2024-03-22T13:01:23 1711112483

The best UX at the moment is to do role play with chatGPT's voice interaction. My son (5y) loves to play "holiday". So I tell GPT to pretend to be a hotel receptionist. And he will then have a conversation about rooms, events, costs, etc. with the bot.

vouaobrasil · 2024-03-22T11:36:41 1711107401

The best experience with AI is not using it at all. I prefer interacting with humans.

Lochleg · 2024-03-22T12:07:31 1711109251

I wasn't impressed. I do use them to vacuum and make frozen pizzas though.

sevagh · 2024-03-22T11:25:21 1711106721

Self-promo/shilling of my site: https://freemusicdemixer.com/

Free website for AI stem separation that runs client-side with WebAssembly

orzig · 2024-03-22T13:14:49 1711113289

I have not used it, but every review of the “AI software engineer“ DEVIN has pointed out that the marketing claim is absurd but the user interface is pretty cool.

Kalanos · 2024-03-22T11:38:18 1711107498

https://aiqc.io Not for NLP

blowski · 2024-03-22T11:50:10 1711108210

It's difficult to spot, because - similar to your sysadmin - if you know it's AI, it's probably because it's doing a bad job. It's when it just blends into the overall experience so you don't even notice it's there that it's great. There are cases where AI is helping a company be more profitable, by allowing them to provide a substandard service with fewer humans. From a company perspective, it's doing great, but the end user experience sucks.

So, my list:

* Spotify's weekly picks used to be pretty good at recommending new music, although it's actually got worse in the last 2-3 years.

* AI filtering out things like fraudulent transactions and virus-laden web pages. They're a long way from 100%, but it's got better, even as the challenge has got bigger.

* Some games have started making good use of AI - Red Dead Redemption 2 is probably the best I've played. Makes the in-game world feel a bit more dynamic, rather than the same procedural world.

* Google Maps does a lot with AI, redirecting based on traffic. It doesn't go out of its way to tell you how clever it's being, so it's hard to spot. But 10 years ago, I used to get stuck in a lot more traffic than I do now.

* ChatGPT is awesome, even if the hype cycle is now turning and we're all eye-rolling at it. I've conversed with it to improve my understanding of all sorts of topics, and it is amazing.

alexpotato · 2024-03-22T12:08:04 1711109284

> * Spotify's weekly picks used to be pretty good at recommending new music, although it's actually got worse in the last 2-3 years.

In a similar vein, I remember using Pandora for the first time in the late 00's and thinking "Is this software reading my mind??" as it picked song I either already loved or a new song I like immediately.

Amazon Music, YouTube etc all feel like it's some version of "this what other people liked" but that just spirals me into some very specific genre.

I haven't had an experience similar to that first Pandora one in tech in a long time.

blowski · 2024-03-22T12:16:50 1711109810

Yes! In 2010, it could say "Oh you like Leonard Cohen? Maybe you'll like Jonathan Richman.", which was a great jump. Now, it's more like "Oh, you like Simon and Garfunkel? Here's a bland cover of one of their songs."

threeseed · 2024-03-22T11:37:18 1711107438

The best are the ones you use all day without thinking about ie. biometrics and computational photography.

AI works better when it's invisible and like magic rather than being yelled at you.

guappa · 2024-03-22T12:17:27 1711109847

Geordi La Forge always seemed to have a great time with Data. :)