I was a bit surprised about their choice of example app. With tour guide/info you need to be re-assured the information you're giving is correct. But asking LLM's to write it for you is a recipe for hallucinations. I built Summer AI to do audio tour guides using old fashioned web scraping and then piping the output through LLM's just to summarize and even then use a couple of extra steps to ensure factual accuracy.
During testing I found any other approach would always inevitably make stuff up.
Not trying to disparage this work. I think it's great that you can build such a thing so quickly. But I wouldn't rely on the info these LLM's provide.
We are working on a similar application and we have the same observation: external data is required to avoid hallucinations, especially if you go to less known places. It's absolutely the case with GPT-3.5 and often with GTP-4.
We will release our new content in the next few days. We are finally wondering about eating the cost of expensive TTS or going with a cheap option for okay results. Can I ask which option you used for TTS?
Hello! I know of your guys work! I try to keep up with all the competitors :)
Feel free to reach out to me rob @ summer dot ai and I'd be happy to talk shop.
For anyone else that is interested in this question: I've tried a whole bunch of the TTS services and found that Microsoft and AWS are the best of the standard providers IMHO and these are services that tend to have startup credits available so I use a mix of these two - I try to never rely on just one provider. I've met with the Eleven Labs folks and some of their demo's of the V2 stuff that's coming are really amazing but latency and pricing might rule them out as an option for the time being.
Thanks for the answer Rob, we just reached out :) We arrived to the same conclusion, we mostly rely on AWS Polly so far. Hopefully the pricing of better alternatives goes significantly down in the next months. We even tried to run different open source solutions but we could not find anything SOTA.
I used it to plan a summer European trip to 17 cities and didn't notice any hallucinations for what must have been 20 hours of back and forth creating itineraries, day trips, etc.
Did not use it as my only source of info, still watched some videos by Rick Steeves and Wolter's World, but it far and away was the most efficient part of the planning process for me.
Yes gpt 4. I was fact checking and using places I knew pretty well and would find the occasional error when asking it to describe cities etc and what to do.
Well for itineraries, given that its training data gets cut off at some point, I’d also be worried about it recommending places that are no longer open.
Note that I’m building an app for customers and attempting to build a brand that can be trusted - whereas in your case gpt 4 is probably ideal given you know it’s caveats and can check the important stuff.
I've also used it a few times to suggest things to do in the last few months. It came up with OK suggestions.
Also, they literally just demoed something similar as this project as part of their recent keynote on new openai stuff. The demo featured focused on their new assistant api.
I've actually grilled chat gpt a bit on geographic information on a few occasions. It's not perfect but surprisingly good. It obviously extracts a lot of that from things like wikipedia.
You can actually ask it to answer in geojson format, paste it in geojson.io and end up with a usable map. It seems to know about a lot of landmarks; including coordinates. For smaller venues, the coordinates are not super accurate.
You can also ask it to provide geojson for the boundingbox of a city or area. I even asked it to pretend to be an in car navigation system and provide me with directions to an address. It half succeeded in listing correct streets but messing up right and left. With the latest chatgpt it just asks bing.
The topic of this thread is intended to be a loose recreation what was used in the OpenAI demo. I guess my assertion is that it could work quite well for a travel agent type of functionality, at least for the type of service an average travel agent provides. For something like a tour guide that needs to gives depth it probably does fall flat quite a bit.
I can totally understand not being great for smaller landmarks... it could be the case too that it could trigger a call to an external service for things it doesn't have high confidence about.
We (the authors of the Solara web app framework) were inspired by the OpenAI keynote Wanderlust app they demoed and rebuilt it in Python using Solara.
It would be good to showcase our framework's power and inspire others to build UI's with AI/LLM elements in it.
My dream is to have a real AI-powered travel agent that would do all the discovery stuff per my constraints and selections and then book reservations (where needed) for anything. The AI agent would fill out all the necessary online forms or make calls and collect confirmations. Then it would present me with my itinerary which I can ask it to change at any time.
These days I purposefully avoid as much travel research as possible, and intently choose to immerse in the place and moment instead. I'm happy to miss that 4.97 stars dinner place, and instead enjoy interacting with a less-than-Instagram-perfect dinner & perhaps some flesh-and-bones human company.
I'm not sure this dichotomy isn't false. I can enjoy the best restaurants a city has to offer without Instagram having anything to do with it, and research doesn't preclude me from having flesh-and-bones company. There may be a good argument for not doing research, but I don't think these are it.
Possibly. I'm alluding to the spiritual disposition that expects 4.97 stars or bust. In relationship terms this often maps to expecting the perfect dinner company. When the actual company disappointingly fails to meet that standard, they are summarily dismissed, and the whole experience turns into superficial theater of appearances.
Yeah, that's definitely a phenomenon that can be harmful, as you describe. Still, this seems to be going from one extreme to the other. I guess "everything in moderation" is good advice here too.
But how much are you willing to pay for this service?
because it's not like human travel agents don't exist today, just like human drivers exist today. Does having an AI book the tickets for you meaningfully change the things for you, the customer?
>Does having an AI book the tickets for you meaningfully change the things for you, the customer?
Of course it does. You don't think replacing humans in the full service travel industry wouldn't drive down costs and upgrade capabilities and experience could be a thing?
The industry, obviously. How does that affect the customer experience though? If call up my travel agent and have them book a vacation for me, what's the difference between that and calling TravelGPT and having it book a vacation for me?
I’m curious what are you thinking the advantages of using an AI only agent for this? For example, compared to Anywhere.com, where human experts are the agents.
I just took a look at it. Seems to focus only on a handful of international spots plus now it's asking me to book a 30 minute meeting with someone. I don't mind filling in preferences and whatnot, but once you introduce a human to the mix I feel like i'm at the whim and mercy of that 1 person with their skillset, communication ability, attitude, business motivations, etc.
You do realize though that in terms of the business model, having the AI partner with select resorts and tour guides is a much more lucrative business model for the AI tour planning company?
In the end, you'll end up with Booking.com with extra steps:
"Perfect, ugh123! I've got just the hotel for you! It's the Marriott Grand Paradise, our top partner hotel. It is only 45 miles away from the specific address you requested and just barely $150 over budget per night, but they have a 4.9 rating, and with their Top Partner status, you earn 10 BookiePoints by booking there. Hurry, 33 people have booked a room there in the last hour! Do you confirm? Whoops, one more booking just now, only one room left! Do you want to save yours now?"
Did you also use help from LLMs in coding this app instead of just coding it all by hand? Just curious.
I am finding myself reaching for GPT4 more often to spin up POC apps.
You could fine-tune a coding assistant on your library and good quality example apps with documentation ... and that might take the ease of churning out new demos and apps to the next level?
It can be used for general purposes, but we focus on the data science landscape. However, our main website is built using solara, and we know a few startups using solara for their main product (public-facing, not internal).
* Solara will not continuously re-execute your script as Streamlit does.
* Solara will re-execute components instead, only what needs to.
* State in Solara is separate from the UI components, unlike streamlit, where they are strongly linked.
* State can be on the application level (global) for simplicity or on the component level (local) for creating reusable components.
* Solara should not block the render loop. Long-running functions should be executed in a thread using use_thread.
Solara is a fantastic project! The user experience is excellent, and I can quickly get ideas up and running. Solara can potentially become the framework for shipping ML/AI apps.
It sounds cool, but I'm cautious. The solara site, powered by solara, is insanely slow to load, has text running off the page and interactivity that's either broken or slow enough it may as well be.
Is it much more of a project for tools used by one or two end users? How have you found it?
Hm, the demo video is less enticing than I expected. What takes two full sentences, delayed responses, two map renders... is less than a single keyword search in the Google Maps UI with autocomplete ("martini tower " and then selecting "martini tower groningen"). And, in addition, Google Maps UI shows a lot of additional information instantly.
Not a criticism of this repo, but presentation is important in order to attract potential users.
During testing I found any other approach would always inevitably make stuff up.
Not trying to disparage this work. I think it's great that you can build such a thing so quickly. But I wouldn't rely on the info these LLM's provide.