Show HN: AI companions stack – create and host your own AI companions

yonixw · on July 12, 2023

It's not host your own AI at all! You need like 5 saas api-keys to operate this app. But sure, you can host this UI on your own.

eob · on July 13, 2023

I think we’re in platform reliance mode for quite a while. I think of all of these SaaS companies as different icons on an abstract AWS console that doesn’t exist yet.

We wouldn’t bat an eye at using S3, EC2, and RDS as a host your own setup. The only difference here is that startups are moving faster than incumbents.

FWIW that’s one reason why Steamship (disclaimer: I’m the founder) aggregates all AI services under a single API key and interface. It’s to deal with the insane glue-code hassle of running this stuff on your own.

chillbill · on July 13, 2023

That’s fine but the title is wrong

quickthrower2 · on July 14, 2023

And using S3, RDS and EC2 I would argue is hosting your own as you are using foundational components. Same if you run LLaMa on a rented A100.

chillbill · on July 14, 2023

Not the same at this point. Setting up properly secure AWS stuff is arguably more demanding than setting up your own box at home.

roseway4 · on July 13, 2023

Well, it is a round-up of their venture investments in the space ;-)

If you're looking to self-host chat memory rather than go all in on Supabase, there's Zep: https://github.com/getzep/zep

Full disclosure: I'm a co-author.

kaliqt · on July 13, 2023

Supabase is also self-hostable though.

kiwicopple · on July 13, 2023

Also worth noting that they aren’t investors in supabase (but we’re very appreciative that they included us in this stack)

roseway4 · on July 14, 2023

I stand corrected. Sometimes my cynicism gets the best of me ;-)

layoric · on July 12, 2023

It doesn't seem to be a major barrier for most (it absolutely is for me). Is there enough of a want/market for tools that integrate with common OSS LLM APIs like oobabooga/Kobold/Novel/vLLM etc? One idea I've been toying with is a BYO model with support for these APIs, eg for IDE integration, brain storming etc. Seems like a good approach to me but how many people would actually bother standing up these LLM engines with APIs to use it?

causality0 · on July 12, 2023

It makes me wonder how long it will be until open-source locally-run AI chatbots reach the GPT-4 level. Five years? Ten?

crooked-v · on July 12, 2023

The real limiter is having enough GPU RAM to train and run a usefully large model. Everything else is just fiddly details.

brucethemoose2 · on July 12, 2023

Not anymore. Llama.cpp and kobold.cpp run well with low vram and non Nvidia GPUs.

Huggingface is full of finetunes now, and I believe a 33b model can be finetuned on a single 3090.

Llama.cpp is developing some kind of training, but I have no idea what the requirements will be.

causality0 · on July 13, 2023

To me, GPT's real "secret sauce" is how it primes the model for interactive prompting instead of just text completion.

ynniv · on July 13, 2023

There are "chat" and "instruct" variants of most models now.

lettergram · on July 12, 2023

It depends on the tasks, I’d argue some of the open source alternatives are at the gpt-4 level already, particularly for code generation.

That said, I suspect summarization, translation, etc will take time. I’d suspect under 1 year.

worldsayshi · on July 12, 2023

>I’d argue some of the open source alternatives are at the gpt-4 level already, particularly for code generation.

Like which one?

brucethemoose2 · on July 12, 2023

Large context Chronos 33b (and some mixes) for roleplaying type chat.

And there are some very new 65b finetunes I have not tried.

vorpalhex · on July 13, 2023

The open source bot I played with last week (Wizard Vicuna uncensored) was 85% of gpt 3.5 on a VERY hard use case (fiction stories).

Maybe a year before we are the level of stablediffusion?

eob · on July 13, 2023

When it happens, I think it’ll happen on our phones first.

koheripbal · on July 13, 2023

Given the fundamental hardware matrix operations, that seems unlikely, unless it's something very scaled down.

swyx · on July 12, 2023

its more a question of hardware progress than software probably. theres a minimum level for these behaviors to emerge

mcbuilder · on July 12, 2023

I hosted my own AI using SillyTavern + text-generation backend (e.g. oogabooga / KoboldAI), just need the power needed to crank the LLM.

brucethemoose2 · on July 12, 2023

Koboldcpp (with ngrok if you need it) is another excellent self hosting solution.

13b will work on 16GB RAM, and 33b on 32GB RAM, with pretty much any dGPU for a little acceleration and RAM offloading.

Doubly so if you host it as an AI Horde node (so you have priority access to many models through the web browser).

arisAlexis · on July 12, 2023

I don't get why I wouldn't just keep an open tab with chatgpt on specific topics. They will very soon have big enough context windows. Why build another UI, deployment pipes and all that jazz ?

P.s nobody will *sms the companion

rootusrootus · on July 12, 2023

For me, it would be cost and alignment. If I own the software, I can choose whatever alignment suits me, or none at all. And ChatGPT is $20/month (assuming you want GPT4, and I do).

But there's still a good argument for a hybrid solution. Buy GPT4 access through the API and get a native UI to query it. Much cheaper to pay as you go, and someone else is still handling the heavy lifting. But if you want an uncensored model, you're out of luck.

eob · on July 13, 2023

ChatGPT is amazing but ultimately unwieldy for directed, long running relationships more nuanced than general themed chit chat.

We’re in the nascent stages but I think there will probably always be a community of folks who want to add more nuance to the communication, whether it’s reveries that enact a mood or goal, tie-ins to other services, etc.

Eg imagine wanting to have your ChatGPT DnD master also keep some kind of score. It may be ultimately easiest to put a wrapper around a themed GPT window that imposes a predictable way to do that rather than require everyone to figure out how to prompt it correctly.

arisAlexis · on July 13, 2023

But they created a wrapper web page for a web page here

Zetobal · on July 13, 2023

It's mostly horny teenagers, tweens and 4chan degenerates. They don't need a good product they just need a uncensored one.

babyshake · on July 12, 2023

Its roughly comparable to how you can use a spreadsheet to do a lot of different things, but it improves the UX (with some trade-offs) to have more custom designed UIs instead of directly using a spreadsheet.

_kb · on July 13, 2023

As someone working in a large non-software org, that statement does not hold. Spreadsheets are the underlying infrastructure to a mildly terrifying amount of modern civilization.

Karunamon · on July 13, 2023

Privacy is a big one. ChatGPT (the website) gives Openai the right to use your conversations for model training (unless you turn conversation history off but that feature is rather important to the UX).

Anything going through the API on the other hand has a commitment to not do this and to purge the history after a month.

asynchronous · on July 13, 2023

Tangentially related, how much would you guys be willing to pay for a company that could deliver a local model implementation that could run on high tier consumer grade hardware at a reduced ability? I feel like even if there was some severe restrictions (model wasn't open source, DRM, etc.) I’d still be willing to fork out for it.

idkyall · on July 13, 2023

I've seen at least one startup around this idea posted on HN: https://codecomplete.ai/

Probably also tiny corp for self hosting: https://geohot.github.io/blog/jekyll/update/2023/05/24/the-t...

I think there's a tremendous concern about letting data exfiltrate to any AI SaaS offering among business execs. If you could offer an experience minus the cloud that's easy to use and has compliance/logging features, I think you'd find success from industries that are reticent sharing customer data(e.g. Banking, government) who benefit greatly from the NLP workflows that LLMs enable

FanaHOVA · on July 13, 2023

MLC Chat lets you run RedPajama 3B on an iPhone. It's free. You'd need to specify what type of model you're thinking about I guess?

fallingmeat · on July 13, 2023

Is this the promise of Amazon Sagemaker?

nl · on July 13, 2023

I love how honest the README is:

> Shortcomings

> Oh, there are so many.

yding · on July 13, 2023

Good job Yoko!

Havoc · on July 12, 2023

Is this a16z associate with the vc a16z?

dmarcos · on July 12, 2023

Yeah. https://twitter.com/stuffyokodraws/status/167879943947197235...

Smart. A VC firm that heavily invests in software platforms will make much better decisions if they have first hand experience using the products.

veaxvoid · on July 13, 2023

you mean kaggle? don’t you?