Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools

lysecret · on Dec 10, 2023

I see many of my friends building some kind of RAG system with chat interface.

I have been building some stuff on top of the OpenAi interface (to use their store) but find myself wanting to implement some simple UI elements (like a date selected or a simple dashboard).

So I feel like these types of apps have a few re occurring elements:

1. A chat interface „frontend“ (with threads, interfaces to popular APIs or local models) nice Ui ideally extensibility to some custom UI elements authentication etc.

2. API calls. (E.g. like OpenAI actions) Simplest case just reading and writing to a db (simple crud).

3. Local data + RAG. With a custom retrieval/search logic could be embeddings or simpler search methods.

Do you know open source software for all three elements? Of course you can piece it together and maybe this is the best approach. But maybe you could build something integrated.

d7y · on Dec 10, 2023

I love the idea of allowing anyone to build apps or experiences on top of some of these common elements.

We have briefly discussed an approach where we make some of these common elements available as abstractions and let people build "apps" on top of it. It would operate kind of similar to how Google's app store does in that the head use cases (email, photos, camera, etc) are first-party apps, but then anyone can build and publish a third-party app using the Android SDK.

dluc · on Dec 10, 2023

about #3 I’ll recommend https://github.com/microsoft/kernel-memory :)

caseyf7 · on Dec 10, 2023

How do you get value from chatting with documents? I can scan and read a pdf faster than I can chat with an AI about it. There must be more to it than I realize.

capableweb · on Dec 10, 2023

> There must be more to it than I realize.

PDF material comes with different information density. If you have a lose collection of 100 manuals, and you need to find a snippet of information that could be in 10 different ones, I'm guessing something like this can help you navigate and locate what you need.

gosub100 · on Dec 11, 2023

That would be a great litmus test for these programs. Dump gigabytes of manuals and ask "how many pins does the 74LS04 have?" "What size bolts hold the oil pan on a 73 Porsche?"

freedomben · on Dec 10, 2023

a one-page PDF, sure. But if it's a 500 page pdf of a law and/or regulation, then definitely not.

saberience · on Dec 10, 2023

How can you chat with a pdf which doesn’t fit in the context window? I mean with a 500 page pdf you might need 100 context windows to fully grok it.

Basically it makes no sense to “chat” with a 500 page pdf with todays LLMs.

htsh · on Dec 10, 2023

That is what the RAG system does. The PDF is chunked and thrown into a vector store. And then when prompted, only the relevant bits are retrieved and stuffed into the context and sent to the LLM.

So yeah it's kinda smoke and mirrors. In some cases, for some long PDFs, it works really well. If it's a 500 page PDF with many disparate topics, it may do fine.

freedomben · on Dec 10, 2023

Indeed. Would only add, context windows are continually multiplying in size. Who knows how long Moore's Law will apply here, but it's a continually improving window.

saberience · on Dec 10, 2023

I've found that the longer context windows don't seem to be a linear improvement in responses though. It's like the longer the context window, the quality of the response is perhaps broader, but less sharp or accurate. I've been using GPT4-turbo with the longer context window for coding tasks but it doesn't seem to have improved the responses as much as you would think, it seems to be more "distracted" now, which perhaps makes some intuitive sense.

I can give gpt4-turbo many full code files to try and solve a complex coding task but despite the larger window it seems to fail more often or ignore parts of the context window or just doesn't really answer the question.

saberience · on Dec 10, 2023

That assumes that only one part of the PDF, which fits in the context window, is relevant to the prompt, which seems like a fairly big assumption.

bravura · on Dec 10, 2023

You can screenshot the first page and use gpt vision

arthurcolle · on Dec 10, 2023

You could just load up the doc, take first 1024 tokens, and almost always get the right authors/title/year, etc, assuming its there.

But going further, for large bills you might need (|n|..|m|) pages to capture full index

for research papers you also want to look at last (|n2|..|m2|) pages for bibliography, etc..

abraae · on Dec 10, 2023

Responding to RFPs springs to mine. Knowing you have already answered the question on some previous response but a nightmare to lay your hands on it.

jrpt · on Dec 10, 2023

I run https://Docalysis.com/ and there’s a few use cases. The first is getting information out of various reports and papers by chatting with it, which is faster than reading an entire document. Another is automating data extraction out of files, which is part of many business processes.

qingcharles · on Dec 10, 2023

I use Claude 2.1 to create summaries and TOCs of the magazines on my magazine encyclopedia. There is no way I could do that by hand for several million magazines averaging 100 pages each.

naiv · on Dec 10, 2023

Also if they are in spanish?

katrinarodri · on Dec 10, 2023

you mean ability to translate PDFs into english?

netcraft · on Dec 10, 2023

what if you want to extract certain information from 100 pdfs?

abnry · on Dec 10, 2023

Is there a good ML tool for renaming PDFs? There are some tools out there but they assume a journal format.

Reptur · on Dec 10, 2023

I tend to just select all the files and then paste their filenames into a Chat AI and tell it to write mv commands to rename them however I need. May be more complicated if you need it to open the PDF to get content for the rename though.

d7y · on Dec 10, 2023

> renaming PDFs

Sorry, I didn't understand. Why do you need ML tool for renaming PDFs? or did you mean rephrasing or rewriting in a different format?

mattsan · on Dec 10, 2023

Maybe they mean walking a file tree and cleaning up file names to match the titles on the covers of them more accurately?

abnry · on Dec 10, 2023

Exactly. You download something from arxiv and the filename has no meaningful content in it. Generally speaking, you want the filename to be descriptive in some way, extracting the title of the document is a good start.

bravura · on Dec 10, 2023

You can just import them all into paperpile, which has good ways of inferring the metadata like title author year etc. and then connect to your google drive. Which will download them with nice filenames.

sunnysogra · on Dec 13, 2023

Nice tool. However, these days, people are looking for something that is more feature-rich and can perform various tasks. Building an RAG system with a chat interface is common. There are many ChatPDF alternatives available that work great. Here you can check one of the popular ChatPDf alternative that is capable of doing much more than ChatPDF. https://www.thesamur.ai/chatpdf-alternative

keep Building!

teruakohatu · on Dec 10, 2023

Can a directory of PDFs be queried or does it only support a single document?

d7y · on Dec 16, 2023

Hi there!

We just added this in the latest release (v0.0.2). You can now create a document collection and upload as many PDFs into it as needed. The documents are processed in the background and once processing finishes, you can create as many chats with it as needed.

Quick demo: https://youtu.be/PwvfVx8VCoY Installation Instructions: https://github.com/SecureAI-Tools/SecureAI-Tools?tab=readme-...

Please try it out, and let me know how it goes. We're always looking to improve the tool so let us know if you have any feedback for us :)

d7y · on Dec 10, 2023

Right now it supports selecting & uploading a _few_ PDFs on chat-creation. Those PDFs get indexed online -- i.e. while the user waits. So it doesn't scale well with the number of PDFs selected in a chat because you'd have to wait that long before the chat responds with your initial question/prompt.

We plan to make this indexing process offline, where you can create a document collection based on either a directory upload or an integrated data source like Google Drive, Notion, Confluence, etc. Then the system would start indexing that collection in the background and notify you once indexing is complete. Once a collection is indexed, users can select it when creating a new chat and query against it.

Let us know if you have any thoughts on this proposed solution.

cheema33 · on Dec 9, 2023

I am very new to this field, so forgive my ignorance when I ask super basic questions.

1. Does chat-with-pdfs function work with scanned PDFs? 2. In the video example for chat-with-pdfs you show uploading a document interactively. The part of processing is quite slow. Can the tool be fed these documents offline as well?

My use case for each user, there are many, to have their own list of documents that they upload. They come back later, after the LLM has had a chance to process all documents and can ask questions about their own documents only.

Is something like this possible?

d7y · on Dec 9, 2023

Great questions.

> 1. Does chat-with-pdfs function work with scanned PDFs?

Not yet. We don't do OCR or anything to extract text from images yet. But that would be an awesome feature, so we would love to add it in the future.

> 2. In the video example for chat-with-pdfs you show uploading a document interactively. The part of processing is quite slow. Can the tool be fed these documents offline as well?

Not as of right now. But we do have plans to make that an offline/background job so that we can feed a larger corpus of documents into it and query against it later.

smeej · on Dec 10, 2023

If I've already run OCR on my PDFs and that's added now as an invisible layer, would it work then?

I've had a workflow digitizing my incoming paper documents, running OCR, and tagging them, all locally, and it would be great to have an easy front-end to talk to them.

d7y · on Dec 10, 2023

I haven't tried this myself, but I think it should work. It would be worth trying at least, so I highly encourage you to play with it, and file issues if you find any issues with it.

lhuser123 · on Dec 10, 2023

I haven't found an OCR tool reliable enough when it comes to scanned PDFs containing financial data where accuracy of amounts in the document is very important.

catlifeonmars · on Dec 10, 2023

People spend an inordinate amount of time and money solving this problem rather than spending the same amount of money in lobbying and standardization efforts for financial institutions. I’ll throw this out there: when all you know is a hammer, everything looks like a nail.

rolisz · on Dec 10, 2023

Have you tried Azure Document Intelligence? I've had very good results with it.

hifreq · on Dec 10, 2023

I couldn't find this info in the readme... does this tool anonymize ChatGPT requests? What does it mean that it's a private an secure tool in the context of using ChatGPT?

glerk · on Dec 10, 2023

My understanding is that this doesn't use ChatGPT at all for the "private and secure" case. It is running LLM models locally using ollama and just provides a ChatGPT-like interface.

d7y · on Dec 10, 2023

It is secure because it allows you to fully customize where to process the data (i.e. LLM inference), where to store it, and data-retention policies, etc. You can choose to use a locally running LLM (like it does in my second video) or use a secure third-party service provider like Azure OpenAI.

For example, if you want GDPR compliance, then you can choose Azure OpenAI running in the EU region. For HIPAA compliance, you should choose a service provider that provides the Business Associate Agreement (BAA). You can even run it in air-gapped facilities (like GitLab's offline mode [1]). In all of these cases, you can always run an Ollama-like inference service on your infra and point SecureAI Tools to it)

[1]: https://docs.gitlab.com/ee/topics/offline/

willi59549879 · on Dec 10, 2023

it runs an llm in a docker container. doesn't send any requests to chatgpt

hifreq · on Dec 10, 2023

The demo video is using ChatGPT.

foxhop · on Dec 10, 2023

https://github.com/russellballestrini/flask-socketio-llm-com...

I'm building a similar app but uses python/socket.io

jart · on Dec 10, 2023

I'm not able to use this locally on Alpine Linux because ollama needs glibc.

rolisz · on Dec 10, 2023

Can you get it running using Cosmopolitan? :D

yreg · on Dec 10, 2023

I'm not able to use this on TempleOS, please fix.

jart · on Dec 10, 2023

Yes but they claim it runs on Linux when it runs on Systemd.

number6 · on Dec 10, 2023

What you guys are referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux plus Systemd. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities, and vital system components comprising a full OS as defined by POSIX.Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called "Linux", and many of its users are not aware that it is basically the GNU system, developed by the GNU Project, and now increasingly integrated with Systemd, the init system that brings everything together - or apart, depending on who you ask.There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system and now, almost inseparably, with Systemd: the whole system is basically GNU with Linux and Systemd added, or GNU/Linux/Systemd. All the so-called "Linux" distributions are really distributions of GNU/Linux/Systemd

suslik · on Dec 10, 2023

Cutting off "I’d just like to interject for a moment" from this pasta is herecy.

number6 · on Dec 10, 2023

I botched it...

testernews · on Dec 10, 2023

this doesn’t work on Nix, pls fix

stuzenz · on Dec 10, 2023

I also use NixOS, but if the target was not NixOS, I don't think you should be requesting the set up as needing fixing by the author. It just doesn't sound right - or maybe it is just me. NixOS isn't the defacto standard, and breaks the Linux FHS to achieve all the good stuff it does do.

Either try to package it or use a docker image or maybe raise an issue noting the blocker and request it as a feature for some changes to give an easier path for to having it build more easily for NixOS

Apart from that, as expected, the docker image that is produced following the instructions is working fine with NixOS as host. All it needed for the build was the openssl packaged on the host.

hruzgar · on Dec 10, 2023

You guys really didn't get the joke, right? <-)

testernews · on Dec 11, 2023

lol I thought we were all joking on this thread???

mikeytown2 · on Dec 10, 2023

What LLM is best for giving it your entire code base and database structure and then using it to help with your project?

oaththrowaway · on Dec 10, 2023

Maybe Refact? They have a self hosted version that can index your repos. They have their own LLM or you can use others. Nice tooling for VS Code as well

vunderba · on Dec 10, 2023

Great job. This is a relatively crowded area, particularly RAG style chat systems. It might be nice for SecureAI to call out what makes their product different from other open source players in the same space, specifically Khoj and Danswer, both of which allow you to chat with your documents, offer network authentication, and allow you to plug in your own LLM.

Danswer

https://github.com/danswer-ai/danswer

Khoj

https://github.com/khoj-ai/khoj

d7y · on Dec 10, 2023

A great question.

We are trying to build a single platform for all the AI tool needs. Chat-with-LLM and chat-with-documents are just a couple of apps or experiences that we have started with, but we have ambitious goals. In future, we would love to provide an SDK that exposes common abstractions and lets everyone build apps/experiences for the long tail of use cases.

SubiculumCode · on Dec 10, 2023

Do you happen to know of some that have been integrated into a slack chatbot?

vunderba · on Dec 10, 2023

I've definitely seen a few that can do this, they're mainly positioned as automatic assistance for technical support.

Danswer has a slack connector, so it might be what you're looking for.

on Dec 9, 2023

[deleted]