Hacker News new | past | comments | ask | show | jobs | submit | jarulraj's comments login

It is based on the feedback shared by an undergraduate student -- pretty subjective. For instance, it is possible to graduate with a CS degree without even taking an introductory course on systems.


It's also possible to work for a decade in the industry without even knowing what you're referring to.


It is currently not possible to get rigorous summaries of paper chunks using GPT-4.


Wouldn't more semantically related neighbors be retrieved by just increasing K?


Potentially, yes! The scenario I am imagining is that context A as context for a query yields results 1, 2, and 3. Sometimes, finding neighbors of 1 (ie not necessarily in the top K w.r.t A) instead of going to results 4, 5, 6 might be better.


Cool project! Can you elaborate on "scalable" AI deployment? We are exploring the data + AI space in EvaDB and would love to exchange notes [1].

[1] https://github.com/georgia-tech-db/evadb


Great! So the "scalable" AI deployment in SuperDuperDB is offered via something called as `Listeners` in conjunction with k8s native support by SuperDuperDB

Lots of cool feature are in pipeline,

Please join a free official webinar from MongoDB, we are hosting about how SuperDuperDB could enable and manage vector search with atlas

https://www.eventbrite.com/e/enable-and-manage-vector-search...

Thanks


Very cool project, @MatthausK!

What are your thoughts on reducing LLM cost?

We are also exploring LLM-based data wrangling using EvaDB and cost is an important concern [1, 2, 3].

[1] https://github.com/georgia-tech-db/evadb

[2] https://medium.com/evadb-blog/stargazers-reloaded-llm-powere...

[3] https://github.com/pchunduri6/stargazers-reloaded


Nice :)

What were the interesting problems you faced in processing the survey data?

If possible, can you share the prompt?


Yes, there is definitely a human-in-the-loop element here.

It would be great if you could share an example of the inconsistent output problem -- we also faced it. GPT-4 was much better than GPT-3.5 in output quality.


Great question! We iterated on the prompt for several days and manually verified the results for ~100 users.

The results were pretty good: https://gist.github.com/gaurav274/506337fa51f4df192de78d1280...

Another interesting aspect was the money spent on LLMs. We could have directly used GPT-4 to generate the "golden" table; however, it's a bit expensive — costing $60 to process the information of 1000 users. To maintain accuracy while reducing costs significantly, we set up an LLM model cascade in the EvaDB query, running GPT-3.5 before GPT-4, leading to a 11x cost reduction ($5.5).

Query 1: https://github.com/pchunduri6/stargazers-reloaded/blob/228e8...

Query 2: https://github.com/pchunduri6/stargazers-reloaded/blob/228e8...


Did you compile accuracy, F1 numbers, or anything like that? Do you have quantitative comparisons of results you got w/ different models?


As we do not have ground truth, we only qualitatively checked for accuracy -- no quantitative metrics. We did find a significant drop in accuracy with GPT 3.5 as opposed to GPT 4.

Are you measuring accuracy with data wrangling prompts? Would love to learn more about that.


Everything I do now is classification and AUC-ROC is my metric. For your problem my first thought is an up-down accuracy metric, but the tricky problem you might have is "do you accept both 'United States' and 'USA' as a correct answer?" and the trouble dealing with that is one reason I stick to classification problems.

I'm skeptical of any claim that "A works better than B" without some numbers to back it up.


Thanks for your kind words, @skeptrune! It was certainly a fun project.

We found some interesting insights. In the Langchain community, ~40% of the stargazers are from India. In the GPT4All community, we found that web developers love open-source LLMs -- more so that machine learning folks :)

Curious if there is any reason why you would not use it?


Thanks for your kind words, @treebeard5440! :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: