Hacker Newsnew | past | comments | ask | show | jobs | submit | itissid's commentslogin

There was a time when if you edited documentation in vscode and had copilot on it would complete internal user and project names when it encountered a path on some.random LLM project we were building. I could find people and their projects by just googling the username and contextual keywords.

We all had a lot of laughs with tab auto complete and wondered in anticipation what ridiculous stuff it threw up next.


One thing that is interesting to think about is given a skill which is just "pre-context", how can it be _evolved_ to create prompts given _my_ context? e.g. here is their web artifact skill builder from desktop app:

``` web-artifacts-builder

Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using modern frontend web technologies (React, Tailwind CSS, shadcn/ui). Use for complex artifacts requiring state management, routing, or shadcn/ui components - not for simple single-file HTML/JSX artifacts. ```

Say I want to build a landing page with some relatively static content — I don't know it yet but its just gonna be bootstrap CSS, no SPA/React(ish), it'll be fine with templated server side thing. But I don't know how to express this in words. Could the skill _evolve_ based on what my preferences are and what is possible for a relative novice to grok and construct?

This is a simple example, but it could extend to say using sqlite+litestream instead of postgres or using Gradient boosted trees instead of an expensive transformer based classifier.


Isn't atleast part of that GH issue something that this https://docs.boundaryml.com/guide/introduction/what-is-baml is also trying to solve? LLM inputs and outputs must be functions with defined functions. That was their starting point.

IIUC their most recent arc focuses on prompt optimization[0] where you can optimize — using DSPy and an optimization algo GEPA [1] — using relative weights on different things like errors, token usage, complexity.

[0] https://docs.boundaryml.com/guide/baml-advanced/prompt-optim... [1] https://github.com/gepa-ai/gepa?tab=readme-ov-file


I gave opus an "incorrect" research task (using this slash command[1]) in my REST server to research to use SQLite + Litestream VFS can be used to create read-replicas for REST service itself. This is obviously a dangerous use of VFS[2] and a system like sqlite in general(stale reads and isolation wise speaking). Ofc it happily went ahead and used Django's DB router feature to implement `allow_relation` to return true if `obj._state.db` was a `replica` or `default` master db.

Now claude had access to this[2] link and it got the daya in the research prompt using web-searcher. But that's not the point. Any Junior worth their salt — distributed systems 101 — would know _what_ was obvious, failure to pay attention to the _right_ thing. While there are ideas on prompt optimization out there [3][4], the issue is how many tokens can it burn to think about these things and come up with optimal prompt and corrections to it is a very hard problem to solve.

[1] https://github.com/humanlayer/humanlayer/blob/main/.claude/c... [2] https://litestream.io/guides/vfs/#when-to-use-the-vfs [3] https://docs.boundaryml.com/guide/baml-advanced/prompt-optim... [4]https://github.com/gepa-ai/gepa


I'm not sure a junior would immediately understand the risks of what you described. Even if they did well in dist sys 101 last year.

Really nice. We should have this as an add-on to https://app.codecrafters.io/courses/sqlite/overview It can probably teach one a lot about the value of good replication and data formats.

If you are not familiar with data systems, havea read DDIA(Designing Data Intensive Applications) Chapter 3. Especially the part on building a database from the ground up — It almost starts with sthing like "Whats the simplest key value store?": `echo`(O(1) write to end of file, super fast) and `grep`(O(n) read, slow) — and then build up all the way to LSMTrees and BTrees. It will all make a lot more sense why this preserves so many of those ideas.


Nicely done. I think from a product perspective it is interesting that:

- Humans really value authentic experiences. And more so IRL experiences. People's words about a restaurant matter more than the star rating to me.

- There is only one reason to go somewhere: 4.5 star reason. But there are 10 different reasons to not go: Too far, not my cuisine, too expensive for my taste. So the context is what really matters.

- Small is better. Product wise, scale always is a problem, because the needs of the product will end up discriminating against a large minority. You need it to be decentralized and organic, with communities that are quirky.

All this is, somehow, anethma to google maps or yelp's algorithm. But I don't understand why it is _so_ bad — just try searching for 'salad' — and be amazed how it will recommend a white table cloth restaurant in the same breath as chipotle.

There are many millions that want to use the product _more_ if it was personalized. Yet somehow its not.


> People's words about a restaurant matter more than the star rating to me.

I find that both offer an incredibly poor signal. I can usually get a much better idea of the quality of the place by looking at pictures of the food (especially the ones submitted by normal users right after their plate arrives at the table). It's more time consuming to scroll through pictures manually than to look at the stars, but I'm convinced it's a much better way to find quality.

Maybe that could be a good angle for this kind of tool. At least until this process becomes more popular and the restaurants try to game that too by using dishonest photography.


This is also quantitatively correct because for two people coming in from afar you might change two trains or a bus and train and each ticket is at least 3.00$(bus/path from NJ) which is 24$ minimum both ways, with more than two it would make even more sense to take the car.

Congestion pricing brings in a toll above the 16$ you pay throu the tunnel. I think it's 18, So 34$ total?

So you are incentivized to get more than 2 people by car. Less traffic.


Could this record like a morse code(ish) like clicks instead of speaking? I can find a number of use cases for it:

1. Distress/Emergency makes you Unable to speak.

2. While doing vipassana meditation to record how strong the feeling attached to a thought was.

3. Repeat previous action.


Oh I love this. I am going to contextually switch the instructions from this to my home trained instruction fine tuned LLM for doing a multitude of things.


I have always wondered if I should be recording all my conversations privately — with consent —with family and friends and then train an LLM to let anyone speak to someone that sounds "like me" when I am gone.

I suppose one could order all the data over time -— decades — and then train a model incrementally every decade and imitate me better at a point in time.

I suppose one could also narrate thoughts and feelings associated with many transcripts, which would be very tedious but would make the LLM imitate not just style but some amount of internal monologue.

I suppose one level further could be an LLM learning about the variety or parts of the ego, the I, me, mine, ours. Then the Observer and the Observed parts of thought — if we can somehow tap internal thought without manually speaking — because thoughts are, metaphorically speaking, the speed of light.

Why would one do all this? I suppose a curt answer would be to "live" eternally of course — with all the limitations of the current tech — but still try.

It might make a fascinating psychoanalysis project, one that might be a better shot at explaining someone's _self_ not as a we, a stranger, might as outwardly see it: just as a series of highs and lows and nothing in between, but instead as how they lived through it.


You've created a text-based version of a Black Mirror episode: https://en.wikipedia.org/wiki/Be_Right_Back


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: