More

gavinray · 2025-09-05T14:59:08 1757084348

Many algorithms are more simply expressed as recursive functions than stack-based iterators or "while" loops.

gavinray · 2025-08-22T14:37:12 1755873432

There's a guy, Ken Stanley, who wrote the NEAT[0]/HyperNEAT[1] algorithms.

His big idea is that evolution/advancements don't happen incrementally, but rather in unpredictable large leaps.

He wrote a whole book about it that's pretty solid IMO: "Why Greatness Cannot Be Planned: The Myth of the Objective."

[0] https://en.wikipedia.org/wiki/Neuroevolution_of_augmenting_t... [1] https://en.wikipedia.org/wiki/HyperNEAT

delichon · 2025-08-22T14:48:17 1755874097

https://en.wikipedia.org/wiki/Saltation_(biology)

gavinray · 2025-08-22T15:06:27 1755875187

Neat (no pun intended), TIL there's a word for this

devin · 2025-08-22T15:28:59 1755876539

Whenever I try to tell people about the myth of the objective they look at me like I'm insane. It's not very popular to tell people that their best laid plans are actually part of the problem.

yifanl · 2025-08-22T14:46:12 1755873972

I would suspect that any next step comes with a novel implementation though, not just trying to scale the same shit to infinity.

I guess the bitter lesson is gospel now, which doesn't sit right with me now that we're past the stage of Moore's Law being relevant, but I'm not the one with a trillion dollars, so I don't matter.

gavinray · 2025-08-22T13:07:45 1755868065

I'd be curious if you could share some poor-performing prompts.

I would be willing to record myself using them across paid models with custom instructions and see if the output is still garbage.

mrguyorama · 2025-08-22T20:07:50 1755893270

This is just the new version of "works on my machine". Oh, I was able to contrive a correct answer from my prompt because the random number generator smiled upon me today.

That's not useful.

gavinray · 2025-08-22T13:06:19 1755867979

  > Yeah I think our jobs are safe.

I give myself 6-18 months before I think top-performing LLM's can do 80% of the day-to-day issues I'm assigned.

  > Why doesn’t anyone acknowledge loops like this?

Thisis something you run into early-on using LLM's and learn to sidestep. This looping is a sort of "context-rot" -- the agent has the problem statement as part of it's input, and then a series of incorrect solutions.

Now what you've got is a junk-soup where the original problem is buried somewhere in the pile.

Best approach I've found is to start a fresh conversation with the original problem statement and any improvements/negative reinforcements you've gotten out of the LLM tacked on.

I typically have ChatGPT 5 Thinking, Claude 4.1 Opus, Grok 4, and Gemini 2.5 Pro all churning on the same question at once and then copy-pasting relevant improvements across each.

dinfinity · 2025-08-22T14:00:27 1755871227

I concur. Something to keep in mind is that it is often more robust to pull an LLM towards the right place than to push it away from the wrong place (or more specifically, the active parts of its latent space). Sidenote: also kind of true for humans.

That means that positively worded instructions ("do x") work better than negative ones ("don't do y"). The more concepts that you don't want it to use / consider show up in the context, the more they do still tend to pull the response towards them even with explicit negation/'avoid' instructions.

I think this is why clearing all the crap from the context save for perhaps a summarizing negative instruction does help a lot.

gavinray · 2025-08-22T14:15:19 1755872119

  >  positively worded instructions ("do x") work better than negative ones ("don't do y")

I've noticed this.

I saw someone on Twitter put it eloquently: something about how, just like little kids, the moment you say "DON'T DO XYZ" all they can think about is "XYZ..."

Wowfunhappy · 2025-08-22T18:34:16 1755887656

> That means that positively worded instructions ("do x") work better than negative ones ("don't do y").

In teacher school, we're told to always give kids affirmative instructions, ie "walk" instead of "don't run". The idea is that it takes more energy for a child to figure out what to do.

fuzzzerd · 2025-08-22T13:31:49 1755869509

> This looping is a sort of "context-rot" -- the agent has the problem statement as part of it's input, and then a series of incorrect solutions.

While I agree, and also use your work around, I think it stands to reason this shouldn't be a problem. The context had the original problem statement along with several examples of what not to do and yet it keeps repeating those very things instead of coming up with a different solution. No human would keep trying one of the solutions included in the context that are marked as not valid.

traceroute66 · 2025-08-22T13:46:58 1755870418

> No human would keep trying one of the solutions included in the context that are marked as not valid.

Exactly. And certainly not a genius human with the memory of an elephant and a PhD in Physics .... which is what we're constantly told LLMs are. ;-)

Workaccount2 · 2025-08-22T15:27:45 1755876465

I'm sure somewhere in the current labs there are teams that are trying to figure out context pruning and compression.

In theory you should be able to get a multiplicative effect on context window size by consolidating context into it's most distilled form.

30,000 tokens of wheel spinning to get the model back on track consolidated to 500 tokens of "We tried A, and it didn't work because XYZ, so avoid A" and kept in recent context

gavinray · 2025-08-22T13:56:24 1755870984

  > No human would keep trying one of the solutions included in the context that are marked as not valid.

Yeah, definitely not. Thankfully for my employment status, we're not at "human" levels QUITE yet

vidarh · 2025-08-22T14:49:48 1755874188

I agree it shouldn't be a problem, but if you don't regularly run into humans who insist on trying solutions clearly signposted as wrong or not valid, you're far luckier than I am.

ModernMech · 2025-08-22T15:21:00 1755876060

> I give myself 6-18 months before I think top-performing LLM's can do 80% of the day-to-day issues I'm assigned.

This is going to age like "full self driving cars in 5 years". Yeah it'll gain capabilities, maybe it does do 80% of the work, but it still can't really drive itself, so it ultimately won't replace you like people are predicting. The money train assures that AGI/FSD will always be 6-18 months away, despite no clear path to solving glaring, perennial problems like the article points out.

xienze · 2025-08-22T15:49:00 1755877740

> The money train assures that AGI/FSD will always be 6-18 months away

I vividly remember when some folks from Microsoft come to my school to give a talk at some Computer Science event and proclaimed that yep, we have working AGI, the only limiting factor is hardware, but that should be resolved in about ten years.

This was in 2001.

Some grifts in technology are eternal.

ForHackernews · 2025-08-22T14:39:49 1755873589

> I give myself 6-18 months before I think top-performing LLM's can do 80% of the day-to-day issues I'm assigned.

How long before there's an AI smart enough to say 'no' to half the terrible ideas I'm assigned?

more_corn · 2025-08-22T15:58:44 1755878324

Herald AI has a pretty robust mechanism for context cleanup. I think I saw a blogpost from them about it.

zparky · 2025-08-22T17:08:10 1755882490

Um, how much are you spending on running all these at once?

gavinray · 2025-08-18T09:00:47 1755507647

Estradiol is _extremely_ well studied, including non-HRT in AFAB/AMAB cisgender eugonadal populations

gavinray · 2025-08-18T08:58:31 1755507511

Birth control uses Ethinyl Estradiol which, despite the name, is not actually Estradiol and so does not undergo the same metabolic pathways and metabolites production

I know this because I recently had to source exogenous Estradiol for my wife after making this same mistaken assumption and being surprised at bloodwork and lack of improvement

gavinray · 2025-08-16T16:20:51 1755361251

I tried to build a SaaS for niche fetish content creators to connect with fans

Got as far as several emails to vendors who all replied they wouldn't facilitate the payments, and saying "Good luck" trying to find one that would

Absolutely stupid system if you ask me.

ta12653421 · 2025-08-16T16:58:48 1755363528

There are "dediacated P0rn billing" companies, reach out to them.

gavinray · 2025-08-12T15:09:33 1755011373

So you ask it to try every task 3.33 times for guaranteed success?

gavinray · 2025-08-12T14:55:29 1755010529

Terrifying.

gavinray · 2025-08-10T11:36:16 1754825776

It has self-closing tags, which you can see in the repo screenshot, so you're correct.

actionfromafar · 2025-08-10T12:05:18 1754827518

But why... standards are good, eveyone should have one, I guess.