More

mccoyb · 2025-11-17T00:58:26 1763341106

> Julia's "secret sauce", the dynamic type system and method dispatch that endows it with its powers of composability, will never be a feature of languages such as Fortran. The tradeoff is a more complex compilation process and the necessity to have part of the Julia runtime available during execution.

> The main limitation is the prohibition of dynamic dispatch. This is a key feature of Julia, where methods can be selected at run time based on the types of function arguments encountered. The consequence is that most public packages don't work, as they may contain at least some instances of dynamic dispatch in contexts that are not performance-critical. Some of these packages can and will be rewritten so that they can be used in standalone binaries, but, in others, the dynamic dispatch is a necessary or desirable feature, so they will never be suitable for static compilation.

The problem (which the author didn't focus on, but which I believe to be the case) that Julia willingly hoisted on itself in the pursuit of maximum performance is _invoking the compiler at runtime_ to specialize methods when type information is finally known.

Method dispatch can be done statically. For instance, what if I don't know what method to call via abstract interpretation? Well, use a bunch of branches. Okay, you say, but that's garbage for performance ... well, raise a compiler error or warning like JET.jl so someone knows that it is garbage for performance.

Now, my read on this work is the noble goal of prying a different, more static version of Julia free from this compiler design decision.

But I think at the heart of this is an infrastructural problem ... does one really need to invoke the compiler at runtime? What space of programs is that serving that cannot be served statically, or with a bit of upfront user refactoring?

Open to be shown wrong, but I believe this is the key compiler issue.

DNF2 · 2025-11-17T14:34:29 1763390069

This is not how I understand the performance model. Allowing invokation of the compiler at runtime is definitely not something that is done for performance, but for dynamism, to allow some code to run that could not otherwise be run.

In performant Julia code, the compiler is not invoked, because types are statically inferred. In some cases you can have dynamic dispatch, but that doesn't necessarily mean that the compiler needs to run. Instead you can get runtime lookup of previously compiled methods. Dynamic dispatch does not necessitate running the compiler.

mccoyb · 2025-11-17T17:18:42 1763399922

I don't believe it, otherwise why not just compile a static but generic version of the method with branches based on the tags of values? ("Can't figure out the types, wait until runtime and then just branch to the specialized method instances which I do know the types for")

Perhaps there is something about subtyping which makes this answer ... not correct -- and if someone knows the real answer, I'd love to understand it.

I believe that this answer is because of performance -- if I can JIT at runtime, that's great -- I get dynamism and performance ... at the cost of a small blip at runtime.

And yes, "performant Julia code" -- that's the static subset of the language that I roughly equated to be the subset which is trying to be pried free from the dynamic "invoking the compiler again" part.

DNF2 · 2025-11-17T22:52:53 1763419973

I'm not exactly sure what you don't believe, your comment is hard to follow, or relies on premises I haven't detected. What you are describing in your first paragraph is somewhat reminiscent of dynamic dispatch, which Julia does use, but generally hampers performance. It is something to avoid in most cases.

Anyway, performance in Julia relies heavily on statically inferring types and aggressive type specialization at compile time. Triggering the compiler later, during actual runtime, can happen, but is certainly not beneficial for performance, and it's quite unusual to claim that it's central to the performance model of Julia.

If you are asking why Julia allows recompiling code and has dynamic types, it's not for performance, but to allow an interactive workflow and user friendly dynamism. It is the central tradeoff in Julia to enable this while retaining performance. If performance was the only concern, the language would be very different.

mccoyb · 2025-11-18T01:07:40 1763428060

I used Julia for 4 years. I'm not a moron: I'm familiar with how it works, I've written several packages in it, including some speculative compiler ones.

You claimed:

> Allowing invokation of the compiler at runtime is definitely not something that is done for performance, but for dynamism, to allow some code to run that could not otherwise be run.

I asked:

> why not just compile a static but generic version of the method with branches based on the tags of values? ("Can't figure out the types, wait until runtime and then just branch to the specialized method instances which I do know the types for")

Which can be done completely ahead of time, before runtime, and doesn't rely on re-invoking the compiler, thereby making this whole "ahead of time compilation only works for a subset of Julia code" problem disappear.

Do you understand now?

My original comment:

> The problem (which the author didn't focus on, but which I believe to be the case) that Julia willingly hoisted on itself in the pursuit of maximum performance is _invoking the compiler at runtime_ to specialize methods when type information is finally known.

is NOT a claim about the overall architecture of Julia -- it's a point about this specific problem (Julia's static ahead-of-time compilation) which is currently highly limited.

mccoyb · 2025-11-16T20:17:23 1763324243

I don’t think open source is going anywhere. It’s posed to get significantly stronger — as the devs which care about it learn how to leverage AI tools to make things that corporate greasemonkeys never had the inspiration to. Low quality code spammers are just marketing themselves for jobs where they can be themselves: soulless and devoid of creative impulse.

That’s the thing: open source is the only place where the true value (or lack of value) of these tools can be established — the only place where one can test mettle against metal in a completely unconstrained way.

Did you ever want to build a compiler (or an equally complex artifact) but got stuck on various details? Try now. It’s going to stand up something half-baked, and as you refine it, you will learn those details — but you’ll also learn that you can productively use AI to reach past the limits of your knowledge, to make what’s beyond a little more palatable.

All the things people say about AI is true to some degree: my take is that some people are rolling the slots to win a CRUD app, and others are trying to use it to do things that they could only imagine before —- and open source tends to be the home of the latter group.

nowittyusername · 2025-11-16T20:24:30 1763324670

True innovation will come from open source for sure. As the developers don't have the same economic incentives to be "safe", "ethical" "profitable" or whatever. large corporations know this and fear this development. That's why i expect a significant lobbying to take hold in USA that will try and make local AI systems illegal. And I think they will be very convincing to the government. Because the government also fears the "peasants" and giving them any true semblance of real AGI like systems. I bet very soon we will start seeing various classifications that will define what is legal and what is not for a citizen to possess or use.

pessimizer · 2025-11-16T23:19:03 1763335143

> That's why i expect a significant lobbying to take hold in USA that will try and make local AI systems illegal.

I think they're going to be using porn and terrorism (as usual) to do that, but also child suicide. I also think they're going to leverage this rhetoric to lock down OSes in general, by making them uninstallable on legally-available hardware unless approved, because approved OSes will only be able to run approved LLMs.

Meaning that I think LLMs/generative AI will be the lever to eliminate general-purpose computing. As mobile went, so will desktop.

I think this is inevitable. The real question for me is whether China will partner with the west on this, or whether we will be trading Chinese CPUs with each other like contraband in order to run what we want.

> any true semblance of real AGI like systems.

This is the only part I don't agree with. This isn't going to happen, but I'm not even sure it would be more useful than what we have. We have billions of full AGI machines walking around, and most of them aren't great. I'm talking about restrictions on something technically barely better than what we have now; maybe only a significant bit more compute-efficient. Training techniques will probably be where we get the most improvements.

exasperaited · 2025-11-16T20:34:47 1763325287

> It’s posed to get significantly stronger

It's really not. Every project of any significance is now fending off AI submissions from people who have not the slightest fucking clue about what is involved in working on long-running, difficult projects or how offensive it is to just slather some slop on a bug report and demand it is given scrutiny.

Even at the 10,000 feet view it has wasted people's time because they have to sit down and have a policy discussion about whether to accept AI submissions, which involves people reheating a lot of anecdotal claims about productivity.

Having learned a bit about how to write compilers I know enough to know that I can guarantee you that an AI cannot help you solve the difficult problems that compiler-building tools and existing libraries cannot solve.

It's the same as it is with any topic: the tools exist and they could be improved, but instead we have people shoehorning AI bollocks into everything.

doug_durham · 2025-11-16T21:10:45 1763327445

This isn't an AI issue. It is a care issue. People shouldn't submit PRs to project where they don't care enough to understand the project they are submitting to or the code they are submitting. This has always been a problem, there is nothing new. The thing that is new is more people can get to a point where they can submit regardless of their care or understanding. A lot of people are trying to gild their resume by saying they contributed to a project. Blaming AI is blaming the wrong problem. AI is a a tool like a spreadsheet. Project owners should instead be working ways to filter out careless code more efficiently.

johnnyanmac · 2025-11-17T10:38:18 1763375898

That's why I'm not super optimistic. Even pre-AI and tech slump there were talks about how hard it may be to replace the old guard maintaining these open source initiatives. Now...

>Blaming AI is blaming the wrong problem. AI is a a tool like a spreadsheet. Project owners should instead be working ways to filter out careless code more efficiently.

When care leaves, the entire commons starts to fall apart. New talent doesn't come in. Old talent won't put up with it and retire out of the scene. They already have so much work to do, needing to add in non-development work to make better spam filters may very well be the final stray.

Even when the careless leave, it won't bring back the talent lost. Directing the blame onto the sure won't do that.

exasperaited · 2025-11-16T21:44:56 1763329496

This is an AI issue because people, including the developers of AI tools, don't care enough.

The Tragedy Of The Commons is always about this: people want what they want, and they do not care to prevent the tragedy, if they even recognise it.

> Project owners should instead be working ways to filter out careless code more efficiently.

Great. So the industry creates a burden and then forces people to deal with it — I guess it's an opportunity to sell some AI detection tools.

danielbln · 2025-11-17T10:13:11 1763374391

We don't need an AI detector, we need a "human vetted" detector.

johnnyanmac · 2025-11-17T10:39:53 1763375993

Who's paying the human to vet it? Or will we have volunteers dedicated to being AI detectors instead of developers?

danielbln · 2025-11-17T10:55:03 1763376903

I don't have those answers. My point was that trying to outright ban any AI is futile and probably overall counter productive, and that we need to find ways to ensure a human hasn't submitted slop. I don't have an answer as to the how.

johnnyanmac · 2025-11-17T22:27:07 1763418427

> trying to outright ban any AI is futile and probably overall counter productive

okay, you can keep thinking that. I'll just reject anything that has a whiff of AI and lacks care. No point campaigning in this admin to regulate anything, so that's off the table for 1-3 years.

exasperaited · 2025-11-17T15:33:43 1763393623

People arguing against my point here seem to be doing a good job of validating my point.

windward · 2025-11-17T12:11:32 1763381492

>Every project of any significance is now fending off AI submissions

Not anything with a cathedral model.

'open source' is too ambiguous to be useful.

theshrike79 · 2025-11-17T10:09:15 1763374155

> Every project of any significance is now fending off AI submissions from people who have not the slightest fucking clue

I'm kinda hoping that Github will provide an Anubis-equivalent for issue submissions by default.

mccoyb · 2025-11-16T21:00:56 1763326856

Sounds like a lot of FUD to me — if major projects balk at the emergence of new classes of tools, perhaps the management strategy wasn’t resilient in the first place?

Further: sitting down to discuss how your project will adapt to change is never a waste of time, I’m surprised you stated it like that.

In such a setting, you’re working within a trusted party — and for a major project, that likely means extremely competent maintainers and contributors.

I don’t think these people will have any difficulty adapting to the usage of these tools …

johnnyanmac · 2025-11-17T10:43:16 1763376196

> if major projects balk at the emergence of new classes of tools, perhaps the management strategy wasn’t resilient in the first place?

It's not the tools, it's the quality. No FOSS dev would care where the code came from if it followed the contributor's guidelines and coding style.

This is why it's a spam issue. a bunch of low quality submissions only gum up the time of such developers and slows the entire process down.

>that likely means extremely competent maintainers and contributors.

Your assumption falls apart here, sadly. Dunning-Kruger hits hard here for new contributors powered by LLMs and the maintainers suffer the brunt of the hit.

mccoyb · 2025-11-17T14:25:15 1763389515

Why not just disallow PRs from non-vetted contributors?

Why not just disallow issues without a vetting process?

Many of these things could be explored -- you're right: it's a spam issue. But we have solutions to spam issues ... filters. LLMs have shown that "praying for the best" with permissive repository settings is not sufficient. We can and will improve our filters, no?

johnnyanmac · 2025-11-18T02:45:47 1763433947

That's certainly going to be the eventual outcome at this rate, yes. Close off the FOSS and go underground. Contributing will now involve negotiating the politics and vetting oneself instead of the quality of the contributions. Rejective, but it feels like it hurts the spirit of FOSS. For an industry that can already be considered a bit gatekeep-y

You can definitely argue we hit that point a long time ago, but this will exacerbate it.

exasperaited · 2025-11-16T21:45:36 1763329536

> Further: sitting down to discuss how your project will adapt to change is never a waste of time, I’m surprised you stated it like that.

It is a waste of time for large-scale volunteer-led projects who now have to deal with tons of shit — when the very topic is "how do we fend off this stuff that we do not want, because our project relies on much deeper knowledge than these submissions ever demonstrate?"

micromacrofoot · 2025-11-16T20:46:02 1763325962

yeah we are getting lots of "I don't know how to do this and AI gave me this code that doesn't work, can you fix it" or "AI said it can do this" and the feature doesn't exist... some people will even argue and say "but AI said it doesn't take long, why won't you add it"

exasperaited · 2025-11-16T21:50:31 1763329831

It weaponises incompetence, carelessness and arrogance at every turn.

AI, to me, is a character test: I'm regularly fascinated by finding out who fails it.

For example, in my personal life I have been treated to AI-generated comms from someone that I would never have expected it from. They don't know I know, and they don't know that I think less of them, and I always will.

kmijyiyxfbklao · 2025-11-17T17:01:13 1763398873

>They don't know I know, and they don't know that I think less of them, and I always will.

lol, behavior like this is way more destructive to personal relationships than AI ever will be.

danielbln · 2025-11-17T10:16:34 1763374594

I will never judge someone for using AI, but I will absolutely judge anyone for lobbing slop at me. I define slop as low effort, non-vetted, fiest-try-output.

mccoyb · 2025-11-06T00:25:56 1762388756

Are you `robject` on Stack Overflow -- or perhaps you might explain why you copied verbatim their response to https://stackoverflow.com/questions/3561145/what-is-a-smallt... without citing your source?

mccoyb · 2025-11-05T13:49:28 1762350568

Sonnet 4.5 is way worse than Opus 4.1 -- it's incredible that they claim it's their best coding model.

It's obvious if you've used the two models for any sort of complicated work.

Codex with GPT-5 codex (high thinking) is better than both by a long shot, but takes longer to work. I've fully switched to Codex, and I used Claude Code for the past ~4 months as a daily driver for various things.

I only reach for Sonnet now if Codex gets cagey about writing code -- then I let Sonnet rush ahead, and have Codex align the code with my overall plan.

mccoyb · 2025-11-01T21:11:07 1762031467

I'm working on such a thing, but I'm not interested in money, nor do I have money to offer - I'm interested in a system which I'm proud of.

What are your motivations?

Interested in your work: from your public GitHub repos, I'm perhaps most interested in `moor` -- as it shares many design inclinations that I've leaned towards in thinking about this problem.

cmrdporcupine · 2025-11-01T21:19:29 1762031969

Unfortunately... mooR is my passion project, but I also need to get paid, and nobody is paying me for that.

I'm off work right now, between jobs and have been working 10, 12 hours a day on it. That will shortly have to end. I applied for a grant and got turned down.

My motivations come down to making a living doing the things I love. That is increasingly hard.

mccoyb · 2025-10-08T22:52:22 1759963942

These are exactly the feelings that I left with from the community in ~2021 (along with the AD story, which never really materialized _within_ Julia - Enzyme had to come from outside Julia to “save it” - or materialized in a way (Zygote) whose compilation times were absolutely unacceptable compared to competitors like JAX)

More and more over time, I’ve begun to think that the method JIT architecture is a mistake, that subtyping is a mistake.

Subtyping makes abundant sense when paired with multiple dispatch — so perhaps my qualms are not precise there … but it also seems like several designs for static interfaces have sort of bounced off the type system. Not sure, and can’t defend my claims very well.

Julia has much right, but a few things feel wrong in ways that spiral up to the limitations in features like this one.

Anyways, excited to check back next year to see myself proven wrong.

jakobnissen · 2025-10-09T05:02:27 1759986147

I basically agree with subtyping (but not multiple dispatch). More importantly, I think it's important to recognize that Julia has a niche that literally no one can compete with - interactive, dynamic and high performance.

Like, what exactly is the alternative? Python? Too slow. Static languages? Unusable for interactive exploration and data science.

That leaves you with hybrids, like Python/Cython, or Python/Rust or Numba, but taken on their own term, these are absolutely terrible languages. Python/Rust is not safe (due to FFI), certainly not pleasant to develop in, and no matter how you cut your code between the languages, you always lose. You always want your Python part to be in Rust so you get static analysis, safety and speed. You always want your Rust part to be in Python, so you can experiment with it easier and introspect.

mccoyb · 2025-10-09T15:33:27 1760024007

To clarify my comment: I agree that multiple dispatch is a very good language feature. I enjoy it, and I’m well-versed in the expression problem, yada yada.

That’s not what I meant by “method JIT architecture” — I meant calling back into the compiler at runtime to specialize code when the types are known.

blindseer · 2025-10-09T07:52:29 1759996349

I think multiple dispatch (useful as it is) is a little overrated. There's a significant portion of the time where I know I have a closed set of cases to cover, and an enum type with a match-like syntax would have worked better for that. For interfaces, multiple dispatch is good but again I would have preferred a trait based approach with static type checking.

I largely think multiple dispatch works well in Julia, and it enables writing performant code in an elegant manner. I mostly have smaller gripes about subtyping and the patterns it encourages with multiple dispatch in Julia, and larger gripes about the lack of tooling in Julia.

But multiple dispatch is also a hammer where every problem in Julia looks like a nail. And there isn't enough discussion, official or community driven, that expands on this. In my experience the average developer to Julia tends to reach for multiple dispatch without understanding why, mostly because people keep saying it is the best thing since sliced bread.

wrt to hybrid languages, honestly, I think Python/Cython is extremely underrated. Sure you can design an entirely new language like Mojo or Julia, but imo it offers only incremental value over Python/Cython. I would love to peek into another universe where all that money, time and effort for Mojo and Julia went to Cython instead.

And I personally don't think Python/Rust is as bad. With a little discipline (and some tests), you can ensure your boundary is safe, for you and your team. Rust offers so much value that I would take on the pain of going through FFI. PyO3 simplifies this significantly. The development of `polars` is a good case study for how Rust empowers Python.

I think the Julia community could use some reflection on why it hasn't produced the next `polars`. My personal experience with Julia developers (both in-person and online) is that they often believe multiple dispatch is so compelling that any person that "saw the light" would obviously naturally flock to Julia. Instead, I think the real challenge is meeting users where they are and addressing their needs directly. The fastest way to grow Julia as a language is to tag along Python's success.

Would I prefer a single language that solves all my problems? Yes. But that single language is not Julia, yet, for me.

PS: I really enjoy your blog posts and comments.

wrathofmonads · 2025-10-09T09:46:29 1760003189

Mojo has a different scope than Julia and Python, it targets inference workloads.

Polars is a dataframe library. Yes, it features vectorized operations, but it is focused on columnar data manipulation, not numerical algorithm development. I might say that this is narrow framing, people are looking at Julia through the lens of a data scientist and not of an engineer or computational scientist.

blindseer · 2025-10-09T13:29:36 1760016576

Most of my gripes are when trying to use Julia the way a software engineer would use a programming language.

Most "data scientist" code is exploratory (it's a prototype or a script for an one-off exploration) in nature. And my main gripe is that making that code production ready and maintainable over a long period of time is so difficult that I would switch to Rust instead. If I were going to switch to Rust, I might as well start with Python.

mccoyb · 2025-10-07T03:32:19 1759807939

It comes off as someone who lives their life according to quantity, not quality.

The real insight: have some fucking pride in what you make, be it a blog post, or a piece of software.

palmotea · 2025-10-07T04:51:48 1759812708

> The real insight: have some fucking pride in what you make, be it a blog post, or a piece of software.

The businessmen's job will be complete when they've totally eliminated all pride from work.

mycall · 2025-10-09T03:36:31 1759980991

At the same time, if there is a business opportunity in having pride when no one else has it, it will become a businessmen's job to do so.

philipallstar · 2025-10-07T15:29:17 1759850957

This same instinct is why a pencil costs almost nothing and is perfect, and isn't rubbish, really expensive, and created by someone who took pride in their work.

palmotea · 2025-10-07T17:14:02 1759857242

> This same instinct is why a pencil costs almost nothing and is perfect, and isn't rubbish, really expensive, and created by someone who took pride in their work.

No. Have you worked with businessmen? 90% of the time they're telling you to cut corners and leave things broken, to the point you have a janky mess that can be barely held together. And, right now, we're talking about a technology (LLMs) that is well known to introduce stupid but often hard to spot errors.

They don't want a pencil that's perfect. They want one that's just barely good enough to write with and that they can get maximum profit margin on.

And then, you know, there's the whole thing about life being more than output.

philipallstar · 2025-10-07T18:03:05 1759860185

Life can be more than output, which is why you don't want buying pencils, or anything else, to take up any more of your wages than is absolutely necessary.

palmotea · 2025-10-07T18:30:10 1759861810

> Life can be more than output, which is why you don't want buying pencils, or anything else, to take up any more of your wages than is absolutely necessary.

You're not getting it. It'd probably help if you stopped focusing on your pencil story, it's frankly off-topic.

To try one more time: You probably spend half your waking ours at work. The quality of that time is important to your well being. Even if the businessmen sell you cheap, perfect pencils (which I do not grant), swimming in them in your off hours won't help with the other half of your time.

philipallstar · 2025-10-08T10:37:20 1759919840

> It'd probably help if you stopped focusing on your pencil story, it's frankly off-topic.

I've no idea what this italicisation is meant to do; nor why this is off-topic. Stating things isn't explaining them.

> Even if the businessmen sell you cheap, perfect pencils (which I do not grant), swimming in them in your off hours won't help with the other half of your time.

It helps in that I don't have to spend as much of my time working to buy pencils. It's the same with everything. There's no reason why a laptop doesn't cost $1m except that the incredible, detailed, cross-continent cooperative work is done by experts and coordinated by a market for that work driving costs down and quality up.

soks86 · 2025-10-07T15:42:40 1759851760

I hope you don't take pride in that sentence because I'm still not sure what it means.

Also, automation and pride can go hand in hand. Pride doesn't mean "make it by hand," that would be silly.

philipallstar · 2025-10-07T16:01:29 1759852889

To put it another way: an apocryphal businessman took something that people took pride in and gradually optimised everything so much that all the logging, transportation, graphite work and combination resulted in a perfect pencil that costs basically nothing almost anywhere in the world.

gf000 · 2025-10-07T21:09:08 1759871348

Pencils here are a bit like grains. The market works for them because they fall into such a niche that economic "laws" works there.

But it's a fallacy to apply it elsewhere and there are millions of examples where the free market failed to optimize a product.

philipallstar · 2025-10-08T10:35:10 1759919710

I don't agree. Loads of things are like this. Cars, microchips, hard drive storage, monitors, TVs, laptops. All either much better than they used to be, or much cheaper, or both.

blargey · 2025-10-08T04:26:49 1759897609

Do you actually use pencils? The most popular US (cheapo) brands have atrocious quality because they compromised on materials and construction to get the lowest sticker price possible.

The brands that do have a claim to "perfection" necessarily had the pride to not participate in that race to the bottom.

jihadjihad · 2025-10-07T15:15:34 1759850134

Don't forget to turn your point into a playful rhetorical question [0].

"The real insight?"

0: https://en.wikipedia.org/wiki/Hypophora

WhyOhWhyQ · 2025-10-08T09:55:09 1759917309

Where's the pride in what you make when you're using AI agents? Seems like you're fantasizing about a by-gone era. The name of the activity, "vibe-coding", already makes it clear that this is a pride-free industry.

Analemma_ · 2025-10-07T16:16:08 1759853768

Taking pride in your work makes your labor more expensive than that of someone who does not do this, so over time as "efficiency" increases, you will eventually be removed and replaced by someone without these compunctions. Taking no pride in your work is economically rational and maximizes your long-term value to capital.

mccoyb · 2025-10-07T17:00:38 1759856438

Economically rational, but bereft of identity or _soul_ -- which, paradoxically, becomes highly valued when economically rational agents all regress to a mean of mediocrity.

littlecosmic · 2025-10-08T10:52:14 1759920734

Valued by the worker to give meaning and quality of life not by the buyer - so it does carry much weight.

mccoyb · 2025-09-30T18:39:59 1759257599

I think this is a strictly worse name than "agentic harness", which is already a term used by open-source agentic IDEs (https://github.com/search?q=repo%3Aopenai%2Fcodex%20harness&... or https://github.com/openai/codex/discussions/1174)

Any reason why you want to rename it?

Edit: to say more about my opinions, "agentic loop" could mean a few things -- it could mean the thing you say, or it could mean calling multiple individual agents in a loop ... whereas "agentic harness" evokes a sort of interface between the LLM and the digital outside world which mediates how the LLM embodies itself in that world. That latter thing is exactly what you're describing, as far as I can tell.

simonw · 2025-09-30T19:02:36 1759258956

I like "agentic harness" too, but that's not the name of a skill.

"Designing agentic loops" describes a skill people need to develop. "Designing agentic harnesses" sounds more to me like you're designing a tool like Claude Code from scratch.

Plus "designing agentic loops" includes a reference to my preferred definition of the term "agent" itself - a thing that runs tools in a loop to achieve a goal.

prats226 · 2025-09-30T19:43:35 1759261415

Context engineering is another name people have given to same skill?

andrewacove · 2025-10-01T02:59:26 1759287566

As a reader of Simon's work, I can speculate an answer here.

All "designing agentic loops" is context engineering, but not all context engineering is designing agentic loops. He's specifically talking about instructing the model to run and iterate against an evaluation step. Sure, that instruction will end up in the context, but he's describing creating a context for a specific behavior that allows an agent to be more effective working on its own.

Of course, it'll be interesting to see if future models are taught to create their own agentic loops with evaluation steps/tests, much as models were taught to do their own chain of thought.

simonw · 2025-09-30T19:51:25 1759261885

I think that's actually quite different.

Context engineering is about making sure you've stuffed the context with all of the necessary information - relevant library documentation and examples and suchlike.

Design the agentic loop is about picking the right tools to be provided to the model. The tool descriptions may go in the context but you also need to provide the right implementations of them.

prats226 · 2025-09-30T21:07:21 1759266441

Reason I felt like they are closely connected are because for designing tools for lets say coding agents, you have to be thoughful of context engineering.

Eg linear MCP is notorious for giving large JSONs which quickly fill up context and hard for model to understand. So tools need to be designed slightly differently for agents keeping context engineering in mind compared to how you design them for humans.

Context engineering feels like more central and first-principle approach of designing tools, agent loops.

tptacek · 2025-09-30T19:52:37 1759261957

They feel pretty closely connected. For instance: in an agent loop over a series of tool calls, which tool results should stay resident in the context, which should be summarized, which should be committed to a tool-searchable "memory", and which should be discarded? All context engineering questions and all kind of fundamental to the agent loop.

simonw · 2025-09-30T20:08:14 1759262894

Yeah, "connected" feels right to me.

Those decisions feel to me like problems for the agent harness to solve - Anthropic released a new cookbook about that yesterday: https://github.com/anthropics/claude-cookbooks/blob/main/too...

tptacek · 2025-09-30T20:12:36 1759263156

One thing I'm really fuzzy on is, if you're building a multi-model agent thingy (like, can drive with GPT5 or Sonnet), should you be thinking about context management tools like memory and autoediting as tools the agent provides, or should you be wrapping capabilities the underlying models offer? Memory is really easy to do in the agent code! But presumably Sonnet is better trained to use its own builtins.

prats226 · 2025-09-30T21:10:54 1759266654

It boils down to information loss in compaction driven by LLM's. Either you could carefully design tools that only give compacted output with high information density so models have to auto-compact or organize information only once in a while which eventually is going to be lossy.

Or you just give loads of information without thinking much about it, assuming models will have to do frequent compaction and memory organization and hope its not super lossy.

tptacek · 2025-09-30T21:13:13 1759266793

Right, just so I'm clear here: assume you decide your design should be using a memory tool. Should you make your own with a tool call interface or should you rely on a model feature for it, and how much of a difference does it make?

andrewacove · 2025-10-01T03:00:43 1759287643

Do you think this'll eventually be trained into the models the way that chain-of-thought has been?

simonw · 2025-10-01T03:08:55 1759288135

To a certain extent it has already - models are already very good at picking tools to use: ask for a video transformation and it uses ffmpeg, ask it to edit an Excel sheet and it uses Python with openpyxl, etc.

My post is more about how sometimes you still need to make environment design decisions yourself. My favorite example is the Fly.io one, where I created a brand new Fly organization with a $5 spending limit and issue an API token that could create resources in that organization purely so the coding agent could try experiments to optimize cold start times without messing with my production Fly environment.

An agent might be able to suggest that pattern itself, but it would need a root Fly credential in order to create itself the organization and restricted credentials and given how unsafe agents with root credentials are I'd rather keep that step to myself!

andrewacove · 2025-10-01T03:23:31 1759289011

It's amusing to think that the endgame is that the humans in the loop are parents with credit cards.

I suppose you could never be sure that an agent would explicitly follow your instruction "Don't spend more than $5".

But maybe one could build a tool that provides payment credentials, and you get to move further up the chain. E.g., what if an MCP tool could spin up virtual credit cards with spending caps, and then the agent could create accounts and provide payment details that it received from the tool?

mccoyb · 2025-09-29T23:34:46 1759188886

Congratulations: it's faster, but worse, with a larger context window.

mccoyb · 2025-09-29T15:38:23 1759160303

Anecdotally: I tried hook and crook to get the best flagship model at the time (Opus) to help with technical writing for a submission.

First, these models are not good at technical writing at all. They have no sense of the weight of a single sentence, they just love to blather.

Second, they can't keep the core technical story consistent throughout their completions. In other words, they can't "keep the main thing the main thing".

I had an early draft with AI writing, but by the time we submitted our work -- there was not a single piece of AI writing in the paper. And not without trying, I really did some iterations on trying to carefully craft context, give them a sense of the world model in which they needed to evaluate their additions, yada yada.

For clear and concise technical communication, it's a waste of time right now.

jszymborski · 2025-09-29T16:23:50 1759163030

I'm so happy I have pre-LLM publications and blog posts to prove that my blathering isn't because I'm lazy and used Claude, it's just how I write (i.e., badly).

sroussey · 2025-09-29T16:28:39 1759163319

poorly…

jszymborski · 2025-09-29T16:44:30 1759164270

It would be v. funny if I got that wrong, but I do feel the need to point out that "badly" is indeed grammatically correct here because this is HN and pedantry is always on topic.

People over-correct and feel like they can't use "badly" because there is "feeling badly" discourse [0], but that pertains to "feeling" being a linking verb. "Write" is just your bog standard verb for which "badly", an adverb, is a totally valid modifier.

[0] https://www.merriam-webster.com/grammar/do-you-feel-bad-or-f...

dcminter · 2025-09-29T19:08:44 1759172924

This is just a by the by, but in British English "feeling poorly" mostly means that you are ill. Amusingly it's become slightly euphemistic, so if someone is "a bit poorly" they probably have sniffles or a minor fever. If they are "very poorly" then you probably heard it from a hospital and they're just about dead.

Thus "I feel badly" ... "ok, what did you do?" vs. "I feel poorly" ... "ok, I'll get a bucket."