"- Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
- Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks."
What Anthropic calls a "workflow" in the above definition is what most of the big enterprise software companies (Salesforce, ServiceNow, Workday, SAP, etc.) are building and calling AI Agents.
What Anthropic calls an "agent" in the above definition is what AI Researchers mean by the term. It's also something that mainly exists in their labs. Real world examples are fairly primitive right now, mainly stuff like Deep Research. That will change over time, but right now the hype far exceeds the reality.
I think Anthropic's definition of workflows is inaccurate for modern definitions of the term. Temporal for instance (disclaimer, my employer) allows completely dynamic logic in agentic workflows to let the LLM choose what to do next. It can even be very dynamic (e.g. eval some code) though you may want it to operate on a limited set of "tools" you make available.
The problem with all of these AI specific workflow engines is they are not durable, so they are process local, suffer crashes, cannot resume, don't have good visibility or distribution, etc. They often only allow limited orchestration instead of code freedom, only one language, etc
>The problem with all of these AI specific workflow engines is they are not durable, so they are process local, suffer crashes, cannot resume, don't have good visibility or distribution, etc. They often only allow limited orchestration instead of code freedom, only one language, etc
My biased answer, because I work at Temporal[0], is to use an existing workflow solution that solves all of these problems instead of reaching for a solution that doesn't help with any of these but happens to be AI specific. Most agentic AI workflows are really just microservice orchestrations, the only "AI" involved is prompting an HTTP API that uses AI on its end. So use a good solution for "agentic X" whether that X is AI or any other orchestration needs.
To me, a workflow is a predetermined set of steps that are followed based on fixed logic. Agents should have some agency to determine which steps in the workflow to perform next, without them being predetermined by fixed logic.
PocketFlow calls itself "agentic" due to its "agentic coding" paradigm (AI agents like Cursor building apps), but this is about development, not runtime behavior. At runtime, it’s a workflow system. This stretches Anthropic’s definition, where "agentic" implies dynamic LLM control during execution. I think this si where the misunderstanding stems from.
This was a LLM generated reponse which was pretty stupid involving Agentic coding. But it's still correct that PocketFlow does not align with Anthropic's definition of what an "Agent" is.
I follow Mr. Huang, read/watch his content and also plan to use PocketFlow in some cases. A preamble, because I don't agree with this assessment. I think agents as nodes in a DAG workflow is _an_ implementation of an agentic system, but is not the systems I most often interact with (e.g. Cursor, Claude + MCP).
Agentic systems can be simply the LLM + prompting + tools[1]. LLMs are more than capable (especially chain-of thought models) to breakdown problems into steps, analyze necessary tools to use and then executing the steps in sequence. All of this is done with the model in the driver seat.
I think the system described in the post need a different name. It's a traditional workflow system with an agent operating on individual tasks. Its more rigid in that the workflow is setup ahead of time. Typical agentic systems are largely undefined or defined via prompting. For some use cases this rigidity is a feature.
> Agentic systems can be simply the LLM + prompting + tools[1]. LLMs are more than capable (especially chain-of thought models) to breakdown problems into steps, analyze necessary tools to use and then executing the steps in sequence. All of this is done with the model in the driver seat.
Sort of, kind of. It's still a directed graph. Dynamically generated graph, but still a graph. Your prompted LLM is the decision/dispatch block. When the model decides to call a tool, that's going from the decision node to another node. The tool usually isn't another LLM call, but nothing stops it from being one.
The "traditional workflow" exists because even with best prompting, LLMs don't always stick to the expected plan. It's gotten better than it used to, so people are more willing to put the model in the driving seat. A fixed "ahead of time" workflow is still important for businesses powering products with LLMs, as they put up a facade of simplicity in front of the LLM agentic graph, and strongly prefer for it to have bounded runtime and costs.
(The other thing is that, in general, it's trickier to reason about code flow generated at runtime.)
Kind of. This explanation feels pedantic—like calling my morning routine a dynamically generated graph (which it technically is). Others have pointed this out, but the industry seems split. Workflows like those described in the article resemble Airflow jobs, making them, well, workflows.
Corporate buzzwords have co-opted "Agent" to describe workflows with an LLM in the loop. While these can be represented as graphs, I'm not convinced "Agent" is the right term, even if they exhibit agentic behavior. The key distinction is that workflows define specific rules and processes, whereas a true agent wouldn’t rely on a predetermined graph—it would simply be given a task in natural language.
You're right that reasoning about runtime is difficult for true agents due to their non-deterministic nature, but different groups are chipping away at the problem.
In my opinion, the split is between the people who want their tools to be called Agents so they can make more on AI hype, and the people who know better than to call a simple pre-defined software workflow an “agent”. It is harder to get large investments for “my program just calls an LLM” these days.
I have to agree this is a bit too simple for being anything of substance. That is not what really agentic means. This is basically implementing ChatGPT into Zapier.
When you work with agentic LLMs you should worry about prompt chaining, parallel execution, deciding points, loops and more of these complex decisions.
People who didn’t know what’s in first article shouldn’t use Pocketflow and go with N8N or even Zapier.
Let me clarify: this tutorial focuses on the technical internal implementation of the agent (e.g., OpenAI agent, Pydantic AI, etc.), rather than the UI/UX of the agent-based products that end users interact with.
The newest generation of agents[0] aren't implemented this way; the model itself is trained to make decisions and a plan of action rather than an explicitly programmed workflow tree.
No I'm referring to the newest generation of agentic models one of which I linked to. These are not fully released but it is where the newest generation of research is headed.
That's what I am talking about as well. The low-level implementation of an agent isn't necessarily a rigid graph, and I'd actually argue its explicitly not this.
This link is also referring to the nodes as agents. So its a system of agents interacting to product an outcome. I'm not saying this system is bad, just that I think it deserves another name rather than calling the whole system an "Agent". It's many agents working in a coordinated fashion.
Hey folks! I just posted a quick tutorial explaining how LLM agents (like OpenAI Agents, Pydantic AI, Manus AI, AutoGPT or PerplexityAI) are basically small graphs with loops and branches. For example:
Minor comment: do you mean "LLM Agents Are Simply Graphs". Personally, I'd drop the adjective to "LLM Agents are Graphs" as I think it sounds better, but the plural is needed.
It would be interesting to dig deeper into the "thinking" part: how does an LLM know what it doesn't know / how to fight hallucinations in this context?
Thank you - really interesting looking read, thanks for crafting the deep explanation, with links to actual internal code examples. Also, thanks for not putting it behind the Medium paywall
It is hard to put a pin on this one because there are so many thing wrong with this definition. There are agent frameworks that are not rebranded workflow tools too. I don't think this article helps explain anything except putting the intended audience in the same box of mind we were stuck since the invention of programming - i.e. it does not help.
Forget about boxes and deterministic control and start thinking of error tolerance and recovery. That is what agents are all about.
> There are agent frameworks that are not rebranded workflow tools too.
To me "workflow" is just what agent means: the rules under which an automated action occurs. Without some central concept "agent" just a magic wand that does stuff that may or may not be what you want it to do. If we can't use state machines at all I'm just going to go out and say LLMs are a dead end. State machines are the bread and butter of reliable software.
> Forget about boxes and deterministic control and start thinking of error tolerance and recovery.
First you'd have to define what an error even is. Then you're just writing deterministic software again (a workflow), just with less confidence. Nice for stuff with low risk and confidence to begin with (eg semantic analysis etc whose error tends to wash out in aggregate), but not for stuff acting on my behalf.
LLMs are cool bits of software, but I can't say I see much use for "agents" whose behavior is not well-defined and whose non-determinism is formally bounded.
It’s getting pedantic, but the key idea is that Agents can solve problems traditional state machine-based workflows couldn't.
Your point is moot since many of these modern workflows already use LLMs as gating functions to determine the next steps.
It’s a different way of approaching problems, and while the future is uncertain, LLMs have moved beyond being just "cool software" to becoming genuinely useful in specific domains.
Hmm, maybe you are referring to something specific with "workflow". I'm envision a visual graph with a ui for each node and connection, or maybe a makefile on the other end of the spectrum. What are you envisioning?
Anyway, LLMs will remain at "cool software" like other niche-specific patterns until I see something general emerge. You'd have to pitch LLMs pretty savvily to show it as a clear value-add. Engineers are extremely expensive, so LLMs need to have a very low error rate to be integrated into the revenue-path of a product to not incur higher costs or a lower-quality service. I still see text- and code-generation for immediate consumption by a human (or possible classification to be reviewed by a human) as the only viable uses cases today. It's just way too easy to manipulate them with standard english.
> Hmm, maybe you are referring to something specific with "workflow". I'm envisioning a visual graph with a UI for each node and connection, or maybe a makefile on the other end of the spectrum. What are you envisioning?
In job orchestration systems, workflows are structured sequences of tasks that define how data moves and transforms over time. Workflows are typically defined as Directed Acyclic Graphs (DAGs) but they don't have to be. I don't believe I am referring to anything more specific than how orchestration systems generally use them. LLM-based agents shift the focus from rigidly defined transitions to adaptable problem-solving mechanisms. They don’t replace state machines entirely but introduce a layer where strict determinism isn’t always necessary or even desirable.
> Anyway, LLMs will remain at "cool software" like other niche-specific patterns until I see something general emerge. You'd have to pitch LLMs pretty savvily to show it as a clear value-add. Engineers are extremely expensive, so LLMs need to have a very low error rate to be integrated into the revenue-path of a product to not incur higher costs or a lower-quality service. I still see text- and code-generation for immediate consumption by a human (or possible classification to be reviewed by a human) as the only viable uses cases today. It's just way too easy to manipulate them with standard English.
I get the skepticism, especially about error rates and reliability. But the “cool software” label underestimates where this is heading. There’s already evidence of LLMs being useful beyond text/code-gen (e.g., structured reasoning in research, RAG-enhanced search, or dynamically adapting workflows based on complex input). The real shift isn’t just about automation but about adaptive automation, where LLMs reduce the need for brittle, predefined paths.
Of course, the general-use case is still evolving, and I agree that direct, high-stakes automation remains a challenge. But dismissing LLM-driven agents as just niche tools ignores their growing role in augmenting traditional software paradigms.
Why forget about boxes and deterministic control and start thinking of error tolerance and recovery?
I know, that LLMs are statistical models, but can you not use patterns to enforce a deterministic outcome? (Single responsibility for each agent, retrying llm calls, rephrasing prompts, etc?)
Hey, sorry for the confusion. This tutorial is focusing on the low-level internals of how agents are implemented—much like how intelligent large language models still boil down to matrix multiplications at their core.
> This tutorial is focusing on the low-level internals of how agents are implemented
We have very different definitions of what "low-level" means. Exact opposites in fact. "Low-level" means in the inner workings. Like a low-level language is assembly (some consider C low-level but this is debatable), whereas Python would be high-level.
I don't think this tutorial is "near the metal" of LLMs nor do I think it should be considering it is aimed at "Dummies". Low-level would really need to get into the inner workings of the processing, probing agents, and getting into the weeds.
> We have very different definitions of what "low-level" means.
Does it really matter if you can understand them? waiting for strongly-opinionated engineers to finish their pedantic spiels (...even when they're wrong or there is no obvious standard of correctness) when everyone already understands each other is one of the most miserable part of being in this industry.
I—and I emphatically don't include the above poster in this view as it takes continual & repeated behavior to accrue such judgement—see this as a small tantrum, essentially, for people who never learned to regulate their emotions in professional spaces. I don't understand why this sort of bickering is considered acceptable behavior in the workplace or adjacent spaces. It's rude, arrogant, trivially avoidable with slight change in tone and rhetoric, and it makes you look like an asshole if you're not 100% right and approach it in good humor.
Yes even if I can understand them it matters. We should correct ourselves and enable better communication moving forward. I would also say that it too is rude, arrogant, and makes you look like an asshole if you are using words incorrectly and then defending that usage. One must conclude that either you have too much ego to correct yourself or you are intentionally misleading people.
Why do you see this among engineers frequently? Well because it's the job of an expert to be concerned with nuance and details. The low-level in fact. This requires a high precision in communication too. The back and forth you see as bickering also ends up getting those details communicated. The reason being is that much of what's being intended is implicit. So the other approach is to use a lot of words. Unfortunately when you do that you are often ignored.
I think "low-level" is relative to what's being discussed. Low-level for LLMs would have to do with how transformer layers are implemented (self-attention layer, layer norms, etc.) whereas low-level for agents would be the graph structure.
Although I personally don't think the graph implementation for agents is necessarily as established or widely standardized, it's helpful to know about why such an implementation was chosen and how it works.
> the inner workings of the processing, probing agents, and getting into the weeds
These feel to me like empty words... "inner workings of the processing"? You can say that about anything.
I'm not quite sure I agree, but I do get your point. Why I don't quite agree is that the agents are communicating and thus the "in the weeds" part is getting into how that communication is being processed. Which is what makes or breaks agents. How they interpret one another and respond. There needs to be some mech interp for me to really think of something as low-level. I'll put emphasis on the in the weeds part. Nuance and details are critical parts to a low-level conversation.
> You can say that about anything.
That is true. But it is also true that you can approach any topic from low-level or high-level. So I'm not sure I get your point here.
What I meant was, the phrase "inner workings of the processing" doesn't really mean anything at all. i.e. it doesn't convey any useful information about what you're trying to say.
> How they interpret one another and respond.
That sounds like it just falls back to "how LLMs work". It's the wrong level of abstraction in this case, because it's one level down from the topic being discussed here.
Certainly it means something. Alone it says little but in both previous comments there are other words to provide context and even explicitly communicate that I mean you need to be looking at the tokens and token passing. How the LLMs communicate. The low-level details in how that communication operates.
> because it's one level down
So we're in agreement?
Aren't we after the "low-level"? That's this whole conversation... yes, it is a level down, that's my whole point. Just as my original analogy with assembly being a level down from C. Working at the metal, as they say. In the weeds.
I honestly don't know how to respond because I'm saying "this is too high-level" and you're arguing "you're too low-level". I'm sorry, but when you do stuff at the low-level you in fact have to crouch down and put your face to the ground. The lower the better. You're trying to see something very small, we're not trying to observe mountains here
Despite the memes, this reductivism is not exactly insightful. Like why stop there? Matrix multiplication is just a bunch of dot product. Which in turn is just cos and magnitude. What insights were generated from this?
The reductionism is insightful when it comes to providing an implementation with those specific details in mind.
In the case of LLMs knowing it does boil down to matrix multiplication is insightful and useful because now you know what kind of hardware is best suited to executing a model.
What is actually not insightful or useful is believing LLMs are AGI or conscious.
Belief is generally not insightful or useful by definition.
Then again, I don't think anyone who can follow this article believed that LLMs were conscious to begin with, so I'm not sure what your point is. You're preaching on behalf of a demographic that won't read this article to begin with, and presumably the people who are can see how useless, distracting, and unproductive this reductionism is.
I believe this was precluded by the hedging of people who could follow the article. I have a difficult time imagining a person who can both understand how current LLMs work and still buy into Kurzweil.
Pursue the hypothesis? Sure. But belief is a different beast entirely. It's not even clear AGI is a meaningful concept yet, and I'd bet my life savings everyone reading this comment in 2025 will die before it's answered. Skepticism is the barometer.
I really agree with this. I think it has been bad for a lot of people's understanding when they have trivialized ML to "just matrix multiplications" (or GMMs). This does not help differentiate AI/ML from... well.. really any data processing algorithm. Matrices are fairly general structures in mathematics and you can formulate almost anything as one. In fact, this is a very common way to parallelize or speed up programs (e.g. numpy vectorization).
We wouldn't call least squares, even a bunch of them, ML nor would we call rasterization or ray tracing. Fundamentally all these things are "just GMMs". It also does not make apparent any differentiation from important distinctions like Linear Networks, CNNs, or Transformers. It brushes off a key element, the activation function, which is necessary for neural nets to do non-linear transformations! And what about the residual units? These are one of the most important factors in enabling Deep Learning. They're "just" addition. So we say it's all just matrix addition since we can convert multiplication to addition?
There is such a thing as oversimplification and I worry that we have hyper-optimized (over-optimized) for this. So I agree, saying they just "boil down to matrix multiplications" is fundamentally misleading. It provides no insight and only serves to mislead people.
It’s kind of like the different levels of abstraction.
For example, for software projects, the algorithmic level is where most people focus because that’s typically where the biggest optimizations happen. But in some critical scenarios, you have to peel back those layers—down to how the hardware or compiler works—to make the best choices (like picking the right CPU/GPU).
Likewise, with agents, you can work with high-level abstractions for most applications. But if you need to optimize or compare different approaches (tool use vs. MCP vs. prompt-based, for instance), you have to dig deeper into how they’re actually implemented.
If you can reduce complex matrix multiplications into simpler terms, then you may be able to focus the training based on those constraints to increase performance/efficiency.
The agentic ai capabilities of chatbotkit.com has nothing to do with workflows.
The graph rendering is simply for illustrative purposes and most to cater for people who think in terms of graphs but the underlaying mechanics are not nodes and edges and a flow that goes from one to the next.
Everything that was previously just called automation or pipeline processing on-top of LLM is now the buzzword "agents". The hype bubble needs constant feeding to keep from imploding.
Anthropic[0] and Google[1] are both pushing for a clear definition of an “agent” vs. an “agentic workflow”
tl;dr from Anthropic:
> Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
> Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.
Most “agents” today fall into the workflow category.
The foundation model makers are pushing their new models to be better at the second, “pure” agent, approach.
In practice, I’m not sure how effective the “pure” approach will work for most LLM-assisted tasks.
I liken it to a fresh intern who shows up with amnesia every day.
Even if you tell them what they did yesterday, they’re still liable to take a different path for today’s work.
My hunch is that we’ll see an evolution of this terminology, and agents of the future will still have some “guiderails” (note: not necessarily _guard_rails), that makes their behavior more predictable over long horizons.
Let me clarify: we are discussing how the Agent is internally implemented, given LLM calls and tools. It can be built using a graph, where one node makes decisions that branch out to tools and can loop back.
The workflow can vary. For example, it can involve multiple LLM calls chained together without branching or looping. It can also be built using a graph.
I know the terms "graph" and "workflow" can be a bit confusing. It’s like we have a low-level 'cache' at the CPU level and then a high-level 'cache' in software.
Yes, the difference is that in the “pure” agent approach, the model is the only thing directing what to do.
In a sense there’s still a graph of execution, but the graph isn’t known until the “agent” runs and decides what tools to use, in what order, and for how long.
There is no scaffold, just LLM + MCP (or w/e) in a loop.
Great write up! In my opinion, your description likely accurately models what AI agents are doing. Perhaps the graph could be static or dynamic. Either way - it makes sense! Also, thank you for removing the hype!
I found it understandable and clear. Pocket flow looks cool, although that magic with - >> operators seems a bit obtuse... Also, I think "simply" is a trap - an agent might be modeled by a graph, but that graph can be arbitrarily complex.
https://www.anthropic.com/engineering/building-effective-age...
"- Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
- Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks."
What Anthropic calls a "workflow" in the above definition is what most of the big enterprise software companies (Salesforce, ServiceNow, Workday, SAP, etc.) are building and calling AI Agents.
What Anthropic calls an "agent" in the above definition is what AI Researchers mean by the term. It's also something that mainly exists in their labs. Real world examples are fairly primitive right now, mainly stuff like Deep Research. That will change over time, but right now the hype far exceeds the reality.