More

filearts · 2025-11-17T12:54:40 1763384080

What I've started experimenting with and will continue to explore is to have project-specific MCP tools.

I add MCP tools to tighten the feedback loop. I want my Agent to be able to act autonomously but with a tight set of capabilities that don't often align with off-the-shelf tools. I don't want to YOLO but I also don't want to babysit it for non-value-added, risk-free prompts.

So, when I'm developing in go, I create `cmd/mcp` and configure a `go run ./cmd/mcp` MCP server for the Agent.

It helps that I'm quite invested in MCP and built github.com/ggoodman/mcp-server-go, which is one of the few (only?) MCP SDKs that let you scale horizontally over https while still supporting advanced features like elicitation and sampling. But for local tools, I can use the familiar and ergonomic stdio driver and have my Agent pump out the tools for me.

lsaferite · 2025-11-17T16:23:39 1763396619

Horizontal scaling of remote MCP Servers is something the spec is sadly lacking any recognition around. If you've done work in this space, bravo. I've been using a message bus to decouple the HTTP servers from the MCP request handlers. I'm still evolving the solution, but it's been interesting so far.

filearts · 2025-11-18T01:36:57 1763429817

This is the interface I landed on to make pluggable 'session hosts': https://github.com/ggoodman/mcp-server-go/blob/b8216cc1830ad...

It goes a tad beyond the spec minimum because I think it's valuable to be able to persist some small KV data with sessions and users.

filearts · 2025-10-28T13:05:38 1761656738

In a previous professional life, I did financial modelling for a big 4 accounting firm. We had tooling that allowed us to visualize contiguous ranges of identical formulas (if you convert formulas to R1C1 addressing, similar formulas have the same representation). This allowed for overrides to stick out like a sore thumb.

I suspect similar tools could be made for Claude and other LLMs except that it wouldn't be plagued by the mind-numbing tedium of doing this sort of audit.

filearts · 2025-10-02T18:18:34 1759429114

An idea might be to require a financially meaningful deposit to pursue an account recovery like this. The deposit would be forfeit if the identity verification failed.

Though now that I write this, it creates a perverse incentive for a company to collect deposits and deny account recovery.

filearts · 2025-09-20T01:04:34 1758330274

It is fascinating how similar the prompt construction was to a phishing campaign in terms of characteristics.

  - Authority assertion
  - False urgency
  - Technical legitimacy
  - Security theater

Prompt injection here is like a phishing campaign against an entity with no consciousness or ability to stop and question through self-reflection.

freakynit · 2025-09-20T06:10:04 1758348604

Pretty similar in spirit to CSRF:

Both trick a privileged actor into doing something the user didn't intend using inputs the system trusts.

In this case, a malicious PDF that uses prompt-injection to get a Notion agent (which already has access to your workspace) to call an external web-tool and exfiltrate page content. Tjhis is simialr to CSRF's core idea - an attacker causes an authenticated principal to make a request - except here the "principal" is an autonomous agent with tool access rather than the browser carrying cookies.

Thus, same abuse-of-privilege pattern, just with different technical surface (prompt-injection + tool chaining vs. forged browser HTTP requests).

XenophileJKO · 2025-09-20T02:39:24 1758335964

I'm fairly convinced that with the right training.. the ability of the LLM to be "skeptical" and resilient to these kinds of attacks will be pretty robust.

The current problem is that making the models resistant to "persona" injection is in opposition to much of how the models are also used conversationally. I think this is why you'll end up with hardened "agent" models and then more open conversational models.

I suppose it is also possible that the models can have an additional non-prompt context applied that sets expectations, but that requires new architecture for those inputs.

BarryMilo · 2025-09-20T03:18:35 1758338315

Isn't the whole problem that it's nigh-impossible to isolate context from input?

Terr_ · 2025-09-20T09:09:15 1758359355

Yeah, ultimately the LLM is guess_what_could_come_next(document) in a loop with some I/O either doing something with the latest guess or else appending more content to the document from elsewhere.

Any distinctions inside the document involve the land of statistical patterns and weights, rather than hard auditable logic.

dns_snek · 2025-09-20T15:23:57 1758381837

What does "pretty robust" mean, how do you even assess that? How often are you okay with your most sensitive information getting stolen and is everyone else going to be okay with their information being compromised once or twice a year, every time someone finds a reproducible jailbreak?

filearts · 2025-09-08T17:51:17 1757353877

As is often the case, reality imitates satire. This reminds me of the "and then" scene from Dude, Where's my Car. https://youtu.be/iuDML4ADIvk

filearts · 2024-11-25T23:17:31 1732576651

If you were willing to bring additional zod tooling or move to something like TypeBox (https://github.com/sinclairzx81/typebox), the json schema would be a direct derivation of the tools' input schemas in code.

rictic · 2024-11-26T06:14:06 1732601646

The json-schema-to-ts npm package has a FromSchema type operator that converts the type of a json schema directly to the type of the values it describes. Zod and TypeBox are good options for users, but for the reference implementation I think a pure type solution would be better.

filearts · on Feb 5, 2023

I took a stab at this a while back using an object to represent the possible resolutions. The keys of the object become a signal upon resolution that indicates which branch fired. https://github.com/ggoodman/channels#select-key-string-chann...

Ultimately though, I don't believe that channels are an abstraction that makes sense in JavaScript's concurrency model. Go's contexts, on the other hand, would be a huge improvement over AbortController and AbortSignal.

filearts · on Aug 3, 2022

Software engineers don't want to be managing physical hardware and often need to run highly available services. When a team lacks the skill, geographic presence or bandwidth to manage physical servers but needs to deliver a highly-available service, I think the cloud offers legitimate improvements in operations with downsides such as increased cost and decreased performance per unit of cost.

Seems like a fair trade-off to make.

LaGrange · on Aug 3, 2022

> Software engineers don't want to be managing physical hardware

Speak for yourself, I need to get some use out of my winter jacket ever since winters stopped being a thing.

filearts · on June 1, 2021

Not anymore. The busses only come every 5m90s on that route.

filearts · on May 11, 2021

Misallocated and misappropriated are two very different things. The linked tweet makes no mention of misappropriation.