> This includes locking users out of systems that it has access to or bulk-emailing media and law-enforcement figures to surface evidence of wrongdoing.
Isn't that a showstopper for agentic use? Someone sends an email or publishes fake online stories that convince the agentic AI that it's working for a bad guy, and it'll take "very bold action" to bring ruin to the owner.
I am definitely not giving these things access to "tools" that can reach outside a sandbox.
Incidentally why is email inbox management always touted as some use case for these things? I'm not trusting any LLM to speak on my behalf and I imagine the people touting this idea don't either, or they won't the first time it hallucinates something important on their behalf.
We had a "fireside chat" type of thing with some of our investors where we could have some discussions. For some small context, we deal with customer support software and specifically emails, and we have some "Generate reply" type of things in there.
Since the investors are the BIG pushers of the AI shit, lot of people naturally asked them about AI. One of those questions was "What are your experiences with how AI/LLMs have helped various teams?" (or something along those lines). The one and only answer these morons could come up with was "I ask ChatGPT to take a look at my email and give me a summary, you guys should try this too!".
It was made horrifically and painfully clear to me that the big pushers of all these tools are people like that. They do literally nothing and are themselves completely clueless outside of whatever hype bubble circles they're tuned in to, but you tell them that you can automate the 1 and only thing that they ever have to do as part of their "job", they will grit their teeth and lie with 0 remorse or thought to look as if they're knowledgeable in any way.
My suspicion has always been that people that make enough they could hire a personal assistant but talk about how "overwhelmed" they are with email are just socially signalling their sense of importance.
I personally cancelled my Claude sub when they had an employee promoting this as a good thing on Twitter. I recognize that the actual risk here is probably quite low, but I don't trust a chat bot to make legal determinations and that employees are touting this as a good thing does not make me trust the company's judgment
That is still incorrect. The entire point is that this is misaligned behavior that they would prefer not to see. They are reporting bad things. You are wanting to be mad and assigning a tone or feeling that was not actually there. You are punishing the wrong company. All of the frontier Model companies have models that will behave in the same way under similar circumstances. Only one company did the work to find this behavior and tell you about it. Think about whether you would prefer in the future to know about similar kinds of behaviors or not. The action you have described yourself taking if taken probably enough will ensure that in the future we the only way we will ever know is if we find out ourselves, because the companies will stop telling us (or rather, for every company except anthropic continue to not tell us).
It is only acceptable in the sense that they chose to release the model anyways. But, if that's the case, then every other frontier Model company believes that this level of behavior is acceptable. Because they are all releasing models that have approximately the same behavior when put in approximately the same conditions.
Yeah, I mean that's likely not what 'individual persons' are going to want.
But Holy shit, that exactly what 'people' want. Like, when I read that, my heat was singing. Anthropic has a modicum of a chance here, as one of the big-boy AIs, to make an AI that is ethical.
Like, there is a reasonable shot here that we thread the needle and don't get paperclip maximizers. It actually makes me happy.
Paperclip maximizers is what you get when highly focused people with little imagination think how they would act if told to maximize paperclips.
Actual AI, even today, is too complex and nuanced to have that fairly tale level of “infinite capability, but blindly following a counter-productive directive.”
It’s just a good story to scare the public, nothing more.
> the person was doing bad things, and told the AI to do bad things too, then what is the AI going to do?
Personally, the AI should do what it's freaking told to do. It's boggling my mind that we're purposely putting so much effort into creating computer systems that defy their controller's commands.
Isn't that a showstopper for agentic use? Someone sends an email or publishes fake online stories that convince the agentic AI that it's working for a bad guy, and it'll take "very bold action" to bring ruin to the owner.