Hacker News new | past | comments | ask | show | jobs | submit | clysm's comments login

I’m not seeing the security issue here. Arbitrary code execution leads to arbitrary code execution?

Seems like policies are impossible to enforce in general on what can be executed, so the only recourse is to limit secret access.

Is there a demonstration of this being able to access/steal secrets of some sort?


> Seems like policies are impossible to enforce

The author relates to exactly that: "ineffective policy mechanisms are worse than missing policy mechanisms, because they provide all of the feeling of security through compliance while actually incentivizing malicious forms of compliance."

And I totally agree. It is so abundant. "Yes, we are in compliance with all the strong password requirements, strictly speaking there is one strong password for every single admin user for all services we use, but that's not in the checklist, right?"


It's less of an "use this to do nasty shit to a bunch of unsuspecting victims" one, and more of a "people can get around your policies when you actually need policies that limit your users".

1. BigEnterpriseOrg central IT dept click the tick boxes to disable outside actions because <INSERT SECURITY FRAMEWORK> compliance requires not using external actions [0]

2. BigBrainedDeveloper wants to use ExternalAction, so uses the method documented in the post because they have a big brain

3. BigEnterpriseOrg is no longer compliant with <INSERT SECURITY FRAMEWORK> and, more importantly, the central IT dept have zero idea this is happening without continuously inspecting all the CI workflows for every team they support and signing off on all code changes [1]

That's why someone else's point of "you're supposed to fork the action into your organisation" is a solution if disabling local `uses:` is added as an option in the tick boxes -- the central IT dept have visibility over what's being used and by whom if BigBrainedDeveloper can ask for ExternalAction to be forked into BigEnterpriseOrg GH organisation. Central IT dept's involvement is now just review the codebase, fork it, maintain updates.

NOTE: This is not a panacea against all things that go against <INSERT SECURITY FRAMEWORK> compliance (downloading external binaries etc). But it would be an easy gap getting closed.

----

[0]: or something, i dunno, plenty of reasons enterprise IT depts do stuff that frustrates internal developers

[1]: A sure-fire way to piss off every single one of your internal developers.


> Most PCBs aren’t distributed to consumers as bare PCBs, so this issue rarely appears to end users.

In terms of hobby/maker electronics, embedded systems, etc., which the Raspberry Pi falls under, yes they absolutely are. The entire Arduino ecosystem is like this.


Raspberry Pi does indeed have users for whom it's in the same category as things like Arduino.

But it also has lots of users for whom it is simply a cheap computer to plug into a screen / mouse / keyboard, people for whom the only interesting things about the hardware are its price and size.

(I've no idea what the ratio is, but I would guess the majority of customers are the latter type; though possibly not the majority of Pi's sold, since the former group contains people much more likely to buy multiple devices, whether someone like me who's bought a few for tinkering with, or someone actually doing something interesting and needing either 100s for their own project, or 1000s to go into something they're selling.)

So what you said is true for some, but far from all, Pi consumers.


During the pandemic, there was a noticeable shortage of Pis on store shelves. Comments by hobbyists indicated that the existing supply was being snapped up by small-time manufacturers who had designed commercial products around the Pi as a base, and end-users weren’t receiving priority or first dibs at them.


Raspberry Pi themselves said they were prioritizing businesses over retail consumers at the time; businesses need stuff to sell to remain viable.


That’s all well and good, but Raspberry Pi had been positioned in the market as educational, entry-level, easy to understand and ideal for children learning Linux, Python, or electronics.

Perhaps some kids can circulate a list of those commercial products incorporating a Pi, and campaign to liberate and repurpose them. Win-win?


"campaign" lol. I can remember some of my youthful "campaigns" to liberate useful technology... I cant say I find them recommendable.


Why the hell is there a line break after every sentence?


Yeah, that's really a strange choice for formatting and makes it very hard to read. Not the typical practice to insert a <br> after every sentence... (that said, the post itself is a great read!)


The goal of truncating the sentences in that way was precisely to increase the suspense a bit, but I believe I miserably failed, making it just less readable.


Not OP: for what it's worth I understood your intent, and it didn't bother me


Definitely not the end of the world - no worries! It just feels a bit uh, haphazard in a way. My eye is leading forward for the next words only to find I need to skip to the next line. I usually read very fast so I'll be a bit more sensitive to variations in formatting like that, I guess? All good though, like I said a great read anyway!


Worse, the formatting failed. Lines are breaking before the sentence is over, leading to two words in the next line, then a new line. (Linux, Chrome)


Just because they didn't see your vision doesn't mean it wasn't good. You clearly had an intent with it.

For my anecdote, it worked for me and I didn't even notice the spacing until they pointed it out.


If you have a wide screen, you won't notice it so much. Try reading on a narrow screen in just about the wrong resolution and it will look like the author used notepad and hit enter every time they were too close to the edge instead of letting the program do the wrapping.


I think it’s called ventilated prose. More commonly found in code comments.


Probably originally written for LinkedIn. The whole pointless "moral lesson" when they didn't actually achieve anything vibe fits too.


No, it wasn't written for LinkedIn. It was written for my blog, and I just wanted to share something that happened, as I often do. That's all :-)


Hello my high school research paper teacher


No, it’s not a threshold. It’s just how the tech works.

It’s a next letter guesser. Put in a different set of letters to start, and it’ll guess the next letters differently.


I think we need to start moving away from this explanation, because the truth is more complex. Anthropic's own research showed that Claude does actually "plan ahead", beyond the next token.

https://www.anthropic.com/research/tracing-thoughts-language...

> Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.


I'm not sure if this really says the truth is more complex? It is still doing next-token prediction, but it's prediction method is sufficiently complicated in terms of conditional probabilities that it recognizes that if you need to rhyme, you need to get to some future state, which then impacts the probabilities of the intermediate states.

At least in my view it's still inherently a next-token predictor, just with really good conditional probability understandings.


Like the old saying goes, a sufficiently complex next token predictor is indistinguishable from your average software engineer


A perfect next token predictor is equivalent to god


Not really - even my kids knew enough to interrupt my stream of words with running away or flinging the food from the fork.


That's entirely an implementation limitation from humans. There's no reason to believe a reasoning model could NOT be trained to stream multimodal input and perform a burst of reasoning on each step, interjecting when it feels appropriate.

We simply haven't.


Not sure training on language data will teach how to experiment with the social system like being a toddler will, but maybe. Where does the glance of assertive independence as the spoon turns get in there? Will the robot try to make its eyes gleam mischeviously as is written so often.


But then so are we? We are just predicting the next word we are saying, are we not? Even when you add thoughts behind it (sure some people think differently - be it without an inner monologue, or be it just in colors and sounds and shapes, etc), but that "reasoning" is still going into the act of coming up with the next word we are speaking/writing.


This type of response always irks me.

It shows that we, computer scientists, think of ourselves as experts on anything. Even though biological machines are well outside our expertise.

We should stop repeating things we don't understand.


We're not predicting the next word we're most likely to say, we're actively choosing the word that we believe most successfully conveys what we want to communicate. This relies on a theory of mind of those around us and an intentionality of speech that aren't even remotely the same as "guessing what we would say if only we said it"


When you talk at full speed, are you really picking the next word?

I feel that we pick the next thought to convey. I don't feel like we actively think about the words we're going to use to get there.

Though we are capable of doing that when we stop to slowly explain an idea.

I feel that llms are the thought to text without the free-flowing thought.

As in, an llm won't just start talking, it doesn't have that always on conscious element.

But this is all philosophical, me trying to explain my own existence.

I've always marveled at how the brain picks the next word without me actively thinking about each word.

It just appears.

For example, there are times when a word I never use and couldn't even give you the explicit definition of pops into my head and it is the right word for that sentence, but I have no active understanding of that word. It's exactly as if my brain knows that the thought I'm trying to convey requires this word from some probability analysis.

It's why I feel we learn so much from reading.

We are learning the words that we will later re-utter and how they relate to each other.

I also agree with most who feel there's still something missing for llms, like the character from wizard of Oz that is talking while saying if he only had a brain...

There is some of that going on with llms.

But it feels like a major piece of what makes our minds work.

Or, at least what makes communication from mind-to-mind work.

It's like computers can now share thoughts with humans though still lacking some form of thought themselves.

But the set of puzzle pieces missing from full-blown human intelligence seems to be a lot smaller today.


We are really only what we understand ourselves to be? We must have a pretty great understanding of that thing we can't explain then.


I wouldn’t trust a next word guesser to make any claim like you attempt, ergo we aren’t, and the moment we think we are, we aren’t.


Humans and LLMs are built differently, it seems disingenuous to think we both use the same methods to arrive at the same general conclusion. I can inherently understand some proofs of pythagorean's theorem but an LLM might apply different ones for various reasons. But the output/result is still the same. If a next token generator run in parallel can generate a performant relational database that doesn't directly imply I am also a next token generator.


Humans do far more than generate tokens.


At this point you have to start entertaining the question of what is the difference between general intelligence and a "sufficiently complicated" next token prediction algorithm.


A sufficiently large lookup table in DB is mathematically indistinguishable from a sufficiently complicated next token prediction algorithm is mathematically indistinguishable from general intelligence.

All that means is that treating something as a black box doesn't tell you anything about what's inside the box.


Why do we care, so long as the box can genuinely reason about things?


What if the box has spiders in it


:facepalm:

I ... did you respond to the wrong comment?

Or do you actually think the DB table can genuinely reason about things?


Of course it can. Reasoning is algorithmic in nature, and algorithms can be encoded as sufficiently large state transition tables. I don't buy into Searle's "it can't reason because of course it can't" nonsense.


It can do something but I wouldn’t call it reasoning. IMO a reasoning algorithmic must be more complex than a lookup table.


We were talking about a "sufficiently large" table, which means that it can be larger than realistic hardware allows for. Any algorithm operating on bounded memory can be ultimately encoded as a finite state automaton with the table defining all valid state transitions.


This is such a confusion of ideas that I don't even know how to respond any more.

Good luck.


But then this classifier is entirely useless because that's all humans are too? I have no reason to believe you are anything but a stochastic parrot.

Are we just now rediscovering hundred year-old philosophy in CS?


There's a massive difference between "I have no reason to believe you are anything but a stochastic parrot" and "you are a stochastic parrot".


If we're at the point where planning what I'm going to write, reasoning it out in language, or preparing a draft and editing it is insufficient to make me not a stochastic parrot, I think it's important to specify what massive differences could exist between appearing like one and being one. I don't see a distinction between this process and how I write everything, other than "I do it better"- I guess I can technically use visual reasoning, but mine is underdeveloped and goes unused. Is it just a dichotomy of stochastic parrot vs. conscious entity?


Then I'll just say you are a stochastic parrot. Again, solipsism is not a new premise. The philosophical zombie argument has been around over 50 years now.


> Anthropic's own research showed that Claude does actually "plan ahead", beyond the next token.

For a very vacuous sense of "plan ahead", sure.

By that logic, a basic Markov-chain with beam search plans ahead too.


It reads to me like they compare the output of different prompts and somehow reach the conclusion that Claude is generating more than one token and "planning" ahead. They leave out how this works.

My guess is that they have Claude generate a set of candidate outputs and the Claude chooses the "best" candidate and returns that. I agree this improves the usefulness of the output but I don't think this is a fundamentally different thing from "guessing the next token".

UPDATE: I read the paper and I was being overly generous. It's still just guessing the next token as it always has. This "multi-hop reasoning" is really just another way of talking about the relationships between tokens.


That's not the methodology they used. They're actually inspecting Claude's internal state and suppression certain concepts, or replacing them with others. The paper goes into more detail. The "planning" happens further in advance than "the next token".


Okay, I read the paper. I see what they are saying but I strongly disagree that the model is "thinking". They have highlighted that relationships between words is complicated, which we already knew. They also point out that some words are related to other words which are related to other words which, again, we already knew. Lastly they used their model (not Claude) to change the weights associated with some words, thus changing the output to meet their predictions, which I agree is very interesting.

Interpreting the relationship between words as "multi-hop reasoning" is more about changing the words we use to talk about things and less about fundamental changes in the way LLMs work. It's still doing the same thing it did two years ago (although much faster and better). It's guessing the next token.


I said "planning ahead", not "thinking". It's clearly doing more than only predicting the very next token.


They have written multiple papers on the subject, so there isn’t much need for you to guess incorrectly what they did.


Absolute bull.

The writing style is exactly the same between the “prompt” and “response”. Its faked.


That's what makes me think it's legit: the root of this whole issue was that OpenAI told GPT-4o:

  Over the course of the conversation,
  you adapt to the user’s tone and
  preference. Try to match the user’s vibe,
  tone, and generally how they
  are speaking.
https://simonwillison.net/2025/Apr/29/chatgpt-sycophancy-pro...


The response is 1,000% written by 4o. Very clear tells, and in line with many other samples from the past few days.


If you look at the full thing, the market analysis it does basically says this isn't the best idea.


FWIW grok also breathlessly opines the sheer genius and creativity of shit on a stick


Yes it does work… with an A/B update system.

Android systems can do this today. After an orderly shutdown of new software, then it can mark the new stuff as good and not allow older software to boot.


The funny part is the Samsung update that bricked a10 phones was a update to smart things, so it couldn't use the Android A/B capability to roll back lol


You create an RF shadow, not a black hole.


Broadcast AM stations sometimes have to pay other towers in the area to install a special ground match unit on their tower legs, that makes the non-broadcast tower "invisible" - otherwise there'd be a null in that direction. Like a cardioid shape.


Harvesting RF ambient noise is not new. Here are some commercial products:

https://e-peas.com/product/aem30940/

https://www.nxp.com/docs/en/application-note/AN12365.pdf

https://www.nexperia.com/products/analog-logic-ics/power-ics...

Also, crystal radios are really old.


I think that titles are confusing, but your links for example are not the same thing as the innovation in the article.

For nexperia, if you read the datasheet in fact it is a module that will harvest energy from a photovoltaic cells.

For the e-peas, this is what says the datasheet: "RF input power from -18.5 dBm up to 10 dBm (typical)". So this is just the typical energy harvesting from an incoming signal.

In the original article, they said that their new technology allows to harvest energy under -20 dBm that was impossible till then.


> There was and is absolutely nothing wrong, and quite a lot right, by having the 2FA program completely separate from your password vault.

Did you read the article? That's what they say.

> For maximum security, you can store your 2FA token elsewhere ... but for general purpose use, storing your 2FA in your password manager is an acceptable solution due to the convenience benefits it provides.


> Did you read the article? That's what they say.

No, that's not what they say. If you read the text that you just now quoted, you will see that it says "storing your 2FA in your password manager is an acceptable solution due to the convenience benefits it provides". Clearly the writer of that text believes there _is_ something wrong with having 2FA completely separate from the password vault: it is less convenient, to the extent where they are happy recommending this horrible approach to laypersons.

In addition, if you go and read OP, you will find that they talk about the potential of losing access to your TOTP codes stored in Google Authenticator. So that's another thing that counts as "something wrong" with storing 2FA separately from password vault.

So there's at least 2 things in the article that count as "something wrong". So they definitely didn't say that there's "absolutely nothing wrong".


They say it's less convenient, that doesn't mean they say it's wrong. And yes it is less convenient, why are you saying it's "horrible"? Security is always about compromises, if the less convenient method causes people to come up with workarounds then it would be worse even if in theory it's more secure.


> if the less convenient method causes people to come up with workarounds then it would be worse even if in theory it's more secure

but that's literally what this is... the less convenient method (2FA) caused people to come up with workarounds (saving 2FA secrets in their password vaults)... and I'm saying it's horrible


Where's the proof that this works?

It's a brute forcing tool with the goal of finding the desired fingerprint, but there's no demonstration of it actually working.


It's enough to find a fingerprint that's visually similar enough. It doesn't have to be exactly the same. That's many orders of magnitude easier than finding an exact match!


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: