Legal punishment is a great incentive to try to do your best job. You can reliab...

protomolecule · 2024-12-27T14:37:08 1735310228

Maybe include in a prompt a threat of legal punishment? Sure somebody has already tried that and tabulated how much it improves scores on different benchmarks)

bick_nyers · 2024-12-27T18:09:51 1735322991

I suspect the big AI companies try to adversarially train that out as it could be used to "jailbreak" their AI.

I wonder though, what would be considered a meaningful punishment/reward to an AI agent? More/less training compute? Web search rate limits? That assumes that what the AI "wants" is to increase its own intelligence.

timeon · 2024-12-27T16:47:09 1735318029

Maybe legal threat for the company operating it? Would that help?

Havoc · 2024-12-27T14:58:37 1735311517

LLM's response being best prediction of next token arguably isn't that far off from a human motivated to do their best. It's a fallible best effort either way.

And both are very far from the certainty the author seems to demand.

420official · 2024-12-27T17:13:18 1735319598

An LLM isn't providing its "best" prediction, it's providing "a" prediction. If it were always providing the "best" token then the output would be deterministic.

In my mind the issue is more accountability than concerns about quality. If a person acts in a bizarre way they can be fired and helped in ways that an LLM can never be. When gemini tells a student to kill themselves, we have no recourse beyond trying to implement output filtering, or completely replacing the model with something that likely has the same unpredictable unaccountable behavior.

dambi0 · 2024-12-27T22:50:02 1735339802

Are you sure that always providing the best guess would make output deterministic? Isn’t the fundamental point of learning, whether done my machine or human, that our best gets better and is hence non-deterministic? Doesn’t what is best depend on context?