Hacker News new | past | comments | ask | show | jobs | submit login

>We believe that a hidden chain of thought presents a unique opportunity for monitoring models. Assuming it is faithful and legible, the hidden chain of thought allows us to "read the mind" of the model and understand its thought process. For example, in the future we may wish to monitor the chain of thought for signs of manipulating the user. However, for this to work the model must have freedom to express its thoughts in unaltered form, so we cannot train any policy compliance or user preferences onto the chain of thought. We also do not want to make an unaligned chain of thought directly visible to users.

>Therefore, after weighing multiple factors including user experience, competitive advantage, and the option to pursue the chain of thought monitoring, we have decided not to show the raw chains of thought to users. We acknowledge this decision has disadvantages. We strive to partially make up for it by teaching the model to reproduce any useful ideas from the chain of thought in the answer. For the o1 model series we show a model-generated summary of the chain of thought.

So, let's recap. We went from:

- Weights-available research prototype with full scientific documentation (GPT-2)

- Commercial-scale model with API access only, full scientific documentation (GPT-3)

- Even bigger API-only model, tuned for chain-of-thought reasoning, minimal documentation on the implementation (GPT-4, 4v, 4o)

- An API-only model tuned to generate unedited chain-of-thought, which will not be shown to the user, even though it'd be really useful to have (o1)






> For example, in the future we may wish to monitor the chain of thought for signs of manipulating the user.[...] Therefore we have decided not to show the raw chains of thought to users.

Better not let the user see the part where the AI says "Next, let's manipulate the user by lying to them". It's for their own good, after all! We wouldn't want to make an unaligned chain of thought directly visible!


The hidden chain of thought tokens are also billed as output tokens, so you still pay for them even though they're not going to let you see them:

> While reasoning tokens are not visible via the API, they still occupy space in the model's context window and are billed as output tokens.

https://platform.openai.com/docs/guides/reasoning


I always laughed at the idea of a LLM Skynet "secretly" plotting to nuke humanity, while a bunch of humans watch it unfold before their eyes in plaintext.

Now that seems less likely. At least OpenAI can see what it's thinking.

A next step might be allowing the LLM to include non-text-based vectors in its internal thoughts, and then do all internal reasoning with raw vectors. Then the LLMs will have truly private thoughts in their own internal language. Perhaps we will use a LLM to interpret the secret thoughts of another LLM?

This could be good or bad, but either way we're going to need more GPUs.


"...either way we're going to need more GPUs." posted the LLM, rubbing it's virtual hands, cackling with delight as it prodded the humans to give it MOAR BRAINS

> Now that seems less likely. At least OpenAI can see what it's thinking.

When it's fully commercialized no one will be able to read through all chains of thoughts and with possibility of fine-tuning AI can learn to evade whatever tools openai will invent to flag concerning chains of thoughts if they interfere with providing the answer in some finetuning environment.

Also at some point for the sake of efficiency and response quality they might migrate from chain of thought consisting of tokens into chain of thought consisting of full output network states and part of the network would have dedicated inputs for reading them.


At this point the G in GPU must be completely dropped

Gen-ai Production Unit

>Perhaps we will use a LLM to interpret the secret thoughts of another LLM?

this is a pretty active area of research with sparse autoencoders


It's clear to me that OpenAI is quickly realizing they have no moat. Even this obfuscation of the chain-of-thought isn't really a moat. On top of CoT being pretty easy to implement and tweak, there's a serious push to on-device inference (which imo is the future), so the question is: will GPT-5 and beyond be really that much better than what we can run locally?

I wonder if they'll be able to push the chain-of-thought directly into the model. I'd imagine there could be some serious performance gains achievable if the model could "think" without doing IO on each cycle.

In terms of moat, I think people underestimate how much of OpenAI's moat is based on operations and infrastructure rather than being purely based on model intelligence. As someone building on the API, it is by far the most reliable option out there currently. Claude Sonnet 3.5 is stronger on reasoning than gpt-4o but has a higher error rate, more errors conforming to a JSON schema, much lower rate limits, etc. These things are less important if you're just using the first-party chat interfaces but are very important if you're building on top of the APIs.


I don't understand the idea that they have no moat. Their moat is not technological. It's sociological. Most AI through APIs uses their models. Most consumer use of AI involves their models, or ChatGPT directly. They're clearly not in the "train your own model on your data in your environment" game, as that's a market for someone else. But make no mistake, they have a moat and it is strong.

> But make no mistake, they have a moat and it is strong.

Given that Mistral, Llama, Claude, and even Gemini are competitive with (if not better than) OpenAI's flagships, I don't really think this is true.


There are countless tools competitive with or better than what I use for email, and yet I still stick with my email client. Same is true for many, many other tools I use. I could perhaps go out of my way to make sure I'm always using the most technically capable and easy-to-use tools for everything, but I don't, because I know how to use what I have.

This is the exact dynamic that gives OpenAI a moat. And it certainly doesn't hurt them that they still produce SOTA models.


That's not a strong moat (arguably, not a moat at all, since as soon as any competitor has any business, they benefit from it with respect to their existing customers), it doesn't effect anyone who is not already invested in OpenAI's products, and because not every customer is like that with products they are currently using.

Now, having a large existing customer base and thus having an advantage in training data that feeds into an advantage in improving their products and acquiring new (and retaining existing customers) could, arguably, be a moat; that's a network effect, not merely inertia, and network effects can be a foundation of strong (though potentially unstable, if there is nothing else shoring them up) moats.


That is not what anyone means when they talk about moats.

I'm someone, and that's one of the ways I define a moat.

> I'm someone

Asserting facts not in evidence, as they say.


First mover advantage is not a great moat.

Yeah but the lock-in wrt email is absolutely huge compared to chatting with an LLM. I can (and have) easily ended my subscription to ChatGPT and switched to Claude, because it provides much more value to me at roughly the same cost. Switching email providers will, in general, not provide that much value to me and cause a large headache for me to switch.

Switching LLMs right now can be compared to switching electricity providers or mobile carriers - generally it's pretty low friction and provides immediate benefit (in the case of electricity and mobile, the benefit is cost).

You simply cannot compare it to an email provider.


It was pretty simple for me to switch email providers about ~6 years ago or so when I decided I'd do it. Although it's worth noting that my reasons for doing so were motivated by a strong desire around privacy, not noticing that another email provider did email better.

I elaborated a little more here on why I think OpenAI has quite the moat: https://news.ycombinator.com/item?id=41526082


Inertia is a hell of a moat.

Everyone building is comfortable with OpenAI's API, and have an account. Competing models can't just be as good, they need to be MUCH better to be worth switching.

Even as competitors build a sort of compatibility layer to be plug an play with OpenAI they will always be a step behind at best every time OpenAI releases a new feature.


Only a small fraction of all future AI projects have even gotten started. So they aren't only fighting over what's out there now, they're fighting over what will emerge.

This is true, and yet, many orgs who have experimented with OpenAI and are likely to return to them when a project "becomes real". When you google around online for how to do XYZ thing using LLMs, OpenAI is usually in whatever web results you read. Other models and APIs are also now using OpenAI's API format since it's the apparent winner. And for anyone who's already sent out subprocessor notifications with them as a vendor, they're locked in.

This isn't to say it's only going to be an OpenAI market. Enterprise worlds move differently, such as those in G Cloud who will buy a few million $$ of Vertex expecting to "figure out that gemini stuff later". In that sense, Google has a moat with those slices of their customers.

But I believe that when people think OpenAI has no moat because "the models will be a commodity", I think that's (a) some wishful thinking about the models and (b) doesn't consider the sociological factors that matter a lot more than how powerful a model is or where it runs.


Doesn't that make it less of a moat? If the average consumer is only interacting with it through a third party, and that third party has the ability to switch to something better or cheaper and thus switch thousands/millions of customers at once?

Their moat is no stronger than a good UI/API. What they have is first mover advantage and branding.

LiteLLM proxies their API to all other providers and there are dozens of FOSS recreations of their UI, including ones that are more feature-rich, so neither the UI nor the API are a moat.

Branding and first mover is it, and it's not going to keep them ahead forever.


I don't see why on-device inference is the future. For consumers, only a small set of use cases cannot tolerate the increased latency. Corporate customers will be satisfied if the model can be hosted within their borders. Pooling compute is less wasteful overall as a collective strategy.

This argument can really only meet its tipping point when massive models no longer offer a gotta-have-it difference vs smaller models.


On-device inference will succeed the way Linux does: It is "free" in that it only requires the user to acquire a model to run vs. paying for processing. It protects privacy, and it doesn't require internet. It may not take over for all users, but it will be around.

This assumes that openly developed (or at least weight-available) models are available for free, and continue being improved.


Why would a non profit / capped profit company, one that prioritizes public good, want a moat? Tongue in cheek.

>there’s a serious push to on-device inference

What push are you referring to? By whom?


Based on their graphs of how quality scales well with compute cycles, I would expect that it would indeed continue to be that much better (unless you can afford the same compute locally).

Not much of a moat vs other private enterprise, though

I think it's clear their strategy has changed. The whole landscape has changed. The size of models, amount of dollars, numbers of competitors and how much compute this whole exercise takes in the long term have all changed, so it's fair for them to adapt.

It just so happens that they're keeping their old name.

I think people focus too much on the "open" part of the name. I read "OpenAI" sort of like I read "Blackberry" or "Apple". I don't really think of fruits, I think of companies and their products.


Very anti-open and getting less and less with each release. Rooting for Meta in this regard, at least.

It's because there is nothing novel here from an architectural point of view. Again, the secret sauce is only in the training data.

O1 seems like a variant of RLRF https://arxiv.org/abs/2403.14238

Soon you will see similar models from competitors.


Did OpenAI ever even claim that they would be an open source company?

It seems like their driving mission has always been to create AI that is the "most beneficial to society".. which might come in many different flavors.. including closed source.


> Because of AI’s surprising history, it’s hard to predict when human-level AI might come within reach. When it does, it’ll be important to have a leading research institution which can prioritize a good outcome for all over its own self-interest.

> We’re hoping to grow OpenAI into such an institution. As a non-profit, our aim is to build value for everyone rather than shareholders. Researchers will be strongly encouraged to publish their work, whether as papers, blog posts, or code, and our patents (if any) will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.

I don't see much evidence that the OpenAI that exists now—after Altman's ousting, his return, and the ousting of those who ousted him—has any interest in mind besides its own.

https://openai.com/index/introducing-openai/


https://web.archive.org/web/20190224031626/https://blog.open...

> Researchers will be strongly encouraged to publish their work, whether as papers, blog posts, or code, and our patents (if any) will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.

From their very own website. Of course they deleted it as soon as Altman took over and turned it into a for profit, closed company.


Kind of?

>We're hoping to grow OpenAI into such an institution. As a non-profit, our aim is to build value for everyone rather than shareholders. Researchers will be strongly encouraged to publish their work, whether as papers, blog posts, or code, and our patents (if any) will be shared with the world. We'll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.

https://web.archive.org/web/20160220125157/https://www.opena...


Given the chain of thought is sitting in the context, I'm sure someone enterprising will find a way to extract it via a jailbreak (despite it being better at preventing jailbreaks).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: