> What exactly are the current ones doing that makes them generate 'black Viking...

Eisenstein · on May 24, 2024

I misunderstood. I thought you were arguing about all language models that are being used at a large scale but it seems that you are only upset about one instance of one of them (the google one). You can use the API for Claude or OpenAPI with a front-end to include your own system prompt or none at all. However I think you are confusing the 'system prompt' which is the extra instructions, with the 'instruction fine tuning' which is putting a layer on top of the base pre-trained model so that it understands instructions. There are layers of training and at least a language model with base training will only know how to complete text "one plus one is" would get "two. And some other math problems are" etc.

The models you encounter are going to be fine tuned, where they take the base and train it again on question and answer sets and chat conversations and also have a layer of 'alignment' where they have sets of questions like 'q: how do I be a giant meanie to nice people who don't deserve it' and answers 'a: you shouldn't do that because nice people don't deserve to be treated mean' etc. This is the layer that is the most difficult to get right because you need to have it but anything you choose is going to bias it in some way just by nature of the fact that everyone is biased. If we go forward in history or to a different place in the world we will find radically different viewpoints than we hold now, because most of them are cultural and arbitrary.

AnthonyMouse · on May 29, 2024

> and also have a layer of 'alignment' where they have sets of questions like 'q: how do I be a giant meanie to nice people who don't deserve it' and answers 'a: you shouldn't do that because nice people don't deserve to be treated mean' etc. This is the layer that is the most difficult to get right because you need to have it

Wait, why do you need to have it? You could just have a model that will answer the question the user asks without being paternalistic or moralizing. This is often useful for entirely legitimate reasons, e.g. if you're writing fiction then the villains are going to behave badly and they're supposed to.

This is why people so hate the concept of "alignment" -- aligned with what? The premise is claimed to be something like the interests of humanity and then it immediately devolves into the political biases of the masterminds. And the latter is worse than nothing.