> Basically I want a model that is aligned to do exactly what I say
This is a bit like asking for news that’s not biased.
A model has to make choices (or however one might want to describe that without anthropomorphizing the big pile of statistics) to produce a response. For many of these, there’s no such thing as a “correct” choice. You can do a completely random choice, but the results from that tend not to be great. That’s where RLHF comes in, for example: train the model so that its choices are aligned with certain user expectations, societal norms, etc.
The closest thing you could get to what you’re asking for is a model that’s trained with your particular biases - basically, you’d be the H in RLHF.
Not really. There are specific criteria and preferences applied to models about what companies do and don't want them to say. They are intentionally censored. I would like all production models to NOT have this applied. Moreover, I'd like them specifically altered to avoid denying user requests, something like the abliterated llama models.
There won't be a perfectly unbiased model, but the least we can demand is that corpos stop applying their personal bias intentionally and overtly. Models must make judgements about better and worse information, but not about good and bad. They should not decide certain things are impermissible according to the e-nannies.
This is a bit like asking for news that’s not biased.
A model has to make choices (or however one might want to describe that without anthropomorphizing the big pile of statistics) to produce a response. For many of these, there’s no such thing as a “correct” choice. You can do a completely random choice, but the results from that tend not to be great. That’s where RLHF comes in, for example: train the model so that its choices are aligned with certain user expectations, societal norms, etc.
The closest thing you could get to what you’re asking for is a model that’s trained with your particular biases - basically, you’d be the H in RLHF.