I think a lot of the novel uses of language models like chatgpt will involve multiple instances interacting upon interleaved data. For example to improve factuality you might do the following:
1) One instance first parses the chat and last message to generate a response. Currently this is where things end but we can keep this private and do additional work.
2) A second instance, properly primed, can take the last prompt and response and "analyze" it, generating scores for things like factuality and usefulness, possibly adding commentary.
3) Pass into a third instance that has the chat history again to rewrite the response, taking into account the feedback.
4) Optionally repeat #2 and #3 until it passes some quality threshold.
1) One instance first parses the chat and last message to generate a response. Currently this is where things end but we can keep this private and do additional work.
2) A second instance, properly primed, can take the last prompt and response and "analyze" it, generating scores for things like factuality and usefulness, possibly adding commentary.
3) Pass into a third instance that has the chat history again to rewrite the response, taking into account the feedback.
4) Optionally repeat #2 and #3 until it passes some quality threshold.