If we are trying to at least match human level then all we have to do is summarize and store information for retrieval in the context window. Emphasis on summarize.
We take out key points explicitly so it's not summarized, and for the rest (less important parts) we summarize it and save it.
That would very likely fit and it would probably yield equal to or better recall and understanding than humans.
We take out key points explicitly so it's not summarized, and for the rest (less important parts) we summarize it and save it.
That would very likely fit and it would probably yield equal to or better recall and understanding than humans.