Regarding point 3, unless your system is under massive memory pressure, no caught-up Kafka consumer should be serviced from disk. Old offsets that are flushed out of memory because you do not have it obviously are served from disk, with essentially linear reads of disk blocks (of consecutive logical addresses if flush sizes are large enough that then can end up on disk in any number of ways, depending on how much the firmware lies, I know) of the requested file.
I really can not see how Redis is going to perform "much better" reading from disk once the entries are no longer in RAM. At that point both Kafka and Redis have to read from disk, and you either have the IOPS to serve all the lagging consumers or you don't. Maybe you have enough of them to service 1 or 2 concurrent reads, maybe 10-12. But for the same messages counts, sizes and concurrent consumers, your workload will become IOPS bound rather fast.
Note: "much better" to me implies 10x+ better, not "my C library read() is 2.3% better than your Java".
I really can not see how Redis is going to perform "much better" reading from disk once the entries are no longer in RAM. At that point both Kafka and Redis have to read from disk, and you either have the IOPS to serve all the lagging consumers or you don't. Maybe you have enough of them to service 1 or 2 concurrent reads, maybe 10-12. But for the same messages counts, sizes and concurrent consumers, your workload will become IOPS bound rather fast.
Note: "much better" to me implies 10x+ better, not "my C library read() is 2.3% better than your Java".