Because the government doesn't just care about CSAM, they care about terrorism, drug and human trafficking, gangs etc. Scanning for CSAM won't change the government's opposition to E2EE, and Apple knows this because, according to their transparency reports, they respond with customers' data to NSL and FISA data requests about 60,000 times a year.
Occam's razor also says they aren't playing 11th dimensional chess, and that they just fucked up.
The point is that with this technology, Apple can now please both the government and Apple users that want their data in the cloud to be fully encrypted.
First they enable this technology to prove to the government that they can still scan for bad stuff, then they enable true E2E. I don't know if this is really their motivation but it sounds plausible to me.
If they scan on device unencrypted and report what they found (even if it's child porn now, we know that is not going to last), they made the e2e effectivelly useless. The point of end to end enceyption is to make the data accessible to just one or two people - and that is not fulfilled here.
EDIT: though you are right - they can please governments by making encryption useless while lying to people they still have it. Not sure that was your message though.
The difference between Apple being able to scan data on your phone vs on their servers comes down to who is able to access your data. In both cases, Apple has access. But by storing both your data and your keys somewhere in their cloud servers, that data becomes available to governments with the appropriate subpoena, rogue Apple employees and hackers. None of that would be possible with true E2E encryption.
It seems plausible to me that Apple might believe some of their privacy conscious users would prefer this situation and hence "privacy" can be considered a plausible explanation for their actions here.
I'm suggesting that the client-side CSAM scanning technology could allow Apple to turn on true E2E encryption and still satisfy the government's requirement to report on CSAM which as you point out is not something that Apple currently implements.
The "client side" CSAM detection involves a literal person in the middle examining suspected matches. The combination of that and whatever you think true E2E means isn't E2E by definition.
Occam's razor also says they aren't playing 11th dimensional chess, and that they just fucked up.