This is too interesting a topic to get into right now – I’ll have to do another blog post or something. For now I’ll say: looking at whiteness alone gets you like 80% of the way there. It’s kind of amazing how effective it is by itself.
I suspect you could also factor in the variance between data samples. If a pixel is occasionally lighter in shade, you can assume that the lighter samples are due to cloud cover.
One possible naive approach could be to simply dispose of the lightest 90% of samples and average the remainder. You could reduce noise by blurring the pixels before applying statistical sampling.