I think the biggest difference between them is that BLAKE3 is still quite new. I agree with https://latacora.micro.blog/2018/04/03/cryptographic-right-a... that most cryptographic applications should make "boring" choices, and BLAKE3 is not yet boring. But assuming everything goes smoothly over the next 2-3 years, I'll start recommending BLAKE3 over BLAKE2 in essentially all cases.
If you're curious about the single-threaded performance differences between BLAKE3 and BLAKE2b, I might clarify that the red bar chart at the top of the BLAKE3 readme is actually a single-threaded measurement. Everything depends on your input size though, and we have some more detailed graphs in the paper.