All streaming will have delay built in. Exponentially so when multi-room is involved. Audio has to be intercepted, encoded/transcoded, sent over the network, possibly endure packet loss, be received, and decoded, before it can be played through speakers.