Reading more comments, I think in the end the problem is not so much in background noise filtering or voice recognition, but in the lack of context. Primary access to contextual information may explain why Siri gives better results.
E.g. for "play X on Spotify", GA first interprets X independently, and then asks Spotify to play it. On the other hand, when you ask Siri to "play X" (presumably on Apple Music), Siri can take into account your listening profile when interpreting X.
This may in fact be at the bottom of Apple trying to restrict integration of Spotify with Siri: The experience will be worse because of the contextual gap, and it will make Siri appear dumb.
E.g. for "play X on Spotify", GA first interprets X independently, and then asks Spotify to play it. On the other hand, when you ask Siri to "play X" (presumably on Apple Music), Siri can take into account your listening profile when interpreting X.
This may in fact be at the bottom of Apple trying to restrict integration of Spotify with Siri: The experience will be worse because of the contextual gap, and it will make Siri appear dumb.