How well it works for identifying music is going to be an issue of how well you ...

bravura · 2024-04-02T16:27:54 1712075274

This is wrong, or misleading at best.

It's a known difficult open problem to create a good audio representation (feature vector). Good, in the case of LSH, would be that representation distance correlates well with human perception.

For something like Shazam, you want this representation to be invariant to minor transformations of the audio. That's the interesting problem.

johanvts · 2024-04-04T17:46:19 1712252779

How is that not aligned with what I’m saying? You add some detail, but that doesn’t make me wrong.