Hacker News new | past | comments | ask | show | jobs | submit login

This is known as Blind Source Separation [1], and it's been a field of study for decades. The specific problem here seems to be the "cocktail party problem", where you want to isolate a single speaker (or in this case 5?) in a room full of conversations.

When I was in grad school, I knew an EE research group in the building next to mine working on this problem using ICA (independent components analysis) -- this was ca 2004, before the resurgence of deep learning. Even with ICA useful results could be obtained.

The results of the FB work [2] with RNNs are pretty impressive (audio samples).

[1] https://en.wikipedia.org/wiki/Signal_separation

[2] https://enk100.github.io/speaker_separation/




With ICA, don't you have to have n independent mixes of the n speakers in order to demix them all?


Are the audio embeds on that second site working for you? Can't get most of them to play.


Yep, they're working for me.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: