Thanks for the paper recommendations. I've definitely come across the autoencoder paper on my deep learning music project last semester. We actually have a novel model that "walks" across a VAE encoding space to produce songs. By walking in circular motions, we can produce repeating themes and rhythms. You can find it samples of it at www.deepsymphony.com
I think you might see some interesting results using a non-parametric [one that doesn't require specifying the number of clusters apriori] clustering algorithm, like mean-shift. I've never seen an adaptation for discrete data like this, but it should be possible.
You could have the same tradeoff between note-distance and time-distance.
Cool, I'll definitely look into these types of algorithms. Do you have exposure to these? I'm still a 3rd year undergrad and my maths isn't strong enough yet to grok some of the crazier algorithms.
Actually I was curious enough about this novel use (and I happen to be interested in music myself, who isn't) I saw your post and thought about replicating it and emailing you results. I'd be more than happy to work with you on any non-commercial stuff. (An email address which works for me is on my HN profile, could you send me a note? Even if nothing else it doesn't make sense for all of us to spider/scrape the same datasets over and over...)
Thanks! I was unsure if the blog would be a good idea or not because I have never tried writing about my projects too much. As a person who struggles to groks research papers I find blog posts that analyze them immensely helpful. I also find it helpful to summarize my thoughts for my own learning.
[0] https://arxiv.org/abs/1612.03789
[1] https://arxiv.org/abs/1706.04486