Hacker News new | past | comments | ask | show | jobs | submit login
Aubio, a C library for analyzing songs (github.com/aubio)
133 points by khoobid_shoma on Sept 21, 2021 | hide | past | favorite | 29 comments



If anyone is interested, Sonic Visualizer with it's vamp plugins is pretty state of the art when it comes to analyzing music. The Chordino NNLS Chroma plugin, for instance, can extract even jazz chords fairly accurately


Interesting! I just signed up for Chordify [0], which does extraction of chords from YouTube videos, and then plays them while you play along to either a grid of chords, or an animation showing the current chord and the next ones coming up. Has chord diagrams for Guitar, Ukulele, and Piano.

Really nice, frequently accurate (it's not perfect :-).

[0] https://chordify.net/


Is there a market for "mathematically superior" tools similar to this? I've been developing this sort of math for more than half my life, and would love to "productize" it. I just never thought there'd be a market...

I wonder how many paying customers Chordify has.


Go for it. Let your users make your market, you should just make a great tool/service.

If you're better than the competition you should be profitable, and if your maths is better you're better than the competition.

I would pay for it. More importantly I would use it. Not everything has to have a price as long as everything has a value.

It's funny because I was trying to sing a melody into Musescore just the other day and had to apply a bit of Audacity and then remember theory from secondary school to make it work. My middle-age voice sounds different to a computer than to my ears. If you could make this easy and accurate you would do every amateur song-writer on this earth a huge favor, though maybe not every amateur listener :)


Chordify is great, but for some reason the chords are displayed late when running along side the youtube video. A simple timed delay would improve this frustrating quirk, but it hasn't been fixed in years.m

It's certainly as accurate as most.


I have noticed it's late/early sometimes, but it appears to me to be mostly on time. Note: I'm not an expert by any means.


There is also the free Yamaha Chord Tracker app (just checked, it still works on iOS15)


Is there any plugin that would give some interesting information to a newbie who has no idea what he is looking at?

I was trying various plugins on this song: https://www.youtube.com/watch?v=e__Z-UpU01U

But so far none of the vamp plugins seem to be showing anything interesting. (I have tried a random handful)

EDIT: The spectrum view is fascinating. I can see a little bit of additional information on the edges on the Flac vs a MP3. Is there a plugin that can separate the instruments?


I've also heard this about Sonic Visualizer + vamp plugins

Could anyone familiar with this area recommend more tools like the original post and these? Would really appreciate it.


All I can say is that I don't really look for sheet music anymore, or guitar tabs / chords. When I find interesting music on youtube I just use youtube-dl and Sonic Visualizer / NLS-chroma and jam along. It's upped my song writing abilities and general understanding and feel for music tremendously


I don't know much about signal processing or music theory. Can you give me a ELI5 version of your process and outcome?


Total gamechanger. Between this, Bitwig and PipeWire, I think Linux can really be my full-time studio choice.


> MFCC (mel-frequency cepstrum coefficients)

For anyone else who had never heard of "cepstrum" before, this is what I found on Wikipedia:

"The cepstrum is the result of computing the inverse Fourier transform of the logarithm of the estimated signal spectrum. The method is a tool for investigating periodic structures in frequency spectra. The power cepstrum has applications in the analysis of human speech.

The term cepstrum was derived by reversing the first four letters of spectrum. Operations on cepstra are labelled quefrency analysis (or quefrency alanysis), liftering, or cepstral analysis."


wikipedia's sections on signal processing and data compression have jumped to near textbook quality in the past few years.

the cepstrum is awesome, it comes from the source-filter model of human speech. by looking at the periodicities in the frequencies, it attempts to capture the resonance of the filter that models the vocal tract.


>wikipedia's sections on signal processing and data compression have jumped to near textbook quality in the past few years.

I pretty much used Wikipedia as my main resource when I was learing dsp at college.

I get like 80% of my EE information from Wikipedia nowaydays, for a ton of different areas. A couple days ago I was reading Meindl's paper on boron implantation in MOSFETs and I don't recall exactly what it was, but it was such an obscure topic in the paper my colleages had quite the trouble and ultimately did not find more resources on it.

There was literally a whole Wikipedia section dedicated to the concept and it saved my goddamn ass in the presentation I had the next day.

I absolutely love Wikipedia and I owe a lot of my education to the contributors.


Cepstral analysis has a pretty good statistical basis (no pun intended) away from the source/filter model.

There's a bit of magic in the MFCC computation where you apply the discrete cosine transform (DCT). That's all about reducing correlation between components in the cepstrum and makes no sense unless you "get" the way the change of basis has high energy compaction (more information in fewer values).

However this has nothing to do with a physiological understanding of human speech.


This is quite useful for sound engineers and those who want to have side projects that have to do with audios.


These are a nice set of tools. I used this long ago for extracting onset times from audio files: https://sighack.com/post/extract-onset-beat-times-from-audio...


Funny, I stumbled upon this just today when I was looking for good realtime beat detection code.

Does anybody have experience, using some of this code for realtime detection?


You might try "librosa" for tempo/beat detection:

https://github.com/librosa/librosa/blob/main/librosa/beat.py

       Track beats using time series input
    >>> y, sr = librosa.load(librosa.ex('choice'), duration=10)
    >>> tempo, beats = librosa.beat.beat_track(y=y, sr=sr)
    >>> tempo
    135.99917763157896
Also see Essentia:

https://essentia.upf.edu/


Not beat detection, but other real-time patterns. See my other reply here: https://news.ycombinator.com/item?id=28607374 .


OT: Does anyone know the state of the art is music visualization?


Yet another library that uses GPL 3 and not LGPL 3. Why?


No doubt because the author doesn't want it to be used in proprietary software that doesn't respect the user's freedoms. At least, not without taking a cut.


From the website:

> Note: aubio is not MIT or BSD licensed. Contact the author if you need it in your commercial product.


The LGPL was never meant for all libraries, only certain specific ones (which the FSF thought were better to more permissively license because of practical reasons.) That's one reason they renamed it from "Library GPL" to "Lesser GPL"


Anyone here used it? How well does it work?


I've used its python bindings. I've been quite pleased with how well it works, honestly. I wrote some software for a client with multiple radio stations. It's a service that listens to one web stream per station and shoots the client an email/text if it detects problematic audio (e.g. static, silence, etc.). Some of the stations are talk and others are music, so it needed to be robust. Aubio made it really easy to test different detection algorithms, and its documentation was tremendously helpful. IRC room was active as well when I had questions.


aubio is the one python library I found that provided a ready-to-use pitch model, which I used for a simple script to listen for my doorbell. This worked much better than the more common approaches I saw elsewhere on the internet that did simple frequency detection.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: