You can as well configure the FEC block size (it should be smaller than the target latency), the length of network packets, the length of internal audio frames in the pipeline, and the resampler window. And also the I/O latency, e.g.PulseAudio buffer size. So basically you can configure all (or almost all) parameters that can affect the resulting latency.
I'll document these parameters and their configuration a bit later. (Currently you can find all of them in the man-page in in the API, but there is no overview page that explains how exactly do they affect the total latency).
So far I mostly tested Roc on several 2.4 Ghz Wi-Fi networks. You can usually expect 100-500 ms in this case (depending on the network). See "Typical configuration" in this article[1].
Most likely you will be able to achieve lower latencies on a better channel, but I did no serious testing for lower latencies yet. This is in my to-do list.
> Few questions: how do you 'capture' PCM stream in case of ALSA? It is straight forward to create a PA sink and plug it into PA configuration, but I am wondering about pure ALSA.
Good question :) Roc does not implement any special capturing code for ALSA, it just reads from the given device (using SoX currently). The user is supposed to use something like snd-aloop.
It would be possible to create a custom ALSA plugin I guess, but we have no plans for that currently.
You're right about the transport vs product part. I would prefer to work on the transport. And an ALSA plugin would be a product on top of it so it should be a separate project ideally. Actually, the same is true for our PulseAudio modules. I hope later we will either submit them to upstream or separate into a standalone project.
Currently, no. Windows port is in our roadmap but not a priority right now. However, if someone would want to maintain it, I'm ready to accept PRs and help with porting.
Regarding realtime stuff there's a project out there to implement the Ethernet AVB (audio/video bridging) standard on BeagleBone using PTP (precision time protocol) for synchronization.
Some of the older network synchronized transports like CobraNet and Dante might also be interesting for anyone wanting to learn more about this stuff.
I don't know yet. When the time comes to implementation we'll look whether we can re-use either the code or ideas or maybe instead integrate Roc into GStreamer as a network transport (actually I was thinking about it already and there is an item in the roadmap for it).
Thanks, I didn't know about this project and will definitely look at the implementation.
Their documentation says they use TCP, which usually means that it won't handle low latencies on Wi-Fi due to packet losses.
On the other hand, they have service discovery, remote control, and multi-room synchronization. All three features are planned but not yet supported in Roc. We'll add the first two in upcoming releases, but the multi-room support requires a serious research.
Their documentation also says the client can correct time deviations by playing faster or slower. We use resampling for that instead. I'm wondering how they can avoid glitches without using a resampler.
One more difference is that they use their own protocols (both for streaming and control) while Roc relies on standard RFCs.
I'm working on Roc, a toolkit for real-time streaming over the network. Among other things, it provides command-line tools and PulseAudio modules that can be used for home audio. It can be used with PA, with bare ALSA, and with the macOS CoreAudio.
The main difference from other transports, including PulseAudio TCP and RTP streaming, is the better quality of service when the latency is low (100 to 300 ms) and the network in unreliable (Wi-Fi). The post explains why and provides some comparison, usage instructions, and future plans.
There is still a long way to go, and now we're looking for people thoughts and feedback. Do you find the project useful? How would you use it? What features would you like to see?
It's very disappointing that 300ms is considered "low" latency. The circumference of the Earth is 40075km, and the speed of light in optical fiber is slightly faster than 2/3rds of that in vacuum, so it's physically possible to send a signal to any place on Earth within 100ms, and get a reply within 200ms.
IMO "low latency" should mean low enough than it's very unlikely to be noticed, which most musicians seem to accept as 5ms. (Theoretically, even microsecond level delayed audio can be noticed if mixed with the original signal because of comb filtering.)
Many audio streaming apps requires 1-2 seconds latency (especially on Wi-Fi), that's why I called the 100-300 ms range "low". 100ms is the minimum I've seriously tested on Wi-Fi so far. 300ms is, roughly, the maximum UI delay that feels acceptable (you press "play" and hear the sound).
300ms is still noticably laggy when the audio is part of a video. Some media players can delay their audio to account for playback delay in the audio device, if the audio stack supports that. Does Roc support that, or if not, is it on your roadmap?
> 300ms is still noticably laggy when the audio is part of a video.
Agree.
> Some media players can delay their audio to account for playback delay in the audio device, if the audio stack supports that. Does Roc support that, or if not, is it on your roadmap?
We have an open issue for implementing correct latency reports in PA modules. When we'll fix it, players that support that feature should automatically start taking the latency into account.
Thanks for reminding me, I'll test this feature specifically.
Many years ago we did some unscientific testing how latency affected a telephony application. Already at around 80ms there was measurable impact on the dialogue, with an increased frequency of the callers interrupting each other's sentences. Even modern VoIP applications can still be problematic in this regard and additional latency from the software stack wouldn't help.
I see you've got Opus on your to-do list. I would really appreciate that! I find Opus (appropriately configured) to be audibly indistinguishable from CD audio, and it would really help with the bandwidth requirements.
I've always been really excited by the possibilities implied by PulseAudio's network capabilities, but disappointed by their latency and bandwidth requirements. Roc + Opus would be amazing.
Check out https://github.com/eugenehp/trx for Opus streaming inspiration, I've played around with their code and found it easy to work with. Opus would be great with ROC because in case of buffer over/under runs the codec provides features to mask dropouts based on previous content. This is critical when using Wi-Fi.
> Opus would be great with ROC because in case of buffer over/under runs the codec provides features to mask dropouts based on previous content. This is critical when using Wi-Fi.
Are you talking about its PLC or FEC? I didn't test it yet and I'm interested if people are using both of them with music.
BTW it would be also interesting to combine our FECFRAME support with Opus.
Excellent; a few years ago I even started hacking on my own transport, very roughly as a PA module, but life got in the way and it never got very far. So I'm very pleased to see this great project. Thanks, and good luck!
Fun fact: end-to-end latency in midrange digital wireless microphone systems is 2.7ms [0]. People sometimes wonder why we bother with 5-6 figure specialty audio and RF gear when we could "just" use general purpose computers and WiFi. This is one of the reasons.
I've got a home HifiBerry streaming setup over Ethernet. I am using TCP streaming and the latencies are low enough not to be noticeable at all while watching YouTube or playing games and streaming the audio output to my speaker setup on the RPi.
1) Would this make any difference?
2) Does it currently support online plug-unplug the way RTP works without restarting pulseaudio?
If you have no issues with 1) latency 2) packet losses and 3) clocks difference, that would be no difference, at least until Roc could offer some new encodings.
(If you're using PA, it handles the clocks difference for you. Its RTP transport sometimes worked strange for me, but its "native" tunnels handled it well.)
> Does it currently support online plug-unplug the way RTP works without restarting pulseaudio?
Roc sinks and sink inputs may be loaded and unloaded at any time without restarting PA. But there is no service discovery yet, which means that 1) when a remote sink input appears, sink is not automatically added 2) when a remote sink input disappears, sink is not automatically removed. (We will add this in upcoming releases). Currently the remote sink input can appear and disappear at any time and the local sink will just continue streaming packets to the specified address.
This is exactly what I did - creating an ALSA plugin and leveraging snd-loopback to pass PCM to a streaming process. I would be interested in incorporating your protocol into SlimStreamer. Currently it uses SlimProto, which is TCP based (so sync part is a nightmare to get working on a reasonable level). How far are you with supporting multiple sampling rates and multiple receivers?
> How far are you with supporting multiple sampling rates
Roc currently supports arbitrary input/output rates but only a single network rate (44100). If the network rate differs from the input/output rate, Roc performs resampling.
We're now finishing the 0.1 release, and I was planning to add support for more network encodings, including more rates, in 0.2. Feel free to file an issue or mail us with a list of encoding/rates you need.
> and multiple receivers?
No support yet. If you use a multicast address, it would probably just work though.
Again, feel free to file an issue and describe what you would expect from such support. I'll be happy to implement it if someone needs it.
Another question is how Roc will interact with your sync part. How do you perform synchronization?
How did you measure latency? I'm thinking to contribute a similar project that has 1~2seconds of delay but I don't know what tools I need to use to benchmark latency.
That small experiment in my post does not include a correct latency estimation. I just configured all three transports with the desired latency. Actually I'm thinking about writing a tool for such benchmarks that will measure the overall latency (PA + network + PA).
I wish you luck with your project! sorry if my comment was snarky but a lot of modern software seems to essentially not care about latency or responsiveness, and it's something that bothers me more and more these days