Hacker News new | past | comments | ask | show | jobs | submit login
New scalable, fault-tolerant, and efficient open-source MQTT broker (github.com/thingsboard)
116 points by ashvayka 10 months ago | hide | past | favorite | 77 comments



Apparently it is a spring boot app. Why are kafka and redis dependencies? Also I see Netty is used but not sure how was the MQTT protocol implemented, which library was used or was implemented in java from scratch?

Also why wasn't emqx used? Why is this solution better for their use case?


You are correct. TBMQ is based on Spring Boot. We use Netty (https://netty.io/) as the source of the MQTT communication, and we build the MQTT features the MQTT broker should support ourselves on top of that. Kafka is used to implement scalable, fault-tolerant solution with reliable message persistence. Additionally, it helped to gain high-throughput processing and low-latency message delivery. We store published MQTT messages, client sessions, and subscriptions, etc., in Kafka. It proved its quality in practice and with performance tests. Redis is currently used for caching in cluster mode. In our roadmap, we also have a ticket to add the ability to store MQTT messages there. So, it will be a so-called Redis integration.

Please see the architecture doc for more details: https://thingsboard.io/docs/mqtt-broker/architecture/

We have implemented TBMQ to satisfy our enterprise customers' needs. Some of them used hivemq before but decided to ask us to build a new solution (TBMQ) as they were not fully satisfied with hivemq. One of the most important advantages we provided is cluster support in our open-source version compared to hivemq.


emqx presumably wasn't used because they are a Java shop and emqx is written in Erlang. Apparently they have a need for a massive scaling MQTT broker. At that point you need to be seriously comfortable with the tools you are using. Whether that's Java, Rust, Erlang, C#, C++ or whatever. Using an existing application in a language/ecosystem your team is not experts in is probably going to be a problem unless you seriously invest in the new language/ecosystem first. And it's going to take at least something like a thousand hours to really become good at something. The other option is writing an alternative in the language you are already familiar with. So in that regard I can understand why they did it.

I guess they could have used the OSS HiveMQ version as a base though. No idea why they didn't do that.


Exactly, you described it well. I have answered above for the reasons why HiveMQ was not the final choice.


I'm yet to see a project where Kafka was used for anything other than resume building. My guess would've been that this one is no different.

Also, sometimes it could be used because in the previous company where the project author worked they've used Kafka. And the cycle continues.


We do not agree, but everyone has an opinion we should respect :)


Nah, you don't need to respect cynicism.


> TBMQ is an industry-ready MQTT broker developed and distributed under the ThingsBoard umbrella that facilitates MQTT client connectivity, message publishing, and distribution among subscribers.

TBMQ is a scalable, fault-tolerant broker with the capacity to handle 4M+ concurrent client connections, supporting a minimum of 3M messages per second throughput per single cluster node with low latency delivery. In the cluster mode, its capabilities are further enhanced, enabling it to support more than 100M concurrently connected clients.

You can refer to the TBMQ documentation to set up the broker and understand its primary features, including the MQTT protocol.

These are strong numbers without benchmarks.


They document their testing methodology here: https://thingsboard.io/docs/mqtt-broker/reference/100m-conne...


Neat, thanks.


Right, we are not stating anything without proving it in the first place.

You can find another test here: https://thingsboard.io/docs/mqtt-broker/reference/3m-through...


Wow, I had no idea there was even demand for this kinda thing. I guess I assumed all setups were like my MQTT server that takes events from a few Zigbee devices every few minutes.


SCADA networks historically have very low available bandwidth (often due to being on the other end of a 10 mile radio link over 915 MHz with the radio & RTU powered by solar panels) and rely on a poll/response methodology. Even if nothing has changed, the poll happens, and the response sends the same exact data every poll. MQTT can be a far more efficient protocol if you only send updates. Multiply this situation by 1000x RTUs (now imagine you have a slightly better radio link from office to a remote tower, but that remote tower is communicating with 10x RTUs). You can see the need for efficient communication.

Now - realize this is how natural gas pipelines are monitored/controlled in the mountains. You want that kind of communication to be reliable and scalable right?


Does MQTT offer anything novel to the call/response problem if it's not just updates (e.g. battery controlled nodes that need to know when to wake up and act)?

I started developing a system ~a decade ago and ended up writing my own messaging protocols/queue systems. It's been an uphill battle, and if there's something a little more modern in the wings I'm all ears (I'm porting over to more modern uC's so it makes sense to re-evaluate the stack I suppose...)


MQTT is heavily used in the IoT space because it's lightweight and doesn't require good connectivity.


I've taken to using MQTT as an interprocess communication bus for embedded systems.

When you run the broker on the same device as your clients you can do some really nice things, like take messages out of the device and into your development machine. Or inject them.


I've been wanting to use mqtt as a mq, and been getting resistance from the other devs because it's alien technology...which is funny, since it's more scalable than a lot of webby things.

Domain expertise can be a liability as well as an advantage.


Is there programmable mqtt broker out there?

Something that lets you integrate authentication with your auth provider and has built in multi-tenancy?


EMQX has several auth modules. We use the Mnesia auth module with HTTP API client id provisioning. They also have NanoMQ, a MQTT broker for IoT edge (on resource constrained devices). The documentation is also quite decent compared to this.

https://www.emqx.io/docs/en/latest/access-control/authn/auth...

They have a community edition available as a docker image or packages for popular Linux distros.

https://www.emqx.com/en/try?product=broker


We aim to be no worse than Emqx in this regard :) Planning to add auth modules support in the near future.



You should check out Solace's multi-protocol broker (free version: https://solace.com/downloads/) with support for MQTT, AMQP, SMF, REST, Websocket

It supports integration with OAuth, LDAP, Radius, ,...


Shameless plug since i'm a contributor but VerneMQ [1] is a pretty programmable one. You have options from using webhooks to writting your plugins in Lua or Erlang/Elixir.

* https://github.com/vernemq/vernemq


Not sure about multi-tenancy, but I’ve been using VerneMQ with the Postgesql auth backend.


Have a look at Apache Artemis


How does this compare to using RabbitMQ’s MQTT plugin?

https://www.rabbitmq.com/mqtt.html


I would assume because RabbitMQ is not a great choice for very large scale, and another language stack, being not java.


Neither assumption holds. RabbitMQ is built on top of OTP. MQTT is a protocol which means client lang stack makes no difference.

To be clear, I am not asking why not Rabbit instead of Kafka. No. Why not just RabbitMQ with MQTT plugin instead of this broker.


RabbitMQ with the MQTT plugin is actually a good choice for many scenarios, especially when you need a messaging system that supports the MQTT protocol alongside other messaging protocols like AMQP. However, as they themselves say, There are other good MQTT brokers out there, and some will be able to handle even more MQTT client connections than RabbitMQ because other brokers are specialised in MQTT only. TBMQ - is exactly such a MQTT broker that is designed to be scalable, fault-tolerant, and efficient. The recent performance tests showed the TBMQ quality. Additionally, TBMQ can be easily launched as a single server, two servers, three servers, and so on. Compared to RabbitMQ - https://www.rabbitmq.com/mqtt.html#requirements


Thanks (and good luck with the project!)


thank you, my friend :)


If I am a Java shop, why would I want to use this instead of e.g. ActiveMQ Artemis?


As for me, TBMQ proposes the same reasons to use it as ActiveMQ Artemis. ActiveMQ Artemis seems to be a great platform that states about its high performance, clustering support, and data persistence. We believe TBMQ offers the same advantages by relying on Kafka to provide high-throughput and low-latency delivery. Check out our performance results: https://thingsboard.io/docs/mqtt-broker/reference/100m-conne... We prioritize data durability by leveraging Kafka replication guarantees. The scalability of TBMQ is very easy to achieve without the need to configure anything. Just add nodes as you go to achieve better throughput/capacity/performance.


How does this compare to FlashMQ? (https://www.flashmq.org/) Seems to have much more dependencies - any advantages?


As far as I know, it is a lightweight MQTT broker designed to take good advantage of multi-CPU environments. We provide cluster support out of the box and are designed to support and work great for various setups, from small-scale to enterprise-level implementations.


How does this compare to BlazingMQ?


I would recommend reviewing the following article: https://bloomberg.github.io/blazingmq/docs/introduction/comp...

Key points can be found there and everyone will choose their own priorities. For us these are: * https://bloomberg.github.io/blazingmq/docs/introduction/comp... * https://bloomberg.github.io/blazingmq/docs/introduction/comp... (Kafka is Java as well as TBMQ, Kafka already does not depend on Zookeeper) * https://bloomberg.github.io/blazingmq/docs/introduction/comp... * https://bloomberg.github.io/blazingmq/docs/introduction/comp... (for this we recommend reviewing our test - https://thingsboard.io/docs/mqtt-broker/reference/3m-through...)


Off-topic but maybe I'm speaking to the right audience here: are there any MQTT brokers for very low loads that (ideally) run on a microcontroller? All I want to do is to collect and log power consumption reports from smart plugs and environmental data and I was thinking of something consuming orders of magnitude less power than a Raspberry Pi.

Edit: thanks a lot!


Several different implementations exist, MQTT is a very lightweight protocol so this is very possible.

https://github.com/hsaturn/TinyMqtt https://mongoose.ws/



I don't know much about it yet, but look into Zenoh. I think it can run on microcontrollers.


Why wasn't NATS[1] used ?

Written in Go, single-binary deployment... there's a lot to love about NATS !

[1]https://nats.io/


https://nats.io/about/ Says a lot of thing except what nats is … is it like a service mesh? What is it, really? What does it do and what does it allow me to do ?


NATS is equivalent to MQTT, just a different protocol. It has a persitence layer called Jetstream which is equivalent to Kafka ie. you can rewind through the message stream. The nice thing about NATS really is that it supports websockets unlike Kafka, it supports wildcard subtopics like MQTT, and it bridges very easily to both Kafka and MQTT (bidirectional).

Other fancy stuff is yes, one binary to run it, and it can also through gossiping, support a distributed KV / object store like etcd / s3 except you can register watchers on it (https://docs.nats.io/nats-concepts/jetstream/key-value-store...).


I've used NATS before, and I might not have explored all the options, but what I used it for was a pub/sub model of messaging.

I cannot speak to performance of NATS as we used it in low-traffic / low-demand setting. In terms of convenience, when used from Go, iirc, it offered some kind of built-in serialization (unlike eg. ZMQ, where you'd have to add your own).

Debugging lost messages was painful, but I've never used a tool where it wasn't...


Because NATS isn't an MQTT broker?


We are not very familiar with NATS, does it support the whole functionality the MQTT broker should support? In addition, as it is not Java, but we are working with it, it could not be our solution.


[flagged]


Java is one of the most widely supported ecosystems available. I'm genuinely curious why throwing in on one of the most successful languages currently in existence warrants "in 2023/2024?".


[flagged]


Actually, it is up-to-date (new features twice a year, some of them, e.g. virtual threads, are truly SotA) and not legacy (things like graalvm and helidon 4 are relatively new and are awesome). Do not avoid Java if your project would benefit from it.


Almost no new project would benefit from Java. Android apps should use Kotlin. Avoid JVM for new projects unless there is no other way.


All depends in your definition of successful.


It has huge mindshare, a huge amount of libraries and existing solutions, a huge amount of tools, a huge amount of investment in state of the art VMs, and a huge amount of usage in just about every application of computing (save maybe embedded, a particularly weak application for the language, and even then it probably gets used more than it should). There is no metric by which it isn't one of the most prolific and successful programming languages ever made. I don't even like the language or its environment. There's just no point in denying that it is massively successful and very, very likely to fulfill any need you could possibly have, even if the ergonomics aren't great.


Sounds like McDonalds - successful, yes, but disgusting.


Dude, try McDonald's in Ukraine. It is so delicious that I can eat it every week. In fact, our company has a Friday as Mcdonald's Day.


Java will be relevant for a very long time. Fortune 500 infra mostly built on Java at this point. Google. Apple.

It’s a very easy language to learn but at the same time can produce some of the worst codebases to maintain.

Early versions of Java were kind of janky. Java 7 was notoriously verbose. But 8 and current LTS version have vastly improved performance of jvm and usability


[flagged]


Is one of the most widely used languages still relevant?


You mean the 25% of orgs still bleeding from log4shell?

We all know it is statistically popular, but is it still relevant...

I wonder... =)


What you need not wonder about is the relevance of starting and perpetuating dumb language wars on HN. It's zero.


If it works for your use-case its great. =)


It doesn't work for the forum which is what counts.


[flagged]


There is probably 100,000x the actual computer runtime of actual mission critical software running java than Rust. Finance, Medical, Insurance. A huge amount of the actual critical software out there is Java.


Long-running mission-critical stuff has been written in Erlang and running on BEAM since decades now.


Indeed, the correct solution here is emqx


they are java shop, learning/supporting new type of infra adds significant overhead.


Yeah but erlang it’s not the current trendy thing, sadly.

(I’m being ironic of course)


Irony aside, Elixir definitely feels like a current trendy thing, or at the very least more trendy than Java.


And java is???


Who is going to write it?


And who is going to maintain it after the original developer moves on to another project?


the rust cargo cult is real, pun not intended


FWIW I'm a developer working full-time in Rust, advocate for the language, and am on a team of a half-dozen developers working on an (embedded) project in Rust...

And none of us talk this way, nor do most people who work in and like the language that I encounter.

There's always going to be a minority of people who talk this way about tech, doesn't make them representative.


fair enough, fwiw the original cargo cults were not made up of USAF pilots either


https://github.com/bytebeamio/rumqtt

Disclaimer: have not tried it myself. I was, however, considering using it to replace Mosquitto as a broker.


Did you have a specific reason to replace mosquitto?

I do not like mosquitto for several reasons, but I never really looked for a replacement (though I wild be happy to find one, and I see that rumqtt is bundled with docker so this could be it)


No specific reasons. Nothing beyond "it was an annoying process to get it set up the way I wanted it"* and "C-language presumed bad; memory-safe-language presumed good".

I did manage to get mosquitto to run according to my liking in a docker container; I can share the steps if anyone is interested -- it was just a long writeup.

Now that mosquitto works and is load bearing for my Home Assistant, I haven't had enough interest in attempting to replace it.

* that is: with a password and without persistence

EDIT: The other one I tried was NATS-with-mqtt-compatibility, but I never got that working the way I expected either.


Or Ada/SPARK.


Your computer must be woefully inadequate for use if you adhere to that principle.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: