Launch HN: Synnax (YC S24) – Unified hardware control and sensor data streaming

candiddevmike · 2024-08-12T19:44:59 1723491899

This is really neat, I know a lot of folks in manufacturing QA that would love something like this. The telemetry aspect of industrial equipment is terrible IMO, so many folks are hand rolling sensors and triggers and then trying to duct tape an extremely fragile monitoring and dashboard system using something like graphite. Neat space to be in!

How are you going to interface with the big boys like rockwell? I see you have drivers, what about partnerships? I know a lot of companies tend to only work with toolsets their provider "blesses", so having them on "your team" can help. You may have to pick favorites to win early deals/"synergy" (and may help with acquisition?)

I've worked with industrial automation in the past and have always enjoyed the technical constraints within it. I would be interested in helping you with pre or post-sales support/training/implementation for your customers if you need it. Email is in my profile.

embonilla · 2024-08-12T20:33:03 1723494783

It's been a really interesting problem to tackle - we've seen so many different ways that companies try to tackle this problem, and while theres one or two companies that have made really fantastic internal tools, most are ... lacking, to say the least.

Our plan so far has been to try to interface with the bigger companies through the drivers we make for their hardware. We haven't reached out about partnerships yet, but that is a really good idea.

Thank you for the offer - will definitely reach out if and when we need more help on the implementation side.

mmckelvy · 2024-08-13T12:48:46 1723553326

I work for a YC company in the industrial maintenance space and we've been looking at integrating machine data drive and automate certain maintenance workflows. The most difficult aspects of integrating machine data have been (i) networking and (ii) translating whatever protocol (Ehternet/IP, Profinet, Modbus, etc.) the machine's control system uses into familiar formats. In many cases, PLCs aren't connected to a network, the customer doesn't know which tags point to the data they need, IT has concerns about security, and the list goes on. How are you guys thinking about these networking and protocol translation issues?

Second question. The main platform in this space is Ignition. Do you consider yourself a competitor to Ignition or are you aiming for a different use case?

elham · 2024-08-13T17:01:39 1723568499

1. Since we have several customers in Aerospace who adhere to ITAR regulations, all our software is hosted on-prem so that we have no connection to the data they're using. The data in the DB can always also be taken out if a system switch is desired. Agree about the large assortment of issues that come up when integrating these kinds of systems. It would be naive of us to try to address all these all at once. We tried to make it easy to integrate our systems so that we can run pilots that start small (1 test stand, 1 bench, etc.) to demonstrate initial & immediate value and then expand from there. We want our product to grow with our users - incorporating support for new protocols as they need that become broadly applicable.

2. We see a lot of value in providing essentially a universal adapter to these protocols and hardware interfaces. Decoupling the data communication/device infrastructure from the control and acquisition workflows is big for us and this seems essential to that. A big endeavor on its own, but our existing integrations have been really helpful to our users and as it matures, we intend to continue expanding these integrations!

Hopefully 1 & 2 address your first question!

3. Addressing the second question: We've mostly been focusing on the test & operations use cases (e.g. running real-time control and data acquisition for engine tests). We see a lot of ways we can eventually service industrial controls/automation space - similar to Ignition. However, we are also aware of many reasons people in this space will want to stick to tried and true tools with a larger community and ecosystem.

We're still figuring out how we fit into that space + communicate our ability to provide the breadth of functionality and support them. Posts like this and the users who already see the value and are willing to try something more novel and developmental like us have been huge in progressing towards this.

Some questions I have!

  1. What parts of the networking have been most challenging work you wish could have already been done for you?

  2. For interfacing with protocols - similar question as above but also which ones were pretty nice to work with and what about them made it so? Closely related which direct integration would you immediately want if you were considering something like Synnax?

  3. Related to the customer not knowing the mapping of tags to data - are there similar issues that you've experienced that make it hard to use these systems?

Ended up being a long message but I appreciate your insights on any of what I just said!

mmckelvy · 2024-08-13T18:23:09 1723573389

Oh, ok, the test and operations case makes sense. As for your questions:

1. It's basic networking tasks such as running a network drop, assigning IPs, making sure the PLCs are on the right subnet, etc. In many cases the PLCs aren't on a network at all and the IT team doesn't really know how to work with the PLCs and the OT team doesn't really know how to work with networks. Sometimes it's been easier to just add external sensors and go over a cellular network and skip the PLC altogether.

2. We use one of Ignition's modules to interface with the control systems directly. They have drivers for Allen-Bradley, Siemens S7, Omron, Modbus, and a few others. The downside is Ignition doesn't have an API, so we have to configure things using a GUI. Beyond Ignition, the other big provider of drivers is Kepware - they probably have a driver for everything, but again, they aren't really set up for use by developers trying to deploy to a Linux box. If the customer has an OPC-UA server set up, we can connect to that using an open source library.

3. What we've learned is that many customers rely on third parties (e.g. the machine manufacturer or a system integrator) to configure their system, so when it comes to extracting the data they want, you're kind of on your own. We're not industrial system experts, so this creates a unique challenge. Larger and more sophisticated customers will have a much deeper understanding of their systems, but these folks are usually going to be using something like Ignition and will already have the dashboards and reports so it's more a matter of integrating with Ignition.

elham · 2024-08-13T22:47:36 1723589256

This all makes sense and is extremely illuminating. Thank you!

svnt · 2024-08-12T22:13:44 1723500824

Context: started a company that did essentially this about a decade ago. Haven’t looked back much. My data may be stale or just biased.

> We used old control software that spit out data in massive 10 GB CSV or TDMS files. After a long day and night of testing, no one wanted to go through all the work to review the data.

> We think Synnax is unique in that it provides a bridge between <lab/automation DAQ systems>

On the surface it seems like anomaly detection is still the hard problem, but you’re not setting out to solve it?

Time series databases are state of the art generally in finance, not in industrial/InfluxDB, so I don’t think saying you’re 5x influxSB on writes is going to persuade too many people, especially given the cost now for a terabyte of RAM. I’ll just move all of it to an in-memory database before I’ll take on the switching costs.

The thing I wanted was one solution for something that was always two: a properties/metadata database, and a separate time series database.

It seems to me like you are maybe building a level too low and could get a lot more value working on the problem that you say motivated you in the first place. It is hard because of all the context required to automatically detect anomalies, but I think that is why it is valuable to solve.

The value we had was we rolled in the data/cellular connection all the way down to the endpoint, so they could avoid IT integration, which was a big hurdle at the time. I don’t know if IT integration is still a hang up for your customers.

We found that visualization layers tended to reach down just far enough into the data intake world that it was really hard to sell just another tsdb.

embonilla · 2024-08-12T22:39:36 1723502376

> I don’t think saying you’re 5x InfluxDB on writes is going to persuade too many people.

I definitely agree with this. Our early prototype of Synnax actually sat on top of a combined Redis/S3/SQL stack and focused on those high level features. We found that it was challenging to deploy, manage, and synchronize data across these services, especially when you're running everything on prem.

We've come to believe that a re-architecture of the underlying infrastructure can actually unlock the high level workflows. For example, to compare a real-time telemetry stream with a historical data set you'd need to query across tools like Kafka and Influx at the same time. For an experienced software engineer this isn't too hard of a task, but they don't tend to be the people who understand the physics/mechanics of the hardware. We want it to be possible for say, a Turbo machinery expert, to translate a Python script they wrote for post-processing a CSV into something Synnax compatible without a huge amount of work.

In short, we're working on finding a way for subject matter experts in hardware to implement the anomaly detection mechanisms they already have in their head, but don't have the software expertise to implement.

> The thing I wanted was one solution for something that was always two: a properties/metadata database, and a separate time series database.

What do you think about TimeScale for this sort of use case? Haven't run it in production myself, but having your time-series data in the same place as a SQL table seems pretty nice.

> We found that visualization layers tended to reach down just far enough into the data intake world that it was really hard to sell just another tsdb.

This is a good point. We think that focusing exclusively on the DB is probably not the right approach. Most of our focus nowadays is on building out higher level workflows on top of the new database.

svnt · 2024-08-13T16:49:20 1723567760

These answers are a much better presentation of your position in the market than how you described the company in the post above.

I found TimescaleDB after I wrote this — it does look like the answer to my problems from a decade ago. I don’t do that anymore but I’m glad someone brought it to market.

If you can describe with clarity how a scientist/hardware engineer using your tool is going to implement their anomaly detection, or whether your software will somehow shadow and assist/learn from what they try to do, I think that would be a much more compelling pitch.

kpmcc · 2024-08-12T20:42:33 1723495353

Hey! I work at a startup that does industrial automation related work and this looks super helpful. Going to take a deeper look later, but off the bat I wanted to ask why you felt a custom time series database was warranted when there are options like timescale or regular old postgres out there?

elham · 2024-08-12T20:57:14 1723496234

Hey! Great question we get a lot. We've come from/talked to a lot of companies that do what you described with stuff like timescale and influxdb. They're useful tools and support a breadth of applications. We thought by building one to specifically leverage the read/write patterns you'd expect with sensor-heavy systems, we could achieve better data throughput and thus better enable real-time applications. For example, we've been able to get 5x write performance for sensor data on our DB compared to influxDB.

In general, having built out the core DB, it has been valuable in allowing us to expand to the other useful features such as being able to write commands out to hardware at sufficient control loop frequencies or create smooth real-time visualizations.

The other thing we think is really powerful is having a more integrated tool for acquiring, storing, and processing sensor data & actuating hardware. One common issue we experienced was trying to cobble together several tools that weren't fully compatible - creating a lot of friction in the overall control and acquisition workflow. We want to provide a platform to create a more cohesive but extensible system and the data storage aspect was a good base to build that off of.

kpmcc · 2024-08-12T21:18:40 1723497520

Thanks for the reply! That all makes sense, and I can totally relate to the "cobbling together several tools that weren't fully compatible" experience. There's enough complexity with having to support or integrate sensors/actuators with a variety of industrial networking protocols. Anything to simplify the software portion of the system would go a long way. Excited to dig into this a bit more, best of luck with ongoing development!

elham · 2024-08-12T21:35:24 1723498524

Thank you! Happy to get any feedback after a deeper look

btown · 2024-08-12T21:14:00 1723497240

Did you build on any low-level libraries like RocksDB for data persistence etc.? Or did you fully hand-roll the database? Curious about the tradeoffs there nowadays.

embonilla · 2024-08-12T21:28:04 1723498084

The core time series engine, called cesium (https://github.com/synnaxlabs/synnax/tree/main/cesium), is written completely from scratch. It's kind of designed as a "time-series S3", where we store blobs of samples in a columnar fashion and index by time. We can actually get 5x the write performance of something like InfluxDB in certain cases.

For other meta-data, such as channels, users, permissions, etc. we rely on CockroachDB's Pebble, which is a RocksDB compatible KV store implemented in pure go.

kpmcc · 2024-08-12T21:41:22 1723498882

One thing that might keep me from using this is how well it integrates with other tools I might want to use for data analysis or historical lookback. For example, I currently use grafana as a simple, easy way to review sensor data from our r&d tests. Grafana has solid support for postgres, timescale, influxdb and a number of other data sources. With a custom database, I'd imagine the availability of tools outside of the synnax ecosystem would be rather limited.

elham · 2024-08-12T22:00:13 1723500013

That's a valid concern! As we're currently a team of 3 building this out - we still are working on building out our integrations with other tools, hardware, etc. We have been prioritizing building direct integrations to systems that our current users are interested in.

We also value enabling developers to build off of Synnax or integrate the parts they are most interested in into their existing systems. We've tried to service that end by building out Python, Typescript, & C++ SDKs and device integrations. We're continuing to look into how we can better support developers build/expand their systems with Synnax, so if there are any integrations you think are important, I would appreciate your take.

jkid · 2024-08-13T13:43:35 1723556615

How will this work with legacy systems? When I would go out to do site installations at concrete plants, the control systems were these antiquated systems cobbled together by CommandAlkon (the major player for the last 20 years). Would you replace that software? Often times these servos and sensors were deeply integrated into the vendor solution, so how could Synnax work in this environment, if at all?

elham · 2024-08-13T17:44:53 1723571093

Not super familiar with CommandAlkon/Construction space so would love to hear more on that end.

To address hardware deeply integrated with proprietary systems - our current mode of dealing with this is building out the middleware to allow these systems to integrate with Synnax - or a protocol that can be used to communicate with these devices. But we also try to make it a developer-friendly tool so that if people already familiar with these systems want a quick way to connect Synnax to their existing systems - they can do so using our Python, Typescript, or C++ libraries.

Our goal would be to substitute these cobbled-together systems for a more uniform and airtight system. We hope to do that by starting with sub-scale preliminary integrations on already-in-development setups/expansions that can be a base for expanding out to the rest of the system.

c_o_n_v_e_x · 2024-08-16T03:58:28 1723780708

Using vernacular from the industrial automation industry, how is this different from a historian?

A sensor is physically wired to an input on a PLC which collects data, the historian software communicates with the PLC/DCS and saves instrument/sensor data for further review.

elham · 2024-08-16T16:55:12 1723827312

Based on my understanding of historians, we combine a historian, real-time data visualizer/processor, and hardware controller into one. We tried to create a more unified and developer-friendly ecosystem - reducing the need for multiple disparate components and adapters and abstracting out the low-level work necessary to interface with hardware like PLCs.

notepad0x90 · 2024-08-12T23:55:50 1723506950

Just curious, were you inspired by Asimov when naming the company?

https://asimov.fandom.com/wiki/Synnax

embonilla · 2024-08-13T00:18:08 1723508288

Yes! I'm a fan of Asimov and the foundation series. Synnax is the home planet of Gaal Dornik, who studies Psychohistory, which is (loosely) related to time. I mostly thought the name sounded cool, and then came up with this explanation later :)

ReadEvalPost · 2024-08-12T21:24:17 1723497857

FYI your CSS looks to be broken on Firefox.

embonilla · 2024-08-12T21:30:15 1723498215

Hey, we're aware and a little embarrassed! We use some nested CSS selectors that aren't compatible with some browsers and haven't gotten around to fixing it yet, sorry!

weaverwasher · 2024-08-13T00:20:44 1723508444

I believe my corporate firewall is blocking your site due to this. On chrome.

embonilla · 2024-08-13T01:00:47 1723510847

It might also have something to do with our analytics tool (via Vercel). I'll look into this.

accurrent · 2024-08-13T12:15:29 1723551329

How does this compare with Zenoh?

elham · 2024-08-13T17:36:46 1723570606

Not super familiar with Zenoh! At first glance, it seems to focus more on the robotics use case and support higher-level data manipulation.

Synnax also follows a pub-sub model, which enables functionality like having multiple clients/consoles be able to view and monitor their physical systems.

I'd say we try to reach closer to the edge to help directly facilitate sensor abstractions. In this vein, Another way Synnax seems to differ is to try to cater more to the hardware control aspect. So for example, we have a panel for users to create schematics of propulsion/electrical systems which they can they link with automated scripts to command that actual system and override with manual control when necessary.

Using multiple nodes of Synnax in a distributed fashion is also still a WIP but a goal of ours as well!

If you use Zenoh would love to hear how you use it and if my impressions of it are correct.

rapjr9 · 2024-08-12T23:15:11 1723504511

Some related work that might be interesting:

https://www.cs.dartmouth.edu/~kotz/research/project/solar/

https://www.cs.dartmouth.edu/~kotz/cmc/projects/solar.html

elham · 2024-08-13T17:46:46 1723571206

Thank you! Will look into these.