Hacker News new | past | comments | ask | show | jobs | submit login

Congrats on shipping Christos, Damien, and Nodar! I really like this idea. I have this problem at my company.

Two questions:

First, we’re using Postgres and some of our tables use JSON. Would Synth be able to generate realistic JSON? Sometimes this is configuration (which would need to be straight copied) and other times it would be data (which would need to keep the same keys but have generated values). Is this use case supported?

Second, I’m concerned about giving Synth access to my data as much of it is sensitive. I understand that you need access to production data to offer the service. What can you tell me about your data security to help me feel more comfortable? (i.e. What kind of data would you have stored on your end? How does the CLI work? etc)

Congrats again and good luck!




Thanks and great questions!

> First, we’re using Postgres and some of our tables use JSON...

We've seen this before when we were talking to a company we were considering to pilot pre-launch - it's on our roadmap. Currently the JSON text would be treated as a string, i.e. it is classified as a categorical type or text.

What we would want is for the classifier to traverse the JSON object instead of treating it like text. This feature is going to be implemented when we extend to NoSQL databases.

> Second, I’m concerned about giving Synth access to my data as much of it is sensitive.

Absolutely. This has been one of the guiding principles in building Synth. We've built it so that our servers never have to see any sensitive information. (Hence why you can use Synth via a CLI tool instead of an API)

Also:

1) The CLI is soon to be OSS giving full visibility into exactly what's happening when you use it. (Really it's OSS now since you can just take a look at the source code running in the container, we just haven't had the time to make our repo public)

2) The models are designed to be transparent. You can inspect them by running `synth model inspect <model-id>`. This gives you visibility into exactly what the model looks like. (Looking at the data which has been sampled is still a WIP)

3) If something goes wrong and sensitive information is uploaded to the Synth platform, you can easily purge all traces of it using `synth model rm <model-id>`


> We've built it so that our servers never have to see any sensitive information.

If true, this is a key selling point and should probably be somewhere near the top of the homepage. I didn't get that point from reading any of the copy.


Thanks for the feedback. I'll make sure this is clear.

Why is this important for you?


(not OP, but) from a European perspective, it means one less GDPR headache. At the company I work for I know having PII going through a 3rd party server for this kind of purpose would be a no-go.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: