Congrats on shipping Christos, Damien, and Nodar! I really like this idea. I hav...

openquery · on Aug 18, 2020

Thanks and great questions!

> First, we’re using Postgres and some of our tables use JSON...

We've seen this before when we were talking to a company we were considering to pilot pre-launch - it's on our roadmap. Currently the JSON text would be treated as a string, i.e. it is classified as a categorical type or text.

What we would want is for the classifier to traverse the JSON object instead of treating it like text. This feature is going to be implemented when we extend to NoSQL databases.

> Second, I’m concerned about giving Synth access to my data as much of it is sensitive.

Absolutely. This has been one of the guiding principles in building Synth. We've built it so that our servers never have to see any sensitive information. (Hence why you can use Synth via a CLI tool instead of an API)

Also:

1) The CLI is soon to be OSS giving full visibility into exactly what's happening when you use it. (Really it's OSS now since you can just take a look at the source code running in the container, we just haven't had the time to make our repo public)

2) The models are designed to be transparent. You can inspect them by running `synth model inspect <model-id>`. This gives you visibility into exactly what the model looks like. (Looking at the data which has been sampled is still a WIP)

3) If something goes wrong and sensitive information is uploaded to the Synth platform, you can easily purge all traces of it using `synth model rm <model-id>`

sbecker · on Aug 18, 2020

> We've built it so that our servers never have to see any sensitive information.

If true, this is a key selling point and should probably be somewhere near the top of the homepage. I didn't get that point from reading any of the copy.

openquery · on Aug 18, 2020

Thanks for the feedback. I'll make sure this is clear.

Why is this important for you?

imInGoodCompany · on Aug 19, 2020

(not OP, but) from a European perspective, it means one less GDPR headache. At the company I work for I know having PII going through a 3rd party server for this kind of purpose would be a no-go.