> Under the hood we use a combination of copulas and deep-learning models to mod...

openquery · on Aug 18, 2020

Hey - good question.

Not right now, but it shouldn't be hard to implement. Is there something some specific use-case this would address?

graerg · on Aug 18, 2020

> it shouldn't be hard to implement

Yeah it seems like it's just a flat/un-informed probability distribution and I'd guess your models are general enough to accommodate that.

A couple use cases come to mind:

1. If I have no data but want to test out various/arbitrary schemas with just a bunch of dummy data. Of course, I could generate it myself (either with ad hoc scripts or building a more general CLI that does this for me), but if Synth just makes it a one-liner in the command line, that's appealing.

2. If it's too burdensome to convince others in my org that you've "built it so that our servers never have to see any sensitive information". Even if I trust you, I then have to make arguments for others to also trust you, when really if all I need is some random data for an empty schema, then that's a whole can of worms I don't need to open.