Hacker News new | past | comments | ask | show | jobs | submit login

> Under the hood we use a combination of copulas and deep-learning models to model the distributions and correlations in your dataset (the intuition here is that it's much more useful for developers to have realistic data than just sample from a random number generator)

This is neat, but do users have the option of just doing vanilla RNG if they want?




Hey - good question.

Not right now, but it shouldn't be hard to implement. Is there something some specific use-case this would address?


> it shouldn't be hard to implement

Yeah it seems like it's just a flat/un-informed probability distribution and I'd guess your models are general enough to accommodate that.

A couple use cases come to mind:

1. If I have no data but want to test out various/arbitrary schemas with just a bunch of dummy data. Of course, I could generate it myself (either with ad hoc scripts or building a more general CLI that does this for me), but if Synth just makes it a one-liner in the command line, that's appealing.

2. If it's too burdensome to convince others in my org that you've "built it so that our servers never have to see any sensitive information". Even if I trust you, I then have to make arguments for others to also trust you, when really if all I need is some random data for an empty schema, then that's a whole can of worms I don't need to open.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: