Specifically, i'm interested in testing a web dashboard/app. So if I use synth to populate my db, how would I know whether the backend's endpoints are giving me good data? Is there a way to guarantee a specific set of test data each time (so i can precompute what the values should be), or will i need to start a test run by querying the data base a bunch to see what's in it to figure out what i should expect the test results to be?
Also, is there a way to prepare data for import into an existing db? Right now for some of our testing we have a single staging instance and we deconflict multiple tests by including a randomized 8 character string in all the relevant IDs for precomputed data we insert as part of the testing initialization. For this testing it's not as important that the data is repeatable, but the testers have a few different scenarios they want to test, so I'd need a way to make a low-data, medium-data, and high-data test set where the backing data fit within some ranges.
> Is there a way to guarantee a specific set of test data each time
Absolutely. You can seed the model so that the data you get each time is completely reproducible
> For this testing it's not as important that the data is repeatable, but the testers have a few different scenarios they want to test, so I'd need a way to make a low-data, medium-data, and high-data test set where the backing data fit within some ranges.
This is a great use-case for Synth. With the upcoming Firehose API you can point it at an existing database and specify how much synthetic data you want to generate and pump into your db.
For now you can either create a database and write the ETL, or do `synth model sample <model-id> --ouput <some-directory> --sample-size <number-of-rows>` to sample directly from the model into a directory of CSV files and use that to load your database
Feel free to get in touch if you would like to learn more :)
Specifically, i'm interested in testing a web dashboard/app. So if I use synth to populate my db, how would I know whether the backend's endpoints are giving me good data? Is there a way to guarantee a specific set of test data each time (so i can precompute what the values should be), or will i need to start a test run by querying the data base a bunch to see what's in it to figure out what i should expect the test results to be?
Also, is there a way to prepare data for import into an existing db? Right now for some of our testing we have a single staging instance and we deconflict multiple tests by including a randomized 8 character string in all the relevant IDs for precomputed data we insert as part of the testing initialization. For this testing it's not as important that the data is repeatable, but the testers have a few different scenarios they want to test, so I'd need a way to make a low-data, medium-data, and high-data test set where the backing data fit within some ranges.