I’m building our companies first data platform right now (fivetran, dbt, snowfla...

gurubavan · on Nov 15, 2021

1) An integration with Metabase Cloud is on our roadmap for Q1! We'd love to integrate with Lightdash, but they don't have a public API just yet[1].

2) Several of our customers use us to alert on schema changes in Postgres, specifically so they can get ahead of application database changes that will end up in the warehouse, so you're definitely not alone! Here's a link on how to connect postgres: https://docs.metaplane.dev/docs/postgres

That's an excellent stack and one we kept front and center when building out Metaplane, so definitely let us know if you have any feedback or suggestions here!

[1]: https://github.com/lightdash/lightdash/issues/632

tomhallett · on Nov 15, 2021

All sounds great! I'll share it with my team.

My plan was to monitor the postgres database in the staging environment, so we can be alerted to schema changes before they are released into production (and hopefully stop the production deploy).

I have a goal of moving this even further upstream into the CI build for the source application itself (Ruby on Rails in this case), so that the application's test suite will fail a developer introduces a breaking schema change. Note: this is a pretty tricky problem to solve without a) the tests being way too brittle OR b) super slow end to end tests. I have some goals of introducing which is a mashup of: Spectacles [1], Pact [2], and dbt models [3].

[1] https://www.spectacles.dev [2] https://pact.io [3] https://docs.getdbt.com/docs/building-a-dbt-project/using-so...

gurubavan · on Nov 15, 2021

That sounds like a great plan. We're planning to build our public API and CI/CD integrations early next year, so that developers can know what the downstream impact of their changes might be, and whether it could introduce unexpected results. We may be able to slot right in there with Pact.

Mitigating the impact with monitoring is where we're at right now, but we're with you that preventing errors can be even more important.

If it's interesting to you, we're happy to open up a shared slack channel to dig into the nuance as well! Just email me (guru@metaplane.dev) with the email you'd like to be added.

tomhallett · on Nov 15, 2021

Very cool. I'll reach out.

When Nick Schrock created dagster, he argued that many "data cleaning" tasks which people attribute to "data engineering" aren't actually "cleaning", but are architecture problems. I believe schema changes also fall into this category. I'm extremely new to data engineering, but when I think about "What are the things which will break this system?" an application engineer thinking "I'm going to rename this column and my tests pass, so this should be fine" will break things all the time. (Similar goes for dropping a column, changing a one-to-many into a many-to-many)