Unsure if it's even possible, but if there was a way to write out (simulate) the streams to WAL files, you might be able to accomplish the same without having to consolidate the duckdb file every time.
A couple of other ideas, that may or may not diverge from your project's purpose or simplicity...
It's also too bad that duckdb isn't included in the out-of-the-box Snowpark packages, potentially allowing you to run all of this in a Snowflake procedure/DAG and persisting the duckdb file to object storage. You could of course achieve this with Snowflake containers. (But this would probably ruin any plans to support Redshift/BigQuery/etc as sources)
If the source tables were Iceberg, then you could have an even simpler duckdb file to produce/maintain that would just wrap views around the Iceberg tables using the duckdb iceberg extension.
create view x
as select
*
from
iceberg_scan('metadata_file');
If you're going to enable change tracking on tables then perhaps another solution is to use CHANGES to select the data from the table instead of reading streams. That way you could use the same timestamp across all the selects and get better consistence between your materializations to duckdb.
set ts = {last_materialization_ts};
select * from X CHANGES AT(timestamp=>$ts);
select * from Y CHANGES AT(timestamp=>$ts);
-- triggered task to kick off materialization to duckdb
create task M
when
SYSTEM$HAS_DATA(X_STREAM) AND
SYSTEM$HAS_DATA(Y_STREAM) AND ...
AS
CALL MY_MATERIALIZE(); -- send SNS, invoke ext-func, etc
Not the original parent, so unsure of their use-case. But I've seen the approach where some/basic development can be done on duckdb, before making its way to dev/qa/prd.
Something like your project might enable grabbing data (subsets of data) from a dev enviroment (seed) for offline, cheap (no SF warehouse cost) development/unit-testing, etc.
Adding a Snowflake connection configuration option that allowed for standard Snowflake connection conventions might be a good option. That way you could connect to Snowflake with your existing configurations (.snowsql/, .snowflake/). Or explicitly specify by adding matching args/params to your project's config.
I just pushed the external_browser branch to github that should offer 1/ externalbrowser authentication and 2/ the ability to use TOML files with profiles with instructions on how in the readme. if you run the following you should be able to test it out.
git clone https://github.com/ryanwith/melchi.git
git pull external_browser
# remaining steps are the same
Would you mind testing it out and letting me know if it works for you? Would really appreciate it!
https://github.com/tobymao/sqlglot