Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s not. If it holds up well in terms of operating it and extending it when things change, then it’s not “too simple”. You’ve just built on 4 extremely resilient and scalable technologies which have made the kind of “Big Data” hoop-jumping of the past 20+ years rather moot.

You know the drill: SQLite and DuckDB mean you’ve got transactional database and data warehouse that live in the single machine and unless it’s terabytes runs analytics performance comparable to bigquery and snowflake.

No need to write an etl pipeline that runs on a Hadoop cluster, provision that Hadoop cluster, spin up a Hive/Pig instance for analysis. Nah, it fits in *your* computer and is scripted using the same language in the pipelines as in the analysis without a performance cost.

If you need to scale it, it’s not technical scaling, it’s team knowledge scaling (still hard, but not a fundamental stumbling block). So bring in DBT/Dagster (or airflow) and now it’s got supported frameworks that others already know and use.



Thank you for your valuable feedback - much appreciated!

Team knowledge scaling is hard - I totally agree and learned a lot of lessons.

Top management usually works „email only“. No matter what cool dashboard you’re building: they don’t use it, because they’re working almost exclusively on their phones.

And that‘s one tough challenge in my opinion: making data easy to understand on small screens.

Then there is this group of CFOs … they love to connect their Excel to a live datastream. Once. Because at some point they return to static sheets just to prove that a 35 MB Excel file shows their latest forecast.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: