Hacker News new | past | comments | ask | show | jobs | submit login

I'm curious what the average sentiment from the Stitch data team is on this. I see the small marginal utility of this, but this would be a massive pain to implement. Imagine going back thru thousands of lines of transforms and adding the framework. Say you make a couple small mistakes somewhere because the framework is new to you. Things seem fine at first, but weeks go by and something seems off. How do you find those mistakes?

Newly hired data scientists would have a "wtf is this thing?" response. You'd really need to "sell" people on this and it doesn't seem worth it.




Speaking as an author (so some bias), but you certainly make an interesting point about the cost of adoption. Its a high barrier to entry (requires a rewrite) for migration, but in our experience it actually is worth it.

So yeah, not for every problem (no framework is), but the problems that it solved it solved quite adeptly, and it was worth the cost (both initial and ongoing) of migration. It particularly proved its worth in the case where pipelines have grown in scale and complexity over time and thus gotten out of control. Particularly when there are thousands of lines of transforms -- that's when the ongoing maintenance burden gets so high that its completely worth the cost of migration. "Weeks going by and something seems off" we found often described the day to day of teams that manage codebases like that -- requiring them to spend tons of time working on the software engineering aspects of testing/visibility rather than developing new features to help the business.

Re: new data scientists, you'd be surprised at how adept they are at picking things up. There are a million frameworks/paradigms to learn when you join a company, so this isn't really an additional burden. And, once they learned that "one simple trick" (that parameter names map to the producing function), it ends up being pretty easy to use.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: