Hey I meant to track one of y'all down at the MLOps conference, but didn't get t... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

zmjjmz on Dec 4, 2019 | parent | context | favorite | on: Metaflow, Netflix's Python framework for data scie...

Hey I meant to track one of y'all down at the MLOps conference, but didn't get the chance. I've built a very shitty version of a cached-execution DAG thing internally, and one of the design decisions I made was to have it so that parent nodes in the DAG don't need to know anything about child nodes. This allows for larger DAG builders to be more easily subclassed.

MetaFlow doesn't do that -- instead each 'step' has to know what to call next, which means that if I wanted to subclass e.g. the MovieStatsFlow in [here](https://github.com/Netflix/metaflow/blob/master/metaflow/tut...) and say, add some sort of input pre-processing before the compute_statistics call, I'd essentially end up having to either override what compute_statistics does to not match its name _or_ copy-past e that first step just to replace that last line.

I'm sure this design decision was considered and/or that use-case doesn't come up a lot at Netflix (although I've encountered that a lot), or maybe I'm missing something very obvious, but I'd love to hear your thoughts on that.

savin-goyal on Dec 4, 2019 [–]

We erred on the side of simplicity to keep things manageable for our users.

beshrkayali on Dec 4, 2019 | [–]

That seems more complicated.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact