1. Metaflow should best help when there is an element of collaboration - so small to medium team of data scientists. Collaborating with your self is also another scenario when Metaflow can be useful since it takes care of versioning and archiving various artifacts.
2. Keeping the language pythonic, without any additional need to learn a DSL has definitely been key to Metaflow's adoption internally. That said, this is something we are open to hearing back, esp. with this OSS launch.
3. Yes - definitely think so. Personally my favorite is the local prototyping experience part; when everything can fit in memory and is blazing fast.
There is an also an open issue for fast-data access, which you can upvote if interested in seeing it open-sourced.
4. We don't think there is an exact equivalent as well. :)
re: Kubeflow - imho it is quite coupled to Kubernetes. We don’t intend to be tied to a specific compute substrate even though the first launch is with AWS. We do follow a plugin architecture - so I’m hoping Kube happens sometime.
re: Flyte - I’m less informed on this but happy to educate myself and get back.
Good overview of Flyte found here. https://www.youtube.com/watch?v=KdUJGSP1h9U
It does appear to be quite similar, though it has native k8s integration and a central web-based UI for monitoring jobs. Flyte asks the user to turn on caching. I like that Metaflow does that for you by default.
That's true of Kubeflow. I'm not sure that project will be as keen on being as "compute substrate" agnostic as Metaflow too, given its connection with Google.
If you feel inclined jump in the Flyte Slack and share your thoughts :). At my company we're on Kubeflow/Argo now, but things are developing quite a lot in this space so keen to not be myopic.
2. Keeping the language pythonic, without any additional need to learn a DSL has definitely been key to Metaflow's adoption internally. That said, this is something we are open to hearing back, esp. with this OSS launch.
3. Yes - definitely think so. Personally my favorite is the local prototyping experience part; when everything can fit in memory and is blazing fast. There is an also an open issue for fast-data access, which you can upvote if interested in seeing it open-sourced.
4. We don't think there is an exact equivalent as well. :)