Hacker News new | past | comments | ask | show | jobs | submit login

1. Metaflow should best help when there is an element of collaboration - so small to medium team of data scientists. Collaborating with your self is also another scenario when Metaflow can be useful since it takes care of versioning and archiving various artifacts.

2. Keeping the language pythonic, without any additional need to learn a DSL has definitely been key to Metaflow's adoption internally. That said, this is something we are open to hearing back, esp. with this OSS launch.

3. Yes - definitely think so. Personally my favorite is the local prototyping experience part; when everything can fit in memory and is blazing fast. There is an also an open issue for fast-data access, which you can upvote if interested in seeing it open-sourced.

4. We don't think there is an exact equivalent as well. :)




Re 4, aren't Kubeflow and Lyft's recently open-sourced "Flyte" pretty similar?

If you don't consider them basically equivalent, what would you say are the key differences?


Thanks for pinging on this.

re: Kubeflow - imho it is quite coupled to Kubernetes. We don’t intend to be tied to a specific compute substrate even though the first launch is with AWS. We do follow a plugin architecture - so I’m hoping Kube happens sometime.

re: Flyte - I’m less informed on this but happy to educate myself and get back.


Good overview of Flyte found here. https://www.youtube.com/watch?v=KdUJGSP1h9U It does appear to be quite similar, though it has native k8s integration and a central web-based UI for monitoring jobs. Flyte asks the user to turn on caching. I like that Metaflow does that for you by default.


That's true of Kubeflow. I'm not sure that project will be as keen on being as "compute substrate" agnostic as Metaflow too, given its connection with Google.

If you feel inclined jump in the Flyte Slack and share your thoughts :). At my company we're on Kubeflow/Argo now, but things are developing quite a lot in this space so keen to not be myopic.


Thanks for sharing the context. Hopefully we can have a (fast) follow up with Kube integration depending on demand.


Can you specifically compare Metaflow to DVC and Databricks MLFlow? Those seem to be some popular tools in this space right now?



re: 3. We have an optimized S3 client as part of this release - https://docs.metaflow.org/metaflow/data#data-in-s-3-metaflow...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: