I've poked around at Graph a little bit and it looks nice but I have a few nagging questions bothering me:
fnk's use their argument names to define how to connect edges of the graph together. This means that fnk's are not modular -- they are inherently coupled with the particular graph they are used in by their argument names.
You couldn't make a utility library of commonly used fnk's because you would need their argument names to match up with the graph you are using.
How does this work in practice at Prismatic? If there is a common computation you use across graphs, do you just pull the logic into a regular function in a library and then make a wrapper fnk in each graph that just calls the function? Making a wrapper fnk seems awkward / boilerplate-y to me.
Another thing I wanted to see from Graph but didn't notice any mention of was modular / composable graphs. If I had a simple univariate stats graph (like the example) that takes an input stream and produces aggregate metrics (count, sum, average, median, variance, etc.) I would want to re-use that all over the place as a sub-graph of other graphs. Again, you run into problems with the implicit-glue of using function arguments. How do you know what output names the sub-graph would use? Add a namespace/prefix? It will get messy quickly.
EDIT: I found one mention in a slide from your Strange Loop talk about nesting graphs. Is this transparent to the compiler (i.e. you write a fnk that runs a compiled graph inside of it) or can the compiler optimize and reason about the sub-graph? By composing graphs I am talking about the graph compiler being aware of and able to optimize the computation of the sub-graph. For example, lets say I have a sub-graph that calculates univariate stats and also generates 1TB of random numbers. If no graph node is hooked up to the 1TB of random number output, the Graph compiler should optimize it out and never run it. Is that possible with sub-graphs?
So, I think the implicit-glue of using argument names is a trade-off that may make it hard to improve Graph in the future. It would be very interesting to hear what you Prismatic folks think because you surely discussed the trade-offs while building it.
This allows you to provide arguments to a subgraph or node fnk via arbitrary computations on input parameters or other node values, including the trivial case of renaming.
With this in place, you can always name your fnk arguments and Graph nodes whatever makes sense in this particular context, and then adapt the graph to a new circumstance using instance.
We use this strategy extensively across our codebase, and will provide lots more examples as we release more of our infrastructure. Please let me know if this makes sense, seems reasonable to you, or you have questions.
> If no graph node is hooked up to the 1TB of random number
> output, the Graph compiler should optimize it out and never
> run it. Is that possible with sub-graphs?
Yes, one of the design goals of Graphs is to make everything transparent, until the last second when you compile a Graph. Our current compilation strategies are pretty simple (and it's very simple to build your own), but right now you can lazily compile a hierarchical graph and any results that are unused (including in subgraphs) will not be executed.
fnk's use their argument names to define how to connect edges of the graph together. This means that fnk's are not modular -- they are inherently coupled with the particular graph they are used in by their argument names.
You couldn't make a utility library of commonly used fnk's because you would need their argument names to match up with the graph you are using.
How does this work in practice at Prismatic? If there is a common computation you use across graphs, do you just pull the logic into a regular function in a library and then make a wrapper fnk in each graph that just calls the function? Making a wrapper fnk seems awkward / boilerplate-y to me.
Another thing I wanted to see from Graph but didn't notice any mention of was modular / composable graphs. If I had a simple univariate stats graph (like the example) that takes an input stream and produces aggregate metrics (count, sum, average, median, variance, etc.) I would want to re-use that all over the place as a sub-graph of other graphs. Again, you run into problems with the implicit-glue of using function arguments. How do you know what output names the sub-graph would use? Add a namespace/prefix? It will get messy quickly.
EDIT: I found one mention in a slide from your Strange Loop talk about nesting graphs. Is this transparent to the compiler (i.e. you write a fnk that runs a compiled graph inside of it) or can the compiler optimize and reason about the sub-graph? By composing graphs I am talking about the graph compiler being aware of and able to optimize the computation of the sub-graph. For example, lets say I have a sub-graph that calculates univariate stats and also generates 1TB of random numbers. If no graph node is hooked up to the 1TB of random number output, the Graph compiler should optimize it out and never run it. Is that possible with sub-graphs?
So, I think the implicit-glue of using argument names is a trade-off that may make it hard to improve Graph in the future. It would be very interesting to hear what you Prismatic folks think because you surely discussed the trade-offs while building it.