i'm gonna do you a solid since i'm actually a researcher in this area, and spell out something that's actually described/discussed in the paper: co-iteration is about iterating over sparse representations of data that have different sparsity ratios and possibly different representations.
Would you mind to explain shortly what co-iteration is?
Usually "co-things" are "things backwards". But I can't imagine any "inversion" of iteration.
Besides maybe something akin to "generators"—where you put some computation in and get a, possibly infinite, stream of things out; in contrast to iteration, where you put a sequence of things in and get the result of applying some computation to individual things out.
But the description above doesn't sound like anything related to generators.
Also, does the "different representations" part imply that it's possible to generalized over iterating for example arrays-of-structures and structures-of-arrays? Is sparse data an inherent part of co-iteration?
There is such a thing as coiteration in the category theory sense. It's related to the data/codata and recursion/corecursion distinction. To take a couple examples from the `free` package [1] for Haskell:
iter :: Functor f => (f a -> a) -> Free f a -> a
coiter :: Functor f => (a -> f a) -> a -> Cofree f a
Like (formal) recursion, `iter` depends on having a base case to work upward from. Like corecursion, `coiter` generates steps from the top down (towards a base that may never come).
As is often the case, terminology is overloaded and multiple overloads are attested in the real world.
Co- comes from Latin and is short for "together, with, joint, common". Think co-operation: working together. If you really want to remember the difference, think of co-itus. Itus comes from ire (to go), and the expression doesn't mean "going backwards".
> Would you mind to explain shortly what co-iteration is?
it's funny how allergic hn is to reading. it's literally completely/succinctly/adequately illustrated in the Figure 1; you have a sparse array that's represented using csr format (compressed sparse row) and another that's represented using what they call "sparse band" format (basically just the start and stop of a contiguous run of nonzero values) and you want to take the dot product between them. so you iterate through both of them at the same time using one loop and you want to do this efficiently. that's the "co-ness" of it - stepping through both using one loop efficiently. that's what distinguishes it as a paradigm from just iteration - doing both at the same time better than if you just iterated over the first using conventional techniques and also iterated over the second using conventional techniques.
> Usually "co-things" are "things backwards". But I can't imagine any "inversion" of iteration.
this has nothing to do with categories or cocategories or monads or whatever - this is about expressive representations and codegen of efficient code.
all this and more available for the low low price of "just reading the paper" instead of guessing.
But frankly one does not always have time for a paper.
A quick online search didn't yield any useful results so I've asked here.
Given the answers the topic seems confusing anyway. There are at least two quite different interpretations of "co" here in play. So I think I didn't ask anything obvious, and the answers are likely also of interest to other people who didn't had time to read the paper.
Thanks for the summary. Now I don't need to read the paper, as your summary is clear and concise, while jumping to figure 1 and trying to understand that without reading the 2.5 pages before it was not possible for me.
So just loose the snark and keep working for free for us!
I associate co-things with things that happen in parallel (e.g. collaboration) and it feels like that's closer to the intended meaning based on a cursory read of the abstract (i.e. it looks like the focus is on iterating on two data structures at the same time when they have different structural properties?).
You didn't explain why iterating through something is somehow a different concept than iterating through anything else. In C++ you could define an iterator and it would just decide what the next element is for you then decide when the iteration is done. Whatever is going on underneath doesn't matter and coming up with a different term is completely unnecessary.
>Whatever is going on underneath doesn't matter and coming up with a different term is completely unnecessary.
my dismissive friend:
1. who do you think implements those stl iterators whose elegance you so greatly admire? we do.
2. there are no iterators in the stl that iterate over a pair of ragged/jagged collections in parallel. you can approximate with std::transform but this will only work if the collections are of the same size. again: reading is fundamental (as i learned from mr burton way back) and you can very quickly understand what the "co" is conveying if you read the first example in the paper (computing a dot product).
None of this answers why a new language is needed and you can't just do whatever is in this language in a C++ iterator.
The paper is mostly about data structures, this whole thing could be a class in C++, then you could use the entire C++ ecosystem like optimizing compilers, IDEs, other libraries, profiling, static analysis, etc.
When someone invents another language, the entire programming ecosystem is wiped out and everyone has to start over.
You're still having trouble reading so let's try socratic method: if I have a vector with 10 elements and a vector with 100 elements, which iterator from stl should I use to take their dot product?
Lol now I know you're just googling for keyword matches - a parfor has absolutely nothing to do with this.
Overall you seem to be very upset about some perceived cognitive burden/imposition at having to learn a few new ideas. Let me free you from that burden: you don't have to understand co-iteration. But it is possible that one day your code will be slower than someone else's that does understand co-iteration.
Overall you seem to be very upset about some perceived cognitive burden/imposition at having to learn a few new ideas.
The ridiculous part is that you think there are new ideas here.
you don't have to understand co-iteration
I understand it because it is already called iteration.
But it is possible that one day your code will be slower than someone else's that does understand co-iteration.
Feel free to explain in detail the part that is new or faster. You haven't done that at all, your entire post is just saying 'nu uh' with no actual evidence to back it up. Give each thread their own ranges that don't overlap. If you want to make up terms and sell pretend results don't be surprised when people aren't convinced.
Repeating RTFM (especially in such tone!) won't convince anybody to look into this. Quite the contrary…
You, my friend, are dismissive to the fact that people are only willing to put in their valuable time into something if it seems at least on first glimpse interesting enough.
You're doing an extremely poor job showing people why this would be the case here.
Math may be fun, but in the end your success in life will almost exclusively depend on your social skills. Skills which you're obviously lacking still.
>You, my friend, are dismissive to the fact that people are only willing to put in their valuable time into something if it seems at least on first glimpse interesting enough.
simple solution: don't pontificate/comment if you don't have any valuable time - no one asked this guy his opinion on co-iteration.
also the comment you directly responded to doesn't say RTFM - it literally poses a puzzle that illustrates the point.
So on one hand you use the specific term "iterator" for a concept that's slightly different than the general idea of looping through a datastructure, but then "coiteration" is a step too far? Where do you draw the line, at the limits of the C++ standard library?
> Whatever is going on underneath doesn't matter
Performance matters and is often based on what is going on underneath.
So on one hand you use the specific term "iterator" for a concept that's slightly different than the general idea of looping through a data structure
True, but an iterator is a specific value that can be incremented for the next value. It helps loop through a data structure when that data structure can't be indexed by sequential values.
but then "coiteration" is a step too far? Where do you draw the line, at the limits of the C++ standard library?
An iterator helps loop through a data structure. Coiteration is iteration, there isn't a separate concept here.
Performance matters and is often based on what is going on underneath.
You need to understand the context here, this is about the interface.
Basically there is no need to use a different language or hallucinate words to act like they invented something. The entire paper could just be a C++ class and someone using their data structure wouldn't have to give up the C++ eco system. As it stand, no one is going to use an entirely different language to use one data structure.
> Coiteration is iteration, there isn't a separate concept here.
It is a subset that the paper seems to focus on, so using precise language is useful in this case. Just like a study about humans will use the word "human" instead of "mammal".
> no one is going to use an entirely different language to use one data structure.
It seems to me that the goal is more about approaches for specific compiler optimizations/code generation, not to replace C++.
Iteration though is already used when looping through something is non trivial. It isn't necessary for an array or vector because those have random and sequential integer indices.
Iterators are made for situations like this where it is more complex under the hood. What no one can say is why this can't be covered by giving the user an iterator. You say next, it gives you the next item, pair etc.
It seems to me that the goal is more about approaches for specific compiler optimizations/code generation
Based on what? The title is "A Language for Structured Coiteration" and most of the paper is about the internal of the data structure.
not to replace C++.
No one said that, I asked why this couldn't be done in C++ as a class. Then you wouldn't even have to use a different language.
> > the goal is more about [...] optimizations/code generation
> based on what?
From the abstract:
Specializing for structure yields significant speedups. But automatically generating efficient code for structured data is challenging, especially when arrays with different structure interact. We show how to abstract over array structures so that the compiler can generate code to coiterate over any combination of them.
i.e. yes could write your own iterator to efficiently co-iterate over any given pair of structured arrays, and this is a way to express the structure to a compiler so you won't have to figure out the right optimizations yourself every time you want to try out a new way of structuring your data.
Note that this is a research article. The target audience is other researchers in the field, who it would seem would already be familiar with this term. That target audience apparently doesn't include you, which is something that happens fairly often on hacker news and seems like something not to get upset about. It doesn't include me either, but after I actually read the text of the first page of the article it was quite clear from context what co-iteration means. Given the level of effort you've put in so far in this thread my guess is you don't really care what it means, but in case you do: it is about iterating through more than one array simultaneously, generally to do some computation that needs related elements of the multiple arrays at any given step. The example they give is a dot product, where you want to multiply the elements at corresponding locations in two arrays (and then add those products up). They then present clever ways to tell their compiler about different ways of representing sparse arrays so that efficient algorithms for this kind of coiteration can be generated automatically, rather than having to be handwritten for every pair of representations.
It's an interesting read. Some nice figures too (I like Figure 3's ontology of sparse matrices). I haven't given much thought to the problems of optimizing operations over non dense arrays. They do a good job of clearly explaining the problem and approaches.
> In this paper, we show how to abstract over these itera- tion strategies using the concept of iterator protocols
It's about as basic as you get. It seems hardly worth mentioning.