The implications are unclear to me. We already know how to prune models for infe...

stared · on March 7, 2020

In most cases, there is limited support for sparse operations. "Sparse Networks from Scratch: Faster Training without Losing Performance" https://arxiv.org/abs/1907.04840 openly says "Currently, no GPU accelerated libraries that utilize sparse tensors exist, and as such we use masked weights to simulate sparse neural networks.".

However, the situation seems to be very dynamic. See:

- https://github.com/StanfordVL/MinkowskiEngine (Minkowski Engine is an auto-diff convolutional neural network library for high-dimensional sparse tensors)

- https://github.com/rusty1s/pytorch_sparse (an efficient sparse tensor implementation for PyTorch; the official one is slower SciPy https://github.com/pytorch/pytorch/issues/16187; however, I failed to install it - it is not "pip install"-simple)

EDIT:

I checked it now and was able to install pytorch_sparse with one command. It is a dynamic field indeed.

madlag · on March 7, 2020

OpenAI just announced it will port its sparse libraries to PyTorch, so exciting times ahead! You can read more here about this (OP here) : https://medium.com/huggingface/is-the-future-of-neural-netwo...

estebarb · on March 7, 2020

Without reading: I think that the importance is that before we had methods that could do that. Now we know that there is an algorithm that can do that. They proved that it is always possible, not in some subset of the networks.

In the other hand, it will trigger research on reducing the size of the networks. That is important, as most researchers don't have access to the computing power of Google and the like.

ekelsen · on March 7, 2020

It's unclear this algorithm would be useful in practice. Training the weights will lead to a more accurate network for the same amount of work at inference time.

lonelappde · on March 8, 2020

Searching for the optimal small network is just as much work as training a larger network. There's no free lunch.

veselin · on March 7, 2020

I wonder, isn't the new CPU training with locality sensitive hashing similar to constructing something close to a winning lottery ticket?