Interesting paper. Any more details on the architecture of the feedback connections. Also I can't tell from the paper where and how weights are being updated e..g what does "train" mean in this context
I believe instead of multiplying the delta by W^t to backpropagate the error from layer l to l-1, you multiply it by a random projection B. It's hard to dig deeper because there doesn't appear to be any other information on it except here: http://isis-innovation.com/licence-details/accelerating-mach...
As an aside, are they really trying to patent a slight twist on backpropagation? That seems pretty counter-productive to me.