We will not really know until memory bandwidth and compute numbers are published. However, Project Digits seems like a successor to the NVIDIA Jetson AGX Orin 64GB Developer Kit, which was based on the Ampere architecture and has 204.8GB/sec memory bandwidth:
The 3090 Ti had about 5 times the memory bandwidth and 5 times the compute capability. If that ratio holds for blackwell, the 5090 will run circles around it when it has enough VRAM (or you have enough 5090 cards to fit everything into VRAM).
Inference presumably will run faster on a 5090. If the 5x memory bandwidth figure holds, then token generation would run 5 times faster. That said, people in the digits discussion predict that the memory bandwidth will be closer to 546GB/sec, which is closer to 1/3 the memory bandwidth of the 5090, so a bunch of 5090 cards would only run 3 times faster at token generation.
https://www.okdo.com/wp-content/uploads/2023/03/jetson-agx-o...
The 3090 Ti had about 5 times the memory bandwidth and 5 times the compute capability. If that ratio holds for blackwell, the 5090 will run circles around it when it has enough VRAM (or you have enough 5090 cards to fit everything into VRAM).