Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
samwillis
on July 25, 2023
|
parent
|
context
|
favorite
| on:
Attention Is Off By One
If I'm following correctly, does this mean that with this change along with a model being quantized, we could see models that are 5% the size (on file system) and memory usage but almost identical in output?
zamalek
on July 25, 2023
[–]
The vales are selected were arbitrary. The size reduction will be 32bits/8bits - so it will be 4 times smaller.
Consider applying for YC's W25 batch! Applications are open till Nov 12.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: