Hacker News new | past | comments | ask | show | jobs | submit login

That is very cool, and surprising to me given I was also initially underwhelmed.

One nitpick though: in the second question, although the explanation itself is clean, there appear to be seven b’s in its representation of the number six.




I've heard it's because GPT's input is whole words/tokens, not characters. So it has little insight into their spelling or letter positions, unless it's explicitly mentioned in the training set (e.g. rhymes are easy, counting letters of rare words is hard).

Though this is a specially bad example, surely it had bbbbbb somewhere in it's training set associated with "6".


It's even more interesting, I think. The tokens are a byte pair encoding [1] of the input string. So a short, frequent word might be represented as one token, but an infrequent word (such as "bbbbbbb") might be represented by several tokens, each of which might or might not correspond to a letter.

This might also explain the weird "off-by-one" errors with the ROT13 task.

[1] https://en.m.wikipedia.org/wiki/Byte_pair_encoding


> This is just one possible way to represent six in this numbering system, and there may be other ways to do it as well.

Gotta hand it to ChatGPT. It's not wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: