That is very cool, and surprising to me given I was also initially underwhelmed....

BoppreH · on Dec 16, 2022

I've heard it's because GPT's input is whole words/tokens, not characters. So it has little insight into their spelling or letter positions, unless it's explicitly mentioned in the training set (e.g. rhymes are easy, counting letters of rare words is hard).

Though this is a specially bad example, surely it had bbbbbb somewhere in it's training set associated with "6".

xg15 · on Dec 17, 2022

It's even more interesting, I think. The tokens are a byte pair encoding [1] of the input string. So a short, frequent word might be represented as one token, but an infrequent word (such as "bbbbbbb") might be represented by several tokens, each of which might or might not correspond to a letter.

This might also explain the weird "off-by-one" errors with the ROT13 task.

[1] https://en.m.wikipedia.org/wiki/Byte_pair_encoding

wyldfire · on Dec 16, 2022

> This is just one possible way to represent six in this numbering system, and there may be other ways to do it as well.

Gotta hand it to ChatGPT. It's not wrong.