Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It depends if they are using a “vanilla” instruction-tuned model or are applying additional task-specific fine-tuning. Fine-tuning with data that doesn’t have misspellings can make the model “forget” how to handle them.

In general, fine-tuned models often fail to generalize well on inputs that aren’t very close to examples in the fine-tuning data set.



Yes, but you can control that.

You can use set fit, less examples, or SVM or etc depending on how much separation, recall and other aspects matter to you for the task at hand.

Sensitivity level to biasing to the dataset is a choice of training method, not an attribute.

It's just not really a major issue unless you finetune with an entirely new or unseen language in the present day.


This is very helpful, thank you!

We are doing a fair bit of task-specific fine-tuning for an asymmetric embeddings model (connecting user-entered descriptions of symptoms with the service solutions that resolved their issues).

I would like to run more experiments with this and see if introducing typos into the user-entered descriptions will help it not forget as much.

Thank you again!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: