Hacker News new | past | comments | ask | show | jobs | submit login

Are there any papers benchmarking a transformer NN architecture in comparison to something like a pointer-generator network? I'm doing a bit of work in this area (i.e. reimplementing papers), and I'm curious if GPT2-like models can derive greater semantic meaning.



Both GPT-2 and pointer-generator network are open source, and pretrained models are available, so it should be straightforward to compare them.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: