Hacker News new | past | comments | ask | show | jobs | submit login

From a quick speed test on my laptop, Pattern is 48x slower at POS tagging, and 8x slower at parsing. I last benchmarked its accuracy in 2013, where I found it got 93.5% on the WSJ corpus, vs 97% for the state-of-the-art taggers --- so twice as many errors. It was also more domain-dependent. Its parser doesn't produce exactly the same representations as mine, so I can't easily evaluate its accuracy. But, I doubt it's very high.

Pattern doesn't really use machine learning, just some pre-computed statistics from the annotated data, and some hand-crafted rules. Machine learning is good. It's really the right way to build these systems.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: