Hacker News new | past | comments | ask | show | jobs | submit login

The date on the top says February 2007 to August 2016.

Does anyone know which parts are new in August 2016? I've read this before and it isn't sticking out to me.




The code has been updated stylistically, and I'm not sure what else has changed. Here's the old (Python 2.5) version from http://web.archive.org/web/20160110135619/http://norvig.com/...:

    import re, collections

    def words(text): return re.findall('[a-z]+', text.lower()) 

    def train(features):
        model = collections.defaultdict(lambda: 1)
        for f in features:
            model[f] += 1
        return model

    NWORDS = train(words(file('big.txt').read()))

    alphabet = 'abcdefghijklmnopqrstuvwxyz'

    def edits1(word):
       splits     = [(word[:i], word[i:]) for i in range(len(word) + 1)]
       deletes    = [a + b[1:] for a, b in splits if b]
       transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b)>1]
       replaces   = [a + c + b[1:] for a, b in splits for c in alphabet if b]
       inserts    = [a + c + b     for a, b in splits for c in alphabet]
       return set(deletes + transposes + replaces + inserts)

    def known_edits2(word):
        return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

    def known(words): return set(w for w in words if w in NWORDS)

    def correct(word):
        candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
        return max(candidates, key=NWORDS.get)


Here's the history, sorry there is no easy way to see diffs.

https://web.archive.org/web/*/http://norvig.com/spell-correc...


I read this in July. I am sure that the Future work section has been better explained. The main code is the same.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: