I did the same thing. I was merging our billing db with that of a business we had just bought. Due to the nature of our business, customers could exist in both.
So I had a routine that normalized the address, as best I could anyway, didn't help that it was all just one varchar field. Then implimented Levenshtein Distance, messed with the weighting a bit to fit our particular data and away it went. Saved a bunch of headaches. It wasn't perfect, but it was better than hand matching a couple thousand accounts.
So I had a routine that normalized the address, as best I could anyway, didn't help that it was all just one varchar field. Then implimented Levenshtein Distance, messed with the weighting a bit to fit our particular data and away it went. Saved a bunch of headaches. It wasn't perfect, but it was better than hand matching a couple thousand accounts.