python - How to return the most similar word from a list of words? -
how create function returns similar word list of words, if word not same?
the function should have 2 inputs: 1 word , other list. function should return word similar word.
lst = ['apple','app','banana store','pear','beer'] func('apple inc.',lst) >>'apple' func('banana',lst) >>'banana store'
from doing research, seems have use concepts of fuzzy string matching, nltk, , levenshtein-distance, i'm having hard time trying implement in creating function this.
i should point out similar, mean characters , i'm not concerned meaning of word @ all.
slow solution debugging:
def func(word, lst): items = sorted((dist(word, w), w) w in lst) # print items here debugging. if not items: raise valueerror('list of words empty.') return items[0][1]
or, faster , uses less memory:
def func(word, lst): return min((dist(word, w), w) w in lst)[1]
see https://stackoverflow.com/questions/682367/good-python-modules-for-fuzzy-string-comparison implementing dist
. 1 of answers has link levenshtein-distance implementation.
Comments
Post a Comment