In this tutorial, we are going to learn about the FuzzyWuzzy Python library. FuzzyBuzzy library is developed to compare to strings. We have other modules like regex, difflib to compare strings. But, FuzzyBuzzy is unique in its way. The methods from this library returns score out of 100 of how much the strings matched instead of true, false or string.
To work with the FuzzyWuzzy library, we have to install the fuzzywuzzy and python- Levenshtein. Run the following commands to install them.
pip install fuzzywuzzy
If you run the above command, you will the following success message.
Collecting fuzzywuzzy Downloading https://files.pythonhosted.org/packages/d8/f1/5a267addb30ab7eaa1beab2 b9323073815da4551076554ecc890a3595ec9/fuzzywuzzy-0.17.0-py2.py3-none-any.whl Installing collected packages: fuzzywuzzy Successfully installed fuzzywuzzy-0.17.0
Run the following command in Linux to install python-Levenshtein.
pip install python-Levenshtein
Run the following command in windows.
easy_install python-Levenshtein
fuzz
Now, we will learn about the fuzz module. fuzz is used to compare two strings at a time. It has different methods that return a score out of 100. Let's see some methods of the fuzz module.
fuzz.ratio()
Let's see the first method of fuzz module ratio. It's used to compare two strings that return a score out of 100. See the examples below to get a clear idea.
Example
## importing the module from the fuzzywuzzy library from fuzzywuzzy import fuzz ## 100 for same strings print(f"Equal Strings:- {fuzz.ratio('tutorialspoint', 'tutorialspoint')}") ## random score for slight changes in the strings print(f"Slight Changed Strings:- {fuzz.ratio('tutorialspoint', 'TutorialsPoint')}") print(f"Slight Changed Strings:- {fuzz.ratio('tutorialspoint', 'Tutorials Point')}" ) ## complete different strings print(f"Different Strings:- {fuzz.ratio('abcd', 'efgh')}")
Output
Max Score:- 100 Slight Changed Strings:- 86 Slight Changed Strings:- 86 Different Strings:- 0
Experiment with the partial_ratio as much as possible for better understanding.
fuzz.WRatio()
fuzz.WRatio() handles upper and lower cases and some other parameters. Let's see some examples.
Example
## importing the module from the fuzzywuzzy library from fuzzywuzzy import fuzz ## 100 score even if one string contains more characters than the other print(f"Max Score:- {fuzz.WRatio('tutorialspoint', 'tutorialspoint!!!')}") ## random score for slight changes in the strings print(f"Slight Changed Strings:- {fuzz.WRatio('tutorialspoint', 'TutorialsPoint')}") print(f"Slight Changed Strings:- {fuzz.WRatio('tutorialspoint', 'TutorialsPoint')}") ## complete different strings print(f"Different Strings:- {fuzz.ratio('abcd', 'efgh')}")
Output
Max Score:- 100 Slight Changed Strings:- 100 Slight Changed Strings:- 100 Different Strings:- 0
WRatio ignores the cases and some extra characters as we see. Using WRatio instead of a simple ratio gives you more close matching strings.
Conclusion
If you have any doubts regarding the tutorial, mention them in the comment section.