Archive
Posts Tagged ‘accent’
unicode to ascii
December 17, 2010
Leave a comment
Problem
I had the following unicode string: “Kellemes Ünnepeket!” that I wanted to simplify to this: “Kellemes Unnepeket!”, that is strip “Ü” to “U”. Furthermore, most of the strings were normal ascii, only some of them were in unicode.
Solution
import unicodedata
title = ... # get the string somehow
try:
# if the title is a unicode string, normalize it
title = unicodedata.normalize('NFKD', title).encode('ascii','ignore')
except TypeError:
# if it was not a unicode string => OK, do nothing
pass
Credits
I used the following resources:
