How to remove accents from unicode characters using python

Sometimes unicode characters with accents cause you trouble (actually most of the time). You can even get rid of those characters or replace them with the base ones (without accents). Here is how to do the second method in Python:

import unidecode

your_text = u"Nguyễn Trọng Đăng Trình"
your_non_accent_text = unidecode(your_text).encode('ascii')
print your_non_accent_text

Nguyen Trong Dang Trinh



Reference: https://pypi.python.org/pypi/Unidecode


Comments