encoding - How can I encode and decode percent-encoded (URL encoded) strings in Python? -
i wrote simple application downloads articles wiki pages. when search, example firstname lech, code returns strings lech_kaczy%c5%84ski or lech_pozna%c5%84 instead of lech_kaczyński , lech_poznań.
how can decode characters ordinary polish letters? tried use: urllib.unquote(text) got lech_kaczy\xc5\x84ski, lech_pozna\xc5\x84 instead of lech_kaczyński , lech_poznań.
i have in code:
# -*- coding: utf-8 -*- import sys reload(sys) sys.setdefaultencoding("utf-8") but result same (it not work).
try this:
import urllib urllib.unquote("lech_kaczy%c5%84ski").decode('utf8') this return unicode string:
u'lech_kaczy\u0144ski' which can print , process usual. example:
print(urllib.unquote("lech_kaczy%c5%84ski").decode('utf8')) will result in
lech_kaczyński
Comments
Post a Comment