encoding - How can I encode and decode percent-encoded (URL encoded) strings in Python? -
i wrote simple application downloads articles wiki pages. when search, example firstname lech
, code returns strings lech_kaczy%c5%84ski
or lech_pozna%c5%84
instead of lech_kaczyński
, lech_poznań
.
how can decode characters ordinary polish letters? tried use: urllib.unquote(text)
got lech_kaczy\xc5\x84ski
, lech_pozna\xc5\x84
instead of lech_kaczyński
, lech_poznań
.
i have in code:
# -*- coding: utf-8 -*- import sys reload(sys) sys.setdefaultencoding("utf-8")
but result same (it not work).
try this:
import urllib urllib.unquote("lech_kaczy%c5%84ski").decode('utf8')
this return unicode string:
u'lech_kaczy\u0144ski'
which can print , process usual. example:
print(urllib.unquote("lech_kaczy%c5%84ski").decode('utf8'))
will result in
lech_kaczyński
Comments
Post a Comment