Python 3 encoding/decoding problems between FreeBSD/Linux BeautifulSoup -


i have application modifys contents of xml file (via beautiful soup), writes disk. easy enough, on development machine (linux), have working code:

first off, lets load file soup:

# load document document = open(contentxml, encoding="utf-8") # load soup soup = beautifulsoup(document, "lxml") # soupy stuff here open(document.name, "w") f:     # soup beautiful soup data     f.write(soup.decode("utf-8")) 

now works fine , dandy, when run exact same code on freebsd production system, error:

unicodeencodeerror: 'ascii' codec can't encode character '\xa3' in position 8253: ordinal not in range(128) 

so in case, thought try encoding file, , write disk:

with open(document.name, "w") f:     # soup beautiful soup data     # srting output cannot write bytes     soup_enc = str(soup.encode('utf8'))     f.write(soup_enc) 

now works without error, writes incorrect xml output file, outputs

b'<myxmlcontent>' 

which in turn makes end file useless, best way around clean solution work on both platforms?

note:

some reading online suggests not open original document, specified encoding e.g. do:

# load document document = open(contentxml) # load soup soup = beautifulsoup(document, "lxml") # soupy stuff here open(document.name, "w") f:     # soup beautiful soup data     f.write(str(soup)) 

this works fine on linux, on freebsd throws error when performing initial open(..) of:

unicodedecodeerror: 'ascii' codec can't decode byte 0xc2 in position 7551: ordinal not in range(128) 

in order write directly binary file, needed open correct method, write encoded byte string:

with open(document.name, 'wb') f:     f.write(soup.encode('utf8')) 

Comments

Popular posts from this blog

javascript - Chart.js (Radar Chart) different scaleLineColor for each scaleLine -

apache - Error with PHP mail(): Multiple or malformed newlines found in additional_header -

java - Android – MapFragment overlay button shadow, just like MyLocation button -