i have non-literal string programmatically obtained title of printed document online.
when try commit mongodb, get:
bson.errors.invalidstringdata: strings in documents must valid utf-8: 'wxpython: windows styles , events hunter \xab mouse vs. python'
string retrieval code:
for printstats in printers: handle = win32print.openprinter(printstats[2]) queued = win32print.enumjobs(handle, 0, -1, 1) printjob in queued: username = printjob['pusername'] computer = printjob['pmachinename'] document = printjob['pdocument'] identity = printjob['jobid'] jobstate = printjob['status'] print document > "wxpython: windows styles , events hunter « mouse vs. python"
from comments in other answers, can see error is:
bson.errors.invalidstringdata: strings in documents must valid utf-8: 'wxpython: windows styles , events hunter \xab mouse vs. python'
as «
encoded \xab
, means string encoded in iso-8995-1, iso-8995-15, windows-1252/latin-1. related locale of machine.
you need decode before passing mongodb, supports unicode strings (it not limited ascii assert):
document = printjob['pdocument'].decode("latin-1") >>> print type(document) <type 'unicode'>
you can pass document
python mongodb driver.
to make code portable, can use codec alias mbcs
(in place of 'latin-1'). mbcs
automatically translated configured windows locale (thanks @roeland)
Comments
Post a Comment