I'm extracting files from MIME messages in a python milter and am running across issues with files named as such:
=?ISO-8859-1?Q?Certificado=5FZonificaci=F3n=5F2010=2Epdf?=
I can't seem to decode this name into UTF. In order to solve a prior ISO-8859-1 issue, I started passing all filenames to this function:
def unicodeConvert(self, fname):
normalized = False
while normalized == False:
try:
fname = unicodedata.normalize('NFKD', unicode(fname, 'utf-8')).encode('ascii', 'ignore')
normalized = True
except UnicodeDecodeError:
fname = fname.decode('iso-8859-1')#.encode('utf-8')
normalized = True
except UnicodeError:
fname = unicode(fname.content.strip(codecs.BOM_UTF8), 'utf-8')
normalized = True
except TypeError:
fname = fname.encode('utf-8')
return fname
which was working until I got to this filename.
Ideas are appreciated as always.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…