Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
707 views
in Technique[技术] by (71.8m points)

python - Reading excel files with xlrd

I'm having problems reading .xls files written by a Perl script which I have no control over. The files contain some formatting and line breaks within cells.

filename = '/home/shared/testfile.xls'
book = xlrd.open_workbook(filename)
sheet = book.sheet_by_index(0)
for rowIndex in xrange(1, sheet.nrows):
    row = sheet.row(rowIndex)

This is throwing the following error:

_locate_stream(Workbook): seen
    0  5 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
   20  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
172480= 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
172500  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 2
172520  2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
173840= 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
173860  2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1
173880  1 1 1 1 1 1 1 1
Traceback (most recent call last):
  File "/home/shared/xlrdtest.py", line 5, in <module>
    book = xlrd.open_workbook(filename)
  File "/usr/local/lib/python2.7/site-packages/xlrd/__init__.py", line 443, in open_workbook
    ragged_rows=ragged_rows,
  File "/usr/local/lib/python2.7/site-packages/xlrd/book.py", line 84, in open_workbook_xls
    ragged_rows=ragged_rows,
  File "/usr/local/lib/python2.7/site-packages/xlrd/book.py", line 616, in biff2_8_load
    self.mem, self.base, self.stream_len = cd.locate_named_stream(qname)
  File "/usr/local/lib/python2.7/site-packages/xlrd/compdoc.py", line 393, in locate_named_stream
    d.tot_size, qname, d.DID+6)
  File "/usr/local/lib/python2.7/site-packages/xlrd/compdoc.py", line 421, in _locate_stream
    raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))
xlrd.compdoc.CompDocError: Workbook corruption: seen[2] == 4

I'm not able to find any info about CompDocError or Workbook corruption, even less the seen[2] == 4 part.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

+1 to Ramiel. Just comment out these lines in compdoc.py (lines 425-27 in xlrd 1.2.0):

if self.seen[s]:
    print("_locate_stream(%s): seen" % qname, file=self.logfile);dump_list(self.seen, 20, self.logfile)
    raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...