I have a csv file that has a few hundred rows and 26 columns, but the last few columns only have a value in a few rows and they are towards the middle or end of the file. When I try to read it in using read_csv() I get the following error.
"ValueError: Expecting 23 columns, got 26 in row 64"
I can't see where to explicitly state the number of columns in the file, or how it determines how many columns it thinks the file should have.
The dump is below
In [3]:
infile =open(easygui.fileopenbox(),"r")
pledge = read_csv(infile,parse_dates='true')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-b35e7a16b389> in <module>()
1 infile =open(easygui.fileopenbox(),"r")
2
----> 3 pledge = read_csv(infile,parse_dates='true')
C:Python27libsite-packagespandas-0.8.1-py2.7-win32.eggpandasioparsers.pyc in read_csv(filepath_or_buffer, sep, dialect, header, index_col, names, skiprows, na_values, thousands, comment, parse_dates, keep_date_col, dayfirst, date_parser, nrows, iterator, chunksize, skip_footer, converters, verbose, delimiter, encoding, squeeze)
234 kwds['delimiter'] = sep
235
--> 236 return _read(TextParser, filepath_or_buffer, kwds)
237
238 @Appender(_read_table_doc)
C:Python27libsite-packagespandas-0.8.1-py2.7-win32.eggpandasioparsers.pyc in _read(cls, filepath_or_buffer, kwds)
189 return parser
190
--> 191 return parser.get_chunk()
192
193 @Appender(_read_csv_doc)
C:Python27libsite-packagespandas-0.8.1-py2.7-win32.eggpandasioparsers.pyc in get_chunk(self, rows)
779 msg = ('Expecting %d columns, got %d in row %d' %
780 (col_len, zip_len, row_num))
--> 781 raise ValueError(msg)
782
783 data = dict((k, v) for k, v in izip(self.columns, zipped_content))
ValueError: Expecting 23 columns, got 26 in row 64
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…