To reproduce the issue, I used the following CSV file (dummy.csv):
F1,F2,F3
11,A,10.54
18,B,0.12,low
24,A,19.00
10,C,7.01,low
22,D,39.11,high
49,E,12.12
It may be noted that some lines have extra fields.
Since, we are using error_bad_lines=False
, no errors/exceptions will be raised, so try-except
is not the way ahead. We need to redirect the stderr
:
from contextlib import redirect_stderr
import pandas as pd
# import io
with open('error_messages.log', 'w') as h:
# f = io.StringIO()
# with redirect_stderr(f):
with redirect_stderr(h):
df = pd.read_csv(filepath_or_buffer='dummy.csv',
sep=',', # change it for your data
encoding='latin-1',
skip_blank_lines=True,
error_bad_lines=False,
# dtype=data_type_dict,
engine='python',
# quoting=csv.QUOTE_NONE
)
# h.write(f.getvalue()) # Write the error messages to log file
print(df)
The above code will write the messages to a log file!
Here is a sample output from the log file:
Skipping line 3: Expected 3 fields in line 3, saw 4
Skipping line 5: Expected 3 fields in line 5, saw 4
Skipping line 6: Expected 3 fields in line 6, saw 4
Update
Modified the code based on a suggestion (in comments below)