Short answer:
Just take line[74:79]
and such as Roelant suggested. Since the lines in your input are always 230 chars long though, there'll never be an IndexError
, so you rather need to check if the result is all whitespace with isspace()
:
field=line[74:79]
<...>
if isspace(field): continue
A more robust approach that would also validate input (check if you're required to do so) is to parse the entire line and use a specific element from the result.
One way is a regex as per Parse a text file and extract a specific column, Tips for reading in a complex file - Python and an example at get the path in a file inside {} by python .
But for your specific format that appears to be an archaic, punchcard-derived one, with column number defining the datum's meaning, the format can probably be more conveniently expressed as a sequence of column numbers associated with field names (you never told us what they mean so I'm using generic names):
fields=[
("id1",(0,39)),
("cname_text":(40,73)),
("num2":(74:79)),
("num3":(96,105)),
#whether to introduce a separate field at [122:125]
# or parse "id4" further after getting it is up to you.
# I'd suggest you follow the official format spec.
("id4":(106,130)),
("num5":(134,168))
]
line_end=230
And parsed like this:
def parse_line(line,fields,end):
result={}
#for whitespace validation
# prev_ecol=0
for fname,(scol,ecol) in format.iteritems():
#optionally validate delimiting whitespace
# assert prev_ecol==scol or isspace(line[prev_ecol,scol])
#lines in the input are always `end' symbols wide, so IndexError will never happen for a valid input
field=line[scol:ecol]
#optionally do conversion and such, this is completely up to you
field=field.rstrip(' ')
if not field: field=None
result[fname]=field
#for whitespace validation
# prev_ecol=ecol
#optionally validate line end
# assert ecol==end or isspace(line[ecol:end])
All that leaves is skip lines where the field is empty:
for line in lines:
data = parse_line(line,fields,line_end)
if any(data[fname] is None for fname in ('num2','id4')): continue
#handle the data