python - How can I partially read a huge CSV file?

Question

Welcome To Ask or Share your Answers For Others

python - How can I partially read a huge CSV file?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - How can I partially read a huge CSV file?

I have a very big csv file so that I can not read them all into the memory. I only want to read and process a few lines in it. So I am seeking a function in Pandas which could handle this task, which the basic python can handle this well:

with open('abc.csv') as f:
    line = f.readline()
    # pass until it reaches a particular line number....

However, if I do this in pandas, I always read the first line:

datainput1 = pd.read_csv('matrix.txt',sep=',', header = None, nrows = 1 )
datainput2 = pd.read_csv('matrix.txt',sep=',', header = None, nrows = 1 )

I am looking for some easier way to handle this task in pandas. For example, if I want to read rows from 1000 to 2000. How can I do this quickly?

I want to use pandas because I want to read data into the dataframe.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:10:21+0000

Use chunksize:

for df in pd.read_csv('matrix.txt',sep=',', header = None, chunksize=1):
    #do something

To answer your second part do this:

df = pd.read_csv('matrix.txt',sep=',', header = None, skiprows=1000, chunksize=1000)

This will skip the first 1000 rows and then only read the next 1000 rows giving you rows 1000-2000, unclear if you require the end points to be included or not but you can fiddle the numbers to get what you want.

Categories

python - How can I partially read a huge CSV file?

python - How can I partially read a huge CSV file?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags