Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
171 views
in Technique[技术] by (71.8m points)

python - How can I open multiple files (number of files unknown beforehand) using "with open" statement?

I specifically need to use with open statement for opening the files, because I need to open a few hundred files together and merge them using K-way merge. I understand, ideally I should have kept K low, but I did not foresee this problem.

Starting from scratch is not an option now as I have a deadline to meet. So at this point, I need very fast I/O that does not store the whole/huge portion of file in memory (because there are hundreds of files, each of ~10MB). I just need to read one line at a time for K-way merge. Reducing memory usage is my primary focus right now.

I learned that with open is the most efficient technique, but I cannot understand how to open all the files together in a single with open statement. Excuse my beginner ignorance!

Update: This problem was solved. It turns out the issue was not about how I was opening the files at all. I found out that the excessive memory usage was due to inefficient garbage collection. I did not use with open at all. I used the regular f=open() and f.close(). Garbage collection saved the day.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's fairly easy to write your own context manager to handle this by using the built-in contextmanger function decorator to define "a factory function for with statement context managers" as the documentation states. For example:

from contextlib import contextmanager

@contextmanager
def multi_file_manager(files, mode='rt'):
    """ Open multiple files and make sure they all get closed. """
    files = [open(file, mode) for file in files]
    yield files
    for file in files:
        file.close()

filenames = 'file1', 'file2', 'file3'

with multi_file_manager(filenames) as files:
    a = files[0].readline()
    b = files[2].readline()
        ...

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...