There are a lot of redundant operations going on in your code.
For example, the use of fromtimestamp()
to calculate total_days_last
inside the loop can simply be done once outside of the loop. In fact, the use of datetime
functions and mucking about with epoch
seems unnecessary because you can simply compare the file ctime
values directly.
os.path.getctime()
is called twice on every file: once for the sort and a second time to calculate total_days_file
.
These repetitive calculations over a large number of files would be part of the performance problem.
Another issue is that, if there are a large number of files, the list files
could become very large and require a lot of memory.
if check_empty != "" and check_empty is not None:
can simply be written as if check_empty:
Here is a simplified version:
def get_ordered_files():
last_ctime = os.path.getctime(check_last_successful_file())
files = glob.glob(files_location + file_extension)
files.sort(key=os.path.getctime)
return [f for f in files
if os.path.getctime(f) > last_ctime and get_email_account(f)]
This eliminates most of the redundant code but still calls os.path.getctime()
twice for each file. To avoid that we can store the ctime
for each file on the first occasion it is obtained.
pattern = os.path.join(files_location, file_extension)
def get_ordered_files():
last_ctime = os.path.getctime(check_last_successful_file())
files = ((filename, ctime) for filename in glob.iglob(pattern)
if (ctime := os.path.getctime(filename)) > last_ctime and
get_email_account(filename))
return (filename for filename, _ in sorted(files, key=itemgetter(1)))
Here a generator expression is assigned to files
. It uses glob.iglob()
which is an iterator version of glob.glob()
that does not store all the files at once. Both the file name and its ctime value are stored as tuples. The generator expression filters out files that are too old and files that don't have an associated email account. Finally another generator is returned that sorts the files by ctime. The calling code can then iterate over the generator, or call list()
on it to realise it as a list.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…