Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
139 views
in Technique[技术] by (71.8m points)

Rolling up values in a list of dictionaries with Python -

I have a file system I'm syncing to a WebDAV server. The WebDAV server returns a modified date (epoch) on a directory that is the most recent modified date of any file in the directory or any directory below that. Essentially it's "rolling up" the newest modification date. So, in the example below the modified date on /tmp/somedir/, /tmp/somedir/some_dir/, and /tmp/somedir/some_dir/some_sub_dir/ are all 1609435951.0 because that is the most recent modification to any file in that path, /tmp/somedir/some_dir/some_sub_dir/some_other_file_3.

If I could match the datetimes on these folders to the same on my local filesystem it would make it far faster to compare the two.

WevDAV results -

[{'path': '/tmp/somedir/', 'isdir': True, 'modified': 1609436111.0}
{'path': '/tmp/somedir/some_file_1', 'isdir': False, 'modified': 1568076538.0}
{'path': '/tmp/somedir/some_file_2', 'isdir': False, 'modified': 1568077354.0}
{'path': '/tmp/somedir/some_file_3', 'isdir': False, 'modified': 1568077410.0}
{'path': '/tmp/somedir/some_dir/', 'isdir': True, 'modified': 1609435951.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/', 'isdir': True, 'modified': 1609435951.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_1', 'isdir': False, 'modified': 1568261178.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_2', 'isdir': False, 'modified': 1568261162.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_3', 'isdir': False, 'modified': 1609435951.0}]

I can almost replicate this for my local file system by getting the most recently modified date of any file in a directory and using it as the modified date on the directory. What I can't figure out is how to roll the modified date of /tmp/somedir/some_dir/some_sub_dir/some_other_file_3 to each directory above it.

In the example below the modified value of '/tmp/somedir/' and /tmp/somedir/some_dir/ should be 1609435951.0. I'm using Python 3.9. Any help is appreciated.

all_dirs = []
for path, directories, _ in os.walk(_top):
    path = Path(path)
    for d in directories:
        _path = path.joinpath(d)
        d = str(d)
        # This gets me the datetime I need for this directory
        max_modified = max(os.path.getmtime(root) for root, _, _ in os.walk(d))
        dir_details = {"path": d,
                       "isdir": True,
                       "modified": _modified}
            
        all_dirs.append(dir_details)

# Results - 
[{'path': '/tmp/somedir/', 'isdir': True, 'modified': 1568077410.0}
{'path': '/tmp/somedir/some_file_1', 'isdir': False, 'modified': 1568076538.0}
{'path': '/tmp/somedir/some_file_2', 'isdir': False, 'modified': 1568077354.0}
{'path': '/tmp/somedir/some_file_3', 'isdir': False, 'modified': 1568077410.0}
{'path': '/tmp/somedir/some_dir/', 'isdir': True, 'modified': 1568022222.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/', 'isdir': True, 'modified': 1609435951.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_1', 'isdir': False, 'modified': 1568261178.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_2', 'isdir': False, 'modified': 1568261162.0}
{'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_3', 'isdir': False, 'modified': 1609435951.0}]
question from:https://stackoverflow.com/questions/65911038/rolling-up-values-in-a-list-of-dictionaries-with-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You have to walk all directory entries and find the maximum of each modification date. The following code filters all entries that are starting with a certain path and calculates the maximum modification date from that list. Each directory entry is then copied with updated modification date to a new list:

walked = [
  {'path': '/tmp/somedir/', 'isdir': True, 'modified': 1568077410.0},
  {'path': '/tmp/somedir/some_file_1', 'isdir': False, 'modified': 1568076538.0},
  {'path': '/tmp/somedir/some_file_2', 'isdir': False, 'modified': 1568077354.0},
  {'path': '/tmp/somedir/some_file_3', 'isdir': False, 'modified': 1568077410.0},
  {'path': '/tmp/somedir/some_dir/', 'isdir': True, 'modified': 1568022222.0},
  {'path': '/tmp/somedir/some_dir/some_sub_dir/', 'isdir': True, 'modified': 1609435951.0},
  {'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_1', 'isdir': False, 'modified': 1568261178.0},
  {'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_2', 'isdir': False, 'modified': 1568261162.0},
  {'path': '/tmp/somedir/some_dir/some_sub_dir/some_other_file_3', 'isdir': False, 'modified': 1609435951.0},
]

print ( walked )  

updated = []
for p in walked:
  p['modified'] = max ( t['modified'] for t in walked if t['path'].startswith(p['path']) )
  updated.append ( p )

print ( updated )

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...