Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
68 views
in Technique[技术] by (71.8m points)

python - Adding hashes and files names to lists and dictionaries

Hey guys I was able to get the hashes to individually work for the files I input. I was wondering how to add the file name to the list I have set and the hash to the other dictionary. I feel like it is a pretty simple fix I am just stuck. we have to use a file path on our machines that has been established already. the folder I am using has 3 or 4 files in it. I am just trying to figure out how to add each of the hases to the lists Thanks!

from __future__ import print_function

'''
Week Two Assignment 2 - File Hashing
'''

'''
Complete the script below to do the following:
1) Add your name, date, assignment number to the top of this script
2) Using the os library and the os.walk() method 
   a) Create a list of all files
   b) Create an empty dictionary named fileHashes 
   c) Iterate through the list of files and
      - calculate the md5 hash of each file
      - create a dictionary entry where:
        key   = md5 hash
        value = filepath
    d) Iterate through the dictionary
       - print out each key, value pair
    
3) Submit
   NamingConvention: lastNameFirstInitial_Assignment_.ext
   for example:  hosmerC_WK1_script.py
                 hosmerC_WK1_screenshot.jpg
   A) Screenshot of the results in WingIDE
   B) Your Script
'''

import os
import hashlib
import sys
    
directory = "."

fileList   = []
fileHashes = {}

# Psuedo Constants

SCRIPT_NAME    = "Script: ASSIGNMENT NAME"
SCRIPT_AUTHOR  = "Author: NAME"
SCRIPT_DATE = "Date: 25 January 2021"



print(SCRIPT_NAME)
print(SCRIPT_AUTHOR)
print(SCRIPT_DATE)


for root, dirs, files in os.walk(directory):

    # Walk the path from top to bottom.
    # For each file obtain the filename 
    
    for fileName in files:
        path = os.path.join(root, fileName)
        fullPath = os.path.abspath(path)
        
    print(files)
    
''' Determine which version of Python '''
if sys.version_info[0] < 3:
    PYTHON_2 = True
else:
    PYTHON_2 = False
    
def HashFile(filePath):
    ''' 
        function takes one input a valid filePath
        returns the hexdigest of the file
        or error 
    '''
    try:
        with open(filePath, 'rb') as fileToHash:
            fileContents = fileToHash.read()
            hashObj = hashlib.md5()
            hashObj.update(fileContents)
            digest = hashObj.hexdigest()
            return digest
    except Exception as err:
        return str(err)
        
print()

if PYTHON_2:
    fileName = raw_input("Enter file to hash: ")
else:
    fileName = input("Enter file to hash: ")

hexDigest = HashFile(fileName)
print(hexDigest)
question from:https://stackoverflow.com/questions/65892763/adding-hashes-and-files-names-to-lists-and-dictionaries

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Well, you've done most of the work in the assignment, so kudos to you for that. You just need to refine a few things and use your functions together.

  1. For item "a) Create a list of all files": In the for fileName in files: loop, add the line fileList.append(fullPath). (See list.append() for more info.)

    • indent it so it's part of the for loop.
    • Btw, the print(files) line you have is outside the loop so it will only print the files of the last folder that was os.walk'ed.
    • Change that to print(fileList)
  2. For "c) Iterate through the list of files and...":

    • Iterate through the fileList and call the HashFile() function for each file. The return value is the key for your dictionary and the filepath is the value:

      for filepath in fileList:
          filehash = HashFile(filepath)
          fileHashes[filehash] = filepath
      
    • The one-line version of that, using a dictionary comprehension:

      fileHashes = {HashFile(filepath): filepath for filepath in fileList}
      
  3. For "d) Iterate through the dictionary": I think you'll be able to manage that on your own. (See dict.items() for more info.)

Other notes:

A.) In the except block for calculating hashes, returning a string of the error - was the pre-written for you in the assignment or did you write it? If you wrote it, consider removing it - coz then that error message becomes the hash of the file since you're returning it. Possibly just print the error; unless you were instructed otherwise.

B.) The "input filename" part is not needed - you'll be calculating the hash of the all files in the "current" directory where your script is executed (as shown above).

C.) Btw, for md5, since finding collisions is a known flaw, may want to add each file as a list of values. Maybe it's part of the test. Or can be safely ignored.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...