Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
153 views
in Technique[技术] by (71.8m points)

python - Pathname too long to open?

This is a screenshot of the execution:

Enter image description here

As you see, the error says that the directory "JSONFiles/Apartment/Rent/dubizzleabudhabiproperty" is not there.

But look at my files, please:

Enter image description here

The folder is definitely there.

Update 2

The code

self.file = open("JSONFiles/"+ item["category"]+"/" + item["action"]+"/"+ item['source']+"/"+fileName + '.json', 'wb') # Create a new JSON file with the name = fileName parameter
        line = json.dumps(dict(item)) # Change the item to a JSON format in one line
        self.file.write(line) # Write the item to the file

UPDATE

When I change the file name to a smaller one, it works, so the problem is because of the length of the path. what is the solution please?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Regular DOS paths are limited to MAX_PATH (260) characters, including the string's terminating NUL character. You can exceed this limit by using an extended-length path that starts with the \? prefix. This path must be a Unicode string, fully qualified, and only use backslash as the path separator. Per Microsoft's file system functionality comparison, the maximum extended path length is 32760 characters. A individual file or directory name can be up to 255 characters (127 for the UDF filesystem). Extended UNC paths are also supported as \?UNCservershare.

For example:

import os

def winapi_path(dos_path, encoding=None):
    if (not isinstance(dos_path, unicode) and 
        encoding is not None):
        dos_path = dos_path.decode(encoding)
    path = os.path.abspath(dos_path)
    if path.startswith(u""):
        return u"\\?\UNC" + path[2:]
    return u"\\?" + path

path = winapi_path(os.path.join(u"JSONFiles", 
                                item["category"],
                                item["action"], 
                                item["source"], 
                                fileName + ".json"))
>>> path = winapi_path("C:\Temp\test.txt")
>>> print path
\?C:Tempest.txt

See the following pages on MSDN:


Background

Windows calls the NT runtime library function RtlDosPathNameToRelativeNtPathName_U_WithStatus to convert a DOS path to a native NT path. If we open (i.e. CreateFile) the above path with a breakpoint set on the latter function, we can see how it handles a path that starts with the \? prefix.

Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58        sub     rsp,58h
0:000> du @rcx
000000b4`52fc0f60  "\?C:Tempest.txt"
0:000> r rdx
rdx=000000b450f9ec18
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3              ret

The result replaces \? with the NT DOS devices prefix ??, and copies the string into a native UNICODE_STRING:

0:000> dS b450f9ec18
000000b4`536b7de0  "??C:Tempest.txt"

If you use //?/ instead of \?, then the path is still limited to MAX_PATH characters. If it's too long, then RtlDosPathNameToRelativeNtPathName returns the status code STATUS_NAME_TOO_LONG (0xC0000106).

If you use \? for the prefix but use slash in the rest of the path, Windows will not translate the slash to backslash for you:

Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58        sub     rsp,58h
0:000> du @rcx
0000005b`c2ffbf30  "\?C:/Temp/test.txt"
0:000> r rdx
rdx=0000005bc0b3f068
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3              ret
0:000> dS 5bc0b3f068
0000005b`c3066d30  "??C:/Temp/test.txt"

Forward slash is a valid object name character in the NT namespace. It's reserved by Microsoft filesystems, but you can use a forward slash in other named kernel objects, which get stored in BaseNamedObjects or Sessions[session number]BaseNamedObjects. Also, I don't think the I/O manager enforces the policy on reserved characters in device and filenames. It's up to the device. Maybe someone out there has a Windows device that implements a namespace that allows forward slash in names. At the very least you can create DOS device names that contain a forward slash. For example:

>>> kernel32 = ctypes.WinDLL('kernel32')
>>> kernel32.DefineDosDeviceW(0, u'My/Device', u'C:\Temp')
>>> os.path.exists(u'\\?\My/Device\test.txt')
True

You may be wondering what ?? signifies. This used to be an actual directory for DOS device links in the object namespace, but starting with NT 5 (or NT 4 w/ Terminal Services) this became a virtual prefix. The object manager handles this prefix by first checking the logon session's DOS device links in the directory SessionsDosDevices[LOGON_SESSION_ID] and then checking the system-wide DOS device links in the Global?? directory.

Note that the former is a logon session, not a Windows session. The logon session directories are all under the DosDevices directory of Windows session 0 (i.e. the services session in Vista+). Thus if you have a mapped drive for a non-elevated logon, you'll discover that it's not available in an elevated command prompt, because your elevated token is actually for a different logon session.

An example of a DOS device link is Global??C: => DeviceHarddiskVolume2. In this case the DOS C: drive is actually a symbolic link to the HarddiskVolume2 device.

Here's a brief overview of how the system handles parsing a path to open a file. Given we're calling WinAPI CreateFile, it stores the translated NT UNICODE_STRING in an OBJECT_ATTRIBUTES structure and calls the system function NtCreateFile.

0:000> g
Breakpoint 1 hit
ntdll!NtCreateFile:
00007ff9`d2023d70 4c8bd1          mov     r10,rcx
0:000> !obja @r8
Obja +000000b450f9ec58 at 000000b450f9ec58:
        Name is ??C:Tempest.txt
        OBJ_CASE_INSENSITIVE

NtCreateFile calls the I/O manager function IoCreateFile, which in turn calls the undocumented object manager API ObOpenObjectByName. This does the work of parsing the path. The object manager starts with ??C:Tempest.txt. Then it replaces that with Global??C:Tempest.txt. Next it parses up to the C: symbolic link and has to start over (reparse) the final path DeviceHarddiskVolume2Tempest.txt.

Once the object manager gets to the HarddiskVolume2 device object, parsing is handed off to the I/O manager, which implements the Device object type. The ParseProcedure of an I/O Device creates the File object and an I/O Request Packet (IRP) with the major function code IRP_MJ_CREATE (an open/create operation) to be processed by the device stack. This is sent to the device driver via IoCallDriver. If the device implements reparse points (e.g. junction mountpoints, symbolic links, etc) and the path contains a reparse point, then the resolved path has to be resubmitted to the object manager to be parsed from the start.

The device driver will use the SeChangeNotifyPrivilege (almost always present and enabled) of the process token (or thread if impersonating) to bypass access checks while traversing directories. However, ultimately access to the device and target file has to be allowed by a security descriptor, which is verified via SeAccessCheck. Except simple filesystems such as FAT32 don't support file security.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...