Long Paths in .NET, Part 1 of 3 [Kim Hamilton]

Let’s start by looking at one of the more interesting exception messages in the BCL, the PathTooLongException:

[PathTooLongException]: The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.

"260 characters? That’s ridiculous; increase the limit", say our customer bug reports. To everyone who got their bug resolved as "won’t fix", we’d like to explain this problem more completely and describe current efforts to address it.

To avoid confusion, let’s get some terminology out of the way:

Path: a fully-qualified file name or directory name. In other words, if you have a file C:\temp\fileA.txt, the file name is often called fileA.txt, but the path or fully-qualified file name is C:\temp\fileA.txt

MAX_PATH: This is the maximum length of a path according to the Windows API, defined as 260 characters.

Long path: a path that can be longer than MAX_PATH characters.

Long file name: not the same as a long path. This term is used in contrast to short file names – those 8.3 names you may remember using long ago.

.NET APIs depend heavily on the Windows APIs, so based on these definitions, it may sound like long paths are immediately out of the question. However, the Windows file APIs provide a way to get around this limitation. If you prefix the file name with "\\?\" and call the Unicode versions of the Windows APIs, then you can use file names up to 32K characters in length. In other words, the \\?\ prefix is a way to enable long paths while working with the Windows file APIs.

Very few people complain about a 32K limit, so, problem solved? Not quite. There are several reasons we were reluctant to add long paths in the past, and why we’re still careful about it, related to security, inconsistent support in the Windows APIs of the \\?\ syntax, and app compatibility.

First about security, the \\?\ prefix not only enables long paths; it causes the path to be passed to the file system with minimal modification by the Windows APIs. A consequence is that \\?\ turns off file name normalization performed by Windows APIs, including removing trailing spaces, expanding ‘.’ and ‘..’, converting relative paths into full paths, and so on. The existence of FileIOPermissions in .NET means that we absolutely have to work with normalized paths, or risk exposing a security threat. So we knew that if we wanted to use the \\?\ prefix as part of the long path solution, we’d need the ability to normalize these paths as expected.

Another concern is inconsistent behavior that would result by exposing long path support. Long paths with the \\?\ prefix can be used in most of the file-related Windows APIs, but not all Windows APIs. For example, LoadLibrary, which maps a module into the address of the calling process, fails if the file name is longer than MAX_PATH. So this means MoveFile will let you move a DLL to a location such that its path is longer than 260 characters, but when you try to load the DLL, it would fail. There are similar examples throughout the Windows APIs; some workarounds exist, but they are on a case-by-case basis.

Another factor, which is considered more of a pain factor, is compatibility with other Windows-based applications and the Windows shell itself, which only work with paths shorter than MAX_PATH (note that the Vista shell attempts to soften this limit, briefly described below). This means that if .NET supports long paths, then we’d let you create files that you couldn’t access via Explorer or the command prompt.

That said, we realize that enforcing the 260 character limit isn’t a reasonable long-term solution. Our customers don’t run into this problem very often, but when they do, they’re extremely inconvenienced. A possible workaround is P/Invoking to the Windows APIs and using the \\?\ prefix, but it would be a huge amount of code to duplicate the functionality in System.IO. So to work around the problem, customers often resort to an overhaul of directory organization, artificially shortening directory names (and updating every referencing location).

Because this problem is becoming increasingly common, also outside the .NET framework, there are efforts throughout Microsoft to address it. In fact, as a timely Vista plug, you’ll notice a couple of changes that reduce the chance of hitting the MAX_PATH limit: many of the special folder names have shortened and, more interestingly, the shell is using an auto-path shrinking feature, which involves using short name aliases for paths (behind the scenes) to attempt to squeeze them into 260 characters.

This is the first of a series of blogs on long paths. Our follow-ups will describe plans to support long paths in System.IO in a future version of the Common Language Runtime and some workarounds you can use in the meantime.

We do need a solution in the underlying windows API, but this would most likely emerge as new APIs rather than changing the existing ones. We’ve discussed this at length on the longpath alias at Microsoft (yes, we have a whole alias devoted to the issue!) and there are no plans to change the existing ones, since it would break third party code that depend on MAX_PATH buffers on the stack.

We’ve discussed migration plans to enable those existing Win32 APIs not to enforce MAX_PATH but there are no clear solutions; this would need to be done very carefully to avoid breaking existing Windows-based apps.

The main reason I think we need new Win32 APIs is that \? lumps together long paths and non-canonical paths, so you have to do tricks to get canonicalization on \? paths. That sort of work shouldn’t be pushed onto the callers just to achieve long paths.

I think everybody’s C/C++ code somewhere has at least one TCHAR[MAX_PATH] in it (or MAX_PATH + 1, FWIW).

I guess an "under-the-hood" workaround could also not be descending down a long path in pieces that are shorter than MAX_PATH, because somebody will be making assumptions on the current directory or trying to copy the current path into a MAX_PATH sized buffer.

What’s with abandoning the path altogether and use item identifier lists like the shell does?

Great point about using item identifier lists (or a similar concept) like the shell does. I’m going to back up and provide some context for folks unfamiliar with PIDLs in the shell before addressing your question.

Most of shell doesn’t have to worry about path representation because they use a pointer to an item identifier list (PIDLs). File names can be gotten if needed, but in general callers interact with the PIDLs and not the path string.

The great thing about this is path representation changes end up being very isolated. For example, this allows the shell to introduce auto path shrinking but only have to modify isolated areas (instead of touching tons of code throughout the shell).

It stands out that having such an object isolates path representation issues. Focusing on that alone (i.e. not PIDLs in particular), the long path issue could be cleaner if our IO APIs dealt with a Path object. The user could create a path object from "C:verylongpath…" but an internal detail is long path version "\?C:verylongpath…" or autopath shrunk version "C:verylo~1", etc. For now, the gain of this would be far outweighed by the fact that our APIs are heavily committed to string representations of paths.

Let me know if I left something out. I went on a tangent and may have hijacked your original question. 🙂

We’d definitely consider it if such a change were up to us and there were no risk of breaking apps as a result of such a change. The problem is the underlying win32 APIs that enforce the MAX_PATH limit of 260 (except when using the nonstandard \? prefix described above). And the win32 folks can’t do this without a nearly flawless transition story for app compat reasons.

I ran into this problem about a year ago, and I found I couldn’t get around it, even when passing in the shortened path (using filena~1 style filenames). Even after I had p-invoked GetShortFileName() and shortened every part of the path, the exception thrown from the framework mentioned the full, expanded path.

I had absolutely no desire to rewrite System.IO, so I ended up changing my directory structure. To a user of the framework, it just seems like a lot of work that I shouldn’t have to go through.