How do I find out which process has a file open?

Classically, there was no way to find out which process has a file open.
A file object has a reference count, and when the reference count drops
to zero, the file is closed.
But there's nobody keeping track of which processes own how many references.
(And that's ignoring the case that the reference is not coming from a
process in the first place; maybe it's coming from a kernel driver,
or maybe it came from a process that no longer exists but whose reference
is being kept alive by a kernel driver that
captured the object reference.)

You do the same thing with your COM object reference counts.
All you care about is whether your reference count has reached zero
(at which point it's time to destroy the object).
If you later discover an object leak in your process,
you don't have a magic query
"Show me all the people who called
AddRef on my object"
because you never kept track of all the people who called
AddRef on your object.
Or even, "Here's an object I want to destroy.
Show me all the people who called AddRef on it
so I can destroy them
and get them to call Release."

The official goal of the Restart Manager is to help make it possible to
shut down and restart applications which are using a file you want
to update.
In order to do that, it needs to keep track of which processes are
holding references to which files.
And it's that database that is of use here.
(Why is the kernel keeping track of which processes have a file open?
Because it's the converse of the principle of not keeping track
of information you don't need:
Now it needs the information!)

Here's a simple program which takes a file name on the command line
and shows which processes have the file open.

The first thing we do is call, no wait, even before we call
the Rm­Start­Session function, we have the line

WCHAR szSessionKey[CCH_RM_SESSION_KEY+1] = { 0 };

That one line of code addresses two bugs!

First is a documentation bug.
The documentation for the
Rm­Start­Session function doesn't specify
how large a buffer you need to pass for the session key.
The answer is CCH_RM_SESSION_KEY+1.

Second is a code bug.
The
Rm­­StartSession function doesn't properly
null-terminate the session key, even though the function
is documented as returning a null-terminated string.
To work around this bug, we pre-fill the buffer with null characters
so that whatever ends gets written will have a null terminator
(namely, one of the null characters we placed ahead of time).

Okay, so that's out of the way.
The basic algorithm is simple:

Create a Restart Manager session.

Add a file resource to the session.

Ask for a list of all processes affected by that resource.

Print some information about each process.

Close the session.

We already mentioned that you create the session by calling
Rm­Start­Session.
Next, we add a single file resource to the session by
calling Rm­Register­Resources.

Now the fun begins.
Getting the list of affected processes is normally a two-step
affair.
First, you ask for the number of affected processes
(by passing 0 as the nProcInfo),
then allocate some memory and call a second time to get the data.
But this is just a sample program, so I've hard-coded a limit
of ten processes.
If more than ten processes are affected, I just give up.
(You can see this if you ask for all the processes that
have open handles to kernel32.dll.)

The other tricky part is mapping the RM_PROCESS_INFO
to an actual process.
Since process IDs can be recycled,
the
RM_PROCESS_INFO structure identifies a process
by the combination of the process ID and the process creation time.
That combination is unique because two processes cannot have the same
ID at the same time.
We open the handle to the process via its ID, then confirm that the
start times match.
(If not, then
the ID refers to a process that exited
during the
time we obtained the list and the time we actually looked at it.)
Assuming it all matches, we get the image name and print it.

And that's all there is to enumerating all the processes that have
a particular file open.
Of course, a more expressive interface for managing files in use
is
IFileIsInUse,
which I mentioned some time ago.
That interface not only tells you the application that has the file open
(in a friendlier format than just an executable path),
you can also use it to switch to the application and even ask it to
close the file.
(Windows 7 first tries IFileIsInUse,
and if that fails, then it goes to the Restart Manager.)

In regards to the two bugs, please excuse me while I scream in frustration. The first bug I can chalk up to a simple documentation oversight; MSDN documentation doesn't exactly have a stellar history (though it is better than it used to be). But the second bug, failing to NULL-terminate an out string? My eyes, the goggles do nothing!

So, if this is possible, then why doesn't Windows tell me which process has a file open when I can't safely remove my external hard drive or USB key?

Frequently, nothing short of logging off and logging on again allows me to remove my device.

[Please read the article again. You have to know the name of the file first. There is no wildcard query like "all files on this drive, even the ones that I don't have permission to access or even know the existence of." -Raymond]

Neat, I didn't know that API existed. I wonder when Process Explorer will be updated to use it; it still searches through all handles in all processes to find open handles to a given file (or at least that's what I assume it does, given that it takes a couple of seconds to do the search).

Incidentally, the classical model still allows for this by walking the open handle tables of all processes and comparing device & file identity. Even though Windows has a poor idea of device identity in user mode, there must be a good identity in kernel mode.

I thought that the system would need to keep track of what processes had which files open because not all processes clean up nicely. If a process is killed how does it decrement the reference count to the file?

[The system knows, for each process, what files (more specifically, handles) it has open. But it classically did not maintain a reverse mapping (for each file, what processes have it open). -Raymond]

[Please read the article again. You have to know the name of the file first. There is no wildcard query like "all files on this drive, even the ones that I don't have permission to access or even know the existence of." -Raymond]

Are you saying I have no permission to yank out my usb stick 20cm infront of me?

w9x didn't have these kind of locking, and it worked fine. File handles which lock up file & folders in NT doesn't guarantee anything for apps anyway. It's only a inconvenience for users. Please bring back the old behaviour please.

[No, I'm saying that Rsetart Manager doesn't help you solve this problem. Because today's topic is Restart Manager, right? -Raymond]

The alternative would be to have the API allocate some memory for you and return the data. (Remember that the OS uses a C-based API so it cannot return an object except via handle) Now you have either a handle to clean up or some memory that you have to remember to call HeapFree (or LocalFree or GlobalFree or something else)

And even if the OS did all that, it STILL doesn't solve the problem because while it was busy allocating memory and filling your buffer some other thread or process could have jumped in and changed things. Unless, of course, you want the OS to synchronize access to the underlying resource collection and now you are in Denial of Service attack mode — all you have to do is write a process that continuously runs the query and writes will be delayed or even prevented.

Does the Restart Manager talk to network shares too? I think it's still a common practice for opening applications on central shared folder in coperations. If system sees the updated folder is a shared folder, will it by some sort of magic send notification through the Server/Workstation service to the list of openers, and then let the remotes Restart Manager handle it accordingly?

Any API that lets you retrieve information has a "race condition" in the sense that the information may be out of date by the time the caller does anything with the information. The kernel is entitled to swap you out (for several hours if it is feeling particularly perverse) after *any* instruction, even RET.

@SimonRev: I don't see why the alternative is worse, and plenty of Win32 APIs *do* use the alternative.

It's only better for the application to have to allocate the buffer itself in fairly rare situations. e.g. When it's likely that the app will already have a buffer that's probably big enough, and will be calling the API repeatedly, in a tight loop where re-allocating the buffer on each iteration would incur significant costs.

In this case, and with most other APIs, the caller is almost certainly going to allocate a one-off buffer and then free it after a one-off API call. So the operating system might as well do that itself, and provide a way to free the buffer (or more likely just tell people to use an existing API like CoTaskMemFree).

And the OS does not have to handle synchronizing the resources or any of that stuff. It just has to handle the "check size, allocate, request data; loop if the data is too big now" logic. Because that is the kind of logic that only needs to be written in one place — the OS — rather than in every single application, and it's also the kind of logic that programmers often get wrong or don't even realise they need to write in the first place.

Having that kind of loop (or wrapper function) around so many API calls in Win32 programs really gets in the way for, most of the time, zero benefit. Having to think about that stuff, instead of calling a simple API and moving on, takes you out of writing (or later reading) the main flow/logic of the code.

That said, you have to start a "session" with the restart manager, so maybe it takes a snapshot of the system state at the start of that and there isn't a race condition here?

RE the null-termination bug in the API, I'm glad my paranoia about passing Win32 APIs buffers one null longer than I tell them ia vindicated. :) The docs are rarely explicit about null termination in those cases.

The ability to see which process has a file open was necessary long before this Restart Manager came along. Many of us have been seeing Explorer's irritating dialogue boxes when you try to delete / move / &c. a file that's in use. Fortunately Process Hacker can figure out which process uses it.

@Simon Rev &c: Having the OS allocate a block of memory isn't a bad thing; in the case of text strings many APIs already return BSTRs (which you must then free later) and calls to those are much less likely to be buggy.

@John: You can rarely know for sure how large is large enough, especially in functions that allow you to call them once to get the buffer size, then call them again to get the actual data.

If there was a max size known in advance then the API would just tell you to use that size and not bother with the first call at all. And that kind of thinking is why we're still limited to 260 character paths in a lot of places.

Allocating the buffer on the stack in this case saves you nothing in reality. You are not going to call the Restart Manager APIs in a tight loop, and the performance/resource differences between a stack and heap allocation will never matter in this situation.

In code that gets called in a tight loop, it might make sense to try a stack-based buffer first and then fall back on the heap to cover the rare (but far from impossible) cases where something larger is needed. That's a lot of complexity (i.e. potential bugs) if it isn't needed, though.

Don't forget that you need to try the stack buffer, then it if isn't big enough try an allocated buffer, then, if that isn't big enough (because the data changed between calls), loop and try a bigger allocated buffer, and keep looping until it works or you hit a sanity-check loop/size limit or until the OS refuses to allocate something that large…

If it really matters for an API, the OS could always provide two versions of the API, one a wrapper around the other which handles the allocation logic and saves everyone else having to write and test that code in their apps.

Some APIs are generalized, though; something like GetTokenInformation where some parameters have fixed or maximum sizes while others do not. Having the caller manage the buffer also allows you to use constructs like vectors or smart pointers without making extra copies or having to deal with raw pointers. The OS generally doesn't provide wrapper functions that are trivial to implement yourself. Personally I prefer flexibility over simplicity in this particular case.