Edge Cases to Keep in Mind. Part 2 – Files

Karol WrótniakAndroid Developer

Did you know, that there may be a File which exists and doesn’t exist at the same time? Are you aware, that you can delete a file and still use it? Discover these & other files edge cases in software development.

In my previous article about edge cases in software development, I was writing about text traps and I gave you some suggestions, how to avoid them. In this blog post, I would like to focus on files and file I/O operations.

A File which is not a file

One may think that, if it is pointed by a given path that exists, an object is either a file or a directory – like in this question on Stack Overflow. However, this is not always true.

It is not explicitly mentioned in File#isFile() javadocs, but file there really means regular file. Thus, special Unix files like devices, sockets and pipes may exist but they are not files in that definition.

Look at the following snippet:

As you can see on the live demo, a File which is neither a file nor a directory may exist.

To exist, or not to exist?

Symbolic links are also special files but they are treated transparently almost everywhere in (old) java.io API. The only exception is the #getCanonicalPath()/#getCanonicalFile() methods family. Transparency here means that all the operations are forwarded to the target, just like they are performed directly on it. Such transparency is usually useful, e.g. you can just read from, or write to, some file. You don’t care about the optional link path resolution. However, it may also lead to some strange cases. For example, there may be a File which exists and doesn’t exist at the same time.

Let’s consider a dangling symbolic link. Its target does not exist, so all the methods from the previous section will return false. Nonetheless, the source file path is still occupied, e.g. you cannot create a new file on that path. Here is the code demonstrating this case:

The order matters

In java.io API, to create a possibly non-existent directory and ensure that it exists afterwards, you can use File#mkdir() (or File#mkdirs() if you want to create non-existent parent directories as well) and then File#isDirectory(). It is important to use these methods in the mentioned order. Let’s see what may happen if the order is reversed. Two (or more) threads performing the same operations are needed to demonstrate this case. Here, we’ll use blue and red threads.

isDirectory()? – no, need to create

isDirectory()? – no, need to create

mkdir()? – success

mkdir()? – fail

As you can see a blue thread failed to create a directory. However, it was in fact created, so the result should be positive. If isDirectory() had called at the end, the result would always have been correct.

The hidden limitation

The number of files open at the same time by a given UNIX process is limited to the value of RLIMIT_NOFILE. On Android, this is usually 1024 but effectively (excluding file descriptors used by the framework) you can use even less (during tests with empty Activity on Android 8.0.0, there were approximately 970 file descriptors available to use). What happens if you try to open more? Well, the file won’t be opened. Depending on the context, you may encounter an exception with an explicit reason (Too many open files), a little bit of an enigmatic message (e.g. This file can not be opened as a file descriptor; it is probably compressed) or just false as a return value when you normally expect true. See the code demonstrating these issues:

Note that, if you use #apply(), the value will just not be saved persistently – so you won’t get any exception. However, it will be accessible until the app process holding that SharedPreferences instance is killed. That’s because shared preferences are also saved in the memory.

Undeads really exists

One may think that zombies, ghouls and other similar creatures exist in fantasy and horror fiction only. But… they are real in computer science! Such common terms refer to the zombie processes. In fact, undead files can also be easily created.

In Unix-like operating systems, file deletion is usually implemented by unlinking. The unlinked file name is removed from the file system (assuming that it is the last hardlink) but any already open file descriptors remain valid and usable. You can still read from and write to such a file. Here is the snippet:

Wrap up

First of all, remember that we can’t forget about the proper method calling order when creating non-existent directories. Furthermore, keep in mind that a number of files open at the same time is limited and not only files explicitly opened by you are counted. And the last, but not least, a trick with file deletion before the last usage can give you a little bit more flexibility.