The Pythonic Wheel Reinvention

Remy escaped the enterprise world and now works as a consultant. Editor-in-Chief for TDWTF.

Starting with Java, a robust built-in class library is practically a default feature of modern programming languages. Why struggle with OS-specific behaviors, or with writing your own code, or managing a third party library to handle problems like accessing files or network resources.

One common class of WTF is the developer who steadfastly refuses to use it. They inevitably reinvent the wheel as a triangle with no axle. Another is the developer who is simply ignorant of what the language offers, and is too lazy to Google it. They don’t know what a wheel is, so they invent a coffee-table instead.

My personal favorite, though, is the rare person who knows about the class library, that uses the class library… to reinvent methods which exist in the class library. They’ve seen a wheel, they know what a wheel is for, and they still insist on inventing a coffee-table.

Anneke sends us one such method.

The method in question is called thus:

if output_exists("/some/path.dat"):
do_something()

I want to stress, this is the only use of this method. The purpose is to check if a file containing output from a different process exists. If you’re familiar with Python, you might be thinking, “Wait, isn’t that just os.path.exists?”

Now, in general, most of your directory-tree manipulating functions live in the os.path package, and you can see os.path.dirname used. That splits off the directory-only part. Then they throw a glob on it. I could, at this point, bring up the importance of os.path.join for that sort of operation, but why bother?

They knew enough to use os.path.dirname to get the directory portion of the path, but notos.path.split which can pick off the file portion of the path. The “Pythonic” way of writing that line would be (path, filename) = os.path.split(full_path). Wait, I misspoke: the “Pythonic” way would be to not write any part of this method.

'%s' % filename2 is how Python’s version of printf and I cannot for the life of me guess why it’s being done here. A misguided attempt at doing an strcpy-type operation?

glob.glob isn’t just the best method name in anything, it also does a filesystem search using globs, so files contains a list of all files in that directory.

" ".join(files) is the Python idiom for joining an array, so we turn the list of files into an array and search it using re.findall… which uses a regex for searching. Note that they’re using the filename for the regex, and they haven’t placed any guards around it, so if the input file is “foo.c”, and the directory contains “foo.cpp”, this will think that’s fine.

And then last but not least, it returns the array of matches, relying on the fact that an empty array in Python is false.

To write this code required at least some familiarity with three different major packages in the class library- os.path, glob, and re, but just one ounce of more familiarity ith os.path would have replaced the entire thing with a simple call to os.path.exists. Which is what Anneke did.