Deploying code to production can be filled with uncertainty. Reduce the risks, and deploy earlier and more often. Download this free guide to learn more. Brought to you in partnership with Rollbar.

Every so often you will find yourself needing to write code that traverses a directory. They tend to be one-off scripts or clean up scripts that run in cron in my experience. Anyway, Python provides a very useful method of walking a directory structure that is aptly called os.walk. I usually use this functionality to go through a set of folders and sub-folders where I need to remove old files or move them into an archive directory. Let’s spend some time learning about how to traverse directories in Python!

Using os.walk

Using os.walk takes a bit of practice to get it right. Here’s an example that just prints out all your file names in the path that you passed to it:

By joining the root and the file_ elements, you end up with the full path to the file. If you want to check the date of when the file was made, then you would use os.stat. I’ve used this in the past to create a cleanup script, for example.

If all you want to do is check out a listing of the folders and files in the specified path, then you’re looking for os.listdir. Most of the time, I usually need drill down to the lowest sub-folder, so listdir isn’t good enough and I need to use os.walk instead.

Using os.scandir() Directly

Python 3.5 recently added os.scandir(), which is a new directory iteration function. You can read about it in PEP 471. In Python 3.5, os.walk is implemented using os.scandir “which makes it 3 to 5 times faster on POSIX systems and 7 to 20 times faster on Windows systems” according to the Python 3.5 announcement.

Scandir returns an iterator of DirEntry objects which are lightweight and have handy methods that can tell you about the paths you’re iterating over. In the example above, we check to see if the entry is a file or a directory and append the item to the appropriate list. You can also a stat object via the DirEntry’s stat method, which is pretty neat!

Wrapping Up

Now you know how to traverse a directory structure in Python. If you’d like the speed improvements that os.scandir provides in a version of Python older than 3.5, you can get the scandir package on PyPI.

Deploying code to production can be filled with uncertainty. Reduce the risks, and deploy earlier and more often. Download this free guide to learn more. Brought to you in partnership with Rollbar.