Describr

Describr, the file inspector

So, tell me about your file…

Our CMS product, Amaxus, allows file uploads as Media File objects. We found that a common requirement was tagging of media files, so we looked into automating it. The result was Describr, a PHP library that can describe any file!

What does it do?

So, what can Describr do? Here are a few examples:

Tell you the size of a file, both in bytes and more informally (e.g. “large”)

Where can I get it?

If you want the unit tests, test data, and the command line scripts, you can grab a copy from our GitHub. If you want to contribute, that’s great, you can fork Describr on GitHub!

How does it work?

When given a file, Describr goes through the plugins that it has and says to each plugin “can you tell me anything about this file?”. If the plugin can help, it runs and returns its results.

How Describr works

We have written a number of plugins for Describr which come bundled, but you can also make your own. If you make a plugin, it would be great if you could fork the project on GitHub and send us a pull request – we’re keen to work with the community! Each plugin says “I can handle a list of MIME types” and “I can handle a list of file extensions”, so when it is asked about a file it has two ways of knowing whether it can tell Describr anything about that file. We found that providing both approaches was more robust – sometimes files misreport their type, meaning matching by MIME type is more accurate, and sometimes matching by MIME type fails so matching by extension is more reliable.

The audio/video plugins essentially wrap the php-reader project. What Describr does for A/V files is that each plugin wraps one or more php-reader classes, and reports the file types and file extensions it can deal with. Similarly, the image plugin wraps the PHP GD library.

The results of running Describr are presented in the MediaFileAttributes class. This class can be interrogated to say “what did the Text plugin say about this file?” or “what did the MPEG plugin say about this file?”. You can also simply call “toArray” to get all the results as an associative array.

Describr in action

Let’s take a look at an example implementation of Describr.

In Amaxus, files are uploaded, either individually or through our batch uploader. As they are uploaded, they are automatically tagged. For example, the dimensions of graphics such as JPEG files are determined, and the file is tagged to “portrait”, “landscape” or “square”. This is done by running Describr on the file and turning the results into tags:

How Describr is used in Amaxus

We also use Describr to extract the filesize, predominant colour and type of each image.

Conclusion

We hope you find Describr useful, or at least interesting! There is a lot of scope to add plugins and integrate it into projects, so feel free to fork Describr on GitHub and hack away!

Want to find out more about the tools our dev community have been exploring? Take a look at our regular tech round-up series - and sign up to our mailing list to have them delivered direct to your inbox.