You are here

AppleScript for Making Clipboard Contents Filename Ready

Submitted by nathanieltroutman on Thu, 09/09/2010 - 15:46

Today I wanted to go through my downloads folder and clean things up a bit. I have a lot of random pdfs in that folder with random names like "fulltext(1).pdf" and "fultext(9).pdf" (thanks IEEE eXplorer, or was that ACM) and others with equally informative names like "S1350482701003061a.pdf". Clearly these simply won't do. What should the filename be? Well the paper/documents title of course! If you open up the pdf and copy the title of the paper you can't paste it as the filename as it might have have undesirable characters in it, especially new lines. I'm rather picky about my file names. I don't want any characters other than a-z, 0-9, . (period) , _ (underscore), and - (dash) in them. One could simply retype the title removing all the offending characters, but that is really tedious, especially with massive titles. So what is one to do? Use AppleScript to sanitize the contents of the clipboard so that you can paste it as the file name.

I tried several different techniques. Initially I wanted to use sed or tr or some other straight forward unix tool, but pbpaste wasn't cooperating. It simply wouldn't show anything. After reading up on pbpaste its rather limited with regards to the types of content it can paste form the clipboard and apparently the contents I had that I was testing with wasn't one of them.

Hence, I opted for a pure applescript approach . . . why doesn't applescript have string manipulation functions? I mean really?! So after googling around for a bit I ended up with satimage's applescript library for string manipulation. A few regex's later and I had an acceptable script. Since I use QuickSilver its simple for me to copy text then run the applescript and past things into the filename.

#
# Cleans up the contents of the clipboard such that they
# can be used as a file name.
#
# Note: This requires the Satimage applescript libraryfrom:
# http://www.satimage.fr/software/en/downloads/downloads_companion_osaxen.html
#
# @author: Nathaniel Troutman
# @date: 2010-09-09
#
set sometxt to Â«class ktxtÂ» of ((the clipboard as text) as record)
set sometxt to lowercase sometxt
# we want spaces, new lines, and carriage returns
# turned into underscores
set sometxt to change "[ \\r\\n]+" into "_" in sometxt with regexp
# remove leading and trailing underscores
set sometxt to change "^[_]+|[_]+$" into "" in sometxt with regexp
#get rid of all the other junk that might be in the string
set sometxt to change "[^a-z0-9_\\-.]" into "" in sometxt with regexp
set the clipboard to sometxt

Comments

Alerted to it by Jason of REU last summer, I started using Mendeley. It might not dodge your offending characters, but it does great PDF renaming for me. I've finally tamed my paper downloads. Something you might want to look at quickly, at least.