The Obfuscated Code section contains some interesting code. The problem is that without working your way through the code it is difficult to decide if it is any good or not. This is probably why obfuscated code which is formatted into a recognisable shape often get more votes than visually mundane but superior code.*

However, it is unlikely that anyone has the time to work through all of the code posted in Obfuscated Code in order to determine the quality. Therefore, it would be nice if people took it upon themselves to take an occasional submission and deobfuscate it.

Apart from the benefits to the community at large deobfuscation is a good way to learn some of the more idiomatic or obscure features of Perl.

Perltidy or the indent-region function in emacs can be helpful in unearthing hidden code structures. And since perl is blind to obfuscation B::Deparse can be useful for clarifying code segments.

However, if the obfuscation is good these tools will be of limited use. I once used Perltidy and emacs to massage Dominus's obfu into a readable shape. It didn't help. While the overall structure was visible the mechanism was still deeply hidden.

You are right, coming up with spoilers for most obfuscations is a tough job. Unfortunately, even with ~ 500 XP left to my sainthood, I still am not able to de-obfuscate much of the code submitted to the Obfuscated Code section. I guess this requires special skill and much practice, not to mention countless hours of meditation.

I once read an excellent post by japhy titled Japhy's Obfuscation Review. I wish we had more similar posts by serious obfuscation freaks (aka gurus ;)! Although, I learnt quite a few neat tricks from japhy, I'm still far far away from being able to tackle or come up with my own obfuscations that would be hard to break. In fact, the furthest I came to writing an obfuscation is the code you see in my signature. And, of course, it still pales in comparison to some of the obfu's you've mentioned ;/. I also didn't get much ++ votes when I submitted this code to the Obfuscated Code section, titled System admin the obfuscated way. (*hint* *hint* hope you drop a few extra ++ in the bucket ;-))

What I think would be useful, in addition to your idea, is to have more authors of original obfuscations to submit a link somewhere in their post to a spoiler page. Then, if anyone is interested in getting to the bottom of a complex obfu, he/she may simply hit the link and follow to the spoiler page.

Three reasons to deobfuscate vladb's signature: fun and profit, as
pointed out by jmcnamara; vladb monkself said What I think
would be useful, in addition to your idea, is to have more authors
of original obfuscations to submit a link somewhere in their post
to a spoiler page but the only spoiler I could find relating to
the sig was in the original post; finally, vladb said
that he is still ...not able to de-obfuscate much of the code...
me neither, and I need to start somewhere! :)

At first glance, line 1 seems to set $", the quoted-array-seperator,
to the letter 'q'. The tinkering with $" makes me think that vladb
is going to use an array somewhere later on; and thinking that
$"=q; meant $"='q';, one might start thinking
about the upcoming array...

I'm kind of slow with coding, so just ran a quick oneliner to see whether
the first line really does what I thought:

This is surprising; obviously two things are happening here:
1. $"=q; is actually using the q operator to quote, umm,
something...
2. For some reason, when $" is set to that, err, something, the second
print statement fails.

Line 1 confused me! B::Deparse gets a lot of mention in the
monastery; maybe it can help me here.

Cut down a tree with a herring? Sure, I'll try, but only if it's red...

====

If I had been running the modified signature as I, um, modified it,
I would've caught my mistake sooner. As it is, vladb's misdirection waylaid me for an hour (actually, I gave up, but while eating lunch figured out my mistake). But this time gave me a chance to read up on $" and $, in perlvar.

I'm a big fan of $", actually; I do this in oneliners a lot:
$"=$/; print "@a\n"; which prints the elements of
@a on their own line. ($/, incidentally, is the "input record
separator"; the default character is \n.)

But I don't use $, very much at all. This turns
out to be useful as well: if one has code such as
$, = '|'; print $foo, $bar, $baz, "\n" then one can
generate nicely formatted (in this case, pipe-delimited) lines
without having to muck around with the equivalent printf statement.

Whew! At this point, we've only looked at the first two lines of
code! Fortunately, lines 4-7 are fairly straightforward.

Line 4: Setting $" and $, to 'grep' is a clue that vladb's signature
is a Unix utility of some sort; the for ( `find ...`) clinches
it. (Uhh, not to mention the original description!)

line 4 runs a shell command (the Unix command "find") and foreach line
that is returned, processes them according to lines 5-9.

This particular find command is going to search the current directory (and,
for some implementations of find, subdirectories) for files that match a
particular naming convention. The regex for these filenames would be
something like /^\.saves.*?~$/, if that helps you. Otherwise, here's a few
examples:

foo.saves_blah~ # no match
.saves_foo # no match
.saves_foo~ # match!

On Unix and Linux, a filename that starts with a dot (.) is a "hidden"
file, which can only be seen if you use an extra flag on 'ls' (same
function as DOS 'dir' command). So the find command is going to find a
bunch of "hidden" files that start '.saves', continue with whatever
text describes what file is saved, and end with a tilde (~). An example
might be .saves_Big_Project_backup_27~

Chances are you don't have any of these in your directory on your
machine, so the find command would return nothing. And with no data
to apply the for block to, perl just skips the block in totum.

====

Well, that's pretty boring stuff. I wonder what happens when vladb
uses this tool on his machine? Presumably the find command returns some
data, so lines 5-9 get to kick in.

Line 5: I didn't bother reformatting this; we can do so now.
s;$/;;;
In a substitution (s///), one can choose an alternate delimiting character.
This is useful if you have a lot of '/' that you are processing, and find
yourself escaping them all the time: '\/'. Consider if you wanted to
remove all '//' from a line:

s/\/\///g;

versus

s,//,,g;

Notice how much cleaner the second form is.

vladb is doing the same thing: using an alternate delimiter on his
s///. He's using ';', though, because he figures that he might be able
to catch overzealous deobfuscators out a second time (remember the "a lamo"!)
But we're on to his semicolon madness, and know immediately that line 5
is globally removing all $/ characters from $_ - and since $/ defaults to
\n, and since vladb hasn't changed it, we know we're really removing all
newline characters from $_. find is only going to return one newline per
line of output - this makes sense - so really line 5 is the same as
chomp;

====

Line 6 is a simple pattern match: perl actually lets you comment your
regexes if you want, so let's try that out.

/(.*-(\d+)-.*)$/;
becomes

/ # start of pattern match
( # begin storing into $1
.* # store any number of any character...
- # ...followed by a hyphen...
( # begin storing into $2
\d # ...any digit...
+ # ...as many as we can grab...
) # stop storing into $2
- # ...followed by another hyphen
.* # ...followed by any number of any character..+.
) # stop storing into $2
$ # end of the line, bub
/x; # / to terminate regex, x to allow comments

Right away this tells me that I'd misguessed the naming convention that
vladb is using: my previous example, .saves_Big_Project_backup_27~,
wouldn't have succeeded at all: the regex says there must be a hyphen,
some digits, and a hyphen; the example actually doesn't have any hyphens surrounding the digits. (Oh well, the example served its purpose: to get me thinking about the data.)

The naming convention is probably .saves-$$-~ where "$$" is the process
id number of the program that created the save file. Putting the
process id, or pid, into a temporary file's name is useful for two
reasons: first, generally your OS doesn't cycle pids very quickly, so it's a lazy
way of making sure your temp file names are unique; second, you can
identify the owner of the temp file, and if the owner isn't running
anymore, you can remove the old file.

Line 7 made my eyes water. It looks like a shell command is being built,
but to do what? Remember that line 6 stuffed a pid into $2. Line 7 is
going to use that stored data and build a ps command that checks whether
that pid is still around.

$_=["ps -e -o pid | "," $2 | "," -v "," "];

First off, we've got what I call "the anonymous array square brackets".
(It ain't catchy but it sure helps me remember what they do.)

Where did those 'grep's come from? Remember back to line 2: $, = 'grep';

So where you see a comma in line 7, you can mentally think "grep"
instead.

But what does the @command do? Let's look.

ps -e -o pid # use the 'ps' command to look at the process stack;
# the -e flag says to look at all running processes;
# the '-o pid' flag specifies to return their process ids.
| # take the output from the previous command and use it
# as input for this next command
grep $2 # look for the pid that we found in line 6; this pid,
# remember, comes from the tempfile name, and tells us
# who the owner of $_ is.
| # take the output from the previous command and use it
# as input for this next command
grep -v grep # Right now there might be two lines in the process stack
# that have $2 in them: first is our grep line from earlier
# in this pipeline; second is the process whose pid really
# is $2. We want to ignore the grep lines; this way we avoid
# a situation where we see $2 in the process stack and think
# it's the process we're looking for when really it's just us!

In line 8 vladb will actually run this command; for now if you only take
one thing away from this, it should be this: the output of the command
will be either 0 lines of data, in which case the process isn't running,
or it will be 1 line of data, in which case the process still is running.

Remember, though, that vladb didn't want to give away the whole bag at
once, so instead of writing:

$_ = "ps -e -o pid | grep $2 | grep -v grep";

he instead wrote

$_=["ps -e -o pid | "," $2 | "," -v "," "];

And one of the consequences of this is that $_ isn't actually the full
command that we want; it's a pointer to an anonymous array- the anonymous array is what contains the real command!p>
====

So in Line 8, when vladb actually want to check the process stack for
those running processes, he must first dereference the array.

As TheDamian wrote in an oldish article archived at
perl.com, ...A reference is like the traditional Zen idea of the "finger pointing at the moon". It's something that identifies a variable, and allows us to locate it. And that's the stumbling block most people need to get over: the finger (reference) isn't the moon (variable); it's merely a means of working out where the moon is.

(n/b if you haven't searched perl.com for your favorite authors and
personalities that hang out on perlmonks: why haven't you? Many have written
articles that will improve your understanding and use of perl almost within
seconds of reading!)

The dereferencing is done in line 8 by simply tossing an at-sign, @, in
front of $_.

Like line 4, line 8 uses backticks `` to run an external command and
feed its output back into the program. We know from the discussion of
line 7 what the command is - a search of running process IDs - and what
the expected output is (either nothing or a process ID).

Line 8 also uses a ternary conditional: this is a fancy way of writing
an if-else statement in just one line.

Line 9 prints $\, which, umm, defaults to nothing; here it looks
like it's being treated as a newline, though, doesn't it? I've
gotta admit: I'm not sure where $\ gets set to \n...

====

Another thing I'm not sure of is why $" was set to 'grep'; this
seems like a bit of misdirection on vladb's behalf. After all,
he only builds one array - in line 7 - and never double-quotes
it. So as far as I can tell, $" never gets used.

Hopefully this will be useful to some other monks as an example of how to start de-obfuscating. This is my first turn at writing a spoiler, and I gotta admit: it was pretty fun to figure this stuff out. Although (because?) I made a few wrong turns in my assumptions about the code, this exercise also helped me learn a little bit more about Perl. Thanks jmcnamara for the thread and vladb for the spoiler opportunity.

Line 9 prints $\, which, umm, defaults to nothing; here it looks like it's being treated as a newline, though, doesn't it? I've gotta admit: I'm not sure where $\ gets set to \n...

It looks like it's being treated as a newline... but $\ isn't being set to \n in this code. So when we execute this code it prints out the empty string. Perhaps vladb has typoed this, because the resulting print out is a little confusing without the newlines.

You'd notice the uninitialised values if you ran the code with warnings turned on:

[me]$ perl -w tmp
Useless use of single ref constructor in void context at tmp line 2.
Odd number of elements in hash assignment at tmp line 2.
Use of uninitialized value in print at tmp line 3.
+ ./.saves-22-~Odd number of elements in hash assignment at tmp line 2+.
- ./.saves-19896-~Use of uninitialized value in print at tmp line 3.
Odd number of elements in hash assignment at tmp line 2.
Use of uninitialized value in print at tmp line 3.
+ ./.saves-19639-~Odd number of elements in hash assignment at tmp lin+e 2.
- ./.saves-19896333-~Use of uninitialized value in print at tmp line 3+.

Perhaps Perl doesn't like blocks here? Or doesn't in my version of Perl (v5.6.1 built for i386-linux). We can't just remove them though, because if we do then we won't print out the file names that we delete.

I can't explain the error messages but I can give an alternate line that doesn't create them:

although of course your replacement is much neater.
When we run our changed versions with warnings we get a slightly cleaner result:

[me]$ perl -w tmp
Use of uninitialized value in print at tmp line 3.
+ ./.saves-22-~- ./.saves-19896333-~Use of uninitialized value in prin+t at tmp line 3.
+ ./.saves-19639-~- ./.saves-19896-~Use of uninitialized value in prin+t at tmp line 3.