Wednesday, June 26, 2013

After winning the Advanced section of the 2013 Scripting Games, Mike had the good idea of re-gifting one of the his prizes and started off a competition where his readers were tasked with the writing of the shortest possible one-liner to returns a list of PowerShell cmdlet names that don't have repeating characters in them.

I published a 24-chars-long answer which has just been chosen by Mike as shortest answer. Therefore I suppose it is a good idea to detail here the way I proceeded.

As you know, the Get-Command cmdlet gets all commands that are installed on your computer, including cmdlets, aliases, functions, workflows, filters, scripts, and applications. So I was going to use this cmdlet to retrieve all the others. Fine. Now what I wanted was the shortest possible alias for this cmdlet. To find it I went this way:

Just one alias was returned, but three chars long! I couldn't ask anything better than this!Now, as I said, Get-Command returns by default every type of command, but Mike asked for cmdlets only, so I explored the parameters of Get-Command and at the same time showed to screen their aliases:

So I had find I way to shorten CommandType or Type. As you should know, there is an incredible piece of the Powershell interpreter named the Parameter Binder that is in charge of analyzing cmdlet parameters. This piece of code is damn smart: it does not requires that you specify the full name of a parameter as long as you specify enough for it to uniquely distinguish what you want.

I easily found out that I could shorten CommandType to just 'C' or Type to just 'Ty' (not just 'T' because there is another 'TotalCount' parameter that starts with the same letter). Great. I had to use 'C'.

Same for the parameter value: Powershell guesses the value of the parameter as long as you give enough letter to make clear what you are looking for. As the possible enumerators of Get-Command -CommandType are "Alias, Function, Filter, Cmdlet, ExternalScript,Application, Script, Workflow, All", we see that that only Cmdlet starts with letter 'C'. Great!

So I was able to find out all possible Powershell cmdlets with a 8 chars one-liner!

gcm -c c

Since Mike asked to return just the names of cmdlets I modified the code this way:

(gcm -c c).name

Now this was the easy part.

I spent some time to analyze the ins and outs of Get-Command, and especially what type of output it produced:

So, what I had for the moment was a command which returned an array of strings.

After some reasoning, I decided to go for Select-String, which was probably the best bet for me.

The Select-String cmdlet works on streams of strings. When you pass it an objects, the Powershell engine converts those objects to strings before passing them to Select-String.

Also, Mike asked for case-insensitive matching and this is what Select-String does by default. Good for me.

At this point of the story I started to write my REGEX expression.

I spent a lot of time verifying the benefit of using lookahead. In the end I understood that my best option was to write a regex that retrieved all the strings with a positive match for duplicate letters and then let Select-String inverse the result.

The regex that I wrote looks like this:

(.).*\1

which means:

(.) a numbered group which matches any character (parenthesis capture and implicitly number the expression contained within them)

.* any char any number of repetitions

\1 against the above numbered group (the number after the backslash is the ordinal position of the capturing group in the regular expression)

Basically for each char, I check whether it's repeated any number of times and if it is then a positive match is returned.

I dind't have to handle the dash cahracter since it alwas comes alone (being the naming of cmdlets always in the form verb-noun, no exceptions).

Now, there is another interesting feature in Powershell which is its capability to understand where a string starts and where it ends. Powershell definitively knows that

(gcm -c c).name|sls '(.).*\1' -N

and

(gcm -c c).name|sls '(.).*\1'-N

are exactly the same because the regex expression is enclosed in simple quotes, so what come after is another parameter. The additional space after the quote is not required and since the shorted the better, I dropped it!

At this moment I felt I was on a lucky day and changed my solution to:

gcm -c c|sls '(.).*\1'-n

...and, believe it or not, it kept working! I was on a good mood!

Here's the returned cmdlets

You ask what dark magic makes Select-String fetch directly the Name property? Well, I suppose it is because there are three fields returned by the default view of Get-Command: CommandType, Name and ModuleName.

If you rememeber, cmdlets parameters can accept pipeline input in one of two different ways: ByValue and ByPropertyName.

ByValue means that a parameter can accept piped objects that have the same .NET type as their parameter value or objects that can be converted to that type, and since Select-String is looking for a string to bind to, the first of the three field returned by Get-Command which is a string is Name:

That's all for my solution. Feel free to leave a comment or share if you've enjoyed my explaination! Thanks to Mike for this funny puzzle and to all other people that participated (especially Bartek, Lee, Shay and NoHandle) to make it an interesting contest.

3 comments:

The explanation why the connection of gcm and sls works seamlessly is IMHO bit off. The get-command cmdlet emits System.Management.Automation.CmdletInfo objects. These objects are implicitly converted to string so that the Select-string cmdlet can accept them on input. (gcm | select -First 1).toString(). The type internally defines how it is converted to string.

What do you mean by 'implicitly'? as far as I can tell, there is no conversion, since cmdletinfo is already an array of strings passed over to sls. But you are the expert and I have no idea of what happens behind the scenes.

Also I noticed that my oneliner does not work on Powershell V4, just on V3. Any idea? I checked sls and gcm manuals and found no changes...