Pages

Sunday, February 19, 2012

Subscene is one of the places to go when you need subtitles. Sadly, their search interface is nigh on goatse. Double sadly, they don't provide a scriptable interface... heck, they don't even use vanilla links - they hide their downloads behind javascript callbacks, preventing fully automated downloads with simple wget/curl solutions. Yeah, I could try watir or johnson or selenium or... Blech. So... rather than writing an obscene letter to them to complain about this, I wrote obscene instead.

Usage:

obscene title date [filter] [language]

It saves the resulting filtered list of subtitle matches in a local HTML file based on the name of the given search parameters. The links within that file point to the final download pages on subscene.

I might add some more niceties to it later, like auto-loading in your default browser and better filtering options, but 80/20, this is gold.

Saturday, February 18, 2012

Hmm... no climate dogs here in China. They don't pay much attention toforecasts either. They dress to the calendar, not the weather forecast (or even the actual weather). So, it might be an unusually warm winter morning prompting Bee & I to take a few layers off before heading to the market; with everyone else sweltering beneath their padded jackets.

It's not unusual to have kids throw up in class because they're forced to sit there in their jackets (it *is* February (winter), after all).

The locals ask us with incredulity, "aren't you cold?!" when they see us in less than Arctic apparel. We feel like asking them back, "Can't you *feel* the temperature? Can't you *see* the sun? Can't you read the weather forecast? Can't you smell your armpits?" But we smile and say, "no." Of course... if we ever get a cold, boy are we in for it then... "See?! You don't wear enough clothing!" come the admonishments.

The edges of the seasons provide the most entertainment if you're into spotting Dress Deficiency Disorder - people in shorts and a T-shirt in the face of an early cold front in Autumn (affectionately called here the Autumn Tiger - he can bite both ways); or passed out in a pool of their own vomit beneath a mountain of padded clothes on an un-calendarily warm winter's day.

But no climate dogs. I shouldn't think that importing them would be a good idea... Dogs here perform far more culinary than climatic roles.

Thursday, February 16, 2012

Ignoring the prosaic question of "Why would you ever want to?!",
consider how, in your editor of choice, you would jump to the longest
line in the file.

Vim provides a lot of jump commands, possibly more so than any
other editor. We can jump to a matching item; jump between words,
sentences and paragraphs; jump between classes or #defined regions in
a source file; jump between files and back and more.

Jump-to-longest-line, however, is missing.

This forces us to craft a
solution using either existing builtin Vim commands, the ex commands,
VimL (Vim's scripting language), or (more typically) a mixture of all
three.

Three different approaches to solving this problem are compared here, each
one arguably a little further along in the "evolution" of a Vim
user.

This style of solution would be considered by capable Vim users who
have not yet ventured very far (if at all) into VimL -- limiting their
choices to just Normal Mode and Ex commands.

Although this solution is presented here in a function (for
comparative purposes with the other solutions presented later), these
sorts of solutions are typically hand-typed when needed or, at best,
briefly stored in macros.

Walkthrough

The %substitute/// prepends the length of the line and the line number as two space-delimited fields to the beginning of every line of the file.

% is a special range in Vim to mean the whole file

The '.' parameter of the getline() and line() functions means
current line in the file.

The Ex command sort n does a numeric sort of the whole
file -- putting the longest line at the bottom of the file.

The series of Normal mode commands on line 4 (Gwyiwu) look like a
Gaelic curse, but they really mean:

G = Go to the end of the file
Note: the cursor will already be in column 1 due to the preceding commands.

w = Move forward one word (jumping over the line length)

yiw = Yank the next inner word (the line number before sort)

u = undo (the sort and line modifications. If this were in a
macro or the commands were issued manually, you would need two
undo commands here.)

Note: the ! at the end of normal! prevents Vim from using user-provided
normal mode maps when executing the command. This should be the default when
you write your own VimL - only use the bare normal when you need to.

The silent keyword before the substitute and normal commands is used to suppress informational feedback from those commands that are unnecessary to our purposes here.

The end result is that the longest line number is now in the Unnamed
register @" and ready to be used in a jump command like:

:<c-r>"<CR>

If you're wondering if this is the messiest solution possible, then
know that it isn't. It's close though.

Approach #2: Procedural VimL

If you're a programmer of C (or java/perl/python/ruby) descent then this
will feel like more familiar ground to you. How would a C programmer
think about the problem at hand?

Input: <nothing (additional)>Output: line number of the longest line in the fileProcessing:

That has to be better, right? For one, we're NOT modifying the
buffer - always a good thing when you don't need to. Secondly, we're
not messing with the registers - we don't lose whatever was in the
unnamed register like we did with the LongestLine_Ex version.

Okay, so it's a tad longer in SLOC than the LongestLine_Ex
version, and could still benefit from a drop or two of optimisation
yet (removing the extra call to len(), for instance)... but it's
certainly no worse than our previous attempt.

Walkthrough

There isn't too much to explain here, except:

len() returns the length of a string, as the name suggests.

getline(1, '$') returns all the lines in the file as a list (the
'$' parameter means last line in the file.)

Note:getline(1) differs from getline(1,2) -- in the first case, a
string is returned containing the requested line, but in the second case, a
list of strings is returned. Earlier we used the form getline('.')
(where '.' means the current line) which is the single argument form
and therefore returns a string.

Is this as good as it gets? While we're feeling all sort of warm and
comfortable, gloating over our achievements, the emacsians are laughing
at us from their REPLs. Avast! We need not cringe at their taunts
any longer. We now have powers in VimL equal to the task and packed
with the sort of expressiveness that will raise an eyebrow of even the
most ardent Functional Programmer.

Approach #3: Functional(ish) VimL

Sometimes as (procedurally indoctrinated) programmers we think too much
in the How of things rather than seeing the What. We're too close
to the trees to see the forest, or in coderthink, too close to the code
to see the abstractions. When we start thinking about a problem like
this our problem solving hammer starts banging away at a solution,
cranking out gobs of loops and conditions and assignments. Before
we've even begun to think of the bigger picture, our minds are fussing
over the minutia -- "should I use a while loop here or a for loop?"

The cure for this begins with the mantra: Don't Use Loops!

Ok, so that might be a bit strong, but it might be just what's
necessary to break the habit you're in of reaching for How pieces
before you've fully digested the What.

This philosophy is succinctly reduced to the pithy aphorism that, "if
you have a dog you shouldn't do your own barking". If you have lists
and functions that process those lists... don't write your own loop
code.

Lispers and other Functional thinkers approach problems in a different
way. They don't worry about how to iterate the elements of a list, or
oftentimes that they're even iterating it. They think about the
deeper abstractions of manipulating and shaping the data from source
to target. They do this by thinking of functions to apply to the
elements of lists. Languages that support this type of programming
(thinking) provide numerous functions that can operate on whole lists
and the elements of lists. Functions to apply another function to each
element, collecting and returning the resulting elements in a new
list; functions to remove elements from a list that match (or don't)
an expression; functions to sort lists (using definable comparators if
necessary); functions to reverse lists; functions to split strings
into lists and join lists into strings, and more...

Clearly there are times when you need to write loops -- the
point is, be on the lookout for times when you shouldn't. Look for
patterns that walk and talk like a list. Start asking yourself, "Can
I solve this through a series of operations on a list?" and "If this
data were in a list, could I join(sort(map(filter(split(...))))) the
sucker to get what I want?"

Eventually, you might even start thinking in S-Expressions all the
time...

The petulant proceduralist within you might be grumbling now that all
we've done is hide the looping behind a layer of functions. "We could
do that in [our procedural] language too! In a library!" It's not
that we're merely hiding the toys under our bed here. The point is that
someone would still have to write that (procedural) library and all
the messy looping therein. In Functionally Friendly languages, this
goodness is already baked right in, ready for your lists from the
get-go.

Don't be deceived by the small SLOC count. This version packs some
conceptual punch. Remembering that we're taking a lisp-y list approach
in this version, let's first talk about map().

Anyone familiar with lisp or any of the modern hip languages
(perl, python and ruby just to name a few) will know that map()
applies a function to every element of a list and then returns each
so-modified element in a new list.

Using map in Python: Collecting line lengths for a file:

file = open("somefile.txt")
lines = map(len, file.readlines())

The lines list will now contain not the actual text lines of
somefile.txt, but the corresponding line lengths for each line in
that file.

Note: Most languages, like python, perl and even lisp use the following
signature for the map() function:

map(function, list)

But in VimL it is the reverse:

map({expr}, {string})

Where:

{expr} is a list (or a dictionary - but we won't worry about that
here), and

{string} is 'evaluated' for each element of {expr}

Note:v:val is Vim's way of referring to the current element of
the list within the evaluated string.

In LongestLine_newVimL, the function being applied is len().
So, all said, line 1 creates a list of line lengths for each line in
the file. The resulting lines list will have as many entries as
there are lines in the file, each entry corresponding to that line's
length (exactly as the python example earlier did).

Which brings us to the second and final line:

return index(lines, max(lines))+1

This might take some mind bending to see what's happening, so we'll
break it down:

the max(lines) function will return the maximum (longest) line
length from the lines list. Great. That's a number, but not
the number we want. We don't want to know how long the longest
line is... we want to know on which line that longest line is. That
is, we want the line number.

the index(lines, <number>) function returns the position within the
lines list that contains <number> (remembering here that
<number> is the length of the longest line as returned by
max().)

Recall that the lines list contains an ordered (in the same
order as the original file) list of line lengths. Line 1's length is
in position 0 (VimL uses zero-based lists, as you would expect), and
line 2's length is in position 1, etc. The longest line is in position
index(lines, max(lines)) +1 (the +1 being necessary to allow
for zero-based indexing.)

Note: Of course, if there is more than one maximally long line in the
file then this function (and the LongestLine_VimL version) will
return the (original line-order-ly) first one. Due to the use of
:sort, the LongestLine_Ex version will return the last such
line.

So, that's newVimL. You might be wondering why I call it newVimL. It's
inspired by my newfound interest in the newLisp language
(http://www.newlisp.org[]). There is a (sickening to the initiated)
adage that "Learning lisp will forever change your thinking as a
programmer, even if you never *use* lisp in anger." (Some might argue
that they *only* use lisp in anger.) I'm here to say: yep. Worked for
me. After glimpsing the mind-bending (there is no spoon, after all)
Ways of Lisp... My approach to VimL changed dramatically. "What?! I
have to loop?! No! Where's my map()?! Oh! VimL has map()?! And
filter()! OMFG!" </enlightenment>

Credit where due...

Just over a year ago, while hanging out on #vim, as all the cool
kids do, someone asked for something and I sprang into action. "I can
do this!" I said hubrisly. I crafted a solution in my awkward
procedural VimL and pastie'd it to the channel. It worked. I was a god
among vimmers. The angelic choir hymned. Life was good... Until
godlygeek crushed me with a one-liner of near-indecipherable newVimL.

It had join and split and map and filter all wrapped mischievously,
one around the other, performing some arcane dance I, at that time,
couldn't fathom. It worked though. Whatever he'd conjured, it actually
worked. It looked to me like... fucking magic.

This was all the motivation I needed. A good and thorough pantsing
often does that, happily. And that's the story of how I started to
drag my consciousness up to the level of Lisp. I'm not there yet, by a
long shot. But I'm far enough along now to start to know how much I
really don't know, and to be able to help others find the path too.

Brave newVimL World

So... welcome to the brave newVimL world. If you find a new and exotic
flower along the path, share it with your fellow travellers. If you
find a thorn, share that too - who are we, after all, to decide which
one holds more value?