I haven't studied the algorithm or the implementation, but just using it suggests it is O(n^2) or worse.

Is this just the nature of the LCSS problem, or is there a better implementation around?

Are there any fast approximation algorithms for LCCS? That is, if I can settle for a pretty-long-but-not-necessarily-the-mathematically-longest common substring ("PLBNNTMLCSS" vs "LCCS"),
are there good heuristics around?

I'm not looking to write a generalized heurisitic around this (tabu or simulated annealing or whatever) at this point.

Maybe Algotithm::Diff will perform better. It isnt C but always seems to perform its job well and fast. If you are trying to do this on gene sequences you probably want to do it in C - once you have found an efficient algoritm. It does LCS native and can be easily made to do LCSS. Yes LCS and LCSS are different. The lcss method returns all the common substrings, unsorted. Just iterate through the list and remeber the index of the longest (much more efficient than a sort). You would want a Schwartzian transform on the sort by length if you want to use the multiple results (but if you want more than one LCSS its kinda a contradiction in terms).

NB: Edge case where two (L)CSS are same length second ignored in my simple loop.

A quick look at the source shows nested for loops three deep, with the outer two iterating the length of each string. My eyeball guess is that the time complexity is (N*M**2)/2. The coding uses C style loops, and lexical $a, $b are used as temporary variables. A prime example of C written in perl.