BUU has asked for the
wisdom of the Perl Monks concerning the following question:

The title is the question. How would you write some code to find the longest palindrome from a string? Note that the entire string *may* or *may not* be a palindrome, you must find a substring from said string that is a palindrome. Also note that there are no "special characters" that you should ignore. So a palindrome is just the reverse of the string. So "foo bazzaboof" is not a palindrome because of the space.

Bonus points for speed, readability and general style. More bonus points for writing it in the form of a subroutine so I can benchmark them all. Name the sub after your self.

Update to respond to the criticism:
Yes, I did get this problem from the job posting on jobs.perl.org. Yes I had every intention of eventually applying for that job. I had no intention of ripping off people's code to get said job. If I had any intention of using replies to this node for that job, I would have mentioned it. My intentions were simply to share a fun puzzle I found with this community I have grown to love.

Looking at just the facts, in the worst light, I can see how limbic~region, merlyn and others might misinterpret the facts to reach a damning conclusion. I can only say that when I wrote this node, the job had nothing to do with it, so I didn't bother to mention it. That is how my mind works. I can understand why people would have problems with me not mentioning the source, and in retrospect I probably should have credited the source for the problem. Also considering where I got the problem, perhaps I should have waited longer or asked before I "published" the problem. I did not consider the affect this would have on etcshadow and his boss when I wrote this node, and I have already apologized to etcshadow, later on in this thread. I apologize for any impact this may have had on his business.

Having expressed my apologies, I want to reiterate that I have never, nor shall I ever, do something unethical to gain an advantage when applying for a job. I hope my future contributions will make this clear.

Update2, minor formatting changes

Update3, more formatting changes, slightly less inflammatory. Original commented.

SHAME ON YOU!

When you mentioned in #perl last night that you intended to make a SoPW post about this, I assumed you would have also mentioned that it was for a job application. I haven't had a chance to refine my non-regex solution on my scratch pad, but here it is for the record.

Update: While I do not know if you intended to use any of the responses to your advantage in applying for the job, I do know you got the problem from the ad as you gave me the link in IRC. Additionally, you indicated you intended to apply for the job. Shame on me for not wording my admonishment better.

Update 2: Corrected another logic flaw pointed out by ccn, which makes it even slower :-(

Methinks this might be a slight over-reaction (Update: or might have been, based on your original text), given that BUU did post his original solution to the problem. He may well have intended to go to the interview, describe his code, and then say "here are some alternatives that I solicited on Perl Monks", which is a valid thing to do when trying to solve a job-related Perl problem. One could say that he is showing initiative. Sure, it would have been polite to point this out in the OP. Maybe I'm too forgiving, but I always try to see the positive side.

But no more. I voted "keep" in your consideration for deletion. Deleting BUU's node alone would be ineffective as it leaves all the replies with solutions lingering, but these replies are valid contributions in their own right and deletion is not justified merely because they're in reply to a deceptive node. Further, any possible damage is already done at this point and cannot be averted by deleting the parent node, which would rob the replies of their context. An editorial amendment of the node might be called for, but deleting it would be counterproductive.

I HAD NO INTENTIONS OF EVER STEALING PEOPLES CODE TO GET A JOB

. Do you get it now? I DO NOT WANT TO CHEAT TO GET THE JOB. I just *assumed* other people would enjoy solving this problem, since I enjoyed writing the code.

Shame on you merlyn, for the same reasons as limbic~region. You have *no* evidence that I have ever cheated to get a job (because I never have or will). All you have is some unfounded slander from limbic~region saying that "I want to cheat" or some such, and you launch off to critically castigate me, not only here, but posted to use.perl.org, a much more public site, WITHOUT EVEN CONSIDERING THAT THERE BE MORE TO THE ISSUE. Now I have large amounts of slander over something I HAVE NEVER DONE, WILL NEVER DO, AND HAD NO INTENTION OF DOING, and some people will probably never believe it, all because of some baseless slander.

SHAME ON YOU LIMBIC~REGION

Why the shame? Because you have attacked me for I haven't done, wouldn't do and HAD NO INTENTION OF DOING. This has absolutely *no* relationship to aforesaid job search other than the problem is mostly the same ( I don't think the problem is even exactly the same ). I had NO FUCKING INTENTION of posting this "just so I could steal peoples work to get a job by cheating". None. If I had that intention, it would have been 10 times easier to steal it by googling for solutions, instead of risking my stealing being found out by someone, as the boss for said job could easily read perlmonks and see where I got my solution. This would be much more difficult if I got it off google.

But I HAD NO INTENTION OF DOING THIS. But instead of even verifying that I was going to, or had done it, you just blatantly attacked me over the *possibility* of me doing it.

For your information, the reason I posted this question, with any mention of where I got it from, is I just assumed that it was a challenge lots of other people would enjoy also. And, judging by the sheer number of responses, lots of other people enjoyed writing a solution to it. Generally enabling this many people to write some cool perl for a solution would make me very happy, but in this case I find all my happiness is ruined by your slander.

BUU,
My better judgement says I shouldn't reply to this at all but... you appear to be quoting me:

"just so I could steal peoples work to get a job by cheating"

I never said that. In fact, I am the one in IRC that gave you the benefit of the doubt when others thought you intended to be unscrupulous. What I was addressing is:

You did indeed get the problem from the job ad

You did indicate you intended to apply for that position saying that you would use the fact you were a Saint at PerlMonks to your advantage

You posted the question here without reference to the first two points

This all transpired on Freenode's #perl of which I do not log. Should you want to dispute any of this I am sure someone has these logs and it will be easy to determine. I certainly would not have said shame on you for anything if you had just acknowledged this when you posted.

_____________________________________________________
Jeff japhy Pinyan,
P.L., P.M., P.O.D, X.S.:
Perl,
regex,
and perlhackerHow can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart

UPDATE: There may be a minor bug in my use of POSIX, however I may be able to get round this as substr() automagically rounds down fractional values which are passed to it.

UPDATE^2: Fencepost error in ceil() function call fixed.

UPDATE^3: The reason why it fails is that the ordering can sometimes be wrong if there is more than one palindrome in the same string, due to the randomisation as the hash keys are fetched. I'll fix this at some point in the near future...

UPDATE^4: Issue fixed; Was in fact due to boneheadedness on my part. Now should work nicely, although not as pretty as it was.

Update: since I was asked how this works, I'm adding an explanation here.

It is pretty simple: a XOR b = 0 when a = b. If you XOR two strings with each other, you will get a NULL at all locations with identical characters. Now obviously, if you XOR a string with a mirror copy of itself and get all NULLs, then it's a palindrome, because all characters in the rotated copy were identical with all characters of the original.

That's the gist of it. The particular problem given for this thread is complicated by the fact that we have to look for embedded palindromes, and rotating the string unfortunately displaces the rotated copy of an embedded palindrome. To find all embedded palindromes, the mirror copy must be XORed against the original string at each offset. The code does this by rotating the copy n times for a string of length n.

There is one nasty trap left. If the string consists of two adjacent palindromes, such as abbabbafef. Mirroring that yields fefabbabba. If you rotate this three times to the left, the mirror copy becomes abbabbafef and XORing them yields a string of all NULLs, which would indicate that the palindrome is abbabbafef. Oops. The problem is that we forgot to keep track of where inside the mirror copy its original start and end used to be. Palindromes obviously cannot run across that location. That is what the substr $mask, $rotate_count, 0, "\1"; is about: a non-NULL is added to break a string of NULLs running across that location. Of course, now we have to account for that extra character in offsets in the mask.

And that's it. The bulk of the work happens in a single XOR and a pattern match, and other auxiliary tasks are done using very few builtins. That's where it gets its speed. The bulk of the code is merely simple math.

Ouch. Thanks for catching that. Only doing length / 2 iterations was a remnant from a slightly different approach I abandoned half-way in. I even tested the final code on a bunch of different inputs but somehow managed not to run into this issue… ugh.

Sometimes I wonder if I should add Boneheaded Mistakes R Us to my signature.

Of course I wanted to see how it stacked up against all the other solutions, so I created a program to generate strings that contained palindromes as well as added in the examples used elsewhere in the thread.

I took some advice and rebuilt the benchmark. I disqualified anyone who didn't get accurate results, so be sure you didn't get disqualified due to my error in your code. Here is the code that I used to generate the new palindromes:

Also, Don't Repeat Yourself. That's far too much copy-paste code there. Look at the benchmark script I wrote: much less code, and it tells you exactly which input strings each solution failed on and what it produced instead.

It's also unfortunately an O(n^2) algorithm, but my initial O(n) idea turned out to be badly flawed. (Actually, I guess it's O(n*m), where "n" is the length of the input and "m" is the length of the longest palindrome - in the worst case, a string of all the same letter, it'd be O(n^2))

Note that it'll also work on unicode strings, assuming that perl knows that its argument is a unicode string.

The full benchmark I ran follows. Note that some entries required slight modifications in order to compile with strictures and run without warnings. I paid careful attention to keep the semantics intact, but if I disqualified your entry, please check my copy of your code for potential breakage.

Oops, thank you. I missed that one because I downloaded the solutions by crawling the displaycode links on the thread page and he didn't put his entry in CODE tags.

It works correctly and therefor qualifies, but I only get about 340 iterations/s for it. That puts it next to Limbic~Region's entry in my chart. I'm not surprised, as his code is quite complex and involves a true for(;;) loop (relatively slow in Perl). Fast code in Perl means as few opcodes as possible and letting builtins do work implicitly as much as possible. (See GRT, f.ex.)

That is how I arrived at my second version. I copied the first version, ripped everything except the XOR out of the loop, and started benchmarking them against each other as I tried to accelerate extraction of null runs from the bitmask and rotation of the reverse string. Every single operation I added to the (non-functional, skeletal) second version had a dramatic impact on speed. Whatever I did, I found nothing with which to improve upon while( /\0{3,}/g ) { } and substr+chop. I only managed to get a speedup when I constrained the match further so that the regex engine isn't exited to drop into the loop body for null runs that are too short to be candidates. Apparently, avoiding that penalty by skipping matches implicitly more than makes up for the additional cost of having to compile the regex multiple times, which skipping them explicitly didn't require.

The version and compile-time flags of the Perl in use probably matter to some degree, as well.