If you've discovered something amazing about Perl that you just need to share with everyone,
this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

Your employer/interviewer/professor/teacher has given you a task with the following specification:

Given an XHTML file, find all the <div> tags with the class attribute "data"1 and extract their id attribute as well as their text content, or an empty string if they have no content. The text content is to be stripped of all non-word characters (\W) and tags, text from nested tags is to be included in the output. There may be other divs and other tags present anywhere, but tags with the class data are guaranteed to have an id attribute and not be nested inside each other. The output of your script is to be a single comma-separated list of the form id=text, id=text, .... You are to write your code first, and then you will be given a test file, guaranteed to be valid and standards-conforming, for which the expected output of your program is "Zero=, One=Monday, Two=Tuesday, Three=Wednesday, Four=Thursday, Five=Friday, Six=Saturday, Seven=Sunday".

1Update: Clarification: The class attribute should be exactly the string data (that is, ignoring the special treatment given to CSS classes). Examples below updated accordingly.

Ok, you think, I know Perl is a powerful text processing language and regexes are great! And you write your code and it works well for the test cases you came up with. ... But did you think of everything? Here's the test file you end up getting:

If you are of a certain age and have been developing for the web for long enough, you will recognize a lot of the details in CmdrTaco's meditation on the beginnings of Slashdot on its 20th birthday, from watching the output of tailing your Apache referrers log in amazement, to the Kai's Power Tools drop shadow on the logos. Good times! (Too bad he doesn't mention that the whole thing was built in Perl.)

Hello Monks,
I came here for your critics, feedbacks and proposals for improvements. I have developped the simple script for grepping paragraphs (block of text lines delimited by the specific separator (blank lines, by default).

The common use case is parsing of java log entries that can be extended onto multiple lines:

paragrep -Pp '^\d+/\d+/\d+ \d+:\d+:\d+' PATTERN FILENAME

Another use case is filtering sections from ini files matching particular strings:

paragrep -Pp '^\[' PATTERN FILENAME

For now I am going to improve searching patterns and add support for -a/--and and -o/--or options to control matches. Using this message I ask you to test the script and point me on possible leaks in performance and efficiency.

The original and actual code is hosted on github (It's not permitted to post external links but you can search for ildar-shaimordanov/perl-utils)
Here is the latest (to the moment of creating this message) version of the script:

Given the pretty CPAN Testers Christmas Tree, especially in the *BSD columns, I learned that not all 32-bit perls are created equal. After some more debug, I was able to show that it was those 32-bit perls with $Config{ivsize}==4 (32bit integers; aka perl -V:ivsize) that were failing, but those with $Config{ivsize}==8 (64bit integers in 32bit perl) would pass.

I had assumed (without looking) that the default ivsize on the 32bit perl was 4 bytes (32 bits), so thought that I had already tested and verified my IV weren't overflowing (or, if they were, they were promoting to NV). After seeing the problem, I discovered that when using left-shift, it would go from IV to NV... but it didn't. However, *=2 did promote the way I expected:

Conclusion: when doing a testing suite across versions, it's not always enough to just have a "32bit perl"; especially if you are dependent on integer sizes, also check that your test suite includes multiple ivsize values.

Many of you are probably aware of the pattern of opening a temporary file, reading from the original file and writing the modified contents to the temporary file, and then renameing the temporary file over the original file, which is often an atomic operation (depending on OS & FS). I recently wrote a module to encapsulate this behavior, and here is one of three interfaces that are available in File::Replace. There are several options to configure the behavior, including the ability to specify PerlIO layers, what happens if the file doesn't exist yet, etc.

Since I hope this is something that you might find useful, I would be happy about any feedback you might have!

To give a practical example, here is an update of my code from this node. As you can see I was able to get rid of eight lines of fairly complicated code, while keeping the main loop entirely unchanged. The module also adds some more robustness, as it incorporates a few more checks on whether operations were successful or not.

I am looking for your advice on updating and my implemented module for encoding and decoding multiple formats. I wrote the module and tried to include as many formats I could. I know that there other formats that I have not added but in my case during the encoding decoding process has to be also converted to hex and vise versa, where I found problems with more formats that I have not included on my sample of code.

The whole idea behind the module, I am working for a telecommunication company and part of my daily job is to correct problems. The languages can vary globally since it is a live network with live customers and the format is in hex on a variety of encoding patterns. I had some cases that I had to create small scripts to process the packages before and after the nodes so I can observe encoding corruptions or not. Sample of previous questions that I was working that are similar with the module (Chinese to Hex and Hex to Chinese, Arabic to Hex and Hex to Arabic). After seeing my self that I need more and more encodings for more and more languages I end up saying that I need to write a simple module to do that for me instead of creating more or less the same code again and again.

So having said that, sample of code as the user would use the module based on the encodings that can be handled:

Early last week, I wrote in CB about a tremendously disturbing event that took place with my family.

In response, several Monks reached out to offer condolences and offers of help.

In my near absence from here since then, a bunch of Monks got together, and 1nickt reached out a few times to say that a group of Monks wanted to do something. Initially, I was advised that the offer could be in the form of finance for travel etc. After I carefully deliberated this kind gesture and discussed with my wife, I decided that I wouldn't feel comfortable taking any funds directly, so I let Nick know that it would be preferred to send flowers or donate to a charity instead.

I was advised by Nick that a beautiful arrangement had been sent on behalf of the Monks, and any left over funds plus any more funds that may trickle in would be donated to some form of preventing violence charity. I advised Nick that I was too busy to deal with it, so I asked if he'd spearhead the decision of which one.

I want to express my (and my wife's) deepest gratitude for such an overwhelmingly kind gesture by everyone involved; those who provided funding, as well as those who reached out to offer emotional support. I'd like to thank Nick directly as well for taking the time to organize everything he did.

This goes to show that this is a great place of caring, not just another forum to get help with questions.

Perlmonks is the only group I let in on what had happened, as it's the only online forum where I feel so comfortable, and people here came through with flying colours... the manner was absolutely unexpected; stunning actually.

Thank you very much everyone, it's kind of hard to put into words, so instead, I'll just try to get back into the groove and give back the best way I can; by continuing to help those who need it here.

Without good design, good algorithms, and complete understanding of the
program's operation, your carefully optimized code will amount to one of
mankind's least fruitful creations - a fast slow program.

In High Performance Game of Life, I chose a very simple design, storing all live cells
in a single set.
Though pleasing for its simplicity and unboundedness,
its drawback is that counting live neighbours becomes a hash lookup,
a chronic performance bottleneck. What to do?

Rather than spending more time optimizing my original design --
thus creating a "fast slow program" --
I researched the domain, learning of many different ways to do it.
From the many possible approaches, I chose the simplest one I could find
that looked interesting and enjoyable, and implemented it in pure Perl.

To try to keep my initial attempt short and understandable, I started with a
simplified version based on the the brilliant works of Adam P. Goucher (apg),
tiling the universe with 64 x 64 tiles
in a conventional way, each tile having eight neighbours.
Note that this was chosen for simplicity; more efficient schemes
are available, such as the "brick wall" tiling used by Goucher
in later versions.
For background on the concept of breaking the game of life universe
into overlapping tiles, see this description of Life128 and vlife.

My code is loosely based on apgnano (version 2) but advances
one tick at a time (rather than two at a time, as apg did)
and does not attempt to use universe history.
Fair warning though. Despite striving to keep the code simple and short,
it's way more complex than my original, Organism.pm swelling
from 66 lines of code to 414.

There is certainly plenty of scope for improving my initial attempt.
After all, I have not attempted any optimizations at all, just tried to implement ideas from
apg's C++/assembler programs in a pure Perl form in a simple and clear way.
While all feedback is welcome, I'm especially eager to see:

Refactorings that make the Perl code shorter, clearer, more idiomatic.

Bug fixes. I was shocked when my code worked the second time I ran it - just one coding blunder was corrected before my new Organism.pm passed tgol.t, tgol2.t, tgol3.t and the 30,000 lidka test! So I suspect there may be more bugs lurking in this brand new implementation.

As a minimum, any code refactorings should be tested by running
tgol.t and tgol3.t from my original node.
Note that this new version of Organism.pm is (or should be) 100% interface compatible with my original.

I made profession and registered to this monastery just five years ago today. And since then, I connected to this forum almost every single day. And it has been a pleasure every time.

Even though I asked only a few questions over these five years, but struggled much more to try to answer others' questions, I learned really a lot from reading posts from many fellow monks in this community. Over this five-year period, this site has made me knowledgeable about my favorite programming language more than anything else I have done in the meantime.

Many thanks to you all, dear sisters and brothers. And long live Perl.

When I went to eclipse.org to see if the latest major release was ready (it is, I'm late, as usual), I discovered something called "Language Server Protocol". Eclipse Oxygen supports it. Several other editiors/IDEs supposedly already support it.

The idea is that instead of each editor/IDE needing a plug-in for each additional language some one wants to support, each editor/IDE gets a LSP client plug-in and each language gets a LSP server.

I did some searching, but didn't find project creating a LSP server for Perl. Still, I think it would be a way to help get better support for Perl into the editors and IDEs that people want to use.

So, sisters and brothers, what do you think? would you be willing to contribute (and how)?