venerdì 19 maggio 2017

Puzzled by a post on the Planet KDE about GCompris and roman numbers,
and needing an easy way to explain to my six-years old son about roman
numbers, I thought it would be an easy task to make a simple program in
Perl to convert an arabic number into its roman notation.
Not so easy, pal!
Well, it's not a problem about Perl, of course, rather I found it
required a quite brainpower for me to write down rules to convert
numbers, and I did not search for the web for a copy-and-paste alghoritm. Please note: if you need a rock-solid way to handle conversions, have a look at CPAN that is full of modules for this particular aim.
Here I'm going to discuss the solution I found and how I implemented
it. It is not supposed to be the best one, or the faster one, it's just my solution from scratch.

The program

I split the problem of converting an arabic number into a roman one
into three steps, with one dedicated subroutine for each step, so that
the main loop reduces to something like the following:

The steps must be read from the inner subroutine to the outer, of course, and therefore we have:

disassemble that translates an arabic number into roman
basis, that is computes how many units, tens, hundreds and thousands
are required. In this phase there is no application of roman rules, so
numbers are decomposed into a linear string of letters. As an example the number 4 is translated into IIII, which is of course a non-existent roman number.

reassemble applies roman rules, in particular promoting numbers so that groups are translated, when needed, into higher order letters. For instance IIII is promoted into two groups: I and V.

roman_string compose the promoted groups into the final
string. The main difficulty of this part is to understand when a letter
has to be placed on the right (addition) or on the left (subtraction)
of another letter. For instance, having the groups I and V the function must understand if the output have to be VI (6) or IV (4).

To speed up the writing of the code, I placed main roman letters and their correspondance with arabic numbers into a global hash:

Each method references $roman when needing to convert
from an arabic number to its roman letter. In order to allow method to
cooperate together, they accept and return an hash keyed by a roman
letter and the number of occurences such letter must appear in the final
string. The following is an example of the hash for a few numbers:

The first thing the method does it to create the hash $items that is what it will return to allow other methods to consume. Each key of the $roman hash is passed ordered by the bigger to the smaller (please note that sort has $b first!). In this way we can surely scompose the number from the thousands, hundreds, tens, and units in this exact order. The $how_many variable contains the integer part of each letter. For example the number 29 is processed as follows:

29 / 10 that drives $how_many to be 2 and the remaining to be a 9;

9 / 5 that makes $how_many to be 1 and the remaining to be a 4;

4 / 1 that makes $how_many to be 4 and there's nothing more to do.

At each step the roman letter and the $how_many value is inserted into the $items has, that in the above ecample becomes:

# 29 (XIX)
{ 'X' => 2,
'V' => 1,
'I' => 4
}

The reassemble method

The reassemble method takes as input the hash produced by disassemble and checks if any letter requires a promotion. Here it is the code:

The promotion must be done from the smaller letter to the greater one, so this time the letters are walked in ascending order (i.e., sort has $a first!). Since to promote a letter I need to access the following one, I need a C-style for loop.
A letter requires to be promoted if its quantity is 4 or /it is 2 and the right bigger value is exactly the double of the current one~, that is while ( $greater_value / $current_value == $how_many ). This makes, for instance IIII to be promoted (the quantity is 4), and VV to be promoted into X (because the quantity is 2 and the X is exactly the double of V).
The promotion manipulates the hash increasing by one the right bigger
letter and leaving a single current letter. In order to flag the
promoted letter, I decided to use a negative quantity (where the
absolute value is the exact one).
So for instance, the 29 hash of the previous paragraph is passed as follows:

At the end of method we know the final string will be made by three X and one I, the point now is to understand how to render them in the correct order. This is the aim of the roman_string method.

The roman_string method

The method accepts the normalized hash (i.e., groups are already
formed) and compose the final string placing letter on the left or the
right of each other depending on their quantity. The following is the
code of the method:

In order to be able to manipulate easily the final string, moving
letters from left to right and vice-versa, I decided to place each
single letter into the @chars array, that is then join -ed into a single string.
Let's suppose we need just to add letters: in this case we need to
write letters from the greater to the smaller from left to right, and
this is the order I traverse the letters of $roman (again, note that sort has $b first!). If the quantity of the letter is positive the letter has not been promoted and therefore it will not be placed to the left of another letter, so just insert into @chars the $letter for the $how_many quantity. On the other hand, if $how_many
is negative, the letter has been promoted and therefore have to be
printed on the left of the last printed letter. This is as easy as
doing:push @chars, ( $letter, pop @chars );
that inserts into @chars the $letter and the previous last character that has been removed via pop.
With regards to the previous example of 29 we have that:

Conclusions

Well, it has been much code that I expected to write. Using an object
notation, instead of plain hashes, could surely make the program more
robust. I'm pretty sure there's a way to shrink the code down and to
avoid that ugly C-style for loop, as well as the promotion
part could be simplified keeping in mind that it often reduces to -1 for
the current letter and +1 for the greater one. Anyway, it does what I
need and seems correct!

venerdì 12 maggio 2017

Starting from Perl 5.20 it is allowed to use a postfix dereference notation, first as an explicit feature, and since Perl 5.24 it is enabled by default.The postfix dereference notation allows you to use the arrow operator -> on a reference (as often used when combined with references) specifying the type of the deferencing you want to. Types are indicated by their well know sigils, so for instance a $ means a scalar, a @ an array, & is for a subroutine, % for an hash and * for a typeglob. The sigil on the right of the arrow operator is enhanced by a star, so it really becomes type and *, as in $*. In the special case of arrays and hashes the star can be dropped in favor of a slicing operator, e.g., the square brackets.The following is a sample program that prints out the values of an array using full access (->@*) and slicing (->@[]):

As you can see, $ref->@* is the equivalent of @{ $ref }, while $ref->@[] is the same as @{ $ref }[]. The same applies to hash references.Code and subroutine references are a little more controversial, at least to my understanding of references. First of all there is only the star operator, so ->&*, and the behavior is such that $ref->&* is the same as &{ $ref }. This makes me think there is no short way of passing method arguments thru a postfix deference, as the following code demonstrates:

The only catch-all way of passing arguments, as suggested on irc.perl.org, is to use the @_, so I suggest to localize it:

{ local @_ = qw( Luca ); $sub_ref->&*; # Hello sir Luca ! }I must admit this is a little too much for my poor brain in order to efficiently use the references, so if anyone has something to comment is truly welcome!Now, back to the main topic, it seems to me that the postfix dereference notation is a way of managing reference more like objects, at least it reminds me a kind of as_xx methods (yes, Ruby-ers, I'm looking at you). If we read the arrow operator like as and the type on sigil on the right as the type name, we have that, for instance:

$ref->$* is $ref as scalar;

$ref->@* is $ref as array or $ref>@[] becomes $ref as array slice

and so on. While I'm pretty sure this is not the real meaning of this notation, it is sure most readable than the "usual" one, where we have the type on the left and the reference on the right (or better in the middle), as in @$ref or @{ $ref }.

mercoledì 10 maggio 2017

A diamond ascii is a geometric structure represented as, ehm, a diamond. For example, in the case of letters it becomes something like:

a
b b
c c
d d
e e
f f
e e
d d
c c
b b
a

How hard can it be to build a Perl program to represent a diamond like the above one?
Well, not so much hard, but we have to observe some geometric properties:

the diamond is simmetric (in the sense it becomes and ends with the same letters), but the
central row is reproduced only once (that is, the f appears only on one line, not two!);

each letter or couple of letters is vertically centered around the number of letters in the whole diamond, that is
the letter a (vertical centre) is shifted to right of 6 chars (the total number of letters is a..f = 6);

each couple of letters has a left and right position, and both are equally distant from the vertical
centre of the diamond.

First of all, @letters contains the letters to be printed in the right order, and this of course could come from user's input, a sequence, an array slice, or whatever, it does not mind here. Since I have to place letters depending on where they are in the array of @letters, I need to have an handy way to get the index within the array for each letter, so I build up an hash where the keys are the letters themselves and the values are the positions of such letters. In other words, $index{a} = 0, $index{f] = 5 and so on.

Finally, I print a line every time I need with say. Let's dissect the say statement:

" " x ( $#letters - $index{ $_ } ) shifts to right a number of spaces required to reach the vertical centre or the right distance from it, in other words it is the left position. For example, for the letter a it comes
down to 5 - 0 = 5, while for b it comes to 5 - 1 = 4 and so on.

then I print $_ that is the current char;

then I move again to the right of " " x ( $index{ $_ } * 2 ), that is in the case of anothing, in the case of b by 2, and so on;

then if required I print again the char. Here "required" means the char is not the first one (i.e., not the one at index 0), since that is the only one character printed exactly one time per line.

The say is repeated over the whole @letters array, so this produces the first part of the diamond:

a
b b
c c
d d
e e
f f

then I need to get the bottom, so I need to iterate over the reverse @letters with the exception of the last element, that is I need to iterate over a reversed slice of @letters with the missing f: reverse @letters[ 0 .. $#letters - 1 ] ). This provides me the whole bottom of the diamond.

domenica 7 maggio 2017

HANDLE->format_name(EXPR)
$FORMAT_NAME
$~ The name of the current report format for the currently selected
output channel. The default format name is the same as the
filehandle name. For example, the default format name for the
"STDOUT" filehandle is just "STDOUT".

venerdì 5 maggio 2017

I found out, while not searching for it, an interesting policy for Perl::Critic
named Perl::Critic::Policy::References::ProhibitComplexDoubleSigils. The idea is quite simple and I find myself agreeing with it:
when dereferencing a reference you can omit braces; while this is often not a good habit (because it can become
quite difficult to read a complex reference), it can work out on simpler cases: