Tricky stuff in fasmg, part 2: Namespace separation

This is the second part of a series on advanced trickery that can be done with fasmg. The other installments: part 1, part 3, part 4.
_______

The NAMESPACE directive allows to process entire sections of source in a separate contexts, avoiding any name clashes. If the sub-modules are assembled in separate namespaces, then they can not only use the same names for various labels, but even their macroinstructions are confined to their local scope. For example, one module may include macros for 8086 instruction set, while the other one could use instructions of a different processor, and they would not get in each other's way (the PE formatter macros that come with the examples in fasmg package do something like this, when they define 8086 instruction set macros within a local namespace only to assemble the MZ stub).

Giving a definition to the parent symbol of a namespace is a recommended practice if this namespace is going to be accessed by name in some other places (like in the CALL instructions in the above sample). But if for some reason it was only needed to separate the namespaces of two sources - maybe because all they have to do is just generate some data into output - a minimal variant would work just as well:

There is however, a dangerous trap hidden there, and it is related to the forward-referencing of symbols.

Let's consider the following framework: we have a global COUNTER variable, initialized in the beginning of source:

Code:

COUNTER = 0

Now every module may for some reason need to sometimes increase this counter, with an instruction like:

Code:

COUNTER = COUNTER + 1

If we now try to put these modules into their own namespaces, suddenly they are going to start defining COUNTER inside their local contexts. If one such module contains only one command like the above one, it is not only going to define COUNTER as a symbol local to its namespace, but this symbol will be allowed to be forward-referenced (because it has only one definition in the entire source), and this construction becomes a self-referencing definition. It is impossible to fulfill such clause, as there is no value of COUNTER that solves such equation, so the assembly is going to fail.

The same problem can also apply to symbolic variables: if we had global LIST variable, perhaps initialized like this:

Code:

LISTequinitial

and then expanded in modules with commands like:

Code:

LISTequLIST, element

then putting the module inside its own namespace would cause the above definition to become circular - this would in theory create ever-growing text, but fasmg catches such circular references early (also the ones of form "a equ b"/"b equ a") and signals the problem.

A possible solution to these problems is very simple: the modules should re-define the global variable with constructions like:

Code:

COUNTER. = COUNTER + 1
LIST.equLIST, element

The dot after the name of symbol tells the assembler to look for the already defined symbol with such name, including parent namespaces, so this way we modify the global symbol instead of creating a local one.

The same principle would apply if we created a special globally-acessible namespace where we would keep these variables:

The principle is the same because again it is the dot in the identifier that makes the assembler look for the defined symbol in parent namespaces, only this time after a dot comes a name of descendant symbol, so this time it is not the global symbol that gets re-defined, but the symbol inside its namespace.

Interestingly, the same problem can also occur in case of macroinstructions. Let's consider that we have some simple global macroinstruction:

Code:

macroINTvalue
ddvalue
endmacro

and that sub-modules may seek to re-define this mcaroinstruction to meet their requirements:

Code:

macroINTvalues&iteratevalue, values
INTvalue
enditerate
endmacro

Normally when such re-defined macro calls its own name, it refers to the previous macro with such name. But if we put the second definition inside a local namespace, we get the same result as with numeric or symbolic variables: the local macro now has just one definition and it can be forward-referenced, and this results in it calling itself recursively. This is very similar to what happens with circularly-defined symbolic value, but this time fasmg is not easily able to detect this and it will only detect an error when it reaches the built-in recursion limit (this limit can be altered with the -r command line option, setting it some small number like 100 allows to catch such errors early).

This time adding a dot after the name of a macro is not a valid solution, because a dot causes the assembler to look for the symbol of the expression class, not the instruction class - so it would only find globally defined INT if was also defined as a numeric or symbolic constant or variable there. Using a special namespace would work, but this would require a macro to also be used in this way.

However there is a different possible solution that may help in this case. If we somehow force the local symbol to be considered variable even when it has just one definition, the infinite recursion is going to disappear. When a variable symbol references in its first definition the same name, the assembler looks for the defined value for that name also outside the local namespace, so it is going to use the global value. And we can force local macroinstruction to become variable by creating a dummy definition and immediately removing it with PURGE:

There is one case when this problem is going to show up frequently when putting some module into its separated namespace: it is when the module tries to re-define some of the internal instructions of the assembler. All the instructions of fasmg are the built-in global symbols, and when a module tries to re-define such instruction in a way that calls the original one, but it does it inside a local namespace, the infinite recursion is going to kick in.

We can see this effect immediately if we try to encapsulate in such way any complete program that uses the PE formatter, for instance the win32.asm example from the fasmg package:

For a given name, it forces such symbol to be variable in all the classes (expression, instruction and labeled instruction). Since DD is defined both as an instruction and as a labeled instruction, it is not much of an overkill here:

Code:

namespaceWin32_Sample
vardd?,dq?
include'win32.asm'endnamespace

The PE formatter also re-defines the SECTION instruction, but it does it multiple times on its own, so this one is a variable anyway.

Now, this helps with the recursion, but the above sample would still not assemble - this time because of the POSTPONE used by the PE formatter, since the postponed code gets executed outside of the namespace where we tried to encapsulate this whole program. But in the previous part we already had prepared a macro that allows to execute postponed blocks locally:

In the above example the combined set of macros allows to assemble both win32.asm and win64.asm programs within a single source. All the generated bytes are placed in virtual blocks and not written into actual output, so the additional DISPLAY instruction is added there to prove that the programs really got assembled.

For a general use, we could hide "var" inside the "encapsulate" macro and - just in case - declare every single one of fasmg's instructions as variable. But even then this encapsulation macros are still not perfect. For example, if an encapsulated module placed POSTPONE block inside another nested namespace, our macro would define "postponed" in the wrong namespace and this block would then never get executed. There is a simple method to deal with this risk, but this is a topic for another time.

Last edited by Tomasz Grysztar on 11 May 2017, 13:27; edited 1 time in total

Trying to use the encapsulate macro with fasm g.hld82, it fails running out of memory.

The above sample assembles fine with hld82. Perhaps you have some additional recursion that you need to correct with "var"? Please try setting some small value for "-r" switch in command line (like -r100) to detect any infinite recursion before you run out of memory.

As you see, the recursion is caused by the "element" re-definition, so you need to add "element" to the "var" line. The samples I used for testing did use 80386.inc instead of p5.inc, that's why I did not notice this.

The point of this entire article was to explain these potential problems and ways to handle them. If you skip most of the content and just try to copy the macros from the examples, you may easily become confused by the problems you encounter.

The other problem you uncovered is actually a bug in fasmg. I'm going to upload new, corrected version, and I'm modifying p5.inc so that it no longer redefines the case-insensitive "element", so the examples from this thread are again going to work without changes with the win32.asm from the fasmg package.

More things changed in the packaged examples in the meantime, so the above sample requires some further changes. On the other hand fasmg also has more features, like the ability to store external files. I have further refined the encapsulation example so that it not only assembles two separate programs from within one source, but also stores them in two output files:

The "postpone ?", which is another feature of fasmg added after this tutorial was written, is not handled well by these macro, though for the purpose of this example it works OK. In general it may require a better handling. In fact, the IRPV that copies areas into output files could itself be moved into a "postpone ?" block.

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum