Question: is optimization done before or after the insertion of inline
assembler? That is, is inline assembler "what you see is what you get", or does
the optimizer munge it? I should mention that I don't actually mind what the
answer is.
If the answer turns out to be that the opimizer MAY modify even my inline
assembler then I do have a workaround, so it doesn't matter. I just want to
know.
If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does there
exist a stability policy in this regard which future incarnations of the
compiler must always respect?
Arcane Jill

It seems to me that inline assembler must always be as-you-type-it. It's
reasonable in (almost?) all such cases to trust the programmer.
"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:ca6rqj$2300$1 digitaldaemon.com...

Question: is optimization done before or after the insertion of inline
assembler? That is, is inline assembler "what you see is what you get", or does
the optimizer munge it? I should mention that I don't actually mind what the
answer is.
If the answer turns out to be that the opimizer MAY modify even my inline
assembler then I do have a workaround, so it doesn't matter. I just want to
know.
If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does there
exist a stability policy in this regard which future incarnations of the
compiler must always respect?
Arcane Jill

Question: is optimization done before or after the insertion of inline
assembler? That is, is inline assembler "what you see is what you get", or
does the optimizer munge it? I should mention that I don't actually mind
what the answer is.
If the answer turns out to be that the opimizer MAY modify even my inline
assembler then I do have a workaround, so it doesn't matter. I just want
to know.
If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does
there exist a stability policy in this regard which future incarnations of
the compiler must always respect?

I don't understand the purpose of this question: The optimizer is guaranteed
not the change the behaviour of the code. If the compiler were intelligent
enough to perfectly understand some inline-assembler lines including all
side-effects, it might optimize them, otherwise you can be sure that it
will not touch them.
I assume that no compiler is intelligent enough to do optimization of
inline-assembler code, and I assume there is little reason to take the pain
of doing something like that. But still: if there were such a compiler
mangling your inline assembler, you could still be sure that the resulting
code behaves identical to the original in every respect.
Therefore, I don't know why you are afraid of the optimizer touching your
inline assembler code.

The optimizer is guaranteed
not the change the behaviour of the code. If the compiler were intelligent
enough to perfectly understand some inline-assembler lines including all
side-effects, it might optimize them, otherwise you can be sure that it
will not touch them.

Not if previous experience is anything to go by. In Borland, Microsoft and GNU
compilers, buffers which are memset() to zero to securely wipe their sensitive
content immediately before destruction are considered to be "dead" already by
the optimizers of those compilers - i.e. they will never be read again, thus the
compiler marks this code as redundant and removes it. When this problem was
revealed it was found that a great deal of cryptographic software, including a
variety of cryptographic libraries written by experienced programmers, had
failed to take adequate measures to address this.*
Arcane Jill
* from the paper "Understanding Data Lifetime via Whole System Simulation" by
Chow, Pfaff, Garfinkel, Christopher and Rosenblum.

declaring the reference that is memset to be 'volatile' should take care of
it.

I don't completely understand the D meaning of "volatile". It seems to be
different from that to which I am accustomed in C/C++. In D, "volatile" is part
of a STATEMENT. It is not a storage class or an attribute.
In C++, of course, volatile is a storage class. It means "do not cache the value
of this variable in a register". It means that the compiler has to actually read
it, every time, in case some other thread (or piece of hardware, etc.) has
modified it.
But in D, if I read this correctly, you can do stuff like:

volatile *p++;

(I just checked, and that does compile). It seems to me that a statement like:

volatile uint n;

won't actually make a volatile variable in the C sense, it will just guarantee
that all writes are complete before the variable is initialized, and that the
initialization of the variable is complete before the next statement begins.
I may have completely misunderstood this. If I've got it right, then I don't
entirely see why this would be useful.

Question: is optimization done before or after the insertion of inline
assembler? That is, is inline assembler "what you see is what you get", or
does the optimizer munge it? I should mention that I don't actually mind
what the answer is.
If the answer turns out to be that the opimizer MAY modify even my inline
assembler then I do have a workaround, so it doesn't matter. I just want
to know.
If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does
there exist a stability policy in this regard which future incarnations of
the compiler must always respect?

I don't understand the purpose of this question: The optimizer is guaranteed
not the change the behaviour of the code. If the compiler were intelligent
enough to perfectly understand some inline-assembler lines including all
side-effects, it might optimize them, otherwise you can be sure that it
will not touch them.
I assume that no compiler is intelligent enough to do optimization of
inline-assembler code, and I assume there is little reason to take the pain
of doing something like that. But still: if there were such a compiler
mangling your inline assembler, you could still be sure that the resulting
code behaves identical to the original in every respect.
Therefore, I don't know why you are afraid of the optimizer touching your
inline assembler code.

One reason is that one may deliberately require 'under'-optimised machine
code to exist. The compiler can nver really know the intentions of a coder,
it just assumes some things.
--
Derek
Melbourne, Australia

Question: is optimization done before or after the insertion of inline
assembler? That is, is inline assembler "what you see is what you get", or does
the optimizer munge it? I should mention that I don't actually mind what the
answer is.

As far as i remember from Walter answering some other question, DMD
guards the inline assembly code to prevent optimizer from messing with it.

If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does there
exist a stability policy in this regard which future incarnations of the
compiler must always respect?

Other incarnations of compiler are not guaranteed to have an inline
assembler at all. :) In particular, GDC doesn't.
As to DMD, it looks like a deliberate decision, so knowing Walter it's
unlikely to change. I haven't seen such a gurantee in documentation
though, so you can actually never know what other compiler writers do,
until it is written down.
-eye

Question: is optimization done before or after the insertion of inline
assembler?

After, although the optimizer does not touch the inline assembler.

That is, is inline assembler "what you see is what you get",

Yes.

or does
the optimizer munge it?

No. But it will do a few single instruction things like:
replaces jmps to the next instruction with NOPs
sign extension of modregrm displacement
sign extension of immediate data (can't do it for OR, AND, XOR
as the opcodes are not defined)
short versions for AX EA
short versions for reg EA
TEST reg,-1 => TEST reg,reg
AND reg,0 => XOR reg,reg
It won't do scheduling or reorganizing of it, nor any changes that would
affect the flags.

I should mention that I don't actually mind what the
answer is.
If the answer turns out to be that the opimizer MAY modify even my inline
assembler then I do have a workaround, so it doesn't matter. I just want

to

know.
If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does

there

exist a stability policy in this regard which future incarnations of the
compiler must always respect?

Since the asm language does not exactly specify the opcode to be generated,
this would be a difficult rule to enforce in a few cases.

Question: is optimization done before or after the insertion of inline
assembler? That is, is inline assembler "what you see is what you get", or does
the optimizer munge it? I should mention that I don't actually mind what the
answer is.
If the answer turns out to be that the opimizer MAY modify even my inline
assembler then I do have a workaround, so it doesn't matter. I just want to
know.
If the answer turns out to be that the optimizer WILL NOT modify inline
assembler, then I must ask a follow-up question: Do we have any kind of
guarantee that this will always be the case in the future? That is, does there
exist a stability policy in this regard which future incarnations of the
compiler must always respect?
Arcane Jill

As a tangential comment:
I wonder if it would makes sense to allocate all security-sensitive data from a
special pool, perhaps portions of a large malloced chunk, or a linked list of
malloc'd chunks. A call at the end of main() could clear this memory. It could
XOR the return value (of main) with several random array elements after it is
cleared to try to prevent optimization.
As the code runs, each object would try to clear its own pieces, to further
minimize lifetimes. The security pool could also verify that the pool was all
nulls, perhaps spitting out messages if the pool was not cleared (even in
release mode). This would also inhibit optimization.
A side benefit of this is that you could do all your mlock or don't-page-out
precautions (however that is done) in one place.
Kevin

As a tangential comment:
I wonder if it would makes sense to allocate all security-sensitive data from a
special pool, perhaps portions of a large malloced chunk, or a linked list of
malloc'd chunks. A call at the end of main() could clear this memory.

It makes perfect sense, except for the fact that, in a server, main() never
returns. The program just keeps running forever.

It could
XOR the return value (of main) with several random array elements after it is
cleared to try to prevent optimization.

It's easy enough just to fill it with zeroes using inline assembler, since
Walter says this will never be optimized away.

As the code runs, each object would try to clear its own pieces, to further
minimize lifetimes.

Yup. I just spent the last couple of weeks implementing exactly that. Now the
only problem is, it doesn't work - BECAUSE - I have no way of knowing when an
object is no longer visible (and hence eligible for wiping). Now, this wouldn't
be a problem if operators new() and delete() were globally overloadable, but
they're not. Unless I've got that wrong.
For example, I could recode Int to use just such a custom allocator. Then they
could be used to do RSA calculations, etc. BUT - realistically, no-one is ever
going to call delete() on an Int. That would seriously complicate using them.
And the GC won't touch it, because it has a custom allocator.
Wait - just had an idea! <ping> I'll make that a separate post and see what
Walter thinks.

The security pool could also verify that the pool was all
nulls, perhaps spitting out messages if the pool was not cleared (even in
release mode). This would also inhibit optimization.

I might do that in a Debug build - that's DbC - but not in a Release build. I
mean, if assembler doesn't get optimized away, there's just no problem. It WILL
happen.

A side benefit of this is that you could do all your mlock or don't-page-out
precautions (however that is done) in one place.

Yup. Just need that global new() and delete() now.
I like your thinking.
Jill

Yup. I just spent the last couple of weeks implementing exactly that. Now the
only problem is, it doesn't work - BECAUSE - I have no way of knowing when an
object is no longer visible (and hence eligible for wiping). Now, this wouldn't
be a problem if operators new() and delete() were globally overloadable, but
they're not. Unless I've got that wrong.
For example, I could recode Int to use just such a custom allocator. Then they
could be used to do RSA calculations, etc. BUT - realistically, no-one is ever
going to call delete() on an Int. That would seriously complicate using them.
And the GC won't touch it, because it has a custom allocator.

You could ask that we be allowed to overload the dot operator to make
implementing smart pointers a bit simpler, though aside from that one use the
idea kind of horrifies me :)
Sean