Hi,
I got this thing working and I think it's about time I get some comments
on it.
I've been wanting to extend Value Rang Propagation (VRE) for some time
now. Mostly because of the fix to the troublesome "signed-unsigned
comparisons" issue. Enabling VRE for if-else and "&&" will fix many of
the false-positive warnings given out by the fix to bug 259 by allowing
code like this: if (signed > 0 && signed < unsigned) { .. }
Have a look at the branch:
https://github.com/lionello/dmd/compare/if-else-range
There, I've also added a __traits(intrange, <expression>) which returns
a tuple with the min and max for the given expression. It's used in the
test case as follows:
const i = foo ? -1 : 33;
if (i)
static assert(__traits(intrange, i) == Tuple!(-1, 33));
else
{
//static assert(i == 0); TODO
static assert(__traits(intrange, i) == Tuple!(0, 0));
}
if (i == 33)
{
//static assert(i == 33); TODO
static assert(__traits(intrange, i) == Tuple!(33, 33));
}
else
static assert(__traits(intrange, i) == Tuple!(-1, 32));
if (10 <= i)
static assert(__traits(intrange, i) == Tuple!(10, 33));
else
static assert(__traits(intrange, i) == Tuple!(-1, 9));
It would be nice if this can be used by CTFE as well.
Destroy?
L.

I've also added a __traits(intrange, <expression>) which
returns a tuple with the min and max for the given expression.

I'd like a name like integral_range, and perhaps it's better for
it to return a T[2] (fixed size array) instead of a tuple.
Such __traits(integral_range, exp) could work on core.checkedint
values too, or it could be used by them to remove some overflow
tests and increase their performance.

This could be a bad thing. It makes it pretty enticing to use
contracts as input verification instead of logic verification.

Until you compile with -release, and then suddenly invalid input crashes
your program. :-P (Then you'll go and fire the guy who wrote it.)
T
--
"The whole problem with the world is that fools and fanatics are always
so certain of themselves, but wiser people so full of doubts." --
Bertrand Russell.
"How come he didn't put 'I think' at the end of it?" -- Anonymous

Until you compile with -release, and then suddenly invalid
input crashes
your program. :-P (Then you'll go and fire the guy who wrote
it.)
T

My point exactly. If contracts allow things like what Bearophile
wants to work, then people might use them to do input/output
validation instead of validating basic assumptions, like they're
supposed to do. Then we have problems like you described. Maybe
it's a good idea to keep contracts as-is and not allow them to
change a function's semantics like this, since they are removed
in release.

My point exactly. If contracts allow things like what
Bearophile wants to work, then people might use them to do
input/output validation instead of validating basic
assumptions, like they're supposed to do. Then we have problems
like you described. Maybe it's a good idea to keep contracts
as-is and not allow them to change a function's semantics like
this, since they are removed in release.

This is an interesting topic.
D contracts are a potentially important language feature that is
currently underused, perhaps because it's currently a little more
than a nicer place to locate asserts (and it still lacks the
pre-state ("old") feature).
Function contracts don't change the program semantics, they
define part if it at a higher level than the function body itself.
When your function has a post-condition that specifies the
function to return a value between 0 and 9 inclusive, your
function is defined to return such range, that's its semantics.
If the function returns something outside that range, then your
function has a bug. Relying on the function semantics to allow
casts is one basic usefulness of contracts, but this is just a
starting point (if take a look at the Whiley language you see
what I mean: whiley.org ).
If you compile your code with -release you are assuming your code
doesn't have bugs, and your contracts are always satisfied.
Perhaps D programmers need to stop compiling with -release.
Perhaps even the name of this compiler switch should be changed
to something more descriptive of its unsafety and meaning.
The presence of -release compiler switch can't hold back the use
of contract programming for all it's worth. Currently D is not
taking its contract programming seriously :-)
Bye,
bearophile

So you mean something like
if(x<byte.min || x>byte.max)
throw new InvalidArgumentException("...
else {}
?
That seems like a strange restriction. Why is that?

I meant, else return x;
Because it's easy to see what the value of x is within the if and else
body. It's not trivial to find out that the if body never exits (because
of the throw) and therefor the code after the if is in essence the else
body.
L.

Note that my discussion so far is about inhibiting run-time
checknig of the value range of
struct Bound(T,
B = T, // bounds type
B lower = B.min,
B upper = B.max,
bool optional = false,
bool exceptional = true)
defined at
https://github.com/nordlow/justd/blob/master/bound.d
where T is an integer type, that is an Ada-style integer range
type.
It would of course be cool if DMD could support this for floating
points aswell.
D range indexing/slicing should probably be builtin to the
compiler.

I think that unfortunately this currently can't work, you can't
tell the range of the input value like that. I have explained why
in one of my posts in this thread. Please try to explain me why
I'm wrong.
Bye,
bearophile

I think that unfortunately this currently can't work, you can't
tell the range of the input value like that. I have explained
why in one of my posts in this thread. Please try to explain me
why I'm wrong.

Don't know what you're referring to, but I guess you mean that -
because of separate compilation - the code generated for `pow`
needs to be generic? This is not really a problem, because the
compiler can generate _additional_ specialized functions for it.
But then, a normal `if` would also do, because dead code
elimination can remove the unused parts in these specializations.

I think that unfortunately this currently can't work, you can't
tell the range of the input value like that. I have explained
why in one of my posts in this thread. Please try to explain me
why I'm wrong.

I'm currently merely talking about possibilities in this case, so
I cannot currently prove you wronge ;) To me it seem like an
unneccessary limitation that valueRanges aren't propagated to
function call arguments provided that the function source is
known at compile time. And it doesn't sound to me like enabling
this in DMD would be such a great task either.

""Nordlöw"" wrote in message news:jctlkqtiztnbnctldtdg forum.dlang.org...

I'm currently merely talking about possibilities in this case, so I cannot
currently prove you wronge ;) To me it seem like an unneccessary
limitation that valueRanges aren't propagated to function call arguments
provided that the function source is known at compile time. And it doesn't
sound to me like enabling this in DMD would be such a great task either.

What happens when a function is called from different places with values
with different ranges? What about when it's called from another compilation
unit? Generally the argument ranges can only be known when the function is
inlined, and by then it's much too late to expose them via __traits.

What happens when a function is called from different places
with values with different ranges? What about when it's called
from another compilation unit?

One solution is to ignore such cases, so that feature gives
useful results only when the source is compiled in the same
compilation unit.
An alternative solution is to handle the functions that use those
features like templates, and keep the source available across
different compilation units. This is perhaps acceptable because I
think that kind of features is going to be used mostly for
library code and not for most user functions.
Bye,
bearophile

One solution is to ignore such cases, so that feature gives
useful results only when the source is compiled in the same
compilation unit.
An alternative solution is to handle the functions that use
those features like templates, and keep the source available
across different compilation units. This is perhaps acceptable
because I think that kind of features is going to be used
mostly for library code and not for most user functions.

What happens when a function is called from different places with
values with different ranges? What about when it's called from another
compilation unit?

One solution is to ignore such cases, so that feature gives useful
results only when the source is compiled in the same compilation unit.
An alternative solution is to handle the functions that use those
features like templates, and keep the source available across different
compilation units. This is perhaps acceptable because I think that kind
of features is going to be used mostly for library code and not for
most user functions.
Bye,
bearophile

Wouldn't that cause compiler errors that only happen depending on what
order you compile things?
-Shammah

Wouldn't that cause compiler errors that only happen depending
on what order you compile things?

If you refer to the first "solution", then the answer is yes. The
ability to catch bugs at compile-time is not fully deterministic,
it's a help for the programmer, but it's not always possible.
This is why when you use a enum precondition you also have to add
a regular pre-condition too.
If you refer to the second "solution" then I think the answer is
no, because your problems doesn't currently happen with templates.
Bye,
bearophile

What happens when a function is called from different places
with values with different ranges? What about when it's called
from another compilation unit? Generally the argument ranges
can only be known when the function is inlined, and by then
it's much too late to expose them via __traits.

The problem is, I think this is currently false (even if you call
your constructor with just a number literal). I'd like this
feature in D, but I don't know how much work it needs to be
implemented.
D language is designed to allow you to create data structures in
library code able to act a lot like built-in features (so there's
a refined operator overloading, opCall, static opCall, and even
more unusual things like opDispatch), but there are built-in
features that can't yet be reproduced in library code, observe:
void main() {
int[5] arr;
auto x = arr[7];
}
dmd 2.066alpha gives a compile error:
test.d(3,14): Error: array index 7 is out of bounds arr[0 .. 5]
In D there is opIndex to overload the [ ], but its arguments are
run-time values, so I think currently they have no way to give a
compile-time error if you use an index that is known statically
to be outside the bounds. So currently you can't reproduce that
behavour with library code. Propagating the value range
information to the constructor, plus the new __trait(valueRange,
exp), allow to solve this problem. And indeed this allows to
implement nice ranged values like in Ada, and to do what the
Static_Predicate of Ada does.
Another common example of what you currently can't do with
library code:
void main() {
import std.bigint;
BigInt[] arr = [10, 20];
import std.numeric;
alias I10 = Bound!(int, 0, 10);
I10[] arr = [8, 6, 20];
}
(In theory the compiler should also catch at compile time that
bug, because 20 is outside the valid bounds of I10.)
The "enum precondition" I've suggested elsewhere is a
generalization of that feature (but it's not a strict subset
because it only works with literals, so it's good to have both
features), because it manages not just ranges of a single value,
but also other literals, like an array:
void foo(int[] arr)
enum in {
assert(arr.all!(x => x < 10));
} in {
assert(arr.all!(x => x < 10));
} body {
// ...
}
void main() {
foo([10, 20]); // gives compile-time error.
}
This is possible only if the source code of foo is available in
the compilation unit. In presence of separated compilation the
enum precondition is ignored. So the enum preconditon can't
replace regular pre-conditions, they are useful to catch
statically only a subset of bugs. The same happens with the idea
of propagating range information to the constructors. One way to
avoid or mitigate this problem is to leave available the source
code of functions with an enum pre-conditions, just like with
templates. Perhaps this is not a big problem because enum
preconditions and constructor value range propagation are meant
to be used mostly in library code, like in Bound integers, etc.
Bye,
bearophile

D language is designed to allow you to create data structures in
library code able to act a lot like built-in features (so there's a
refined operator overloading, opCall, static opCall, and even more
unusual things like opDispatch),

opDispatch is extremely powerful; I think we've only barely begun to tap
into what it can do. Like my recent safe-dereference template function.
;-)

but there are built-in features that can't yet be reproduced in
library code, observe:
void main() {
int[5] arr;
auto x = arr[7];
}
dmd 2.066alpha gives a compile error:
test.d(3,14): Error: array index 7 is out of bounds arr[0 .. 5]
In D there is opIndex to overload the [ ], but its arguments are
run-time values, so I think currently they have no way to give a
compile-time error if you use an index that is known statically to be
outside the bounds. So currently you can't reproduce that behavour
with library code. Propagating the value range information to the
constructor, plus the new __trait(valueRange, exp), allow to solve
this problem. And indeed this allows to implement nice ranged values
like in Ada, and to do what the Static_Predicate of Ada does.

It seems like ultimately, we want to have some kind of uniformity
between built-in compiler intrinsics and user-defined types, such as
allowing user-defined types to access internal compiler knowledge like
value range and many other things.
One very useful thing to have would be a way to tell if a particular
symbol has a compile-time known value. You could then do things like
custom constant-folding on user-defined types by automatically
simplifying expressions at compile-time.
Hmm. Maybe this is already possible to some extent?
template hasCompileTimeValue(alias Sym) {
alias hasCompileTimeValue = __traits(compiles, (){
enum val = Sym;
});
}
int x = 10, y;
// Will this work? (I don't know)
assert(hasCompileTimeValue!x);
assert(!hasCompileTimeValue!y);

Another common example of what you currently can't do with library
code:
void main() {
import std.bigint;
BigInt[] arr = [10, 20];
import std.numeric;
alias I10 = Bound!(int, 0, 10);
I10[] arr = [8, 6, 20];
}
(In theory the compiler should also catch at compile time that bug,
because 20 is outside the valid bounds of I10.)

[...]
This is a different instance of the same problem as above, isn't it?
If Bound has access to compiler knowledge about value ranges, then it
would be able to statically reject out-of-range values.
T
--
"I'm not childish; I'm just in touch with the child within!" - RL

I got this thing working and I think it's about time I get some
comments on it.

This is really cool. Good job!
One thing we need to be careful with is how this is specified.
Because of all the compile time introspection (e.g. __traits
compiles and now __traits intrange), this VRP needs to be
precisely specified otherwise you end up with incompatible
differences between different compiler front ends. I don't want
to see code compiling in one compiler, but not another like we do
in C++ (and if we do, I'd like it to be minimized).
If this goes in, the mechanisms by which is works need to be
added to the spec.

I got this thing working and I think it's about time I get
some comments on it.

This is really cool. Good job!
One thing we need to be careful with is how this is specified.
Because of all the compile time introspection (e.g. __traits
compiles and now __traits intrange), this VRP needs to be
precisely specified otherwise you end up with incompatible
differences between different compiler front ends. I don't want
to see code compiling in one compiler, but not another like we
do
in C++ (and if we do, I'd like it to be minimized).
If this goes in, the mechanisms by which is works need to be
added to the spec.

I agree with what is stated here.
And will repeat that this sounds awesome.

The compiler uses value range propagation in this {min, max}
form, but I think that's an implementation detail. It's well
suited for arithmetic operations, but less suitable for logical
operations. For example, this code can't overflow, but {min, max}
range propagation thinks it can.
ubyte foo ( uint a) {
return (a & 0x8081) & 0x0FFF;
}
For these types of expressions, {known_one_bits, known_zero_bits}
works better.
Now, you can track both types of range propagation
simultaneously, and I think we probably should improve our
implementation in that way. It would improve the accuracy in many
cases.
Question: If we had implemented that already, would you still
want the interface you're proposing here?

The compiler uses value range propagation in this {min, max} form, but I
think that's an implementation detail. It's well suited for arithmetic
operations, but less suitable for logical operations. For example, this
code can't overflow, but {min, max} range propagation thinks it can.
ubyte foo ( uint a) {
return (a & 0x8081) & 0x0FFF;
}
For these types of expressions, {known_one_bits, known_zero_bits} works
better.
Now, you can track both types of range propagation simultaneously, and I
think we probably should improve our implementation in that way. It
would improve the accuracy in many cases.
Question: If we had implemented that already, would you still want the
interface you're proposing here?

You could have different __traits in that case:
__traits(valueRange,...) // for min/max
__traits(bitRange,...) // mask
You example seems rather artificial though. IRL you'd get a compiler
warning/error and could fix it by changing the code to "& 0xFF". I
personally have not yet had the need for these bit-masks.
L.

The need for this D improvement is sufficiently common, a just
appeared question:
http://forum.dlang.org/thread/jwfvuaohvlvwzjlmztsj forum.dlang.org
Lionello needs some cheering & support to create the patch that
implements the value range propagation for if-else. Exposing the
range and implementing value range propagation for if-else are
sufficiently distinct needs.
Bye,
bearophile

Fantastic work, although I would prefer Bearophile's
'value_range', is there any reason why the same trait could not
be used for float, etc? (I don't need it for float, just as an
example).
I guess an interval_set would be too complicated, slowing down
the compiler? i.e. multiple ranges, or multiple values, just
thinking out loud... Anyway what you did already is kick ass! :)

Hi,
I got this thing working and I think it's about time I get some comments on
it.
I've been wanting to extend Value Rang Propagation (VRE) for some time now.
Mostly because of the fix to the troublesome "signed-unsigned comparisons"
issue. Enabling VRE for if-else and "&&" will fix many of the false-positive
warnings given out by the fix to bug 259 by allowing code like this: if
(signed > 0 && signed < unsigned) { .. }
Have a look at the branch:
https://github.com/lionello/dmd/compare/if-else-range
There, I've also added a __traits(intrange, <expression>) which returns a
tuple with the min and max for the given expression. It's used in the test
case as follows:
const i = foo ? -1 : 33;
if (i)
static assert(__traits(intrange, i) == Tuple!(-1, 33));
else
{
//static assert(i == 0); TODO
static assert(__traits(intrange, i) == Tuple!(0, 0));
}
if (i == 33)
{
//static assert(i == 33); TODO
static assert(__traits(intrange, i) == Tuple!(33, 33));
}
else
static assert(__traits(intrange, i) == Tuple!(-1, 32));
if (10 <= i)
static assert(__traits(intrange, i) == Tuple!(10, 33));
else
static assert(__traits(intrange, i) == Tuple!(-1, 9));
It would be nice if this can be used by CTFE as well.
Destroy?

I'm looking forward to this so hard!
The one time I've attempted to hack on DMD, it was to investigate the
idea of doing this.
As others said, I think a key use case is for contracts/preconditions.
Also eliminating annoying warnings when down-casting.
I suspect there are many great optimisation opportunities available
too when this is pervasive. switch() may gain a lot of great
information to work with! :)