Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to make
all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
I could replace that end with $, but what can I replace coll.begin with?
0 doesn't make sense for things like linked lists, maps, sets, basically
anything that's not an array.
One thing that's nice about opDollar is I can make it return coll.end, so
I control the type. With 0, I have no choice, I must take a uint, which
means I have to check to make sure it's always zero, and throw an
exception otherwise.
Would it make sense to have an equivalent symbol for the beginning of a
container/range?
In regex, ^ matches beginning of the line, $ matches end of the line --
would there be any parsing ambiguity there? I know ^ is a binary op, and
$ means nothing anywhere else, so the two are not exactly equivalent. I'm
not very experienced on parsing ambiguities, but things like ~ can be
unambiguous as binary and unary ops, so maybe it is possible.
So how does this look: coll[^..$];
Thoughts? other ideas?
-Steve

Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to make
all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
I could replace that end with $, but what can I replace coll.begin with?
0 doesn't make sense for things like linked lists, maps, sets, basically
anything that's not an array.
One thing that's nice about opDollar is I can make it return coll.end, so
I control the type. With 0, I have no choice, I must take a uint, which
means I have to check to make sure it's always zero, and throw an
exception otherwise.
Would it make sense to have an equivalent symbol for the beginning of a
container/range?
In regex, ^ matches beginning of the line, $ matches end of the line --
would there be any parsing ambiguity there? I know ^ is a binary op, and
$ means nothing anywhere else, so the two are not exactly equivalent. I'm
not very experienced on parsing ambiguities, but things like ~ can be
unambiguous as binary and unary ops, so maybe it is possible.
So how does this look: coll[^..$];
Thoughts? other ideas?
-Steve

coll[µ..$];
The funny thing is that you can probably make it work today if you want
since 'µ' is a valid identifier. Unfortunately you can't use €. :-)
Other characters you could use: ø, Ø, ß...
--
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

coll[µ..$];
The funny thing is that you can probably make it work today if you want
since 'µ' is a valid identifier. Unfortunately you can't use €. :-)

Not exactly, µ would have to be a global with the same type/meaning
everywhere. I want to control the type per container, so the compiler
would still have to treat it special, or I would have to use
coll[coll.µ..$].
If I didn't want to control the type, I could of course use 0 to that same
effect.
Besides, I can't type that character or any of those others (had to
copy-paste), so I don't see it being a viable alternative :)
-Steve

Do you have specific objections, or does it just look horrendous to you
:) Would another symbol be acceptable?

The problem is D already has a lot of syntax. More syntax just makes the
language more burdensome after a certain point, even if in isolation it's a
good
idea.
One particular problem regex has is that few can remember its syntax unless
they
use it every day.

Thoughts? other ideas?

I'd just go with accepting the literal 0. Let's see how far that goes
first.

I thought of a counter case:
auto tm = new TreeMap!(int, uint);
tm[-1] = 5;
tm[1] = 6;
What does tm[0..$] mean? What about tm[0]? If it is analogous to
"beginning of collection" then it doesn't make any sense for a container
with a key of numeric type.
Actually any map type where the indexes don't *always* start at zero are
a problem.

I'd question the design of a map type that has the start at something other
than 0.

Do you have specific objections, or does it just look horrendous to
you :) Would another symbol be acceptable?

The problem is D already has a lot of syntax. More syntax just makes the
language more burdensome after a certain point, even if in isolation
it's a good idea.
One particular problem regex has is that few can remember its syntax
unless they use it every day.

Thoughts? other ideas?

I'd just go with accepting the literal 0. Let's see how far that goes
first.

I thought of a counter case:
auto tm = new TreeMap!(int, uint);
tm[-1] = 5;
tm[1] = 6;
What does tm[0..$] mean? What about tm[0]? If it is analogous to
"beginning of collection" then it doesn't make any sense for a
container with a key of numeric type.
Actually any map type where the indexes don't *always* start at zero
are a problem.

I'd question the design of a map type that has the start at something
other than 0.

D slicing syntax and indexing isn't able to represent what you can in Python,
where you can store the last index in a variable:
last_index = -1
a = ['a', 'b', 'c', 'd']
assert a[last_index] == 'd'
In D you represent the last index as $-1, but you can't store that in a
variable.
If you introduce a symbol like ^ to represent the start, you can't store it in
a variable.
Another example are range bounds, you can omit them, or exactly the same, they
can be None:
a = ['a', 'b', 'c', 'd']

D slicing syntax and indexing isn't able to represent what you can in Python,
where you can store the last index in a variable:
last_index = -1
a = ['a', 'b', 'c', 'd']
assert a[last_index] == 'd'
In D you represent the last index as $-1, but you can't store that in a
variable.

Probably my post was nearly useless, because it doesn't help much the
development of D, so you can ignore most of it.
The only part of it that can be meaningful is that it seems Python designers
have though that having a way to specify the _generic_ idea of start of a slice
can be useful (with a syntax like a[:end] or a[None:end]) so in theory an
equivalent syntax (like a[^..end]) can be added to D, but I have no idea if
this is so commonly useful in D programs. If you notice in this thread I have
not said that I like or dislike the 'complement to $' feature.
Regarding your specific answer here, storing a.length-1 in the index is not
able to represent the idea of "last item". Another example to show you better
what I meant:
last_index = -1
a = ['a', 'b', 'c']
b = [1.5, 2.5, 3.5, 4.5]
assert a[last_index] == 'c'
assert b[last_index] == '4.5'
I am not asking for this in D, I don't think there are simple ways to add this,
and I don't need this often in Python too. I am just saying that the semantics
of negative indexes in Python is a superset of the D one.
Bye,
bearophile

The problem is D already has a lot of syntax. More syntax just makes the
language more burdensome after a certain point, even if in isolation it's a
good idea.<

This is an interesting topic in general (the following are general notes, not
specific of the complement to $). I agree that what can be good in isolation
can be less good for the whole syntax ecology of a language.
Some times adding a syntax reduces the language complexity for the programmer.
When you have added struct constructors to D2 you have removed one special
case, making classes and structs more uniform, removing one thing the
programmer/manual has to remember.
Some other times a syntax can mean an about net zero complexity added, because
it increases some complexity but reduces some other complexity in the language.
For example named function arguments add a little complexity to the language,
but they are very easy to learn and if used wisely they can make the code (at
the calling point of functions) more readable, and reduce mistakes caused by
wrong arguments or wrong argument order. So I see them as good.
There are languages like the Scheme family that have very little amount of
syntax. And other languages as C# and C++ that have tons of syntax. There was a
time Lisp when was common, but today in practice quite syntax-rich languages
seem to have "won". D too is syntax-rich.
So many things show me that programmers are able to use/learn good amounts of
syntax (especially when they already know a language with a similar syntax), so
syntax is not an evil thing. Yet, today C++ usage is decreasing, maybe even
quickly. I think the cause is that not all syntax is created equal.
This is the C++ syntax for an abstract function:
virtual int foo() = 0;
The same in D:
abstract int foo();
Both have a syntax to represent this, but the D syntax is better, because it's
specific, it's not used for other purposes, and because it uses a readable word
to denote it, instead of of arbitrary generic symbols.
So I think just saying "lot of syntax" as you have done is not meaningful
enough. In my opinion programmers are able to learn and use lot of syntax
(maybe even more than the amount of syntax currently present in D) if such
syntax is:
- Readable. So for example it uses a English word (example: abstract), or it's
common in normal matematics, or it's another example of the same syntax already
present in another part of the language (example: tuple slicing syntax is the
same as array slicing syntax). This makes it easy to remember, easy/quick to
read, and unambiguous. If a syntax is a bit wordy this is often not a problem
in a modern language used with a modern monitor (so I don't care if 'abstract'
is several chars long and a symbol like ^? can be used for the same purpose
saving few chars. It's not a true gain for the programmer).
- Specific. Using the same syntax for several different or subtly different
purposes in different parts of the language is bad. A specific syntax is
something that can't be confused with anything else in the language, it's used
for just one purpose and does it well.
- Safe. There is a natural enough way to use it, and the compiler catches all
improper usages of it. There are no subtly wrong usages that do something very
bad in the program. (This is why I have asked for the compiler to enforce the
presence in the code of only the correct symbol strings in the new D2 operator
overloading regime. It's too much easy to write something wrong, in my opinion).
Bye,
bearophile

Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to make
all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end
symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
I could replace that end with $, but what can I replace coll.begin with?
0 doesn't make sense for things like linked lists, maps, sets, basically
anything that's not an array.
One thing that's nice about opDollar is I can make it return coll.end, so
I control the type. With 0, I have no choice, I must take a uint, which
means I have to check to make sure it's always zero, and throw an
exception otherwise.
Would it make sense to have an equivalent symbol for the beginning of a
container/range?
In regex, ^ matches beginning of the line, $ matches end of the line --
would there be any parsing ambiguity there? I know ^ is a binary op, and
$ means nothing anywhere else, so the two are not exactly equivalent. I'm
not very experienced on parsing ambiguities, but things like ~ can be
unambiguous as binary and unary ops, so maybe it is possible.
So how does this look: coll[^..$];
Thoughts? other ideas?

I think 0 makes perfect sense for any ordered container, or, really,
anything for which $ makes sense (plus some things for which $ doesn't make
sense, like an right-infinite range). However, the rest of your argument
convinced me.

I think 0 makes perfect sense for any ordered container, or, really,
anything for which $ makes sense (plus some things for which $ doesn't make
sense, like an right-infinite range). However, the rest of your argument
convinced me.

What if you order by > instead of <, then 0 should be the end surely.
Also what if you have an array that you can access with a negative
index? Then 0 is at some indeterminate point in the range.
Ok it's pretty bloody rare to have arrays like that, but I've done it a
couple of times before now.
- --
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iD8DBQFL7b0DT9LetA9XoXwRApO6AKCWIR0Yl0D1+qDTgnP619nlbbcgaQCeK18a
59Nl+g78KmlHX4C3RubCv88=
=0aI1
-----END PGP SIGNATURE-----

Speaking of regex, [^ sequence starts a set of excluded characters. :)
$ has always bugged me anyway, so how about no character at all:
coll[..n]; // beginning to n
coll[n..]; // n to end
coll[..]; // all of it
I like it! :)
Ali

The $ is not elegant, but it's a good solution to a design problem, how to
represent items from the bottom of the array. In Python you write:
a[-5]
In D you write:
a[$-5]
This small one-char difference has an important effect for a system language:
the presence of $ allows you to avoid a conditional each time you want to
access an array item :-)
So I think it as one of the smartest details of D design :-)
Bye,
bearophile

The $ is not elegant, but it's a good solution to a design problem, how to
represent items from the bottom of the array. In Python you write:
a[-5]
In D you write:
a[$-5]
This small one-char difference has an important effect for a system
language: the presence of $ allows you to avoid a conditional each time
you want to access an array item :-)
So I think it as one of the smartest details of D design :-)

Once upon a time, there was a book called "Writing Solid Code". It seemed
that anyone who was an established, respectable programmer swore by it and
proclaimed it should be required reading by all programmers. These days, I
sometimes feel like I'm the only one who's ever heard of it (let alone read
it).
So much of the book has made such an impact on me as a programmer, that from
the very first time I ever heard of a language (probably Python) using
"someArray[-5]" to denote an index from the end, I swear, the very first
thought that popped into my head was "Candy-Machine Interface". I instantly
disliked it, and still consider it a misguided design.
For anyone who doesn't see the the problem with Python's negative indicies
(or anyone who wants to delve into one of the forerunners to great books
like "Code Craft" or "The Pragmatic Programmer"), I *highly* recommend
tracking down a copy of "Writing Solid Code" and reading "The One-Function
Memory Manager" and "Wishy-Washy Inputs", both in the "Candy-Machine
Interfaces" chapter.
(Although, the book did have such an impact on the programming world at the
time, that many of the cautions in it sound like no-brainers today, like not
using return values to indicate error codes. But even for those, it goes
into much more detail on the "why" than what you usually hear.)

Once upon a time, there was a book called "Writing Solid Code". It seemed
that anyone who was an established, respectable programmer swore by it and
proclaimed it should be required reading by all programmers. These days, I
sometimes feel like I'm the only one who's ever heard of it (let alone read
it).
So much of the book has made such an impact on me as a programmer, that from
the very first time I ever heard of a language (probably Python) using
"someArray[-5]" to denote an index from the end, I swear, the very first
thought that popped into my head was "Candy-Machine Interface". I instantly
disliked it, and still consider it a misguided design.
For anyone who doesn't see the the problem with Python's negative indicies
(or anyone who wants to delve into one of the forerunners to great books
like "Code Craft" or "The Pragmatic Programmer"), I *highly* recommend
tracking down a copy of "Writing Solid Code" and reading "The One-Function
Memory Manager" and "Wishy-Washy Inputs", both in the "Candy-Machine
Interfaces" chapter.

I have not read this book. Thank you for the suggestion, I will read it. There
are tons of books about programming and computer science, but the good books
are very uncommon, I can probably write a list of less than a dozen titles. So
given few years it's easy to read them all.

It seemed that anyone who was an established, respectable programmer swore by
it and proclaimed it should be required reading by all programmers.<

"Writing Solid Code" is a book about programming, but its examples are in C and
it's focused on C programming. Today some people write code all day and they
don't even know how to write ten lines of C code. Time goes on, and what once
was regarded as indispensable, today is less important (in my university the C
language is taught only at about the third year, often in just one course and
the teacher is not good at all, the code written on the blackboard can be
sometimes used by me as examples of how not write C code). This happens in all
fields of human knowledge. In practice given enough time, almost no books will
be indispensable. Books that are interesting for centuries are uncommon, and
they are usually about the human nature (like novels) that changes very little
as years pass :-)

and reading "The One-Function Memory Manager"<

C99 has changed things a bit:

In both C89 and C99, realloc with length 0 is a special case. The C89 standard
explicitly states that the pointer given is freed, and that the return is
either a null pointer or a pointer to the newly allocated space. The C99
standard says that realloc deallocates its pointer argument (regardless of the
size value) and allocates a new one of the specified size.<

I agree that C realloc is a function that tries to do too many things. C libs
are not perfect.

"Wishy-Washy Inputs",<

Python supports named functions arguments (as C#4, and I hope to see them in
D3), this can reduce the bug count because you can state what an argument is at
the calling point too.
The code of CopySubStr is bad:
- Abbreviated names for functions (and often variables too) are bad.
- Unless very useful, it's better to avoid pointers and to use normal array
syntax [].
- There is no need to put () around the return value in C.

that many of the cautions in it sound like no-brainers today, like not using
return values to indicate error codes.<

Generally in Python when some function argument is not acceptable, an exception
is raised. Exceptions are used in D for similar purposes. But in D you also
have contracts that I am starting to use more and more.

So much of the book has made such an impact on me as a programmer, that from
the very first time I ever heard of a language (probably Python) using
"someArray[-5]" to denote an index from the end, I swear, the very first
thought that popped into my head was "Candy-Machine Interface". I instantly
disliked it, and still consider it a misguided design.<

A negative value to index items from the end of an array is a bad idea in C
(and D), it slows down the code and it's unsafe.
But you must understand that what's unsafe in C is not equally unsafe in
Python, and the other way around too is true. A well enough designed computer
language is not a collection of features, it's a coherent thing. So its
features are adapted to each other. So even if a[5] in Python looks the same
syntax as a[5] in C, in practice they are very different things. Python arrays
are not pointers, and out-of-bound exceptions are always present. And often in
Python you don't actually use a[i], you use something like:
for x in a:
do_something(x)
As you see there are no indexes visible here (as in D foreach).
What I am trying to say you is that while I agree that negative indexes can be
tricky and they are probably too much unsafe in C programs (given how all other
things work in C programs), they are not too much unsafe in Python programs
(given how all other things work in Python programs). In Python you have to be
a little careful when you use them, but they usually don't cause disasters in
my code.

But even for those, it goes into much more detail on the "why" than what you
usually hear.)<

It seemed that anyone who was an established, respectable programmer swore
by it and proclaimed it should be required reading by all programmers.<

"Writing Solid Code" is a book about programming, but its examples are in
C and it's focused on C programming. Today some people write code all day
and they don't even know how to write ten lines of C code. Time goes on,
and what once was regarded as indispensable, today is less important (in
my university the C language is taught only at about the third year, often
in just one course and the teacher is not good at all, the code written on
the blackboard can be sometimes used by me as examples of how not write C
code). This happens in all fields of human knowledge. In practice given
enough time, almost no books will be indispensable. Books that are
interesting for centuries are uncommon, and they are usually about the
human nature (like novels) that changes very little as years pass :-)

Yea. The book is heavily C, of course, because C was heavily used at the
time. But, I think another reason for all the focus on C is that the typical
(at least at the time) C-style and the standard C lib are filled with great
examples of "what not to do". ;)

"Wishy-Washy Inputs",<

Python supports named functions arguments (as C#4, and I hope to see them
in D3), this can reduce the bug count because you can state what an
argument is at the calling point too.

Yea, non-named-arguments-only has been feeling more and more antiquated to
me lately.

The code of CopySubStr is bad:
- Abbreviated names for functions (and often variables too) are bad.

There are two major schools of thought on that. On one side are those who
say full names are more clear and less prone to misinterpretation. The other
side feels that using a few obvious and consistent abbreviations makes code
much easier to read at a glance and doesn't cause misinterpretation unless
misused. Personally, I lean more towards the latter group. (Some people also
say abbreviations are bad because the number of bytes saved is insignificant
on moden hardware. But I find that to be a bit of a strawman since everybody
on *both* sides agrees with that and the people who still use abbreviations
generally don't do so for that particular reason anymore.)

- Unless very useful, it's better to avoid pointers and to use normal
array syntax [].

Heh, yea. Well, that's old-school C for you ;)

that many of the cautions in it sound like no-brainers today, like not
using return values to indicate error codes.<

Generally in Python when some function argument is not acceptable, an
exception is raised. Exceptions are used in D for similar purposes. But in
D you also have contracts that I am starting to use more and more.

Yea, since the book was written, exceptions have pretty much become the de
facto standard way of handling errors. There are times when exceptions
aren't used, or can't be used, but those cases are rare (dare I say, they're
"exceptions"? ;) ), and the most compelling arguments against exceptions are
only applicable to languages that don't have a "finally" clause.

So much of the book has made such an impact on me as a programmer, that
from the very first time I ever heard of a language (probably Python)
using "someArray[-5]" to denote an index from the end, I swear, the very
first thought that popped into my head was "Candy-Machine Interface". I
instantly disliked it, and still consider it a misguided design.<

A negative value to index items from the end of an array is a bad idea in
C (and D), it slows down the code and it's unsafe.
But you must understand that what's unsafe in C is not equally unsafe in
Python, and the other way around too is true. A well enough designed
computer language is not a collection of features, it's a coherent thing.
What I am trying to say you is that while I agree that negative indexes
can be tricky and they are probably too much unsafe in C programs (given
how all other things work in C programs), they are not too much unsafe in
Python programs (given how all other things work in Python programs). In
Python you have to be a little careful when you use them, but they usually
don't cause disasters in my code.

Python certainly makes the consequences of getting the index wrong less
severe than in C, and less likely. But it still stikes me as a bit of a
"dual-purpose" input, and therefore potentally error-prone.
For instance, suppose it's your intent to get the fifth element before the
one that matches "target" (and you already have the index of "target"):
leeloo = collection[targetIndex-5]
Then, suppose your collection, unexpectedly, has "target" in the third
position (either because of a bug elsewhere, or because you just forgot to
take into account the possibility that "target" might be one of the first
five). With bounds-checking that ensures no negatives, you find out
instantly. With Python-style, you're happily givin the second-to-last
element and a silent bug.

Do you have specific objections, or does it just look horrendous to you
:) Would another symbol be acceptable?

Thoughts? other ideas?

I'd just go with accepting the literal 0. Let's see how far that goes
first.

I thought of a counter case:
auto tm = new TreeMap!(int, uint);
tm[-1] = 5;
tm[1] = 6;
What does tm[0..$] mean? What about tm[0]? If it is analogous to
"beginning of collection" then it doesn't make any sense for a container
with a key of numeric type.
Actually any map type where the indexes don't *always* start at zero are a
problem.
I can make 0 work for LinkList and ArrayList, but not any of the others.
Even with TreeSet, I allow using element values as slice arguments.
I guess I should have pointed this out in my first post... sorry.
-Steve

Steven Schveighoffer wrote:
> In regex, ^ matches beginning of the line, $ matches end of the line
So far so good... :)
> So how does this look: coll[^..$];
Speaking of regex, [^ sequence starts a set of excluded characters. :)

Yeah, that is a good counter-argument :)

$ has always bugged me anyway, so how about no character at all:
coll[..n]; // beginning to n
coll[n..]; // n to end
coll[..]; // all of it
I like it! :)

Well, for true contiguous ranges such as arrays, you need to have ways of
adding or subtracting values. For example:
a[0..$-1];
How does that look with your version?
a[0..-1];
not good. I think we need something to denote "end" and I would also like
something to denote "beginning", and I think that can't be empty space.
-Steve

Python, where you can store the last index in a variable:
last_index = -1
a = ['a', 'b', 'c', 'd']
assert a[last_index] == 'd'
In D you represent the last index as $-1, but you can't store that in
a variable.

Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to make
all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
I could replace that end with $, but what can I replace coll.begin
with? 0 doesn't make sense for things like linked lists, maps, sets,
basically anything that's not an array.
One thing that's nice about opDollar is I can make it return coll.end,
so I control the type. With 0, I have no choice, I must take a uint,
which means I have to check to make sure it's always zero, and throw an
exception otherwise.
Would it make sense to have an equivalent symbol for the beginning of a
container/range?
In regex, ^ matches beginning of the line, $ matches end of the line --
would there be any parsing ambiguity there? I know ^ is a binary op,
and $ means nothing anywhere else, so the two are not exactly
equivalent. I'm not very experienced on parsing ambiguities, but things
like ~ can be unambiguous as binary and unary ops, so maybe it is possible.
So how does this look: coll[^..$];
Thoughts? other ideas?
-Steve

If we were to have something like this (and I'm quite unconvinced that
it is desirable), I'd suggest something beginning with $, eg $begin.
But, it seems to me that the slicing syntax assumes that the slicing
index can be mapped to the natural numbers. I think in cases where
that's not true, slicing syntax just shouldn't be used.

I am not sure that is necessary to have a symbol for begining ($
represents the length, not the end, right?).
Anyway, instead of $begin and $end, I would rather have: $$ and $ (or
vice-versa).
Thoughts?

The problem is D already has a lot of syntax. More syntax just makes the
language more burdensome after a certain point, even if in isolation
it's a good idea.

In a lot of cases, this is somewhat true. On the other hand though,
shortcut syntaxes like this are not as bad. What I mean by shortcut is
that 1) its a shortcut for an existing syntax (e.g. $ is short for
coll.length), and 2) it doesn't affect or improves readability.
A good example of shortcut syntax is the recent inout changes. At first,
the objection was "we already have too mcuh const", but when you look at
the result, it is *less* const because you don't have to worry about the
three cases, only one.
The burden for such shortcuts is usually on readers of such code, not
writers. But a small lesson from the docs is all that is needed. Any new
developer will already be looking up $ when they encounter it, if you put
^ right there with it, it's not so bad. Once you understand the meanings,
it reads just as smoothly (and I'd say even smoother) as the alternative
syntax.
I'll also say that I'm not in love with ^, it's just a suggestion. I'd
not be upset if something else were to be used. But 0 cannot be it.

One particular problem regex has is that few can remember its syntax
unless they use it every day.

I don't use it every day, in fact, I almost always have to look up syntax
if I want to get fancy.
But I always remember several things:
1. [^abc] means none of these
2. . means any character
3. * means 0 or more of the previous characters and + means 1 or more of
the previous characters
4. ^ and $ mean beginning and end of line. I usually have to look up
which one means which :)
point 4 may suggest a special error message if someone does coll[^-1] or
coll[$..^]

Thoughts? other ideas?

I'd just go with accepting the literal 0. Let's see how far that goes
first.

auto tm = new TreeMap!(int, uint);
tm[-1] = 5;
tm[1] = 6;
What does tm[0..$] mean? What about tm[0]? If it is analogous to
"beginning of collection" then it doesn't make any sense for a
container with a key of numeric type.
Actually any map type where the indexes don't *always* start at zero
are a problem.

I'd question the design of a map type that has the start at something
other than 0.

Then I guess you question the AA design? Or STL's std::map? Or Java's
TreeMap and HashMap? Or dcollections' map types?
I don't think you meant this. The whole *point* of a map is to have
arbitrary indexes, requiring them to start at 0 would defeat the whole
purpose.
-Steve

Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to
make all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end
symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
I could replace that end with $, but what can I replace coll.begin
with? 0 doesn't make sense for things like linked lists, maps, sets,
basically anything that's not an array.
One thing that's nice about opDollar is I can make it return coll.end,
so I control the type. With 0, I have no choice, I must take a uint,
which means I have to check to make sure it's always zero, and throw an
exception otherwise.
Would it make sense to have an equivalent symbol for the beginning of
a container/range?
In regex, ^ matches beginning of the line, $ matches end of the line
-- would there be any parsing ambiguity there? I know ^ is a binary
op, and $ means nothing anywhere else, so the two are not exactly
equivalent. I'm not very experienced on parsing ambiguities, but
things like ~ can be unambiguous as binary and unary ops, so maybe it
is possible.
So how does this look: coll[^..$];
Thoughts? other ideas?
-Steve

If we were to have something like this (and I'm quite unconvinced that
it is desirable), I'd suggest something beginning with $, eg $begin.

This would be better than nothing.

But, it seems to me that the slicing syntax assumes that the slicing
index can be mapped to the natural numbers. I think in cases where
that's not true, slicing syntax just shouldn't be used.

slicing implies order, that is for sure. But mapping to natural numbers
may be too strict. I look at slicing in a different way. Hopefully you
can follow my train of thought.
dcollections, as a D2 lib, should support ranges, I think that makes the
most sense. All containers in dcollections are classes, so they can't
also be ranges (my belief is that a reference-type based range is too
awkward to be useful). The basic operation to get a range from a
container is to get all the elements as a range (a struct with the range
interface).
So what if I want a subrange? Well, I can pick off the ends of the range
until I get the right elements as the end points. But if it's possible,
why not allow slicing as a better means of doing this? However, slicing
should be a fast operation. Slicing quickly isn't always feasible, for
example, LinkList must walk through the list until you find the right
element, so that's an O(n) operation. So my thought was to allow slicing,
but with the index being a cursor (i.e. pointer) to the elements you want
to be the end points.
Well, if we are to follow array convention, and want to try not to enforce
memory safety, we should verify those end points make sense, we don't want
to return an invalid slice. In some cases, verifying the end points are
in the correct order is slow, O(n) again. But, you always have reasonably
quick access to the first and last elements of a container, and you *know*
their order relative to any other element in the container.
So in dcollections, I support slicing on all collections based on two
cursors, and in all collections, if you make the first cursor the
beginning cursor, or the second cursor the end cursor, it will work. In
some cases, I support slicing on arbitrary cursors, where I can quickly
determine validity of the cursors. The only two cases which allow this
are the ArrayList, which is array based, and the Tree classes (TreeMap,
TreeSet, TreeMultiset), where determining validity is at most a O(lgN)
operation.
Essentially, I see slicing as a way to create a subrange of a container,
where the order of the two end points can be quickly verified.
auto dict = new TreeMap!(string, string); // TreeMap is sorted
...
auto firstHalf = dict["A".."M"];
(You say that slicing using anything besides natural numbers shouldn't be
used. You don't see any value in the above?)
But "A" may not be the first element, there could be strings that are less
than it (for example, strings that start with _), such is the way with
arbitrary maps. So a better way to get the first half may be:
auto firstHalf = dict[dict.begin.."M"];
What does the second half look like?
auto secondHalf = dict["M"..dict.end];
Well, if we are to follow array convention, the second half can be
shortcutted like this:
auto secondHalf = dict["M"..$];
Which looks and reads rather nicely. But there is no equivalent "begin"
shortcut because $ was invented for arrays, which always have a way to
access the first element -- 0. Arbitrary maps have no such index. So
although it's not necessary, a shortcut for begin would also be nice.
Anyways, that's what led me to propose we have some kind of short cut. If
nothing else, at least I hope you now see where I'm coming from, and
hopefully you can see that slicing is useful in cases other than natural
number indexes.
-Steve

Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to make
all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
-Steve

No. begin and end return cursors, which are essentially non-movable
pointers.
The only collection where adding an integer to a cursor would be
feasible is ArrayList, which does support slicing via indexes (and
indexes can be added/subtracted as needed).

emphasis on the semantics (slice starting at second element), not the
arithmetic, sorry.

Currently, D supports the special symbol $ to mean the end of a
container/range.
However, there is no analogous symbol to mean "beginning of a
container/range". For arrays, there is none necessary, 0 is always the
first element. But not all containers are arrays.
I'm running into a dilemma for dcollections, I have found a way to make
all containers support fast slicing (basically by imposing some
limitations), and I would like to support *both* beginning and end
symbols.
Currently, you can slice something in dcollections via:
coll[coll.begin..coll.end];
-Steve

No. begin and end return cursors, which are essentially non-movable
pointers.
The only collection where adding an integer to a cursor would be feasible
is ArrayList, which does support slicing via indexes (and indexes can be
added/subtracted as needed).
Note, I'm not trying to make slicing dcollections as comprehensive as
slicing with arrays, I'm just looking to avoid the verbosity of re-stating
the container's symbol when specifying the beginning or end. In all
dcollections containers, one can always slice where one of the endpoints
is begin or end.
I probably should at least add opDollar to ArrayList...
-Steve