Chapter 2 Types, Operators, and Expressions

D provides the ability to access and manipulate a variety of data objects: variables and data structures can be created and modified, data objects defined in the operating system kernel and user processes can be accessed, and integer, floating-point, and string constants can be declared. D provides a superset of the ANSI-C operators that are used to manipulate objects and create complex expressions. This chapter describes the detailed set of rules for types, operators, and expressions.

Identifier Names and Keywords

D identifier names are composed of upper case and lower case letters, digits, and underscores where the first character must be a letter or underscore. All identifier names beginning with an underscore (_) are reserved for use by the D system libraries. You should avoid using such names in your D programs. By convention, D programmers typically use mixed-case names for variables and all upper case names for constants.

D language keywords are special identifiers reserved for use in the programming language syntax itself. These names are always specified in lower case and may not be used for the names of D variables.

Table 2–1 D Keywords

auto*

goto*

sizeof

break*

if*

static*

case*

import*+

string+

char

inline

stringof+

const

int

struct

continue*

long

switch*

counter*+

offsetof+

this+

default*

probe*+

translator+

do*

provider*+

typedef

double

register*

union

else*

restrict*

unsigned

enum

return*

void

extern

self+

volatile

float

short

while*

for*

signed

xlate+

D reserves for use as keywords a superset of the ANSI-C keywords. The keywords reserved for future use by the D language are marked with “*”. The D compiler will produce a syntax error if you attempt to use a keyword that is reserved for future use. The keywords defined by D but not defined by ANSI-C are marked with “+”. D provides the complete set of types and operators found in ANSI-C. The major difference in D programming is the absence of control-flow constructs. Keywords associated with control-flow in ANSI-C are reserved for future use in D.

Data Types and Sizes

D provides fundamental data types for integers and floating-point constants. Arithmetic may only be performed on integers in D programs. Floating-point constants may be used to initialize data structures, but floating-point arithmetic is not permitted in D. D provides a 32-bit and 64-bit data model for use in writing programs. The data model used when executing your program is the native data model associated with the active operating system kernel. You can determine the native data model for your system using isainfo-b.

The names of the integer types and their sizes in each of the two data models are shown in the following table. Integers are always represented in twos-complement form in the native byte-encoding order of your system.

Table 2–2 D Integer Data Types

Type Name

32–bit Size

64–bit Size

char

1 byte

1 byte

short

2 bytes

2 bytes

int

4 bytes

4 bytes

long

4 bytes

8 bytes

long long

8 bytes

8 bytes

Integer types may be prefixed with the signed or unsigned qualifier. If no sign qualifier is present, the type is assumed to be signed. The D compiler also provides the type aliases listed in the following table:

Table 2–3 D Integer Type Aliases

Type Name

Description

int8_t

1 byte signed integer

int16_t

2 byte signed integer

int32_t

4 byte signed integer

int64_t

8 byte signed integer

intptr_t

Signed integer of size equal to a pointer

uint8_t

1 byte unsigned integer

uint16_t

2 byte unsigned integer

uint32_t

4 byte unsigned integer

uint64_t

8 byte unsigned integer

uintptr_t

Unsigned integer of size equal to a pointer

These type aliases are equivalent to using the name of the corresponding base type in the previous table and are appropriately defined for each data model. For example, the type name uint8_t is an alias for the type unsigned char. See Chapter 8, Type and Constant Definitions for information on how to define your own type aliases for use in your D programs.

D provides floating-point types for compatibility with ANSI-C declarations and types. Floating-point operators are not supported in D, but floating-point data objects can be traced and formatted using the printf() function. The floating-point types listed in the following table may be used:

Table 2–4 D Floating-Point Data Types

Type Name

32–bit Size

64–bit Size

float

4 bytes

4 bytes

double

8 bytes

8 bytes

long double

16 bytes

16 bytes

D also provides the special type string to represent ASCII strings. Strings are discussed in more detail in Chapter 6, Strings.

Constants

Integer constants can be written in decimal (12345), octal (012345), or hexadecimal (0x12345). Octal (base 8) constants must be prefixed with a leading zero. Hexadecimal (base 16) constants must be prefixed with either 0x or 0X. Integer constants are assigned the smallest type among int, long, and long long that can represent their value. If the value is negative, the signed version of the type is used. If the value is positive and too large to fit in the signed type representation, the unsigned type representation is used. You can apply one of the following suffixes to any integer constant to explicitly specify its D type:

u or U

unsigned version of the type selected by the compiler

l or L

long

ul or UL

unsigned long

ll or LL

long long

ull or ULL

unsigned long long

Floating-point constants are always written in decimal and must contain either a decimal point (12.345) or an exponent (123e45) or both (123.34e-5). Floating-point constants are assigned the type double by default. You can apply one of the following suffixes to any floating-point constant to explicitly specify its D type:

f or F

float

l or L

long double

Character constants are written as a single character or escape sequence enclosed in a pair of single quotes ('a'). Character constants are assigned the type int and are equivalent to an integer constant whose value is determined by that character's value in the ASCII character set. You can refer to ascii(5) for a list of characters and their values. You can also use any of the special escape sequences shown in the following table in your character constants. D supports the same escape sequences found in ANSI-C.

Table 2–5 D Character Escape Sequences

\a

alert

\\

backslash

\b

backspace

\?

question mark

\f

formfeed

\'

single quote

\n

newline

\”

double quote

\r

carriage return

\0oo

octal value 0oo

\t

horizontal tab

\xhh

hexadecimal value 0xhh

\v

vertical tab

\0

null character

You can include more than one character specifier inside single quotes to create integers whose individual bytes are initialized according to the corresponding character specifiers. The bytes are read left-to-right from your character constant and assigned to the resulting integer in the order corresponding to the native endian-ness of your operating environment. Up to eight character specifiers can be included in a single character constant.

Strings constants of any length can be composed by enclosing them in a pair of double quotes ("hello"). A string constant may not contain a literal newline character. To create strings containing newlines, use the \n escape sequence instead of a literal newline. String constants may contain any of the special character escape sequences shown for character constants above. Similar to ANSI-C, strings are represented as arrays of characters terminated by a null character (\0) that is implicitly added to each string constant that you declare. String constants are assigned the special D type string. The D compiler provides a set of special features for comparing and tracing character arrays that are declared as strings, as described in Chapter 6, Strings.

Arithmetic Operators

D provides the binary arithmetic operators shown in the following table for use in your programs. These operators all have the same meaning for integers as they do in ANSI-C.

Table 2–6 D Binary Arithmetic Operators

+

integer addition

-

integer subtraction

*

integer multiplication

/

integer division

%

integer modulus

Arithmetic in D may only be performed on integer operands, or on pointers, as discussed in Chapter 5, Pointers and Arrays. Arithmetic may not be performed on floating-point operands in D programs. The DTrace execution environment does not take any action on integer overflow or underflow. You must check for these conditions yourself in situations where overflow and underflow can occur.

The DTrace execution environment does automatically check for and report division by zero errors resulting from improper use of the / and % operators. If a D program executes an invalid division operation, DTrace will automatically disable the affected instrumentation and report the error. Errors detected by DTrace have no effect on other DTrace users or on the operating system kernel, so you don't need to worry about causing any damage if your D program inadvertently contains one of these errors.

In addition to these binary operators, the + and - operators may also be used as unary operators as well; these operators have higher precedence than any of the binary arithmetic operators. The order of precedence and associativity properties for all the D operators is presented in Table 2–11. You can control precedence by grouping expressions in parentheses ( ).

Relational Operators

D provides the binary relational operators shown in the following table for use in your programs. These operators all have the same meaning as they do in ANSI-C.

Table 2–7 D Relational Operators

<

left-hand operand is less than right-operand

<=

left-hand operand is less than or equal to right-hand operand

>

left-hand operand is greater than right-hand operand

>=

left-hand operand is greater than or equal to right-hand operand

==

left-hand operand is equal to right-hand operand

!=

left-hand operand is not equal to right-hand operand

Relational operators are most frequently used to write D predicates. Each operator evaluates to a value of type int which is equal to one if the condition is true, or zero if it is false.

Relational operators may be applied to pairs of integers, pointers, or strings. If pointers are compared, the result is equivalent to an integer comparison of the two pointers interpreted as unsigned integers. If strings are compared, the result is determined as if by performing a strcmp(3C) on the two operands. Here are some example D string comparisons and their results:

"coffee" < "espresso"

... returns 1 (true)

"coffee" == "coffee"

... returns 1 (true)

"coffee" >= "mocha"

... returns 0 (false)

Relational operators may also be used to compare a data object associated with an enumeration type with any of the enumerator tags defined by the enumeration. Enumerations are a facility for creating named integer constants and are described in more detail in Chapter 8, Type and Constant Definitions.

Logical Operators

D provides the following binary logical operators for use in your programs. The first two operators are equivalent to the corresponding ANSI-C operators.

Table 2–8 D Logical Operators

&&

logical AND: true if both operands are true

||

logical OR: true if one or both operands are true

^^

logical XOR: true if exactly one operand is true

Logical operators are most frequently used in writing D predicates. The logical AND operator performs short-circuit evaluation: if the left-hand operand is false, the right-hand expression is not evaluated. The logical OR operator also performs short-circuit evaluation: if the left-hand operand is true, the right-hand expression is not evaluated. The logical XOR operator does not short-circuit: both expression operands are always evaluated.

In addition to the binary logical operators, the unary ! operator may be used to perform a logical negation of a single operand: it converts a zero operand into a one, and a non-zero operand into a zero. By convention, D programmers use ! when working with integers that are meant to represent boolean values, and == 0 when working with non-boolean integers, although both expressions are equivalent in meaning.

The logical operators may be applied to operands of integer or pointer types. The logical operators interpret pointer operands as unsigned integer values. As with all logical and relational operators in D, operands are true if they have a non-zero integer value and false if they have a zero integer value.

Bitwise Operators

D provides the following binary operators for manipulating individual bits inside of integer operands. These operators all have the same meaning as in ANSI-C.

Table 2–9 D Bitwise Operators

&

bitwise AND

|

bitwise OR

^

bitwise XOR

<<

shift the left-hand operand left by the number of bits specified by the right-hand operand

>>

shift the left-hand operand right by the number of bits specified by the right-hand operand

The binary & operator is used to clear bits from an integer operand. The binary | operator is used to set bits in an integer operand. The binary ^ operator returns one in each bit position where exactly one of the corresponding operand bits is set.

The shift operators are used to move bits left or right in a given integer operand. Shifting left fills empty bit positions on the right-hand side of the result with zeroes. Shifting right using an unsigned integer operand fills empty bit positions on the left-hand side of the result with zeroes. Shifting right using a signed integer operand fills empty bit positions on the left-hand side with the value of the sign bit, also known as an arithmetic shift operation.

Shifting an integer value by a negative number of bits or by a number of bits larger than the number of bits in the left-hand operand itself produces an undefined result. The D compiler will produce an error message if the compiler can detect this condition when you compile your D program.

In addition to the binary logical operators, the unary ~ operator may be used to perform a bitwise negation of a single operand: it converts each zero bit in the operand into a one bit, and each one bit in the operand into a zero bit.

Assignment Operators

D provides the following binary assignment operators for modifying D variables. You can only modify D variables and arrays. Kernel data objects and constants may not be modified using the D assignment operators. The assignment operators have the same meaning as they do in ANSI-C.

Table 2–10 D Assignment Operators

=

set the left-hand operand equal to the right-hand expression value

+=

increment the left-hand operand by the right-hand expression value

-=

decrement the left-hand operand by the right-hand expression value

*=

multiply the left-hand operand by the right-hand expression value

/=

divide the left-hand operand by the right-hand expression value

%=

modulo the left-hand operand by the right-hand expression value

|=

bitwise OR the left-hand operand with the right-hand expression value

&=

bitwise AND the left-hand operand with the right-hand expression value

^=

bitwise XOR the left-hand operand with the right-hand expression value

<<=

shift the left-hand operand left by the number of bits specified by the right-hand expression value

>>=

shift the left-hand operand right by the number of bits specified by the right-hand expression value

Aside from the assignment operator =, the other assignment operators are provided as shorthand for using the = operator with one of the other operators described earlier. For example, the expression x = x + 1 is equivalent to the expression x += 1, except that the expression x is evaluated once. These assignment operators obey the same rules for operand types as the binary forms described earlier.

The result of any assignment operator is an expression equal to the new value of the left-hand expression. You can use the assignment operators or any of the operators described so far in combination to form expressions of arbitrary complexity. You can use parentheses ( ) to group terms in complex expressions.

Increment and Decrement Operators

D provides the special unary ++ and -- operators for incrementing and decrementing pointers and integers. These operators have the same meaning as in ANSI-C. These operators can only be applied to variables, and may be applied either before or after the variable name. If the operator appears before the variable name, the variable is first modified and then the resulting expression is equal to the new value of the variable. For example, the following two expressions produce identical results:

x += 1;

y = ++x;

y = x;

If the operator appears after the variable name, then the variable is modified after its current value is returned for use in the expression. For example, the following two expressions produce identical results:

y = x;

y = x--;

x -= 1;

You can use the increment and decrement operators to create new variables without declaring them. If a variable declaration is omitted and the increment or decrement operator is applied to a variable, the variable is implicitly declared to be of type int64_t.

The increment and decrement operators can be applied to integer or pointer variables. When applied to integer variables, the operators increment or decrement the corresponding value by one. When applied to pointer variables, the operators increment or decrement the pointer address by the size of the data type referenced by the pointer. Pointers and pointer arithmetic in D are discussed in Chapter 5, Pointers and Arrays.

Conditional Expressions

Although D does not provide support for if-then-else constructs, it does provide support for simple conditional expressions using the ? and : operators. These operators enable a triplet of expressions to be associated where the first expression is used to conditionally evaluate one of the other two. For example, the following D statement could be used to set a variable x to one of two strings depending on the value of i:

x = i == 0 ? "zero" : "non-zero";

In this example, the expression i == 0 is first evaluated to determine whether it is true or false. If the first expression is true, the second expression is evaluated and the ?: expression returns its value. If the first expression is false, the third expression is evaluated and the ?: expression return its value.

As with any D operator, you can use multiple ?: operators in a single expression to create more complex expressions. For example, the following expression would take a char variable c containing one of the characters 0-9, a-z, or A-Z and return the value of this character when interpreted as a digit in a hexadecimal (base 16) integer:

The first expression used with ?: must be a pointer or integer in order to be evaluated for its truth value. The second and third expressions may be of any compatible types. You may not construct a conditional expression where, for example, one path returns a string and another path returns an integer. The second and third expressions also may not invoke a tracing function such as trace() or printf(). If you want to conditionally trace data, use a predicate instead, as discussed in Chapter 1, Introduction.

Type Conversions

When expressions are constructed using operands of different but compatible types, type conversions are performed in order to determine the type of the resulting expression. The D rules for type conversions are the same as the arithmetic conversion rules for integers in ANSI-C. These rules are sometimes referred to as the usual arithmetic conversions.

A simple way to describe the conversion rules is as follows: each integer type is ranked in the order char, short, int, long, long long, with the corresponding unsigned types assigned a rank above its signed equivalent but below the next integer type. When you construct an expression using two integer operands such as x + y and the operands are of different integer types, the operand type with the highest rank is used as the result type.

If a conversion is required, the operand of lower rank is first promoted to the type of higher rank. Promotion does not actually change the value of the operand: it simply extends the value to a larger container according to its sign. If an unsigned operand is promoted, the unused high-order bits of the resulting integer are filled with zeroes. If a signed operand is promoted, the unused high-order bits are filled by performing sign extension. If a signed type is converted to an unsigned type, the signed type is first sign-extended and then assigned the new unsigned type determined by the conversion.

Integers and other types can also be explicitly cast from one type to another. In D, pointers and integers can be cast to any integer or pointer types, but not to other types. Rules for casting and promoting strings and character arrays are discussed in Chapter 6, Strings. An integer or pointer cast is formed using an expression such as:

y = (int)x;

where the destination type is enclosed in parentheses and used to prefix the source expression. Integers are cast to types of higher rank by performing promotion. Integers are cast to types of lower rank by zeroing the excess high-order bits of the integer.

Because D does not permit floating-point arithmetic, no floating-point operand conversion or casting is permitted and no rules for implicit floating-point conversion are defined.

Precedence

The D rules for operator precedence and associativity are described in the following table. These rules are somewhat complex, but are necessary to provide precise compatibility with the ANSI-C operator precedence rules. The table entries are in order from highest precedence to lowest precedence.

Table 2–11 D Operator Precedence and Associativity

Operators

Associativity

() [] -> .

left to right

! ~ ++ -- + - * & (type) sizeof stringof offsetof xlate

right to left

* / %

left to right

+ -

left to right

<< >>

left to right

< <= > >=

left to right

== !=

left to right

&

left to right

^

left to right

|

left to right

&&

left to right

^^

left to right

||

left to right

?:

right to left

= += -= *= /= %= &= ^= |= <<= >>=

right to left

,

left to right

There are several operators in the table that we have not yet discussed; these will be covered in subsequent chapters:

The comma (,) operator listed in the table is for compatibility with the ANSI-C comma operator, which can be used to evaluate a set of expressions in left-to-right order and return the value of the rightmost expression. This operator is provided strictly for compatibility with C and should generally not be used.

The () entry in the table of operator precedence represents a function call; examples of calls to functions such as printf() and trace() are presented in Chapter 1, Introduction. A comma is also used in D to list arguments to functions and to form lists of associative array keys. This comma is not the same as the comma operator and does not guarantee left-to-right evaluation. The D compiler provides no guarantee as to the order of evaluation of arguments to a function or keys to an associative array. You should be careful of using expressions with interacting side-effects, such as the pair of expressions i and i++, in these contexts.

The [] entry in the table of operator precedence represents an array or associative array reference. Examples of associative arrays are presented in Chapter 1, Introduction. A special kind of associative array called an aggregation is described in Chapter 9, Aggregations. The [] operator can also be used to index into fixed-size C arrays as well, as described in Chapter 5, Pointers and Arrays.