C/C++ Extensions

Porting to DMC++

Digital Mars C++ is a comprehensive development system for the
Intel 8086 family of processors. This chapter explains how to choose
an appropriate memory model, so that you can create everything
from small command line utilities to the largest and most complex
applications.

Overview of Memory Models

Choosing a memory model means making choices among
meeting minimum system requirements, maximizing code
efficiency, and gaining access to every available memory location. If
you don't specify any particular memory model, the Digital Mars
compilers use the Win32 model. To compile for DOS or Win16, a memory
model must be selected.

To use the Small memory model, you don't need to know anything
about compiler switches or configuring the IDDE. And when you
use a debugger, small model addresses are easy to interpret.
However, if your program requires more than 640KB of memory to
store its code or data, you must choose a different memory model.

If your program's total size is under 640KB, you should choose one
of the memory models in Table 7-1 below. These are the real mode
memory models. Since all the processors used in IBM PCs and
compatibles can run in real mode, programs compiled with these
models can run on all PCs.

For faster and more efficient code use the memory model that gives
you the best fit for your program. For example, using the Large
memory model when another model would suffice makes your
program slower than it has to be because more data is referenced
using both a segment and an offset. For information on how data is
stored in the various memory models, see the section "How Data Is
Stored" later in this chapter. For information on making your
program as efficient as possible, see the section "Fine-Tuning with
Mixed Model Programming" later in this chapter.

If your program is over 640KB, that restricts what machines
will be able to run it. Ultimately, for large DOS programs, you must
choose between performance and portability to older systems.

Running on 8086/8088 machines and later

When using DOS on a machine with an 8086 or 8088 CPU,
more than 640KB of memory is not accessible. If your
program has a large amount of data, consider using handle pointers
described in Using Handle Pointers.

Running under DOS on 80386 machines and later

If your DOS program will run only on the 80386 and 80486, it can
operate in 32-bit protected mode, which lets you access up to 4GB
of RAM.

Note:
4GB is a theoretical maximum. Under DOS or XMS,
the maximum extended memory is limited by BIOS
function calls to just under 64MB. The DOSX 32-bit
DOS extender can handle up to 3GB, because it
allocates 3/4 of the available extended memory.
There is no system that supports 4GB of memory at
this time.

To run in 32-bit protected mode, you need a 32-bit
DOS extender. The DOSX memory model (-mx) is compatible with
the DOSX 386 DOS extender.
The Phar Lap memory model (-mp) is compatible with the Phar Lap
32-bit DOS extender, available from Phar Lap.

Memory Models for Windows 3.x Programs

Since Windows is itself a form of DOS extender, the DOSX or Phar
Lap memory models cannot be used for Windows programs.
You can compile a Windows 3 application with the Small, Medium,
Compact, or Large memory models. Digital Mars recommends
compiling Windows applications with the Large model because it
minimizes the problems associated with mixed model programming.
Windows 3.0 and later eliminate any advantage to using the Medium
memory model.

Compile all the files in a Windows application with the same
(preferably Large) memory model if possible, or explicitly declare a
type for each pointer in a function prototype. If you are mixing near
and far data references, make sure that all declarations match their
corresponding definitions, or hard-to-find bugs can result. For more
information see "Fine-Tuning with Mixed Model Programming"
below.

Note:
Since Digital Mars's Large memory model does not
place data into far data segments by default, Large
model programs compiled with Digital Mars C++ can
be multiple-instance applications.

Memory Models for Windows 95 and NT

To create a program for a 32-bit operating system like Windows NT,
you need a memory model that can reference a flat 32-bit address
space (where CS, DS, SS, and ES all map onto a single memory area).
Digital Mars C++ supports a 32-bit flat address space with the NT (-mn)
memory model. For more information, see
Win32 Programming Guidelines.

Note:
The compiler ignores the keywords __far,
__huge, __interrupt, __loadds, and
__handle when compiling with the -mn memory
model. You can tell the compiler to ignore these
keywords for any compilation with the -NF
compiler option.

How Data Is Stored

How your program stores data depends on whether it is a 16-or 32-bit
program.

16-bit programs

Real mode programs can run on the 8086 and 8088 processors that
were in the original IBM PCs and compatibles. In real mode DOS
programs, code and data are stored in 64KB segments. DOS limits
programs to 640KB bytes of memory, including both code and data.

For the most part, programs written for the 8086 architecture use two
types of references: near and far.

Near code and data references

A near reference refers to a function or data object (or a pointer to a
function or data object) that is within the current segment. It is 16
bits long and contains an offset into the current data segment if it's a
data pointer, or into the current code segment (or the stack segment)
if it's a function pointer.

Far code and data references

A far reference refers to a function or data object (or a pointer to a
function or data object) that is in a different segment than the current
one. It is 32 bits long and contains a 16-bit quantity called the
segment, which identifies the memory segment where the code or
data is stored, and a 16-bit quantity called the offset, which contains
the location of the code or data in that segment. (The 8088 and 8086
have a 20-bit address bus. Therefore, they actually use a 20-bit
segment address, which is obtained by shifting the 16-bit segment
value four bits to the left. This 20-bit value is combined with the
offset to reference an actual memory location.)

Choosing a memory model changes how the compiler stores
addresses to functions and data. If the model can handle less than a
segment's worth of code or data, it uses near pointers to reference
them. If the model can handle more than a segment's worth of code
or data, it uses far pointers to reference them.

Accessing code or data with a near reference is much quicker than
accessing it with a far reference. When you use a far reference, your
program must first find the segment and then find the code or data
within that segment. When you use a near reference, your program
only needs to find the code or data. For a faster program, choose the
memory model that lets you make as many near references as
possible.

Memory models and segmentation

Choosing a memory model does not change how the compiler
segments your code. You choose the segment in which to store code
and data with the __far and __huge keywords, as described in
"Fine-Tuning with Mixed Model Programming" later in this chapter.
The compiler and linker automatically segment your code. You can
fine-tune how the compiler and linker segment your code with the
techniques described in Compiling Code.

32-bit programs

In 32-bit protected mode programs (those compiled with the DOSX,
Phar Lap, or NT memory model), near pointers are 32 bits long and
far pointers are 48 bits long. With these models, your programs can
access up to 4GB of RAM, all through near references.

In 32-bit applications, far pointers are used only for special purposes
like accessing video memory. Therefore, you should not typically
use pointer modifiers in 32-bit programs.
Sizes of data types and pointer types The table below lists the sizes of the base data types and pointer
types in all Digital Mars C++ memory models.

Table 7-2 Data and pointer types and sizes

Data/PointerType

Size in 16-bit
compilations (T, S, M,
C, L, and V models)

Size in 32-bit
compilations (X, P, F,
and N models)

char

signed 8 bits

signed 8 bits

signed char

signed 8 bits

signed 8 bits

unsigned char

unsigned 8 bits

unsigned 8 bits

short

signed 16 bits

signed 16 bits

unsigned short

unsigned 16 bits

unsigned 16 bits

int

signed 16 bits

signed 32 bits

unsigned

unsigned 16 bits

unsigned 32 bits

long

signed 32 bits

signed 32 bits

unsigned long

unsigned 32 bits

unsigned 32 bits

float

32 bits floating

32 bits floating

double

64 bits floating

64 bits floating

long double

64 bits floating

64 bits floating, 80 bits N model

__near pointer

16-bit segment offset

32-bit segment offset

__far pointer

16-bit segment and 16-bit offset

16-bit segment and 32-bit offset

__huge pointer

16-bit segment and 16-bit offset

16-bit segment and 32-bit offset

__ss pointer

16-bit segment offset

32-bit segment offset

__cs pointer

16-bit segment offset

32-bit segment offset

__handle pointer

16-bit segment and 16-bit offset

16-bit segment and 32-bit offset

Fine-Tuning with Mixed Model Programming

Digital Mars C++ lets you mix memory models within a program by
using the __near, __far, __cs, __ss, and __huge keywords.
These keywords permit you to fine-tune how your program uses
memory.

Note:
The __near, __far, and __huge keywords are
not part of ANSI C language and are used only in
operating systems with segmented memory. Code
that uses them is not portable. In addition, they are
of limited usefulness when creating 32-bit
applications.

Creating large data structures with far data in 16-bit programs

In all the 16-bit memory models, the compiler puts all static and
global variables into a single data segment (called DGROUP) that can
only contain 64KB. With far data, you can put a particular data
structure into a data segment of its own. However, that data structure
cannot be larger than 64KB.

To declare a data structure to be far, put the __far keyword
immediately before the identifier, like this:

int __far array[10000];
struct ABC __far table[600] = { .... }

Access far data with array syntax:

array[301] = 32;
table[258] = an_abc_struct;

The compiler creates a segment name for the data structure from the
source file name and the variable name.

By default, the compiler uses far data in the Compact and Large
memory models. When you use the __far keyword with a data
declaration, the compiler starts a new data segment and puts the rest
of the data in the file into that segment.

Portably declaring large arrays in 16-bit compilations

It is frequently necessary to declare arrays larger than 64K in size.
For instance:

To portably declare arrays greater than 64K in 16-bit compilations,
you can construct an array of pointers to arrays, where each unit is
less than 64K is size. Using this technique, the above arrays would
be declared as:

Code that declares large arrays using pointers must be compiled in
one of the large data models (Compact or Large). Storage for an
array of pointers cannot be allocated statically; you need to call
calloc() to initialize them to all zeros:

To access an element of array[], instead of array[i]. use this
syntax:

long i;
array(i) = array(i + 10);

Note that the macro can be used both as an lvalue or an rvalue.
Similarly, for values:

int i, j;
values[i][j] = values[i][j] + 6.7;

Most of the time you won't need to deallocate the memory used for
the arrays, if they are used for the duration of program execution;
the operating system will deallocate the storage when the program
terminates.

The methods described above are not only portable to ANSI C and to
32 bits, they can also be faster than using _huge.

Declaring class objects as far data

In the Small, Tiny, and Medium memory models, you cannot declare
as far class objects that you create with new data. In this example,
the first declaration causes an error, but the second will not:

AClassA __far *a1 = new(classA) // ERROR
AClassA __far a2; // OK

In the other 16-bit memory models, you can declare any class object
as far data. In the 32-bit models, you cannot declare class objects as
far data.

Using __near and __far functions

When you compile a program with the Medium or Large memory
models, by default the compiler uses far pointers for function
addresses. However, if you know that a function is used only by
other functions that are in the same code segment, you can declare it
__near, so that the compiler will access it with near pointers.

The __near keyword is especially useful with static functions that
is, functions that are used only within the file where they're defined.
Since the compiler by default puts all a file's functions into the same
code segment, you can declare any static function as near. However,
you should not declare global functions as __near.

In the example below, walktree() is a recursive static near
function. The program saves a significant amount of time by using a
near instead of a far address. A near address pushes a 16-bit return
address on the stack for each call.

Note:
You cannot declare a static function whose address
you take as near and then attempt to call it as a far
function.

You rarely need to declare functions with the __far keyword.
Programs that use the Medium or Large memory models use far
pointers by default and programs that use the Small and Compact
memory models don't contain multiple code segments. The only
exception is a Small, Tiny, or Compact program that runs under
Windows and uses a dynamic link library (or DLL). The functions in
the DLL are in a separate code segment and must be declared far.

Using huge pointers

A huge pointer is similar to a far pointer. It is 32 bits long and can
point to any location in memory. You declare data to be huge by
substituting the __huge keyword for __far.

Huge pointers offer three advantages over far pointers:

A huge pointer's segment value can change as the offset
value "wraps around," unlike a far pointer. A huge pointer
therefore can point to a data object greater than 64KB in
size.

When you perform logical comparisons on huge pointers,
both the segment and offset are compared. For far
pointers, only the offsets are compared.

You can change a huge pointer's segment value with
pointer arithmetic or array indexing.

Because of the extra overhead associated with huge pointer
arithmetic, you should use huge pointers only for data objects larger
than 64KB. Do not use huge pointers in 32-bit code. The keyword
__huge is ignored in compilations using the NT (-mn) memory
model.

Note:
Digital Mars C++ does not support the Huge memory
model. That is, a pointer whose type is unspecified
cannot be made huge by default.

Using handle pointers

Handle pointers are a Digital Mars C++ extension to the far pointer
type that support virtual memory management. You use handle
pointers to access expanded memory (EMS or LIMS) in 16-bit
programs.

Like far pointers, handle pointers are 32 bits long in 16-bit
applications. They let a data structure use as much as 16KB of
memory, and let your program use as much as 16MB.

Note:
The keyword __handle is ignored in compilations
using the NT (-mn) memory model.

Using __ss pointers

You use __ss pointers to point to objects on the stack. In the Tiny,
Small, Compact, Medium, and Large memory models, __ss is a 16-bit
offset. In the DOSX and Phar Lap memory models, it is a 32-bit
offset.

__ss pointers work like near pointers; the difference is that their
segment address is set to the stack segment instead of the data
segment. Thus __ss pointers are relative to the SS segment register,
while near pointers are relative to the DS segment register.

If SS==DS (which is TRUE in the Tiny, Small, and Medium memory
models), there is no difference between __ss pointers and near
pointers. In the Compact and Large models, or whenever you set
SS!= DS with the w qualifier to the -m compiler option (as for DLLs or
ROM-based code), __ss can only be used to point to parameters
and automatic variables, while near pointers can only point to static
and global data.

Storing data in the code segment

Digital Mars C++ lets you store data in the code segment with the
keyword __cs. Use __cs as you do __far. For example:

Potential problems

When using the __cs keyword, keep the following potential
problems in mind:

For any program, declare all data stored in the code
segment as const (or as static const if possible) to
explicitly tell the compiler not to write to it. If you modify
data in the code segment, the modified value is not saved
if the segment is swapped out.

Data in a code segment cannot be accessed if the CS
register does not contain the value for that code segment.
Therefore, in programs with multiple code segments, the
code and the associated data must exist in the same
segment.

In real mode Windows 3.0 programs, make sure that the
code segment is not relocated while a far pointer is
pointing to data in that code segment. If the code
segment is relocated, its contents will be corrupted.

When storing data in the code segment, you must disable
the /FARCALLTRANSLATION option to OPTLINK. This
linker optimization assumes that code segments contain
only code.

The OBJ2ASM utility, as well as many debuggers, have
problems handling arbitrary data stored in the code
segment.