Primitives and Object Wrappers

This series, The Object-Oriented Thought Process, is intended for someone just learning an object-oriented language and who wants to understand the basic concepts before jumping into the code, or someone who wants to understand the infrastructure behind an object-oriented language he or she is already using. These concepts are part of the foundation that any programmer will need to make the paradigm shift from procedural programming to object-oriented programming.

In keeping with the code examples used in the previous articles, Java will be the language used to implement the concepts in code. One of the reasons that I like to use Java is because you can download the Java compiler for personal use at the Sun Microsystems Web site http://java.sun.com/. You can download the standard edition, J2SE 5.0, at http://java.sun.com/j2se/1.5.0/download.jsp to compile and execute these applications. I often reference the Java J2SE 5.0 API documentation and I recommend that you explore the Java API further. Code Listings are provided for all examples in this article as well as Figures and output (when appropriate). See the first article in this series for detailed descriptions for compiling and running all the code examples.

In previous columns, you covered the concept of object wrappers. This is actually one of the most important, and interesting, object-oriented design methodologies. Perhaps the most elegant aspect of object-oriented design is when it comes to integrating with other technologies, specifically older, more established technologies. In fact, one of the hottest job opportunities in Information Technologies is that of integrating legacy applications with object-oriented technologies.

For example, because much of today's business data is kept on mainframes, and it will most likely be staying put, the ability to use a Web front end to interface with this legacy data is very important. Wrapping mainframe technologies in objects is a powerful mechanism form combining the two methodologies. Likewise, as you saw with the client-server systems of earlier articles in this series, using object wrappers to hide the hardware implementation provides a great advantage.

Interestingly, one of the best examples of object wrappers is also one of the simplest. Because the main purpose of this series is to explore the underlying concepts of object-oriented technologies, it is quite helpful to explore the relationship between primitives and objects.

Primitives

When experienced programmers begin to learn object-oriented techniques, one of the first stumbling bocks encountered is the concept that everything is an object—well almost everything. In some languages, such as Java, the universe is really divided into two camps, objects and primitives (in other o-o language architectures, although the primitives are there, the programmer must access them via object wrappers).

Sun Microsystems's Java tutorial defines a primitive as: A variable of primitive type contains a single value of the appropriate size and format for its type: a number, a character, or a boolean value. For example, an integer value is 32 bits of data in a format known as two's complement, the value of a char is 16 bits of data formatted as a Unicode character, and so on.

The primitives are called built-in types, and there are eight of them:

boolean

int

long

byte

short

float

double

char

One of the problems that C programmers had was that the byte size of the primitive types differed among the various platforms. This led to a great amount of grief, or at least a lot of work, for a programmer wanting to port a program from one machine to another.

For example, there were times when I was porting C/C++ programs to different machines that had integer sizes of 16, 32, and 64 bits. This led to a lot of conditional compiling and made the code more vulnerable to bugs. In short, it was the programmer's responsibility to adjust to the variety of the platforms. This was not the optimal solution. The Java architecture allows the programmer to treat all primitives the same on all platforms. Table 1 shows the storage requirements for the various primitives.

boolean

undefined

byte

signed 8-bit integer

short

signed 16-bit integer

int

signed 32-bit integer

long

signed 64-bit integer

float

signed 32-bit floating-point

double

signed 64-bit floating-point

char

16-bit Unicode 2.0 character

Table 1

These primitives fall into three basic categories, the boolean type, the common numeric types (byte, short, int, long, float, and double) and the char type. I like to categorize them in groupings because the boolean and char types require a bit more explanation.

boolean

In the case of the boolean type, the explanation is really due to the fact that the boolean is not really what it appears to be. Because a boolean only has two possible values, true and false, it may seem obvious that a single bit is all that is required to implement the boolean. Yet, as most C programmers know, there was never an actual boolean type in the original C specification.

Yet, the concept of a boolean type was quite common in C programs. When a C programmer needed a boolean type, the programmer simply used an integer to do the job. To mimic the functionality of the boolean values true and false, a simple coding trick was performed:

#define TRUE 1
#define FALSE 0

In effect, this code defines TRUE and FALSE to the values 1 and 0 respectively. With this in place, the programmer can now use traditional Boolean logic.

if (flag == TRUE) {
...do something
}

Although this workaround provides the functionality that we need, it has one drawback—it wastes space. This may well be a trivial problem, but the fact remains that you are using at minimum 8 bits, where only a single bit is required (this assumes that the compiler is implementing the boolean as a byte).

In fact, Java uses the same approach, although behind the scenes. There are efficiency reasons why a single bit is not used to implement a boolean. The compiler/virtual machine/operating systems are not designed to access individual bits. Thus, while you might be seeing bits, you are actually getting bytes simulating bits. As you can see in the code in Listing 1, you actually define a boolean type.

char

The char type is the only non-numeric primitive, and is stored as a 16-bit unsigned value. Despite the fact that a char represents a single character, each char is stored as a numeric code that represents a specific character. Java represents characters in a 16-bit Unicode whereas earlier languages have used 8-bit codes such as ASCII. Unicode allows programming languages to handle a wider character set and support several languages using various alphabets. You can get a great idea of the various alphabet choices, as well as their definitions, by visiting: http://www.unicode.org/charts/.