Design

Computer Programming and Precise Terminology

By Jack Purdum, July 10, 2008

Teaching a new programming language is difficult enough without confusing the very concepts we are trying to teach

The Bucket Analogy

While students initially have some difficulty understanding lvalues and rvalues, we developed the "Bucket Analogy" to help them understand the concepts of lvalues and rvalues in a simple way. We use the Bucket Analogy immediately after the discussion of lvalues and rvalues presented above. Simply stated, an lvalue is the memory address where you can find a variable's bucket in much the same way that a street address tells you where to find a specific house. The rvalue is what you see when you look inside the bucket. And finally, the variable's data type (see column 2 in the symbol table) determines the size of the bucket. (While most buckets have their size expressed in gallons, our buckets' size is expressed in bytes.)

Using the Bucket Analogy to Explain Casts

All kinds of teaching concepts can benefit from the Bucket Analogy. For example, in C# and Java, consider the statements:

int val;
double x;
// some code...
val = x; // Statement 20

Technically, we could tell the students: "The compiler does not like the assignment of x into val in Statement 20 because data narrowing reflects an impedance mismatch between the two variables' data types resulting in a possible loss of information." Or, we can use the Bucket Analogy and the symbol table information and simply say: "The compiler's complaining because you are trying to pour 8 bytes of information into a 4-byte bucket." That is, the 8 bytes of double data stored in x's bucket won't fit into val's 4-byte int bucket and information might be spilled and lost in the process.

We then ask them how to solve the bucket overflow problem. Perhaps they come up with:

val = (int) x;

In terms of the Bucket Analogy, you can explain a (data narrowing's need for a) cast as the compiler's attempt to adjust the bucket size from a larger bucket to one that matches the destination bucket not to "spill" any information during the assignment process.

When you attempt to explain data widening using the statement:

x = val;

ask the students why the compiler does not complain even though the two variables are not of the same data type. The answer is simple: "Data widening is not a problem because you are pouring 4 bytes of information into an 8 byte bucket...no information is spilled or lost in the process." (We also point out, however, that they should still use a cast to document the silent cast being performed by the compiler.)

Once the students have grasped the basic concepts, you can go back and fill in the explanation using more technical terms if one thinks it is necessary.

Explaining Value Types versus Reference Types

The concepts of lvalues and rvalues in conjunction with the Bucket Analogy also makes it easier to explain the difference between value types and reference types in languages that support objects. Consider the following statements (for C++, C#, or Java):

int i;
clsPerson myFriend;

We might reflect these two statements in a symbol table like that in Table 6.

Table 6: A Hypothetical Symbol Table, value and reference types.

Using the symbol table information from Table 6, we can draw the associated lvalue-rvalue diagrams as in Figure 4.

Figure 4: Lvalue and rvalue Diagrams for i and myFriend

In this example, we assume that the two variables are instance variables being defined for use in a program. Most OOP languages initialize such variables so value types are initialized to 0 and reference types are initialized to null, as in Figure 4.

The stumbling block for many students is the distinction between a reference variable and an instance object of a class. The students probably understand the definition of variable i using the narrative associated with Figure 2. Explaining the statement:

clsPerson myFriend;

however, often takes a little more effort. From the symbol table in Table 6, we can see that we have defined a reference variable named myFriend. At this point, you would give the students the following rule:

A reference variable can only have an rvalue with one of two possible values: 1) null, or 2) a memory address.

If we look at Figure 4, we can see that myFriend does have an lvalue of 750,000, but it has an rvalue of null. This means: we have defined a reference variable named myFriend, but we have also declared a clsPerson object. The interpretation is that myFriend does exist (i.e., it is defined), but no object yet exists because the rvalue of myFriend is null (i.e., the object is declared, but not defined). At this point, we simply have information that describes an object (i.e., it can "become" a clsPerson object), but that object does not yet exist in memory. Again, thus far, we have defined a reference variable named myFriend which is a declaration for a clsPerson object. (This is the point where programmers who treat the terms definition and declaration as synonyms get into trouble when trying to explain object instantiation.)

To define a clsPerson object that we can actually use in our code, we need to "finish" the data definition for an instance of a clsPerson object. We do this with the statement:

myFriend = new clsPerson();

After the compiler checks the syntax and finds it acceptable, the compiler issues a memory request to the operating system's memory manager for enough memory to hold a clsPerson object. An object might take only a few bytes of memory or it might require several kilobytes of memory depending upon the object's complexity. Whatever the actual request is, the compiler makes the request to the operating system's memory manager and returns the memory address of where the bytes for that object are located. Having fulfilled that memory request, code to call the class constructor is generated and the constructor instantiates the object according to the constructor's code. Because the rvalue of myFriend contains a valid memory address, variable myFriend now references an object of clsPerson that we can use in our program.

Just to make things more concrete, assume a clsPerson object takes 2,500 bytes of storage and the memory manager found that many free bytes of memory at memory address 780,000. Figure 4 now becomes Figure 5.

Figure 5: Lvalue and rvalue Diagrams for i and myFriend

Note how the rvalue of myFriend has changed from null to the memory address of where the 2,500 bytes of memory associated with the clsPerson object is located. In other words, we have now defined a clsPerson object that we can access through the myFriend reference variable. Also notice that when a reference variable has an rvalue that is null, it does not reference a "useable" object. That is, a null rvalue for a reference variable means we have declared an object (i.e., we know something about it), but the object is not yet defined (i.e., we cannot do anything with it because the object is not yet instantiated with a known memory address). Once the reference variable's null rvalue is replaced with a valid memory address, we know we have defined a class object that we can use via the reference variable named myFriend.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!