The evolution of version resources – 16-bit version resources

I return to the extremely sporadic series on resources
with a description of the version resource.
You don't need to know how version resources are
formatted internally;
you should just use the version resource manipulation functions
GetFileVersionInfo,
VerQueryValue, and their friends.
I'm providing this information merely for its historical significance.

Version resources can be viewed as a serialized tree structure.
Each node of the tree has a name and associated data
(either binary or text),
and each node can have zero or more child nodes.
The root node is always named VS_VERSION_INFO and
is a binary node consisting of a VS_FIXEDFILEINFO
structure.
Beyond that, you can call your nodes anything you want and
give them any kind of data you want.
But if you want other people to understand your version information,
you'd be best off following the conventions I describe below.
Actually, since people seem to prefer diagrams to words,
I'll give you a diagram:

VS_VERSION_INFO

VS_FIXEDFILEINFO structure (binary)

StringFileInfo

(no data)

xxxxyyyy

(no data)

CompanyName

string for xxxxyyyy

FileDescription

string for xxxxyyyy

FileVersion

string for xxxxyyyy

...

zzzzwwww

(no data)

CompanyName

string for zzzzwwww

FileDescription

string for zzzzwwww

FileVersion

string for zzzzwwww

...

VarFileInfo

(no data)

Translation

array of locale/codepage pairs (binary, variable-size)

The child nodes can appear in any order, and the strings
like CompanyName are all optional.
VarFileInfo\Translation, however, is mandatory
(by convention).

If you've used VerQueryValue, you know that the
binary data stored under
VarFileInfo\Translation consists of a variable-length
array of locale/codepage pairs, each of which in turn corresponds
to a child of the StringfileInfo node.
I'm not going to go into what each of the strings means and how
the local/codepage pairs turn into child nodes of StringFileInfo;
I'll leave you to research that on your own (assuming you don't
already know).

How does this tree get stored into a resource?
It's actually quite simple.
Each node is stored in a structure which takes the following
form (in pseudo-C):

In words, each version node begins with a 16-bit value
describing the size of the nodes in bytes (including its children),
followed by a 16-bit value that specifies how many bytes
of data (either binary or text) are associated with the node.
(If the node contains text data, the count includes the null terminator.)
Next comes the null-terminated name of the node
and padding bytes to bring us back into DWORD
alignment.
After the key name (and optional padding) comes the data,
again followed by padding bytes to bring us back
into DWORD alignment.
Finally, after all the node information come its children.

Since each of the children might themselves have children,
you can see how the tree structure "flattens" into this
serialized format.
To move from one node to its next sibling, you skip ahead
by cbNode bytes.
To move from a node to its first child, you skip over
the key name and associated data.

Let's take a look at the resources for the 16-bit
shell.dll to see how this all fits together.

Notice that the size of the root node equals the size of
the entire version resource.
This is to be expected, of course, because the version resource
is merely a serialization of the resource tree diagram.

Since the string name (plus null terminator) happens to come out
to an exact multiple of four bytes, there is no need for padding
between the name and the binary data, which takes the form of a
VS_FIXEDFILEINFO:

Notice that the padding bytes
are not counted in the cbData.
In fact, the padding bytes at the end of the data don't
even count towards the cbNode.
This is a leaf node since we already reach the end of the node
once we store the data.
Therefore, the next node in the version resource is a sibling,
not a child.

Wait a second, what's that "another null terminator"?
if you count the bytes, you'll see that the cbData
for the LegalCopyright node counts not only the
terminating null, but another bonus null after it.
I suspect that somebody put an extra null terminator in the
resource file by mistake:

I’m not really that surprised at the “another NUL terminator”.
There have been times where the resource compiler (or maybe
Visual Studio, I haven’t really looked) does not properly terminate the
strings in the version resource. In fact, I had to use the
“extra” NULs on my last project because the product version string had
weird characters on the end due to the File Properties dialog printing
out the size of the subsequent version information record as a string.

By the way. I do hope that the use of “null” rather than NUL
was a typo. It would be damaging to my idealized version of
Raymond Chen to see that he didn’t know his ASCII character codes.
That puts a serious hurt on your geek cred. :)

[NUL is the name for the null character. In the
same way that BEL is the name for the bell character, but if I want to
talk about a bell, I’ll just call it bell and not BEL. -Raymond]

<<Beyond that, you can call your nodes anything you want and give them any kind of data you want. But if you want other people to understand your version information, you’d be best off following the conventions I describe below.>>

In fact, even the resource editor in VS has problems with that, and will "merge" resources from other blocks into the StringFileInfo block.

Actually there are distinct entities here. NUL is the name for ASCII 0. NULL is a name for a constant which converts trivially to a null pointer in C. The word "null" has some definition which I won’t bother quoting; I’m assuming we’re happy with its usage in context.

Null pointers are synonymous with NULL but that doesn’t "make" them NULL; NULL is either a builtin compiler concept or a macro which expands to a compiler concept (usually just "0" but that’s a whole additional legacy).

"Strings" in C/C++ are required to terminate with a null character which is by definition 0. (Even with wchar_t being a distinct type, it’s still an integral type and must store and preserve the value "0". Whereas pointers are not integral types and assignment of "0" to them may involve a nontrivial conversion.)

I personally try to avoid using the term NUL ever since it assumes too much about the character set encoding. But I’m a weirdo like that.

If you’re talking about Win16 then the base note here makes me think that it did not have an i18n system. Strings were encoded in ANSI and there was no way to designate code pages other than the code page in use in the end user’s Windows system.

If you’re talking about Win32 then the resource system does appear to be designed to support i18n but it doesn’t work. If you try to load resources for a language other than the "first" language in the .exe’s resources, it won’t work. You have to put each other language in its own .dll.

why are all the different locales packed into

one resource block for version resources?

Maybe so that the list of languages can be retrieved? There’s one known place to retrieve it from.

(1) If you try to load "ordinary" resources (strings, dialog boxes, etc.) with a LocaleID that doesn’t match the "first" set of resources that were compiled into the .exe then it doesn’t work. This is why separate resource-only .dll files are needed.

(2) For version resources, which are the subject of this thread, I didn’t try it. No that’s not right, because I did try it, I just didn’t know I was trying it. No that’s not quite right either.

For a product that is designed primarily to be exported, I set "ordinary" resources to English (UK) in the .exe and various other languages in .dll’s. I defined version resources in the .exe but didn’t bother (yet) in the .dll’s. Both Visual Studio 2005 and eMbedded Visual C++4 put version resources in a resource section for Japanese instead of English (UK). The .exe produced by eVC++4 has version resources visible despite the mismatch. The .exe produced by VS2005 ordinarily doesn’t have version resources visible, and still doesn’t even if I hand-edit the .rc file to put them in the English (UK) section — but Microsoft suggested a workaround in editing the .rc file, so that now the version resources are visible even though they’re in the default Japanese section.

So I think that I18N might really be working (mostly), for version resources though not for other resources.