Generating String Resource Accessors for .NET

This article presents a utility written in Visual Basic .NET that generates a C# or VB.NET source file from a .resx or .resources file. The resulting class enables compile-time checking of resource string identifier names and the numbers of format items.

Introduction

Information technology applications increasingly have a global reach. Thus, it is essential for software developers to design applications with an eye towards localization into multiple languages. Ideally, releasing a well-designed application in a new language is simply a matter of creating a new set of localized resources and perhaps changing a setting at compile-time or runtime. The .NET Framework provides excellent support for this scenario via the ability to swap in XML-based .resx files or binary .resources files and deployment and loading of satellite assemblies. For a good introduction to localization in the .NET Framework, see MSDN.

A key part of localization is the management of string resources.

The .NET Framework contains a ResourceManager class with a GetString method to which you pass the literal name of the string resource you want to retrieve, and in return, you get the appropriate localized string per the rules described in the article above.

Of course, should you misspell the resource name, GetString will return the empty string at runtime. Depending on the application, this bug may be difficult to isolate. Since it is always preferable to detect errors at compile-time rather than runtime, in the past, I have manually added symbols for each string defined in a resource, like the following:

This is a safer approach. However, once projects get into the hundreds or thousands of strings, the process of defining string constants becomes tedious and cries out for automation. Before building my own tool, I searched around and found two other attempts to address this problem: a command-line utility [1] and a Visual Studio .NET custom tool that makes use of the CodeDOM [2]. Both of these tools incorporate a number of good ideas, and I would encourage you to check them out. Rather than spending a great deal of time explaining them and why they did not meet my needs, I will move directly into discussing my tool, StringClassGen.

StringClassGen Features

StringClassGen.exe is a command-line tool written in Visual Basic .NET. The command syntax is explained below, in the section "StringClassGen Usage". As a command-line tool, depending on your build process, you can integrate it into your build batch files, NAnt scripts, or Visual Studio .NET pre-build events, to turn .resx and/or .resources files into source files containing properties and methods, to help you more easily and safely access your resource strings. StringClassGen has the following key features:

Like the approach described in [2], StringClassGen uses the CodeDOM to generate either C# or Visual Basic .NET code on demand. Because code that uses the CodeDOM is not particularly interesting (just tedious), I will not show any of it here. Suffice to say that the generated code consists of a set of static properties and methods discussed in more detail below. When this code is integrated into a C# or Visual Basic .NET project, it abstracts the process of loading strings from assembly resources in a type-safe manner.

The top-level class generated by StringClassGen encapsulates a singleton instance of a ResourceManager that loads strings from the executing assembly. (A possible extension of StringClassGen would be to allow the ResourceManager to load from an assembly other than the one that is executing.) Note that the ResourceManager.GetString method is thread-safe according to the .NET Framework documentation.

When creating identifiers for string resources in code, I have often found it useful to organize them into groups, such as by the web page or form in which they are used. Inner classes provide a nice encapsulation for this concept. This leads to the problem of how to identify the desired groups when doing string class generation.

If you have edited .resx files in Visual Studio, you may have noticed that, associated with each string entry, there is a column entitled "comment" in which you can place any text you want. If you adopt the convention that the contents of the "comment" column represent the desired group (inner class) for a string identifier, then when working with .resx files, StringClassGen will create the appropriate inner classes and populate them with the correct string identifiers. (Clearly, that implies that in this scheme, the contents of the "comment" column must be a valid identifier name in the target programming language.)

How this is accomplished and why it only works with .resx files is worth some discussion. The .NET Framework provides an interface, IResourceReader, and two implementations, ResourceReader and ResxResourceReader, that allow you to pull string resource name-value pairs out of .resources and .resx files, respectively. If you peek underneath the hood at the .resx file format, you'll see that it is just XML, with string elements that look like:

<dataname="String1"><value>This is the first string.</value><comment>Class1</comment></data>

Unfortunately, the IResourceReader interface provides no mechanism to access the <comment>. Indeed, for .resources files (the compiled binary format of .resx files), the comment is no longer a part of the file's data. StringClassGen processes .resources files using ResourceFileReader, as shown in the following code snippet:

But for .resx files, we have another option. Since a .resx file is just XML, we can forgo the use of the IResourceReader interface entirely and process it as an ordinary XML file. For StringClassGen, I chose to use XPathNavigator and XPathDocument:

The most interesting part of this routine is the sort on "comment". This allows us to process all strings belonging to a given inner class sequentially, so that only one inner class is "open for business" at a time as we use the CodeDOM to generate identifiers. Strings that have an empty comment defined in the .resx file will not be placed in an inner class but rather belong to the outermost generated class.

One of the more useful string manipulation features in the .NET Framework is the ability to embed "format item" tokens into strings and replace them at runtime using the String.Format method. Ordinarily, to take advantage of this feature while using string resources, one would write code like the following (where a resource manager has already been obtained):

where the value associated with StringWithParams might be: "This is the first param: {0}. And this is the second param: {1}".

To streamline this process, StringClassGen uses regular expression matching to attempt to detect format item tokens in the strings it processes and, if it finds any, generates wrapper methods similar to those in [1] that take the correct number of parameters. The wrapper methods handle both the retrieval of the string from the resource and the formatting. Thus, the generation of the following code:

Furthermore, as FxCop is fond of pointing out, when you use String.Format, you ought to use the overload that accepts an IFormatProvider to provide culture-specific formatting information, especially in global-friendly applications. Thus, StringClassGen actually creates two overloads for each string method, one which accepts a client-supplied IFormatProvider and another that uses a default format provider that you specify via a command-line option, which may be either InvariantCulture, CurrentUICulture, or CurrentCulture. If no format provider is specified, the default is to use the InvariantCulture.

As we saw in the last section, strings with format item tokens cause the generation of methods taking one or more parameters. If a string contains no format item tokens, there is no need to generate a method: a property is more appropriate. Thus, a string with no format tokens will trigger the generation of code like the following:

When you edit a .resx file in Visual Studio .NET or your text editor of choice, there is nothing to prevent you from using the same string name for more than one string. However, this is almost certainly a bug, since there is no way to retrieve more than one string with a given name using a ResourceManager. Because of this, StringClassGen.exe tracks the list of used names and returns an error if the same name is used more than once in the resource file.

StringClassGen Usage

The behavior of StringClassGen.exe may be controlled by several command-line arguments:

(-vb|-cs) - The language of the generated code, either VB.NET or C#. The default is C#.

(-c) - Specifies to output .resx file comments as <summary> comments for the generated properties and methods, instead of using them to group strings into inner classes. If you choose this option, all of your strings will be defined in the top-level class. The default is to have this option off.

(-ns namespacename) - The namespace that the generated top-level string class lives in. The default is the name of the resource file (minus extension) suffixed with the string "Namespace".

(-class classname) - The name of the generated string class, which defaults to the name of the resource file minus extension.

(-out outfilename) - The name of the generated file. If this option is omitted, output is sent to standard out.

You should add the StringClassGen-generated source file(s) to your assembly, compiling them along with the rest of your source code. Depending on the build tools you are employing, you may want to establish a dependency so that StringClassGen only runs if the .resx file is newer than the source file.

If you are having difficulties using the generated source code, the most likely problem is that you are providing the wrong namespace (-ns) to StringClassGen. This has the effect of causing an exception the first time you access the resource manager. Visual Studio .NET, by default, places resources from project-included .resx files into the default namespace for the project, which is configurable in the project properties. Chances are, you want to pass this as the -ns parameter to StringClassGen. (You can verify the name of your specific resource by examining your assembly manifest in ildasm.exe. An exact explanation of this process is beyond the scope of this article, however.)

Demo Project

Included in the download are sample applications in Visual Basic .NET and C# that load a few strings from the resource using StringClassGen-generated classes. In each application, you can switch between using inner classes or not by defining to true the conditional compilation directive "INNERCLASSES", and including an appropriate version of the generated TestStringResource.cs/TestStringResource.vb. Here are the sample command lines for generating the files (relative to the build-output directory for StringClassGen):

History

Initial release: 01/02/2005.

Allows selection of default format provider: 01/25/2005.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

I have over 10 years of full-lifecycle software development experience on a variety of large projects. I have a B.S. in Math from the University of Nebraska-Lincoln, a Masters in Software Engineering from Seattle University, and a Masters in Computer Science from the University of Maryland. I specialize in building and installing tools, frameworks, and processes that improve developer productivity and product quality. I am currently based in the Seattle area.

Comments and Discussions

I think (sometimes) that the logic and code in this article have one drawback (maybe a big one ), that it deals with the localization on page by page basis. The thing is , for example, that you might have a "First Name" on number of pages and as per this article you need to enter the First Name on every page for every language, instead of entering it once and cover all pages.

I use a "properties" file and a GlobalizationModule to handle these, it was an article here some time ago.

Thanks for your interest in the subject matter. I'm not entirely sure I understand your comment, but I'll take a shot at a reply:

The tool does not require that you organize all of your strings on a page-by-page basis. You could organize by functional area. For example, for "First Name", you could use a comment called "UserInformation" and stick that on the "FirstName" string along with "LastName", "PhoneNumber", etc. Then you could use the UserInformation inner string class on any number of pages or forms or whatever you're working with.

Indeed, you can use the inner class mechanism wherever you feel it is appropriate, and leave other strings in the top-level class by omitting their "comment". The sample application demonstrates this.