Hungarian notation

Hungarian notation is an identier naming convention in computer programming, in which the name ofa variable or function indicates its type or intended use.There are two types of Hungarian notation: Systems Hungarian notation and Apps Hungarian notation.

xes used to indicate the type of information being

stored. His proposal was largely concerned with decorating identier names based upon the semantic information of what they store (in other words, the variablespurpose), consistent with Apps Hungarian. However, hissuggestions were not entirely distinct from what becameknown as Systems Hungarian, as some of his suggestedprexes contain little or no semantic information (see below for examples).

Hungarian notation was designed to be languageindependent, and found its rst major use with the BCPLprogramming language. Because BCPL has no data typesother than the machine word, nothing in the language itself helps a programmer remember variables types. Hungarian notation aims to remedy this by providing the programmer with explicit knowledge of each variables datatype.

The term Hungarian notation is memorable for many

people because the strings of unpronounceable consonants vaguely resemble the consonant-rich orthographyof some Eastern European languages despite the fact thatIn Hungarian notation, a variable name starts with a group Hungarian is a Uralic language, and unlike Slavic lanof lower-case letters which are mnemonics for the type guages is rather rich in vowels. The zero-terminatedor purpose of that variable, followed by whatever name string prex sz is also a letter in the Hungarian alphabet.the programmer has chosen; this last part is sometimesdistinguished as the given name. The rst character ofthe given name can be capitalized to separate it from the 2 Systems vs. Apps Hungariantype indicators (see also CamelCase). Otherwise the caseof this character denotes scope.Where Systems notation and Apps notation dier is in thepurpose of the prexes.

In Systems Hungarian notation, the prex encodes the

actual data type of the variable. For example:

History

The original Hungarian notation, which would now be

called Apps Hungarian, was invented by Charles Simonyi, a programmer who worked at Xerox PARC circa19721981, and who later became Chief Architect atMicrosoft. It may have been derived from the earlier principle of using the rst letter of a variable name to set itstype for example, variables whose names started withletters I through N in FORTRAN are integers by default.

lAccountNum : variable is a long integer (l);

arru8NumberList : variable is an array of unsigned8-bit integers (arru8); szName : variable is a zero-terminated string (sz);this was one of Simonyis original suggested prexes. bReadLine(bPort,&arru8NumberList) :with a byte-value return code.

The notation is a reference to Simonyis nation of origin; Hungarian peoples names are reversed comparedto most other European names; the family name precedes the given name. For example, the anglicized nameCharles Simonyi in Hungarian was originally SimonyiCharles (Simonyi Kroly in Hungarian). In the sameway the type name precedes the given name in Hungarian notation rather than the more natural, to most Europeans, Smalltalk type last naming style e.g. aPointand lastPoint. This latter naming style was most commonat Xerox PARC during Simonyis tenure there.

function

Apps Hungarian notation strives to encode the logical

data type rather than the physical data type; in this way,it gives a hint as to what the variables purpose is, or whatit represents. rwPosition : variable represents a row (rw); usName : variable represents an unsafe string (us),which needs to be sanitized before it is used (e.g.see code injection and cross-site scripting for examples of attacks that can be caused by using raw userinput)

The name Apps Hungarian was coined since the convention was used in the applications division of Microsoft.Systems Hungarian developed later in the Microsoft Windows development team. Simonyis paper referred to pre1

4 EXAMPLES strName : Variable represents a string (str) con- sigils declare the type of the variable to the language intaining the name, but does not specify how that terpreter (which may be a compiler), whereas Hungarianstring is implemented.notation is purely a naming scheme, with no eect on themachine interpretation of the program text.

Most, but not all, of the prexes Simonyi suggested are

semantic in nature. To modern eyes, some prexes seemto represent physical data types, such as sz for strings.However, such prexes were still semantic, as Simonyi intended Hungarian notation for languages whose type systems could not distinguish some data types that modernlanguages take for granted.The following are examples from the original paper:

[1]

pX is a pointer to another type X; this contains very

little semantic information. d is a prex meaning dierence between two values;for instance, dY might represent a distance alongthe Y-axis of a graph, while a variable just calledy might be an absolute position. This is entirely semantic in nature. sz is a null- or zero-terminated string. In C, this contains some semantic information because it is notclear whether a variable of type char* is a pointer toa single character, an array of characters or a zeroterminated string. w marks a variable that is a word. This contains essentially no semantic information at all, and wouldprobably be considered Systems Hungarian.

b marks a byte, which in contrast to w might have

u32Identier : unsigned 32-bit integer (Systems)semantic information, because in C the only byte stTime : clock time structuresized data type is the char, so these are sometimesused to hold numeric values. This prex might clear fnFunction : function nameambiguity between whether the variable is holding avalue that should be treated as a character or a number.The mnemonics for pointers and arrays, which are notactual data types, are usually followed by the type of theWhile the notation always uses initial lower-case letters as data element itself:mnemonics, it does not prescribe the mnemonics themselves. There are several widely used conventions (seeexamples below), but any set of letters can be used, aslong as they are consistent within a given body of code.It is possible for code using Apps Hungarian notation tosometimes contain Systems Hungarian when describingvariables that are dened solely in terms of their type.

Relation to sigils

In some programming languages, a similar notation now

called sigils is built into the language and enforced by thecompiler. For example, in some forms of BASIC, name$names a string and count% names an integer. The majordierence between Hungarian notation and sigils is that

pszOwner : pointer to zero-terminated string

rgfpBalances : array of oating-point values aulColors : array of unsigned long (Systems)While Hungarian notation can be applied to any programming language and environment, it was widely adoptedby Microsoft for use with the C language, in particularfor Microsoft Windows, and its use remains largely conned to that area. In particular, use of Hungarian notationwas widely evangelized by Charles Petzold's Programming Windows, the original (and for many readers, thedenitive) book on Windows API programming. Thus,many commonly seen constructs of Hungarian notationare specic to Windows:

3 For programmers who learned Windows programming in C, probably the most memorable examplesare the wParam (word-size parameter) and lParam(long-integer parameter) for the WindowProc()function. hwndFoo : handle to a window lpszBar : long pointer to a zero-terminated stringThe notation is sometimes extended in C++ to includethe scope of a variable, optionally separated by anunderscore.[2][3] This extension is often also used without the Hungarian type-specication: g_nWheels : member of a global namespace, integer m_nWheels : member of a structure/class, integer m_wheels, _wheels : member of a structure/class s_wheels : static member of a class c_wheels : static member of a functionIn Javascript code using jQuery, a $ prex is often usedto indicate that a variable holds a jQuery object (versus aplain DOM object or some other value).[4]

Advantages

(Some of these apply to Systems Hungarian only.)

Supporters argue that the benets of Hungarian Notationinclude:[1] The symbol type can be seen from its name. Thisis useful when looking at the code outside an integrated development environment like on a codereview or printout or when the symbol declaration is in another le from the point of use, such asa function. In a language that uses dynamic typing or that is untyped, the decorations that refer to types cease to beredundant. In such languages variables are typicallynot declared as holding a particular type of data, sothe only clue as to what operations can be done onit are hints given by the programmer, such as a variable naming scheme, documentation and comments.As mentioned above, Hungarian Notation expandedin such a language (BCPL). The formatting of variable names may simplifysome aspects of code refactoring (while makingother aspects more error-prone). Multiple variables with similar semantics can beused in a block of code: dwWidth, iWidth, fWidth,dWidth.

Variable names can be easy to remember from

knowing just their types. It leads to more consistent variable names. Inappropriate type casting and operations using incompatible types can be detected easily while reading code. In complex programs with lots of global objects(VB/Delphi Forms), having a basic prex notationcan ease the work of nding the component insideof the editor. For example, searching for the stringbtn might nd all the Button objects. Applying Hungarian notation in a narrower way,such as applying only for member variables, helpsavoiding naming collision.

6 DisadvantagesMost arguments against Hungarian notation are againstSystems Hungarian notation, not Apps Hungarian notation. Some potential issues are: The Hungarian notation is redundant when typechecking is done by the compiler. Compilers forlanguages providing type-checking ensure the usageof a variable is consistent with its type automatically;checks by eye are redundant and subject to humanerror. Most modern integrated development environmentsdisplay variable types on demand, and automaticallyag operations which use incompatible types, making the notation largely obsolete. Hungarian Notation becomes confusing when itis used to represent several properties, as ina_crszkvc30LastNameCol: a constant referenceargument, holding the contents of a database column LastName of type varchar(30) which is part ofthe tables primary key. It may lead to inconsistency when code is modiedor ported. If a variables type is changed, either thedecoration on the name of the variable will be inconsistent with the new type, or the variables namemust be changed. A particularly well known example is the standard WPARAM type, and the accompanying wParam formal parameter in many Windows system function declarations. The 'w' standsfor 'word', where 'word' is the native word size ofthe platforms hardware architecture. It was originally a 16 bit type on 16-bit word architectures,but was changed to a 32-bit on 32-bit word architectures, or 64-bit type on 64-bit word architecturesin later versions of the operating system while retaining its original name (its true underlying type

7 NOTABLE OPINIONSis UINT_PTR, that is, an unsigned integer largeenough to hold a pointer). The semantic impedance,and hence programmer confusion and inconsistencyfrom platform-to-platform, is on the assumption that'w' stands for 16-bit in those dierent environments. Most of the time, knowing the use of a variable implies knowing its type. Furthermore, if the usage ofa variable is not known, it cannot be deduced fromits type. Hungarian notation reduces the benets of using code editors that support completion on variable names, for the programmer has to input thetype specier rst, which is more likely to collidewith other variables than when using other namingschemes. It makes code less readable, by obfuscating the purpose of the variable with needless type and scopingprexes.[5]

The readability problem can be circumvented with the

Rudder Notation enhancement which recommends acamel case variable name to the left and the type information to the right, while having them clearly separatedwith an underscore. E.g.: LightYears_dw.[6] The additional type information can insucientlyreplace more descriptive names. E.g. sDatabasedoes not tell the reader what it is. databaseNamemight be a more descriptive name. When names are suciently descriptive, the additional type information can be redundant. E.g. rstName is most likely a string. So naming it sFirstName only adds clutter to the code. Its harder to remember the names. Multiple variables with dierent semantics can beused in a block of code with similar names: dwTmp,iTmp, fTmp, dTmp.

Notable opinions Robert Cecil Martin (against Hungarian notationand all other forms of encoding):... nowadays HN and other forms oftype encoding are simply impediments.They make it harder to change the nameor type of a variable, function, memberor class. They make it harder to readthe code. And they create the possibility that the encoding system will misleadthe reader.[7] Linus Torvalds (against Systems Hungarian):

Encoding the type of a function into

the name (so-called Hungarian notation)is brain damagedthe compiler knowsthe types anyway and can check those,and it only confuses the programmer.[8] Steve McConnell (for Hungarian):Although the Hungarian namingconvention is no longer in widespreaduse, the basic idea of standardizing onterse, precise abbreviations continues tohave value. Standardized prexes allow you to check types accurately whenyou're using abstract data types that yourcompiler can't necessarily check.[9] Bjarne Stroustrup (against Systems Hungarian forC++):No I don't recommend 'Hungarian'.I regard 'Hungarian' (embedding an abbreviated version of a type in a variablename) as a technique that can be useful in untyped languages, but is completely unsuitable for a language that supports generic programming and objectoriented programming both of whichemphasize selection of operations basedon the type and arguments (known to thelanguage or to the run-time support). Inthis case, 'building the type of an objectinto names simply complicates and minimizes abstraction.[10] Joel Spolsky (for Apps Hungarian):If you read Simonyis paper closely,what he was getting at was the samekind of naming convention as I used inmy example above where we decidedthat us meant unsafe string and s meantsafe string. They're both of type string.The compiler won't help you if you assign one to the other and Intellisense[an Intelligent code completion system]won't tell you bupkis. But they are semantically dierent. They need to be interpreted dierently and treated dierently and some kind of conversion function will need to be called if you assignone to the other or you will have a runtime bug. If you're lucky. Theres stilla tremendous amount of value to AppsHungarian, in that it increases collocation in code, which makes the code easierto read, write, debug and maintain, and,most importantly, it makes wrong codelook wrong.... (Systems Hungarian) wasa subtle but complete misunderstandingof Simonyis intention and practice.[11]

5 Microsoft's Design Guidelines[12] discourage developers from using Hungarian notation when theychoose names for the elements in .NET Class Libraries, although it was common on prior Microsoftdevelopment platforms like Visual Basic 6 and earlier. These Design Guidelines are silent on the naming conventions for local variables inside functions.