One of the things I struggle with is not using Hungarian notation. I don't want to have to go to the variable definition just to see what type it is. When a project gets extensive, it's nice to be able to look at a variable prefixed by 'bool' and know that it's looking for true/false instead of a 0/1 value.

I also do a lot of work in SQL Server. I prefix my stored procedures with 'sp' and my tables with 'tbl', not to mention all of my variables in the database respectively.

I see everywhere that nobody really wants to use Hungarian notation, to the point where they avoid it. My question is, what is the benefit of not using Hungarian notation, and why does the majority of developers avoid it like the plague?

What language/IDE do you use? In Visual Studio, you don't have to go to the definition to know the type of a variable, since the IDE gives it to you. In languages where types are not enforced, like PHP, you don't need to know the type most of the time (since you can assign 0 or 1 to a boolean).
–
MainMaAug 21 '11 at 17:53

10

@MainMa I'm not sure that's a good reason not to know the type of a value in PHP.
–
Rei MiyasakaAug 22 '11 at 2:23

24

"When projects get extensive"... should not matter. There should only be a small number of variables in scope at any point: a handful of class variables, and another handful of method arguments. If you can't keep them straight, then the class is too big or the method is too long. Hungarian notation was invented for Windows C programming, basically as a work-around for the horrible Windows API. Nothing similar was ever suggested or wanted for Unix development.
–
kevin clineAug 22 '11 at 4:11

16

what is the benefit of not using Hungarian notation "Not being hated by your co-workers" comes to mind ...
–
Sean Patrick FloydAug 22 '11 at 14:22

16 Answers
16

Because its orginal intention (see http://www.joelonsoftware.com/articles/Wrong.html and http://fplanque.net/Blog/devblog/2005/05/11/hungarian_notation_on_steroids) has been misunderstood and it has been (ab)used to help people remember what type a variable is when the language they use is not statically typed. In any statically typed language you do not need the added ballast of prefixes to tell you what type a variable is. In many untyped script languages it can help, but it has often been abused to the point of becoming totally unwieldy. Unfortunately, instead of going back to the original intent of Hungarian notation, people have just made it into one of those "evil" things you should avoid.

Hungarian notation in short was intended to prefix variables with some semantics. For example if you have screen coordinates (left, top, right, bottom), you would prefix variables with absolute screen positions with "abs" and variables with positions relative to a window with "rel". That way it would be obvious to any reader when you passed a relative coordinate to a method requiring absolute positions.

update (in response to comment by delnan)

IMHO the abused version is avoided like the plague because

it complicates naming. When (ab)using Hungarian notation there will always be discussions on how specific the prefixes need to be. For example: listboxXYZ or MyParticularFlavourListBoxXYZ.

it makes variable names longer without aiding the understanding of what the variable is for.

it sort of defeats the object of the exercise when in order to avoid long prefixes these get shortened to abbreviations and you need a dictionary to know what each abbreviation means. Is a ui an unsigned integer? an unreferenced counted interface? something to do with user interfaces? And those things can get long. I have seen prefixes of more than 15 seemingly random characters that are supposed to convey the exact type of the var but really only mystify.

it gets out of date fast. When you change the type of a variable people invariably ( lol ) forget to update the prefix to reflect the change, or deliberately don't update it because that would trigger code changes everywhere the var is used...

it complicates talking about code because as "@g ." said: Variable names with Hungarian notation are typically difficult-to-pronounce alphabet soup. This inhibits readability and discussing code, because you can't 'say' any of the names.

... plenty more that I can't recall at the moment. Maybe because I have had the pleasure of not having to deal with the abused Hungarian notation for a long while...

@Surfer513: Actually, I prefer suffixes when naming controls. To me it is far more interesting to find all controls that deal with a specific subject/aspect than finding all edit controls... When I want to find the control where the user can type in a client name, I'll start hunting for client rather than txt, because it might not be a txt (edit) but a memo or a richedit or a ... Could even be a combo box to allow for finding it in previously entered client names...
–
Marjan VenemaAug 21 '11 at 19:06

7

@Surfer513: And nowadays I tend to use suffixes only when distinguishing between two controls that deal with the same thing. For example the label and the edit for client name. And quite often the suffixes are not related to the type of control, but to its purpose: ClientCaption and ClientInput for example.
–
Marjan VenemaAug 21 '11 at 19:11

3

It's also worth noting that intellisense in VS 2010 allows you to search the entire name, not just the start. If you name your control "firstNameTextBox" and type in "textb", it'll find your control and list it.
–
Adam RobinsonAug 21 '11 at 19:58

4

'...the abused version is avoided like the plaque...' which is to say, avoided slightly less than the tartar and gingivitis? ;-)
–
Ben MosherAug 21 '11 at 23:21

3

@Marjan: Of course a compiler could have picked up on this. If each unit is represented by a type, then you cannot accidentally pass one for another. Here, if you have a AbsoluteXType and a RelativeYType, then you could not mistakenly pass a Y relative coordinate for a X absolute one. I prefer variables that represent incompatible entities to be of incompatible types, rather than having incompatible prefixes. The compiler care not for prefixes (or suffixes).
–
Matthieu M.Aug 22 '11 at 8:44

the reason it is avoided is because of systems hungarian which violates DRY (the prefix is exactly the type which the compiler and (a good) IDE can derive)

apps hungarian O.T.O.H. prefixes with the use of the variable (i.e. scrxMouse is a x coordinate on the screen this can be an int, short, long or even a custom type (typedefs will even allow you to change it easily))

the misunderstanding of systems is what destroyed hungarian as a best practice

I have to disagree -- what destroyed Hungarian as a "best practice" is that it was never close to best (or even good) practice.
–
Jerry CoffinAug 21 '11 at 18:40

2

@haylem: I've seen the articles, and Simonyi's original paper, and read the code. Every way you look at it, it's a bad idea that should never have seen the light of day.
–
Jerry CoffinAug 21 '11 at 20:01

IMO what "destroyed" Hungarian is the fact that even the "intent" isn't useful compared to using descriptive names, although to be fair when Hungarian was created there wasn't a movement for having readable code...
–
Wayne MAug 22 '11 at 13:20

1

@Wayne M: "descriptive names" is easy to say when compilers will essentially allow you to use an essay for a variable name. When the length of identifier names were actually limited to a low, fairly arbitrary value (I think a common limit was eight or ten characters not that extremely long ago; I do seem to recall that Borland Turbo C 2 even had a configuration option for the maximum length of an identifier name!), encoding useful information in a name was just a tiny bit trickier...
–
Michael KjörlingAug 22 '11 at 14:20

Wikipedia has a list of advantages and disadvantages of the hungarian notation and can thus probably provide the most comprehensive answer to this question. The notable opinons are also a quite interesting read.

The benefit of not using hungarian notation is basically only avoidance of its disadvantages:

The Hungarian notation is redundant when type-checking is done by the compiler. Compilers for languages providing type-checking ensure the usage of a variable is consistent with its type automatically; checks by eye are redundant and subject to human error.

All modern Integrated development environments display variable types on demand, and automatically flag operations which use incompatible types, making the notation largely obsolete.

Hungarian Notation becomes confusing when it is used to represent several properties, as in a_crszkvc30LastNameCol: a constant reference argument, holding the contents of a database column LastName of type varchar(30) which is part of the table's primary key.

It may lead to inconsistency when code is modified or ported. If a variable's type is changed, either the decoration on the name of the variable will be inconsistent with the new type, or the variable's name must be changed. A particularly well known example is the standard WPARAM type, and the accompanying wParam formal parameter in many Windows system function declarations. The 'w' stands for 'word', where 'word' is the native word size of the platform's hardware architecture. It was originally a 16 bit type on 16-bit word architectures, but was changed to a 32-bit on 32-bit word architectures, or 64-bit type on 64-bit word architectures in later versions of the operating system while retaining its original name (its true underlying type is UINT_PTR, that is, an unsigned integer large enough to hold a pointer). The semantic impedance, and hence programmer confusion and inconsistency from platform-to-platform, is on the assumption that 'w' stands for 16-bit in those different environments.

Most of the time, knowing the use of a variable implies knowing its type. Furthermore, if the usage of a variable is not known, it can't be deduced from its type.

Hungarian notation strongly reduces the benefits of using feature-rich code editors that support completion on variable names, for the programmer has to input the whole type specifier first.

It makes code less readable, by obfuscating the purpose of the variable with needless type and scoping prefixes.

The additional type information can insufficiently replace more descriptive names. E.g. sDatabase doesn't tell the reader what it is. databaseName might be a more descriptive name.

When names are sufficiently descriptive, the additional type information can be redundant. E.g. firstName is most likely a string. So naming it sFirstName only adds clutter to the code.

I myself don't use this notation, because I dislike the unnecessary technical noise. I nearly always know which type I am dealing with and I want a clean language in my domain model, but I write mostly in statically and strongly typed languages.

This should be the correct answer. It's so nice. +1
–
Saeed NeamatiAug 21 '11 at 20:51

9

Upvoted for a single sentence: "Most of the time, knowing the use of a variable implies knowing its type. Furthermore, if the usage of a variable is not known, it can't be deduced from its type." -- IMHO, this is the #1 reason to avoid Hungarian notation.
–
Daniel PrydenAug 22 '11 at 18:49

Hungarian notation is a naming anti-pattern in modern day programming environments and form of Tautology.

It uselessly repeats information with no benefit and additional maintenance overhead. What happens when you change your int to a different type like long, now you have to search and replace your entire code base to rename all the variables or they are now semantically wrong which is worse than if you hadn't duplicated the type in the name.

It violates the DRY principle. If you have to prefix your database tables with an abbreviation to remind you that it is a table, then you are definitely not naming your tables semantically descriptively enough. Same goes for every other thing you are doing this with. It is just extra typing and work for no gain or benefit with a modern development environment.

"What happens when you change your int to a different type like long ..." Simple: <sarcasm> Don't change the name because there's no telling how many places the change will ripple.</sarcasm> So now you have a variable whose Hungarian name conflicts with its implementation. There is absolutely no way to tell how widespread the effects of changing the name will be if the variable/function has public visibility.
–
David HammenAug 22 '11 at 0:41

@David just look at the Win32 API which is riddled with variables, parameters, and even method names which using Hungarian notation (which is a requirement at MS) indicate an 8 bit or 16 bit value when in fact they've all been 32 bit values since the introduction of Windows 95 back in 1994 (so almost 17 years ago).
–
jwentingAug 22 '11 at 5:43

1

@Secure: I would argue that's what automated tests should be doing. Not programmers.
–
JoelAug 22 '11 at 17:14

I'll just focus on SQL Server since you mentioned it. I see no reason to put 'tbl' in front of a table. You can just look at any tSQL code and distinguish a table by how it is used. You would never Select from stored_procedure. or Select from table(with_param) like you would a UDF or Execute tblTableOrViewName like a stored procedure.

Tables could be confused with a View, but when it comes to how they are used; there is no difference, so what is the point? The Hungarian notation may save you the time of looking it up in SSMS (under table or views?), but that's about it.

Variables can present a problem, but they need to be declared and really, how far from your declare statement do you plan on using a variable? Scrolling a few lines shouldn't be that big of a deal unless you're writing very long procedures. It may be a good idea to break up the lengthy code a little.

What you describe is a pain, but the Hungarian Notation solution doesn't really solve the problem. You can look at someone else's code and find that the variable type may get changed which now requires a change to the variable name. Just one more thing to forget. And if I use a VarChar your're going to need to look at the declare statement anyway to know the size. Descriptive names will probably get you further. @PayPeriodStartDate pretty much explains itself.

@Surfer: A primary key is different because "PK" isn't the type; "PK_TableName" says "this is the primary key for TableName", not "this is a TableName of type PK". As for the rest... it doesn't sound like you're really listening to the arguments here. Please stop using this detestable practice that to this day persists in bringing down the collective quality of all code.
–
AaronaughtAug 21 '11 at 20:12

1

@Aaronaught, you are specifying one aspect of Hungarian Notation (en.wikipedia.org/wiki/Hungarian_notation). It's not just type, but also intended use. So prefixing with 'pk' for a primary key is in fact Hungarian Notation. I am listening to arguments here, but there's exceptions (like the primary key situation) where it seems HN is beneficial. Maybe, maybe not. I'm still trying to wrap my head around the do's and dont's for all scenarios. I have learned quite a bit today about it and has sparked some great thought.
–
user29981Aug 21 '11 at 20:34

2

@Surfer: That's simply not correct. Hungarian notation describes either the type (bad version) or a particular usage of the type (less bad version). PK is neither, it's not simply describing something about the type, it's describing a behaviour. When I'm talking about, say, an age, it's thoroughly uninteresting that it happens to be an integer, but when talking about a database constraint, "primary key for table X" is exactly what's important.
–
AaronaughtAug 21 '11 at 20:45

3

I like your second to last paragraph. My firm's rationale for using Hungarian notation is "It lets us quickly see what variables are global and which ones aren't." And then you look at the variable declarations and there's 300 variables per file, some m* for modular (global), some v* for local EVEN THOUGH THEY'RE DECLARED AT THE MODULAR LEVEL. "Oh, that's because they're just intended to be used as locals, not modulars." facepalm
–
corsiKaAug 21 '11 at 21:26

3

@Aaronaught: In my current project, we are using table names prefixed with tbl_. While I am not a big fan of this practice, I fail to see how this is "bringing down the collective quality of all code". Can you perhaps give an example?
–
TrebAug 22 '11 at 13:19

+1 for some very cool information there. I use 'sp' as my stored proc prefix, not 'sp_'. But all the same that is definitely a very very interesting fact and a concrete reason to not use 'sp_' as a stored proc prefix.
–
user29981Aug 21 '11 at 17:56

2

+1 I never knew this, and when I started working on SQL Server all the SPs at my place of employment were prefixed sp_ so I just stuck with the convention.
–
MichaelAug 21 '11 at 19:42

I think the reasons not to use Hungarian notation have been well covered by other posters. I agree with their comments.

With databases I use Hungarian notation for DDL objects that are rarely used in code, but would otherwise collide in namespaces. Mainly this gets down to prefixing indexes and named constraints with their type( PK, UK, FK, and IN ). Use a consistent method to name these objects and you should be able to run some validations by querying the metadata.

Hungarian notation is almost completely useless in a statically typed language. It's a basic IDE feature to show the type of a variable by putting the mouse over it, or by other means; moreover you can see what the type is by looking a few lines up where it was declared, if there's no type inference. The whole point of type inference is to not have the noise of the type repeated everywhere, so hungarian notation is usually seen as a bad thing in languages with type inference.

In dynamically typed languages, it can help sometimes, but to me it feels unidiomatic. You already gave up your functions being restricted to exact domains/codomains; if all your variables are named with hungarian notation, then you are just reproducing what a type system would have given you. How do you express a polymorphic variable that can be an integer or a string in hungarian notation? "IntStringX"? "IntOrStringX"? The only place I've ever used hungarian notation was in assembly code, because I was trying to get back what I'd get if I had a type system, and it was the first thing I've ever coded.

Anyways, I could care less what people name their variables, the code will probably still be just as incomprehensible. Developers waste way too much time on things like style and variable names, and at the end of the day you still get a ton of libraries with completely different conventions in your language. I'm developing a symbolic (i.e: non text-based) language where there are no variable names, only unique identifiers, and suggested names for variables (but most variables still have no suggested name because there simply does not exist a reasonable name for them); when auditing untrusted code, you can't depend on variable names.

for IDEs it's even worse than for manual coding as it means code completion is slowed down dramatically. If you have 100 integer variables, typing int<alt-space> (for example) brings up 100 items to scroll through. If the one you need is intVeryLongVariableName your code completion gets problematic. Without hungarian, you'd just type very<alt-space> and have only 1 or 2 options.
–
jwentingAug 22 '11 at 5:46

Another thing to add is that, what abbreviations would you use for an entire framework like .NET? Yeah, it's so simple to remember that btn represents a button and txt represents a text box. However, what do you have in mind for something like StringBuilder? strbld? What about CompositeWebControl? Do you use something like this:

CompositeWebControl comWCMyControl = new CompositeWebControl();

One of the inefficiencies of the Hungarian notation was that, by having larger and larger frameworks, it proved not only not to add extra benefit, but also to add more complexity for developers, because they now had to learn the nonstandard prefixes more and more.

As usual in such a case, I will post an answer before I read answers from other participants.

I see three "bugs" in your vision:

1) If you want to know the type of a variable/parameter/attribute/column you can hover your mouse or click it and it will be displayed, in most modern IDE. I don't know what tools you're using, but last time I was forced to work in an environment that didn't provide this feature was in the 20th century, the language was COBOL, oops no it was Fortran, and my boss didn't understand why I left.

2/ Types may change during the cycle of development. A 32-bit integer may become a 64-bit integer at some point, for good reasons that had not be detected at the start of project. So, renaming intX into longX or leaving it with a name that points to the wrong type is bad karma.

3) What you're asking for is in fact a redundancy. Redundancy is not very good design pattern or habit. Even humans are reluctant to too much redundancy. Even humans are reluctant to too much redundancy.

That's... pretty much the reason why many workplaces dash it out, I suppose.

It originated on languages that needed it.
On times of global variables bonanza. (for lack of alternatives)
It served us well.

The only real use we have for it today is the Joel Spolsky one.
To track some particular attributes of the variable, like its safety.

(e.g. “Does variable safeFoobar has a green light to be injected into a SQL query?
— As it is called safe, yes”)

Some other answers talked about editor functionalities that helped seeing the type of a variable as you hover on it. In my view, those too are kind of problematic for code sanity. I believe they where only meant for refactoring, as many other features too, (like function folding) and should not be used on new code.

IMO the biggest benefit of not using Hungarian is the fact it forces you to use meaningful names. If you are naming variables properly, you should immediately know what type it is or be able to deduce it fairly quickly in any well-designed system. If you need to rely on str or bln or worse of all obj prefixes to know what type a variable is, I would argue it indicates a naming issue - either poor variable names in general or way too generic to convey meaning.

Ironically, from personal experience the main scenario I have seen Hungarian used is either "cargo-cult" programming (i.e. other code uses it, so let's continue to use it just because) or in VB.NET to work around the fact the language is case-insensitive (e.g. Person oPerson = new Person because you can't use Person person = new Person and Person p = new Person is too vague); I've also seen prefixing "the" or "my" instead (as in Person thePerson = new Person or the uglier Person myPerson = new Person), in that particular case.

I will add the only time I use Hungarian tends to be for ASP.NET controls and that's really a matter of choice. I find it very ugly to type TextBoxCustomerName or CustomerNameTextBox versus the simpler txtCustomerName, but even that feels "dirty". I feel some kind of naming convention should be used for controls as there can be multiple controls that display the same data.

This is fine as long as you know what c stands for. But you'd have to have a standard table of prefixes, and everyone would have to know them, and any new people would have to learn them in order to understand your code. Whereas customerCount or countOfCustomers is pretty obvious at first glance.

Hungarian had some purpose in VB before Option Strict On existed, because in VB6 and prior (and in VB .NET with Option Strict Off) VB would coerce types, so you could do this:

Dim someText As String = "5"
customerCount = customerCount + someText

This is bad, but the compiler wouldn't tell you so. So if you used Hungarian, at least you'd have some indicator of what was happening:

To my way of looking at things, Hungarian Notation is a kludge to get around an insufficiently powerful type system. In languages that allow you to define your own types it's relatively trivial to create a new type that encodes the behavior you're expecting. In Joel Spolsky's rant on Hungarian Notation he gives an example of using it to detect possible XSS attacks by indicating that a variable or function is either unsafe (us) or safe (s), but that still relies on the programmer to visually check. If you instead have an extensible type system you can just create two new types, UnsafeString and SafeString, and then use them as appropriate. As a bonus, the type of encode becomes:

SafeString encode(UnsafeString)

and short of accessing the internals of UnsafeString or using some other conversion functions becomes the only way to get from a UnsafeString to a SafeString. If all your output functions then only take instances of SafeString it becomes impossible to output an un-escaped string [ baring shenanigans with conversions such as StringToSafeString(someUnsafeString.ToString()) ].

It should be obvious why allowing the type system to sanity check your code is superior to trying to do it by hand, or maybe eye in this case.

In a language such as C of course, you're screwed in that an int is an int is an int, and there's not much you can do about that. You could always play games with structs but it's debatable whether that's an improvement or not.

As for the other interpretation of Hungarian Notation, I.E. prefixing with the type of the variable, that's just plain stupid and encourages lazy practices like naming variables uivxwFoo instead of something meaningful like countOfPeople.

I found a lot of good arguments against, but one I did not see: ergonomics.

In former times, when all you had was string, int, bool and float, the characters sibf would have been sufficient. But with string + short, the problems begin. Use the whole name for the prefix, or str_name for string? (While names are almost always strings - aren't they?) What's with a Street class? Names get longer and longer, and even if you use CamelCase, it is hard to tell where the type-prefix ends and where the variable-name begins.

Either you will end in using useless prefixes for trivial cases, like loop-variables or count. When did you recently use a short or a long for a counter? If you make exceptions, you will often loose time, thinking about needing a prefix or not.

If you have a lot of variables, they get normally grouped in an object browser which is part of your IDE. Now if 40% start with i_ for int, and 40% with s_ for string, and they are alphabetically sorted, it is hard to find the significant part of the name.

The one place where I still regularly use either Hungarian or analogous suffixes is in contexts where the same semantic data is present in two different forms, like data conversion. This may be in cases where there are multiple units of measurement, or where there are multiple forms (e.g., String "123" and integer 123 ).

I find the reasons given here for not using it compelling for not imposing Hungarian on others, but only mildly suggestive, for deciding on your own practice.

The source code of a program is a user interface in its own right - displaying algorithms and metadata to the maintainer - and redundancy in user interfaces is a virtue, not a sin. See, e.g, the pictures in "The Design of Everyday Things", and look at the doors labelled "Push" that look like you pull them, and at the beer taps the operators hacked onto the important nuclear reactor controls because somehow "hovering over them in the IDE" wasn't quite good enough.

The "just hover in an IDE" is not a reason not to use Hungarian - only a reason some people might not find it useful. Your mileage may differ.

The idea that Hungarian imposes a significant maintenance burden when variable types change is silly - how often do you change variable types? Besides, renaming variables is easy:

Just use the IDE to rename all the occurences. -Gander, replying to Goose

If Hungarian really helps you quickly grok a piece of code and reliably maintain it, use it. If not, don't. If other people tell you that you're wrong about your own experience, I'd suggest that they're probably the ones in error.