What does all that mess do? It returns the value of a hash by its key.

Everything else is just error trapping crap. I say crap because in Ruby or Python, most of that error trapping is done for you. The null value? In Ruby the Hash class can be tweaked to initialize its members to a safe value.

No offense, Chalain, but if I'm reading this correctly, it's the (non-standard) error logging and database libraries that your company is using which you should be bitching about, not the language itself. While I won't claim that Java is the soul of elegance (I reserve that title for Scheme , which just goes to show that you can take even elegance to excess), a lot of the problems people have with it are in how it being (mis)used by others - just like with any language.

Admittedly, Java tends to encourage this sort of thing - indeed, the standard libraries are in some ways worse in this regard than the ones you seem to be using, with the logging API in particular being spectacularly overdesigned - and the one-size-fits-all hype surrounding the language doesn't help any. But there are reasons why the language works the way it does, even if those reasons are really applicable to the situation you're in.

Perhaps the real problem is that Java is too heavyweight for what you're doing - using a nuke to swat flies. Java scales up quite well, but scales down rather less so; it's optimized for massive, distributed systems, being written by dozens of coders working independently (which is ironic when consider it was originally meant for embedded software). In this regard, it really is in many ways more like Ada than like C++ or Smalltalk (both of which are over-engineered in their own ways). Of course, if this is from a really large project - and it sounds like it is - then I hate to say it, but a certain increase in complexity is the price of scale, regardless of the language.

Perhaps the real problem is that Java is too heavyweight for what you're doing

:: Chokes on his sandwich and sprays coffee on the kb ::

I'm serious, though I may not be looking at it in the sense you're think of; my perspective may simply be warped.

What mean is that Java has a fair amount of syntactic overhead (though not so much as, say, Ada or COBOL), mostly due to support for large scale programming. Furthermore, most of the API libraries are designed to cover a large number of cases, at the expense added complexity in use. Conversely, the overall syntax and semantics are rather less abstract than those of a language like Ruby or Python or Scheme; it doesn't support many poweful but complex-to-compile idioms such as list comprehensions or continuations, and much of what would be a part of some languages, such as regexes, are implemented in the libraries instead. The language focuses less on expressiveness and programmer convenience and more on enforcing 'best practices' in a 'software engineering' sense. Again this is relative: compared to C++, Java isn't very heavyweight at all, though frankly it probably scales better than C++ does.

As for how appropriate the statement is, I have no idea what Chalain is working on these days, though I gather he works in many different areas and languages. The fact that he mentioned Python and Ruby made me think that he was working on small to medium scale scripting mostly lately; while I agree that those languages have several advantages over Java, the relevant differences in this case all came out of the (over-)engineering of the Java libraries.

but if I'm reading this correctly, it's the (non-standard) error logging and database libraries that your company is using which you should be bitching about, not the language itself.

You are reading it wrong. The logging system is log4j, a very fast and quite standard logging system. The database is jdbc, with a custom query layer written for it. However, nothing in that function relies on that code. It just happens that the HashSet I was working with there was keyed by a string contained in resultSet.

All that method does is get a value from a hash, or a default value if it is missing. And it needs 10 lines of code to do it.

Schol-R-LEA wrote:

But there are reasons why the language works the way it does, even if those reasons are really applicable to the situation you're in.

I think most of those reasons are well-considered, sensible, and outmoded. Java's greatest accomplishment was getting people to realize that a garbage collected, interpreted language really could be used to write Real Programs. Lighter-weight, scalable languages like Python and Ruby have followed in its sheltering wake, and are now emerging and challenging many of the concepts we converted to when Java challenged C++ a decade ago.

This a topic for a whole 'nother thread.

Schol-R-LEA wrote:

Perhaps the real problem is that Java is too heavyweight for what you're doing - using a nuke to swat flies.

Not really. Our project is of sufficient complexity that it requires a full-blown application. Also, as mentioned in the OP, the problem is that the fly I'm trying to swat is to get a value, or a default value, out of a hash by key without having to write a novel.

That's so succinct it almost doesn't need a wrapper function to enclose it. The Hash.new method in Ruby takes an optional default value that the hash will return if a missing key is read. In Python, you have to hand the default value in with every call, but that is also still extremely readable:

I concede that similar code can be achieved with Java. But only by writing my own HashSetWithDefault class, and adapting everything that selectExpansionCache touches to use the new class.

Schol-R-LEA wrote:

Java scales up quite well, but scales down rather less so; it's optimized for massive, distributed systems, being written by dozens of coders working independently (which is ironic when consider it was originally meant for embedded software). In this regard, it really is in many ways more like Ada than like C++ or Smalltalk (both of which are over-engineered in their own ways). Of course, if this is from a really large project - and it sounds like it is - then I hate to say it, but a certain increase in complexity is the price of scale, regardless of the language.

The project I am on is large enough to require a non-toy language, it is true. I occasionally talk Ruby or Python or smaller languages here because I dabble all across the board... this is why I take no offense at your earlier comment. Yes, I really am working on real software.

Java's problem isn't that it doesn't scale down, it's that it cannot maneuver well in tight spaces. This is a fatal flaw in my mind because, at the end of the day, every program eventually must be broken down into small pieces.

A few days ago I was at the bookstore. They had Java In A Nutshell there. It was THREE INCHES THICK. 1,224 pages. Does no one besides me see the inherent lie in the language?

I've said it before, and here it is again: "If Java is the answer, it must have been a really verbose question."

(DISCLAIMER: I am not a professional programmer. 99% of the code I have written has been solely for class assignments or to amuse myself. Therefore my qualifications for pointing out idiocy in code are suspect. Nevertheless, I present the following.)

I'm sure you are all aware of MediaWiki, even if you're not aware of it. It's the software that runs Wikipedia, and what seems like every other wiki site on the planet. Part of the core distribution (but not activated by default, thankfully) is a plugin for displaying math equations in wiki pages. This plugin contains some fairly bad design decisions.

I mean, I can understand using (La)TeX for rendering the equations. Sure, it's bloated and monolithic, but its mathematical typesetting is without rival. (And even if it does have a rival, most of the textbooks and papers I've read have had at least their equations formatted by TeX, so as far as programs I've heard of go, it's unrivaled. No, Microsoft Equation Editor doesn't count.) It also has the notable advantage of being free. And most webservers have the megabytes to spare to host a LaTeX installation, even if the sysadmins aren't happy about it.

I can understand using ImageMagick to convert TeX's DVI output into something viewable in a web browser, as it's quite good at format conversion and most Linux distributions have it by default. But then, DVI is pretty much the only format that ImageMagick doesn't understand, so instead it gets converted to PostScript (using the standard LaTeX dvips tool) and then to PNG (using ImageMagick, which hooks into Ghostscript to do the conversion). Wait, Ghostscript? Of course your distribution will have Ghostscript, it's necessary for printing. Never mind that it may not be on a stripped-down webserver install that will have no need for printing anything, or that non-Linux OSes won't necessarily include it...who hosts websites on FreeBSD, anyway?

(Keep in mind, MediaWiki's only external dependencies in the base install are PHP and MySQL.)

No, the thing about the math plugin I can't understand is...why the fuck was it written in Objective Caml?

Maybe I didn't quite make it clear. This TeX interface thing is part of the base distribution of MediaWiki, which is entirely PHP except for the TeX bit in OCaml. I'm not saying it's never appropriate to use OCaml, just not willy-nilly in the middle of a program in a different language. (Especially one intended for webservers, which may be on shared hosting services that don't have every random scripting language installed.)

I suppose it's possible, though, that the TeX plugin was written independently of the rest of the MediaWiki codebase, and then brought into the core when the Wikipedia folks decided they wanted math formatting there. If this is the case, well, feel free to chuck me at the nearest monster and watch me explode, dood.

/* Well, it's finally here. My CS teacher said this day would come. I didn't * believe him, but it turns out, he was right. * * See, these records need to be in alpha order. They probably already are, * since they were written that way the last time this program ran. However, the * file might have been hand-edited. Thus, I need a sort routine that won't * take much programmer time, and is fastest on already-sorted data. * * Thus it is that I, <name elided>, use a Bubble Sort in a commercial product. */

/* Well, it's finally here. My CS teacher said this day would come. I didn't * believe him, but it turns out, he was right. * * See, these records need to be in alpha order. They probably already are, * since they were written that way the last time this program ran. However, the * file might have been hand-edited. Thus, I need a sort routine that won't * take much programmer time, and is fastest on already-sorted data. * * Thus it is that I, <name elided>, use a Bubble Sort in a commercial product. */

Hahaha. You know, what's really silly is that a bubble sort in its pure form isn't fastest on already sorted data. What is fastest on already sorted data is an Insertion sort, the naive implementation of which can be arrived at by 1-2 simple mods to the traditional bubblesort, so that's probably what he actually used.

/* Well, it's finally here. My CS teacher said this day would come. I didn't * believe him, but it turns out, he was right. * * See, these records need to be in alpha order. They probably already are, * since they were written that way the last time this program ran. However, the * file might have been hand-edited. Thus, I need a sort routine that won't * take much programmer time, and is fastest on already-sorted data. * * Thus it is that I, <name elided>, use a Bubble Sort in a commercial product. */

Hahaha. You know, what's really silly is that a bubble sort in its pure form isn't fastest on already sorted data. What is fastest on already sorted data is an Insertion sort, the naive implementation of which can be arrived at by 1-2 simple mods to the traditional bubblesort, so that's probably what he actually used.

I would think that bubble sort and insertion sort would be the same over already sorted data, since they both have to iterate over that data once with no re-arrangement.

Anyway, the code that the comment referred to was, in fact, a pure bubble sort.

Hahaha. You know, what's really silly is that a bubble sort in its pure form isn't fastest on already sorted data. What is fastest on already sorted data is an Insertion sort, the naive implementation of which can be arrived at by 1-2 simple mods to the traditional bubblesort, so that's probably what he actually used.

I would think that bubble sort and insertion sort would be the same over already sorted data, since they both have to iterate over that data once with no re-arrangement.

Anyway, the code that the comment referred to was, in fact, a pure bubble sort.

The Bubble sort does not, in fact, iterate over the data only once. It is an O(N^2) operation and repeatedly goes over the data, doing a single swap of the out of order elements on each pass. There are certain advanced implementations that stop when the data is sorted, but the traditional simplest implementation consists of two for loops and a single swap if out of order statement. That's why it is taught first, it can be implemented in 4 lines in its simplest configuration. Any time you want to do more than that, it's best to swap it out for an insertion sort.

To make it stop on a single pass if there are no swaps adds 3-4 lines to it, making it a more complex sort. Making it into an insert sort instead only adds two lines and is more efficient anyway.

(Doing a bit more research, the insert sort that results from a modified bubble sort is more accurately called a Gnome sort. There's other ways of implementing it, but I find that one to be the best bubble sort replacement.)

I'm doing a toy robotics project. Among parts in the kit I'm using is an Atmel ATmega8535L AVR microcontroller. Another part of it is a 1x102 pixel CCD array. The CCD array communicates via a serial protocol called USART transmitting bytes at a time. The ATmega8535L, being a multipurpose microcontroller, has support for dozens of configurations of USART as well as one or two other serial communication protocols, assuming you plug stuff into the right pins. The documentation for it even has code examples in both C and assembly for how to use this functionality.

So, the people who put the kit together also provided a driver library for the CCD. These are amateur student coders, so perhaps they should be cut some slack, but they put the kit together, too, and I'd expect them to read the datasheets on their parts. In trying to figure out how the library works because I wanted to make a different interface to save about 20% of the communications time off, I took a look at their code and found this gem:

This is the clock control... notice they didn't even bother trying to use the onboard system which would have clocked it itself at the same speed as the CPU itself, rather than one pulse every three plus read/write time clock cycles.

When I saw that, I decided that I wasn't going to use their library, even if I didn't put together my different peripheral. My new driver, written in C is shorter, faster, and more readable.

sniff sniff It smells of generated code. Are you sure this was hand-written?

Looks more like it was written by foot.

_________________==============================================="A sufficiently-advanced technology is indistinguishable from magic."Arthur C. Clarke"Sufficiently advanced magic is indistinguishable from technology."Jack L. Chalker"Magic is just another way of saying 'I don't know how it works.'"Larry Niven"Any technology, no matter how primitive, is magic to those who don't understand it."Florence Ambrose

I'm doing a toy robotics project. Among parts in the kit I'm using is an Atmel ATmega8535L AVR microcontroller. Another part of it is a 1x102 pixel CCD array. The CCD array communicates via a serial protocol called USART transmitting bytes at a time. The ATmega8535L, being a multipurpose microcontroller, has support for dozens of configurations of USART as well as one or two other serial communication protocols, assuming you plug stuff into the right pins. The documentation for it even has code examples in both C and assembly for how to use this functionality.

So, the people who put the kit together also provided a driver library for the CCD. These are amateur student coders, so perhaps they should be cut some slack, but they put the kit together, too, and I'd expect them to read the datasheets on their parts. In trying to figure out how the library works because I wanted to make a different interface to save about 20% of the communications time off, I took a look at their code and found this gem:

This is the clock control... notice they didn't even bother trying to use the onboard system which would have clocked it itself at the same speed as the CPU itself, rather than one pulse every three plus read/write time clock cycles.

When I saw that, I decided that I wasn't going to use their library, even if I didn't put together my different peripheral. My new driver, written in C is shorter, faster, and more readable.

Point of order: USART is not a protocol. It stands for Universal Synchronous/Asynchronous Receiver Transmitter, and it's a hardware device that can support all kinds of serial protocols. Most modern USARTs are very smart, so they require very little in the way of code to make them work. Many are programmable to help offload processing load. Being a student coder is no excuse--I wrote better code than that 35 years ago, and I was self-taught.

_________________==============================================="A sufficiently-advanced technology is indistinguishable from magic."Arthur C. Clarke"Sufficiently advanced magic is indistinguishable from technology."Jack L. Chalker"Magic is just another way of saying 'I don't know how it works.'"Larry Niven"Any technology, no matter how primitive, is magic to those who don't understand it."Florence Ambrose