Data Classes - Structure and Type

Hello there. I want to pose the following question, as it seems somewhat difficult to answer myself, as I am predisposed to certain solutions: I have several Beans / EJBs / Servlets / JSP that reach out to an Oracle database (although the db is insignificant) and return datasets based on particular query parameters. I have "holder" or "element" classes that maintain the individual recordset information (attributes w/ get-set methods). This is for future maintenance and re-useability in many items across our particular application paradigm. Our base class describes a "person" in a sense - ID, Name, Address, City, State, Zip, etc. and then is extended to hold "banking transaction" data of sorts - (Issue at hand These custom classes were then put in a Vector object. The display page (JSP / Servlet) then outted the vectors to html. I had problems (as expected) when the vectors grew to > ~2000 elements in size, so I'm re-factoring the code. Since the result sets are generally going to be fairly large, and "list like" I figured why not just put the classes into a "Linked List" but am looking for more information before I implement a new container to hold the class data. As I am sure, there are many opinions on what data structure to use for returning large quantities of read-only data, and I am interested in the arguements for all of them... I have seen some arguement for Hash maps, and collections, etc. but I'd like to hear more direct opinions. My solution will end up as a JSP containing custom tags (for output), and the new TBD "class container" and some other enhancements to efficiency. Thanks for your input, I am looking forward to the answers, and possibly a new can of worms... -dane

Part of the solution as to which data structure you will use will depend on how often, in what order, and so on, you will be accessing this information. So, for example, if you are just going to iterate through the data for table generation, then things like arrays, vectors, stacks, queues or linked lists will work (though in the case of stacks or queues that is not, obviously, their primary advantage.) But if you need to access individuals pieces of data at random times then obviously more will be needed. Hashtables, for example, search in constant time, that is, the time to access the data is (relatively) the same regardles of the size of the table. (I say relatively because if hashing collisions- where two items map to the same place- occur, one has to be "rehashed" to another place. So the time to access will be different, but not significantly so.) There are priority queues, where the information is stored according to some criterion of value (for example, bills, according to date due then are stacked for payment), black-red trees, ordered arrays, 2-3-4 trees, heaps, and graphs. Based on you basic description the following are the ones you will probably want to look at.Arrays have quick insertion and constant time access if the index is known. But they have slow search capabilities, deletion, and are of fixed size. You can get around this with Vectors, but of course you tried this.Ordered arrays have quicker searches than unsorted and the same disadvantages.Stacks are Last In First Out but are slow for access of middle items by repeated pops(You can get around this if you create one from a linked list- just add a push() method that keeps linking to the head and then becoming the head, and the pop() keeps returning the head. In this case you can iterate through it like a linked list- and it is a linked list, so you might as well use that, unless you need both traits.)Queues are First In First Out, same advantages and disadvantages (and you can do the linked list thing again, with modifications).Linked Lists are fast on iteration but slow on searches (since, like arrays, you search by starting at the beginning and looking until you find what you want) and might be more memory intensive (not sure, though).Binary Trees, obviously more complex, are fast at everthing. Deletion is complex (but, as for all of these, you can find implementions to use so it's not that much of a problem). I believe its search time (in # times through the main workhorse algorithm) is no more than log2 (n) where n is the number of items in the tree- that is worst case, which is really good. The only really better time (without some really strange, comlicated algorithm) is O(c), where the time is the same no matter what (almost- see above).Hashtables as I mentioned are really good at speedy accessing, but only as long as you know the key. Otherwise you have to get all the keys (or objects) and check them to see if they're what you want- and then it is just like an array, etc. It's deletion is slower, and it uses memory inefficiently. So, if iteration is your game, use arrays or linked lists (limited only by memory, really), or trees. If random searches for specific items only, then hashtables or trees. If you will be adding to or deleting from alot, then trees (in all their complex forms) and, if not alot of deleting, hashtables. For specific implementations, check the java.util.Collections for the specialized types. You can find implementations of others (like red-black trees) on the web. hope that helps, [ July 20, 2002: Message edited by: Ian Ohlander ]

Peace favor your sword.<br />May the last embrace of the mother welcome you home.<p>Ian

Dane Frazier
Greenhorn

Joined: Jul 15, 2002
Posts: 18

posted Jul 31, 2002 14:12:00

0

Took a few days hiatus from work life. Thanks for the information. I'll chew on it a bit. -d