LessThanDot

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary.

LTD Social Sitings

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Every now and then when I am answering questions I get an answer that I should just make everything bigint and not worry about making it smallint since the difference is only 6 bytes and should not waste time with premature optimization

To me this is not premature optimization but a best practice, keep your keys and tables as small as possible because

A) You don't want to redesign this later (or use the sexy word refactor if you will)B) If someone looks at your code he might think why you use 6 bytes more because the foreign key table has billions of rows

See I always design and code in a way expecting that my table will have millions or even billions of rows.....once you have a table with a billion+ rows it is not as easy to make changes without incurring downtime

And of course the 6 bytes per row is not only in storage, if it is part of the Primary Key then your foreign keys will also be wider. It will take more bandwith to move that data around, updates and inserts will be slower. Backups will take longer as they will be bigger etc etc etc

I also see some (some, as in less than 3) people saying that you should make everything VARCHAR(MAX) that way you don't have to worry about exceeding the column storage...this one to me is really laughable

What do you think..is the premature optimization meme overused/misused?

I think the warning against premature optimization is good advice for beginners. An experienced programmer intuitively knows where the bottlenecks will be and will premptively correct the problem before it becomes a problem. This is one of the reasons why I prefer to specialize in just a couple technologies. It allows you to become an expert in your field. When I write code, I don't need to think about the syntax. I can focus on the algorithm.

Blanket statements like "always use varchar max or always use big int" are wrong. Not just from a best practices perspective, but also from a premature optimization perspective. Experience tells us that!

There is a difference between premature optimasation and best practices as you point out. In my case as a developer I would say that using threads is a premature optimisation since they add a serious complexity and you want to keep you code simple to avoid extra bugs. Threads are difficult to debug.

Using datetime columns to save date values is not premature optimisation it's common sense.

Using index on most used columns is common sensen not premature optimisations. But perhaps you don't know what column is going to be used at design time and you will need to tweak for best preformance.

It is also a case of "can I change it easily when needed". You see sometimes you start of with writting a program for a couple users that then becomes popular and gets millions of users. YOu would not write the same program for a few users as you would for millions of users.

It's all a matter of time and money. And throwing more hardware at it could be cheaper than writting it all new. Or as an inbetween option.

I almost forgot about the other one..I cringe when I see nvarchar(255) used for US zipcodes or other names that can only be in ASCII...I always get the suspicion that someone used DTS to create the table based on Excel data, you know that all the tect columns then become nvarchar(255)

The warning against premature optimization is generally a warning against premature performance optimization. Premature performance optimization is bad because we are guessing at where bottlenecks might be once real load is paced on a system, we are spending the time to optimize something that isn't finalized yet (may be changed or replaced prior to release), and we are spending time on optimization when we could be getting the next priority item done (which could very well be optimization of something that's actually important).

Optimizing the definition of our data, on the other hand, is a different matter. If we know what the definition of the data is, we should be defining the structure to match that definition. Optimization of the definition, however, includes weighing the accuracy of our knowledge so far and may mean we are defining types more loosely until our information firms up and the level of risk goes down. There have been a number of times that I have started databases off with mostly integer, datetime, and large varchar fields because most of the data was third party and I simply couldn't trust the published standards until I got my hands on actual data. Where that level of risk is is going to be different for different people, but once we have a high enough comfort level about the data (or have good, solid business or transport rules in place that define upper limits) then we should refine that data definition to best fit the data. This might be day one, it might be month 3.

To answer the original question: Maybe. I think the premature optimization saying is misused quite frequently. I also think that anyone quoting it and immediately making everything varchar(MAX) is not likely to understand this, as they are over-optimizing in another direction (optimizing for least interruptions, least changes, or least thinking needed).

Um, yeah...I'm going to need you to come in on Saturday -- Bill Lumbergh, Office Space

SQLDenis wrote:What do you think..is the premature optimization meme overused/misused?

One company I worked for had strong "get it done, optimize later" policy. They kept code base at neanderthal level (really) so that any developer can take it over at any moment. And they used "optimize later" part to squeeze extra $$$ outta customers on a regular basis. Needless to say, their threshold for "premature optimization" was very low.

How many SEO experts do you need to change a lightbulb lightbulbs buy light bulbs neon lights sex p0rn

SQLDenis wrote:Every now and then when I am answering questions I get an answer that I should just make everything bigint and not worry about making it smallint since the difference is only 6 bytes and should not waste time with premature optimization

To me this is not premature optimization but a best practice, keep your keys and tables as small as possible because

A) You don't want to redesign this later (or use the sexy word refactor if you will)B) If someone looks at your code he might think why you use 6 bytes more because the foreign key table has billions of rows

See I always design and code in a way expecting that my table will have millions or even billions of rows.....once you have a table with a billion+ rows it is not as easy to make changes without incurring downtime

And of course the 6 bytes per row is not only in storage, if it is part of the Primary Key then your foreign keys will also be wider. It will take more bandwith to move that data around, updates and inserts will be slower. Backups will take longer as they will be bigger etc etc etc

I also see some (some, as in less than 3) people saying that you should make everything VARCHAR(MAX) that way you don't have to worry about exceeding the column storage...this one to me is really laughable

What do you think..is the premature optimization meme overused/misused?

I have found, unfortunately through personal experience, that premature optimization is the better route - provided you have the knowledge to know what to do in the first place. When I first started on some things, I just did what worked at the time without considering the implications in the future. More than a few times did I get bitten on the @$$ as a result. But those experiences also taught to look at the bigger picture to see what could happen and to plan accordingly. Those lessons have allowed me to apply some of those premature optimizations/best practices to good use. Unfortunately, I have discovered that my experience is not universal and there is often a strong push (typically by mgmt, marketing and sales) to push a product that is built without them that inevitably becomes a problem because such practices were not implemented up front - and more than a few developers are willing to buckle in to such demands. Sadly, it makes the jobs of those who follow, either maintaining or adding new functionality to the code, significantly more difficult in many cases... And still the lessons are not learned.

Reason and free inquiry are the only effectual agents against error; they are the natural enemies of error and of error only.