Often I write small methods (maybe 10 to 15 lines of code) that need to be reused across two projects that can't reference each other. The method might be something to do with networking / strings / MVVM etc. and is a generally useful method not specific to the project it originally sits in.

So how should you track shared snippets across projects so you know where your canonical code resides and know where it's in production when a bug needs to be fixed?

In all, there are about a dozen or so libraries. You can really distribute the code however you see fit, so you don't have to end up with hundreds or dump everything into one giant assembly. I find this approach fits because only some of our projects will need Framework.Data and only a few will ever need Framework.Strings, so consumers can select only those parts of the framework that are relevant to their particular project.

If they're really just snippets and not actual methods / classes that can be easily reused, you could try just distributing them as code snippets into the IDE (e.g. Visual Studio Code Snippets). Teams I've worked with in the past had a common snippet library that made it easier for everyone to follow our standard coding practices with internal code as well.

Break the rules from time to time

For small bits of code—say a single class with no dependencies—we tend to just copy and paste the code into projects. This sounds like a violation of DRY, and I'll admit it can be at times. But over the long term it has been much better than having some sort of massive, multi-headed commons project for a few reasons.

First, it is just simpler to have the code handy, especially when building and debugging stuff.

Second, invariably you'll want to make some little tweak to the common code for that project. If you've got a local copy of the source then you can just make the tweak and call it a day. If there is a shared library then you could be taking on tweaking that library and then making sure you don't break all the other apps or creating a versioning nightmare.

So if it isn't beefy enough for it's own namespace we tend to just push it into the appropriate bits on the project and call it a day.

29 Reader Comments

This just reminded me of my days working on a large embedded system. I found a bug in a low level routine that was only detectable when the system was under severe load. When I had found the problem (a decade ago) it had existed since the "beginning of time" for that product and was probably already twenty years old. I was working on fixing a customer affecting issue, and this problem was part of a larger problem for that customer. I figured out what I thought was the whole fix, and called a code review... with as many senior guys as I could get, as I was working in unfamiliar territory (at the time.)

Well, longer story short, during the code review, one of the most senior guys asked me a question: I've never worked in this module before, but this code looks awfully familiar... did you copy it from someplace else?

I said: no, I just found the report bug and fixed it.

He replied: You better search the code base for other copies of this code... I'm pretty sure there will be other copies.

He was right, and on a pretty massive scale. The few lines of code were pretty complex stuff (thus the reason for the latent bug) that most people didn't really understand, so when they needed to solve a similar problem, they first looked for code they could copy instead of writing it fresh themselves. Over the two decades it turns out to have been copied 27 from the original times for a total of 28 copies I found.

What I though was a minor fix, turned out to be months of effort because I had to make and test 28 patches, and I had to acquire all sorts of hardware and knowledge to test some of the more esoteric use cases.

So what did I learn from that experience? It would have been really nice if the the person who had started to make the second copy (way back in time) had instead though: Hmmm, this code seems like something I could modularize and find a way to share so it doesn't have to be copied in the future. But because it was only a few lines, it became "standard practice" to copy the working code where ever it was needed, rather than pay the "price" of refactoring it. (The price being the second copier would have had to justify why he was making a change to an unrelated and probably stable module. The benefit, even then, might have been he would have found the bug if he'd tested the [theoretically] refactorized code.)

So my point in all this is simple... if you DRY and then you find and fix a bug, you fix for everyone...

Every time you reinvent the wheel, the chances of producing one with a bent rim or a broken spoke go up. Snippets and libraries help mitigate that risk that, if they are consistently drawn from a well maintained and controlled repository of some sort. Good and clean code is good and clean code; a good wheel is a good wheel. Use it and when if you can.

There are rules in programming to which there are exceptions. DRY is the one rule to which there is absolutely no exception.

I have to disagree. The conflict between DRY and KISS can be quite subtle and lead to code bases that are no easier to maintain than had those dependencies just been copied in. Worse, it makes really knowing the scope of a project somewhat difficult to see, if the majority of it is external to the code.

Repeating items can itself be a tool. Like most tools, learning to use it can be tough.

Of course, more realistically, odds are quite high that somebody has done the item you are trying to do right now. If you truly believed in DRY, then you would almost invariably always be retrofitting an existing solution into your flow. Now, you have two problems to solve. Getting that solution into your flow, and not breaking the previous usage of it.

Having multiple servers with different systems, we ended up with similar code libraries all over the place. Ex Lets say a php html form selector. As long as the code is good, no problems. Another issue in distributed systems is ongoing development on a codebase that is distributed then into various production systems on different servers. All systems have to be updated individually. Messy at its best.

In the old days, this topic was covered by the concept "Abstract Data Structures", and we put usefuls code in what we generally refered to as "libraries". These "libraries", they were catalogs on our filesystem, or on the filesystem of a server. We had no other fancy name for it. Just ADTs and libraries.We did copy-paste, or used the fine art of #include...

My recommendation is to create a repository specifically for snippets and full-blown shared libraries. Depending on your version control system, this can be pulled into any project. If it becomes more than a snippet, move it into a "library" repository.

Because the version of a subrepo is dependent on the version of the umbrella repository, you can ensure that the snippets and libraries your project depends on don't change underneath you, whilst allowing other projects to expand and change them.

Does anyone actually do editing here? In this day of spelling and grammar checking 'repete' should never happen. This is not the only spelling error I've seen on Ars as of late.

If you read a little further down, you would see that it was spelled "Repete" deliberately. And does it matter if the spelling is occasionally incorrect on here if you can understand the meaning of what someone is saying?

Sorry, people misspell things on the Internet all the time. Complain if you want, but you're not going to change anything.

There are rules in programming to which there are exceptions. DRY is the one rule to which there is absolutely no exception.

While I generally agree that you should try to adhere to DRY no matter what, speaking in absolutes is almost always a bad idea.

The one example I can think of off the top of my head where you can arguably ignore DRY is when you have a system that is about to be retired. If you need to add code to one part of the system that is very similar to another part, you can get away with copying and pasting the code, if you know that taking on this technical debt (repeated code) will save more time and money in the long run (meaning, you don't need to support it past a couple more months). Now, this could still bite you in the rear, since the system could get a stay of execution (so to speak) and you have to deal with the ramifications of it, but my point is that there are some situations where not following DRY is to your advantage.

Common libraries, with a repository managing it (so everything pulls it from a canonical source when built) are the way to go here.

Sometimes you have common code that wraps actions, which isn't as amenable to this... like reading in files in Java:BufferedReader in=new BufferedReader(new FileReader(filename))String line=in.readLine()while(line!=null){(Stuff to process file)line=in.readLine()}

That's not so amenable to being refactored into a common library, but it's basically four lines of code.

I suppose you could write a FileLineIterator or something, which would allow you to reduce this to a foreach loop.

Common libraries, with a repository managing it (so everything pulls it from a canonical source when built) are the way to go here.

Sometimes you have common code that wraps actions, which isn't as amenable to this... like reading in files in Java:BufferedReader in=new BufferedReader(new FileReader(filename))String line=in.readLine()while(line!=null){(Stuff to process file)line=in.readLine()}

That's not so amenable to being refactored into a common library, but it's basically four lines of code.

I suppose you could write a FileLineIterator or something, which would allow you to reduce this to a foreach loop.

Hmm... in the last team that I worked with in Java... they were very focused on DRY and testability to the exclusion of worries about performance. The general answer to a design choice supposedly made for performance reasons was to get 100% coverage first and then refactor and optimize IF a performance related problem surfaced....

So with that thought in mind... the recommendation from my old group for you example above would have been to use the Apache Commons code to read the entire file into a List<String> in one function call, so as to not repeat yourself writing a file reading loop. When I newly joined the group I initially was uncomfortable with their frame of mind on such matters... but over time I found they had a certain point... Computers are massively more powerful these days than when I first learning programming... and optimization probably should lose out to readability and immutability.

The best route, when possible, is to use a third-party commons library. It wins on a number of fronts: it's naturally going to be more reliable (fifteen lines of code is *plenty* of room for bugs) and better documented, and a new dev might actually be familiar with it.

Also if you're building up a strings library, you're almost certainly wrong. Most complex string matching and splitting code can be replaced with regular expressions. Most complex regular expression code can be replaced with a proper parser. And code that assembles complex strings or text files can be replaced with a templating library.

Quote:

I suppose you could write a FileLineIterator or something, which would allow you to reduce this to a foreach loop.

I was about to say, "don't do it" as Iterator.next() and Iterator.hasNext() can't throw a checked exception, so you'd have to wrap IOException as RuntimeException or otherwise hide IO failures.

I follow the principle of write it once, write it twice, and if you need to write it a third time look at how you could generise it either into a base class or a library. Saying that, I still have some code that has to be written into every class that uses it (may be able to fix it now I know about generics) because there were enough differences between methods that I couldn't write a generic version.

I've also worked on a project where the original team went dry crazy and as such made the whole product harder to maintain and debug because you had to transverse a couple of classes, a few wpf databindings and finally dive into the code behind of a couple of datasets to finally get at the origin of a bit of data. So, dry is very much a guideline rather than a hard and fast rule.

Most people have a "utilities" library where they keep stuff like this. I do.

The C/C++ Code Snippets from the late 80s and early 90s is an example. The routines were all "public domain" since this was well becore copyleft became popular. Heck, I've been using the same fuctions to remove leading and trailing whitespace from text strings for 20+ years - snippets that do that in my code for 5 different languages. Initially, I wanted to use someone elses' code, but after having to fix commercial Rogue Wave classes for Y2K issues, I became much more confident. My code isn't usually better than commercial libraries, but sometimes is it **much** better.

I don't know if this http://www.daniweb.com/software-develop ... code/_/118 or this http://www.cprogramming.com/snippets/ are the same libraries that I've referenced over the years, but you get the idea. I probably have an original ZIP file somewhere on optical media. These were cross platform, standard C/C++ code that worked across 16, 32 and 64-bit systems. Back then, a named ZIP file was how code was distributed. There might be a github repository these days that you can clone.

A couple of years ago I worked as a contractor for an org I used to work for as a regular employee. I had written about 80% of their client side codebase so I was very familiar with the code. It did not surprise me at all to find that one of the contractors they had hired for a short period to do some work, had taken a class I had written for a UI screen, copy pasted the whole class for a new UI screen, changed ten lines of code and called it good. I don't think I am cynical in believing that this developer probably spent an hour at most doing this (including testing) and probably charged the org for a day or two worth of work.

IMO, cut & paste code of more than just a few lines is a bad practice. Even just a few lines can be a bad practice if it is repeated enough times in a single class, or even in multiple classes.

As everybody knows, and at least one person pointed out, C&P code causes problems (often huge problems) with bug fixes, modifications of the logic, and refactoring.

Even worse is when the same code is copy/pasted and slightly to majorly modified - this makes it harder to find, harder to fix, harder to modify.

Currently I have the unfortunate job of maintaining, modifying and extending a legacy app for a large corporation. One of the major problems with this codebase is the extensive use of copy/paste code. Moreover, not only does much of the code consist of copy/pasted code, but there are also major sections of code that was copy/pasted then modified in such a way that it has diverged significantly in how it does the same task as the original source (or vice versa - it is often hard to tell what is the source and what is the pasted code without extensive research into the history of the code). So I am presented not just with copy/pasted code, but code that does the same thing in a significantly different way.

IMO there are a number of different ways to reduce copy/pasting and increase reuse. One is a Command framework (encapsulate some task inside a command) - this works well for less general purpose code (like formatting a string) and better for business logic or persistence logic. An advantage of a Command beyond simple reuse is that it can often be re-used in different contexts, such as on a server vs. a client, or in different execution contexts (background UI thread vs. foreground or in a thread pool, etc.).

A slightly different way of looking at it would be as a "Unit of Work", especially if the task is something that can be divided up between separate threads or processes or hardware.

Beyond that I also use abstract/base classes to contain reusable code. If the code needs to be separate from the class (as is often the case with value or data transfer objects), then I put it in a "helper" or some kind executor class that does some task on the value object. I have found that it is often a mistake to put too much logic into value/DTO classes because how they are used depends on the context. Having that logic separate also helps with IOC.

However, there are also cases where an entire application can end up flowing through certain "reused" classes or functions. In this case, refactoring becomes very difficult, as the outcome of refactoring the reused classes or functions may be hard to determine. This is where technical debt builds up, as developers are afraid to refactor.

Where I work, we are moving to the model of breaking large, complex web apps into smaller, more manageable, vertically siloed mini-apps that have a smaller responsibility and are functionally independent. For example, the login/authentication part of the app may be its own app that is deployable/runnable entirely separate from the "main" app.

In this case, both the authentication app and the main app may each have their own User class (for example) in which there is some overlap of properties and functionality (so not DRY). However, the upside is that you can have differences in the User classes and not have to worry that by changing one you are breaking things in the rest of the app.

Obviously, if you have to add a property to the database that backs the User class, you would need to change that property on any/all User classes for your separate mini-apps. However, this addition of properties happens fairly rarely, so it is worth the annoyance for the flexibility that mini-apps give you.