Many times throughout writing larger programs I have questioned after how many copy and pastes does it make sense to put the code into a function or method and what is a good rule of thumb?
I have been using a rule of thumb of four lines or greater and appearing more then twice, then I make a simple function / method containing that code.
Can you think of a better practice or offer any pointers?
This is more of a general design pattern question rather then a language specific question.

8 Answers
8

I use functions partly as a way to document the code. Calling a function with a meaningful name makes it easier to understand the code. In some cases even a function with a single line makes sense.

For example, in "Clean Code", Robert C. Martin gives the following example:
Which one would you rather see? This:

// Check to see if the employee is eligible for full benefits
if ((employee.flags & HOURLY_FLAG) &&
(employee.age > 65))

Or this?

if (employee.isEligibleForFullBenefits())

I don't always agree with him, but in this case I do. Code should be readable, not only when you write it and you know every detail, but also at 9pm when you have to fix bugs in someone else's code. Staring at a long condition and trying to figure out all the double negatives is not recommended. If you can just put a name on it (not only conditions, but every piece of code you write), it becomes a lot simpler.

I have never regretted putting something in a function, and if you're worried about performance, then profile first.

Following this simple practice will eventually allow you to write the application code at a relatively high level. The little functions get collected into little classes and soon you are converting functional specs into code almost word for word.
–
kevin clineAug 9 '11 at 23:39

11

I loved this example. Suddenly you don't need that comment anymore. This is a rule of thumb: if your comment can be converted to a variable or function name, do it!
–
Karoly HorvathAug 9 '11 at 23:42

I agree with omrib here, it's often about cleaning up the code - what makes it more readable than any other rule of thumb. If I do end up reusing something, I'll extract it to a method. However, my IDE and tooling often helps with that, so it's easy and quick to do.
–
TravisAug 10 '11 at 16:36

+1 This sort of code could even be faster, at least if it needs to be JIT-ed - you pay only for what you use.
–
JobAug 10 '11 at 17:51

There is a widespread misunderstanding that function calls should only be made to avoid repetitive code segments. My rule of thumb is that any logical unit of work should be made into a function, even when it's used only in a single place. This usually leads to a better readability, and allow you to write self-documenting code, where function names replaces comments and you don't need to write additional comments explaining what you're doing.

then make it a function or method. Long pieces of repeated code, in my experience, will naturally fall into one of these categories (usually the first one, but then the categories overlap a lot ;). Of course, anything that has to be in the interface is also a function/method in its own right.

The question is: why wouldn't you write a function for a commonly repeated piece of code even if it wasn't tricky to get right or likely to change? (My rule of thumb: if its repeated and longer then calling a function, make it a function)
–
Winston EwertAug 9 '11 at 23:34

I would go even further. As a programmer you have to eliminate all kinds of repetition. Whether it's in the database (normalize), some manual testing (replace it with unit tests) or deployment (automate it).
–
Karoly HorvathAug 9 '11 at 23:46

@Winston: that depends on the language being used. Not every construct can be naturally captured as a function, the function might take more space than the original code (think C and return by pointer), function calls may incur overhead.
–
larsmansAug 10 '11 at 8:52

@larsman, I'm curious what you are referring to by "(think C and return by pointer)". But what you are saying is what I was trying to get at with my rule of thumb. Calling the function has to be easier (i.e. naturally capturing and taking less space) then implementing the contents of the function.
–
Winston EwertAug 10 '11 at 12:49

If a piece of code computes multiple values, say float x, int y and double density, then setting up those computations as a C function can be trickier than just repeating the code, since you have to devise a way to get all three values out. If the repeated computations themselves are trivial, it's sometimes better to just leave them inline.
–
larsmansAug 10 '11 at 13:28

Beware of duck typing, if for now the implementation is similar, but the functionality is effectively different, then merging the two makes it annoying as hell to split back. Especially in languages with poor IDE support (hey, I work in C++...)
–
Matthieu M.Sep 5 '11 at 18:00

On the other hand by keeping them apart you have two functions that do the same thing to test, half the chance of your code being exercised, two places in which the same bug may creep up and you have to remember to fix the one in which the bug hasn't been detected yet. I'd like to be still working in C++, poor IDE support notwithstanding ;-)
–
Nicola MusattiSep 6 '11 at 9:26

A search for "refactoring" will lead you to many resources for industry "best practices" for this very common process. The somewhat famous article, Once and Only Once is a great historical reference explaining what some view as "best practices" for the concerns raised by your question. Also, the even more general concept is known as Don't Repeat Yourself (DRY). For a really in-depth set of answers to your question, read Martin Fowler's great classic, Refactoring: Improving the Design of Existing Code, which covers some of the best known advice for refactoring, which is what you are intuitively trying to accomplish!

That depends on the nature of the cohesion of the repeated code. If the repeated section of code is performing a specific function, then it is an excellent candidate for being made into a method, partly because of the DRY principle, partly because if the function needs to be optimized or corrected, then there is only one section of code to deal with.

If the association is coincidental, it is better to repeat the code rather than make it into a method. If you need to add something to the middle of one of the code sequences to satisfy one of the uses of that snippet, if it is in a method, the change you make may affect other uses of that method.

if the association looks coincidental, it is likely that the two processes do share a common idea, and you probably should explore whether the two processes are indeed two facets of the same thing. More often than not, it is.
–
Lie RyanSep 6 '11 at 21:09