I started out with a script that was a few hundred lines. Later, I realized I wanted another script that would require much of the same code. I decided to wrap certain areas of the original script that would be shared, into definitions. When I was deciding exactly what should be in a function, I came across various things to consider:

What should I set as its input parameters? If anything was needed in the function, I should probably require it as a parameter, right? Or should I declare it within the function itself?

If function x always requires the output of function y, but function y sometimes is needed alone, should function X include the code of function y or simply call function y?

Imagine a case where I have 5 functions that all call a function 'sub' (sub is essential for these 5 functions to complete their work). If 'sub' always is supposed to return the same result, then wouldn't multiple calls from these parent functions duplicate the same work? If I move the call to 'sub' outside of the 5 functions, how can I be sure that 'sub' is called before the first call to any of the 5 functions?

If I have a segment of code that always produces the same result, and isn't required more than once in the same application, I normally wouldn't put it in a function. However, if it is not a function, but is later required in another application, then should it become a function?

Sorry if these questions are too vague, but I feel there should be some general guidelines. I haven't programmed for very long, and bounced around between OOP and functional, but I've never remembered reading anything that explained this. Could it simply be a matter of personal preference?

Consider studying Test Driven Development. One of its purposes is to guide you towards those methods, input parameters and logic that your program actually requires.
–
Robert HarveyMay 31 '13 at 19:37

@imagineerThis For #3 on your list, you may want to create a separate question with a more concrete example, as it probably expands the scope of discussion on this single topic too greatly.
–
JustinCMay 31 '13 at 21:09

You are currently thinking about functions only to be a container of code beeing reused somewhere. That's a start when learning how to build functions, but it is not the best way.

A different point of view is to use functions to build abstractions. You wrote you have a script with a few 100 lines - fine. Now you have blocks within this script doing a certain subtask. Each of these blocks belongs into a function of its own. The name of the function should be self-explanatory telling you what that subtask is. When you avoid global variables and side-effects, it becomes self-evident what parameters your functions will need and what they must return.

This way, your functions are becoming the building blocks of your application. This is mostly independent of how often they are reused, and independent of how often they are called.

If function x always requires the output of function y, but function y sometimes is needed alone, should function X include the code of function y or simply call function y?

y is a single subtask within x, something for which you can give a separate name, so it should clearly be in its own function - independently of beeing needed alone or not!

Imagine a case where I have 5 functions that all call a function 'sub'
(sub is essential for these 5 functions to complete their work). If
'sub' always is supposed to return the same result, then wouldn't
multiple calls from these parent functions duplicate the same work? If
I move the call to 'sub' outside of the 5 functions, how can I be sure
that 'sub' is called before the first call to any of the 5 functions?

First, if there is the same thing calculated twice, does that really matter in your case? There is no need to optimize that away as long as you have not a proven performance bottleneck. Second, if your really have a performance bottleneck in such a case, there is the technique of memoization, which will exactly solve this problem.

If I have a segment of code that always produces the same result, and isn't required more >than once in the same application, I normally wouldn't put it in a function.

That is exactly what you should do - put it into a function, not for the purpose of reuse, but for the purpose of creating an abstraction.

Clearly you're an advocate of abstraction. May I ask why? I can see that the code may be easier for a reader to understand. When reading a book, glancing at the chapters provides information on the content and direction. Similarly, glancing the definitions of code provides information on logic and work flow.
–
imagineerThisMay 31 '13 at 21:47

1

@imagineerThis: take what you wrote together with the fact that code is 10x more often read than written. Does this answer your question?
–
Doc BrownMay 31 '13 at 22:27

1

@imagineerThis: Yes, code definitions have the logic, but it's faster to glance at a call to a well-named function than it is to pore over every detail to figure out what is going on. Then, when you are concerned about the details of a particular method, you can simply go to that definition (and ignore the rest).
–
jhewlettMay 31 '13 at 23:46

Thanks. I guess it's especially important if the code will be read by others, or myself in the future.
–
imagineerThisJun 1 '13 at 1:02

@imagineerThis: as it will be always ;-)
–
Doc BrownJun 3 '13 at 18:45

In Clean Code, Robert Martin argues that fewer parameters are better. Consider whether something should be an instance variable instead of a parameter. Also, see Preserve Whole Object in Refactoring by Martin Fowler.

I'm not sure what you mean by "always returns the same result". Do you mean that the same result is returned for the same input parameters (deterministic)? Are you asking about lazy loading?

Before I read Clean Code, I was reluctant to create small functions that were used only once. Now I do so frequently. Consider extraction a function if you have a long function with several blocks that represent sub steps in the function. The tell-tale sign is a comment that says something like "Calculates Monthly Sales". You should probably extract the code block into a function called calculateMonthlySales.

There are no recipes for logic, but I will answer your specific questions:

Q: What should I set as its input parameters?

A: the things needed to do the calculation

Q: If anything was needed in the function, I should probably require it as a parameter, right?

A: yes, or you can get it from a file or database

Q: If function x always requires the output of function y, but function y sometimes is needed alone, should function X include the code of function y or simply call function y?

A: simply call y

Q: Imagine a case where I have 5 functions that all call a function 'sub' (sub is essential for these 5 functions to complete their work). If 'sub' always is supposed to return the same result, then wouldn't multiple calls from these parent functions duplicate the same work?

A: if sub always returns the same result, then no work is beeing done.

Q: If I move 'sub' outside the 5 functions, how can I be sure that 'sub' is called before the first call to any of the 5 functions?

A: wasn't sub already outside the 5 functions ?... Do not call sub from within any of those functions. Call it in the parent program before calling the 5 functions.

Q: If I have a segment of code that always produces the same result, and isn't required more than once in the same application, I normally wouldn't put it in a function. However, if it is not a function, but is later required in another application, then should it become a function?

A:If it is complex enough. Put it into a function. If other app needs it, share the library.

"A: wasn't sub already outside the 5 functions ?... Do not call sub from within any of those functions. Call it in the parent program before calling the 5 functions." - But if someone is importing my definitions only, they will need to know that sub needs to be called first.
–
imagineerThisMay 31 '13 at 23:59