...and then I wonder whether or not I should gather up the various combinations of foo, bar, and baz that lead to thing1, thing2... before calling them.

4: I can try to get clever using fall-thrus and such to reduce the duplicated code, but this is fragile.

5: something else.

It doesn't matter what the A, B, C, thing1, thing2... is. I've run into this general pattern before.

Further, sometimes thing-n comes in variations. For example, if the data associated with A is a particular (kind of) value, then thing3 uses that value, otherwise thing3 uses some default. As a concrete example, there are message boards (Science-1, Science-2, Math-1...) which are displayed one way, but TeacherLounge-# boards (of the same structure) are displayed differently. And the -0 boards (i.e. Science-0) are not displayed to the public, whereas the -n (n>0) are.

Is there any good way of thinking about it that would help choose between #1, #2, and #3? #3 seems the best to me, but none of them are elegant. Any better ways to reduce code duplication? Am I "doing it all wrong" (without getting into haskell-like languages)?

Jose

Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

I think it's mostly a matter of taste or highly dependent upon use case and environment (consistency with other libraries/projects improves clarity).When the thing<k> operations mutate the data and require a particular order, my preference would be the conditional dataflow of #2. Well, as long as the things can't alter the result of the conditions, otherwise the clarity is easily lost.

ucim wrote:(without getting into haskell-like languages)

Speaking of, maybe I'm doing it all wrong, but why is there no function Bool->(a->a)->a->a or f a->(a->Bool)->(a->f a)->f a or something similar in the standard libraries? Function or case notation is verbose in such simple cases.

Whether the array contains function pointers/function objects that can be called in a loop, or entries of an ENUM (or simply strings) that are later handled with a switch statement, I think this variant does multiple things well:

a) the logic remains as clear as in example 1. Your example 2 needs to reorder all the logic, which is difficult to read and maintainb) it is immediately obvious what each case does, because the list of actions is immediately below the corresponding if. Your examples 1 and 3 suffer, because "case=bar" doesn't tell the reader anything unless the reader scrolls down.c) it separates the selection logic from the actions.

About your variations, it's difficult to say without knowing what kind of variations you have, but here are two inspirations:

(Table-tags would still be nice, ah well. But what you have is an array covering all sixteen possible combos, populated by function numbers we know we want according to not-variable and is-variable positions.)

Obviously you'd have etcs, including more than four tests, and you may even want to change things later, but by looking for the largest groupings (together or wrapped or, with more vars to worry about, spread-spanned unless you switch variables around (in your head, or actually) I can see, forthis particular example, that I can quickly summarise as:

If (B & C & D) // defines two positions, in the simplest + largest + exclusivest way things123;elseif (A & C) // defines four positions, but the A&B&C&D one has already been iffed, // and this is an else to that... thing2;elseif (A & !B & !D) // defines two positions, but A&C has already been dealt with // which doesn't mean you *can't* add !C to that things134;else // room for catchalls etcs???endif

Note that instead of overriding the "foo" situation with the "bar" one, I've just pre-emptively used the barish call, if necessary, by shortcutting.

And, based on the assumption of logical groupings of thingNs, I've called an actual envelope "thingsXYZ" calle, here, although you don't have to. My thinking, though, is that it first of all forces you to anticipate a combination in code (things234 could be a typo, as could calling "thing1: thing1; thing3;" in longhand due to a copypasta error) and also enables you to validate and extend such specific sequences. (You're probably not using thingN functions but something like "undock(RadarWindow); resize(RadarWindow,quarter_screen); recentre(RadarWindow.data);" and a function "zoomFocus(RadarWindow)" could be a good envelope function to have, so that adding "scale(RadarWindow.data)" i to zoomFocus also deals with other zoomFocussing (e.g. with MapWindow as the param,, another side-docked element in this hypothetical).)

But, it depends on your style (and expectations as to who might be maintaining/porting/reverse-engineering the code in the future, including your future self!) and how malleable your decision-making process is possibly going to be both in development and in foreseeable future tweaks.

And I'd probably do it different in Perl. Define @doFuncs=() and then push (and, in extreme circumstances, splice) $subRefs into that possibly in the form of an Iterated function, then while ( $doFunc = shift @doFuncs) {&$doFunc()}, or somesuch guff. [Which it appears Tub has just ninjaed me about, whilst I was perfecting my pseudo-table formatting, except in OO form.] But the logic of the pushing would probably be much the same, with or without a truth-array (depending on whether I think I could keep the workings all safely in my head.

My reason for trying to avoid this is that the things being done and the order in which they are done are pretty much mostly the same in all the cases. So whether I put the thing strings in an array first, or just do them straightaway, I'm still duplicating the logic of "thing1 comes first, thing2 is next..."

I often get to this point by starting with (the logic I don't want to be duplicating):

thing1;thing2;thing3;thing4;thing5;

where each "thing" could be a function or could be a simple statement (like foo=bar;)

but then, if B is populated, we don't need thing2, and if C is populated and has the value 0, we need to do thing3a before thing3. Now it's

thing1;if (B) then thing2;if (C and C=0) then thing3a;thing3;thing4;thing5;

Thing1, thing2, and thing3 set up a form, but if the only incoming data is A, then it's a virgin form.

if !(A and !B and !C and !D)// if not virgin{thing1;if (B) then thing2;if (C and C=0) then thing3a;thing3;}thing4;thing5;

At this point, the logic flow of thing12345 is getting lost, and there are a lot of if statements, but at least no code duplication (except perhaps having to evaluate B and C twice so far)

So (sez I)... evaluate A, B, C... once, creating foo, bar, and baz cases, and switch on them... and you can see where that's going.

An example of thing3a would be that usually the table to use is in A, but if it's a certain kind of data coming in (we have A, B, and D populated), the table to use is in D. So usually I'll have the statementtable=A['tablename'];but sometimes I'll have the statementtable=D['name'];but in all cases followed by:do_stuff_with_(table);

@ThemePark - also brilliant. I'll have to remember that for the future. It's especially elegant for pure cases.

@Soupspoon - that solution is the (4) in my OP; you using "already if'd" and me using case fall-thrus. But though most efficient, it looks fragile. Is the matrix itself something you're using in the code, or just as a guide? (I actually did use a matrix in plpgsql code to do that very thing... it was a nightmare (plpgsql ain't good at that!) but the best way to accomplish the task at hand.)

Jose

Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

ucim wrote:1: I can FIRST figure out which subset of fields I have, and THEN use a switch to do the proper subset of things....2: I can do the things in order, and for each thing, figure out whether the data combination means I should do that thing in the first place....3: I can do both, but it seems to be the long way around:...4: I can try to get clever using fall-thrus and such to reduce the duplicated code, but this is fragile....5: something else....

#1 is bad; it's just a less-readable variant of #2. If the stuff you need to do is long enough to make #2 less readable, it should be in functions so the conditional structure remains paramount.

I've done this kind of code before; the right thing to do is to make a few boolean variables for "do_thing_1", "do_thing_2", etc. Then, first, go through all your logical conditions, setting the correct variables to true. Then just run through the six things based on whether the corresponding variable is set.

Further, sometimes thing-n comes in variations. For example, if the data associated with A is a particular (kind of) value, then thing3 uses that value, otherwise thing3 uses some default. As a concrete example, there are message boards (Science-1, Science-2, Math-1...) which are displayed one way, but TeacherLounge-# boards (of the same structure) are displayed differently. And the -0 boards (i.e. Science-0) are not displayed to the public, whereas the -n (n>0) are.

Same deal - abstract the data you're actually going to work on into variables, and in the conditionals, set them to the correct value. Then in the six things, work on those variables, rather than on your incoming data directly.

Because I assumed that "thing1" is a placeholder for something more complicated, along the lines of results.thing1 = thing1factory.getInstance().thing1(data.A, data.B, params.thing1_parameter);and the point of the exercise was to avoid duplicating that code multiple times.

Now that you've given a more detailed example, I see why my suggestion might not be as useful for your specific problem.

I also see that your problems are way too diverse to abstract them away in a simplified model. There's no one-size-fits-all solution that's more concise than what you're already doing.

The thing is, your logic is inherently messy, and any possible way to translate that logic into code is going to end up messy. It is going to be difficult to read precisely because you're trying to communicate something non-trivial.That being said, you can always improve readability. For one, I'm not sure if the single "case" variable is a good idea. You have multiple independent conditions and you should probably track them separately. So why not use multiple bools with clear names?

if !(A and !B and !C and !D)// if not virgin{thing1;if (B) then thing2;if (C and C=0) then thing3a;thing3;}thing4;thing5;

There. Ok, most of my booleans have stupid names because I don't know what those checks do, but I hope you get the idea of proper, readable naming anyway. Also the idea of piecewise assembly. For example

if (A and C) or (B and C and D) thing2;if (A or (B and C and D)) thing3;

bool BCD = B and C and D;bool needs_thing2 = (A and C) or BCD;bool needs_thing3 = A or BCD;

if your bools have better names than mine, that should increase readability and reduce duplication.

There are plenty of variants with a framework with inversion of control and dependency injection that would yield a code structure with very clean separation of tasks, but in the end you end up with 1000+ LoC for essentially the same result. That's overkill and won't increase readability.

ucim wrote:@Soupspoon - that solution is the (4) in my OP; you using "already if'd" and me using case fall-thrus.

For fall-throughs, as I use the term, it'd be a different optimal combo, but could still be worked out from the same diagram. Although, for that limited example, I can't see a more efficient fall-through.

That looks messy, and I've pseudocoded the cases, probably... You might even pre-if to set "docase1=true" and "case (docase1)" instead, etc, but then it might as well be serially (rather than elseif 'next test if not'ting) or nestingly if-controlled in its entirety, and not even Cased at all in the first place.

(Depends on your real-life scenario. The logic of your actual requirement(s) could give a more logical fall-through routing than this example ever could.)

But though most efficient, it looks fragile. Is the matrix itself something you're using in the code, or just as a guide?

Here just a guide (and easier jotted down on paper, for planning purposes) but stored in such a text form as a comment just above the code lets you buffer the 'fragility' by referring to it and altering it (via anothe scrap of paper, if necessary!) whenever you need to revise the logic you're taking. And inform future readers of the approach currently taken.

(It's not unreasonable to consider storing the table in human-editable and machine-parsable form (just the one entry for both, that is!) for either 'precompiled' or on-the-fly evaluation, but would writing the 'solver' code be worthwhile? Easier than a natural-language parser, but probably a diversion left for when you have time on your hands, if at all... Unless something like it is already Public Domain or otherwise rippable-off with impunity... )

If a future revision changes the diagram to work better in a different and more efficient form then a trained eye could discover this, as well, rather than have to work with pasta-code1 to derive the current workings, then decide whether to re-engineer it from scratch, in-built (but rarely encountered and never noticed) logical errors and all...

Not that I'm exactly the saintly supplier of super-readable script, allmostmuch enough of the time, but I can still pontificate upon preferable programming paradigms, right?

1 Not necassarily spaghetti, but could look sufficiently like it at first glance!

Thanks for all your ideas. Yes, there are several different kinds of ways to group things, and I've been taking a case-by-case approach. Sometimes this leads to one form (examine input, decide on the kind of thing to do, and then switch (or if..elseif) to choose.) The advantage is each logical grouping is clear and well named:if (new_event) { bunch_of_stuff;}elseif (edit_event) {similar_bunch_of_stuff;}...

Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

Tub wrote:Shouldn't that be part[part_number].name or ->name or even ['name']? What language are you using, and why aren't you using proper classes with named attributes?

It's pseudocode, dimly based on (PHP hidden to protect the sensitive), which is what my latest project is in. I needed a scripting language, and chose it because it looked like C and I wanted to get going fast. I'd been out of coding since structural programming (C, pascal) was a thing, so missed out on the joys of OOP. And now, here I am. Sometimes I think I'm slowly inventing parts of OOP, but let's not go there! And converting the whole thing to something else (Python?) would be a huge project, and I need to know I have something marketable before I try that.

Yeah, I've added spaces too.

Jose

Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

Honestly, duplication isn't so bad, usually. Worth worrying about if the lack of it results in massive bloat, or performance issues or whatever, but most of the time, the clarity is handy down the line, and the optimization *really* isn't important.

It's a tradeoff, and as such, there isn't a single perfect answer you should always do, you decide based on the needs of your current app. That said, I find that *most* coders are prone to premature optimization in search of the mythical perfect code.