DoTheSimplestThingThatCouldPossiblyWork:
Having the code in a class depend on the class' own implementation strategy is not that bad of a CodeSmell.
Just go ahead and use the member variables until such a time as you find that you need to redesign the implementation, then change all member access to accessor functions.

The SelfLanguage does not expose member variables (slots) to the programmer. All access to slots is done through accessors. Declaring a slot defines the accessors in some implementation-dependent manner. E.g. declaring a slot called "slot" would define a getter called "slot" and a setter called "slot: new_value".

RubyLanguage has something similar with its accessors, but also allows direct member access from within the object.

Using your own variables (outside of accessors) is one thing. Using your parent class' variables is truly evil.

I spent four days once (intermittently, in between doing real work) tracking down a bug that was driving me nuts. A member variable kept getting changed in the production system, and I couldn't see how. Putting a breakpoint at the setter didn't show me anything unexpected. A line-by-line inspection of the code didn't help me either. Then, all of a sudden, I noticed that the variable wasn't declared as private, but as protected. When I changed this, my jaw dropped as the approximately 500 compile errors this caused creeped out of the woodwork (about 120 of these were value changing!)

Sometimes, good coding habits (like making your non-final variables private) can really leave you blindsided to the evil practices that others use.

You can discover all sorts of interesting things about a program by simply commenting out a variable...

One could argue that Java is a flawed language and ideally one should be able to hide the distinction between changing variables and using an accessor. This can simplify syntax and allow one to swap one for the other. If an accessor is simply a formal wrapper around a variable, then by itself it's not providing any value and is just repetitious bloat that hogs eye real-estate. It only provides value if we LATER want to add more processing or control. Thus, it's more logical to start out with direct variables, but change them to accessors later if and when needed. But with Java you'd have to change the interface to do this. In a "proper" language, you wouldn't: you could switch to accesors from variables without changing existing calling code. A side-effect of this is that you wouldn't be able to tell whether it's a direct reference or an accessor by looking at the call interface alone, unless perhaps you make it "read-only" or add parameters. (Somewhere there is a related existing topic on this.)

But if one is stuck with a language that forces an external extinction, like Java, I would suggest leaving variables "naked" if there will be relatively few users (other classes) of the class and these associations will be relatively stable. If, however, there will be a lot of using classes and/or the associations change fairly often, then go ahead and wrap them in accessors up front. -t

-t

Take a look at C#'s Properties, and please reflect on whether your advice is based on good practice or your personal preference. There are good reasons -- that you've not mentioned -- not to do what you suggest.

Let's stick with Java for now. I'll come back to C# later. Per Java, it's to avoid bloat. One only should create such bloat if the cost of change is great enough to justify it. Bloat creates risk by confusing and/or distracting "the eye", but visiting many callers to convert direct attributes to accessors when later needed also creates risk (and work). Thus, there is a balancing point of risk. If you only are likely to have a few and stable callers, then the bloat doesn't prevent enough risk & rework to justify it's own risk (by being bloat). Granted, I have no formal studies on confusion/distraction caused by bloat, and these are based on my personal observations of my own work and others'.

General illustration of trade-offs:

Scenario A: 2 Callers

Scenario A.1: Direct attributes

Visual complexity: Low

Cost & Risk to change callers: c * 2

Scenario A.2: Wrapped attributes (setX/getX)

Visual complexity: High (bloat cost)

Cost & Risk to change callers: 0

Scenario B: 40 Callers

Scenario B.1: Direct attributes

Visual complexity: Low

Cost & Risk to change callers: c * 40

Scenario B.2: Wrapped attributes

Visual complexity: High (bloat cost)

Cost & Risk to change callers: 0

Different people may assign different values to the 1) cost of bloat (see FastEyes), 2) the probability that we'll later need accessors, and 3) the cost of changing the interface (and changing the callers). But do notice the "c * 40" (c times 40) in B.1. It's probably a high total by most scenarios (readers applying their estimated costs). If we don't wrap and have few callers (A.1), then the cost of changing the interface is relatively low, and arguably lower than the cost of bloat (A.2). I believe most would agree under that scenario it's probably a good idea to wrap, but the "best" choice in scenario A is probably subject to heavy debate, based on estimations of one's own WetWare and/or that of typical developers in their shop. It's roughly a wash by my estimate, leaning toward A.1 in the name of YagNi. Related: DecisionMathAndYagni, SimulationOfTheFuture.

Do you feel that's a balanced assessment? The only reason you mention to avoid direct access of member variables is that the interface might change.

I am not following. Please elaborate, perhaps with a specific example.

It's not an issue of a specific example, but of the fact that your scenarios are focused almost entirely on "bloat" and "visual complexity". There are good reasons to avoid direct access of member variables, such as to increase encapsulation and reduce coupling. Why do you not mention these?

"Coupling" is an ill-defined concept, and encapsulation for encapsulation's sake is a waste of code. I am not suggesting that one don't wrap/hide variables, but rather only do it WHEN there is known and existing reason, or if there is likely to be one in the future. Thus, I am not arguing against encapsulation in general. It's a matter of if and when to encapsulate. I'm not a YagNi purist in that I append "likely to need" in my criteria versus "until actually needed" in pure YagNi.

Coupling is a very well-defined ComputerScience/SoftwareEngineering concept. Coupling exists wherever there is a dependency such that altering A affects B. In that case, we say that A and B are coupled. Where coupling is necessary, we try to group it together whenever possible. That's cohesion. To reduce accidental or intentional (but unnecessary) coupling between defined units of cohesive code, we use encapsulation. Your argument appears to be based on the presumption that code will be static and neither re-used nor modified, and is simple enough that accidental coupling is unlikely.

"Affects" appears to be open-ended, but I don't want to do the LaynesLaw dance over "affects" in this topic and bloat it up with off-topic bickering over English. Note that I don't dispute the general principles related to coupling and encapsulation, but they must be weighed against other principles, such as YagNi. I'm NOT prepared to say they ALWAYS trump YagNi et al. ItDepends. I consider them ALL rules of thumb, not absolute Ten Commandments.

As far as encapsulation, we'd probably need to explore specific scenarios, being that talking in generalities is failing to improve communication it appears. IF we are doing a basic set/get wrapper around a variable, it buys us nothing in "protection" over a "naked" variable. At that stage there is nothing to protect. They are pretty much interchangeable. If we wrap it, it's typically in preparation for some future change where an accessor is no longer a basic set/get. Thus, there is no up-front protection provided JUST by making a set/get versus a variable. Putting a wrapper solves nothing and improves nothing and protects nothing up front. But it does cost in terms of code bloat up front. Thus, wrapping gives a net negative benefit, at least for the short term. If we LATER need non-basic wrappers, we can add them THEN. The decision about whether to wrap up front should depend on the probability that we will later need complex wrappers, and the amount of code impacted by the change. In other words, if we spend bloat now, we hope we get a sufficient and/or likely payoff in the future. Otherwise we are being billed for bloat without getting enough benefits from the bloat to justify all those bills. It's a lot like deciding if insurance is worth the cost.

Wrappering a variable with get/set makes it possible to replace the variable behind get/set with some other mechanism, and/or validate what is set, and/or mutate what is getted, and/or guarantee the variable is not being inadvertently mutated by some code external to the class, without changing any other code. In general terms, wrappering a variable with get/set encapsulates the internal mechanisms of the class such that the internal mechanisms of the class are decoupled from the class users. This promotes code reuse and simplifies maintenance.

Not for basic accessors. You seem to be slipping into BrochureTalk. We need something clear and explicit. And you appeared to completely ignore my points. I do agree there are SOME potential future of wrapping up front. Thus, I am NOT disputing there exists SOME value to doing such. But you ignored the real issue: do the up-front costs justify potential future benefits. I didn't see any contrasting and comparing from you, just re-statements of the upsides, which I never disputed to begin with. Let me make it double clear: I do not dispute that up-front wrapping has SOME value (up-sides).

I didn't ignore your points, I dismiss them. Arguing that private class members should be publicly accessible violates one of the basic tenets of good OO programming. There is no excuse for it, certainly not that you lack FastEyes.

That's an authoritative argument, not a real argument.

The real argument is above.

I failed to see where it is demonstrated/proved how it overrides the cost of bloat 100% of the time.

I fail to see where you make the case that getters/setters are "bloat", or that they represent a cost. Your argument appears to be a roundabout way of claiming that in source code, fewer tokens are always superior to more tokens.

All else being equal, YES. Code volume/size is a criteria we try to optimize, along with other factors. Software design is one big tradeoff balancing act (SoftwareEngineeringIsArtOfCompromise).

Indeed, we try to optimise all relevant factors, and that was the basis for my original criticism: As I wrote above, "your scenarios are focused almost entirely on 'bloat' and 'visual complexity'" and don't appear to be offering a balanced consideration of "other factors".

I disagree with your "entirely focused" characterization. My statement is basically a form of, "In certain circumstances, the cost of wrapping exceeds the benefits". That's a comparison of A to B, not a focus on just A. -t

In the absence of evidence -- such as metrics -- to demonstrate that the cost of wrapping exceeds the benefits, why should we violate what is generally considered OO programming best practice?

Where does YagNi recommend violating recognised OO programming best practices?

I interpret YagNi in a general sense, a form of parsimony: keep it short. I don't know if anybody has laid out specific and/or canonical rules in which other principles override it. If they have, I'm not applying it without clear justification.

[I'm honestly curious as to how get/set actually enables those changes. If you add validation to a setter, for example, you're changing the external interface of the class - it's either got to throw an exception or silently not set the value, which violates the behavioural interface established by the non-validating version of that setter. Starting out with getters and setters (or, in languages that support them, properties) guarantees the syntax doesn't change, but wouldn't you need to at least check the callsites to ensure they'll operate correctly with the new behaviour? -DavidMcLean?]

I never said pre-wrapping prevents all interface changes. But, thank you for helping to clarify that.

What is "pre-wrapping"?

Creating generic set/get's for all "public" attributes (variables) up front.

No one is arguing in favour of creating setters and getters up front. A setter should be created only when it is determined that one or more external classes require write access to the state of a class instance. A getter should be created only when it is determined that one or more external classes requires read access to the state of a class instance. Of course, in case it's not obvious, there should be no public member variables.

Also, lack of encapsulation can create "accidents", but so can bloated code. If the accidents caused by bloat exceed those caused by lack of encapsulation, then encapsulation is not giving us a net benefit. Bloat can also slow general productivity by making more code to read and change.

What is "encapsulation for encapsulation's sake"? Isn't encapsulation for reducing coupling and increasing cohesion?

It depends. See above.

Encapsulation always reduces coupling and increases cohesion. What you appear to believe "depends" is whether it's worth the additional code of get/set/etc. or not.

Without a clear definition/metric of "coupling" and "cohesion", I cannot confirm nor deny that claim. But this is NOT the topic to define/debate coupling and cohesion, as a reminder.

I've given a clear definition of CouplingAndCohesion. They are abstract and often qualitative (though specific quantitative metrics may be defined for specific cases, but not in general), but that doesn't mean they're vague.

If there are no consensus numeric or Boolean metrics for it, or a definition clear enough to lead to that, then it's "vague" in my book. The existing proposals have too much dependency on damned English, and we know how that turns out.

Many things have no established numeric or boolean metric, and yet they're clear enough to make decisions. For example, "programming language" has no established metric, and yet millions of people use them and create them every day.

YagNi is about only implementing requirements that you need to implement. It isn't advice to write what would generally be considered bad code that violates encapsulation.

That's your opinion. Again, encapsulating before encapsulation is actually needed can indeed be interpreted as a violation of YagNi. But I don't want to make this into a "principle war" but rather explore the ACTUAL costs versus benefits with something more concrete.

That sounds like a highly nuanced and personal interpretation of YagNi. I'd be curious to see if ExtremeProgramming, or any other Agile methodology that endorses YagNi, advocates it.

As long as they don't rely on ArgumentFromAuthority, I would indeed like to see wider opinions also.
Map Perspective

I consider objects to be "glorified maps" (dictionary structures). It's acceptable to have maps without wrapping each element of a map. If we say "always wrap" all "public" elements of an object, then as soon as we add a single method to the map, we would then be obligated to wrap every element, creating an all-or-nothing DiscontinuitySpike. It's like that one method is a poison pill that suddenly triggers some grand encapsulation rule. The boundary between "map" and "object" can and should be fuzzy: I see no reason to hard-classify them (forced dichotomy). See also MergingMapsAndObjects. -t

That also a nuanced and personal interpretation of ObjectOriented programming. OO is defined by encapsulation, inheritance, and polymorphism, not that objects are like maps.

Encapsulation is NOT a hard requirement of OOP. And NobodyAgreesOnWhatOoIs. Thus, my preferred viewpoint of OO is not less valid than others. And RobertMartin's "jump table" definition/description of OOP can be viewed the same or similar to a map view. A "jump table" is simply a map from name/key to a function (implementation and/or reference to).

I don't recognize that conversation. Anyhow, I don't see anything that contradicts "jump tables" being interpreted as maps of function/behavioral references.

Describing OO as "jump tables" appears to equate OO with the vtables used to implement virtual function calls in C++. Are you sure "jump tables" wasn't intended to be a joke?

RM appears to have an extensive C/C++ background and writes from that perspective. I myself interpret his jump-table statement to be a C-family-centric way to describe a form of maps of behavioral references. You are welcome to have a different personal interpretation of his words, but I am not obligated to subscribe to it. -t

Bob Martin has sufficient experience in OO not to write from such a peculiarly biased perspective, and normally he doesn't. I can't find any reference to "jump tables" except the above exchange. It looks more like a sarcastic quip or a joke than a definition. Not that it would matter, anyway.

The debates over OO definitions fall into established categories. Your "preferred viewpoint" appears to be unique to you, and not in any of the established categories.

Established by who? I only see differing opinions. I'm sticking by my working definition until a consensus is reached. In practice I do see maps morph into objects as more requirements are added (at least for languages that help blur the distinction).

Established by multiple acts of general consensus, which has resulted in groups of agreement. Your working definition doesn't fall into any of them. It appears to be unique to you. Do you therefore consider any program which uses maps to be object oriented?

No. Again RM's "jump table" is essentially a map of behavioral pointers. (It's an odd way to describe it, but it just may reflect too much time spent working with one language.) Thus it is NOT unique to me. (See [1] below for my working definition.)

What's a "behavioral pointer"? The "jump table" definition appears to be unique to Bob Martin, and unique to that conversation. Are you sure he wasn't making fun of something you'd written before?

One advantage of the map-based definition over many of the others is that it considers objects merely as a coding convenience (code packaging) and does NOT assume objects are intended to be used for a specific purpose, such as fitting the "real world" or wrapping "data structures" behind behavioral interfaces. It's a wider view of OO; and if one dispenses with such "purpose" views, then map-ism naturally follows. --top

I don't know of any established definition of OO that assumes objects are intended to be used for a specific purpose. Wrapping data structures behind behavioural interfaces is an obvious purpose, however -- do you really want to manipulate a graph or BTree with something other than a behavioural interface?

ItDepends on whether one wants to provide an IS-A viewpoint/interface or a HAS-A (or ACTS-AS) style of interface. The second is often more flexible and better future-proofing in that it doesn't lock one into a given viewpoint, and is one of the reasons for the popularity of RDBMS. -t

How does your response relate to my question?

I reviewed your question and confirmed I answered it properly as I understood it. If you meant something different than how I read it, I cannot tell because I cannot read your mind, only your text. I will agree what we may indeed want to fully wrap SOME things; but that doesn't mean we should fully wrap everything behind behavioral interfaces.

How, then, would you interact with a graph or BTree with something other than a behavioural interface?

One can use behavioural interfaces on data without making the entire thing "be" that thing. (I suppose if you define a "thing" by implementation instead of interface, that may be a different matter.)

What do you mean?

One can have a tree or stack view of something without that thing being a "tree" or "stack". It's poor future-proofing of a design to lock your data behind a narrow interface anyhow. (There may be security-related reasons to do so, but for domain data, you typically don't want to box yourself in.)

The way you make the actual structure (e.g., tree or stack) look like some other structure (e.g., stack or tree) is by hiding the actual structure behind an interface. To make this work, it is typical to make the actual structure private to the class that defines the interface. For that reason, it is best to AvoidDirectAccessOfMembers, because direct access of members would expose the actual structure (e.g., tree or stack) that you're trying to make look like something else. Of course, the usual "something else" is an abstraction, so that you don't see it as (say) a tree or stack, but as a collection.

Then you are artificially forcing a "root" structure, which can create both implementation problems, and interface conversion complexity problems. If one is going to have to select a root structure, at least make it a fairly flexible one. If for example you choose a "stack" as a root structure, then you will have to do tons of pushing and popping to emulate a more random-access kind of structure. Not pleasant.

The "root" structure is normally a Collection, which defines methods to add and retrieve items without reference to any underlying structure. Indeed, it would be not pleasant to choose "stack" as a "root" structure (with push and pop methods) instead of a collection's usual add and get methods, which is why it's unlikely that anyone would do that.

A generic "collection" is more or less a "table", which pretty much backs my original point. For flexibility, we generally want usage-specific views on top of general "data structures" rather than general views on top of usage-specific (narrow) structures.

No, we generally want generic or abstract views on top of specific or concrete data structures. That way, we can program to a generic interface, and trivially replace the concrete data structure that underpins it, as needed to meet performance and/or resource usage requirements.

That doesn't contradict my statement. I wasn't even addressing under-the-hood implementation. I was comparing interface to interface, not interface to raw guts.

Actually, that does contradict your statement that we want "usage-specific views on top of general 'data structures' rather than general views on top of usage-specific (narrow) structures". Your statement sounds like the diametric opposite of my statement.

Sorry, I am not following.

My statement is that abstract interfaces should hide concrete implementations. Your statement is that usage-specific interfaces should hide abstract constructs. Thus, they are the opposite.

"Hide" is probably not the right word. Adapt info to usage-specific needs is what I meant by "on top". I will agree that "on top" was probably a poor choice of words on my part.

A collection interface is the opposite of "adapt info to usage-specific needs". The choice of implementation is what adapts to "usage-specific needs".

No, I didn't say a general collection was for the specific-adaption purpose. There is a huge communication gap somewhere here.

Then what did you mean, if "on top" was a poor choice of words?

I already gave a re-phrasing of it.

Let me try to restate it yet again. We generally want to wrap raw data in some kind of "structure" or systematic abstract interface rather than direct access. But because that data may be used for multiple purposes, we don't want the "first layer" to be something stiff and limiting, like a stack or a queue. Something like "generic tables" is a better first-layer wrapper around the raw data. Stacks and queues tend to be usage/application-specific viewpoints we want on data, but we don't want to hard-wire our general or future access to a given data item to such narrow structures. Thus, if you want flexibility, you build the stack and queue on top of ("through") the more general structure, such as tables.

That's exactly what I wrote. My only objection is the notion of a 'table' being a general abstraction. It isn't as general as a collection. A collection is a structure that lets you obtain iterators over the objects in the collection, and often (but not always) lets you add objects to the collection. A table is a collection that is a set of tuples with a specific heading.

Okay, we generally agree on the concept, then. (Note an RDBMS may be the "general abstraction" in many cases, not a same-language class.) My point stands that the stack doesn't "own" the data, but merely provides a view/wrapper around it or some aspect of it.

It's perfectly normal to describe a "view object" that uses data and info from another object(s) and/or database(s). Few if any will say, "no, that's not an object because objects must entirely wrap ALL their data/state by definition". Therefore, in colloquial-land, "objects" do not require 100% wrappage. -t

I don't think anyone would say it's not an object. What they might say is that it should AvoidDirectAccessOfMembers. My database view objects, for instance, do not make member variables public and do not use private member variables to represent columns in the ResultSet. Instead, the DatabaseView class provides a getRow() method that returns an associative array from column names to column values.

I was addressing vocabulary in that statement and not intending to suggest anything about design practices there.

Again, no one would say an "object" without "100% wrappage" isn't an object.

So then we agree that objects don't require wrapping of members to be "objects"?

Sure. No one said they did. The title of this page is AvoidDirectAccessOfMembers, not DirectlyAccessingMembersTurnsObjectsIntoNonObjects.

Could you provide an example of what you mean by "maps morph into objects as more requirements are added"?

One starts with a typical record "structure" (AKA map), and various actions seem to group naturally with that "structure". The map may still be used in map-ish ways, but we now have methods that are specific to that structure. For example, it might be app configuration info that is typically only set by field technicians. We later want a "list_to_screen" method for it to display it for easier trouble-shooting by field technicians (using a "hidden" back-room UI).

By map, do you mean a collection of named elements, aka a dictionary? I don't know what your field technician example is intended to illustrate, but it's trivially obvious that operations may be defined to manipulate a given structure. That doesn't make it object oriented, only that you've defined a structure and a set of associated operations.

That gets back to how one defines "object oriented". I don't define it by "wrap-ness" or "encapsulation" level. I'd argue that full wrapping is really creation of an AbstractDataType and not (just) object orientation. OO is wider than ADT. You appear to be conflating the two. If they are one in the same, then we should dispense with the term "object oriented".

An AbstractDataType is a mathematical model for a category of data structures. It is isomorphic to certain applications of object oriented programming, but not equivalent. Particularly, they are not the same because AbstractDataType is not defined in terms of polymorphism, inheritance, or encapsulation. Other OO definitions are either too amorphous or too individuated to consider.

Perhaps a better way to describe it is that you are talking about user-defined types, not OOP.

What do you consider to be the distinction between user-defined types and OOP?

Encapsulation is required for UDT's, but not "objects". I'm not sure I will agree it's the only difference at this point, but a key difference.

Encapsulation (as in DataHiding) is not required for objects or UDTs. For objects and UDTs, however, DataHiding a good idea (in conjunction with encapsulation), in order to reduce coupling. That's why we AvoidDirectAccessOfMembers.

I suppose that comes down to how one defines UDT. I don't want to get into that vocab dance today. Maybe another.

That may depend on how one defines polymorphism, inheritance, or encapsulation (the "big 3" for reference). Anyhow, it's reasonable that one may wish to use one or two of those three without having to subscribe to them all. There's no reason I can see to force an artificial DiscontinuitySpike in order to match somebody's category system.

Sure, you can use one or two of polymorphism, inheritance or encapsulation, but then it's not OO.

I have to disagree with your view of what "OOP" is. It doesn't matter anyhow here, for one should design software based on the best design choices, NOT based on vocabulary. You can't make something more efficient or more parsimonious or more economical by redefining it. (Caveat: changing the definition of the goals/metrics or "economics" may affect such.)

I see no logic of the universe that forces a hard distinction between maps and OOP as far as how to use them, even IF I buy your definition. It's not in for a penny, in for a pound. Even if I buy your def, there is a continuum between a map and a "true" object and no clear reason to ignore the continuum or pretend like it doesn't exist or pretend like if we are 70% fitting "true OO" we should go 100% because 70% is "bad" or 30% is "bad" and 100% is "good".

If maps are sufficient for OO, then do the static maps in C (i.e., the 'struct' construct) mean C is object oriented?

That depends on how one defines "OO"[1]. The definition is not really what matters and I don't want to get caught up in another term fight. The point is that useful code constructs can exist that cover the full gamut between and including a pure map ("is" or "used as") and a fully encapsulated object (no public "variables", only methods). What we call these things is irrelevant and shouldn't dictate how we lay out our code. It's silly to say that as soon as one introduces a single method into a map (or object used like a map), then one is suddenly obligated to wrap every key of the map or map-like thing. If I understand your argument correctly, then this all-or-nothing rule would apply under it. I find it a ludicrous and highly artificial "boundary". -t

By the way, if full encapsulation is always the "proper" way to do OO, then an "OOP language" technically shouldn't allow public variables in classes at all: only methods would be able to read and change class variables.

Because unless the language is carefully-designed to avoid such, it creates bloat, and bloat slows down reading and creates errors due to bloat-induced reading mistakes. They don't do it because they probably don't want the bloat-related problems associated with it.

By the same argument, structured programming is bloat compared to the simplicity of GOTOs, and slows down reading and creates errors due to bloat-inducing reading mistakes. Typical OO languages don't force member variables to be private purely for historical reasons. Modern OO practice does not make member variables public.

How the heck is that the same argument? For one, goto programs are not shorter.

They're simpler, by your metric. For example, they don't risk the reading mistakes that are possible from putting the initialisation, test, and increment sections of a "for" loop close together.

Sorry, I'm not following this at all. Example P-2 under PayrollExampleTwoDiscussion is CLEARLY less code and CLEARLY easier to read than P-1 (at least for normal developers, which I don't necessarily group you with. You seem to have outlier style preferences). I don't see anything equivalent in your GOTO analogy attempt. The "reading mistakes" weighing of GOTO's versus blocks is much more nuanced, as is code volume differences. -t

Example P-2 under PayrollExampleTwoDiscussion is rather unusual. In production, the Employee class is the base for several concrete implementations which obtain the relevant values through a variety of mechanisms that depend on whether the code is deployed in an MS-Access/VBA payroll application, a C++ application, or a test harness, none of which merely store the various values in simple variables. Hence, getters are not only appropriate, they're the only appropriate mechanism. Example P-2 looks unusually verbose because the base class is being inherited with the various getters overridden to return a literal, purely to support the stripped-down illustration of real business code that is PayrollExampleTwo. Employee instances aren't defined like that in production, only in PayrollExampleTwo.

You seem to be bringing up multiple issues, none of which are clear to me as written. Anyhow, I'm not critiquing the general application design above, but merely illustrating parsimony and grokkability differences between the two "styles".

Fair enough, but note that the "style" you're criticising in Example P-2 is, or should be, very rare.

Incidentally, "behavior-oriented programming" or "verb-oriented programming" or "interface-oriented programming" may be better way to describe what you have in mind. Your ADT-like view of OOP came after OOP.

None of those are established terms. "ObjectOriented" is the recognised term.

Also recognized to be a mess as far as terminology. Anyhow, you still haven't addressed the question whether something can be in an in-between state of a map and an object. You still seem to be encouraging a forced and/or artificial dichotomy. -t

I didn't know there was an open question about "whether something can be in an in-between state of a map and an object". I'm not sure why it would matter. Whilst "object" is frequently used to refer to any identifiable language construct, particularly one that defines something to hold data (like a struct, class, table, variable, whatever) as opposed to (say) a control structure like a 'for' loop (which is not normally called an "object"), the loose and general use of "object" is quite distinct from the usual meaning of ObjectOriented. I see no evidence that the industry or academia generally considers Map (as in a kind of container) and Object (as in ObjectOriented) to be equivalent in any defining sense.

I don't want to get caught up in classification slots here; it's likely a wasteful LaynesLaw dance. My point is that there can be a wide range of "structures" between those that are treated/used like a typical map, and those treated/used like a typical "object". I give an example above (config info) of something that starts out like a map, but a method or two is later added on. Whether it's called/labelled/classified as a "map", "object", or a "frippokof" doesn't matter. The point is that "in between" things exist with behaviors/conventions/designs/usage-patterns that straddle both the "map" and "object" world. Your "rule" seems to reject this in-between state, and/or it's rule(s) for when "object-ness" kicks in are ill-defined. I'm looking for something clear like, "If it has more than 3 methods, then The Rule kicks in: all public attributes should now be wrapped", or the like (along with the rational of the rule and its trigger point of 3, of course). -t

Classes and prototypes -- i.e., constructs which serve as a template for instances -- should not publicly expose member variables. In C++ and C#, 'struct' is effectively an alias for 'class', so the same "rule" applies. Other non-class constructs that may be evocative of classes -- like Python's tuples, or various Map or Map-like collections (apparently) -- are not classes or prototypes, nor are they a template for instances, so the "rule" does not apply.

Okay, that's clear enough for my satisfaction. Thank you. However, I won't deviate from my recommendation above that wrapping only be done if there are likely to be a sufficient quantity of instances/clones. -t

Why would the quantity of instances make any difference? Does it make a difference whether 'new Blah()' gets called once or a thousand times? Do you perhaps mean the quantity of references to an instance?

Whether they are subclasses, clones, or instances probably depends on the language used and/or programming style since dynamic languages may blur the distinction between instances and sub-classing. I'm not sure of a compact way to word it that makes sense in all languages and coding styles. The cost-of-change to go back and wrap dependent usages (when the need arises) is generally higher the more "coded" references. I generally wouldn't count quantities in "automated" references, such as a loop that allocates 500 references/clones/instances. I'd only count that once (unless something really unusual is being done). Thanks for bringing up that wording point, though. The main factor that matters here is the cost-of-change, which we are weighing against the cost of bloat. Again, I approach it similar to "investment math" where we are weighing trade-offs based on our best prediction of future events. Without having a working time-machine, that's the best we can do.

Would it be correct to say that it's ok to allow direct access of members if the number of dependencies on a given member are low, and not ok if the number of dependencies on a given member is high?

"Dependencies" is too open-ended. I look at probability and cost first, not "dependencies". If a given "dependency" is unlikely to cause a problem, then it should be given less attention/weight than a factor that is likely and/or costly.

I would have described it the other way around. Probability is inherently unknown, and cost is often unpredictable, but dependency is straightforward. You have a dependency between A and B if changing A affects B. In most imperative programming languages, dependencies are defined by relationships between identifiers. Given some definition or declaration z assigned an identifier 'p', every reference to 'p' represents a dependency upon z. Improving coupling means reducing references to 'p'. Improving cohesion means grouping references to 'p' together. This means that if z changes, the impact is minimised. The question, if any, is assuming 'p' is a member variable, how many references to 'p' does there have to be, and/or how ungrouped do they have to be, before you hide 'p' behind a wrapper?

I have to disagree. Probabilities, cost-of-change, and cost of reading bloat can be roughly estimated. Focusing only on easy-to-measure factors is the SovietShoeFactoryPrinciple. I'll stick with SimulationOfTheFuture as the most rational way to make design decisions, which generally follows investment theories. Focusing on the existing code alone is too narrow a viewpoint. -t

I don't know enough about that domain to make a confident estimate of change patterns. The kind of changes to the formulas may play a role in the calculations also. I've never worked directly on a payroll app. I can tentatively agree that without sufficient estimates of change patterns, wrapping may be the better default. But if you don't have enough knowledge of the domain, you should probably talk to somebody who does before making that coding decision, and/or study past formulas & changes. AND past coding mistakes. It's quite possible they were caused by BloatInducedReadingConfusion.

Changes occur every six months and can occur anywhere, but the numeric literals change the most frequently, the switch statements change next, then the structure of the formulae (including adding or removing factors), then provinces/territories are added. The last one happened once in the ten years that I maintained the real code upon which PayrollExampleTwo was based. Because changes can occur anywhere to anything, what made the most sense was to design with a focus on CouplingAndCohesion. Thus, on average, any change had the least impact, rather than trying to optimise for specific changes as implied by SimulationOfTheFuture.

Re: "the switch statements change next [2nd in frequency]" is a bit open-ended. How they change can matter a lot.

They can change in all the way that switch statements can change: More cases or fewer, the case literals can change, and the code in them can change.

Let's say we classify the patterns of all possible changes into 20 different patterns. If say change-pattern 7 is 10 times more likely than pattern 15, that could very well affect the final decision of which code design technique is ranked as the most change-friendly. I don't have those specific probability values here to analyze and process.

Such specific probabilistic "change patterns" don't exist in Canadian payroll. Numeric literals are roughly twice as likely to change as switch statements, switch statements are slightly more likely to change than formulae, and provinces/territories will be added about once every 100 years unless they repartition the country for payroll purposes into income tax regions.

They do exist, per actual historic change events. You just don't know them because you forgot or nobody bothered to track them. But anyhow, how does your coding suggestion improve the changing of "Numeric literals", the most common change pattern, according to you?

Why do you assume I "forgot" or "nobody bothered to track them"? I developed Canadian payroll software for a decade, so I know how the payroll specification changes, and it's not something for which you can (a) identify a collection of "change-patterns" beyond those that I've given; or (b) assign numbers like "10 times more likely" other than the numbers I've given such as "roughly twice as likely".

Coding with a focus on CouplingAndCohesion has no impact on changing numeric literals. It has a big effect on changing the switch/case statements (they're cohesive), the formulae within a province/territory (province/territory classes are cohesive but not coupled to each other), changing provincial/territorial factors independent of federal factors (each is cohesive, coupling is via inheritance through a minimal set of functions), etc.

[1] In my book, it would only be OO if it facilitates defining and putting behavior (functions or references to functions) in the 'struct' nodes (per C example). And OO-ness can perhaps be considered on a continuum rather than discrete (is or isn't OO). If C makes it possible but difficult to put and use behavior in struct nodes, then it may be considered "weak OO". "Orientation" generally means "leaning toward". Thus, "object oriented" generally means "leaning toward objects". This can be read as "facilitates object-ish things and usage" or "makes it easier to do object-ish stuff". -t

So you define OO by "has dotted dispatch?"

I don't recognize the phrase "dotted dispatch".

E.g.:

instance.method()

Note the dot between the instance and the method name. That's dotted dispatch.

That type of syntax is one way to simplify adding and using behavior in maps, but I'm not prepared to say it's the only way, at this point. I'm not smart enough to list out all possible syntactic ways to facilitate the above concepts. -t

Note also that 'class p {void x()};' is merely a syntactic shorthand for 'class c {}; void x(c p);' In terms of TypeSafety, etc., they are equivalent, but the latter requires that 'class p' have publicly-accessible members whilst in the former, method 'x' can access private members of 'class p'. In short, the reason for "putting behavior ... in the 'struct' nodes" instead of outside the 'struct' nodes is specifically to AvoidDirectAccessOfMembers.