This gives me a tremendous amount of flexibility, because the container object can essentially store anything. So if new requirements crop up, I'll just add it as another attribute to the DataObj object (which I pass around in my code).

However, recently it has been impressed upon me (by FP programmers) that this is an awful practice, because it makes it very hard to read the code. One has to go through all the code to figure out what attributes DataObj actually has.

Question: How can I rewrite this for greater maintainability without sacrificing flexibility?

Are there any ideas from functional programming that I can adopt?

I'm looking for best-practices out there.

Note: one idea is to pre-initialize the class with all the attributes that one expects to encounter, e.g.

Your data structures are so mutable that you seem to be concerned with their maintainability. In your copious free time™, try reading this article about immutable data models. It might completely change the way you reason about data.
–
9000Aug 14 '12 at 2:59

@9000 An article like that re-convinces the already convinced. To me it seemed more like a how-to than a why (The list of whys isn't really convincing unless you feel like you have those specific needs). To me it doesn't convince someone updating invoices in VB that having to constantly make new copies of their invoice object makes sense (add payment, new invoice object; add part, new invoice object).
–
PaulAug 14 '12 at 13:13

3 Answers
3

How can I rewrite this for greater maintainability without sacrificing flexibility?

You don't. The flexibility is precisely what causes the problem. If any code anywhere may change what attributes an object has, maintainability is already in pieces. Ideally, every class has a set of attributes that's set in stone after __init__ and the same for every instance. Not always possible or sensible, but it should the case whenever you don't have really good reasons for avoiding it.

one idea is to pre-initialize the class with all the attributes that one expects to encounter

That's not a good idea. Sure, then the attribute is there, but may have a bogus value, or even a valid one that covers up for code not assigning the value (or a misspelled one). AttributeError is scary, but getting wrong results is worse. Default values in general are fine, but to choose a sensible default (and decide what is required) you need to know what the object is used for.

What if I don't know what my attributes are a priori?

Then you're screwed in any case and should use a dict or list instead of hardcoding attribute names. But I take it you meant "... at the time I write the container class". Then the answer is: "You can edit files in lockstep, duh." Need a new attribute? Add a frigging attribute to the container class. There's more code using that class and it doesn't need that attribute? Consider splitting things up in two separate classes (use mixins to stay DRY), so make it optional if it makes sense.

If you're afraid of writing repetive container classes: Apply metaprogramming judiciously, or use collections.namedtuple if you don't need to mutate the members after creation (your FP buddies would be pleased).

I would likely use the second approach, possibly using None to indicate invalid data. It is true that it is difficult to read/maintain if you add attributes later. However, more information on the purpose of this class/object would give insight as to why the first idea is a bad design: where would you ever have a completely empty class with no methods or default data? Why wouldn't you know what attributes the class has?

It's possible that processData might be better as a method (process_data to follow python naming conventions), since it acts upon the class. Given the example, it looks like it might be better as a data structure (where a dict may suffice).

Given a real example, you might consider taking the question to CodeReview, where they could help to refactor the code.