Florian octo Forster's Homepage

Object-oriented programming in C

Introduction

"Object-oriented programming" (OOP) is a
programming paradigm, in which data is bundled with functionality, such that
the structure of data and the implementation of functionality on that data is
"encapsulated", i. e. hidden from other parts of the program
which don't need to know how something is done. Other aspects
of OOP are "polymorphism" and "inheritance", both of which
are used excessively among OO-programmers, often without much justification.
But I take it you're not reading this to learn what OOP is,
but how this idea can be used in C and when it may make sense to be used.

A common and good example for where OOP is great are storage data
structures, such as AVL-trees: They are complex enough so that you don't want
to rewrite them from scratch every time you need to store some data, yet their
functionality is strictly limited. To implement AVL-trees you need some data
structures, primarily a tree, of course. Each "node" has a pointer to
it's "parent", there's possibly some kind of comparator and there's a
pointer to user data in each node or only the leaf nodes. Also, there's
functions to do a left rotation, right rotation and combinations of the two.
All of this is absolutely uninteresting for the code using the tree. When using
a tree you want to put stuff in and you want to get it back again, and that's
all there's to it.

Method calls

It's important to note at this point that OOP is a programming
paradigm, not a language feature. C++ has some special syntax
which is supposed to help writing OO-code, but that doesn't mean that you
need these features to write OO-code. The most important
OOP-feature is probably that you can call "methods" (functions that
are associated with the data) so that these methods can access the data it's
supposed to work on. This is often done through so called "self
pointer". C doesn't have this feature, so the functions cannot
automatically access the data. As a consequence you cannot write something
like:

my_object->my_method (args);

Instead you will have to use:

my_method (my_object, args);

Sure, the first syntax has some appeal, but the second example is doing
basically the same, just without automatic self pointer.

Opaque data types

If you give out a struct which is then used as your object (i. e.
handed to every "method") you won't have to wait long until someone
messes with the data instead of calling your methods. It's astonishing how
creative people will become when telling you why they did this, but they will
do it, trust me! The only way to prevent this is to not tell them how an object
is organized. Fortunately, C has the concept of opaque data
types which accomplish exactly that. It is possible to
declare a structure, but not define it. The
code can then use pointers to such a data type, but it cannot dereference it or
instantiate such a struct. To come back to the AVL-tree example, the data type
could be declared (but not defined) as follows:

struct my_avl_tree_s;typedefstruct my_avl_tree_s my_avl_tree_t;

Declaring and defining methods

To use an AVL-tree you'd need to get a pointer to such a structure from
somewhere and pass it to all methods that operate on the tree. Since you cannot
allocate nor instantiate such a structure yourself, this must be done by the
object, too. In OOP-talk this is called a constructor:

my_avl_tree_t *my_avl_create (void);

Since the internal structure is unknown and may need some cleaning up
(e. g. closing file handles, freeing more memory) when the object is no
longer needed. Thus, there needs to be a function to do that for you, called
the destructor:

Closing thoughts

Here are the advantages of this method as I see them:

The "user" (in this case: The author of other parts of the
program or the programmer of a program using your library) is
forced to use your interface. If you do internal changes
(and you do them right ;) other parts of the program are not affected.

A modular layout of functionality is enforced, improving the overall
structure of a program.

Last but not least: Splitting up the declaration and definition of a
structure is not much work. Passing the object as first argument isn't
either.

And some final thoughts on OOP in general and OOP in C in specific:

OOP is a nice concept when used in the right dose. I've seen cases where
there was an abstract class, a class which inherited from it, another class
that inherited from that one and another one (using multiple
inheritance) and then there was a synchronized version from that last class,
too. And then that very last object was instantiated exactly once. WTF?

I've seen many cases where inheritance was used and in almost no case it
improved the program (from a code-wise point of view). In fact, right now I
cannot recall a positive example of the usage of inheritance. There's a
reason why OOP tutorials talk about dogs and cars and stuff, but not about
actual real-world code examples, if you ask me.
In many cases it's best to have a new object which uses another object
internally. For example you might want to write a cache which stores data and
purges old data automatically. This cache will need some sort of storage
internally. The die-hard OOP way would be to inherit from such a storage
class, i. e. some sort of tree, list, hash table or similar.
If you reorganize the inner working of your cache (e. g. you just
implemented the AVL-trees and you want to use them now instead of your simple
linked list implementation) you need to inherit from another class, breaking
a lot of code (as a consequence out of Murphy's Law). Thus in OOP libraries
you'll probably find AVLCache, LLCache (linked list cache), HTCache (hash
table cache) and so on. So much for "abstraction".

Operator overloading is made by the devil.

Polymorphism makes writing bad code easy and reading good code hard.

Thanks for your attention. If you have comments, questions or have found a
typo, please let me know.