OpenOffice.org Modularization

Often the question arises why is OpenOffice.org not more modular. There are several aspects of modularization:

Views on Modularization

User View

From the user point of view there are several modules:

Word Processor

Spreadsheet Module

Impress Module

Database Module

Core Modules

Desktop Integration

Filter Modules

and many more.

Architectural View

The Architectural OverView shows a very stripped down overview. In reality we will find up to 20 layers below an Application module. All code which can be shared among the Application modules has been moved to core modules. Examples for such layers are

System Abstraction Layer

Infrastructure Layer

Framework Layer

These layers themselves can contain some layers again, helper and wrapper layer and so on.

Source Code

The Source Code itself is group im more than 150 modules, each of these can be build in one pass.

Each compilation unit and each C++ class also can be viewed as a module.

Libraries

Often libraries are build on per (source code/CVS) module basis, sometimes several libraries are built in one CVS module. Attention: at many places the runtime dependencies of the library modules differs from the build sequence of CVS modules.

Goals of Modularization

When viewing the list of problems, one can see some terms repeated often. In reverse, those lead to some goals (items are not necessarily distinct):

General Problems

Amount of modules and their dependencies is impossible to overview for Developers, this leads to

maintainability problems

code complexity

long build times

Architectural/source code modules are not clearly separated from each other by interfaces. Instead clients depend on implementation details of the module. Often it is not really defined what shall be interface and what was meant as implementation only. Consequences:

complexity - as it is unclear which functions of a module are for public use and which are not

avoidable build time - as dependant source code needs to be rebuild, when implementation details of a module are changed

maintainability problems - because when changing implementation details of a module, it is unclear which client code will be influenced and how.

There is a lot of duplicate code for the same tasks in parallel use cases. This leads to problems with

performance - because of increased code size in memory

maintainability - because to change one feature, it is often necessary to change a lot of different locations in code; sometimes even nobody knows which all those locations would be

Large circular dependencies among compilation units. There are several cases where one or a few hundred compilation units are circular dependent on each other. This means:

testing problems - none of them can be tested in isolation, unit tests are nearly impossible

maintainability problems - lots of regression, because a change in one of those files may cause side effects in a few hundred others

Specific Suggestions for Improvement

List suggestions here, if they are already related to concrete locations in the code or specific architectural concepts.