Re: Code Review requested: Postscript Interpreter

On Monday, December 20, 2010 9:43:13 PM UTC-5, Ian Collins wrote:
> On 12/21/10 02:47 PM, luser- -droog wrote:
> > On Dec 20, 6:08 pm, (Ben Pfaff) wrote:
> > So for my own purposes, I'm quite pleased with the small file
> > sizes. To me it suggests that the code is concise. Perhaps Strunk
> > and White isn't the best style guide for coding.
>
> Small file sizes is good - it's easier to read multiple files side by
> side than to be a several place in one file and if you ever use a
> parallel or distributed build system, things go faster.

Okay, sure. So now I will count angels on pinheads, but there isn't much difference between working in multiple files and working with multiple frames viewing the same file at different points. I find that being able to make interfaces (read header files) as small and simple as possible is an advantage of bigger files, esp. since C doesn't allow anything like a module hierarchy a la Ada. For example, as other have said, putting most of an interpreter (parser, simulator, etc.; take your pick) in one file requires exposing only the main interface. Interfaces among subsystems remain invisible. This, avoids complexity of headers and their dependencies, yada, yada. For example, a year or so ago I broke a single 10,000 line module down into pieces of ~600 lines. It took 2 or 3 hours to get the headers right, generate make dependencies, etc. and get everything through the regression tests. Build time advantage was tiny and negative when clean-building. On the whole, wish I'd stayed with original setup.

Advertisements

On Dec 20, 11:06 pm, Gene <> wrote:
> On Monday, December 20, 2010 9:43:13 PM UTC-5, Ian Collins wrote:
> > On 12/21/10 02:47 PM, luser- -droog wrote:
> > > On Dec 20, 6:08 pm, (Ben Pfaff) wrote:
> > > So for my own purposes, I'm quite pleased with the small file
> > > sizes. To me it suggests that the code is concise. Perhaps Strunk
> > > and White isn't the best style guide for coding.
>
> > Small file sizes is good - it's easier to read multiple files side by
> > side than to be a several place in one file and if you ever use a
> > parallel or distributed build system, things go faster.
>
> Okay, sure. So now I will count angels on pinheads, but there isn't much difference between working in multiple files and working with multiple frames viewing the same file at different points. I find that being able to make interfaces (read header files) as small and simple as possible is an advantage of bigger files, esp. since C doesn't allow anything like a module hierarchy a la Ada. For example, as other have said, putting most of an interpreter (parser, simulator, etc.; take your pick) in one file requires exposing only the main interface. Interfaces among subsystems remain invisible. This, avoids complexity of headers and their dependencies, yada, yada. For example, a year or so ago I broke a single 10,000 line module down into pieces of ~600 lines. It took 2 or 3 hours to get the headers right, generate make dependencies, etc. and get everything through the regression tests. Build time advantage was tiny and negative when clean-building.

The main advantage that you gain with smaller files is better compile-
time granularity. This is primarily useful for enabling trace, debug,
or constraint checking functionality to a subset of an interface.
This may or may not be useful depending on the code infrastructure in
place.

For example, consider implementing a list interface with a pair of
files: a header (list.h) and source file (list.c). Furthermore, this
file can be compiled using a compile-time flag (say -DENABLE_TRACE)
that can enable trace messages when calling functions within the list
library. Typically, when the interface is all in a single file, it's
quite tedious to enable trace messages for only a subset of functions,
which can be quite useful in debugging or logging to limit the
overhead or information overload. In my limited experience, trying to
control this at run-time (at least the way I was trying to do it) was
a huge pain.

Contrast that with an implementation that separates the large module
into a set of smaller modules. One can split up list.c into
list_insert_front.c, list_insert_back.c, list_free.c, list_sort.c,
etc., where each file corresponds to a single function (with
associated helper functions if needed). You can in essence compile
each individual "function" with a specific set of compiler flags. For
instance, one can compile list_sort.c with -DENABLE_TRACE to just
trace through sorting function calls without adding the tracing
overhead to other list functions that may be extraneous to the
problem. If a bug is was found in list_insert_back, one could try
compiling list_insert_back.c with -DENABLE_CONSTRAINTS to verify
function arguments. But, in addition to the dependency complexity
described above, it also requires making the build system (Makefiles)
more complicated to support compiling each object file with compile-
time specific flags.

I definitely recommend using the single file approach for major
components until the majority of the interface and design work is
complete, to simplify development. Function names, arguments,
structures, return values get changed, and splitting up a module too
early in my opinion is more hassle than its worth. Large header files
don't particularly bother me since I use a web browser to lookup
functionality rather than grepping headers. When the interface is
pretty stable, one can consider whether the kind of granularity
presented above is useful enough to warrant splitting up the interface
into more source files. I typically wouldn't bother doing subdividing
an interface just because of build times, but I'm also not in an
environment where build times are terribly long so that's out of my
personal experience.

Advertisements

Gene <> writes:
[...]
> Okay, sure. So now I will count angels on pinheads, but there isn't much difference between working in multiple files and working with multiple frames viewing the same file at different points. I find that being able to make interfaces (read header files) as small and simple as possible is an advantage of bigger files, esp. since C doesn't allow anything like a module hierarchy a la Ada. For example, as other have said, putting most of an interpreter (parser, simulator, etc.; take your pick) in one file requires exposing only the main interface. Interfaces among subsystems remain invisible. This, avoids complexity of headers and their dependencies, yada, yada.. For example, a year or so ago I broke a single 10,000 line module down into pieces of ~600 lines. It took 2 or 3 hours to get the headers right, generate make dependencies, etc. and get everything through the regression tests. Build time advantage was tiny and negative when clean-building. On the whole, wish I'd stayed with original setup.

Gene, not all news reader software copes well with very long lines.
Keeping your text down to 72 columns or so is helpful.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

On Dec 20, 11:06 pm, Gene <> wrote:
> On Monday, December 20, 2010 9:43:13 PM UTC-5, Ian Collins wrote:
> > On 12/21/10 02:47 PM, luser- -droog wrote:
> > > On Dec 20, 6:08 pm, (Ben Pfaff) wrote:
> > > So for my own purposes, I'm quite pleased with the small file
> > > sizes. To me it suggests that the code is concise. Perhaps Strunk
> > > and White isn't the best style guide for coding.
>
> > Small file sizes is good - it's easier to read multiple files side by
> > side than to be a several place in one file and if you ever use a
> > parallel or distributed build system, things go faster.
>
> Okay, sure. So now I will count angels on pinheads, but there isn't much difference between working in multiple files and working with multiple frames viewing the same file at different points. I find that being able to make interfaces (read header files) as small and simple as possible is an advantage of bigger files, esp. since C doesn't allow anything like a module hierarchy a la Ada. For example, as other have said, putting most of an interpreter (parser, simulator, etc.; take your pick) in one file requires exposing only the main interface. Interfaces among subsystems remain invisible. This, avoids complexity of headers and their dependencies, yada, yada. For example, a year or so ago I broke a single 10,000 line module down into pieces of ~600 lines. It took 2 or 3 hours to get the headers right, generate make dependencies, etc. and get everything through the regression tests. Build time advantage was tiny and negative when clean-building. On the whole, wish I'd stayed with original setup.

There are advantages to smaller files though, which have nothing to do
with editing. These include:
1) Parallel make of big projects works better (I usually kick off
parallel makes with the number of jobs equal to my number of cores,
works well for me and rather faster than single threaded make)
2) Small files makes merging of different developers work easier.
Less likely that you will have silly merge conflicts if you aren't
touching the same files. Of course, if you are changing the exact
same code in different branches there will always be stuff to
resolve. Doesn't apply to personal projects of course.
3) Smaller more specialized headers will have fewer includers, and
hence will not trigger the global rebuild that a monolithic header
might.

There are of course tradeoffs. Having everything in a single header
can make a simpler interface for clients. And huge numbers of files
means more metadata when using version control systems that do
labelling etc. And for a small project with a few thousand lines,
really doesn't matter much anyway.

On Dec 22, 3:47 pm, David Resnick <> wrote:
> On Dec 20, 11:06 pm, Gene <> wrote:
>
> > On Monday, December 20, 2010 9:43:13 PM UTC-5, Ian Collins wrote:
> > > On 12/21/10 02:47 PM, luser- -droog wrote:
> > > > On Dec 20, 6:08 pm, (Ben Pfaff) wrote:
> > > > So for my own purposes, I'm quite pleased with the small file
> > > > sizes. To me it suggests that the code is concise. Perhaps Strunk
> > > > and White isn't the best style guide for coding.
>
> > > Small file sizes is good - it's easier to read multiple files side by
> > > side than to be a several place in one file and if you ever use a
> > > parallel or distributed build system, things go faster.
>
> > Okay, sure. So now I will count angels on pinheads, but there isn't much difference between working in multiple files and working with multiple frames viewing the same file at different points. I find that being able to make interfaces (read header files) as small and simple as possible is an advantage of bigger files, esp. since C doesn't allow anything like a module hierarchy a la Ada. For example, as other have said, putting most of an interpreter (parser, simulator, etc.; take your pick) in one file requires exposing only the main interface. Interfaces among subsystems remain invisible. This, avoids complexity of headers and their dependencies, yada, yada. For example, a year or so ago I broke a single 10,000 line module down into pieces of ~600 lines. It took 2 or 3 hours to get the headers right, generate make dependencies, etc. and get everything through the regression tests. Build time advantage was tiny and negative when clean-building. On the whole, wish I'd stayed with original setup.
>
> There are advantages to smaller files though, which have nothing to do
> with editing. These include:
> 1) Parallel make of big projects works better (I usually kick off
> parallel makes with the number of jobs equal to my number of cores,
> works well for me and rather faster than single threaded make)
> 2) Small files makes merging of different developers work easier.
> Less likely that you will have silly merge conflicts if you aren't
> touching the same files. Of course, if you are changing the exact
> same code in different branches there will always be stuff to
> resolve. Doesn't apply to personal projects of course.
> 3) Smaller more specialized headers will have fewer includers, and
> hence will not trigger the global rebuild that a monolithic header
> might.
>
> There are of course tradeoffs. Having everything in a single header
> can make a simpler interface for clients. And huge numbers of files
> means more metadata when using version control systems that do
> labelling etc. And for a small project with a few thousand lines,
> really doesn't matter much anyway.

everyone seems to be treating good modularisation as purely a solution
to long compilation times (not a problem I'd have thought with a
couple of kloc...). How about it's simply good design! Modularistaion,
information hiding etc. etc. You should be able to understand what a
module does just by reading its header file. The messy details of how
it does it shouldn't concern you (until it breaks).

Take a look at the Single Responsibility Principle. In fact start
reading up on software design in general.

Share This Page

Welcome to The Coding Forums!

Welcome to the Coding Forums, the place to chat about anything related to programming and coding languages.

Please join our friendly community by clicking the button below - it only takes a few seconds and is totally free. You'll be able to ask questions about coding or chat with the community and help others.
Sign up now!