In trying to port postgres to 64-bit Windows, I've encountered a
number of issues which may (or may not) affect other compilers. If
you followed the other thread, this is mostly a summary with a bit
more details so feel free to ignore it. Some of these may have some
minor effects on other platforms, so they may be of interest (but I
doubt it, since no one has noticed/complained about them yet). This
post contains a small taxonomy of the problems, as well as some
discussion about the work that needs to be done in order to make
postgres portable to LLP64 data model compilers (in case someone else
is interested). I use the standard terms for discussing different
compiler data models, which are explained here:
http://www.unix.org/version2/whatsnew/lp64_wp.html
Throughout this post I will assume sizeof(size_t) == sizeof(void*),
because I doubt you want to support a system where this is not the
case.
When I try to compile postgres, I get 396 warnings. These come from
several different places:
1.) Most of the code involving strings requires a ILP32 or ILP64 to
not generate a warning. This means sizeof(int) == sizeof(size_t) ==
32 or 64, respectively. Something as simple as:
int len;
len = strlen(str);
violates this on LP32, LP64, and LLP64. AFAIK, there really are no
LP32 compilers around anymore, but LP64 is common (LLP64 is MSVC).
None of these warnings are actually problems, since they involve
strings and realistically the problems which "could happen" never
will. Unfortunately, these are actually portability problems, since
you never want to disable narrow cast warnings when supporting
different architectures because some of the warnings could be
important. If these aren't disabled, they will be very annoying and
make it hard to spot real problems (and tempt people to turn off all
such warnings). If they are changed, almost 300 lines will need to be
committed, all of which have the not very exciting form:
int len;
len = (int)strlen(str);
the alternative is changing int to size_t everywhere, which several
have objected to because of bloat. This bloat will only affect LP64
and LLP64, which do not seem to have been the target machines in the
first place. I'd be willing to make the changes to either form, but I
don't know if anyone would be willing to commit them :)
2.) There is a lot of other code involving memory index and offset
calculations being "int". On ILP64, these will be able to work with
buffers > 2 GB. On LP64 or LLP64, they will not. On ILP64,
sizeof(int) == sizeof(size_t), but on the other two sizeof(int) <
sizeof(size_t). Either c.h or postgres.h (I forgot which) defines an
Offset and Index typedef to aid in portability, but they are only
rarely used. Most of the unchecked conversions from size_t to int are
of the string variety (1), but there are a fair amount of these as
well.
None of these warnings are actually problems even on LP64 or LLP64,
unless the buffers involved are > 2 GB. Buffers > 2 GB will work with
no changes on ILP64 only. Whether the problem domain specifies that
they _can't_ (or probably never should) be > 2 GB either way must be
examined on a case by case basis, and I haven't examined that yet.
Thoughts on 1 & 2
==============
I was surprised to see this in the code. The reason is that both of
these issues affect LP64. Problems with LLP64 are expected, because
LLP64 basically means "Microsoft" and therefore support is not usually
a concern of the OSS community. LP64 on the other hand is any x64
machine using gcc, or at least it was several years ago. Has that
changed? Can gcc now be configured to use ILP64 instead?
3.) Some of the code assigns size_t to uint16. These should elicit
warnings on all compilers, but are almost certainly guaranteed to
never be errors because the use of uint16 implies that the developer
clearly knows that this is the maximum size needed in all situations
and the precision loss never matters (here, the size_t variables being
cast were probably not really supposed to be size_t in the first
place, but Size was used in the RHS with no cast, carelessly).
4.) Some of the code assigns size_t to uint32. It's unclear if these
cases are like (2) or like (3), and would need to be examined on a
case by case basis.
Problems for LLP64 compilers
======================
Almost everywhere the keyword "long" appears, is a problem for LLP64
whether its a warning or not. Unfortunately this happens in a very
large number of places. Size is supposed to be used for memory
resident objects, but rarely is. Often (signed) long is used, and its
not immediately clear without fully reading through the program
whether it actually needs to be signed or not (i.e., whether it can
safely be changed to Size or not). Sometimes it does need to be
signed, as allocations can apparently be negative, and this is
required for correct program operation i.e., there are a few:
while !(something_which_should_semantically_never_be_less_than_0 < 0)
The types of problems that are easy to spot raise warnings because
they freely mix Size (which is typedef size_t Size;) and long. This
way, you are alerted to problem. This will not find any problems
where long was used throughout. Note that none of this is
particularly hard to do, it just takes time. Does that mean that
almost every part of postgres that interacts with memory or the Datum
type must be read very carefully to rediscover all the assumptions
behind the code? Unfortunately I would guess yes. This is going to
take much longer than I thought. But it's that or use MySQL, which
crashes whenever I load the data in.
-Ken