I'm looking for a free software solution to index
a big (CD-sized) collection of HTML documents (articles from
a monthly news publication over a few years). The plan is to have
pregenerated static indexes and all documents in plain HTML
(should be usable everywhere)
and then to offer some additional software for word / date / etc
boolean queries and query result management.
The software must work on Linux, Win9x and above
and MacOS X. May be it's possible to develop / reuse / adapt a plugin
for IE and Netscape.

If you have any link or idea on how to achieve this
or better places to ask, please let me know either
on your diary or by email at guerby@acm.org

One could also ask, why don't free software and Ada mix? We
have a first class Ada compiler, it's a clean, readable
language,
far more elegant than C++ IMO. But, like C++, it
has a (largely undeserved) reputation for bloat.

The GNU Ada compiler is not yet first class, it is not in
CVS GCC and most distro don't have it. This should change
real soon now. And I guess there are a whole bunch of
existing and new Ada free software that will gain visibility
as a result, not the least being the GNU Visual
Debugger. Also Ada Core
Technology is one of the few 100% free software company
making money.

Once a programmer has been crowned "hacker" by his
peers (which is quite often what he was aspiring at in the
first place) that makes it even harder for him to
ever reconsider his technical choices (especially if that
involves significant effort like learning C++ is). So if
he's ever had bad
experiences with C++, he'll stay firmly on his positions
that the language just isn't good.

There's also the fact that many of the "masters"
(Miguel, Linus, RMS) have publicly stated that they don't
like the language. What's more, no hacker as high-profile as
Linus' or RMS' is using C++. And as egnor says, almost all
successful Free Software projects use C, and thus
somehow prove that C is "good enough".

May be those "masters" have good reason not to like the
C++ language, I mean other than just being lazy and
unwilling to learn, don't you think?

I've attended two RMS talks, at the first one at a direct
question about what he thought about C++ he called the
language an "abomination", at the second one he made a joke
about C++ being ugly that drew applause from the crowd.

About RMS ability to understand quickly a "complex"
language, I would remind that he just looked at the Ada
Reference Manual and invented a new compilation model for it
that is now used by GNAT and commercial compilers
(he just didn't like the traditional model ;-).

I think that saying that "high-profile hackers"
don't like C++ because they did not learn or understand it
or are unwilling to evolve their technical opinions is
questionable.

As for C++ at the workplace, I work for a bank, and as
some of my coworkers were quick to adopt C++ a few years
ago, it looks like they're dropping it even faster in favor
of Java (they say C++ is legacy...).

I certainly think people working on free software (most
of them on their free time) are choosing their programming
language following their own taste or curiosity, and so the
mass dynamics found in the workplace doesn't apply and
freedom of choice works 100%.

This together with the slow
progress of g++ looks like a good explanation of the
dynamics observed by egnor.

The meeting runs from 6:30 pm to 9:00 pm. After the meeting
full and
precise instructions on how to get to our traditional place
of refreshment
will be given in clear.

Thanks to support of the IBM Corporation, the meeting is at
their building
at 590 Madison Avenue at East 57th Street on the Island of
Manhattan.
Enter the building at the corner of Madison and 57th and ask
at the desk
for the floor and room number.

Robert B. K. Dewar, Head of Ada Core Technologies, a
multi-million-dollar
multi-national free software company, will talk about how to
make money
with free software.

Today one of the many rhetorical attacks against free
software goes "But is
free software ready for the Enterprise?". Ada Core
Technologies sells
services and software built around GNAT, the Official GNU
Compiler for Ada,
the Official Computer Language of the United States Military
Industrial
Complex. Another rhetorical attack goes "But how can you
make money
selling free software?". Ada Core Technologies makes money.

Robert B. K. Dewar is

Professor of Computer Science at the Courant Institute of
Mathematical
Sciences at NYU,

an expert on programming languages,

a serious programmer who has written compilers, language
libraries and
real-time OSes, on a wide range of platforms from embedded
systems to
microcomputers to mainframes,

a mainstay of the robust froup comp.lang.ada,

SPITBOLer extraordinaire,

an effective advocate and amicus at large for free software
and the freedom
to program,

To add to the story by apm and to
other readers interested in how to
improve coding skills, I strongly recommand "Code
Complete"
by Steve
McConnell, Microsoft Press 1993, ISBN 1-55615-484-4.

It is near 900 pages on the topic of how to define
and write a routine. It is very-well written, with
funny "coding horrors" section, covers multiple languages
(hopefully not in the "how to do xxx with language yyy",
repeat for all yyy publishing crazyness).

My experience is that it can boost the coding style
and efficiency of beginners dramatically (the size is a
little intimidating, but once they start it they finish it).
Experienced coders won't learn much, but it's nearly all the
topic covered in
one book which is nice.

Warning: not to be confounded with "Writing Solid
Code :
Microsoft's Techniques for Developing Bug-Free C Programs"
by Steve Maguire which is nowhere near the same quality and
interest.

It might be of interest to some advogato readers (let's fill
these
diary entries a little bit ;-). As I said in a previous
message in the
thread, I would love to see what is the current employment
ratio of
shrink wrap vs custom software + support, and money making
ratio, and
David answered he would love to see it too. Any information?

Laurent Guerby <guerby@acm.org> writes:
It looks like to me that you're reducing the "software
development
industry" to the "shrink wrap proprietary software
industry".

David Masterson <dmasters@rational.com> writes:
Is there any other? :-)

Well, none of the people I graduated with (software
engineering) work
in the shrink wrap industry, they all work
writing/maintaining custom
software for big projects, or as contractors inside various
companies
also developping/maintaining custom software for one client.
All these
companies and people are making big money, and there's no
shortage of
work to do!

It might be an Europe vs US way of doing software.

I would say you're missing all the one customized software
for one
customer/problem industry [...]

I mentioned this later in my message as "contract"
programming (ie. first
user pays development costs). Its not an unreasonable
model, but it does
have flaws...

But you seem to imply again that all software is "sellable"
to the masses
(shrink wrap), and that's just not true. I see a large
market for
"first user pays development costs" and no other user ever
exist.

But, of course, you're thinking like a bank employee with no
competition for
your job.

Just to clarify what I do, I work on financial equity and
other
derivative software, if you think there's no competition for
jobs
or otherwise in this market, I guess you don't know it at
all!

Now think about it from the point of view of the bank. What
would the bank think of making *ALL* of the software that
you develop "open
source"? Can other banks now become more competitive with
your bank because
they pick up and use your software? Remember these other
banks are not
obligated to contribute to the development of the software,
so they would be
making use of your software for "free" (monetarily
speaking).

I think you're completely misunderstanding the "open source"
licensing
scheme. You're NEVER obliged to release to the community
your
customization as long as you don't distribute binaries to
people
outside your organization (*). There are custom ports of GCC
made and
used by only one company, and they don't have to release it
at all,
and this is perfectly fine with the FSF. The software I work
for is
not intended at all for use outside our company, and we even
have some
software protection and authorization scheme to make sure it
stays
this way.

(*) There is some debate on how to have multi company
collaborative
work on "unreleased" GPL software, but that's a bit special.

On the other hand, if your software is *not* released as
"open
source", does that mean that all those other banks have to
go about
duplicating your efforts? Isn't that wasteful?

Yes of course all other banks in this market are duplicating
our
efforts, this is where the competition is! The software
embodies
nearly all of the bank derivative product know-how, that's
all kept
secret as much as possible. The model is that if you want to
know what
other banks are doing, you have to pay to get their
employees work for
you and bring you their knowledge. There are companies
selling this
kind of software (proprietary way, and may be open source
way), but
with much less advanced functionalities (often with a plugin
interface
so that banks can put their know-how in).

Perhaps it would be a win-win if your bank sold the software
to the
other banks (oh, but then you're into a "proprietary"
model... ;-)

See RISK magazine if you're interested in learning what's
going on in
our industry.

My employer decided to use an open source tool and pay for
technical
support (not cheap!) a few years ago for this critical piece
of
software and this was an informed move. They hired me in
part because
I know very well the technology since I worked on it, and
the sources
are available. In case of emergency I'm even able to fix
things myself
(we have very hard requirements on getting things to work
quickly). They can't get that level of insurance if they
choose to use
proprietary software since the selling company is the only
one with
source access (monopoly), if you piss them off and have a
problem,
you're dead, no alternative, no incentive to fix things we
need
(except may be the low one to one 1/N with a big N since
you're
selling to the masses). To put it otherwise, vendor lock-in
and
racket.

For other custom software developed in my bank where
multiple
proprietary tools are used, my coworkers have to make sure
things work
together, and sometimes their vendors have conflicting
interest, I can
tell you that ends up being very costly and unsatisfying.

I guess this experience applies to other industries, a lot
of
businesses today are heavily depending on combining and
customizing
software for their own unique purpose. If your business
scales up,
you'll end up being very interested to buy skills to do even
more
customization, and in this way, open source is very
attractive because
it is ... the only option. Sometimes you might even end up
competing
in a slight way with what your original vendor software
does, and you
just can't live with a proprietary vendor.

I firmly believe that the software market needs this kind of
competition, and I do not see it at all as putting every
software
engineer out of business, well on the contrary! But I can
easily
imagine that people with different experience think
otherwise.

PS: I didn't notice your email at first, but your company is
offering
a competing product to the free software we use (GNAT, an
Ada
compiler). I haven't tried yet to compile our software
using the
Rational Ada compiler, but I'll do it one of these days, as
QA for
code portability at least ;-).

Just a note for those who find "The Future of
Programming" advogato post interesting:
you might have a look at the book "High Integrity Ada" by
John Barnes
(Addison-Wesley), it describes a safe "provable" subset of
Ada 95 known as "SPARK 95". Also look at the message of
Peter Amey in the thread "High integrity software"
of the comp.lang.ada newsgroup for some interesting
claims.

Formal proofs are in use today with existing languages
for non-trivial software, it's just not C/C++ (as the cvsup
author John D. Polstra wrote it's "notC" ;-) and what can be
called a typical hacker environment and mindset. Needless to
say I regret this a lot, in particular the Ada entry of the
Hacker's dictionary is highly misinformed ;-).

As for proving with C, it's uselessly hard because the
pointer model is pervasive. And for the C++ template model,
AFAIK, it does not even enforce interface and implementation
separation which is quite essential for scaling up (you need
to know all the implementation code to see if your
template use will work).

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser
code is live. It needs further work but already handles most
markup better than the original parser.