[TUHS] History of popularity of Curn:uuid:d731248e-428f-87b9-5de9-98858dcc2d312020-06-07T05:58:38ZTyler Adamscoppero1237@gmail.com[TUHS] History of popularity of C2020-05-21T15:28:07Zurn:uuid:af47689d-861c-75fc-0fec-c1d3c3c100de

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-21T16:17:15Zurn:uuid:c9be2850-7051-d2dd-ee7b-5d3ed00063ff

On 2020-05-21 11:27 AM, Tyler Adams wrote:
> Does anybody have any good resources on the history of the popularity of
> C? I'm looking for data to resolve a claim that C is so prolific and
> influential because it's so easy to write a C compiler.
>
> Tyler
Based on recollections of C from mid-1980s until today, this claim
doesn't make sense for several reasons. Sorry, this is all anecdata or
recollection, not cited data:
- inexpensive compiler availability was not very good until ~1990 or
later, but C had been taking off like wildfire for 10 years before that
- developing good compilers is certainly not "easy" - and there were a
lot of mediocre vendor compilers despite (duplicated) investment
- by the time gcc was mature (by some definition, but probably before
1990) - something that happened largely as a reaction to the vendor
compiler situation - it was a large and complicated codebase even by
standards of the time
- hobby/novelty/small/educational compilers are a relatively new thing
and arrived long after the C adoption curve was complete. The earliest
well known example I can think of is lcc (1994) but most are much newer.
...and probably quite a few other points.
--T

Jim Cappjcapp@anteil.comRe: [TUHS] History of popularity of C2020-05-21T16:28:37Zurn:uuid:d69ba8ce-6d48-86d8-2536-d61dcd019641

[-- Attachment #1: Type: text/plain, Size: 586 bytes --]
Again, based on recollections, what got me immediately interested was that I regarded C as a "portable assembler". It was one of the earliest implementations of "write-once-run-anywhere".
From: "Tyler Adams" <coppero1237@gmail.com>
To: "The Eunuchs Hysterical Society" <tuhs@tuhs.org>
Sent: Thursday, May 21, 2020 11:27:26 AM
Subject: [TUHS] History of popularity of C
Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler.
Tyler
[-- Attachment #2: Type: text/html, Size: 915 bytes --]

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-05-21T16:31:46Zurn:uuid:c79bfea4-bcea-3a3b-f6cb-5367c4b96d8a

On Thu, May 21, 2020 at 12:10:35PM -0400, Toby Thain wrote:
> On 2020-05-21 11:27 AM, Tyler Adams wrote:
> > Does anybody have any good resources on the history of the popularity of
> > C? I'm looking for data to resolve a claim that C is so prolific and
> > influential because it's so easy to write a C compiler.
> >
> > Tyler
>
> Based on recollections of C from mid-1980s until today, this claim
> doesn't make sense for several reasons. Sorry, this is all anecdata or
> recollection, not cited data:
>
> - inexpensive compiler availability was not very good until ~1990 or
> later, but C had been taking off like wildfire for 10 years before that
>
> - developing good compilers is certainly not "easy" - and there were a
> lot of mediocre vendor compilers despite (duplicated) investment
>
> - by the time gcc was mature (by some definition, but probably before
> 1990) - something that happened largely as a reaction to the vendor
> compiler situation - it was a large and complicated codebase even by
> standards of the time
>
> - hobby/novelty/small/educational compilers are a relatively new thing
> and arrived long after the C adoption curve was complete. The earliest
> well known example I can think of is lcc (1994) but most are much newer.
>
> ...and probably quite a few other points.
This matches my memory as well. I think I learned C in 1983 or 84,
it just worked. To me it felt like it was PDP-11 assembler only nicer.
The thing I liked about C is that you always felt like you were right
on the metal, it didn't hide the fact that there was a computer under
it. Very different feel from, say, Pascal. I think the fact that you
could feel the machine under the language had a lot to do with it taking
off.
And what Toby said about compilers, oh, man, so true. Once you got out
in the real world, gcc was buggy and slow, companies wanted to charge
you at every step of the way for compilers that were marginally better
than gcc at best. When gcc finally got good enough, I agree, around
1990 or so, it was a relief. You just used it and ignored the platform
specific ones. G++ took a long time to be good enough.

Tony Finchdot@dotat.atRe: [TUHS] History of popularity of C2020-05-21T16:43:43Zurn:uuid:49c64778-4fba-5647-31ae-f55daaaecb9e

Toby Thain <toby@telegraphics.com.au> wrote:
>
> - inexpensive compiler availability was not very good until ~1990 or
> later, but C had been taking off like wildfire for 10 years before that
I get the impression that an important part of its popularity was how C
(and C++) became the language of choice on the PC, and displaced Pascal in
the process.
Tony.
--
f.anthony.n.finch <dot@dotat.at> http://dotat.at/
Trafalgar: Northerly or northwesterly, backing southwesterly in northwest, 3
to 5. Moderate, occasionally slight in southeast. Fair. Good.

John Foustjfoust@threedee.comRe: [TUHS] History of popularity of C2020-05-21T18:32:16Zurn:uuid:15a6298f-d028-70ba-b7e1-ebc86d584fed

At 11:30 AM 5/21/2020, Larry McVoy wrote:
>This matches my memory as well. I think I learned C in 1983 or 84,
>it just worked. To me it felt like it was PDP-11 assembler only nicer.
One thing that stuck with me about our experience at UW-Madison
at that time was that there wasn't a course that taught C
yet some courses were taught in C. "Here's K&R, there's
the Unix manuals, get to it."
>When gcc finally got good enough, I agree, around
>1990 or so, it was a relief. You just used it and ignored the platform
>specific ones. G++ took a long time to be good enough.
There's the broader history of the languages that were popular
in the IBM PC market in the 80s and 90s, too. In that at least
numerically larger market, there were times when C was not on top
for many small-time developers. Let's not forget Turbo Pascal
(shipped 1983 to 1995) and Turbo C and C++ (1987-1995).
In 1986 or so on the PC, I was using the Gimpel C-terp interpreted C
and their fine PC-lint to speed development (which Clem Cole has mentioned
here before and which is still sold (!) ) in conjunction with shipping
code under the Lattice and Microsoft C compilers of that time.
In the mid- to late 80s, there's the rise of the flat address space
68000 machines like Amiga and Atari which could enjoy the
cross-pollination of code ported from Unix C environments.
On the Mac, Apple's MacApp environment was their Object Pascal
and not C++ until 1991. Think C came out in 1986.
In the late 1980s, 32-bit DOS extenders arose that let you write
DOS programs in C that had true 32-bit pointers and didn't need
to worry about 64K segments as much, followed by Microsoft's Win32s
in late 1992 that allowed that freedom under Windows 3.1.
- John

arnoldarnold@skeeve.comRe: [TUHS] History of popularity of C2020-05-21T17:36:36Zurn:uuid:ab33e73a-f9e9-48a5-6b64-966774d98035

> Toby Thain <toby@telegraphics.com.au> wrote:
> >
> > - inexpensive compiler availability was not very good until ~1990 or
> > later, but C had been taking off like wildfire for 10 years before that
PCC contributed to this. Everybody and their brother was porting Unix
to their fancy new CPU architecture / hardware. All you had to do was
bootstrap a cross-compiler version of PCC on a PDP-11 (or more likely
Vax), then get Unix to boot and Voila.
(I remember reading a paper about how Motorola did just that for
the MC 680x0 family.)
C and Unix were established in Academia and Industry well before 1990.
> I get the impression that an important part of its popularity was how C
> (and C++) became the language of choice on the PC, and displaced Pascal in
> the process.
C++ became the language of choice on the PC when MSFT started pushing
its compiler and Visual Studio IDE.
At least, this is my two cents.
Arnold

Noel Chiappajnc@mercury.lcs.mit.eduRe: [TUHS] History of popularity of C2020-05-21T18:28:40Zurn:uuid:14d6bc80-bd05-7bc5-95da-958337494f2f

> From: Tyler Adams
> C is so prolific and influential because it's so easy to write a C
> compiler.
I'm not sure the implied corollary ('it's _not_ easy to write compilers for
other languages') is correct.
As a datapoint, I pulled "Algol 60 Implementation" (Randell and Russell) off
the shelf, and it reveals that the Algol 60 compiler discussed there (for the
KDF9), using lessons from the Algol compiler for the Electrologica X1, was
3600 words (roughly 3 instructions/word). So it was small.
Now, small is not necessarily equivalent to easy, but it was clearly not a
mountainous job. I imagine early BCPL, etc compilers were roughly similar.
The only language from that era which I can think of which was a slog,
compiler-wise, was PL/I.
I suspect the real reason for C's sucess was the nature of the language.
When I first saw it (ca. 1976), it struck me as a quantum improvement over
its contemporaries.
Noel

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-21T18:45:11Zurn:uuid:b6ff5f4f-87d4-0fca-ed4e-73b0c82f0a0e

>I suspect the real reason for C's sucess was the nature of the language.
it has most of the elements of structured programming as known in the 70the/80ths, and - most important - it produces small and fast performing binaries like no other high level language. Furthermore its syntax is relatively close to the system, and systems calls are easily adoptable. Thus for me it still is and ever will be the first choice.

A. P. Garciaa.phillip.garcia@gmail.comRe: [TUHS] History of popularity of C2020-05-21T18:58:37Zurn:uuid:798cc776-57c2-5136-5883-6c12ff5b5aea

[-- Attachment #1: Type: text/plain, Size: 947 bytes --]
From memory, there is a History of Programming Languages book from an ACM
conference that contains some papers that were presented there, along with
some notes from Q&A sessions that followed. I'm paraphrasing, but Dennis
Ritchie said something flattering about Pascal, that it was essentially the
same language as C. Given this, asked Niklaus Wirth, why do you suppose
that C is so much more popular than Pascal? Ritchie answered, "I don't
know".
My personal opinion is that Ken Thompson is not given enough credit for the
beauty and expressiveness of C, as much of this comes from its
predecessor, B, which is essentially Thompson's "remix" of BCPL.
On Thu, May 21, 2020, 11:28 AM Tyler Adams <coppero1237@gmail.com> wrote:
> Does anybody have any good resources on the history of the popularity of
> C? I'm looking for data to resolve a claim that C is so prolific and
> influential because it's so easy to write a C compiler.
>
> Tyler
>
[-- Attachment #2: Type: text/html, Size: 1382 bytes --]

Clem Coleclemc@ccc.comRe: [TUHS] History of popularity of C2020-05-21T19:03:52Zurn:uuid:04988800-f5c1-c300-1839-3c774c4cd171

[-- Attachment #1: Type: text/plain, Size: 9408 bytes --]
On Thu, May 21, 2020 at 11:28 AM Tyler Adams <coppero1237@gmail.com> wrote:
> Does anybody have any good resources on the history of the popularity of
> C? I'm looking for data to resolve a claim that C is so prolific and
> influential because it's so easy to write a C compiler.
>
Hmmmm, I don't know what's been written, but old Dr. Dobbs and Byte Mag
are where I would start from 1975-85 (which I have in my attic, but lack a
good index).
Let me give you my experience and recollections, although Larry may scream
that catalog of memories he is colored by what he likes to call the UNIX
club.
Academics in the mid-late 70s all got UNIX with full sources to originally
the Ritchie C Compiler and later the Johnson compiler. Plus had access to
yacc/lex and the first editions of the dragon book.
Before C (or B) shows up there already were 'system programming languages'
such as BCPL, BLISS, PL/360, and ESPOL (much leads full languages like
Fortran, Algol family, PL/1) which people were also trying to use for
systems work.
In '73, C had been retargeted for the PDP-10 by Alan Snyder @ MIT
https://github.com/PDP-10/Snyder-C-compiler, but I believe that was a
rewrite not a port of the Ritchie compiler. I believe this was the first
retarget, at least outside of the MH. Before that C has been retargeted
to the Honeywell, Interdata and I believe the S/360 -- Steve and Doug can
probably say more.
The 8-bit microprocessor arrives on the scene in 75 and the 16-bit ones 4
years later. Many of us at different universities wrote assemblers and
linkers for the same, often in C under UNIX. CMU had a SAIL based 6502
assembler in the CS Dept that was used to burn ROMs, but it ran on the
PDP-10. There must have been an 8080 assembler over there too, but I don't
remember it. The 10s were more difficult to use in the EE building and
tough to use with the KIM-1s. I wrote one for the 6502, 8085, and the Z80
for the EE department on our 11/34 UNIX V6 system. Ted Kowalski wrote the
predecessor to the eventual UNIX cu(1) program, which we called connect(1)
that allowed us to download code to the KIM's (and other micros) from the
UNIX systems that we had in the EE lab (which he took back to USG, was
rewritten and went into both PWB and eventually TS and V7).
The Purdue 8-bit micro suite would eventually become popular because it
supported full relocation and linker, which microprocessor support tools
like the one I wrote did not. There was group at Purdue in EE that started
to retarget the Ritchie C Compiler, but I've lost track what happen. Mike
Zuhl or Ward Cummingham might remember what became of that (more in a
minute) - I'm pretty sure Ward was mixed up in the that -- check his web
site you might find stuff there, or we can ask either of them (Steinhart
might know some of it too, as we all working with the original
Microprocessor team in Tek Walker Road in the late 1970s).
The first microprocessor targeted C compiler I personally used was the one
from Teletype Corporation which had retargeted the Ritchie C Compiler to
the Z80 in 1977/78 IIRC (that Phil Karn brought to CMU). He and I hacked
it to use my assembler and got it to spit out 8085 code for our semester
project for Steely Dan's Real-Time course. This was the original C
compiler he used for the KA9Q TCP/IP, although at some point he switched
Leor Zohlman's Brain-Damaged Systems (BDS) compiler (
https://en.wikipedia.org/wiki/BDS_C) after we both left CMU.
In the late 1970s ('78 I think), Dennis Allison was teaching a course at
Stanford. The assignment was to developed TinyBasic (for the 6502 IIRC).
Some of these got presented at an early AMW (talk to Bob Berg if you want
try to find the date). This idea spread to a lot of places and the idea of
'TinyX' or SmallX was started. By the late 1979/early 1980, Ron Cain (one
of his students I believe) used an SRI based UNIX system to develop his
'Small C' that he would publish the sources to in Byte and eventually a
book that was used to teach (which I still have):
https://en.wikipedia.org/wiki/Small-C. The Small C compiler would get
retargeted to the other 8-bit micros and you can find most of them with a
search engine.
The best I can tell, Leor and Ron worked independently of each other.
Leor's compiler was a tad more complete and he actually wrote a UNIX clone
for the Z80 with it (I don't remember if Leor has fp support, Ron did
not). Leor had access to the Ritchie compiler, but he seems to have
written it himself (you can search for and download the sources and decide
yourself). Leor showed many of us his systems running on 3 8" floppies at
the Boston USENIX in the early 1980s [I remember dmr playing with it and
remarked how much it reminded him of early UNIX on the PDP-11].
Also, after I left CMU in 1979, I took the Ritchie compiler and retargeted
to what would become the 68000 (it was not yet released or numbered when I
started). Paul Blanter of Tek labs wrote the assembler and Steve Glaser and
I hacked v7's ld a little. This was the original tool suite for the
Magnolia system. The folks in the MIT RTS Group had started to retarget
the Johnson compilers to the 8086, the 68000 and eventually the Z8000 as
part of the NU project and Trix (I know Jack Test, who had previously been
at Stanford had is hand in this -- tjt wrote the MIT 68000 assembler that
used an MIT hacked version of V7's old, I think John Siber did the
C8086). Around the same time, CMU started the Mach project and created
the macho format. Robert Baron and Mike Accetta were heavily involved,
but I think they took the MIT compilers as the basis for some of that work.
At some point (Steve can fill us in) I thought someone in USG started to
retarget his compilers for USG. This is the source of the AT&T assembler
and is what ISC started with when they did the 386 ports a few years later
for AT&T that Heinz talked about a few weeks ago.
Meanwhile, Gordon Letwin who had been Purdue, EE, brought the Purdue
assemblers and forked from the C compiler work at some point. He and Bob
Greenberg did the start of the compilers for original Xenix work for the
8086, we would have to ask Bob or Gordon for more details [Gordon is
believed to be the source the terrible curse, called the 'far' pointer].
By the early 1980s, a number of UNIX ports start and many C Compilers show
up. I think the John Bass did the Onyx Z8000 C compiler independent of the
MIT code base, but the MIT NU C compilers and the NU UNIX port would become
used by a lot of the 'JAWS' work that would start to ramp in the early
1980s.
Anyway -- the point is we all had access to the UNIX sources (sorry Larry)
we start to hack on them. Plus different Universities doing compiler work,
like Andy Tannenbaum release compilers (ACK) independent of the AT&T code
origin but built/bootstrapped from UNIX/the UNIX toolkit. Waterloo,
Edinburgh, and others also all put something out. Plus you start to
commercial C implementations like Intermetrics, Tartan Labs, Greenhills (in
fact IIRC the Apple Mac C Compiler was developed under contract by
Greenhills).
What I am leaving out is the BASIC and Pascal wars that were going on at
the same time. The 8-bits micros, in particular, went BASIC crazy. The
'CS types' at many Universities (like mine at CMU) had been considered
BASIC, C, and Fortran as 'ugly' and were using an Algol or a more
Algol-like language as the future (Pascal was premier teaching language at
the time). For issues, we can talk about in COFF, Pascal diverged (in
1980 at one of the Hatfield and McCoy parties at Steve Glaser's, a couple
of us counted 14 incompatible 'HP-BASIC's and 8 different 'Tek Pascal' in
use).
Here comes the final thing that happened...
By the early, mid-80s, all us UNIX folks were happy using UNIX derived C
compiler, like the NU suite. But as Larry points out, there was a whole
group of people that could not get UNIX sources or tools. Stallman sets
out to build his Gnu system and he needs a language and compiler. I've
always been amazed he did not use LISP, other than the first tool he wanted
was EMACS, and get got CMU (Gosling's) codebase to start. CMU-EMACS was
in C (plus his 'mock-lisp' creation). So rms needed a C compiler and
starts to hack mock-lisp to be more to his taste. But to make it
widespread he needs a C compiler and microprocessor tools. So he starts
to write his famous compiler -- which to me is that key thing he did. The
Gnu project would release tools that ran pretty much anywhere and targeted
the popular micro's and generated 'good enough code.'
Cole's Law -- 'Simple Economics always beats Sophisticated Architecture.'
It turns out Paul Winalski and I were just talking about this last week. I
very much believe C 'won' the war for economic reasons.
UNIX being 'Open Source' in the 70s to the Unversity types, did allow us to
hack and >>share<< the compilers, either Ritchie or Johnson based.
Moore's Law caused the 16-bit micros to flourish and they ended up in
systems. Unix taught a number of programmers the language and the tool
suite, then we went to the real world and wanted it. Stallman's tools were
there.
It did not matter that there were 'better' languages (Pascal had forked, we
also had new languages from OCAM to Modula, eventually C++ et al). The
Gnu C compiler was cheap (free) and that was the final stroke.
Clem
[-- Attachment #2: Type: text/html, Size: 16624 bytes --]

Paul Winalskipaul.winalski@gmail.comRe: [TUHS] History of popularity of C2020-05-21T19:07:05Zurn:uuid:db6f54fa-81d1-b11f-2ffd-0e7c104e4d6d

On 5/21/20, Thomas Paulsen <thomas.paulsen@firemail.de> wrote:
>>I suspect the real reason for C's sucess was the nature of the language.
> it has most of the elements of structured programming as known in the
> 70the/80ths, and - most important - it produces small and fast performing
> binaries like no other high level language.
Sorry, but I can't agree with that statement (like no other high-level
language). C is a decent language for systems programming but so are
other languages such as BLISS. C is a terrible language if you have
to throw arrays around (which is why Fortran still rules the roost in
HPTC).
C, Pascsal, and other modern Algol-ish languages have well-behaved
grammars and were designed to be easy to lex and parse. Fortran and
COBOL were designed before Chomsky's work on formal grammars became
well known, and as a consequence are bears to parse. Fortran has
context-sensitive lexical analysis, for example. But nobody knew any
better back then.
-Paul W.

CHARLES KESTERcorky1951@comcast.netRe: [TUHS] History of popularity of C2020-05-21T19:24:58Zurn:uuid:fe617284-9eb4-fc17-a956-5b3334c12f05

> On May 21, 2020 at 10:35 AM arnold@skeeve.com wrote:
>
> C++ became the language of choice on the PC when MSFT started pushing
> its compiler and Visual Studio IDE.
Microsoft C 7.0 already had a C++ compiler and an early version of MFC in 1992.
But you're right: it was when Visual C++ 1.0 came out in 1993 that C++ became
really popular among developers targeting Windows. VC1.0 introduced "wizards"
for MFC that produced a skeleton application to which many people had to make only
a few additions in order to come up with a shippable product. The market was
soon flooded with apps that had what I called a "wizard smell". (The more
charitable phrase was "look and feel".)
Of course, as with all framework-based code, wizard-generated apps couldn't
distinguish themselves in the market for very long and the bar was raised.
But by then C++ was well-established as the language of choice.
None of which has anything to do with Unix, I admit.

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-21T20:07:53Zurn:uuid:d8f8aac9-8002-7b1f-7f88-b9198e1377c2

On 2020-05-21 12:43 PM, Tony Finch wrote:
> Toby Thain <toby@telegraphics.com.au> wrote:
>>
>> - inexpensive compiler availability was not very good until ~1990 or
>> later, but C had been taking off like wildfire for 10 years before that
>
> I get the impression that an important part of its popularity was how C
> (and C++) became the language of choice on the PC, and displaced Pascal in
> the process.
Yes, that's basically true, but I didn't try to cover the contemporary
"appeal" of C, stylistic or otherwise - but only the compiler point
(which I think is mostly false).
--Toby
>
> Tony.
>

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-21T20:10:37Zurn:uuid:6e794df3-ea85-dc59-525c-d4403c87bbf8

On 2020-05-21 1:35 PM, arnold@skeeve.com wrote:
>> Toby Thain <toby@telegraphics.com.au> wrote:
>>>
>>> - inexpensive compiler availability was not very good until ~1990 or
>>> later, but C had been taking off like wildfire for 10 years before that
>
> PCC contributed to this. Everybody and their brother was porting Unix
> to their fancy new CPU architecture / hardware. All you had to do was
> bootstrap a cross-compiler version of PCC on a PDP-11 (or more likely
> Vax), then get Unix to boot and Voila.
>
> (I remember reading a paper about how Motorola did just that for
> the MC 680x0 family.)
>
Yes, but Johnson had already done the work. Imho compilers were still
considered pretty complex magic and you wouldn't lightly write one from
scratch.
And yeah all the vendors wanted to get a compiler out with minimal
effort, which is why they often weren't very good.
> C and Unix were established in Academia and Industry well before 1990.
>
>> I get the impression that an important part of its popularity was how C
>> (and C++) became the language of choice on the PC, and displaced Pascal in
>> the process.
>
> C++ became the language of choice on the PC when MSFT started pushing
> its compiler and Visual Studio IDE.
That was much later.
--Toby
>
> At least, this is my two cents.
>
> Arnold
>

Tony Finchdot@dotat.atRe: [TUHS] History of popularity of C2020-05-21T20:12:47Zurn:uuid:b976621e-2d92-5b9c-fa14-62109f12d0b4

arnold@skeeve.com <arnold@skeeve.com> wrote:
>
> > I get the impression that an important part of its popularity was how C
> > (and C++) became the language of choice on the PC, and displaced Pascal in
> > the process.
>
> C++ became the language of choice on the PC when MSFT started pushing
> its compiler and Visual Studio IDE.
C was winning years before that. I saw a comment on a certain orange
website that referred to Dr Dobbs Journal, August 1986, which I found
online at
https://archive.org/details/dr_dobbs_journal_vol_11/page/n541/mode/1up
On that page there are a few choice quotes from the archives (1983) about
C from a PC perspective. The letters pages are 1/3 C. There are 8/10 pages
of articles about C. Then there is a 23 page comparative review of 17 C
compilers.
It's remarkable :-)
Tony.
--
f.anthony.n.finch <dot@dotat.at> http://dotat.at/
Irish Sea: Southeast 3 or 4, increasing 5 to 7, veering southwest 6 to gale 8
later. Smooth or slight, becoming moderate or rough, occasionally very rough
later in south. Fair then rain or squally showers. Good occasionally poor.

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-21T20:27:10Zurn:uuid:2422cf3d-d31b-ccd6-aabb-3a19bb7c811c

On 2020-05-21 1:22 PM, John Foust wrote:
> ...
> In the mid- to late 80s, there's the rise of the flat address space
> 68000 machines like Amiga and Atari which could enjoy the
> cross-pollination of code ported from Unix C environments.
>
> On the Mac, Apple's MacApp environment was their Object Pascal
> and not C++ until 1991. Think C came out in 1986.
Few developers used MacApp, afaicr. Most used plain Pascal - initially
Lisa Pascal, though that was before my time on Mac - and those who
didn't like Pascal could write C on Mac before THINK, using tools like
Aztec C (or even Whitesmiths, one of the earliest).
Even though MPW was an excellent industrial strength environment with
good Pascal and C compilers, the big vendors like Adobe adopted
LIGHTSPEED-then-THINK-then-Symantec C quickly and rewrote Pascal apps
(like Photoshop) in C early. Then CodeWarrior came along and ate THINK's
lunch.
--Toby
>
> ...
>
> - John
>

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-21T20:28:15Zurn:uuid:f6aea689-fca0-bde5-3864-af18a51e5d2a

>Sorry, but I can't agree with that statement (like no other high-level
>language). C is a decent language for systems programming but so are
>other languages such as BLISS. C is a terrible language if you have
>to throw arrays around (which is why Fortran still rules the roost in
>HPTC).
BLISS is known to a very small number of persons, thus irrelevant, and with regards to arrays, first they are rarely used in advanced programming preferring lists, maps, trees, etc.. second I never had problems with pointers.

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-21T20:33:57Zurn:uuid:5ab75448-fdcf-6d69-12e7-5f20449a0eae

>Microsoft C 7.0 already had a C++ compiler and an early version of MFC in
>1992. But you're right: it was when Visual C++ 1.0 came out in 1993 that C++ became
>really popular among developers targeting Windows. VC1.0 introduced "wizards"
msc was really good in those days. As a systems guy I used to study its generated assembly code which was extremely good. However today's gcc uses advanced instructions too, thus also very good, whereas all the unix cc's of the 90ths known to me were rather naive, simple lex&yacc derived.
The "wizards" also were very good making gui programmig much easier.

Clem Coleclemc@ccc.comRe: [TUHS] History of popularity of C2020-05-21T20:56:54Zurn:uuid:17903298-1ea4-9b61-7f3e-77c9272e3b66

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]
On Thu, May 21, 2020 at 12:17 PM Toby Thain <toby@telegraphics.com.au>
wrote:
> - inexpensive compiler availability was not very good until ~1990 or
> later,
Hrrumpt The Gnu C compiler was starting to be available by the mid-1980s
in alpha/beta form. rms was looking for places to start. He approached a
number of folks, from Tanenbaum to some of the vendors (he knew Masscomp
had written a compiler from scratch which we away the binaries gave to our
customers and he called me asking if we would donate it. We had donated
development hardware and I was still his contact to the Gnu project at that
point).
As far as I know, he ended up writing his own because he could not find one
to start with. The big kickstart for rms, was that Sun hard just started
to charge for its compilers, and so a lot of people were looking for a free
alternative (and frankly in those days the Sun compiler was still a bit of
a toy -- 20% we got over them at Masscomp was because we had a number of
the folks from the DEC compiler team).
It is true that the targets and the original systems it ran were more
limited. The 1.0 release was before the summer of '87 (in May maybe???).
The biggest issue is that it did not run on DOS until the 386 and the
DOS-extenders show up. But it covered the many 68000 workstations and was
often as good or better than the supplied one [which were mostly
based/derived from the MIT Jack Test port of the Johnson compiler for the
NU system].
> but C had been taking off like wildfire for 10 years before that
>
At least 15 years before. By 1975, it was a solid fixture at most
Universities.
> - by the time gcc was mature (by some definition, but probably before
> 1990)
Mature is the key word here. gcc does not really start to mature
until Cygnus
takes it over. But it was quite usable for the systems that targetted it.
[-- Attachment #2: Type: text/html, Size: 3428 bytes --]

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-21T23:46:14Zurn:uuid:a19c8c5c-662d-f3ab-8998-7878addde02d

On 2020-05-21 4:56 PM, Clem Cole wrote:
>
>
> On Thu, May 21, 2020 at 12:17 PM Toby Thain <toby@telegraphics.com.au
> <mailto:toby@telegraphics.com.au>> wrote:
>
> - inexpensive compiler availability was not very good until ~1990
> orlater,
>
> Hrrumpt The Gnu C compiler was starting to be available by the
> mid-1980s in alpha/beta form. rms was looking for places to start. He
Right, things were changing, but costly C compilers were a reality well
into the 90s, unless your use case happened to coincide with a gcc port.
And the reason this matters is that it contradicts the "C is popular
because compilers were easy" assertion. Not "easy", and not necessarily
cheap or free either.
> approached a number of folks, from Tanenbaum to some of the vendors (he
> knew Masscomp had written a compiler from scratch which we away the
> binaries gave to our customers and he called me asking if we would
> donate it. We had donated development hardware and I was still his
> contact to the Gnu project at that point).
>
> As far as I know, he ended up writing his own because he could not find
> one to start with. ...
>
>
>
>
>
> but C had been taking off like wildfire for 10 years before that
>
> At least 15 years before. By 1975, it was a solid fixture at most
> Universities.
Yes. I should have said "more than 10" :-)
--Toby
>
>
>
> - by the time gcc was mature (by some definition, but probably
> before1990)
>
> Mature is the key word here. gcc does not really start to mature until
> Cygnus takes it over. But it was quite usable for the systems that
> targetted it.

Richard Salzrich.salz@gmail.comRe: [TUHS] History of popularity of C2020-05-21T23:57:48Zurn:uuid:b6b778b5-78f7-9436-050e-42b5af9cdf5a

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-22T00:18:20Zurn:uuid:f16171ef-6508-0777-3304-0ec664ad6a47

On 2020-05-21 7:57 PM, Richard Salz wrote:
> Was the fact that gcc had the "portable" RTL as an intermediate
> representation important? That it was designed to be ported.
>
> And what about John Gilmore making all bsd user it? And the multiple
> usenix tutorials?
Regardless of one's opinions on the ubiquity of gcc it wasn't mature and
accessible until at least 10-15 years after C was already popular
(depending how you count).
And gcc is hardly an "easy" compiler project ... to the OP's question.
--Toby

John Gilmoregnu@toad.comRe: [TUHS] History of popularity of C2020-05-22T04:24:51Zurn:uuid:ae68aa0a-85b2-1ad8-2561-481ba0262bc2

Richard Salz <rich.salz@gmail.com> wrote:
> And what about John Gilmore making all bsd user it? And the multiple usenix
> tutorials?
I think Rich is referring to the time in 1987-8 when I spent some time
compiling the entire BSD distribution sources with the Vax version of
gcc. This was a volunteer effort on my part so that Berkeley could
adopt GCC to replace PCC. They got an ANSI C compiler, and avoided AT&T
copyright restrictions on Yet Another critical piece of Berkeley Unix.
GNU got an extensive test of GCC which moved it out of "beta" status.
I ended up taking extensive notes, and wrote a 1988 paper about the
experience, which I submitted to USENIX. But it was rejected, on the
theory that porting code (even ancient crufty Unix code) through new
compilers wasn't research. Indeed, I recall Kirk McKusick remarking to
me around that time that even Unix kernel ports to new architectures
were so routine as to not be research in his opinion.
Oddly, I was easily able to find that paper (thanks to Kryder's Law), so
I have appended it verbatim below (in troff with -ms macros). In short,
I found about a dozen bugs in GCC, which RMS fixed; and many hundreds of
bugs in the 4.3BSD Unix sources, which I fixed and Keith merged upstream.
Note the quaint footnoted homage to distributed collaboration, which was
still novel back then in the pre-Covid, pre-public-Internet, 2400 baud
modem era.
John
.TL
Porting Berkeley
.UX
through the GNU C Compiler
.AU
John Gilmore
.AI
Grasshopper Group
San Francisco, CA, USA 94117
gnu@toad.com
.AB
We have ported UC Berkeley's latest
.UX
sources through the GNU C Compiler,
a free draft-ANSI compatible compiler written by Richard Stallman and available from the Free
Software Foundation. In the process, we made Berkeley
.UX
more compatible
with the draft ANSI C standard, and tested the GNU C Compiler
for its full production release.
We describe the impact of various ANSI C changes on the Berkeley
.UX
sources, the kinds of non-portable code that the conversion uncovered,
and how we fixed them. We also briefly explore some limitations in the tools
used to build a
.UX
system.
.AE
.SH
Introduction
.PP
The GNU C Compiler (GCC) is a complete C compiler, compatible with the draft
ANSI standard, and
available in source from the Free Software Foundation (FSF). It was written by
Richard Stallman
in 1986 and 1987, and is (at this writing) in its
18th release. It is a major component of the GNU (``GNU's Not
.UX '')
project, whose aim
is to build a complete
.UX -like
software system,
available in source to anyone who wants it.
The compiler produces good code \(em better than most commercial
compilers \(em and has been ported to the Vax, MC680X0,
and NS32XXX.
.PP
Berkeley
.UX ,
from the Computer Systems Research Group (CSRG) at the University
of California at Berkeley,
had its start in the 1970's with a prerelease
.UX
Version 7, and
has been improving ever since. The current sources derive from the
1978 AT&T ``32V'' release, a V7 variant for the Vax. CSRG has produced
four major releases for the Vax
\(em 3, 4.1, 4.2, and 4.3BSD. These releases have set the
standard for high powered
.UX
systems for many years, and continue
to offer an improved alternative to the flat-tasting AT&T
.UX
releases.
.PP
However, Berkeley's C compiler is based on an old version of PCC,
the Portable C Compiler from AT&T. There was little chance that anyone
would provide ANSI C language extensions in this compiler, or do significant
work on optimizing the generated code. By merging the GNU C compiler
into the Berkeley release, we provided these new features to Berkeley
Unix users at a low cost,
while offering the GNU project an important test case for GNU C.
.SH
Goals
.PP
The major goal for the project is to move GCC out of ``beta test'' and
into ``production'' status,
by demonstrating that a successful
.UX
port can be based on it.
.PP
We are also providing a better maintained
compiler for Berkeley
.UX .
GCC already produces better
object code then the previous compiler,
has a more modern internal structure, and supports useful features
such as function prototype declarations.
It is also maintained by a large collection of people around the world,
who contribute their fixes and enhancements to the master sources.
Regular releases by the
Free Software Foundation encourage distribution of the improvements.
In contrast, PCC
is proprietary to AT&T, and few fixes are widely distributed, except as
part of infrequent and expensive AT&T releases.
.PP
We are producing a
.UX
source tree which can be compiled
by
.I both
the old and the new compilers. This is partly for convenience during the port,
partly in case the project suffers long delays,
and partly because Berkeley
.UX
also runs on the Tahoe, a fast Vax-like machine
built by Computer Consoles, which
GCC does not yet support.
We are avoiding the introduction of new
.B #ifdef 's,
instead rewriting the code so that it does not depend
on the features of either compiler.
.PP
We have to constantly remind ourselves to minimize the changes required.
It's too easy to get lost in a maze of twisty
.UX
code, all desperately
needing improvement.
.PP
Whenever we have to make a change, we have moved in the direction of
ANSI
C and POSIX compatability.
.SH
People
.PP
The project was conceived by John Gilmore, and endorsed
by Keith Bostic and Mike Karels of CSRG, and Richard Stallman of FSF.
John did the major grunt work and provided fixes to the
.UX
code.
Keith and Mike provided machine
resources, collaborated
on major decisions, and arbitrated the style and content of the changes
to
.UX .
Richard provided quick turnaround on compiler bug fixes and problem
solving.
This setup worked extremely well.
.PP
We started work on 17 December 1987, and are not yet done at the
time of writing (19 February 1988). About 9 days of my time, 2 of Keith's,
half a day of Mike's, and XXX days of Richard's have gone into the
project so far.
.SH
Working Style
.PP
Most of the work was done over networks, in a loosely coordinated
style which was hard to concieve of only a few years ago.\(dg
.FS \(dg
Much of the free software work that is happening these days occurs in this
manner, and I would like to publicly thank the original DARPA pioneers who gave
birth to this vision of wide area, computer mediated collaborative work.
.FE
John worked in San Francisco,
Keith in Berkeley, and Richard in Cambridge. Keith set up an account and
a copy of the source tree on
.I vangogh ,
a Vax 8600 at Berkeley.
John spent a few
days in front of a Sun at Berkeley getting things straight, but did
most of the work by dialing in at 2400 baud from his office in San Francisco.
When we modified
.UX
source files, Keith
checked the changes and merged them back into the master
.UX
sources on another machine at Berkeley. When we found an apparent
bug in GCC, we isolated a small
excerpt or test program to demonstrate the bug, and forwarded it to Richard by Internet electronic
mail.
Bug fixes came back as new GCC releases, which were FTP'd over the Internet
from MIT. Ongoing status reports, discussions, and scheduling were done
by \fIuucp\fP and Internet electronic mail.
.PP
At this writing, we have used four GCC releases (1.15 through 1.18).
For each
GCC release, we did a ``pass'' over the
.UX
source tree;
one such pass included an updated source tree as well.
Each GCC
release was built, tested, and installed on
.I vangogh
without trouble.
Then we ran
.I "make clean; make"
on the source tree, and examined 500K to 800K of resulting
output. Keith Bostic's Makefiles did an excellent job of
automating this process, though we ran into some problems with the
.UX
compilation model in general, and limitations in
.I make
in particular.
.SH
ANSI Language Changes
.PP
The problems encountered during the port fell into two general categories.
Some of the code was not written portably and failed in the new environment.
Other code was written portably for its time, but failed because ANSI C
has redefined parts of the language. In some cases it was hard to tell
the difference; the consensus on what is ``portable code'' changes over
time, and on some points there is no agreement.
.PP
The major ANSI C problem was the generation of
.B "character constants in cpp" .
The traditional
.UX
C preprocessor (\fIcpp\fP), written by John F. Reiser, would
substitute a macro's parameters into like-named substrings even inside
single or double quotes in the macro definition. For example:
.DS
#define CTRL(c) ('c'&037)
#define CEOF CTRL(d)
.DE
In an attempt to make things easier for tokenizing preprocessors,
ANSI C has changed the
rules here, and there is in fact
.I no
way to generate a character constant containing a macro argument.
(There is a way to generate a character
.I string ,
e.g. double-quoted string, but not a single-quoted character.
We consider this a bug in ANSI C.)
Fixing this required altering both the macro definition and each reference
to the macro:
.DS
#define CTRL(c) (c&037)
#define CEOF CTRL('d')
.DE
This required changes in about 10 system include files and in about 45
source modules. Many user programs turned out to depend on the undocumented
.B CTRL
macro, defined in
.B <sys/ttychars.h> ,
and since all its callers had to change, all those programs did too.
.PP
Another \fIcpp\fP problem involved
.B "token concatenation" .
No formal facilities were provided for this in the old \fIcpp\fP, but many
users discovered that with code like this, from the /etc/passwd scanning code:
.DS
#define EXPAND(e) passwd.pw_/**/e = tp; while (*tp++ = *cp++);
EXPAND(name);
EXPAND(passwd);
.DE
they could cause a macro argument to be concatenated with another argument,
or with preexisting text, to make a single name. In one case
(\fIphantasia\fP),
the Makefile provided half of a quoted string as a command line
.B #define ,
and the source text provided the other half!
ANSI C
does not allow a preprocessor to concatenate tokens in these ways, instead
providing a newly invented
.B ##
operator, and new rules requiring the compiler to concatenate adjacent
character strings. Again,
it was impossible to write
a macro that works with both old and new compilers, and we didn't want
to uglify our code with
.B "#ifdef __STDC__" ;
our solution was to
rewrite both the macros and all their callers, to avoid ever having to
concatenate tokens:
.DS
#define EXPAND(e) passwd.e = tp; while (*tp++ = *cp++);
EXPAND(pw_name);
EXPAND(pw_passwd);
.DE
Mostly the token concatenation was used as a typing convenience, so this
was not a problem. It involved changes to five modules.
We found no clean solution for
.I phantasia ;
a fix will probably involve rewriting it to do explicit
string concatenations at runtime.
.PP
Changes to the
.B "scope of externals"
provided another set of widely scattered changes. If an external
identifier is declared from inside a function, PCC causes that declaration
to be visible to the entire remaining text of the source file.
This also applies to functions which are implicitly declared
when they first appear in an expression.
This
behaviour was not explicitly sanctioned by K&R,
but it was condoned (pg. 206, 2nd paragraph), and many programs depended on it.
ANSI C changed the scope rules to be more consistent; if you declare an
external identifier in a local block, the declaration has no effect outside
the block. We moved extern declarations to global scope, or added global
function declarations, in 38 files to handle this.
.PP
A number of programs used
.B "new keywords"
such as \fIsigned\fP or \fIconst\fP as identifiers. We renamed the identifiers
in 9 modules.
.PP
The Fortran libraries used a \fBtypedef name as a formal parameter\fP
to a set of functions. ANSI C has disallowed this, since it complicates
the parsing of the new prototype-style function declarations. We renamed
the parameter in 8 modules.
.PP
Three modules used a \fBtypedef with modifiers\fP, e.g.:
.DS
typedef int CONSZ;
x = (unsigned CONSZ) y;
.DE
This has been repudiated by ANSI C. We fixed it by making the original
typedef \fBunsigned\fP where possible, or by
creating a second typedef for ``U_CONSZ''.
.SH
Non-Portable Constructs
.PP
The worst non-portable construct we found in the
.UX
sources was the use of
.B "pointers to non-members" .
There was plenty of code as bad as:
.DS
int *foo;
foo->memb = 5
if (foo->humbug >= -1) bah();
.DE
and, in many cases, \fImemb\fP and \fIhumbug\fP are not even members of
the same struct!
Such code seems to have been written with a ``BCPL'' mentality, assuming
that all pointers are really the same thing and it doesn't matter what their
type is. Early C implementations lacked the
.B union
declarator,
and did not distinguish between the members of different structures.
Exploiting this has been considered
bad practice for years, and lint checks for it,
though many
.UX
compilers do not. We found a lot of it in old code, though newer
code did not lack for examples either.
Fixing this problem caused the most work,
because we had to figure out what each untyped or mistyped pointer was
.I really
being used for, then fix its type, and whatever references to it were
inconsistent with that type. We changed 5 modules due to this.
One program, \fIefl\fP, would have required so much work
that we abandoned it, since we could
not find anyone using it.
.PP
Another problem was caused by existing uses of
.B "cpp on non-C sources" .
Various assembler language modules were being preprocessed by \fIcpp\fP,
probably
because there is no standard macro assembler for
.UX .
These modules are
carefully arranged to avoid confusing the old \fIcpp\fP; for example,
assembler language comments are introduced by
.B # ,
but indented so that \fIcpp\fP will not treat them as control lines.
ANSI \fIcpp\fP's handle white space on both sides of the ``#'', so
indentation no longer hides these comments. Also, the ANSI rules
to require the preprocessor to keep track of which
material is inside single and double quotes and which is outside;
the old \fIcpp\fP terminated a character string or constant at the next
unescaped newline. Vax assembler language uses unmatched quotes
when specifying single ASCII characters, such as in immediate operands.
This causes an ANSI \fIcpp\fP to stop processing # directives at that point,
until it finds another
unmatched quote. We chose to alter the assembler modules to avoid
stumbling over these features in ANSI C preprocessors, without fixing the
larger problem of using a C-specific preprocessor on non-C text.
.PP
In addition to embedded C preprocessor statements in assembler
sources, we had to deal with
.B "asm() constructs"
in C source. Some system-dependent routines were written in C
with intermixed assembler code, producing a mess when compiled with
anything but the original compiler. Other routines, such as
.I compress ,
drop in an
.B asm()
here or there as an optimization. Still more modules, including the kernel,
run a
.I sed
script over the assembler code generated by the C compiler, before
assembling and linking it. There is no general solution to these
problems. GCC has added an asm() facility that is independent of
the compiler's register allocation strategy, but programs using this are
incompatible with the old C compiler.
We are investigating
a possible fix involving
changing all these places to use e.g.
.B "#include <machine/inline.h>"
which, in GCC, would define inline code containing asm()s, while
in PCC, declarations of (slower) external functions would be generated.
.PP
.I Troff
used
.B "multi-character constants"
in its font tables; we fixed it with a macro for building an int out of two
characters. A Fortran library module used the character constant
.B 'EOF' ,
presumably a typo for
.B EOF ;
and \fIrogue\fP defined the character '\300' as a possible command letter.
While ANSI C permits multiple character constants, they are implementation
defined, and GCC wisely defines them to be invalid (as the standard should
have done).
.PP
Some programs tried to declare functions or variables,
.B "omitting both type and storage class" .
This usage is not even valid in K&R, though PCC accepts it. We fixed this in
about 15
modules, by adding ``int'' to the declarations. There were two other modules
where this check uncovered inadvertent use of ``;'' in a declaration list
where ``,'' was intended.
.PP
GCC provides better error checking in a few ways, and caught a number
of bugs caused by misunderstood
.B "sign extension" .
It warns ``comparison is always 0 due to limited range of data type''
for constructs like:
.DS
char c;
if (c == 0x80) foo();
.DE
If a signed character contains the bit pattern 0x80, using it in an
expression causes it to be
sign-extended to 0xFFFFFF80, which does not equal 0x00000080.
Bugs of this sort were fixed, typically by casting the 0x80 to (char),
in 5 modules.
.PP
Changes to the rules for \fBparsing declarations\fP made us fix two modules
where the last declaration in a struct was immediately followed by a
closing brace, without a semicolon. Three more modules needed changes
because the rules for where braces are required in struct or array
initializers have changed. Four programs defined a \fBstruct foo\fP
and then referenced it as a \fBunion foo\fP, or vice verse. Two programs
declared \fBregister struct foo bar;\fP and then took bar's address, which
is not allowed for register variables!
.PP
Thirteen programs had miscellaneous \fBpointer usage bugs\fP
fixed. Two more were
comparing pointers to \fB-1\fP; these were changed to use zero as a
flag value instead.
.PP
In ANSI C, local variables in use at a
.B setjmp()
are no longer guaranteed to be preserved when a
.B longjmp()
occurs, unless they are declared \fBvolatile\fP. This
is not a problem for the Vax port, since the Vax longjmp()
will continue to restore the registers, but gcc warns about this
situation, since code that assumes restoration is not portable.
We have not yet worked on fixes for this.
.PP
Five or ten other miscellaneous bugs were caught and fixed.
.SH
Least portable
.UX
code
.PP
The process of porting software inevitably uncovers
a few files that cause a disproportionate share of problems.
For our port,
the clear winner is
.I efl ,
the Extended Fortran Language, by Stu Feldman.
It defines ``\fBtypedef int * ptr;\fP'' in a header file,
and then uses a ``ptr'' to point to anything.
GCC produced
1600 lines of errors messages on this program alone, and three modules
of it caused compiler core dumps. We ended
up deciding to abandon support for it rather than attempt to clean
it up.
.PP
A runner-up is
.I pcc ,
the Portable C Compiler itself, by Steven C. Johnson.
It caused GCC to coredump twice, tickled another GCC parsing bug,
and contained the modified typedef and sign extension problems mentioned above.
.PP
Third place goes to
.I monop ,
the Monopoly\(dg
.FS \(dg
Trademark of Parker Brothers
.FE
game, by Ken Arnold. This
program used a variety of typed pointers, but the main pointer to
a set of structs was declared as a \fBchar *\fP. Another part of
the code initialized an array of struct pointers with integer values,
then a small loop at the beginning of the game would read out these
integers and replace them with corresponding ``real'' struct pointers.
It took about two days to face up to the job and about a day to clean
it up.
.PP
Honorable mention for silly mistakes goes to the
.I indent
program, by someone at the University of Illinois.
It contain the only instance of
.B "a + = b"
(with a space between + and =), and was the only module
to terminate its
.B #include
directives with a semicolon.
It also contained a comparison between a character and the value 0200,
a value that a signed 8-bit char can never hold.
.SH
Results
.PP
We are pleased with the results so far. Most of the
.UX
code compiled
without problems, and the parts which we have executed are free from
code generation bugs.
The worst of the ANSI C changes only required roughly fifty modules
to be changed, and there were only two problems of this magnitude.
A total of
twenty bugs in gcc were located so far, and most of them are now fixed.
We expected several times this many bugs; the compiler is in better
shape than any of us expected.
.PP
Many minor type problems and ``nit'' incompatabilities with ANSI C have
been removed from the
.UX
sources.
.SH
Future Results
.PP
\fI(This section will move to \fBResults\fP for the final paper.)\fP
.PP
We expect that the size of the
.UX
binaries will be significantly less than
with the previous compiler, but at the current stage of the project
we can't easily confirm the expectation.
.PP
When the system compiled with GCC is in everyday use at Berkeley, GCC
will be relabeled as a full production-quality compiler, which will
encourage its wider use.
.SH
Non-Results
.PP
We have not attempted to make Berkeley
.UX
fully ANSI C compliant.
In particular, we have retained preprocessor comments (#endif FOO)
as well as machine-specific \fB#define\fP's (#ifdef vax). GCC supports
these features without trouble, even though ANSI C does not.
.PP
The
.UX
kernel has not yet been ported to gcc. Other people are working on
this, compiling one module at a time and running it for a while before
moving on to the next. We will merge their work with
ours once we have the rest of the system in a stable state.
.PP
Pieces of the Portable C Compiler are still being used inside
.I "lint, f77" ,
and
.I pc .
Eventually someone will write Fortran and Pascal front-ends for gcc;
this has already been done for C++. So far nobody has created a GNU
\fIlint\fP, but it is an obvious project.
.PP
CSRG has ported Berkeley
.UX
to the Tahoe, a fast Vax-like machine
built by Computer Consoles and resold by Harris and others. We are looking
for someone to do a Tahoe port of gcc, to replace the PCC supplied by CCI.
.SH
Problems in Building
.UX
.PP
.UX
compilers traditionally look in certain global places in the
file system for their libraries, include files, etc. This is a problem
when cross-compiling, or when building a new
.UX
release (which almost
amounts to the same thing). While it is possible to provide a new
default directory for
.B #include
files, if a source program
.B #include s
a file that is not in the cross-compilation include files,
the C compiler will erroneously use the one from /usr/include.
There should be a switch that turns off \fIall\fP the built-in include
file and library pathnames, and only uses those specified on the
compiler's command line.
.PP
However, there is still the problem of getting those switches to the
compiler's command line.
.I Make
is a great tool for dealing with one directory's worth of files,
but as
.UX
has evolved, \fImake\fP has not kept up. Indeed, it has fallen behind;
Makefiles that worked perfectly well five years ago will no longer
work because each manufacturer (AT&T especially) has hacked up their
.I make
to include harmful, gratuitous, and mutually incompatible changes.
The result is that a Makefile that works on your system is unlikely
to work on your neighbor's system, unless they are from the same manufacturer,
and you happen to use the same login shell.
.PP
.I Make
works poorly on nested directory structures, too.
As an example, we could find no way to change ``cc'' to ``gcc'' in all the
Makefiles used to build Berkeley
.UX
(short of text-editing them all).
In a single directory, you can say
.I "make CC=gcc" ,
but this change is not propagated to subdirectories. You can manually
propagate that change one level by saying
.I "make CC=gcc MFLAGS='CC=gcc'"
but that only goes one level (at least in Berkeley's version of
.I make ).
We ended up putting a copy of gcc in a private
.I bin
directory, named
.I cc ,
and putting that directory on the front of the search path.
(When we later wanted to override CFLAGS as well, \fI~/bin/cc\fP
became a shell script that invokes
.I "gcc -W" ).
.PP
Another problem with
.I make
is that even if it was instructed to ignore errors (with -i or -k), it exits
if it can't locate a file that something else depends upon. This has the
effect of ``pruning'' a potentially large section
of the source hierarchy, and the
only warning is an unobtrusive
message buried among 500K of other output.
.PP
Of course, if someone was to fix these bugs in \fImake\fP, they would
be creating yet another incompatible version.
I have been watching the papers on the ``new makes'' and so far there
doesn't seem to be one that handles deeply nested
source trees in a clean and consistent fashion, or is otherwise
so much better than \fImake\fP that it's worth the effort to switch.
I think it is time to look for a completely new paradigm for
software compilation control. I don't have any major insights on where
to go from here, but it is clear to me that
.I make
and its derivatives have reached their useful limits.
.SH
Availability
.PP
These changes will be available to recipients of Berkeley's next software
distribution, whenever that is. We will also make diffs available
to others involved in porting
.UX
to ANSI C. We suspect that most of the
problems we solved have already been handled in one or another
.UX
port, but the work had to be duplicated because either it was not
sent back to Berkeley or AT&T, or the changes were not accepted. (AT&T
has a history of pretending that
.UX
bugs do not exist, and
Berkeley has limited manpower).
.SH
Future Work
.PP
Future projects include building a complete set of ANSI C and POSIX
compatible include files and libraries (including function prototypes),
and converting the existing sources to use them. An eventual goal
is to produce a fully standard-conforming
.UX
system \(em not only in
the interface provided to users, but with sources which will compile
and run on any standard-conforming compiler and libraries.
.PP
The success of this collaboration between GNU and CSRG has encouraged further
cooperation. Both parties feel that AT&T licensing
is a problem; most recipients of CSRG releases have old
.UX
licenses,
and are unwilling to upgrade to more expensive and more onerous AT&T
licenses. However, new AT&T releases include some features which would
be useful in Berkeley
.UX .
The GNU project is working to provide
early reimplementations of these features, such as improved shells and
``make'' commands. In return, CSRG is working to release software to
the public which has previously been held to be ``
.UX
licensed'' even though
it was not derived from AT&T code, such as the implementation
of TCP/IP, and many of the Berkeley utility programs.
.SH
References
.LP
\fIDraft Proposed American National Standard \(em Programming Language C\fP,
ANSI X3.J11, draft of October 1, 1986 (update for new draft when out).
CBEMA, 311 First Street NW #1500, Washington DC 20001.
.LP
\fI4.3BSD Manual Set\fP,
Computer Systems Research Group, University of California
at Berkeley.
.LP
Fowler, Glenn S., ``The Fourth Generation Make'', Usenix conference
proceedings, Summer 1985, page 159. (More references on ``make''
are provided in this paper.)
.LP
Hume, Andrew, ``Mk: a successor to make'', Usenix conference
proceedings, Summer 1987, page 445.
.LP
Kernighan, Brian W. and Ritchie, Dennis M., ``\fIThe
C Programming Language\fP'', Prentice-Hall, 1978.

arnoldarnold@skeeve.comRe: [TUHS] History of popularity of C2020-05-22T07:42:42Zurn:uuid:d3ce1ad9-d2ad-7709-3ee5-a43b71e9d539

Richard Salz <rich.salz@gmail.com> wrote:
> Was the fact that gcc had the "portable" RTL as an intermediate
> representation important? That it was designed to be ported.
I think it was. GCC had *two* intermediate forms, one representing
the source program (trees), and the other representing instructions
(RTL). It was really designed to make it easy to write both new
front ends and new back ends.
In that it seems to have succeeded fairly well, too. :-)
Arnold

David Arnolddavida@pobox.comRe: [TUHS] History of popularity of C2020-05-22T08:34:40Zurn:uuid:37486f72-7771-f5a3-99e1-cf011708cd8d

On 22 May 2020, at 03:37, arnold@skeeve.com wrote:
<...>
> C++ became the language of choice on the PC when MSFT started pushing
> its compiler and Visual Studio IDE.
On the PC side, TurboPascal started to get displaced by Borland C++ I think in the early 90’s. I don’t have a good feeling why, but perhaps it was the parallel evolution of Microsoft’s C & C++, which were doing pretty well even before 1997 when Visual Studio began its rise.
Watcom C++ was also around, iirc it was available for OS/2 as well?
On the Unix side, the egcs fork of gcc pushed it forward a lot and the subsequent reverse takeover of gcc saved it from needing replacement far earlier.
Of course the commercial Unix vendors charging for their compilers helped gcc too, and by then Pascal, Modula/2/3, Ada ... everything else had become a niche market.
I don’t recall any hard data from back then though, sorry ...
d

Noel Chiappa <jnc@mercury.lcs.mit.edu> writes:
> I suspect the real reason for C's sucess was the nature of the language.
> When I first saw it (ca. 1976), it struck me as a quantum improvement over
> its contemporaries.
Paul Graham expressed it like this:
"It seems to me that there have been two really clean, consistent
models of programming so far: the C model and the Lisp model. These
two seem points of high ground, with swampy lowlands between them."
-tih
--
Most people who graduate with CS degrees don't understand the significance
of Lisp. Lisp is the most important idea in computer science. --Alan Kay

Tyler Adamscoppero1237@gmail.comRe: [TUHS] History of popularity of C2020-05-22T09:52:32Zurn:uuid:466a4b07-a410-a19e-0dcb-af510d8450c5

[-- Attachment #1: Type: text/plain, Size: 1677 bytes --]
Awesome, looks like my theory was completely wrong. Here's what it looks
like to me, please correct me as needed.
C's popularity has 2 distinct phases.
1972-1987 Unix drove C. Writing a functional PCC for a particular
architecture was easy, but not unusually so compared to other languages at
the time.
1987- gcc made C uniquely free to compile, so people chose to write C
because it was free and already popular.
Perl also came out in 1987, and afaik that was always free, but C still
took off because there was so much room for multiple languages.
So, now Im curious about embedded systems. In my limited experience, every
"embedded system" I programmed for from 2002-2011 had C as its primary
language. After 2011, I stopped programming embedded systems, so I don't
know after that. Why was C so dominant in this space? Is it because adding
a backend to gcc was free, C was already well known, and C was sufficiently
performant?
Tyler
On Fri, May 22, 2020, 11:53 Tom Ivar Helbekkmo <tih@hamartun.priv.no> wrote:
> Noel Chiappa <jnc@mercury.lcs.mit.edu> writes:
>
> > I suspect the real reason for C's sucess was the nature of the language.
> > When I first saw it (ca. 1976), it struck me as a quantum improvement
> over
> > its contemporaries.
>
> Paul Graham expressed it like this:
>
> "It seems to me that there have been two really clean, consistent
> models of programming so far: the C model and the Lisp model. These
> two seem points of high ground, with swampy lowlands between them."
>
> -tih
> --
> Most people who graduate with CS degrees don't understand the significance
> of Lisp. Lisp is the most important idea in computer science. --Alan Kay
>
[-- Attachment #2: Type: text/html, Size: 2459 bytes --]

arnoldarnold@skeeve.comRe: [TUHS] History of popularity of C2020-05-22T11:10:02Zurn:uuid:30948b8e-de3c-b3cf-3e78-fb04cd977ef0

Tyler Adams <coppero1237@gmail.com> wrote:
> So, now Im curious about embedded systems. In my limited experience, every
> "embedded system" I programmed for from 2002-2011 had C as its primary
> language. After 2011, I stopped programming embedded systems, so I don't
> know after that. Why was C so dominant in this space?
First of all, because C is the (almost) perfect language for embedded
systems - tight code generated, language close to the metal, etc. etc.
> Is it because adding
> a backend to gcc was free, C was already well known, and C was sufficiently
> performant?
Cygnus Solutions (Hi John!) had a lot to do with this. They specialized
in porting GCC to different processors used in embedded systems and
provided support.
Arnold

Tyler Adamscoppero1237@gmail.comRe: [TUHS] History of popularity of C2020-05-22T11:15:55Zurn:uuid:0f30bed5-103c-6a4c-e1a5-306c0cdc104f

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]
Doesn't C++ also generate tight code and is fairly close to the metal?
Today C++ is the high performant language for game developers and HFT shops.
But, I never found it on any of these embedded systems, it was straight C.
Tyler
On Fri, May 22, 2020, 14:09 <arnold@skeeve.com> wrote:
> Tyler Adams <coppero1237@gmail.com> wrote:
>
> > So, now Im curious about embedded systems. In my limited experience,
> every
> > "embedded system" I programmed for from 2002-2011 had C as its primary
> > language. After 2011, I stopped programming embedded systems, so I don't
> > know after that. Why was C so dominant in this space?
>
> First of all, because C is the (almost) perfect language for embedded
> systems - tight code generated, language close to the metal, etc. etc.
>
> > Is it because adding
> > a backend to gcc was free, C was already well known, and C was
> sufficiently
> > performant?
>
> Cygnus Solutions (Hi John!) had a lot to do with this. They specialized
> in porting GCC to different processors used in embedded systems and
> provided support.
>
> Arnold
>
[-- Attachment #2: Type: text/html, Size: 1738 bytes --]

A. P. Garciaa.phillip.garcia@gmail.comRe: [TUHS] History of popularity of C2020-05-22T11:59:08Zurn:uuid:64470909-9ee4-4278-9d2a-96f44e946e7c

[-- Attachment #1: Type: text/plain, Size: 766 bytes --]
On Fri, May 22, 2020, 5:52 AM Tyler Adams <coppero1237@gmail.com> wrote:
<snip>
> So, now Im curious about embedded systems. In my limited experience, every
> "embedded system" I programmed for from 2002-2011 had C as its primary
> language. After 2011, I stopped programming embedded systems, so I don't
> know after that. Why was C so dominant in this space? Is it because adding
> a backend to gcc was free, C was already well known, and C was sufficiently
> performant?
>
I don't know how much gcc contributed to the success of C in the embedded
space. Microcontrollers are often programmed in assembly. They have memory
and speed constraints, much like the PDPs where C began. I think it goes
back to what Larry said about C being so close to the metal.
>
>
[-- Attachment #2: Type: text/html, Size: 1452 bytes --]

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-05-22T14:12:04Zurn:uuid:3381f19d-77a6-b2e7-9978-393e525868c1

On Thu, May 21, 2020 at 09:10:06PM -0700, John Gilmore wrote:
> Note the quaint footnoted homage to distributed collaboration, which was
> still novel back then in the pre-Covid, pre-public-Internet, 2400 baud
> modem era.
>
> John
http://mcvoy.com/lm/papers/porting-berkeley.pdf
for those who don't want to run it through groff. As an aside, this
didn't work (firefox couldn't display it):
groff -ms -Tpdf porting-berkeley.ms > porting-berkeley.pdf
but this did:
groff -ms porting-berkeley.ms > PS
ps2pdf PS porting-berkeley.pdf
I'll ask the groff people if they know what is up.

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-05-22T14:18:41Zurn:uuid:c3394f84-e0a8-03df-50c6-c70a5f1b827d

Clem, you should read that paper, link again:
http://mcvoy.com/lm/papers/porting-berkeley.pdf
because it validates a lot of what I have said about not having access to
the AT&T code. The BSD code was slightly easier to get but even that,
around 1985 at UW-Madison, was locked up on an 11/750 named slovax.
I had to beg and beg to get a login on that machine. You had to be
somebody to get access to the source and I was still nobody.
I did get a login eventually, I think I had to sign some papers,
don't remember. I went on to spend so many happy hours reading the
sources that my primary machine, be it 68k, SPARC, MIPS, x86, whatever,
has been called slovax ever since.
On Thu, May 21, 2020 at 09:10:06PM -0700, John Gilmore wrote:
> Richard Salz <rich.salz@gmail.com> wrote:
> > And what about John Gilmore making all bsd user it? And the multiple usenix
> > tutorials?
>
> I think Rich is referring to the time in 1987-8 when I spent some time
> compiling the entire BSD distribution sources with the Vax version of
> gcc. This was a volunteer effort on my part so that Berkeley could
> adopt GCC to replace PCC. They got an ANSI C compiler, and avoided AT&T
> copyright restrictions on Yet Another critical piece of Berkeley Unix.
> GNU got an extensive test of GCC which moved it out of "beta" status.
>
> I ended up taking extensive notes, and wrote a 1988 paper about the
> experience, which I submitted to USENIX. But it was rejected, on the
> theory that porting code (even ancient crufty Unix code) through new
> compilers wasn't research. Indeed, I recall Kirk McKusick remarking to
> me around that time that even Unix kernel ports to new architectures
> were so routine as to not be research in his opinion.
>
> Oddly, I was easily able to find that paper (thanks to Kryder's Law), so
> I have appended it verbatim below (in troff with -ms macros). In short,
> I found about a dozen bugs in GCC, which RMS fixed; and many hundreds of
> bugs in the 4.3BSD Unix sources, which I fixed and Keith merged upstream.
>
> Note the quaint footnoted homage to distributed collaboration, which was
> still novel back then in the pre-Covid, pre-public-Internet, 2400 baud
> modem era.
>
> John
>
> .TL
> Porting Berkeley
> .UX
> through the GNU C Compiler
> .AU
> John Gilmore
> .AI
> Grasshopper Group
> San Francisco, CA, USA 94117
> gnu@toad.com
> .AB
> We have ported UC Berkeley's latest
> .UX
> sources through the GNU C Compiler,
> a free draft-ANSI compatible compiler written by Richard Stallman and available from the Free
> Software Foundation. In the process, we made Berkeley
> .UX
> more compatible
> with the draft ANSI C standard, and tested the GNU C Compiler
> for its full production release.
> We describe the impact of various ANSI C changes on the Berkeley
> .UX
> sources, the kinds of non-portable code that the conversion uncovered,
> and how we fixed them. We also briefly explore some limitations in the tools
> used to build a
> .UX
> system.
> .AE
> .SH
> Introduction
> .PP
> The GNU C Compiler (GCC) is a complete C compiler, compatible with the draft
> ANSI standard, and
> available in source from the Free Software Foundation (FSF). It was written by
> Richard Stallman
> in 1986 and 1987, and is (at this writing) in its
> 18th release. It is a major component of the GNU (``GNU's Not
> .UX '')
> project, whose aim
> is to build a complete
> .UX -like
> software system,
> available in source to anyone who wants it.
> The compiler produces good code \(em better than most commercial
> compilers \(em and has been ported to the Vax, MC680X0,
> and NS32XXX.
> .PP
> Berkeley
> .UX ,
> from the Computer Systems Research Group (CSRG) at the University
> of California at Berkeley,
> had its start in the 1970's with a prerelease
> .UX
> Version 7, and
> has been improving ever since. The current sources derive from the
> 1978 AT&T ``32V'' release, a V7 variant for the Vax. CSRG has produced
> four major releases for the Vax
> \(em 3, 4.1, 4.2, and 4.3BSD. These releases have set the
> standard for high powered
> .UX
> systems for many years, and continue
> to offer an improved alternative to the flat-tasting AT&T
> .UX
> releases.
> .PP
> However, Berkeley's C compiler is based on an old version of PCC,
> the Portable C Compiler from AT&T. There was little chance that anyone
> would provide ANSI C language extensions in this compiler, or do significant
> work on optimizing the generated code. By merging the GNU C compiler
> into the Berkeley release, we provided these new features to Berkeley
> Unix users at a low cost,
> while offering the GNU project an important test case for GNU C.
> .SH
> Goals
> .PP
> The major goal for the project is to move GCC out of ``beta test'' and
> into ``production'' status,
> by demonstrating that a successful
> .UX
> port can be based on it.
> .PP
> We are also providing a better maintained
> compiler for Berkeley
> .UX .
> GCC already produces better
> object code then the previous compiler,
> has a more modern internal structure, and supports useful features
> such as function prototype declarations.
> It is also maintained by a large collection of people around the world,
> who contribute their fixes and enhancements to the master sources.
> Regular releases by the
> Free Software Foundation encourage distribution of the improvements.
> In contrast, PCC
> is proprietary to AT&T, and few fixes are widely distributed, except as
> part of infrequent and expensive AT&T releases.
> .PP
> We are producing a
> .UX
> source tree which can be compiled
> by
> .I both
> the old and the new compilers. This is partly for convenience during the port,
> partly in case the project suffers long delays,
> and partly because Berkeley
> .UX
> also runs on the Tahoe, a fast Vax-like machine
> built by Computer Consoles, which
> GCC does not yet support.
> We are avoiding the introduction of new
> .B #ifdef 's,
> instead rewriting the code so that it does not depend
> on the features of either compiler.
> .PP
> We have to constantly remind ourselves to minimize the changes required.
> It's too easy to get lost in a maze of twisty
> .UX
> code, all desperately
> needing improvement.
> .PP
> Whenever we have to make a change, we have moved in the direction of
> ANSI
> C and POSIX compatability.
> .SH
> People
> .PP
> The project was conceived by John Gilmore, and endorsed
> by Keith Bostic and Mike Karels of CSRG, and Richard Stallman of FSF.
> John did the major grunt work and provided fixes to the
> .UX
> code.
> Keith and Mike provided machine
> resources, collaborated
> on major decisions, and arbitrated the style and content of the changes
> to
> .UX .
> Richard provided quick turnaround on compiler bug fixes and problem
> solving.
> This setup worked extremely well.
> .PP
> We started work on 17 December 1987, and are not yet done at the
> time of writing (19 February 1988). About 9 days of my time, 2 of Keith's,
> half a day of Mike's, and XXX days of Richard's have gone into the
> project so far.
> .SH
> Working Style
> .PP
> Most of the work was done over networks, in a loosely coordinated
> style which was hard to concieve of only a few years ago.\(dg
> .FS \(dg
> Much of the free software work that is happening these days occurs in this
> manner, and I would like to publicly thank the original DARPA pioneers who gave
> birth to this vision of wide area, computer mediated collaborative work.
> .FE
> John worked in San Francisco,
> Keith in Berkeley, and Richard in Cambridge. Keith set up an account and
> a copy of the source tree on
> .I vangogh ,
> a Vax 8600 at Berkeley.
> John spent a few
> days in front of a Sun at Berkeley getting things straight, but did
> most of the work by dialing in at 2400 baud from his office in San Francisco.
> When we modified
> .UX
> source files, Keith
> checked the changes and merged them back into the master
> .UX
> sources on another machine at Berkeley. When we found an apparent
> bug in GCC, we isolated a small
> excerpt or test program to demonstrate the bug, and forwarded it to Richard by Internet electronic
> mail.
> Bug fixes came back as new GCC releases, which were FTP'd over the Internet
> from MIT. Ongoing status reports, discussions, and scheduling were done
> by \fIuucp\fP and Internet electronic mail.
> .PP
> At this writing, we have used four GCC releases (1.15 through 1.18).
> For each
> GCC release, we did a ``pass'' over the
> .UX
> source tree;
> one such pass included an updated source tree as well.
> Each GCC
> release was built, tested, and installed on
> .I vangogh
> without trouble.
> Then we ran
> .I "make clean; make"
> on the source tree, and examined 500K to 800K of resulting
> output. Keith Bostic's Makefiles did an excellent job of
> automating this process, though we ran into some problems with the
> .UX
> compilation model in general, and limitations in
> .I make
> in particular.
> .SH
> ANSI Language Changes
> .PP
> The problems encountered during the port fell into two general categories.
> Some of the code was not written portably and failed in the new environment.
> Other code was written portably for its time, but failed because ANSI C
> has redefined parts of the language. In some cases it was hard to tell
> the difference; the consensus on what is ``portable code'' changes over
> time, and on some points there is no agreement.
> .PP
> The major ANSI C problem was the generation of
> .B "character constants in cpp" .
> The traditional
> .UX
> C preprocessor (\fIcpp\fP), written by John F. Reiser, would
> substitute a macro's parameters into like-named substrings even inside
> single or double quotes in the macro definition. For example:
> .DS
> #define CTRL(c) ('c'&037)
> #define CEOF CTRL(d)
> .DE
> In an attempt to make things easier for tokenizing preprocessors,
> ANSI C has changed the
> rules here, and there is in fact
> .I no
> way to generate a character constant containing a macro argument.
> (There is a way to generate a character
> .I string ,
> e.g. double-quoted string, but not a single-quoted character.
> We consider this a bug in ANSI C.)
> Fixing this required altering both the macro definition and each reference
> to the macro:
> .DS
> #define CTRL(c) (c&037)
> #define CEOF CTRL('d')
> .DE
> This required changes in about 10 system include files and in about 45
> source modules. Many user programs turned out to depend on the undocumented
> .B CTRL
> macro, defined in
> .B <sys/ttychars.h> ,
> and since all its callers had to change, all those programs did too.
> .PP
> Another \fIcpp\fP problem involved
> .B "token concatenation" .
> No formal facilities were provided for this in the old \fIcpp\fP, but many
> users discovered that with code like this, from the /etc/passwd scanning code:
> .DS
> #define EXPAND(e) passwd.pw_/**/e = tp; while (*tp++ = *cp++);
> EXPAND(name);
> EXPAND(passwd);
> .DE
> they could cause a macro argument to be concatenated with another argument,
> or with preexisting text, to make a single name. In one case
> (\fIphantasia\fP),
> the Makefile provided half of a quoted string as a command line
> .B #define ,
> and the source text provided the other half!
> ANSI C
> does not allow a preprocessor to concatenate tokens in these ways, instead
> providing a newly invented
> .B ##
> operator, and new rules requiring the compiler to concatenate adjacent
> character strings. Again,
> it was impossible to write
> a macro that works with both old and new compilers, and we didn't want
> to uglify our code with
> .B "#ifdef __STDC__" ;
> our solution was to
> rewrite both the macros and all their callers, to avoid ever having to
> concatenate tokens:
> .DS
> #define EXPAND(e) passwd.e = tp; while (*tp++ = *cp++);
> EXPAND(pw_name);
> EXPAND(pw_passwd);
> .DE
> Mostly the token concatenation was used as a typing convenience, so this
> was not a problem. It involved changes to five modules.
> We found no clean solution for
> .I phantasia ;
> a fix will probably involve rewriting it to do explicit
> string concatenations at runtime.
> .PP
> Changes to the
> .B "scope of externals"
> provided another set of widely scattered changes. If an external
> identifier is declared from inside a function, PCC causes that declaration
> to be visible to the entire remaining text of the source file.
> This also applies to functions which are implicitly declared
> when they first appear in an expression.
> This
> behaviour was not explicitly sanctioned by K&R,
> but it was condoned (pg. 206, 2nd paragraph), and many programs depended on it.
> ANSI C changed the scope rules to be more consistent; if you declare an
> external identifier in a local block, the declaration has no effect outside
> the block. We moved extern declarations to global scope, or added global
> function declarations, in 38 files to handle this.
> .PP
> A number of programs used
> .B "new keywords"
> such as \fIsigned\fP or \fIconst\fP as identifiers. We renamed the identifiers
> in 9 modules.
> .PP
> The Fortran libraries used a \fBtypedef name as a formal parameter\fP
> to a set of functions. ANSI C has disallowed this, since it complicates
> the parsing of the new prototype-style function declarations. We renamed
> the parameter in 8 modules.
> .PP
> Three modules used a \fBtypedef with modifiers\fP, e.g.:
> .DS
> typedef int CONSZ;
> x = (unsigned CONSZ) y;
> .DE
> This has been repudiated by ANSI C. We fixed it by making the original
> typedef \fBunsigned\fP where possible, or by
> creating a second typedef for ``U_CONSZ''.
> .SH
> Non-Portable Constructs
> .PP
> The worst non-portable construct we found in the
> .UX
> sources was the use of
> .B "pointers to non-members" .
> There was plenty of code as bad as:
> .DS
> int *foo;
> foo->memb = 5
> if (foo->humbug >= -1) bah();
> .DE
> and, in many cases, \fImemb\fP and \fIhumbug\fP are not even members of
> the same struct!
> Such code seems to have been written with a ``BCPL'' mentality, assuming
> that all pointers are really the same thing and it doesn't matter what their
> type is. Early C implementations lacked the
> .B union
> declarator,
> and did not distinguish between the members of different structures.
> Exploiting this has been considered
> bad practice for years, and lint checks for it,
> though many
> .UX
> compilers do not. We found a lot of it in old code, though newer
> code did not lack for examples either.
> Fixing this problem caused the most work,
> because we had to figure out what each untyped or mistyped pointer was
> .I really
> being used for, then fix its type, and whatever references to it were
> inconsistent with that type. We changed 5 modules due to this.
> One program, \fIefl\fP, would have required so much work
> that we abandoned it, since we could
> not find anyone using it.
> .PP
> Another problem was caused by existing uses of
> .B "cpp on non-C sources" .
> Various assembler language modules were being preprocessed by \fIcpp\fP,
> probably
> because there is no standard macro assembler for
> .UX .
> These modules are
> carefully arranged to avoid confusing the old \fIcpp\fP; for example,
> assembler language comments are introduced by
> .B # ,
> but indented so that \fIcpp\fP will not treat them as control lines.
> ANSI \fIcpp\fP's handle white space on both sides of the ``#'', so
> indentation no longer hides these comments. Also, the ANSI rules
> to require the preprocessor to keep track of which
> material is inside single and double quotes and which is outside;
> the old \fIcpp\fP terminated a character string or constant at the next
> unescaped newline. Vax assembler language uses unmatched quotes
> when specifying single ASCII characters, such as in immediate operands.
> This causes an ANSI \fIcpp\fP to stop processing # directives at that point,
> until it finds another
> unmatched quote. We chose to alter the assembler modules to avoid
> stumbling over these features in ANSI C preprocessors, without fixing the
> larger problem of using a C-specific preprocessor on non-C text.
> .PP
> In addition to embedded C preprocessor statements in assembler
> sources, we had to deal with
> .B "asm() constructs"
> in C source. Some system-dependent routines were written in C
> with intermixed assembler code, producing a mess when compiled with
> anything but the original compiler. Other routines, such as
> .I compress ,
> drop in an
> .B asm()
> here or there as an optimization. Still more modules, including the kernel,
> run a
> .I sed
> script over the assembler code generated by the C compiler, before
> assembling and linking it. There is no general solution to these
> problems. GCC has added an asm() facility that is independent of
> the compiler's register allocation strategy, but programs using this are
> incompatible with the old C compiler.
> We are investigating
> a possible fix involving
> changing all these places to use e.g.
> .B "#include <machine/inline.h>"
> which, in GCC, would define inline code containing asm()s, while
> in PCC, declarations of (slower) external functions would be generated.
> .PP
> .I Troff
> used
> .B "multi-character constants"
> in its font tables; we fixed it with a macro for building an int out of two
> characters. A Fortran library module used the character constant
> .B 'EOF' ,
> presumably a typo for
> .B EOF ;
> and \fIrogue\fP defined the character '\300' as a possible command letter.
> While ANSI C permits multiple character constants, they are implementation
> defined, and GCC wisely defines them to be invalid (as the standard should
> have done).
> .PP
> Some programs tried to declare functions or variables,
> .B "omitting both type and storage class" .
> This usage is not even valid in K&R, though PCC accepts it. We fixed this in
> about 15
> modules, by adding ``int'' to the declarations. There were two other modules
> where this check uncovered inadvertent use of ``;'' in a declaration list
> where ``,'' was intended.
> .PP
> GCC provides better error checking in a few ways, and caught a number
> of bugs caused by misunderstood
> .B "sign extension" .
> It warns ``comparison is always 0 due to limited range of data type''
> for constructs like:
> .DS
> char c;
> if (c == 0x80) foo();
> .DE
> If a signed character contains the bit pattern 0x80, using it in an
> expression causes it to be
> sign-extended to 0xFFFFFF80, which does not equal 0x00000080.
> Bugs of this sort were fixed, typically by casting the 0x80 to (char),
> in 5 modules.
> .PP
> Changes to the rules for \fBparsing declarations\fP made us fix two modules
> where the last declaration in a struct was immediately followed by a
> closing brace, without a semicolon. Three more modules needed changes
> because the rules for where braces are required in struct or array
> initializers have changed. Four programs defined a \fBstruct foo\fP
> and then referenced it as a \fBunion foo\fP, or vice verse. Two programs
> declared \fBregister struct foo bar;\fP and then took bar's address, which
> is not allowed for register variables!
> .PP
> Thirteen programs had miscellaneous \fBpointer usage bugs\fP
> fixed. Two more were
> comparing pointers to \fB-1\fP; these were changed to use zero as a
> flag value instead.
> .PP
> In ANSI C, local variables in use at a
> .B setjmp()
> are no longer guaranteed to be preserved when a
> .B longjmp()
> occurs, unless they are declared \fBvolatile\fP. This
> is not a problem for the Vax port, since the Vax longjmp()
> will continue to restore the registers, but gcc warns about this
> situation, since code that assumes restoration is not portable.
> We have not yet worked on fixes for this.
> .PP
> Five or ten other miscellaneous bugs were caught and fixed.
> .SH
> Least portable
> .UX
> code
> .PP
> The process of porting software inevitably uncovers
> a few files that cause a disproportionate share of problems.
> For our port,
> the clear winner is
> .I efl ,
> the Extended Fortran Language, by Stu Feldman.
> It defines ``\fBtypedef int * ptr;\fP'' in a header file,
> and then uses a ``ptr'' to point to anything.
> GCC produced
> 1600 lines of errors messages on this program alone, and three modules
> of it caused compiler core dumps. We ended
> up deciding to abandon support for it rather than attempt to clean
> it up.
> .PP
> A runner-up is
> .I pcc ,
> the Portable C Compiler itself, by Steven C. Johnson.
> It caused GCC to coredump twice, tickled another GCC parsing bug,
> and contained the modified typedef and sign extension problems mentioned above.
> .PP
> Third place goes to
> .I monop ,
> the Monopoly\(dg
> .FS \(dg
> Trademark of Parker Brothers
> .FE
> game, by Ken Arnold. This
> program used a variety of typed pointers, but the main pointer to
> a set of structs was declared as a \fBchar *\fP. Another part of
> the code initialized an array of struct pointers with integer values,
> then a small loop at the beginning of the game would read out these
> integers and replace them with corresponding ``real'' struct pointers.
> It took about two days to face up to the job and about a day to clean
> it up.
> .PP
> Honorable mention for silly mistakes goes to the
> .I indent
> program, by someone at the University of Illinois.
> It contain the only instance of
> .B "a + = b"
> (with a space between + and =), and was the only module
> to terminate its
> .B #include
> directives with a semicolon.
> It also contained a comparison between a character and the value 0200,
> a value that a signed 8-bit char can never hold.
> .SH
> Results
> .PP
> We are pleased with the results so far. Most of the
> .UX
> code compiled
> without problems, and the parts which we have executed are free from
> code generation bugs.
> The worst of the ANSI C changes only required roughly fifty modules
> to be changed, and there were only two problems of this magnitude.
> A total of
> twenty bugs in gcc were located so far, and most of them are now fixed.
> We expected several times this many bugs; the compiler is in better
> shape than any of us expected.
> .PP
> Many minor type problems and ``nit'' incompatabilities with ANSI C have
> been removed from the
> .UX
> sources.
> .SH
> Future Results
> .PP
> \fI(This section will move to \fBResults\fP for the final paper.)\fP
> .PP
> We expect that the size of the
> .UX
> binaries will be significantly less than
> with the previous compiler, but at the current stage of the project
> we can't easily confirm the expectation.
> .PP
> When the system compiled with GCC is in everyday use at Berkeley, GCC
> will be relabeled as a full production-quality compiler, which will
> encourage its wider use.
> .SH
> Non-Results
> .PP
> We have not attempted to make Berkeley
> .UX
> fully ANSI C compliant.
> In particular, we have retained preprocessor comments (#endif FOO)
> as well as machine-specific \fB#define\fP's (#ifdef vax). GCC supports
> these features without trouble, even though ANSI C does not.
> .PP
> The
> .UX
> kernel has not yet been ported to gcc. Other people are working on
> this, compiling one module at a time and running it for a while before
> moving on to the next. We will merge their work with
> ours once we have the rest of the system in a stable state.
> .PP
> Pieces of the Portable C Compiler are still being used inside
> .I "lint, f77" ,
> and
> .I pc .
> Eventually someone will write Fortran and Pascal front-ends for gcc;
> this has already been done for C++. So far nobody has created a GNU
> \fIlint\fP, but it is an obvious project.
> .PP
> CSRG has ported Berkeley
> .UX
> to the Tahoe, a fast Vax-like machine
> built by Computer Consoles and resold by Harris and others. We are looking
> for someone to do a Tahoe port of gcc, to replace the PCC supplied by CCI.
> .SH
> Problems in Building
> .UX
> .PP
> .UX
> compilers traditionally look in certain global places in the
> file system for their libraries, include files, etc. This is a problem
> when cross-compiling, or when building a new
> .UX
> release (which almost
> amounts to the same thing). While it is possible to provide a new
> default directory for
> .B #include
> files, if a source program
> .B #include s
> a file that is not in the cross-compilation include files,
> the C compiler will erroneously use the one from /usr/include.
> There should be a switch that turns off \fIall\fP the built-in include
> file and library pathnames, and only uses those specified on the
> compiler's command line.
> .PP
> However, there is still the problem of getting those switches to the
> compiler's command line.
> .I Make
> is a great tool for dealing with one directory's worth of files,
> but as
> .UX
> has evolved, \fImake\fP has not kept up. Indeed, it has fallen behind;
> Makefiles that worked perfectly well five years ago will no longer
> work because each manufacturer (AT&T especially) has hacked up their
> .I make
> to include harmful, gratuitous, and mutually incompatible changes.
> The result is that a Makefile that works on your system is unlikely
> to work on your neighbor's system, unless they are from the same manufacturer,
> and you happen to use the same login shell.
> .PP
> .I Make
> works poorly on nested directory structures, too.
> As an example, we could find no way to change ``cc'' to ``gcc'' in all the
> Makefiles used to build Berkeley
> .UX
> (short of text-editing them all).
> In a single directory, you can say
> .I "make CC=gcc" ,
> but this change is not propagated to subdirectories. You can manually
> propagate that change one level by saying
> .I "make CC=gcc MFLAGS='CC=gcc'"
> but that only goes one level (at least in Berkeley's version of
> .I make ).
> We ended up putting a copy of gcc in a private
> .I bin
> directory, named
> .I cc ,
> and putting that directory on the front of the search path.
> (When we later wanted to override CFLAGS as well, \fI~/bin/cc\fP
> became a shell script that invokes
> .I "gcc -W" ).
> .PP
> Another problem with
> .I make
> is that even if it was instructed to ignore errors (with -i or -k), it exits
> if it can't locate a file that something else depends upon. This has the
> effect of ``pruning'' a potentially large section
> of the source hierarchy, and the
> only warning is an unobtrusive
> message buried among 500K of other output.
> .PP
> Of course, if someone was to fix these bugs in \fImake\fP, they would
> be creating yet another incompatible version.
> I have been watching the papers on the ``new makes'' and so far there
> doesn't seem to be one that handles deeply nested
> source trees in a clean and consistent fashion, or is otherwise
> so much better than \fImake\fP that it's worth the effort to switch.
> I think it is time to look for a completely new paradigm for
> software compilation control. I don't have any major insights on where
> to go from here, but it is clear to me that
> .I make
> and its derivatives have reached their useful limits.
> .SH
> Availability
> .PP
> These changes will be available to recipients of Berkeley's next software
> distribution, whenever that is. We will also make diffs available
> to others involved in porting
> .UX
> to ANSI C. We suspect that most of the
> problems we solved have already been handled in one or another
> .UX
> port, but the work had to be duplicated because either it was not
> sent back to Berkeley or AT&T, or the changes were not accepted. (AT&T
> has a history of pretending that
> .UX
> bugs do not exist, and
> Berkeley has limited manpower).
> .SH
> Future Work
> .PP
> Future projects include building a complete set of ANSI C and POSIX
> compatible include files and libraries (including function prototypes),
> and converting the existing sources to use them. An eventual goal
> is to produce a fully standard-conforming
> .UX
> system \(em not only in
> the interface provided to users, but with sources which will compile
> and run on any standard-conforming compiler and libraries.
> .PP
> The success of this collaboration between GNU and CSRG has encouraged further
> cooperation. Both parties feel that AT&T licensing
> is a problem; most recipients of CSRG releases have old
> .UX
> licenses,
> and are unwilling to upgrade to more expensive and more onerous AT&T
> licenses. However, new AT&T releases include some features which would
> be useful in Berkeley
> .UX .
> The GNU project is working to provide
> early reimplementations of these features, such as improved shells and
> ``make'' commands. In return, CSRG is working to release software to
> the public which has previously been held to be ``
> .UX
> licensed'' even though
> it was not derived from AT&T code, such as the implementation
> of TCP/IP, and many of the Berkeley utility programs.
> .SH
> References
> .LP
> \fIDraft Proposed American National Standard \(em Programming Language C\fP,
> ANSI X3.J11, draft of October 1, 1986 (update for new draft when out).
> CBEMA, 311 First Street NW #1500, Washington DC 20001.
> .LP
> \fI4.3BSD Manual Set\fP,
> Computer Systems Research Group, University of California
> at Berkeley.
> .LP
> Fowler, Glenn S., ``The Fourth Generation Make'', Usenix conference
> proceedings, Summer 1985, page 159. (More references on ``make''
> are provided in this paper.)
> .LP
> Hume, Andrew, ``Mk: a successor to make'', Usenix conference
> proceedings, Summer 1987, page 445.
> .LP
> Kernighan, Brian W. and Ritchie, Dennis M., ``\fIThe
> C Programming Language\fP'', Prentice-Hall, 1978.
--
---
Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm

Richard Salzrich.salz@gmail.comRe: [TUHS] History of popularity of C2020-05-22T14:35:02Zurn:uuid:6dceace5-f283-0266-eaac-c0120665b8c7

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-22T14:59:52Zurn:uuid:fc958c7c-ebf1-ef62-280c-f46f7040b80e

On 2020-05-22 7:09 AM, arnold@skeeve.com wrote:
> Tyler Adams <coppero1237@gmail.com> wrote:
>
>> So, now Im curious about embedded systems. In my limited experience, every
>> "embedded system" I programmed for from 2002-2011 had C as its primary
>> language. After 2011, I stopped programming embedded systems, so I don't
>> know after that. Why was C so dominant in this space?
>
> First of all, because C is the (almost) perfect language for embedded
> systems - tight code generated, language close to the metal, etc. etc.
To my recollection, in 1985 C wasn't firstly considered an embedded
language; it was considered an applications language (so was assembly,
but we could say that was tapering off).
I believe the explosion in popularity was due to that lesson from Unix,
that you could have a single portable language for both "system" code
and applications code, with a modern looking syntax, that could be self
hosted and compiled to reasonably efficient machine code.
All those tradeoffs and definitions are very different 40 years later,
of course. (And C was far from the first or only language that met those
criteria before 1975. It just happened to take off.)
>
>> Is it because adding
>> a backend to gcc was free, C was already well known, and C was sufficiently
>> performant?
>
> Cygnus Solutions (Hi John!) had a lot to do with this. They specialized
> in porting GCC to different processors used in embedded systems and
> provided support.
Having to get a paid consultant doesn't exactly argue for the idea that
C compilers were "easy" - plus it's almost a decade after the period of
high growth. So this doesn't seem strong support for the thesis quoted
by OP.
--Toby
>
> Arnold
>

John Gilmoregnu@toad.comRe: [TUHS] History of popularity of C2020-05-22T18:41:07Zurn:uuid:a76701e4-1ef8-8ca5-6203-feff23ad96b2

Tyler Adams <coppero1237@gmail.com> wrote:
> Doesn't C++ also generate tight code and is fairly close to the metal?
> Today C++ is the high performant language for game developers and HFT shops.
>
> But, I never found it on any of these embedded systems, it was straight C.
My take on this is that programmers who understand the underlying
hardware architecture can easily intuit the code that would result from
what they write in C. There are only a few late features (e.g. struct
parameters, longjmp) that require complex code to be generated, or
function calls to occur where no function call was written by the
programmer.
Whereas in C++, Pascal, Python, APL, etc, a few characters can cause the
generated code to do immense amounts of unexpected work. Think of
string compares, hash table types, object initializers, or arbitrary
amounts of jumping through tables of pointers to different kinds of
objects. Automated memory allocation. Garbage collection.
This is both a blessing and a curse. In C it was quite predictable how
well or badly typical sections of your code would perform. If the
performance was bad, it was YOUR fault! But at least YOU could fix it,
without learning to hack a compiler instead of your own application.
(I once found Berkeley SPICE code doing string compares in a triply
nested loop, just to look up the names of the signals. In C. Making
changes to a large state machine going into a custom chip was taking the
Sun hardware engineers multiple hours per change. I spent weeks finding
the source code (Sun's tools group was dysfunctional; I got it from
UCB). In half a day of profiling it and fixing it to cache the
result of the first string lookup on each signal name, four hour
rebuilds went down to under a minute. A second day of profiling
and cacheing, just for fun, took it down to 10 seconds.)
John

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-22T19:02:13Zurn:uuid:5fee7ccd-4c08-64ee-1561-a70631198610

On 2020-05-22 2:40 PM, John Gilmore wrote:
> Tyler Adams <coppero1237@gmail.com> wrote:
>> Doesn't C++ also generate tight code and is fairly close to the metal?
>> Today C++ is the high performant language for game developers and HFT shops.
>>
>> But, I never found it on any of these embedded systems, it was straight C.
>
> My take on this is that programmers who understand the underlying
> hardware architecture can easily intuit the code that would result from
> what they write in C. There are only a few late features (e.g. struct
A short time playing with Godbolt should challenge that view :)
https://godbolt.org/> parameters, longjmp) that require complex code to be generated, or
> function calls to occur where no function call was written by the
> programmer.
>
> Whereas ...
>
> John
>

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-05-22T19:31:59Zurn:uuid:de896df0-cd14-71d5-4208-e37ef1374f88

On Fri, May 22, 2020 at 11:40:11AM -0700, John Gilmore wrote:
> Tyler Adams <coppero1237@gmail.com> wrote:
> > Doesn't C++ also generate tight code and is fairly close to the metal?
> > Today C++ is the high performant language for game developers and HFT shops.
> >
> > But, I never found it on any of these embedded systems, it was straight C.
>
> My take on this is that programmers who understand the underlying
> hardware architecture can easily intuit the code that would result from
> what they write in C. There are only a few late features (e.g. struct
> parameters, longjmp) that require complex code to be generated, or
> function calls to occur where no function call was written by the
> programmer.
Amen.
> Whereas in C++, Pascal, Python, APL, etc, a few characters can cause the
> generated code to do immense amounts of unexpected work. Think of
> string compares, hash table types, object initializers, or arbitrary
> amounts of jumping through tables of pointers to different kinds of
> objects. Automated memory allocation. Garbage collection.
Double amen.
> This is both a blessing and a curse. In C it was quite predictable how
> well or badly typical sections of your code would perform. If the
> performance was bad, it was YOUR fault! But at least YOU could fix it,
> without learning to hack a compiler instead of your own application.
Triple amen.
> (I once found Berkeley SPICE code doing string compares in a triply
> nested loop, just to look up the names of the signals. In C. Making
> changes to a large state machine going into a custom chip was taking the
> Sun hardware engineers multiple hours per change. I spent weeks finding
> the source code (Sun's tools group was dysfunctional; I got it from
> UCB). In half a day of profiling it and fixing it to cache the
> result of the first string lookup on each signal name, four hour
> rebuilds went down to under a minute. A second day of profiling
> and cacheing, just for fun, took it down to 10 seconds.)
Gazillion amens (I especially loved the jab at Sun's tools group, I
wrote the SCM that Sun used for Solaris initially. They tried to get
me to join the tools group to make my stuff "official" - it worked just
fine being "unofficial". I took a look at the people in the tools group,
no offense, but it was a big step down from working with people like srk
and gingell and shannon, not to mention that all of my peers were smart.
Tools group, just say no.)

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-05-22T19:36:16Zurn:uuid:940f872e-fa39-fdd8-6eeb-9fb6ebe8eacd

On Fri, May 22, 2020 at 03:01:40PM -0400, Toby Thain wrote:
> On 2020-05-22 2:40 PM, John Gilmore wrote:
> > Tyler Adams <coppero1237@gmail.com> wrote:
> >> Doesn't C++ also generate tight code and is fairly close to the metal?
> >> Today C++ is the high performant language for game developers and HFT shops.
> >>
> >> But, I never found it on any of these embedded systems, it was straight C.
> >
> > My take on this is that programmers who understand the underlying
> > hardware architecture can easily intuit the code that would result from
> > what they write in C. There are only a few late features (e.g. struct
>
> A short time playing with Godbolt should challenge that view :)
>
> https://godbolt.org/
>
>
> > parameters, longjmp) that require complex code to be generated, or
> > function calls to occur where no function call was written by the
> > programmer.
What John didn't mention, he just assumes people know and everyone is
the same, is that he is an excellent C programmer, I could fix bugs
in his code.
You can always fine someone who will make a mess of any language.
That's not the point.
Assume that you have decent programmers, you will be able to understand
and fix their C code. If you have really good C programmers, like
my company did, you can start to predict what the bottom half of the
function looks like by reading the top half. We wrote very stylized C,
were not afraid of gotos when used wisely.

Michael Kjörlingmichael@kjorling.seRe: [TUHS] History of popularity of C2020-05-22T20:20:16Zurn:uuid:4afbb608-67e3-3031-4563-dd92acafaa43

On 22 May 2020 11:40 -0700, from gnu@toad.com (John Gilmore):
> Whereas in C++, Pascal, Python, APL, etc, a few characters can cause the
> generated code to do immense amounts of unexpected work. Think of
> string compares, hash table types, object initializers, or arbitrary
> amounts of jumping through tables of pointers to different kinds of
> objects. Automated memory allocation. Garbage collection.
What you wrote is pretty much my take on the subject as well.
However, part of me wants to say "let's not compare apples to
airplanes just because both start with 'a' and one can typically be
placed within the other".
C++ adds a ton of features on top of C, never mind early C, though for
the features that at least earlier C has (I'm honestly not sure about
the newer additions), C++ has very similar or downright identical
syntax compared to C.
As long as you stay with the basic C feature set, I strongly suspect
that most programmers who can follow along in the C to assembler to
machine code compilation process, can do much the same thing with C++.
It's when you start piling all the extras on top of it that things get
hairy from a code generation perspective.
Vectors? Function overloading? Exceptions? RAII? Try predicting the
execution order of destructors during exception handling for classes
with multiple inheritance where multiple inherited-from classes define
destructors. Anything else? :-)
--
Michael Kjörling • https://michael.kjorling.se • michael@kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”

John Gilmoregnu@toad.comRe: [TUHS] History of popularity of C (GCC/Cygnus)2020-05-22T20:40:17Zurn:uuid:59b293c1-272f-c9f8-258b-c810f0cd9a98

Tyler Adams <coppero1237@gmail.com> wrote:
> > Is it because adding
> > a backend to gcc was free, C was already well known, and C was sufficiently
> > performant?
arnold@skeeve.com wrote:
> Cygnus Solutions (Hi John!) had a lot to do with this. They specialized
> in porting GCC to different processors used in embedded systems and
> provided support.
First things first. When figuring out what happened and what became
popular, it's important to look at where money was flowing. Economics
will tell you things about systems that you can't learn any other way.
Second, until the embedded market included 32-bit processors, gcc was
unknown in it. 32-bit was way less than 1% of the embedded market; only
in multi-thousand dollar things like laser printers (Adobe/Apple
LaserWriter) and network switches (3Com/Cisco/etc). Cygnus ended up
with lots of those companies as support customers, because they were
sick of porting their code through a different compiler for each new
generation of hardware platforms. But we had zero visibility into the
vast majority of the embedded market. We went there because even our
tiny niche of it was huge, many times the size of the market for native
compilers, and with much more diversity of customers.
Early on, GCC had the slight advantage that because it was free (as in
both beer and speech) and had an email community of maintainers, many
people had started ports to different architectures. Only a few of
those were production-quality, but they each offered at least a starting
point, and attracted interested users who might pay us to make them
real.
Cygnus was able to deliver production compilers for each new
architecture for significantly less than the other companies building
compilers for embedded systems. I think that had more to do with our
pricing strategy than the actual cost of modifying the compiler. Our
main competitors were half a dozen small, fat, lazy companies who
charged $10,000 PER SEAT for cross-compilers and charged the chipmaker
$1,000,000 and frequently more, to do a port to their latest chip.
Cygnus charged chipmakers $500K for a brand new architecture, and $0 per
seat, which caused us to eat our competitors' lunch over our first 3 to
5 years. Then we hired someone who knew more about pricing, and raised
our own port price to a larger fraction of what the market would bear,
to get better margins while still winning deals.
We built the first 64-bit ports of GCC and the tools, for the SPARC when
SPARC-64 was still secret, and later for other architectures. (Sun's
hardware diagnostics group funded our work. They needed to be able to
compile their chip and system tests, a full year before Sun's compiler
group could deliver 64-bit compilers for customers.)
A lot of what got done to make GCC a standard, production worthy
compiler had little to do with the code generation. For example, many
customers really wanted cross-compilers hosted on DOS and Windows, as
well as on various UNIX machines, so we ended up hiring the genius who
created djgpp, DJ Delorie, and making THAT into a commercial quality
supported product. (We also hired the guy who made GCC run in the old
MacOS development environment (Stan Shebs) and one of the senior
developers and distributors for free Amiga applications (Fred Fish).)
We had to vastly improve the testing infrastructure for the compiler and
tools. We designed and built DejaGnu (Rob Savoye's work), and with each
bug report, we added to a growingly serious free C and C++ compiler test
suite. We automated enough of our build process that we could compare
the code produced by different host systems for the same target
platform. (The "reproducible builds" teams are now trying to do that
for the whole Linux distribution code base.) DejaGnu actually ran our
test suite on multiple target platforms with different target
architectures, downloading the binaries over serial ports and jumping to
them, and compared the tests' output so we could fix any discrepancies
that were target-dependent. We hired full-time writers (initially
Roland Pesch, who had been a serious programmer earlier in life) to
write real manuals and other documentation. We wrote an email-based bug
tracking system (PRMS), and the first working over-the-Internet version
control system (remote cvs). Our customers all used different object
file formats, so we wrote new code for format independence (the BFD
library) in the assembler, linker, size, nm and ar and other tools
(e.g. we created objdump and objcopy). Ultimately Steve Chamberlain
wrote us and GNU a brand-new linker which had the flexibility needed for
building embedded system binaries and putting code and data wherever the
hardware needed it to go.
We learned how to hire and manage remote employees, which meant we were
able to hire talented gcc and tools hackers from around the country and
the world, who jumped at the chance to turn their beloved hobby into a
full-time paying gig. We started our own ISP in order to get ourselves
good, cheap commercial quality Internet access, and so we could teach
our remote employees how to buy and install solid 56kbit/sec Frame Relay
connections rather than flaky dialup access.
And because we didn't control the master source code for gcc, one of our
senior compiler hackers, Jim Wilson, spent a huge fraction of his time
merging our changes upstream into FSF GCC, and merging their changes
downstream into our product, keeping the ecosystem in sync. We handled
that overhead for significant other tools by taking up the whole work of
maintenance and release engineering -- for example, I became FSF's
maintainer for gdb. I would make an FSF GDB release two weeks before
Cygnus would make its own integrated toolchain releases. If bug reports
didn't start streaming in within days from the free software community,
we knew we had made a solid release; and we had time to patch anything
that turned up, before our customers got it from us on cartridge tapes.
It wasn't just a compiler, it was a whole ecosystem that had to be built
or improved. About half of our employees were software engineers, so by
the time our revenues grew from <$1M/year to $25M a year, we were
spending about $12M every year improving the free software ecosystem.
And because we avoided venture capital for six years, and shared the
stock ownership widely among the employees, when we got lucky after 10
years and were acquired by the second free software company to go public
(the first was VA Linux, the second Red Hat), all those hackers became
millionaires. A case of doing well by doing good.
John

Greg A. Woodswoods@robohack.caRe: [TUHS] History of popularity of C2020-05-22T23:51:00Zurn:uuid:59c38f27-fbea-7ae8-6389-ad2544972aa4

[-- Attachment #1: Type: text/plain, Size: 4799 bytes --]
I always assumed C became popular because there was a very large cohort
of programmers who started with it as their first language, usually on
early Unix, at university, in the late very 1970s and early 1980s.
After all if I was exposed to it a small Canadian university in the
early 1980s, then surely it was almost everywhere!
At least that's how it happened for me. I was already fluent in BASIC
and reasonably good at Pascal before I went to university, and though we
had a very wide variety of languages to work with since we had accounts
on both Unix and Multics systems right from the start of first year, C
was the strong favourite amongst both juniors and phds, i.e. all but the
most die-hard Multics lovers (who of course used and loved PL/1, though
by 1985 there was even talk of C on Multics).
Some of this popularity of C was no doubt due to the fact that those a
year or two ahead of me had started with FORTRAN on an IBM 370 and had
absolutely hated it and were very vocal to those of us coming up behind
that we were very lucky to jump right onto the Unix (and Multics)
machines right from the start.
My first job programming in 1983/84 was back to BASIC and assembler, but
a year later and I was writing C again (though sadly mostly on MS-DOS,
briefly on Xenix, then back to very early MS-Windows until about 1988 --
not long in hindsight, but it was painful).
At Thu, 21 May 2020 12:10:35 -0400, Toby Thain <toby@telegraphics.com.au> wrote:
Subject: Re: [TUHS] History of popularity of C
>
> - inexpensive compiler availability was not very good until ~1990 or
> later, but C had been taking off like wildfire for 10 years before that
Well, there were a plethora of both full C and "tiny"/"small" C
compilers widely available in the very early 1980s.
Indeed I would say inexpensive C compilers were widely available and
very popular well before 1985, and a few "toy/tiny" compilers were
freely available by then too. By 1985 I was doing C development,
primarily on MS-DOS systems, using commercial compilers, for a wide
variety of projects, mostly in big national companies (in Canada, such
as CP Rail). I would say C was the first commercially successful
systems-level language available across many platforms, and that this
was evidently so by 1985.
Early Atari (6502) computers were partly programmed with a cross-
compiler, though I've no idea what it was (possibly a re-targeted PCC).
I think VisiCalc had similar origins.
The most ground-breaking C compiler might arguably have been
P.J.Plauger's Whitesmiths C compiler, around about 1978. I don't think
it was what you'd call "inexpensive" necessarily, but it was popular.
The BD Software company's C compiler for CP/M (8080/z80) was released in
1979.
The first version of Mark Williams C came out very early, possibly
before 1980. I owned a copy for MS-DOS 386 by 1985/86. This was the
most Unix-like compiler and library, by far, and quite inexpensive (else
I wouldn't have been able to afford my own personal copy).
Small-C appeared in Dr.Dobb's in May 1980 (and it spawned a plethora of
derivatives of its own). C was everywhere in personal computing
literature by 1980.
I believe Aztec C was first released in 1980.
Two books about C were published by McGraw-Hill in 1982: "The C
Primer", Les Hancock and Morris Krieger; and "The C Puzzle Book", Alan
R. Feuer. There were likely more.
Then there was Lattice C, out and about by 1982 and VERY popular and
widely used by 1984. (I was using the second version in 1985/1986 on
PCs. It's probably the buggiest compiler I've ever used for real work
projects.)
"Learning to Program in C" by Thomas Plum was published 1983.
And of course there was Tanenbaum and Jacobs' ACK, with a C parser
front-end in the early 1980s (even by 1980?).
Brad Templeton wrote a C (or maybe Tiny-C) compiler for C64/6502 around
about 1984 (though he only commercialized the "PAL" assembler I think).
In my estimation GCC really only served to cement C's early success and
popularity. It gave people certainty that a good C compiler would be
available for most any platform no matter what happened.
I would also argue that non-Unix C compilers actually drove the adoption
curve of C. Pascal tried to play catch-up, but just as with what
happened to me in university where it was one of the teaching languages,
C was just far more popular and though Pascal had a tiny head-start (in
terms of first-published books/manuals), C overtook it and had far more
staying power too (though indeed in the late 1980s there was a fair
battle going on in the pc/mac/amiga/etc world for Pascal).
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C (GCC/Cygnus)2020-05-23T04:33:41Zurn:uuid:d85df654-0747-5628-afae-fd3a804e5a87

>Early on, GCC had the slight advantage that because it was free (as in
>both beer and speech) and had an email community of maintainers
I remember that we started moving to gcc, gmake, etc,, because these tools
performed simply spoken better than the native SNI ones.

Andy Koselaakosela@andykosela.comRe: [TUHS] History of popularity of C2020-05-23T07:29:00Zurn:uuid:9738ded7-0222-4c89-2a5c-2e7fc0d757d2

On 5/23/20, Greg A. Woods <gwoods@acm.org> wrote:
>
> I would also argue that non-Unix C compilers actually drove the adoption
> curve of C. Pascal tried to play catch-up, but just as with what
> happened to me in university where it was one of the teaching languages,
> C was just far more popular and though Pascal had a tiny head-start (in
> terms of first-published books/manuals), C overtook it and had far more
> staying power too (though indeed in the late 1980s there was a fair
> battle going on in the pc/mac/amiga/etc world for Pascal).
This is my recollection as well. In the late 80s with the
introduction of really nice compilers for MS-DOS like Turbo C from
Borland (1987), Watcom C 6.0 (1988) and mature versions of Microsoft C
(which originally was based on Lattice C), the C future was
solidified.
The documentation coming with those compilers were also excellent. I
still have tons of reference books from that period. It was a time
when almost everybody was using pure C. I think C++ needed another
5-7 years to displace C in the application market.
--A

Clem Coleclemc@ccc.comRe: [TUHS] History of popularity of C2020-05-23T17:09:43Zurn:uuid:6422d267-1484-055c-dcca-3db4f495b34d

[-- Attachment #1: Type: text/plain, Size: 4101 bytes --]
On Fri, May 22, 2020 at 7:51 PM Greg A. Woods <woods@robohack.ca> wrote:
> I always assumed C became popular because there was a very large cohort
> of programmers who started with it as their first language, usually on
> early Unix, at university, in the late very 1970s and early 1980s.
>
Exactly - my giving away UNIX, it cemented the language and the technology
into a group of young engineers (like me) who then 'spread the gospel' when
we went to real jobs.
> Well, there were a plethora of both full C and "tiny"/"small" C
> compilers widely available in the very early 1980s.
>
Yep -- I listed a little of the pre-history.
>
> Indeed I would say inexpensive C compilers were widely available and
> very popular well before 1985, and a few "toy/tiny" compilers were
> freely available by then too.
Yup, although until the 386 and the DOS extenders, it could be tough to use
with the Gordon's awful 'far pointer' infection.
> Early Atari (6502) computers were partly programmed with a cross-
> compiler, though I've no idea what it was (possibly a re-targeted PCC).
>
Most 6502 shops were assembler, although you are correct cc65 shows up
reasonably early. It was not PCC based.
> I think VisiCalc had similar origins.
>
Dan Bricklin wrote it assembler. He had access to the same Harvard PDP-10
that Gates and Allen had used to write MITS Basic a few years earlier. I
should ask him to be sure, but I was under the impression he used the SAIL
based 6502 assembler I mentioned previously.[1]
>
> The most ground-breaking C compiler might arguably have been
> P.J.Plauger's Whitesmiths C compiler, around about 1978. I don't think
> it was what you'd call "inexpensive" necessarily, but it was popular.
>
Other than his wretched 'anat' - a natural assembler, which was far from
natural. But you are correct, particularly for non-UNIX boxes, he had the
first 'widely used' compiler.
> In my estimation GCC really only served to cement C's early success and
> popularity. It gave people certainty that a good C compiler would be
> available for most any platform no matter what happened.
>
I would agree. C had already been 'winning' by the time of gcc, and
offering a compiler that was so portable and generated 'reasonable' code
(sometimes even better than some of the commercial ones) I think was the
winning score.
>
> I would also argue that non-Unix C compilers actually drove the adoption
> curve of C.
>
I would put a small accent on that. I think the C compilers that targeted
non-UNIX systems, and in particular the microprocessors were the driver.
The micro's started with assembler in most cases. Basic shows up and is
small, but it's not good enough for real products like VisiCalc or later
Lotus. Pascal tries to be the answer, but I think it suffered from the
fact that it makes Pascal a production quality language, you had a extend
it and everybody's extensions were different.
So, C came along and was 'better than assembler' and allowed 'production
quality code' to be written, but with the exception of the far pointer
stuff, pretty much worked as dmr had defined it for the PDP-11. So code
could be written to work between compilers and systems. When the 386 DOS
extenders show up, getting rid of far, and making it a 32-bit based
language like the Vax and 68000, C had won.
Clem
1.] FWIW: Bricklin I know socially. He was one of my brother's quad-mates
at HBS in 1978-79 when he wrote VisiCalc to do his homework [the story is
on the Wikipedia page]. In fact, there is now a plaque in the shared lounge
over the nook where his study carrel was when he wrote it. The four of
them all did pretty well. You know Dan's story, his roommate went on to
found Staples, my brother's roommate became the CEO of Pepsi, and my
brother ran Milcron, then founded a materials handling firm that did the
automation for Amazon (and he sold the firm a few years ago to Honeywell).
Also, their section-mate was Clay Christensen of the 'Innovators Dilemma'
fame and of course classmate Meg Whitman would do eBay. Pretty impressive
class from HBS.
[-- Attachment #2: Type: text/html, Size: 8600 bytes --]

Richard Salzrich.salz@gmail.comRe: [TUHS] History of popularity of C2020-05-23T17:22:51Zurn:uuid:47a6ae7b-3463-b78a-b66e-35e0e54b0969

Derek Fawcusdfawcus+lists-tuhs@employees.orgRe: [TUHS] History of popularity of C2020-05-23T18:43:09Zurn:uuid:9da94f8d-ca30-53d3-083a-d0df5f038f70

On Sat, May 23, 2020 at 01:08:28PM -0400, Clem Cole wrote:
> So, C came along and was 'better than assembler' and allowed 'production
> quality code' to be written, but with the exception of the far pointer
> stuff, pretty much worked as dmr had defined it for the PDP-11. So code
> could be written to work between compilers and systems. When the 386 DOS
> extenders show up, getting rid of far, and making it a 32-bit based
> language like the Vax and 68000, C had won.
Certainly having a flat 32 bit compiler was eventually useful, but even
prior to that the impact of 'far' pointers wasn't always an issue.
For simple tasks, one simpy ignored it (wrote w/o 'far'), and the compiled
as either small or large memory model. It was only if one wanted to
optimise the code that 'far' became an issue, and a lot of code was never
shipped, so didn't need to be so optimised.
Even a lot of the shipped code I worked on with those DOS based compilers
simply used large memory model, and ignored 'far'.
More of an issue was the segmented memory, and that structures couldn't
be larger than 64k. For targetting DOS, compilers eventually offered 'huge'
pointers, and possibly a 'huge' memory model which hid the problem; but
were of no use in protected 16 bit mode - which the embedded RT-OS I was
developing for at the time used.
DF

Michael Kjörlingmichael@kjorling.seRe: [TUHS] History of popularity of C2020-05-23T19:29:16Zurn:uuid:e234c377-664c-502c-c464-9b369068c79b

On 23 May 2020 13:08 -0400, from clemc@ccc.com (Clem Cole):
>> I would also argue that non-Unix C compilers actually drove the adoption
>> curve of C.
>
> I would put a small accent on that. I think the C compilers that targeted
> non-UNIX systems, and in particular the microprocessors were the driver.
> The micro's started with assembler in most cases. Basic shows up and is
> small, but it's not good enough for real products like VisiCalc or later
> Lotus. Pascal tries to be the answer, but I think it suffered from the
> fact that it makes Pascal a production quality language, you had a extend
> it and everybody's extensions were different.
There's also the issue that, even once you get into compiled BASIC
territory, those wretched vendor-unique extensions show up again. Try
porting, say, a non-trivial program written for QuickBASIC to Turbo
BASIC even on the same PC. Both Pascal and BASIC are hard to extend by
the programmer who's actually using them to try to write useful
end-user software, _particularly_ in ways that fit into the rest of
the code, so you're essentially stuck with what the compiler vendor
thought you would need, or what they thought you would be willing to
pay for, in memory or money. On the flip side, much of C's magic
really isn't in the language (which is quite, pardon me, basic), but
rather in the standard library. Yes, C('s standard library) ended up
with its share of vendor-specific extensions as well, but the language
itself actually gave the programmer the building blocks needed to, if
necessary, even implement those extensions for a different compiler;
most often without resorting to more than minimal amounts of
assembler, and often outright none. So you weren't stuck with what the
compiler vendor gave you; it was actually possible to effectively
_extend_ the language vocabulary yourself, if you felt a need to do
that.
I didn't do serious enough programming back during those days for that
to matter to me, but now that I get paid to write software, I
definitely come across situations at times where the ability to extend
the language in such a manner (and have the code using those
extensions read idiomatically for the language) is awful nice.
--
Michael Kjörling • https://michael.kjorling.se • michael@kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”

Dave Horsfalldave@horsfall.orgRe: [TUHS] History of popularity of C2020-05-26T04:22:04Zurn:uuid:16cd6c97-09f9-1a81-a7be-04bdd5520e7d

[-- Attachment #1: Type: text/plain, Size: 795 bytes --]
On Sat, 23 May 2020, Clem Cole wrote:
> [...] Pascal tries to be the answer, but I think it suffered from the
> fact that it makes Pascal a production quality language, you had a
> extend it and everybody's extensions were different.
Perhaps I'm the only one here, but when I was taught Pascal (possibly by
Dr. Lions himself) it was emphasised to us that it was not a production
language bur a *teaching* language; you designed your algorithm, debugged
it with the Pascal compiler, then hand-translated it into your favourite
language (and debugged it again :-/).
That damned "pre-fill read buffer" was always a swine with interactive
sessions, though; I recall Andrew Hume threatening to insert a keyboard
into the terminal's CRT if he saw that "?" prompt on the Cyber...
-- Dave

Ed Carperc@pobox.comRe: [TUHS] History of popularity of C2020-05-26T04:32:30Zurn:uuid:f2be7bc2-2b7e-6149-a216-0ac2f82e4c5e

Rob Pikerobpike@gmail.comRe: [TUHS] History of popularity of C2020-05-26T08:22:30Zurn:uuid:c14f3e16-462d-b373-28e7-d8c3b4e4b2c0

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]
The peculiar input semantics of Pascal are a consequence of a locally
hacked-up version of NOS (I think that's the name) that ran on the big CDC
machines at ETH in Zurich. It was entirely a card-based system then, and
the way Pascal required read-ahead worked perfectly on that system, but not
really on any other, including other card-based, even NOS systems. I was
told this when I worked on that same machine as an exchange student working
at EIR outside Zurich, but not by Wirth himself. I couldn't bring myself to
ask him personally.
-rob
[-- Attachment #2: Type: text/html, Size: 614 bytes --]

Clem Coleclemc@ccc.comRe: [TUHS] History of popularity of C2020-05-26T14:33:47Zurn:uuid:30922209-2c7c-54ae-e4a7-bb73993ae15a

[-- Attachment #1: Type: text/plain, Size: 1463 bytes --]
On Tue, May 26, 2020 at 12:22 AM Dave Horsfall <dave@horsfall.org> wrote:
> On Sat, 23 May 2020, Clem Cole wrote:
>
> > [...] Pascal tries to be the answer, but I think it suffered from the
> > fact that it makes Pascal a production quality language, you had a
> > extend it and everybody's extensions were different.
>
> Perhaps I'm the only one here, but when I was taught Pascal (possibly by
> Dr. Lions himself) it was emphasised to us that it was not a production
> language bur a *teaching* language; you designed your algorithm, debugged
> it with the Pascal compiler, then hand-translated it into your favourite
> language (and debugged it again :-/).
>
> Dave that was exactly my point. Pascal was designed as a teaching
language so Wirth did not put things into the language that made it helpful
as a production language. So everyone else tried and the language became
a mess. Everybody peed on it. Dennis' quote: “When I read commentary
about suggestions for where C should go, I often think back and give thanks
that it wasn't developed under the advice of a worldwide crowd.”
<https://www.inspiringquotes.us/quotes/eDQR_hqwtHAC9>
It's not that you could not turn Pascal into a production language, but
every attempt to try to do so was done in a different manner. And within
firms it was always different. Eight different 'Tek Pascal'
implementations -- all close, but different - he says shaking his head.
[-- Attachment #2: Type: text/html, Size: 2624 bytes --]

Clem Coleclemc@ccc.comRe: [TUHS] History of popularity of C2020-05-26T14:45:14Zurn:uuid:fb7463aa-c46f-14c8-e03d-a3022eb5501d

[-- Attachment #1: Type: text/plain, Size: 1636 bytes --]
On Tue, May 26, 2020 at 4:23 AM Rob Pike <robpike@gmail.com> wrote:
> The peculiar input semantics of Pascal are a consequence of a locally
> hacked-up version of NOS (I think that's the name) that ran on the big CDC
> machines at ETH in Zurich. It was entirely a card-based system then, and
> the way Pascal required read-ahead worked perfectly on that system, but not
> really on any other, including other card-based, even NOS systems.
>
Yep, NOS was always a real mess. The ASCII vs 6-bit Display code got
mixed up in this too, IIRC. But again, if you think of Pascal as a
teaching language under a batch system, where the student tosses in her/his
program and some data to run against it. The batch queue eventually picks
up your 'job', tries to compile the code, and if successful will run the
executable it once on your input deck - a small light comes on. Yeah it
does that just fine and it is a pretty simple model.
BTW: a number of those local NOS hacks were to make the system easier to
use with student batch files. I think it was Ward Cunningham that told me
in the late 1970s, ETH got some of those NOS hacks from Purdue - Ward had
been working in the Purdue computer center and he sent the CDC tape to them
(remember Purdue was late to the Arpanet and I do not ETH was one of the
few places in Europe that had connections). Sending mag tapes via mail or
maybe FedEx/DHL was pretty standard in those days. Particularly within
Universities, shops with the same hardware and/or OS tended to share a lot
of tricks and solutions to issues.
FWIW: that particular 6500 from Purdue is now at the LCM+L in Seattle.
[-- Attachment #2: Type: text/html, Size: 2523 bytes --]

Toby Thaintoby@telegraphics.com.auRe: [TUHS] History of popularity of C2020-05-26T15:19:44Zurn:uuid:18a8fd07-84c8-fda1-b5d1-7b91c2117f96

On 2020-05-26 12:21 AM, Dave Horsfall wrote:
> On Sat, 23 May 2020, Clem Cole wrote:
>
>> [...] Pascal tries to be the answer, but I think it suffered from the
>> fact that it makes Pascal a production quality language, you had a
>> extend it and everybody's extensions were different.
>
> Perhaps I'm the only one here, but when I was taught Pascal (possibly by
> Dr. Lions himself) it was emphasised to us that it was not a production
> language bur a *teaching* language; you designed your algorithm,
> debugged it with the Pascal compiler, then hand-translated it into your
> favourite language (and debugged it again :-/).
Prof. Knuth came up with an interesting solution to that -- in the
process, inventing (or maturing) the concept of "literate programming".
Perhaps it's not well known that his most widely used programs (e.g.
TeX) were written in something VERY close to standard Pascal
(preprocessing aside). The translation to C (as required by certain
platforms) was mechanical.
--Toby
>
> That damned "pre-fill read buffer" was always a swine with interactive
> sessions, though; I recall Andrew Hume threatening to insert a keyboard
> into the terminal's CRT if he saw that "?" prompt on the Cyber...
>
> -- Dave

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-26T16:01:20Zurn:uuid:c99e605f-b752-e5a7-714a-415afa2aef14

>Dr. Lions himself) it was emphasised to us that it was not a production
>language bur a *teaching* language;
In the early 90ths I written some larger programs in Turbo Pascal after years of intensively working with my favored C&C++ language, and was surprised how well designed the Borland language was. Thus, recently I installed Free-Pascal with its comfortable IDE and since then I'm wondering why they always inventing new languages as these 'old' C&Pascal languages are so well designed and implemented, that I can't imagine that anything else is really needed.

Christopher Brownecbbrowne@gmail.comRe: [TUHS] History of popularity of C2020-05-26T16:22:11Zurn:uuid:1ac4be33-786c-c273-206a-5362b41d176d

[-- Attachment #1: Type: text/plain, Size: 2393 bytes --]
On Tue, 26 May 2020 at 12:01, Thomas Paulsen <thomas.paulsen@firemail.de>
wrote:
> >Dr. Lions himself) it was emphasised to us that it was not a production
> >language bur a *teaching* language;
>
> In the early 90ths I written some larger programs in Turbo Pascal after
> years of intensively working with my favored C&C++ language, and was
> surprised how well designed the Borland language was. Thus, recently I
> installed Free-Pascal with its comfortable IDE and since then I'm wondering
> why they always inventing new languages as these 'old' C&Pascal languages
> are so well designed and implemented, that I can't imagine that anything
> else is really needed.
>
I remember the fighting going on at that time.
I did some Pascal in about 1986, with one of the Waterloo compilers, and
found it mildly a pain in the neck; it was a reasonably-nearly-strict
version of the academic language, and was painful for non-academic
programming for the reasons normally thrown about.
In grad school, I TA'ed a course that was using TurboPascal, and it was
definitely a reasonable extension towards usability for larger programs
that needed more sophisticated environmental interactions. The compiler
was decently fast (unlike Ada, anyone??? ;-) ), and the makers were
selective and adequately opinionated as to their extensions.
And I fully recall the split ongoing, as academic folk would regard
TurboPascal as "non-conformant" with the standard, whilst bwk's missive on
"Why Pascal Is Not My Favorite Language" provides a good explanation...
And bwk nicely observed, "Because the language is so impotent, it must be
extended. But each group extends Pascal in its own direction, to make it
look like whatever language they really want."
The Modula family seemed like the better direction; those were still
Pascal-ish, but had nice intentional extensions so that they were not
nearly so "impotent." I recall it being quite popular, once upon a time,
to write code in Modula-2, and run it through a translator to mechanically
transform it into a compatible subset of Ada for those that needed DOD
compatibility. The Modula-2 compilers were wildly smaller and faster for
getting the code working, you'd only run the M2A part once in a while
(probably overnight!)
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"
[-- Attachment #2: Type: text/html, Size: 2980 bytes --]

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-26T19:30:31Zurn:uuid:399b43f0-d339-89a3-ff83-c9e583bf2ae4

Greg A. Woodswoods@robohack.caRe: [TUHS] History of popularity of C2020-05-26T19:51:16Zurn:uuid:621eff2f-5f71-6eec-1be4-2e92ea774a80

[-- Attachment #1: Type: text/plain, Size: 1597 bytes --]
At Tue, 26 May 2020 10:32:43 -0400, Clem Cole <clemc@ccc.com> wrote:
Subject: Re: [TUHS] History of popularity of C
>
> Dave that was exactly my point. Pascal was designed as a teaching
> language so Wirth did not put things into the language that made it helpful
> as a production language. So everyone else tried and the language became
> a mess. Everybody peed on it. Dennis' quote: “When I read commentary
> about suggestions for where C should go, I often think back and give thanks
> that it wasn't developed under the advice of a worldwide crowd.”
> <https://www.inspiringquotes.us/quotes/eDQR_hqwtHAC9>
And that's exactly what's wrong with C now -- except it's probably even
a bit worse for C as the majority of people who have been sitting on the
C standards committees for the past decades are primarily either those
with deeply funded agendas about how they think they can make more money
with the language if only it behaves a certain way (e.g. more like C++);
and/or a few academic compiler and optimizer experts who have strong
ideas about how they can eek the tiniest gains from their compilers if
only the spec says certain things. UB (undefined behaviour), for
example, should be stricken from the standard completely and forever.
Every behaviour MUST be defined, either by the implementation (with NO
recourse for or fallback to UB), or, strictly defined, by the standard.
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

Dan Crosscrossd@gmail.comRe: [TUHS] History of popularity of C2020-05-26T19:56:32Zurn:uuid:0b10d480-3149-07dc-ce21-dd6685a35c4d

[-- Attachment #1: Type: text/plain, Size: 1653 bytes --]
Cc: to COFF, as this isn't so Unix-y anymore.
On Tue, May 26, 2020 at 12:22 PM Christopher Browne <cbbrowne@gmail.com>
wrote:
> [snip]
> The Modula family seemed like the better direction; those were still
> Pascal-ish, but had nice intentional extensions so that they were not
> nearly so "impotent." I recall it being quite popular, once upon a time,
> to write code in Modula-2, and run it through a translator to mechanically
> transform it into a compatible subset of Ada for those that needed DOD
> compatibility. The Modula-2 compilers were wildly smaller and faster for
> getting the code working, you'd only run the M2A part once in a while
> (probably overnight!)
>
Wirth's languages (and books!!) are quite nice, and it always surprised and
kind of saddened me that Oberon didn't catch on more.
Of course Pascal was designed specifically for teaching. I learned it in
high school (at the time, it was the language used for the US "AP Computer
Science" course), but I was coming from C (with a little FORTRAN sprinkled
in) and found it generally annoying; I missed Modula-2, but I thought
Oberon was really slick. The default interface (which inspired Plan 9's
'acme') had this neat graphical sorting simulation: one could select
different algorithms and vertical bars of varying height were sorted into
ascending order to form a rough triangle; one could clearly see the
inefficiency of e.g. Bubble sort vs Heapsort. I seem to recall there was a
way to set up the (ordinarily randomized) initial conditions to trigger
worst-case behavior for quick.
I have a vague memory of showing it off in my high school CS class.
- Dan C.
[-- Attachment #2: Type: text/html, Size: 2175 bytes --]

Jon Steinhartjon@fourwinds.comRe: [TUHS] History of popularity of C2020-05-26T20:00:49Zurn:uuid:661fb881-1b69-69a5-4df1-4f96015a558a

Dan Cross writes:
>
> Of course Pascal was designed specifically for teaching. I learned it in
> high school ...
I had a different experience; I learned C in high school at BTL and then took
my first programming class in college which was Pascal and I kept finding it
extremely difficult to use because it was so much less flexible than C.
Until I took that class it had never even occurred to me that people would
write books about the topic as I had leaned from technical memoranda. There
were two books in this class, Wirth's and Fundamental Algorithms. Got Don
to sign my copy a few years ago which he said he wouldn't do unless it looked
really used.
Jon

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-26T21:49:18Zurn:uuid:c5773f0c-ed3a-6715-20b4-69c753a1979a

>And that's exactly what's wrong with C now -- except it's probably even
>a bit worse for C as the majority of people who have been sitting on the
>C standards committees for the past decades are primarily either those
>with deeply funded agendas about how they think they can make more money
>with the language if only it behaves a certain way (e.g. more like C++);
they don't play any role, as the C language was defined decades ago. I learned it
before the ansi committee came to an end by Turbo C and soon later MS C, and then
various *NIX compilers. Recently I written a couple of linux programs using gcc
with exactly the same syntax I studied 30 years ago, and it works pretty cool. All
these programs are error free performing very fast while having a small memory
footprint. For me there is nothing better than C, and I know a lot of languages.

Greg A. Woodswoods@robohack.caRe: [TUHS] History of popularity of C2020-05-26T22:37:11Zurn:uuid:f7a56d6d-d3cc-f67f-580c-20cee9bfbf04

[-- Attachment #1: Type: text/plain, Size: 1920 bytes --]
At Tue, 26 May 2020 23:48:43 +0200, "Thomas Paulsen" <thomas.paulsen@firemail.de> wrote:
Subject: Re: [TUHS] History of popularity of C
>
> they don't play any role, as the C language was defined decades ago. I learned it
> before the ansi committee came to an end by Turbo C and soon later MS C, and then
> various *NIX compilers. Recently I written a couple of linux programs using gcc
> with exactly the same syntax I studied 30 years ago, and it works pretty cool. All
> these programs are error free performing very fast while having a small memory
> footprint. For me there is nothing better than C, and I know a lot of languages.
You might be surprised by just how much C has been changed since, say,
C89, or even C90, and how niggly the corner cases can get (i.e. where UB
sticks its ugly head). Lots of legacy code is now completely broken, at
least with the very latest compilers (especially LLVM, but also GCC).
Some far more recently written code has even had important security
problems, e.g. one in the Linux kernel. NetBSD has to turn off specific
"features" in the newest compilers when building the kernel lest they
create a broken and/or insecure system. Some code no longer does what
it seems to do unless you're the most careful language lawyer at reading
it, Standard in hand, and with years of experience. Some compilers can
help, e.g. by inserting illegal instructions anywhere where UB would
have otherwise allowed the optimizer to go wild and possibly change
things completely, but without such tools, and others such as Valgrind,
one can get into a heap-o-trouble with the slightest misstep; and of
course these tools only work for user-land code, not bare-metal code
such as embedded systems and kernels.
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

Ronald Natalieron@ronnatalie.comRe: [TUHS] History of popularity of C2020-05-27T14:38:39Zurn:uuid:f4556909-0006-5f64-9f7e-f4f78b0f72d5

The large areas of undefined and unspecified behavior has always been an issue in C. It was somewhat acceptable when you were using it as a direct replacement for assembler,
but Java and many of other follow-ons endevaored to be more portable/rigourous. Of course, you can write crap code in any language.
It didn’t take modern C to do this. On the PDP-11 (at least not in split I/D mode), location zero for example contained a few assembler instructions (p&P6) which you could print out.
Split I/D and VAX implementations made this even worse by putting a 0 at location 0. When we moved from the VAX to other processors we had location zero unmapped. For the
first time, accessing a null pointer ended up trapping rather than either resulting in a null (or some random data). Eventually, we added a feature to the kernel called “Braindamanged
Vax compatibility Mode” that restored the zero to location zero. This was enabled by a field we could poke into the a.out header because this was needed on things we didn’t have
source code to (things we did we just fixed).
Similar nonsense we found where the order that function args are evaluated was relied upon. The PDP-11, etc… evaluated them right-to-left because that’s how they had to push them
on the stack for the call linkage. We had one machine that did that in the opposite order (I considered flipping the compiler behavior anyhow0 and when we got to the RISC architectures,
things were passed in registered so the evaluation was less predictable.
I already detailed the unportability problem I found where the BSD kernel “converted by union”.
The most amusing thing I’d have to say was that one day I got a knock on my office door. One of the sales guys from our sister company wanted to know if I could write some Novell
drivers for an encrypting ethernet card they were selling. The documentation for writing the driver was quite detailed but all describing i386 assembler interfaces (and the examples
were in assembler). About a week into the project I came to realization that the linkages were all the C subroutine calls for that platform. The caller was C and there was no particular
reason why the driver wasn’t also written in C.

Clem Coleclemc@ccc.comRe: [TUHS] History of popularity of C2020-05-27T15:10:47Zurn:uuid:844a6b2d-9e04-e80d-450f-644d6a14ddc8

[-- Attachment #1: Type: text/plain, Size: 2548 bytes --]
Henry Spencer's 10 Commandments for C Programmers
<https://www.seebs.net/c/10com.html>
On Wed, May 27, 2020 at 10:38 AM Ronald Natalie <ron@ronnatalie.com> wrote:
> The large areas of undefined and unspecified behavior has always been an
> issue in C. It was somewhat acceptable when you were using it as a direct
> replacement for assembler,
> but Java and many of other follow-ons endevaored to be more
> portable/rigourous. Of course, you can write crap code in any language.
>
> It didn’t take modern C to do this. On the PDP-11 (at least not in split
> I/D mode), location zero for example contained a few assembler instructions
> (p&P6) which you could print out.
> Split I/D and VAX implementations made this even worse by putting a 0 at
> location 0. When we moved from the VAX to other processors we had
> location zero unmapped. For the
> first time, accessing a null pointer ended up trapping rather than either
> resulting in a null (or some random data). Eventually, we added a
> feature to the kernel called “Braindamanged
> Vax compatibility Mode” that restored the zero to location zero. This
> was enabled by a field we could poke into the a.out header because this was
> needed on things we didn’t have
> source code to (things we did we just fixed).
>
> Similar nonsense we found where the order that function args are evaluated
> was relied upon. The PDP-11, etc… evaluated them right-to-left because
> that’s how they had to push them
> on the stack for the call linkage. We had one machine that did that in
> the opposite order (I considered flipping the compiler behavior anyhow0 and
> when we got to the RISC architectures,
> things were passed in registered so the evaluation was less predictable.
>
> I already detailed the unportability problem I found where the BSD kernel
> “converted by union”.
>
> The most amusing thing I’d have to say was that one day I got a knock on
> my office door. One of the sales guys from our sister company wanted to
> know if I could write some Novell
> drivers for an encrypting ethernet card they were selling. The
> documentation for writing the driver was quite detailed but all describing
> i386 assembler interfaces (and the examples
> were in assembler). About a week into the project I came to realization
> that the linkages were all the C subroutine calls for that platform. The
> caller was C and there was no particular
> reason why the driver wasn’t also written in C.
>
>
[-- Attachment #2: Type: text/html, Size: 3005 bytes --]

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-05-27T16:12:11Zurn:uuid:f9342831-7b3b-2b60-2af3-af550fdcb9cd

>The large areas of undefined and unspecified behavior has always been an
>issue in C. It was somewhat acceptable when you were using it as a direct
>replacement for assembler, but Java and many of other follow-ons endevaored
>to be more portable/rigourous.
One cannot compare system and business related stuff!
When I'm doing C I always have the CPU and its instructions in mind. As Linus I
see the assembly code in my inner eyes. For such minds, doing with C what earlier
was done with assembly, C was created, whereas writing business applications
cobol and its modern relative java are the first choices.

Greg A. Woodswoods@robohack.caRe: [TUHS] History of popularity of C2020-05-27T19:50:01Zurn:uuid:5d9e41d0-b34a-0474-3c9a-9a7cde772382

[-- Attachment #1: Type: text/plain, Size: 3056 bytes --]
At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen" <thomas.paulsen@firemail.de> wrote:
Subject: Re: [TUHS] History of popularity of C
>
> When I'm doing C I always have the CPU and its instructions in mind.
And that's exactly what might trip you up unless you _exactly_
understand how the language standard defines the operations of the
abstract virtual machine (right down to the implications of every
sequence point in the code); how compilers and optimizers do and (more
importantly) do not work when mapping the abstract virtual machine
operations into real-world machine instructions; and what how _all_
instances of "undefined behaviour" can arise, and exactly what the
optimizer is allowed to do when and if it spots UB conditions in the
code.
A big part of the problem is that the C Standard mandates compilation
will and must succeed (and allows this success to be totally silent too)
even if the code contains instances of undefined behaviour. This means
that the successful execution of the generated code may depend on what
optimization level was chosen. Code that does security tests on input
values might be entirely and silently eliminated by the optimizer
because of some innocuous-seeming UB instance, and this is exactly what
has happened in the Linux kernel, for example (probably more than once).
UB can be introduced quite innocently just by moving sequence points in
variable references in ways that are not necessarily obvious even to
seasoned programmers (and indeed "seasoned" programmers are often the
ones who's old-fashioned coding habits might lead to introduction of
serious problems in such a way).
I've found dozens of instances of UB in mature and well tested code, and
sometimes only by luck of having chosen the "right" compiler and enabled
its feature of introducing illegal instructions in places where UB might
occur, _and_ having had the luck to test in such a way as to encounter
the specific code path where this UB occurred.
I would claim it's truly safer now to write C without understanding the
underlying mechanics of the CPU and memory, but rather by just paying
very close attention to the detailed semantics of the language,
understanding only the abstract virtual C machine, and hoping your
compiler will at least warn if anything even remotely suspicious is done
in your code; and lastly (but perhaps most importantly) avoiding like
the plague any coding constructs which might make UB harder to spot
(e.g. never ever initialize local variables with their definition when
pointers are involved).
Unfortunately the new "most advanced" C compilers also make it quite a
bit more difficult for those of us writing C code that must have
specific actions on the bare metal hardware, e.g. in embedded systems,
kernels, hardware drivers, etc.; including especially where UB detection
tools are far more difficult to use.
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-05-27T20:14:17Zurn:uuid:9797aba5-cea7-f248-7609-5b19f50f7da5

So I may have just gotten lucky in my 30+ years of writing C code but
I have yet to hit a single instance of this doom and gloom.
On Wed, May 27, 2020 at 12:49:25PM -0700, Greg A. Woods wrote:
> At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen" <thomas.paulsen@firemail.de> wrote:
> Subject: Re: [TUHS] History of popularity of C
> >
> > When I'm doing C I always have the CPU and its instructions in mind.
>
> And that's exactly what might trip you up unless you _exactly_
> understand how the language standard defines the operations of the
> abstract virtual machine (right down to the implications of every
> sequence point in the code); how compilers and optimizers do and (more
> importantly) do not work when mapping the abstract virtual machine
> operations into real-world machine instructions; and what how _all_
> instances of "undefined behaviour" can arise, and exactly what the
> optimizer is allowed to do when and if it spots UB conditions in the
> code.
>
> A big part of the problem is that the C Standard mandates compilation
> will and must succeed (and allows this success to be totally silent too)
> even if the code contains instances of undefined behaviour. This means
> that the successful execution of the generated code may depend on what
> optimization level was chosen. Code that does security tests on input
> values might be entirely and silently eliminated by the optimizer
> because of some innocuous-seeming UB instance, and this is exactly what
> has happened in the Linux kernel, for example (probably more than once).
>
> UB can be introduced quite innocently just by moving sequence points in
> variable references in ways that are not necessarily obvious even to
> seasoned programmers (and indeed "seasoned" programmers are often the
> ones who's old-fashioned coding habits might lead to introduction of
> serious problems in such a way).
>
> I've found dozens of instances of UB in mature and well tested code, and
> sometimes only by luck of having chosen the "right" compiler and enabled
> its feature of introducing illegal instructions in places where UB might
> occur, _and_ having had the luck to test in such a way as to encounter
> the specific code path where this UB occurred.
>
> I would claim it's truly safer now to write C without understanding the
> underlying mechanics of the CPU and memory, but rather by just paying
> very close attention to the detailed semantics of the language,
> understanding only the abstract virtual C machine, and hoping your
> compiler will at least warn if anything even remotely suspicious is done
> in your code; and lastly (but perhaps most importantly) avoiding like
> the plague any coding constructs which might make UB harder to spot
> (e.g. never ever initialize local variables with their definition when
> pointers are involved).
>
> Unfortunately the new "most advanced" C compilers also make it quite a
> bit more difficult for those of us writing C code that must have
> specific actions on the bare metal hardware, e.g. in embedded systems,
> kernels, hardware drivers, etc.; including especially where UB detection
> tools are far more difficult to use.
>
> --
> Greg A. Woods <gwoods@acm.org>
>
> Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
> Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
--
---
Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm

Richard Salzrich.salz@gmail.comRe: [TUHS] History of popularity of C2020-05-27T20:23:39Zurn:uuid:a5922ed3-8b02-35bf-4882-d16dd9c8fd3c

Nevin Libernliber@gmail.comRe: [TUHS] History of popularity of C2020-05-27T21:02:00Zurn:uuid:4c3e563f-d0c2-8d15-e2ed-fc527f289f13

[-- Attachment #1: Type: text/plain, Size: 1601 bytes --]
On Wed, May 27, 2020 at 2:50 PM Greg A. Woods <woods@robohack.ca> wrote:
> A big part of the problem is that the C Standard mandates compilation
> will and must succeed (and allows this success to be totally silent too)
> even if the code contains instances of undefined behaviour.
No it does not.
To quote C11:
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of
erroneous data, for which this International Standard imposes no
requirements
NOTE Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation or
program execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating a
translation or execution (with the issuance of a diagnostic message).
Much UB cannot be detected at compile time. Much UB is too expensive to
detect at run time.
Take strlen(const char* s) for example. s must be a valid pointer that
points to a '\0'-terminated string. How would you detect that at compile
time? How would you set up your run time to detect that and error out?
How would you design your codegen and runtime to detect and error out when
UB is invoked in this code:
#include <stdio.h>
#include <string.h>
void A(const char* a, const char* b) {
printf("%zu %zu\n", strlen(a), strlen(b));
}
// Separate compilation unit
int main() {
const char a[] = {'A'};
const char b[] = {'\0'};
A(a, b);
}
--
Nevin ":-)" Liber <mailto:nl <nevin@eviloverlord.com>iber@gmail.com>
+1-847-691-1404
[-- Attachment #2: Type: text/html, Size: 3717 bytes --]

Greg A. Woodswoods@robohack.caRe: [TUHS] History of popularity of C2020-05-27T23:17:47Zurn:uuid:ee1b182a-1fc6-a975-e5de-72d9e87df848

[-- Attachment #1: Type: text/plain, Size: 4159 bytes --]
At Wed, 27 May 2020 16:00:57 -0500, Nevin Liber <nliber@gmail.com> wrote:
Subject: Re: [TUHS] History of popularity of C
>
> On Wed, May 27, 2020 at 2:50 PM Greg A. Woods <woods@robohack.ca> wrote:
> >
> > A big part of the problem is that the C Standard mandates compilation
> > will and must succeed (and allows this success to be totally silent too)
> > even if the code contains instances of undefined behaviour.
>
> No it does not.
>
> To quote C11:
>
> undefined behavior
> behavior, upon use of a nonportable or erroneous program construct or of
> erroneous data, for which this International Standard imposes no
> requirements
Sorry, I concede. Yes, "no requirements". In C99 at least.
Sadly most compilers, including GCC and Clang/LLVM will, at best, warn
(and warnings are only treated as errors by the most macho|wise); and
compilers only do that now because they've been getting flack from
developers whenever the optimizer does something unexpected.
> Much UB cannot be detected at compile time. Much UB is too expensive to
> detect at run time.
Indeed. At best you can get a warning, or optional runtime code to
abort the program.
Now this isn't a problem when "undefined behaviour" becomes
"implementation defined behaviour" for a given implementation.
However that's not portable obviously, except for the trivial cases
where the common compilers for a given type of platform all do the same
things.
The real problems though arise when the optimizer takes advantage of
these rules regardless of what the un-optimized code will do on any
given platform and architecture.
The Linux kernel example I've referred to involved dereferencing a
pointer to do an assignment in a local variable definition, then a few
lines later testing if the pointer was NULL before using the local
variable. Unoptimised the code will dereference a NULL pointer and load
junk from location zero into the variable (because it's kernel code),
then the NULL test will trigger and all will be good. The optimizer
rips out the NULL check because "obviously" the programmer has assumed
the pointer is always a valid non-NULL pointer since they've explicitly
dereferenced it before checking it and they wouldn't want to waste even
a single jump-on-zero instruction checking it again. (It's also quite
possible the code was written "correctly" at first, then someone mushed
all the variable initialisations up onto their definitions.)
In any case there's now a GCC option: -fno-delete-null-pointer-checks
(to go along with -fno-strict-aliasing and -fno-strict-overflow, and
-fno-strict-enums, all of which MUST be used, and sometimes
-fno-strict-volatile-bitfields too, on all legacy code that you don't
want to break)
It's even worse when you have to write bare-metal code that must
explictly dereference a NULL pointer (a not-so-real example: you want
to use location zero in the CPU zero-page (e.g. on a 6502 or 6800, or
PDP-8, etc.) as a pointer) -- it is now impossible to do that in strict
Standard C even though trivially it "should just work" despite the silly
rules. As far as I can tell it always did just work in "plain old" C.
The crazy thing about modern optimizers is that they're way more
persistent and often somewhat more clever than your average programmer.
They follow all the paths. They apply all the rules at every turn.
> Take strlen(const char* s) for example. s must be a valid pointer that
> points to a '\0'-terminated string. How would you detect that at compile
> time? How would you set up your run time to detect that and error out?
My premise is that you shouldn't try to detect this problem, AND in any
case where the optimizer might be able to prove the pointed at object
isn't a valid string it should not, and must not, abuse that knowledge
to rip out code or cause other even worse mis-behaviour.
I.e. this should not be "undefined", but rather "implementation defined
and without any recourse to allowing optimizer abuses".
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

Dave Horsfalldave@horsfall.orgRe: [TUHS] History of popularity of C2020-06-05T20:57:57Zurn:uuid:183e694a-d28f-b796-eecb-2173f5720eb4

On Wed, 27 May 2020, Greg A. Woods wrote:
> Sadly most compilers, including GCC and Clang/LLVM will, at best, warn
> (and warnings are only treated as errors by the most macho|wise); and
> compilers only do that now because they've been getting flack from
> developers whenever the optimizer does something unexpected.
Don't talk to me about optimisers... That's not the code that I wrote!
I've seen code simply disappear, because the "optimiser" though that it
was cleverer than I was.
> The Linux kernel example I've referred to involved dereferencing a
> pointer to do an assignment in a local variable definition, then a few
> lines later testing if the pointer was NULL before using the local
> variable. Unoptimised the code will dereference a NULL pointer and load
> junk from location zero into the variable (because it's kernel code),
> then the NULL test will trigger and all will be good. The optimizer
> rips out the NULL check because "obviously" the programmer has assumed
> the pointer is always a valid non-NULL pointer since they've explicitly
> dereferenced it before checking it and they wouldn't want to waste even
> a single jump-on-zero instruction checking it again. (It's also quite
> possible the code was written "correctly" at first, then someone mushed
> all the variable initialisations up onto their definitions.)
Typical Penguin/OS behaviour...
> In any case there's now a GCC option: -fno-delete-null-pointer-checks
> (to go along with -fno-strict-aliasing and -fno-strict-overflow, and
> -fno-strict-enums, all of which MUST be used, and sometimes
> -fno-strict-volatile-bitfields too, on all legacy code that you don't
> want to break)
I'm sure that there's a competition somewhere, to see who can come with
GCC's -fmost-longest-and-most-obscure-option flags...
> It's even worse when you have to write bare-metal code that must
> explictly dereference a NULL pointer (a not-so-real example: you want
> to use location zero in the CPU zero-page (e.g. on a 6502 or 6800, or
> PDP-8, etc.) as a pointer) -- it is now impossible to do that in strict
> Standard C even though trivially it "should just work" despite the silly
> rules. As far as I can tell it always did just work in "plain old" C.
I've programmed a PDP-8! 'Twas way back in high school, and I found a bug
in my mentor's program; it controlled traffic lights...
> The crazy thing about modern optimizers is that they're way more
> persistent and often somewhat more clever than your average programmer.
> They follow all the paths. They apply all the rules at every turn.
Optimisers... Grrr...
-- Dave

Nemo Nusquamcym224@gmail.comRe: [TUHS] History of popularity of C2020-06-05T21:41:29Zurn:uuid:21be004b-246a-3017-9043-b1e6df06bbd5

Bakul Shahbakul@iitbombay.orgRe: [TUHS] History of popularity of C2020-06-05T22:02:01Zurn:uuid:b1180805-26d0-b0c8-fc41-8443acd10f24

On Jun 5, 2020, at 2:47 PM, Richard Salz <rich.salz@gmail.com> wrote:
>
>
> | I'm sure that there's a competition somewhere, to see who can come with
> | GCC's -fmost-longest-and-most-obscure-option flags...
>
> At least one of the GCC maintainers is German, so possibly. Can clang keep up? :)
Clang has more than kept up!
clang:
-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang<value>
gcc-9:
-print-sysroot-headers-suffix
Not counting gcc's
--help={common|optimizers|params|target|warnings|[^]{joined|separate|undocumented}}[,...].

Ed Carperc@pobox.comRe: [TUHS] History of popularity of C2020-06-06T20:50:17Zurn:uuid:d16293e4-2209-72e7-8818-03a65513821c

On 5/27/20, Ronald Natalie <ron@ronnatalie.com> wrote:
> The large areas of undefined and unspecified behavior has always been an
> issue in C. It was somewhat acceptable when you were using it as a direct
> replacement for assembler, but Java and many of other follow-ons endevaored to be more
> portable/rigourous. Of course, you can write crap code in any language.
"It's not a bug, it's a feature"
C was written when the programmer had to be more rigorous instead of
just letting things slide and having the language do their thinking
for them. I remember being laughed at for using static arrays instead
of malloc() and friends, until people found out that safety-critical
systems were written the same way.
I have C code that was written 35 years ago that's still in
production. Back then, you had to be careful, and you actually had to
think about what you were writing.
We've gotten soft and lazy, and now we're paying for it.

Thomas Paulsenthomas.paulsen@firemail.deRe: [TUHS] History of popularity of C2020-06-06T21:09:12Zurn:uuid:1175b43b-648d-b9f3-bf26-e9661fc9eacb

'C was written when the programmer had to be more rigorous instead of
just letting things slide and having the language do their thinking
for them. '
I fully subscribe to that.
Today the company owners have to pay a lot for programmers having the language do their thinking
for them. The memory hunger of the soa java business services of the company I worked prior to
retirement, is sheer endless. Arnold once told that there is more demand for C developers
in Israel. I envy you

Larry McVoylm@mcvoy.comRe: [TUHS] History of popularity of C2020-06-06T21:14:08Zurn:uuid:09d53fff-f4b6-b3fd-2433-fc5ec0aba4ab

I did one stint at a Java shop, Charles Schwab's web group. No talent,
no architecture, no vision. Lots of politics and back stabbing and
claiming credit for other people's work. Totally toxic, hands down the
worst job I've ever had. I lasted less than 6 months and am surprised
I made it that far.
On Sat, Jun 06, 2020 at 11:08:43PM +0200, Thomas Paulsen wrote:
> 'C was written when the programmer had to be more rigorous instead of
> just letting things slide and having the language do their thinking
> for them. '
> I fully subscribe to that.
> Today the company owners have to pay a lot for programmers having the language do their thinking
> for them. The memory hunger of the soa java business services of the company I worked prior to
> retirement, is sheer endless. Arnold once told that there is more demand for C developers
> in Israel. I envy you
>
--
---
Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm

Doug McIlroydoug@cs.dartmouth.eduRe: [TUHS] History of popularity of C2020-06-06T21:50:02Zurn:uuid:e83ef104-7d8d-fb25-400a-a7bd82514112

> Steve Johnson's position paper on optimising compilers may amuse you:
> https://dl.acm.org/doi/abs/10.1145/567532.567542
Indeed. This passage struck a particular chord:
"I contend that the class of applications that depend on, for example, loop
optimization and dead code elimination for their efficient solution is of
modest size, growing smaller, and often very susceptible to expression in
applicative languages where the optimization is built into the individual
applicative operators."
I don't know whether I saw that note at the time, but since then I've
come to believe, particularly in regard to C, that one case of dead-code
elmination should be guaranteed. That case is if(0), where 0 is the
value of a constant expression.
This guarantee would take the place of many--possibly even
most--ifdefs. Every ifdef is an ugly intrusion and a pain to read.
Syntactically it occurs at top level completely out of sync with the
indentation and flow of text. Conversion to if would be a big win.
Doug

Warner Loshimp@bsdimp.comRe: [TUHS] History of popularity of C2020-06-06T21:55:45Zurn:uuid:2b71da61-a941-0d73-5f7c-11f99a54e0b1

[-- Attachment #1: Type: text/plain, Size: 1567 bytes --]
On Sat, Jun 6, 2020 at 3:50 PM Doug McIlroy <doug@cs.dartmouth.edu> wrote:
> > Steve Johnson's position paper on optimising compilers may amuse you:
> > https://dl.acm.org/doi/abs/10.1145/567532.567542
>
> Indeed. This passage struck a particular chord:
>
> "I contend that the class of applications that depend on, for example, loop
> optimization and dead code elimination for their efficient solution is of
> modest size, growing smaller, and often very susceptible to expression in
> applicative languages where the optimization is built into the individual
> applicative operators."
>
> I don't know whether I saw that note at the time, but since then I've
> come to believe, particularly in regard to C, that one case of dead-code
> elmination should be guaranteed. That case is if(0), where 0 is the
> value of a constant expression.
>
> This guarantee would take the place of many--possibly even
> most--ifdefs. Every ifdef is an ugly intrusion and a pain to read.
> Syntactically it occurs at top level completely out of sync with the
> indentation and flow of text. Conversion to if would be a big win.
>
I'd love something like this to work, but the semantic interpretation would
need to also somehow be omitted, otherwise how do you replace
#ifdef AIX
ioct(fd, AIX_SPECIFIC_IOCTL, ...)
#endif
on a HUPX system that doesn't define AIX_SPECIFIC_IOCTL...
I remember hearing that BLISS could cope because it deferred the semantic
interpretation of the identifiers until after a round of dead code
elimination so it didn't need a pre-processor...
Warner
[-- Attachment #2: Type: text/html, Size: 2211 bytes --]

Ed Carperc@pobox.comRe: [TUHS] History of popularity of C2020-06-06T22:27:49Zurn:uuid:5a2df115-d5c0-000c-cca8-8b37655eb80e

Bakul Shahbakul@iitbombay.orgRe: [TUHS] History of popularity of C2020-06-06T23:32:19Zurn:uuid:67b872b7-7ac7-fe59-e13b-26c1726e6ff1

On Jun 6, 2020, at 1:49 PM, Ed Carp <erc@pobox.com> wrote:
>
> On 5/27/20, Ronald Natalie <ron@ronnatalie.com> wrote:
>
>> The large areas of undefined and unspecified behavior has always been an
>> issue in C. It was somewhat acceptable when you were using it as a direct
>> replacement for assembler, but Java and many of other follow-ons endevaored to be more
>> portable/rigourous. Of course, you can write crap code in any language.
>
> "It's not a bug, it's a feature"
A snippet of a recent comp.arch post by someone (the subject was C and safety):
What you call "misfeatures", some other people call "features". If you
expect people to take you and your opinions seriously, you'll get on
better if you stop mocking other opinions. I've written several times
why undefined behaviour lets me write better and safer code, as well as
more efficient code. If you remain determinedly unconvinced, at least
agree to disagree without sounding childish about it.

Greg A. Woodswoods@robohack.caRe: [TUHS] History of popularity of C2020-06-07T00:13:17Zurn:uuid:b5c377af-0126-00d0-05fa-b9b71d7a7a14

[-- Attachment #1: Type: text/plain, Size: 3639 bytes --]
At Sat, 6 Jun 2020 16:31:42 -0700, Bakul Shah <bakul@iitbombay.org> wrote:
Subject: Re: [TUHS] History of popularity of C
>
> On Jun 6, 2020, at 1:49 PM, Ed Carp <erc@pobox.com> wrote:
> >
> > On 5/27/20, Ronald Natalie <ron@ronnatalie.com> wrote:
> >
> >> The large areas of undefined and unspecified behavior has always
> >> been an issue in C. It was somewhat acceptable when you were using
> >> it as a direct replacement for assembler, but Java and many of
> >> other follow-ons endevaored to be more portable/rigourous. Of
> >> course, you can write crap code in any language.
> >
> > "It's not a bug, it's a feature"
>
> A snippet of a recent comp.arch post by someone (the subject was C and
> safety):
>
> What you call "misfeatures", some other people call "features".
> If you expect people to take you and your opinions seriously,
> you'll get on better if you stop mocking other opinions. I've
> written several times why undefined behaviour lets me write
> better and safer code, as well as more efficient code. If you
> remain determinedly unconvinced, at least agree to disagree
> without sounding childish about it.
Heh.
W.r.t. efficiency, well undefined behaviour does allow the compiler to
turn their code, or anyone's else's code, into more "efficient" code if
they happen to (accidentally or otherwise) trip over undefined
behaviour.
However I don't think it can be argued in any valid way that "undefined
behaviour" can ever lead to "better and safer" code, in any way, or from
any viewing angle, whatsoever.
"Undefined behaviour" just means that the language definition is somehow
adversely compromised in such a way that it is impossible to prevent the
programmer from writing compilable and executable code that will always
produce some well defined behaviour in all standards-compliant
implementations. I.e. the language allows that there are ways to write
syntactically correct code that cannot be guaranteed to do anything
particular whatsoever in _all_ standards-compatible implementations.
We can argue until the cows come home whether "undefined behaviour" is a
"necessary" part of the language definition (e.g. to keep the language
implementable, or backward-compatible, or whatever), but I don't see how
any valid argument can ever be made for it being a "good" and "useful"
thing from the perspective of a programmer using the language.
Undefined behaviours are black holes for which the language standard
offers no real guidance nor maps for safe passage other than the stern
warning to avoid them as best as possible. Perhaps it is such
scare-mongering that the author above justifies as their influence to
write "better and safer" code, but that's no good argument for having
such pits of despair in the language definition in the first place. If
we were arguing theology then I would say the bible we call the "C
Standard" is actually actively trying to trap its followers into
committing sins.
Luckily the real world of C is made of actual implementations, and they
are free to either offer definitions for how various (ab)uses of the
language will work, or to maintain the black holes of mystery that we
must try to avoid, or even sometimes to give us the choice in how they
will treat our code. As programmers we should try to choose which
implementation(s) we use, and how we control _their_ behaviour, while at
the same time still doing our best to avoid giving them the rope to hang
us with.
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

arnoldarnold@skeeve.comRe: [TUHS] History of popularity of C2020-06-07T05:58:29Zurn:uuid:d300c4ca-adba-c1e1-3e28-79b11afea9fe

Ed Carp <erc@pobox.com> wrote:
> "Arnold once told that there is more demand for C developers
> in Israel. I envy you"
The market in Israel for software developers is VERY hot.
Based entirely on the emails I get from Linked-In about jobs that may
interest me, there's some C, but a lot more C++, both Windows and Linux.
Also a lot of Python.
> Maybe I ought to move to Israel.
Moving here isn't a trivial decision, especially if you don't speak
any Hebrew. Off-topic. Sorry.
> Sounds like they have more common sense there.
We do, but the strong influence of western (US) culture is eroding it,
which is saddening and frustrating. This definitely off-topic.
Arnold