Why Are Only Some Sources Generated

Why Are Only Some Sources Generated

Administrator

I thought from reading old Squeak books that the VM started life as all-Slang, and then that was turned into C to be compiled into the VM. But after building a Pharo VM per the instructions on GitHub, it occured to me that there seems to be a lot of C source code already present before the generation step. Is that correct? If so, was there ever a time when most or all of the C code was generated? And why did it change?

Re: Why Are Only Some Sources Generated

I thought from reading old Squeak books that the VM started life asall-Slang, and then that was turned into C to be compiled into the VM. Butafter building a Pharo VM per the instructions on GitHub, it occured to methat there seems to be a lot of C source code already present before thegeneration step. Is that correct? If so, was there ever a time when most orall of the C code was generated? And why did it change?

Only the actual interpreter/jit core, and some cross-platform plugins are generated from Slang. The platform-dependent "glue code" has always been coded in C.

Initially, that C code was Mac-only, and shipped as string literals in the image itself. See class InterpreterSupportCode in Squeak 1.x:

When Ian Piumarta ported Squeak to Unix, and Andreas Raab to Windows, those additional C files were not added to the image, but kept separately. And at some point we removed even the Mac support code, not exactly sure when. Possibly when VMMaker was introduced in 3.x, which made working with various platforms easier. It's still there in 2.8 but gone in 3.8 (http://try.squeak.org/).

Re: Why Are Only Some Sources Generated

On 25.03.2015, at 23:06, Sean P. DeNigris <[hidden email]> wrote:
>
> Bert Freudenberg wrote
>> Only the actual interpreter/jit core, and some cross-platform plugins are
>> generated from Slang. The platform-dependent "glue code" has always been
>> coded in C.
>
> Why? Could some or all of the platform code be written in Slang?

It could, but the main reason to prefer Slang over C is that you can simulate it in Smalltalk. And you can't simulate the OS-dependent code. So you would always have to translate to C, build a new VM, run it ... which is more hassle than using C directly.

Re: Why Are Only Some Sources Generated

I had intended to reply the list on this, apparently I did not "reply all":

> On Wed, Mar 25, 2015 at 03:06:39PM -0700, Sean P. DeNigris wrote:
> Bert Freudenberg wrote
> > Only the actual interpreter/jit core, and some cross-platform plugins are
> > generated from Slang. The platform-dependent "glue code" has always been
> > coded in C.
>
> Why? Could some or all of the platform code be written in Slang?
>

To some extent, this is a matter of what is most convenient and comfortable
for the person who is writing the platform code. If you are dealing with
data structures, or with system calls and complex arguments that need to be
properly declared, then it may be easier to write this in a language such
as C.

On the other hand, I would point to the FilePlugin (which is a critical VM
component) as an example of platform support code written in C that could
just as easily have been written in Smalltalk (slang).

To some extent, it is just a judgement call. If you were writing the glue
code for FilePlugin and you wanted it to be readable in the image, then
you might prefer to write in Smalltalk (slang). If you were really more
interested in writing code that is readable for a person familiar with
the operating system functions, then you might prefer to write the glue
code in C.

As another example, the very lowest level of glue code in the VM would
be the functions that map object addresses (oop values for the object
memory) to actual memory addresses in the platform address space. These
are very performance critical, and they are usually implemented as a set
of C macros in the platform glue code (look for sqMemoryAccess.h in the
platform code). But it turns out that they can be implemented equally
well in Smalltalk (see package MemoryAccess in the VMMaker repository).
The Smalltalk slang implementations perform just as well as the C macros,
because the slang code inliner is extremely effective. They also can
be browsed and debugged in Smalltalk, which might be an advantage if
the C preprocessor is something that makes your head spin.

I think that it really is a matter of preference. Do you prefer to read
and write the glue code in Smalltalk, or do you prefer to use a language
that more closely matches the runtime platform that you are trying to
support? There are good arguments either way.

Re: Why Are Only Some Sources Generated

Administrator

David T. Lewis wrote

this is a matter of what is most convenient and comfortable
for the person who is writing the platform code...

Thanks for the detailed explanation!

I started thinking about this because I was patching the VM's mouse wheel event simulation and could only find the code for Mac, where the file name was "sq...events.m". For Windows and Linux, I later found out via the mailing list that the proper files were "sqWin32Window.c" and "vm-display-X11/sqUnixX11.c", neither of which stand out to me as particularly event-related (although maybe I was biased by finding a file called "events" first.

Then, I was extracting some magic constants as #defines and couldn't figure out exactly where to put them in the huge (from a Smalltalk perspective) files. This was equally true for a few functions I extracted.

Anyway, it made me wonder whether we couldn't benefit from Smalltalk. It seems like parts could be simulated to verify the logic. And the tools might give the code some structure and brows-ability. Especially since C source files probably scare much of the community away from attempting to contribute.