/* minor.h --- C interface to Minor Scheme */
#ifndef MN_MINOR_H
#define MN_MINOR_H
#include
#include
#include
#include
/* This file defines the Minor C Interface, a C interface to the Minor
Scheme system. Using this interface, you can:
- create, access, and change Minor objects,
- call Minor functions from C,
- define C functions to be called from Minor, and
- define new Minor types.
This interface is thread-safe. Minor Scheme supports multiple
concurrent threads of control, based on the underlying system's
native thread support.
This interface is modeled after the Java Native Interface
(http://java.sun.com/j2se/1.3/docs/guide/jni/, or search at
http://java.sun.com for the latest docs), since that nicely solves
a lot of the same problems we need to. It would be easy to
implement this interface in terms of the JNI, for a Java-based
Scheme. */
/* Naming conventions. */
/* This interface defines a lot of functions, many of which fall into
categories within which there's some regularity. To help make the
names easier to remember, we try to follow some conventions.
Type conversions:
Many of the functions in the API convert C values to Scheme values
and vice versa, or test whether such a conversion is possible
without loss of information. A function that converts instances of
one type A to another type B we call 'mn_A_to_B'. A function that
checks whether a given value of type A can be converted to type B
we call 'mn_A_is_B'. For example:
- mn_str_to_string - Convert a null-terminated C string to a
Scheme string.
- mn_number_to_int - Convert a Scheme number to a C int.
- mn_number_is_uint - Return true if a given Scheme number can be
converted to a C unsigned int without
underflow or overflow; otherwise, return false.
We use the following lexemes in these functions' names to designate
the types involved (and we pass and return the values with the
following C types):
- Text:
- 'character', for Scheme characters
- 'string', for Scheme strings
- 'char', for C 'char'
- 'wchar', for C 'wchar_t'
- 'str', for C null-terminated byte strings (char *)
- 'mem', for C arrays of arbitrary bytes (a char * and a size_t)
- 'wcs', for C null-terminated wide character strings (wchar_t *)
- 'wmem', for arrays of arbitrary wchar_t values (a wchar_t * and a
size_t)
And in :
- 'utf8' for an array of well-formed UTF-8 characters
(an mn_utf8_t * and a size_t)
- 'unicode' for a single Unicode code point (mn_unicode_t)
- Numbers:
- 'number' for Scheme numbers
- 'int' and 'uint' for the C types 'int' and 'unsigned int'
- 'long' and 'ulong' for the C types 'long' and 'unsigned long'
- 'llong' and 'ullong' for the C types 'long long', 'unsigned long long'
(The name lexemes for C numeric types are the same as those used by
the FOO_MIN and FOO_MAX macros defined by , except that
they are in lower case.)
C wrappers of Scheme functions:
Some functions are C wrappers of standard Scheme functions. We
give these the same name name they have in Scheme, with 'mn_' added
to the front, and with hyphens changed to underscores. If the name
ends with a question mark, we remove the question mark, and add
'is' in the natural place. Exclamation points are dropped. For
example:
Scheme function C function
=================== =================
car mn_car
vector-length mn_vector_length
boolean? mn_is_boolean
set-car! mn_set_car
*/
/* mn_refs: References to Minor objects. */
/* To support garbage collection, the Minor run-time needs to be able
to find all the Minor objects C code is examining at any given
time: if C code has access to an object, then the garbage collector
will make sure not to free it.
The functions in this interface never return, or accept, direct
pointers to Minor objects. Instead, we introduce one level of
indirection: they return and accept 'mn_ref_t' objects, which refer
to Minor objects.
C code:
mn_ref_t *x;
+-----+
| . | the mn_ref_t
+--|--+ +-----+
`--------->| . |
+--|--+ ,--------------.
`-------->| minor object |
`--------------'
(The Minor objects themselves, of course, point directly to each
other; mn_refs are strictly an interface to non-GC'd languages.)
For example, the functions to construct and access pairs are
declared like this:
mn_ref_t *mn_cons (mn_call_t *c, mn_ref_t *car, mn_ref_t *cdr);
mn_ref_t *mn_car (mn_call_t *c, mn_ref_t *car);
mn_ref_t *mn_cdr (mn_call_t *c, mn_ref_t *cdr);
(Ignore the 'mn_call_t' arguments for the moment.) The Minor
run-time keeps a list of all the mn_refs that have been given to C
code, and protects all the objects those mn_refs refer to from
being garbage collected. We try to make mn_refs as lightweight as
possible.
C code using this interface is responsible for making sure every
mn_ref_t it is given gets freed. Since this can be a rather complex
task to manage, we introduce some wrinkles to make the common cases
easy. Mn_refs come in two kinds: local, and global.
- A local mn_ref_t is owned by a particular call of a C function by
Minor code; when that function call returns, all the local refs
it owns are automatically freed.
Specifically, when Minor calls a C function, it passes an extra
argument: a 'mn_call_t *' pointing to a call object created just
for that Minor->C call. The functions in this interface that
construct local references all take a 'mn_call_t *' as their first
argument, and return local mn_refs owned by that call (with some
exceptions, all marked as such). The call object also owns any
mn_refs the C function may have received as arguments from Minor
code. When the C function returns, all the local refs its call
object owns are automatically freed, along with the call object
itself.
This means that, in the common case of Minor calling a C function
that does some work on its arguments and then returns without
stashing any Minor objects in global variables or data
structures, the C code doesn't need to do any extra bookkeeping;
once it returns, all the mn_refs it accumulated --- all local ---
will be freed.
However, the convenience of local mn_refs comes at a price: local
mn_refs should not be stored in global variables, nor should they
be shared with other threads. To get around this problem, you
can promote a local mn_ref_t to a global mn_ref_t:
- A global mn_ref_t lives until you explicitly free it. Global refs
can be shared amongst other threads, and stored in global
variables. However, since global refs are never automatically
reclaimed, C code must take care of freeing them at the right
time itself; global refs are more work to manage.
Like any other kind of object shared between multiple threads,
it's up to the user of this interface to ensure that one thread
isn't using a global reference while another thread is freeing
it.
This interface provides functions to convert local mn_refs to
global mn_refs and vice versa, and a function to explicitly free
refs when necessary.
There are also some cases where C code will need to explicitly free
local mn_refs, before the call that owns them returns. For
example, if a loop traverses a list using mn_car and mn_cdr, each
element is returned as a local mn_ref_t. To avoid consuming storage
proportional to the length of the list for all those local mn_refs,
the loop must free them as it goes. To accomodate cases like this,
there is also a function that explicitly frees a local mn_ref_t,
mn_free_local_ref.
Every function in this interface but one takes a 'mn_call_t *' as its
first argument; unless stated otherwise, any mn_ref_t values it
returns are owned by that call. When we use the call in the
obvious way, we don't bother to give it a name in the prototype,
for (a tiny bit of) legibility.
The sole function that doesn't take a call argument is ---
obviously --- the function that gives you your very first call:
mn_first_call takes no arguments, and returns a call object. Since
calls can't be shared amongst threads, every thread that wants to
use this API must call this function. The first time it is called,
we initialize the Minor library.
Some subtleties:
- Except where we state otherwise, all the functions in this
interface promise to free any local refs they allocate before
they return (other than the local ref(s) they return). This
allows these functions to be used within long-running loops
without accumulating local refs the caller has no way to free.
- You may notice that even functions that don't need to allocate or
return local references still expect a call argument --- if
there's no need to indicate who should own any new local refs,
why does the function need to know the current call? The
collector also uses calls internally, as a cheap way to keep
track of which threads might be accessing heap objects. If
you've ever touched a heap object, you must have a call. So we
can take care of adding a thread to our list when we allocate its
first call --- instead of having every function in this interface
check to make sure the calling thread is registered.
- You may notice that this interface doesn't provide any functions
that change the heap object a reference refers to. This is an
important property, because it allows this interface to be
perfectly thread-safe without doing any memory or execution
synchronization while accessing references. Where the user's
code shares references between threads (global references only,
please), it's already the user's responsibility to do the right
sorts of mutual exclusion to make that sharing kosher --- and
that takes care of us, as well. The users manage the same
synchronization burden they've always had; we don't add to it, in
complexity or run-time overhead.
(The functions mn_ad_car and mn_ad_cdr look a little like
side-effecting functions, but the specification actually says
they destroy the original reference and return you a fresh one.
And it's always up to the client code to ensure that nobody
destroys an object while someone else is using it. So the
calling thread must have the only live pointer to the
reference.)
Functions that free references automatically: 'ad' functions
Often a reference is meant to be passed to a function, and never
used again. For example, to produce a reference the pair (1 . 2),
one would need to write:
mn_ref_t *one = mn_int_to_number (c, 1);
mn_ref_t *two = mn_int_to_number (c, 2);
mn_ref_t *onetwo = mn_cons (c, one, two);
mn_free_local_ref (c, one);
mn_free_local_ref (c, two);
Since this is a common pattern, the Minor C API offers variants of
functions that, in addition to whatever else the functions do, also
free all the references they are passed as arguments. These
functions have names containing the lexeme 'ad'. For example, the
'mn_cons' function takes two references to objects, and returns a
reference to a pair whose car and cdr are the objects passed. The
'mn_ad_cons' function takes two references to objects and returns
the same, but frees the two references it was passed.
The above example can be written in a more functional style using
'mn_ad_cons':
mn_ref_t *onetwo = mn_ad_cons (c, mn_int_to_number (c, 1),
mn_int_to_number (c, 2));
The 'ad' functions can clean up code for traversing a Scheme list:
for (l = mn_make_local_ref (c, list);
mn_is_pair (c, l);
l = mn_ad_cdr (c, l))
{
...;
}
The 'ad' functions are only a convenience; every 'ad' function
could be implemented simply by calling the non-'ad' function,
saving the result, and then freeing the argument references.
(However, in some cases the 'ad' functions allow Minor to use an
internal optimization, so they may be a bit faster.) */
/* Return a new global mn_ref_t referring to the same object as REF.
REF itself may be a local or global mn_ref_t. */
mn_ref_t *mn_make_global_ref (mn_call_t *, mn_ref_t *ref);
/* Like mn_make_global_ref, but then free REF as with
mn_free_local_ref. */
mn_ref_t *mn_ad_global_ref (mn_call_t *, mn_ref_t *ref);
/* Free the global reference REF. */
void mn_free_global_ref (mn_call_t *, mn_ref_t *ref);
/* Return a new local reference, owned by CALL, that refers to the
same object as REF. REF itself may be a local or global mn_ref_t.
(I'm not sure what this function is good for. Maybe it would be
useful for meeting some sort of allocation invariant.) */
mn_ref_t *mn_make_local_ref (mn_call_t *call, mn_ref_t *ref);
/* Free the local mn_ref_t REF.
If REF is actually a global mn_ref_t, do nothing. (This behavior
allows code to use a global mn_ref_t where a local mn_ref_t was
expected: it's fine to either call mn_free_local_ref explicitly,
and it's also fine to simply return to Scheme and let the C
interface clean up the local mn_refs.) */
void mn_free_local_ref (mn_call_t *, mn_ref_t *ref);
/* Free the array of local references REFS, containing LEN elements,
using mn_free_local_ref. (This does not free the array REFS itself.) */
void mn_free_local_ref_array (mn_call_t *c, mn_ref_t * const *refs, size_t len);
/* Return the first Minor call object for the calling thread.
A call object corresponds to a particular Minor->C call, but since
the main program (and new threads) may begin execution in C code,
and you need to have a call object before C can call Minor code,
where do you get that first call?
You call this function. For each thread we create a special
"first" call object, that a new thread can use to call Minor
functions.
If there are, in fact, other active Minor->C calls in this thread,
you should use the most recent of those calls instead. If you call
this function while other calls are active, it will abort.
(Would it be better to simply abort whenever this function is
called more than once from any given thread? By not aborting
unless there are other active calls, we ensure we don't hand out
misleading information; isn't that good enough?)
The first thread to call this function initializes the Minor
runtime. */
mn_call_t *mn_first_call (void);
/* Subcalls:
A subcall is a call object you can create and free yourself. You
can pass it to functions in this API like any other call object,
and the subcall will own whatever local references they return.
When you're done, you can free the subcall, which frees all the
local references it owns.
Every subcall is a child of some other call; a subcall can have
children of its own. Freeing a parent call frees all its children
as well (and their children, recursively). Since you need to have
a parent call object on hand to produce a subcall, every tree of
subcalls is rooted at an ordinary call object.
A subcall is simply a bookkeeping aid: it's equivalent to
maintaining a list of the local references yourself, except that
freeing them is somewhat faster. */
/* Produce a subcall that is a child of CALL. */
mn_call_t *mn_subcall (mn_call_t *call);
/* Free SUBCALL, and all its children. This frees all local
references owned by SUBCALL.
SUBCALL must be a subcall, produced by a call to mn_subcall, not a
true call object; if it is a true call object, abort. */
void mn_free_subcall (mn_call_t *subcall);
/* Free SUBCALL, and all its children. Return a new local reference,
owned by CALL, referring to the same object as REF. If REF is
NULL, return NULL.
This is equivalent to a call to mn_make_local_ref to duplicate REF,
and then a call to mn_free_subcall, to free SUBCALL. But it's a
common idiom --- cleaning up after a computation that has produced
a single result --- so we provide a function for it.
SUBCALL must be a subcall, produced by a call to mn_subcall, not a
true call object; if it is a true call object, abort. */
mn_ref_t *mn_finish_subcall (mn_call_t *call,
mn_call_t *subcall,
mn_ref_t *ref);
/* Exceptions. */
/* The functions in this interface return exceptions in a way
resembling the usual C 'errno' style, except that Minor's
exceptions are Scheme exception objects, rather than integers, and
the interface is reentrant, without resorting to magical
definitions for an 'errno'-like variable.
For each function in this interface that can return an exception,
we document a distinguished 'exception return value' --- a null
pointer, for example --- that indicates to the caller that an
exception has occurred.
Each thread has a 'pending exception' object, accessed via the
'mn_get_exception' and 'mn_set_exception' functions. When a
function returns its exception return value, the caller can use
'mn_get_exception' to find the exception object describing the
error.
To throw an exception, C code can make an exception object, call
'mn_set_exception' to make that object the thread's pending
exception, and return its own exception return value.
Some of these functions take or produce strings; see the comments
at the top of the "Characters" section for Minor's conventions for
dealing with text.
When Minor Functions Abort Instead Of Returning Exceptions,
and Why:
By convention, Minor C API functions handle type errors, index
range errors, and numeric range errors by aborting, instead of
returning an exception. These sorts of errors typically indicate
bugs in the user's code itself: correct programs usually never
encounter them. When this is not the case, the API provides
functions to check for the conditions that would cause an abort.
An interface which handles these sorts of errors by returning
exceptions doesn't work well:
- Users will often not check for exception return values in these
cases, since they "know" the errors cannot occur. If the API
reports them as exceptions which the user's code ignores, then
the program behaves unpredictably, instead of failing in a
controlled way.
- For many functions in this API, these classes of errors are the
only ones they can ever encounter, so these functions either
return successfully, or not at all. This lets the user write
terser, more legible code, without leaving errors unchecked.
Each function's description details when it will abort. */
/* Return a new local reference to the calling thread's pending
exception object. If there is no pending exception, return NULL.
Note that calling this function does not clear the pending
exception! You must call mn_set_exception yourself if you want
future callers to see that there is no pending exception. */
mn_ref_t *mn_get_exception (mn_call_t *);
/* Set the calling thread's pending exception object to EX. If you
want to clear the pending exception, pass NULL as EX. */
void mn_set_exception (mn_call_t *, mn_ref_t *ex);
/* Return a null-terminated string describing the exception EX. (In
other words, get its error message.) The string is allocated using
malloc; the caller is responsible for freeing it. If EX is not an
exception, abort. */
char *mn_exception_string (mn_call_t *, mn_ref_t *ex);
/* Make a generic exception with the error message MSG. MSG is a
null-terminated string.
If MSG cannot be converted to a Minor string, return NULL and set
the pending exception. */
mn_ref_t *mn_make_generic_exception (mn_call_t *, const char *msg);
/* Equality. */
/* Return true if A and B refer to the same object; otherwise,
return zero. */
_Bool mn_eq (mn_call_t *, mn_ref_t *a, mn_ref_t *b);
/* Return true if A and B are equal in the sense of the Scheme
'equal?' predicate.
If both A and B are cyclic, this function may not return. (And,
until we have annotated code running, this could block all other
threads from performing a GC. This is a lot of trouble to fix in
the C implementation, but will be a non-issue in annotated code,
which will run faster anyway, so we're going to leave it as is for
now.) */
_Bool mn_equal (mn_call_t *, mn_ref_t *a, mn_ref_t *b);
/* Booleans. */
/* Return true if REF refers to a boolean value; otherwise, return
false. */
_Bool mn_is_boolean (mn_call_t *, mn_ref_t *ref);
/* Return a reference to #t / #f. */
mn_ref_t *mn_true (mn_call_t *);
mn_ref_t *mn_false (mn_call_t *);
/* Return true if REF is a true value (i.e., anything but #f);
otherwise, return false. */
_Bool mn_is_true (mn_call_t *, mn_ref_t *ref);
/* Return true if REF is a true value (i.e., anything but #f);
otherwise, return false. Either way, free REF. */
_Bool mn_ad_is_true (mn_call_t *c, mn_ref_t *ref);
/* Pairs. */
/* Return true if REF refers to a pair; otherwise, return false. */
_Bool mn_is_pair (mn_call_t *, mn_ref_t *ref);
/* Return a new pair whose car is CAR and whose cdr is CDR. */
mn_ref_t *mn_cons (mn_call_t *, mn_ref_t *car, mn_ref_t *cdr);
/* Like mn_cons, but free the references to CAR and CDR. The effect
is identical to that of calling mn_cons, and then mn_free_local_ref
twice. */
mn_ref_t *mn_ad_cons (mn_call_t *, mn_ref_t *car, mn_ref_t *cdr);
/* Return the car/cdr of PAIR. If PAIR is not a pair, abort. */
mn_ref_t *mn_car (mn_call_t *, mn_ref_t *pair);
mn_ref_t *mn_cdr (mn_call_t *, mn_ref_t *pair);
/* Set the car/cdr of PAIR to VALUE, and return zero.
If PAIR is not a pair, abort. */
void mn_set_car (mn_call_t *, mn_ref_t *pair, mn_ref_t *value);
void mn_set_cdr (mn_call_t *, mn_ref_t *pair, mn_ref_t *value);
/* If REF refers to a pair P, free REF and return a new reference to
(car P) / (cdr P). If REF is not a pair, abort.
This is no different from calling mn_car / mn_cdr and then calling
mn_free_ref on the ref you passed to it, except that it's a little
more readable, and the implementation can optimize the process.
It's just helpful for traversing list structures. */
mn_ref_t *mn_ad_car (mn_call_t *, mn_ref_t *ref);
mn_ref_t *mn_ad_cdr (mn_call_t *, mn_ref_t *ref);
/* Lists. */
/* Return true if REF refers to a proper list; otherwise, return
false. A proper list is either the empty list object (), or a pair
whose cdr is a list.
A "cyclic list", or a series of pairs chained through their cdrs
whose last cdr points to an earlier pair in the series, is not a
proper list; this function returns false given such a list. */
_Bool mn_is_list (mn_call_t *, mn_ref_t *obj);
/* Return the length of the proper list LIST. If LIST isn't a proper
list, or if the length doesn't fit in an int, return -1, and set
the pending exception. This detects cyclic lists. */
int mn_length (mn_call_t *, mn_ref_t *list);
/* Return a copy of LIST: the spine of the list is made from new
pairs, but the elements are shared with the original. If LIST is
not a proper list, return NULL, and set the pending exception.
This detects cyclic lists. */
mn_ref_t *mn_copy_list (mn_call_t *, mn_ref_t *list);
/* Return a reference to the empty list. */
mn_ref_t *mn_null (mn_call_t *);
/* Return true if REF refers to the empty list, false otherwise. */
_Bool mn_is_null (mn_call_t *, mn_ref_t *ref);
/* Allocate a new pair whose car is ELT and whose cdr is LIST. Free
LIST, and return a reference to the new pair.
This is no different from calling mn_cons and then mn_free_ref,
except that it's a little more readable, and the implementation can
optimize the process. */
mn_ref_t *mn_push (mn_call_t *, mn_ref_t *list, mn_ref_t *elt);
/* Thread safety note:
The functions in Minor that traverse lists are designed to behave
reasonably on cyclic data structures, but it is still possible to
confuse them by having other threads mutate the list structure
while it is being traversed, to the point that they never return.
Minor should (at least) promise that such infinite loops will not
take place while holding any important internal locks, so other
threads will be able to make forward progress. Unfortunately,
that's very difficult to achieve in a run-time library implemented
using incoherent sections, and we don't. It's our hope that
re-implementing the run-time in Scheme will remove this problem, by
allowing us to use annotated machin code instead of incoherent
sections. */
/* Numbers. */
/* At the moment, Minor supports only exact integers --- and only
fixnums at that. But it's obvious how the new functions should be
added, and the existing functions shouldn't need to change their
behaviors. */
/* Return true if REF refers to a number; otherwise, return false. */
_Bool mn_is_number (mn_call_t *, mn_ref_t *ref);
/* Return true if REF refers to an exact number; return false
otherwise. */
_Bool mn_is_exact (mn_call_t *, mn_ref_t *ref);
/* Return true if REF refers to an integer; otherwise, return false. */
_Bool mn_is_integer (mn_call_t *, mn_ref_t *ref);
/* Return true if REF refers to an exact integer; return false
otherwise. This is equivalent to:
mn_is_integer (call, ref) && mn_is_exact (call, ref) */
_Bool mn_is_exact_integer (mn_call_t *, mn_ref_t *ref);
/* Return true iff N can be represented as an exact Minor integer.
(Eventually, Minor will support bignums, and these functions will
always return true. But:
- the bignum support isn't done yet, and our own rules say we
provide functions that check for each condition that could cause
an abort, and
- If this interface is to serve as a model for other Scheme
implementations, it needs to support those where exact numbers
have a limited range. */
_Bool mn_int_is_number (mn_call_t *, int n);
_Bool mn_uint_is_number (mn_call_t *, unsigned int n);
_Bool mn_long_is_number (mn_call_t *, long n);
_Bool mn_ulong_is_number (mn_call_t *, unsigned long n);
_Bool mn_llong_is_number (mn_call_t *, long long n);
_Bool mn_ullong_is_number (mn_call_t *, unsigned long long n);
/* Return true iff N is an exact integer that can fit in the given
type, false otherwise. */
_Bool mn_number_is_int (mn_call_t *, mn_ref_t *n);
_Bool mn_number_is_uint (mn_call_t *, mn_ref_t *n);
_Bool mn_number_is_long (mn_call_t *, mn_ref_t *n);
_Bool mn_number_is_ulong (mn_call_t *, mn_ref_t *n);
_Bool mn_number_is_llong (mn_call_t *, mn_ref_t *n);
_Bool mn_number_is_ullong (mn_call_t *, mn_ref_t *n);
/* Return a mn_ref_t for an exact integer equal to N. If n is beyond
the range of integers Minor can represent, abort. */
mn_ref_t *mn_int_to_number (mn_call_t *, int n);
mn_ref_t *mn_uint_to_number (mn_call_t *, unsigned int n);
mn_ref_t *mn_long_to_number (mn_call_t *, long n);
mn_ref_t *mn_ulong_to_number (mn_call_t *, unsigned long n);
mn_ref_t *mn_llong_to_number (mn_call_t *, long long n);
mn_ref_t *mn_ullong_to_number (mn_call_t *, unsigned long long n);
/* If N refers to an exact integer, return its value.
If N does not refer to an exact integer, or if its value does not
fit in the given return type, abort. */
int mn_number_to_int (mn_call_t *, mn_ref_t *n);
unsigned int mn_number_to_uint (mn_call_t *, mn_ref_t *n);
long mn_number_to_long (mn_call_t *, mn_ref_t *n);
unsigned long mn_number_to_ulong (mn_call_t *, mn_ref_t *n);
long long mn_number_to_llong (mn_call_t *, mn_ref_t *n);
unsigned long long mn_number_to_ullong (mn_call_t *, mn_ref_t *n);
/* Why not intmax_t? ptrdiff_t? intptr_t? */
/* Arithmetic and comparison functions could go here. */
/* Return true if N is numerically equal to M, false otherwise. (This
returns true when, in Scheme, (= N M)). If N or M are not numbers,
abort. */
_Bool mn_numbers_equal (mn_call_t *, mn_ref_t *n, mn_ref_t *m);
/* Characters. */
/* General Conventions For Handling Text and Character Sets:
Minor uses Unicode to represent characters and strings; C uses a
representation that varies from one locale to another. Where the
functions in this API accept or return 'char' or 'wchar_t' values,
or strings made from them, those values use the current C execution
character set; the API converts to and from Minor's internal
representation as needed. This means that you can use such values
with the standard C library functions for working with text
(getchar, printf, atoi, and so on) in the normal way, without
worrying about what representation Minor is using.
Since the encoding of characters in the current C execution
character set is determined by the current locale, the behavior of
these functions may depend on the current locale --- specifically,
that established for the LC_CTYPE category.
Errors can occur during conversion: byte strings may not be
well-formed encodings of code points; code points may be
unassigned; and characters may not exist in the destination
character set.
Minor reports errors that would result in the loss of information.
However, if a conversion can be performed without doing so, Minor
may carry it through; for example, if the C execution character set
is also Unicode, then Minor can arbitrary code points to characters
or store them in strings, even if those code points have no
character assigned to them.
ISO C divides the C execution character set into the "basic
character set" --- the upper- and lower-case letters, the digits,
the graphic symbols used in C syntax (all the ASCII symbols but
'$', '@', or '`'), the whitespace characters, and the null
character --- and "extended characters". Characters in the basic
character set, and strings containing only such characters, may
always be converted to and from Minor values without error. */
/* Return true if REF refers to a character; otherwise, return false. */
_Bool mn_is_character (mn_call_t *, mn_ref_t *ref);
/* Convert between Minor characters and the C execution character set.
(Hah! "Minor characters"??? Get it? Pretty funny, huh!) */
/* Return true if CHARACTER is a character, and that character can be
represented as a C char / wchar_t, false otherwise. */
_Bool mn_is_char (mn_call_t *, mn_ref_t *character);
_Bool mn_is_wchar (mn_call_t *, mn_ref_t *character);
/* Return CHARACTER as a C char / wchar_t. If CHARACTER cannot be
represented in the given type, return EOF / WEOF, and set the
pending exception. If CHARACTER is not a character, abort. */
int mn_character_to_char (mn_call_t *, mn_ref_t *character);
wint_t mn_character_to_wchar (mn_call_t *, mn_ref_t *character);
/* Return the Minor character corresponding to the 'char' or 'wchar_t'
value C. If C cannot be converted to a Minor character, return EOF
/ WEOF and set the pending exception. */
mn_ref_t *mn_char_to_character (mn_call_t *, int c);
mn_ref_t *mn_wchar_to_character (mn_call_t *, wchar_t c);
/* Strings. */
/* A string is an array of characters.
Minor strings are immutable. (This is a deviation from Scheme.)
See the comments in the "Characters" section describing the general
conventions for handling text and dealing with conversion errors.
The functions here that provide the contents of a string all
produce copies of the text for the user's use. If it's important
to avoid this, then we could introduce a lease-based interface
here. Leases are described in the file doc/leases. */
/* Return true if REF refers to a string; otherwise, return false. */
_Bool mn_is_string (mn_call_t *, mn_ref_t *ref);
/* Return the length of STRING, in characters. If STRING is not a
string object, abort. */
size_t mn_string_length (mn_call_t *, mn_ref_t *string);
/* Return the i'th character of STRING. If STRING is not a string, or
doesn't have that many characters, abort. */
mn_ref_t *mn_string_ref (mn_call_t *, mn_ref_t *string, int i);
/* Return the contents of STRING as a null-terminated string. The
memory for the string returned is allocated using malloc; the
caller is responsible for freeing it.
If STRING contains null characters, truncate it just before the
first one. (Would it be more useful to just return the entire
string, embedded nulls and all, with an extra null on the end?)
If STRING cannot be fully and accurately converted to the C
execution character set, return NULL and set the pending exception.
If STRING is not a string, abort. */
char *mn_string_to_str (mn_call_t *, mn_ref_t *string);
/* Return the contents of STRING as a block of characters, and set
*LENGTH to its length in bytes. The memory returned is allocated
*using malloc; the caller is responsible for freeing it.
If STRING cannot be fully and accurately converted to the C
execution character set, return NULL and set the pending exception.
If STRING is not a string, abort. */
char *mn_string_to_mem (mn_call_t *, mn_ref_t *string, size_t *length);
/* Return a Minor string object whose contents are the same as the
null-terminated string STR. This is a copy of STR; the returned
string does not refer to STR's memory.
If STR cannot be fully and accurately converted to a Minor string,
return NULL and set the pending exception.
(For storing arbitrary sequences of bytes, use byte vectors; they
are described in bytevec.h.) */
mn_ref_t *mn_string_from_str (mn_call_t *, const char *str);
/* Return a Minor string object whose contents are the same as the
LENGTH bytes at MEM. This makes a copy of MEM; the returned string
does not refer to MEM's memory. MEM need not be null-terminated,
and may contain embedded null characters.
If MEM cannot be fully and accurately converted to a Minor string,
return NULL and set the pending exception.
(For storing arbitrary sequences of bytes, use byte vectors; they
are described in bytevec.h.) */
mn_ref_t *mn_string_from_mem (mn_call_t *, const char *mem, size_t length);
/* Symbols. */
/* See the comments in the "Characters" section describing the general
conventions for handling text and dealing with conversion
errors. */
/* Return true if REF refers to a symbol; otherwise, return false. */
_Bool mn_is_symbol (mn_call_t *, mn_ref_t *ref);
/* Return the symbol whose name is NAME. If NAME is not a string,
abort. */
mn_ref_t *mn_string_to_symbol (mn_call_t *, mn_ref_t *name);
/* Return the name of the symbol SYMBOL as a Minor string. If SYMBOL
is not a symbol, abort. */
mn_ref_t *mn_symbol_name (mn_call_t *, mn_ref_t *symbol);
/* Return the symbol whose name is the null-terminated C string NAME.
Every symbol's name is a valid string. If NAME cannot be fully and
accurately converted to a string, return NULL and set the pending
exception. */
mn_ref_t *mn_symbol_from_str (mn_call_t *, const char *name);
/* Return the name of the symbol SYMBOL, as a malloc'd block of
characters, and set *LENGTH to its length in bytes.
The memory for the string returned is allocated using malloc; the
caller is responsible for freeing it.
If SYMBOL's name cannot be fully and accurately converted to a
string, return NULL and set the pending exception.
If SYMBOL is not a symbol, abort. */
char *mn_symbol_to_mem (mn_call_t *, mn_ref_t *symbol, size_t *length);
/* Procedures. */
/* The Minor run-time keeps track of each time Scheme code calls a C
function, and each time C code calls a Scheme function. Every
local mn_ref_t is 'owned' by a particular Scheme->C call; when that
call returns, all the local mn_refs it owns are freed.
All the arguments passed in a given Scheme->C call are owned by
that call. Furthermore, all local mn_refs allocated by functions
in this API get attributed to the most recent Scheme->C call on the
calling thread's stack. When that Scheme->C call returns, the
run-time frees all the local mn_ref_t's it owned. */
/* Return true if REF refers to a procedure; otherwise, return false. */
_Bool mn_is_procedure (mn_call_t *, mn_ref_t *ref);
/* Apply PROC to the Scheme list of arguments, ARGS. PROC must return
exactly one value, to which we return a reference.
If the application of PROC throws an exception, save that as the
pending exception and return zero.
If any of the following constraints are not met, set the pending
exception and return zero:
- PROC must be a procedure.
- PROC must take as many arguments as there are elements in ARGS.
- PROC must return exactly one value. */
mn_ref_t *mn_apply (mn_call_t *, mn_ref_t *proc, mn_ref_t *args);
/* mn_callN applies FUNC to ARG1, ARG2, ... and returns the single
result value. */
mn_ref_t *mn_call1 (mn_call_t *, mn_ref_t *func, mn_ref_t *arg);
/* Apply PROC to the Scheme list of arguments, ARGS. Return the
values PROC returns as a Scheme list. If PROC returns an
exception, save that as the pending exception, and return zero. If
PROC is not a procedure at all, return zero and set the pending
exception. */
mn_ref_t *mn_apply_multi_valued (mn_call_t *, mn_ref_t *proc, mn_ref_t *args);
/* Create a new Scheme procedure PROC based on the C function FUNC.
Calling PROC calls FUNC with:
- a fresh mn_call_t object NEW_CALL as its first argument,
- a fresh reference to CLOSURE as its second argument, and
- local references to the first N arguments to PROC as subsequent
arguments.
If PROC was created by one of the mn_make_procedure_N_rest
functions, then calling PROC also pass FUNC one more argument at
the end, which is a fresh reference to a fresh list of any
remaining arguments passed to PROC, beyond the first N.
The local references passed to FUNC are all owned by NEW_CALL.
FUNC should return a reference to a Scheme object to be the single
return value of the Scheme function. If FUNC returns zero, then
the pending exception (see mn_get_exception) is thrown. When FUNC
returns, all the local refs owned by NEW_CALL are freed.
So, you might define a function 'foo' for a Scheme procedure that
takes two arguments like this:
mn_ref_t *
foo (mn_call_t *call, mn_ref_t *closure, mn_ref_t *arg1, mn_ref_t *arg2)
{
...
}
...
scheme_foo = mn_make_procedure_2 (call, foo, closure, "foo");
If NAME is non-zero, it is taken as a string to use as the
procedure's name when the procedure value is printed. The
procedure holds its own copy of NAME; it does not refer to the
string it is passed.
At present, we only provide functions to create procedures taking
up to four fixed arguments, but it's easy to add support for more
fixed arguments. If the limitations cause you trouble, feel free
to extend the interface.
CLOSURE may be NULL; in this case, FUNC will be passed NULL as its
own CLOSURE argument.
It's kind of gross to write out all these declarations like this,
but it's worth it to get the static type checking from the C
compiler. */
mn_ref_t *mn_make_procedure_0 (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_1 (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_2 (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *arg2),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_3 (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *arg2,
mn_ref_t *arg3),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_4 (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *arg2,
mn_ref_t *arg3,
mn_ref_t *arg4),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_0_rest (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *rest),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_1_rest (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *rest),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_2_rest (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *arg2,
mn_ref_t *rest),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_3_rest (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *arg2,
mn_ref_t *arg3,
mn_ref_t *rest),
mn_ref_t *closure,
const char *name);
mn_ref_t *mn_make_procedure_4_rest (mn_call_t *,
mn_ref_t *(*func) (mn_call_t *new_call,
mn_ref_t *closure,
mn_ref_t *arg1,
mn_ref_t *arg2,
mn_ref_t *arg3,
mn_ref_t *arg4,
mn_ref_t *rest),
mn_ref_t *closure,
const char *name);
/* Make a procedure but specify its arity dynamically. The returned
procedure expects NARGS fixed arguments, and if REST is true, it
expects a rest argument as well.
NARGS must be between zero and four; this function doesn't let you
get beyond the limitations on the number of fixed arguments.
mn_func_t is never the right type for the FUNC argument: vararg
function types are not compatible with function types that actually
spell out their arguments' types. So you'll always need to cast
that argument, and the C compiler won't check that you're passing a
function that actually accepts the number of arguments it'll get.
Use the mn_make_procedure_N[_rest] functions if you can. */
typedef mn_ref_t *mn_func_t (mn_call_t *, void *, ...);
mn_ref_t *mn_make_procedure (mn_call_t *,
mn_func_t *func,
int nargs, _Bool rest,
mn_ref_t *closure,
const char *name);
/* The maximum number of fixed arguments a procedure created by
mn_make_procedure can expect.
The only thing this is really good for, as far as I can tell, is
for checking consistency between Minor and the test suite. */
#define MN_C_PROC_MAX_FIXED_ARITY (4)
/* Like mn_make_procedure, except that FUNC's return value is a Scheme
list of values, to be returned as the values of the function.
Exception returns are handled the same way. */
mn_ref_t *mn_make_multi_valued_procedure (mn_call_t *,
mn_func_t *func,
int nargs, _Bool rest,
mn_ref_t *closure,
const char *name);
/* Return the name of the procedure PROC, or NULL if it has none. The
return value is allocated using malloc; it is the caller's
responsibility to free it. If PROC is not a procedure, abort. */
char *mn_procedure_name (mn_call_t *, mn_ref_t *proc);
/* It might be nice to add alternate entry points here that let you
say more about the types of arguments functions expect, so more
conversions can happen under the hood, and fewer local mn_refs will
need to be allocated. */
/* Vectors. */
/* Return true if REF refers to a vector; otherwise, return false. */
_Bool mn_is_vector (mn_call_t *, mn_ref_t *ref);
/* Return the length of VECTOR. If VECTOR is not a vector, abort. */
size_t mn_vector_length (mn_call_t *, mn_ref_t *vector);
/* Return a new LEN-element vector whose elements are all ELT. */
mn_ref_t *mn_make_vector (mn_call_t *, size_t len, mn_ref_t *elt);
/* Return a new LEN-element vector whose i'th element is ELTS[i]. */
mn_ref_t *mn_vector_from_array (mn_call_t *,
mn_ref_t * const *elts,
size_t len);
/* Return an array whose i'th element is a reference to the i'th
element of VECTOR. If LEN is non-zero, set *LEN to the length of
VECTOR.
The memory returned is allocated with malloc; it's the caller's
responsibility to free it --- as well as every reference it
contains; mn_free_ref_array may be helpful here.
If VECTOR is not a vector, abort. */
mn_ref_t **mn_vector_to_array (mn_call_t *, mn_ref_t *vector, size_t *len);
/* Return the I'th element of VECTOR. If VECTOR isn't a vector, or
hasn't that many elements, abort. */
mn_ref_t *mn_vector_ref (mn_call_t *, mn_ref_t *vector, int i);
/* Set the I'th element of VECTOR to OBJ, and return true. If VECTOR
isn't a vector or hasn't that many elements, abort. */
void mn_vector_set (mn_call_t *, mn_ref_t *vector, int i, mn_ref_t *obj);
/* Return a vector of the same length as LIST, whose i'th element is
LIST's i'th element. If LIST is not a proper list, abort. */
mn_ref_t *mn_list_to_vector (mn_call_t *, mn_ref_t *list);
/* Return a list of the same length as VECTOR, whose i'th element is
VECTOR's i'th element. If VECTOR is not a vector, abort. */
mn_ref_t *mn_vector_to_list (mn_call_t *, mn_ref_t *vector);
/* Input/Output ports. */
/* Naming conventions:
The C or headers declare input and output
functions whose names follow the convention that, if the function
takes an explicit stream argument, the name starts with 'f':
fprintf, fputs, and so on.
The Minor C API function corresponding to a or
input / output function named fFOO is named mn_FOO. All Minor C
API input / output functions take an explicit port argument. The
arguments are the same as those to the C function, except that a
'call' argument is added at the front, and the port always comes
immediately after the 'call' argument. For example:
C function Minor function
========================== ===============================================
fputs (char *, FILE *) mn_puts (mn_call_t *, mn_ref_t *port, char *)
fputws (wchar_t *, FILE *) mn_putws (mn_call_t *, mn_ref_t *port, wchar_t *)
wint_t ungetwc wint_t mn_ungetwc
(wint_t, FILE *) (mn_call_t *, mn_ref_t *port, wchar_t ch)
Returning Exceptions:
Like the C functions, the functions in this section use
EOF or WEOF as their exception return value, and the input
functions also return EOF or WEOF to indicate that they have
reached end-of-file. So EOF / WEOF is both a normal return value
and an exception return value.
To distinguish these two cases, when an input function reaches
end-of-file, it will always set its end-of-file indicator. If it
encounters an error, it will always set the pending exception
(obviously). Thus, when a function returns EOF or WEOF, the caller
should always check the end-of-file indicator, and if it is clear,
check the pending exception. For example:
if ((ch = mn_getc (c, port)) == EOF)
{
if (mn_port_at_eof (c, port))
... handle EOF ...
else
... handle exception mn_get_exception (c) ...
}
The End-of-File Indicator:
On POSIX systems, it is possible for an input stream to reach end-
of-file more than once. For example, when a user types control-D
at a terminal, a process reading from that terminal receives an
end-of-file indication --- the 'read' system call returns zero,
indicating that no bytes were read. However, the user may then
continue to enter characters at the terminal, which can include
more EOF characters. So from the reading process's point of view,
even after receiving an end-of-file, there may be more data left to
read.
For consistency with the C input functions, Minor input
ports include an end-of-file indicator, which records whether
end-of-file has been reached. See the individual functions'
descriptions for the details of house the end-of-file indicator
works.
(Note that only the C functions defined in this interface respect a
port's end-of-file indicator. Scheme input functions operating on
the same port may consume new input even if the end-of-file
indicator is set. The end-of-file indicator is just a convenience
for C programmers, making Minor ports behave more like C
streams.)
See the comments at the start of the section on "Characters" for
Minor's general conventions for processing text. */
/* Return true if REF refers to a port, input port, or output port;
otherwise, return false. */
_Bool mn_is_port (mn_call_t *, mn_ref_t *ref);
_Bool mn_is_input_port (mn_call_t *, mn_ref_t *ref);
_Bool mn_is_output_port (mn_call_t *, mn_ref_t *ref);
/* Return true if REF refers to an open port; otherwise, return
false. */
_Bool mn_is_open_port (mn_call_t *, mn_ref_t *ref);
/* Create a new input or output port based on the Standard I/O file
object FILE. The returned port's end-of-file indicator is always
equal to that of FILE.
Each port returned by these functions may be used either with the
byte input/output routines (mn_putc; mn_getc; etc.) or the wide
character input/output routines (mn_putwc; mn_getwc; etc.), but not
both: you may not mix byte and wide character operations on a given
port. The first operation performed determines the port's
orientation (byte or wide character), and from that point on only
operations of the same orientation may be performed.
There is currently no way to arrange for FILE to be freed if the
port object becomes garbage. We will eventually fix this, once we
implement guardians. */
mn_ref_t *mn_make_stdio_input_port (mn_call_t *, FILE *file);
mn_ref_t *mn_make_stdio_output_port (mn_call_t *, FILE *file);
/* If PORT was created using mn_make_stdio_input_port or
mn_make_stdio_output_port, return the underlying FILE object.
If PORT is not a port at all, abort. Otherwise, return NULL. */
FILE *mn_stdio_port_file (mn_call_t *, mn_ref_t *port);
/* Close PORT. If PORT is an output port with buffered unwritten
data, write it out. If all goes well, return true; if an error
occurs while doing the output, set the pending exception and return
false.
If PORT is already closed, this function has no effect.
If PORT is not a port, abort. */
_Bool mn_close_port (mn_call_t *, mn_ref_t *port);
/* Print the byte or wide character CH on PORT, and return CH. (For
mn_putc, CH is first converted to an unsigned character.)
If an error occurs while doing the output, return EOF / WEOF and
set the pending exception.
If PORT is not an open output port, abort. */
int mn_putc (mn_call_t *, mn_ref_t *port, int ch);
wint_t mn_putwc (mn_call_t *, mn_ref_t *port, wchar_t ch);
/* Print OBJECT on PORT the way 'write' does, and return true.
This is a 'wide character' operation.
If an error occurs while doing the output, return false and set the
pending exception.
If PORT is not an open output port, abort. */
_Bool mn_write (mn_call_t *, mn_ref_t *object, mn_ref_t *port);
/* Print OBJECT on PORT the way 'display' does, and return true.
This is a 'wide character' operation.
If an error occurs while doing the output, return false and set the
pending exception.
If PORT is not an open output port, abort. */
_Bool mn_display (mn_call_t *, mn_ref_t *object, mn_ref_t *port);
/* Print the null-terminated string / wide string STRING on PORT, and
return a positive value.
If an error occurs while doing the output, return EOF and set the
pending exception.
If PORT is not an open output port, abort. */
int mn_puts (mn_call_t *, mn_ref_t *port, const char *string);
int mn_putws (mn_call_t *, mn_ref_t *port, const wchar_t *string);
/* Print STRING, containing LENGTH bytes / wchar_t's, on PORT, and
return true.
If an error occurs while doing the output, return false and set the
pending exception.
If PORT is not an open output port, abort. */
int mn_put_mem (mn_call_t *, mn_ref_t *port,
const char *string, size_t length);
int mn_put_wmem (mn_call_t *, mn_ref_t *port,
const wchar_t *string, size_t length);
/* If PORT's end-of-file indicator is set, return EOF / WEOF.
Otherwise, read a single byte / multibyte character from PORT, and
return it. If end-of-file is reached, set PORT's end-of-file
indicator and return EOF / WEOF.
If an error occurs while doing the input, return false and set the
pending exception.
If PORT is not an open input port, abort. */
int mn_getc (mn_call_t *, mn_ref_t *port);
wint_t mn_getwc (mn_call_t *, mn_ref_t *port);
/* Push back the character / wide character CH onto PORT. If the call
is successful, clear PORT's end-of-file indicator and return CH.
(For mn_ungetc, CH is first converted to an unsigned character.)
Only one character of pushback is guaranteed. If too many calls to
mn_ungetc / mn_ungetwc are made without intervening calls to
consume the pushed-back characters, they return EOF / WEOF and set
the pending exception.
If PORT is not an open input port, abort. */
int mn_ungetc (mn_call_t *, mn_ref_t *port, int ch);
wint_t mn_ungetwc (mn_call_t *, mn_ref_t *port, wchar_t ch);
/* If PORT's end-of-file indicator is set, return the end-of-file
object. Otherwise, read a datum from PORT the way 'read' does, and
return it. If no datum, complete or partial, is found on PORT
before end-of-file is reached, set the end-of-file indicator, and
return the end-of-file object.
If there is an incomplete or ill-formed datum on PORT, or if an
error occurs while doing the input, return NULL and set the pending
exception.
This is a 'wide character' operation.
If PORT is not an open input port, abort. */
mn_ref_t *mn_read (mn_call_t *, mn_ref_t *port);
/* Return true if PORT's end-of-file indicator is set, or false
otherwise.
NOTE: this just tests the end-of-file indicator; it doesn't go and
check whether PORT is actually at end-of-file. That is, you must
get an EOF / WEOF from some input function before this will return
true.
If PORT is not an open input port, abort. */
_Bool mn_port_at_eof (mn_call_t *, mn_ref_t *port);
/* Clear PORT's end-of-file indicator.
If PORT is not an open input port, abort. */
void mn_port_clear_eof (mn_call_t *, mn_ref_t *port);
/* Return true if OBJ is the Scheme end-of-file object. */
_Bool mn_is_eof_object (mn_call_t *, mn_ref_t *obj);
/* Return a reference to the end-of-file object. */
mn_ref_t *mn_eof_object (mn_call_t *);
/* Eventually, I'd like to provide something that lets you do
arbitrary computation to provide the stream's contents. It
shouldn't require a function call for every character passed, and
it should support some sort of buffering. We'll use it for
character set conversion, compression, and so on.
The C++ arrangement is probably good.
MzScheme's custom ports meet all those criteria, but they're
awfully complicated; is all that really necessary? */
/* Top-Level Environments. */
/* A top-level environment provides bindings for identifiers, either
as variables with locations or syntactic keywords with expanders,
that you can examine and modify incrementally, and evaluate code
in. Identifiers are represented by symbols for now; maybe we'll
change that later, for hygiene.
Minor's rules for interpreting references to identifiers are as
follows:
- Code presented to 'eval' (or one of the related functions in this
interface) is expanded immediately to core Scheme forms before
any evaluation of the program those forms denote takes place.
- Whether a particular use of an identifier is interpreted as a
variable reference or a syntactic form depends on what sort of
binding is in scope for that identifier when the use is expanded.
If no binding is in scope, the use is assumed to refer to a
variable which has not yet been defined.
- When a variable reference is evaluated, whatever binding is
present in the environment at that point is the one used. If the
identifier is not currently bound to a variable, Minor throws an
exception.
- If you re-define an identifier, the new definition is visible to
all prior variable references enclosed in that environment.
For example, suppose we have an environment E which has no binding
for the identifier x, and that we evaluate the expression
(lambda () x) in E.
- Evaluating the lambda expression will not raise an exception,
even though x is unbound. Call the resulting procedure P.
- Applying P now will throw an exception, complaining that x is
unbound.
- If we define x in E as a variable with value V1 and apply P
again, P will return V1.
- If we re-define x in E as a variable with value V2 and apply P
again, P will return V2.
- If we re-define x in E as a syntactic keyword, then applying P
again will throw an exception, complaining that x is not a
variable.
- If we re-define x in E as a variable with value V3 and apply P
again, P will return V3.
The goal here is to allow you to think of all uses of a particular
identifier in a particular environment as references to the same
thing, even as identifiers are re-defined.
(We don't achieve that, however: we expand syntax once, completely,
before evaluation, and don't keep any record of the syntactic
keywords we used, so re-defining a syntactic keyword as a variable
doesn't cause code that referred to the syntactic keyword to break.
This means you can be running code that uses two conflicting
definitions of the same variable at once.)
Apology:
I'm not at all sure how environments should behave.
One useful characteristic that some module systems have is the
ability to export variables in a way that does not allow importing
modules to assign to them. In such systems, the compiler can
generate code for the exporting module under the assumption that
the assignments it sees to those variables as it compiles the
module are all that there will ever be in the whole program. Which
is pretty nice. If the exporting module never mutates those
variables, the compiler can treat the variable as a constant in
importing modules. Which is also pretty nice.
You could accomplish this with declarations. You could have a
separate form of definition, 'define-constant', that tells the
compiler it can assume the variable's value won't be changed. But
it seems more graceful to have a way for the compiler to simply see
that it is so from the way it is used; the fact of variable's
constantness should just be evident from the code, even in the
presence of separate compilation.
Clearly some kind of extra information is needed (because otherwise
new assignments to the variable could always show up), but it
should be something much more general than 'define-constant' ---
for example, if the variable's value is always a list, that would
be nice for the compiler to be able to figure out, too; but does
that mean we should add 'define-typed'? This can go on and on.
Another useful characteristic: the Flatt module system and the
proposed R6RS 'library' system impose a separation of phases that
seems really valuable. I think something like this is exactly the
right way to give some well-defined meaning to separate
compilation. But it seems like Scheme should provide primitives
that *allow* the construction of facilities like that. Building
them directly into the language is an admission, in my eyes, that
Scheme isn't really as good a language for playing with these kinds
of ideas as it aspires to be.
I came across some advice on solving hard problems from someone who
was successful at that:
Think of a simpler problem that seems similar. If you can't
solve that, think of a simpler one. If you can solve it, then
SOLVE IT. Don't just say you know how to solve it. After that,
you might think of a way to attack the harder problem. You
might realize something.
--- Richard Feynman
According to Don P. Mitchell
http://www.mentallandscape.com/Writings_Feynman.htm
I have no real clue how to address the sorts of issues I raised
above. But then, neither have I actually written a hygienic macro
expander for Scheme.
So I'm just going to knock out a straightforward macro expander and
interpreter based on a one-phase environment model, because I'm
sure I can do that. When that's done, maybe I'll have a better
perspective.
So go ahead and use these functions. They will change, but if I'm
doing it right, whatever they change to will probably permit these
interfaces to implemented --- perhaps not optimally, but
adequately. */
/* Create a new, empty top-level environment. */
mn_ref_t *mn_make_environment (mn_call_t *);
/* Return true if OBJ is an environment, false otherwise. */
_Bool mn_is_environment (mn_call_t *, mn_ref_t *obj);
/* Return true if (the identifier denoted by) SYMBOL is bound at all
in ENV, either as a variable or as syntax. */
_Bool mn_is_bound (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol);
/* Return true if (the identifier denoted by) SYMBOL is a variable in
ENV. Otherwise, return false. If ENV is not an environment, or if
SYMBOL is not a symbol, abort. */
_Bool mn_is_variable (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol);
/* Bind (the identifier denoted by) SYMBOL to a location whose value
is VALUE in ENV, and return true. "Minor's rules for interpreting
references to identifiers", above, describe this function's
behavior if SYMBOL is already bound.
If ENV is not an environment, or SYMBOL is not a symbol, abort. */
void mn_define (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol, mn_ref_t *value);
/* Return the value of the variable denoted by SYMBOL in ENV. If
SYMBOL does not denote a variable in ENV, return NULL and set the
pending exception. (This could occur if SYMBOL is unbound in ENV,
or if SYMBOL denotes a syntactic keyword.)
If ENV is not an environment or SYMBOL is not a symbol, abort. */
mn_ref_t *mn_variable_value (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol);
/* Set the value of the variable denoted by SYMBOL in ENV to VALUE,
and return true. If SYMBOL does not denote a variable in ENV,
return false and set the current exception. (This could occur if
SYMBOL is unbound in ENV, or if SYMBOL is a syntactic keyword.)
If ENV is not an environment or SYMBOL is not a symbol, abort. */
_Bool mn_set_variable (mn_call_t *,
mn_ref_t *env,
mn_ref_t *symbol,
mn_ref_t *value);
/* Return true if (the identifier denoted by) SYMBOL is a syntactic
keyword in ENV. Otherwise, return false.
If ENV is not an environment, or if SYMBOL is not a symbol, abort. */
_Bool mn_is_syntax (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol);
/* Bind (the identifier denoted by) SYMBOL to the syntactic
transformer TRANSFORMER in ENV. TRANSFORMER must be a procedure of
one argument, just as in the 'define-syntax' form. "Minor's rules
for interpreting references to identifiers", above, describe this
function's behavior if SYMBOL is already bound.
If ENV is not an environment or SYMBOL is not a symbol, abort. */
void mn_define_syntax (mn_call_t *,
mn_ref_t *env,
mn_ref_t *symbol,
mn_ref_t *transformer);
/* Return the transformer for the syntactic keyword SYMBOL in ENV. If
SYMBOL is not bound to a syntactic keyword in ENV, return NULL and
set the pending exception. (This could occur if SYMBOL is a
variable in ENV, or if it is unbound altogether.)
If ENV is not an environment or SYMBOL is not a symbol, abort. */
mn_ref_t *mn_syntax_transformer (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol);
/* Remove any binding for SYMBOL in ENV. If SYMBOL is not bound in
ENV, do nothing.
If ENV is not an environment or SYMBOL is not a symbol, abort. */
void mn_undefine (mn_call_t *, mn_ref_t *env, mn_ref_t *symbol);
/* Return the environment containing all the usual bindings for Minor
Scheme.
NOTE: Don't modify this environment, unless you really mean to
affect what other code sees as the default Minor environment. For
user interaction, running scripts, etc. you should make a fresh
environment, and copy this environment into it. */
mn_ref_t *mn_default_environment (mn_call_t *);
/* Copy all the bindings in SRC to DEST. If SRC and DEST bind the
same identifier, the binding from SRC replaces the binding from
DEST.
If SRC and DEST are not both environments, abort. */
void mn_environment_merge (mn_call_t *, mn_ref_t *dest, mn_ref_t *src);
/* For each bound symbol in ENV, call FUNC, passing it a fresh call
object, CLOSURE, and SYMBOL. FUNC should return true for the
iteration to proceed; if it wants to throw an exception, it should
call mn_set_exception and return false. If FUNC returns true for
every binding in ENV, return true. SYMBOL is owned by the new call.
If FREE_CLOSURE is non-zero, and a continuation is applied that
causes this iteration to be abandoned, apply FREE_CLOSURE to a
fresh call and CLOSURE. If the iteration returns normally, it will
not call FREE_CLOSURE; Minor will not refer to CLOSURE again.
If ENV is not an environment, abort. */
_Bool mn_environment_for_each (mn_call_t *,
mn_ref_t *env,
void *closure,
_Bool (*func) (mn_call_t *,
void *closure,
mn_ref_t *symbol),
void (*free_closure) (mn_call_t *, void *));
/* Eval. */
/* This may need to be re-thought in light of Matthew Flatt's phase
distinction; compile-time and run-time environments are separate,
no? */
/* Expand and then evaluate the expression EXPR in the environment
ENV, and return its value. (EXPR must return exactly one value).
EXPR must be a valid Minor Scheme expression represented as data.
ENV must be an environment. If the evaluation raises an exception,
save that as the pending exception, and return zero. If ENV is not
an environment, or EXPR is not a valid Minor Scheme expression,
return zero and set the pending exception.
Expansion is done entirely before any evaluation takes place, when
mn_eval is called. Evaluation interprets only fully-expanded
expressions. */
mn_ref_t *mn_eval (mn_call_t *, mn_ref_t *expr, mn_ref_t *env);
/* Expand and evaluate the expression EXPR in the environment ENV, and
return its values as a Scheme list. The arguments, return value,
and exception handling are as above. */
mn_ref_t *mn_eval_for_value_list (mn_call_t *, mn_ref_t *expr, mn_ref_t *env);
/* Appendix: Async-safety and fork safety. */
/* A function is "async-safe" if it can be used reliably in a signal
handler. POSIX specifies a limited set of system functions that
are async-safe.
None of the Minor C API functions are async-safe --- not even the
trivial ones like mn_eq. (Doing any sort of operation on Minor
heap objects involves coordinating with the garbage collector, and
the primitives POSIX offers to do that --- mutexes, semaphores, and
so on --- are not async-safe.)
Similarly, the functions of this interface can't be used in a child
process created by 'fork', when the parent is multi-threaded. (See
"WHY NON-ASYNC-SAFE FUNCTIONS CAN'T BE USED IN FORKED CHILDREN",
below.)
However, when the parent process is single-threaded, and the POSIX
system interface functions --- both async-safe and non-async safe
--- are safe to use in forked children, then the Minor C API
functions are safe to use, too. On most systems that implement
POSIX, that precondition does hold: while multi-threaded programs
and 'fork' are just inherently a bad mix, there are too many
existing single-threaded programs that assume they have free reign
in forked children for it to be acceptable for system libraries to
choke on them. The Minor C library takes the appropriate steps to
ensure that it, at least, won't be the source of the problem.
WHY NON-ASYNC-SAFE FUNCTIONS CAN'T BE USED IN FORKED CHILDREN
In general, it's not safe to use non-async-safe functions in a
child process created by 'fork'. POSIX specifies that, when a
thread calls 'fork', the child process inherits a copy of that
thread, and a copy of all the parent's memory, including mutexes
--- but none of the parent's other threads. This means that, if
one of those other threads was holding a mutex when the fork took
place, the child process will inherit locked mutexes, with no
threads running to free them. If the child thread tries to acquire
any of those mutexes, it will block forever.
But mutexes are just one instance of the problem. In general, if a
multi-threaded program calls 'fork', the child process will inherit
whatever temporary inconsistent state other threads may have
created, and any sort of cleanup those threads would normally be
counted on to do isn't going to happen in the child.
So, according to POSIX, only async-safe functions may be used in a
child created by 'fork'. Async-safe functions are defined to be
safe to use in signal handlers, and signal handlers can run at any
time, in the midst of whatever inconsistent state and locked
mutexes happen to be present. So they will be undisturbed by
similar disorder in a child process.
To be clear: none of these restrictions apply to a child process
created by 'fork' *after it has called exec*. The exec wipes out
the process's memory, replacing it with the contents of the
executable file, and starts it running with a single thread. So
any inconsistent state there might have been is gone after the
exec.
For details, see the rationale text in the POSIX spec for 'fork'
and 'pthread_atfork'. It's the latter that acknowledges that,
really, everything ought to continue to work after a fork if the
parent is single-threaded. */
#endif /* MN_MINOR_H */