On Wednesday 12 October 2005 15:58, Tim Daly Jr. wrote:
> I guess I don't quite understand the difference here. Currently,
> each thread has its own binding for every special variable, right?
If the parent thread has a non-global binding then the child also gets a=20
binding that's a copy of the parent's (it's a different binding with=20
the same value).
(defparameter *x* 0)
(let ((*x* 1))
(sb-thread:make-thread (lambda () (assert (=3D *x* 1)) (setq *x* 2)))
(assert (=3D *x* 1)))
> And you are proposing that they share all bindings except for the
> ones listed in INITIAL-BINDINGS?
I realized after sending it that 'a more conventional design' is not the=20
most precise description.
I mean that they share no binding except the global binding:
(defparameter *x* 0)
(let ((*x* 1))
(sb-thread:make-thread (lambda () (assert (=3D *x* 0)) (setq *x* 2)))
(assert (=3D *x* 1)))
Not even the initial bindings are shared, the initial-bindings argument=20
is to make a thread perform bindings on startup as documented in:
http://www.franz.com/support/documentation/7.0/doc/multiprocessing.htm#dyna=
mic-environments-1
A way to establish a binding that's not global but shared between a=20
number of threads (a session?) might be useful, too.
>
> -Tim
Hopefully, I didn't butcher the terminology too badly.
G=E1bor

On Wed, 2005-10-12 at 15:07 +0200, G=E1bor Melis wrote:
> Currently threads act like dynamic closures that capture _all_ dynamic=20
> variables. This has a number issues:
...
> AFAIK SBCL's approach is unique. In my current state of mind I can only=
=20
> see the drawbacks and moving to a more conventional design is luring:
>=20
> make-thread (function &key name initial-bindings)
>=20
> or maybe even
>=20
> make-thread (function &key name (initial-bindings *default-bindings*)=
)
>=20
> What do you think?
Do you have a minute to explain it a to a lowly user? :)
I guess I don't quite understand the difference here. Currently, each
thread has its own binding for every special variable, right? And you
are proposing that they share all bindings except for the ones listed in
INITIAL-BINDINGS?
-Tim

On Wed, Oct 12, 2005 at 08:22:29AM +0300, Nikodemus Siivola wrote:
> >So, do people feel that this is a reasonable thing to include in the
> >code base? This seems like a lot of code to add to each of the backends,
> >although the impact would be mitigated somewhat if improvement 3) above
> >was realized. Comments and questions welcome.
>
> OTOH, "anything for a faster AREF". ...but simultaneously it seems to me
> that the ADDI ADDI LWZ -> ADDI LWZ could be effected by a peephole
> optimizer as well -- if we had one.
This thought had occurred to me as well, and would arguably be the nicer
solution. But I don't have a spare summer for doing this SoC project. :)
(OTOH, I speculate that doing this sort of thing on IR2 *might* be
possible and be far easier to write than each of (symbol-value 'n)
machine-dependent peephole optimizers.)
--
Nathan | From Man's effeminate slackness it begins. --Paradise Lost
The last good thing written in C was Franz Schubert's Symphony Number 9.
--Erwin Dieterich

Currently threads act like dynamic closures that capture _all_ dynamic=20
variables. This has a number issues:
1) dynamic-extent
Suppose your package has a non-exported special, binds it, promises it's=20
going to be dynamic extent and proceeds to call user code. The user=20
code spawns a thread and the promise is broken.
2) gc
It's hard to control giving out references to objects. Yeah, it's=20
similar to 1), but colour of the smoke is different.
3) scaling
When starting up, a thread is given a snapshot of the parent thread's=20
current values for dynamic variables. This means that the minimum=20
memory required by a thread is proportional to the number of specials.
AFAIK SBCL's approach is unique. In my current state of mind I can only=20
see the drawbacks and moving to a more conventional design is luring:
make-thread (function &key name initial-bindings)
or maybe even
make-thread (function &key name (initial-bindings *default-bindings*))
What do you think?
G=E1bor

On Tue, 11 Oct 2005, Nathan Froyd wrote:
> So, do people feel that this is a reasonable thing to include in the
> code base? This seems like a lot of code to add to each of the backends,
> although the impact would be mitigated somewhat if improvement 3) above
> was realized. Comments and questions welcome.
OTOH, "anything for a faster AREF". ...but simultaneously it seems to me
that the ADDI ADDI LWZ -> ADDI LWZ could be effected by a peephole
optimizer as well -- if we had one.
Cheers,
-- Nikodemus Schemer: "Buddha is small, clean, and serious."
Lispnik: "Buddha is big, has hairy armpits, and laughs."

While writing and optimizing a program a year ago, I noticed a
suboptimality in the code SBCL generates. Consider the following
function:
(defun gene-w (gene age)
(declare (type (simple-array fixnum (50)) gene))
(declare (type (integer 0 3) age))
(aref gene (+ age 20)))
This function compiles to essentially the following PPC assembly:
; 40A9AD8C: 39590050 ADDI $FDEFN,$A1,80
; 90: 386A0001 ADDI $NL0,$FDEFN,1
; 94: 7F18182E LWZX $A0,$A0,$NL0
Which is all well and good. Except that the two ADDIs could really be
folded into a single ADDI, thus eliminating an instruction. This
requires recognizing the pattern (AREF <thing> (+ <index> <constant))
and ensuring that <constant> will fit into the immediate field of the
ADDI after the adjustment:
(- (* VECTOR-DATA-OFFSET N-WORD-BYTES) OTHER-POINTER-LOWTAG)
is made.
The attached patch attempts to implement just such an optimization for
the PPC; doing the same for other architectures should be relatively
straightforward. At a high level, the patch adds a new support routine
to SB-VM that implements the check for fitting into the immediate field
of an ADDI. (For the x86, this would be the displacement field of an
effective address.) A transform is added for DATA-VECTOR-REF which
checks to see if the index is off the form (+ <foo> <constant>) and
changes the call into (DATA-VECTOR-REF-WITH-OFFSET ...). New VOPs are
then introduced to implement this new function.
The attached patch is incomplete: VOPs for DATA-VECTOR-REF-WITH-OFFSET
have not been implement for float or complex arrays, nor for arrays with
signed elements smaller than a word. The corresponding transformation
for DATA-VECTOR-SET has not yet been implemented, although the low-level
VOP support is present. But the basic idea works well enough; the above
example compiles to:
; 100131AC: 38790051 ADDI $NL0,$A1,81
; B0: 7F18182E LWZX $A0,$A0,$NL0
Which is what we really wanted from our compiler anyway.
The patch could probably be enhanced in several ways:
1) Complete support for *all* specialized arrays, natch;
2) Proper use of :INFO in the D-V-R-W-O VOPs to avoid lying to the VOP
machinery about how the EXTRA-OFFSET argument is passed in. This
change would also enable the VOPs to more precisely declare the
types of EXTRA-OFFSET, which is a win;
3) (ambitious) Totally replace DATA-VECTOR-REF with D-V-R-W-O, since the
former is just the latter with an offset of 0. I thought about doing
this before I started, but then balked because of possible
difficulties with constant folding (thinking that the real function
D-V-R-W-O would have to be defined somewhere, which would mean that
the 'offset' parameter would not be a constant...). But after
implementing D-V-R-W-O, I see that the "real" function for D-V-R does
not exist. So maybe this would be feasible after all;
4) Repeated application of the transformation: e.g. recognizing that
(AREF A (+ (+ X 5) 6)) should not be (D-V-R-W-O A (+ X 5) 6) but
instead (D-V-R-W-O A X 11). And so on in other ways, too.
So, do people feel that this is a reasonable thing to include in the
code base? This seems like a lot of code to add to each of the backends,
although the impact would be mitigated somewhat if improvement 3) above
was realized. Comments and questions welcome.
--
Nathan | From Man's effeminate slackness it begins. --Paradise Lost
The last good thing written in C was Franz Schubert's Symphony Number 9.
--Erwin Dieterich