Ever since, I've been doing parameter sweeps over these as we make other refinements to the system.

119

134

120

135

* nn - do not float a binding that applies one of its free variables.

121

136

* yn - do not float a binding that applies one of its free variables saturated or oversaturated.

122

137

* ny - do not float a binding that applies one of its free variables undersaturated.

123

* yy - do not restrict application of bindings free variables

124

125

Roughly, we expect that more floating means (barely) less allocation but worse runtime (by how much?) because some known calls become unknown calls.

138

* yy - do not restrict application of the binding's free variables

139

140

I have yet to see a clear results; there's no variant that's bested the others on most programs' runtime.

141

142

This is even after I developed some bash script (I'm so sorry) to transpose the NoFib loops; instead of running the entire NoFib suite for one set of switches and then running it again for the next set of switches, and so on, I build all the variants, and then run each variant's version of each program sequentially. I intend for this to reduce noise by improving the time locality of the measurements of the same test. Even so, the noise was bad.

143

144

So I turned my attention to allocation instead, for now. Roughly, we expect that more floating means (barely) less allocation but worse runtime (by how much?) because some known calls become unknown calls. But, eg, going from nn -> yn --- ie floating functions that undersaturate free variables instead of not floating them --- caused worse allocation! This investigation led to [#MitigatingLNEAbstraction].

145

146

Based on that example, it occurred to me that we should only restrict the binding's saturation of its *known* free variables. For example, I was not floating a binding because its RHS applied a free variable, even though that free variable was lambda bound. That decision has no benefit, and indeed was causing knock-on effects that increase allocation (eg [#MitigatingLNEAbstraction]).

147

148

I have yet to determine that the preservation of fast entries is worth the trouble --- I certainly hope so... the parameter sweeps have taken a lot of time!

149

150

To enable further measurements, I have identified the semantics of some ticky counters, cf [#TickyCounters].

NB I think this will be mitigated "for free", since I'm predicting that we will never abstract variables that occur exactly saturated and an LNE binder can only be exactly saturated. If we do end up abstracting over saturated functions, we may want to consider mitigating this separately.

132

157

133

In fish (1.6%), hpg (~4.5%), and sphere (10.4%), allocation gets worse for ny and yy compared to nn and yn. The nn and ny do not change the allocation compared to the baseline library (ie no LLF).

158

Using -flate-float-in-thunk-limit=10, -fprotect-last-arg, and -O1, I tested the libraries+NoFib for the four variants from [#PreservingFastEntries]. In fish (1.6%), hpg (~4.5%), and sphere (10.4%), allocation gets worse for ny and yy compared to nn and yn. The nn and ny do not change the allocation compared to the baseline library (ie no LLF).

We discovered that the worker-wrapper was removing the void argument from join points (eg knights and mandel2). This ultimately resulted in LLF *increasing* allocation. A thunk was let-no-escape before LLF but not after, since it occurred free in the right-hand side of a floated binding and hence now occurred (escapingly) as an argument.

187

212

188

SPJ was expecting no such non-lambda join points to exist. We identified where it was happening (WwLib.mkWorkerArgs) and switched it off. Here are the programs that with affected allocation.

213

SPJ was expecting no such non-lambda join points to exist. We identified where it was happening (`WwLib.mkWorkerArgs`) and switched it off. Here are the programs that with affected allocation.