Change Details

This is currently a work-in-progress of #15999, a follow-up on #5793 and #15357.
As this patch will change some benchmarks (i.e. `wheel-sieve1`, `awards`)
rather drastically, I wanted to get some early feedback on this, rather than
quietly investing hours of work when this patch would never have had a chance
to be accepted to begin with.
The general plan is outlined in #15999: Identify wibbly benchmarks by looking at how productivity rates change over different nursery sizes and iterate `main` of these benchmarks often enough (almost always 100 times) for the wibbles to go away.
I was paying attention that the benchmarked logic is actually run $n times more often.
When I found benchmarks with insignificant runtime (#15357), I made sure that parameters/input files were adjusted so that runtime of the different modes falls within the ranges described in https://ghc.haskell.org/trac/ghc/ticket/15357#comment:4.
This is what I did so far:
- Stabilise gen_regexp
- Stabilise primes
- Stabilise wheel-sieve1
- Stabilise wheel-sieve2
- Adjust running time of x2n1
- Adjust running time of ansi
- Adjust running time of atom
- Make awards benchmark something other than IO
- Adjust running time of banner
- Stabilise boyer
- Adjust running time of boyer2
- Adjust running time of queens
- Adjust running time of calendar
- Adjust runtime of cichelli
- Stabilise circsim
- Stabilise clausify
- Stabilise constraints with moderate success
- Adjust running time of cryptarithm1
- Adjust running time of cryptarythm2
- Adjust running time of cse
- Adjust running time of eliza
- Adjust running time of exact-reals
- Adjust running time of expert
- Stabilise fft2
- Stabilise fibheaps
- Stabilise fish
- Adjust running time for gcd
- Stabilise comp_lab_zift
- Stabilise event
- Stabilise fft
- Stabilise genfft
- Stabilise ida
- Adjust running time for listcompr
- Adjust running time for listcopy
- Adjust running time of nucleic2
- Attempt to stabilise parstof
- Stabilise sched
- Stabilise solid
- Adjust running time of transform
- Adjust running time of typecheck
- Stabilise wang
- Stabilise wave4main
- Adjust running time of integer
- Adjust running time of knights
- Stabilise lambda
- Stabilise lcss
- Stabilise life
- Stabilise mandel
- Stabilise mandel2
- Adjust running time of mate
- Stabilise minimax
- Adjust running time of multiplier
- Adjust running time of para
- Stabilise power
- Adjust running time of primetest
- Stabilise puzzle with mild success
- Adjust running time for rewrite
- Stabilise simple with mild success
- Stabilise sorting
- Stabilise sphere
- Stabilise treejoin
Problematic benchmarks:
- `last-piece`: Unclear how to stabilise. Runs for 300ms and I can't make up smaller inputs because I don't understand what it does.
- `pretty`: It's just much too small to be relevant at all. Maybe we want to get rid of this one?
- `scc`: Same as `pretty`. The input graph for which SCC analysis is done is much too small and I can't find good directed example graphs on the internet.
- `secretary`: Apparently this needs `-package random` and consequently hasn't been run for a long time.
- `simple`: Same as `last-piece`. Decent runtime (70ms), but it's unstable and I see no way to iterate it ~100 times in fast mode.

This is currently a work-in-progress of #15999, a follow-up on #5793 and #15357.
As this patch will change some benchmarks (i.e. `wheel-sieve1`, `awards`)
rather drastically, I wanted to get some early feedback on this, rather than
quietly investing hours of work when this patch would never have had a chance
to be accepted to begin with.
The general plan is outlined in #15999: Identify GC-sensitive benchmarks by looking at how productivity rates change over different nursery sizes and iterate `main` of these benchmarks often enough for the wibbles to go away.
I was paying attention that the benchmarked logic is actually run $n times more often.
When I found benchmarks with insignificant runtime (#15357), I made sure that parameters/input files were adjusted so that runtime of the different modes falls within the ranges described in https://ghc.haskell.org/trac/ghc/ticket/15357#comment:4.
This is what I did so far:
- Stabilise gen_regexp
- Stabilise primes
- Stabilise wheel-sieve1
- Stabilise wheel-sieve2
- Adjust running time of x2n1
- Adjust running time of ansi
- Adjust running time of atom
- Make awards benchmark something other than IO
- Adjust running time of banner
- Stabilise boyer
- Adjust running time of boyer2
- Adjust running time of queens
- Adjust running time of calendar
- Adjust runtime of cichelli
- Stabilise circsim
- Stabilise clausify
- Stabilise constraints with moderate success
- Adjust running time of cryptarithm1
- Adjust running time of cryptarythm2
- Adjust running time of cse
- Adjust running time of eliza
- Adjust running time of exact-reals
- Adjust running time of expert
- Stabilise fft2
- Stabilise fibheaps
- Stabilise fish
- Adjust running time for gcd
- Stabilise comp_lab_zift
- Stabilise event
- Stabilise fft
- Stabilise genfft
- Stabilise ida
- Adjust running time for listcompr
- Adjust running time for listcopy
- Adjust running time of nucleic2
- Attempt to stabilise parstof
- Stabilise sched
- Stabilise solid
- Adjust running time of transform
- Adjust running time of typecheck
- Stabilise wang
- Stabilise wave4main
- Adjust running time of integer
- Adjust running time of knights
- Stabilise lambda
- Stabilise lcss
- Stabilise life
- Stabilise mandel
- Stabilise mandel2
- Adjust running time of mate
- Stabilise minimax
- Adjust running time of multiplier
- Adjust running time of para
- Stabilise power
- Adjust running time of primetest
- Stabilise puzzle with mild success
- Adjust running time for rewrite
- Stabilise simple with mild success
- Stabilise sorting
- Stabilise sphere
- Stabilise treejoin
Problematic benchmarks:
- `last-piece`: Unclear how to stabilise. Runs for 300ms and I can't make up smaller inputs because I don't understand what it does.
- `pretty`: It's just much too small to be relevant at all. Maybe we want to get rid of this one?
- `scc`: Same as `pretty`. The input graph for which SCC analysis is done is much too small and I can't find good directed example graphs on the internet.
- `secretary`: Apparently this needs `-package random` and consequently hasn't been run for a long time.
- `simple`: Same as `last-piece`. Decent runtime (70ms), but it's unstable and I see no way to iterate it ~100 times in fast mode.

This is currently a work-in-progress of #15999, a follow-up on #5793 and #15357.
As this patch will change some benchmarks (i.e. `wheel-sieve1`, `awards`)
rather drastically, I wanted to get some early feedback on this, rather than
quietly investing hours of work when this patch would never have had a chance
to be accepted to begin with.
The general plan is outlined in #15999: Identify wibblyGC-sensitive benchmarks by looking at how productivity rates change over different nursery sizes and iterate `main` of these benchmarks often enough (almost always 100 times) for the wibbles to go away.
I was paying attention that the benchmarked logic is actually run $n times more often.
When I found benchmarks with insignificant runtime (#15357), I made sure that parameters/input files were adjusted so that runtime of the different modes falls within the ranges described in https://ghc.haskell.org/trac/ghc/ticket/15357#comment:4.
This is what I did so far:
- Stabilise gen_regexp
- Stabilise primes
- Stabilise wheel-sieve1
- Stabilise wheel-sieve2
- Adjust running time of x2n1
- Adjust running time of ansi
- Adjust running time of atom
- Make awards benchmark something other than IO
- Adjust running time of banner
- Stabilise boyer
- Adjust running time of boyer2
- Adjust running time of queens
- Adjust running time of calendar
- Adjust runtime of cichelli
- Stabilise circsim
- Stabilise clausify
- Stabilise constraints with moderate success
- Adjust running time of cryptarithm1
- Adjust running time of cryptarythm2
- Adjust running time of cse
- Adjust running time of eliza
- Adjust running time of exact-reals
- Adjust running time of expert
- Stabilise fft2
- Stabilise fibheaps
- Stabilise fish
- Adjust running time for gcd
- Stabilise comp_lab_zift
- Stabilise event
- Stabilise fft
- Stabilise genfft
- Stabilise ida
- Adjust running time for listcompr
- Adjust running time for listcopy
- Adjust running time of nucleic2
- Attempt to stabilise parstof
- Stabilise sched
- Stabilise solid
- Adjust running time of transform
- Adjust running time of typecheck
- Stabilise wang
- Stabilise wave4main
- Adjust running time of integer
- Adjust running time of knights
- Stabilise lambda
- Stabilise lcss
- Stabilise life
- Stabilise mandel
- Stabilise mandel2
- Adjust running time of mate
- Stabilise minimax
- Adjust running time of multiplier
- Adjust running time of para
- Stabilise power
- Adjust running time of primetest
- Stabilise puzzle with mild success
- Adjust running time for rewrite
- Stabilise simple with mild success
- Stabilise sorting
- Stabilise sphere
- Stabilise treejoin
Problematic benchmarks:
- `last-piece`: Unclear how to stabilise. Runs for 300ms and I can't make up smaller inputs because I don't understand what it does.
- `pretty`: It's just much too small to be relevant at all. Maybe we want to get rid of this one?
- `scc`: Same as `pretty`. The input graph for which SCC analysis is done is much too small and I can't find good directed example graphs on the internet.
- `secretary`: Apparently this needs `-package random` and consequently hasn't been run for a long time.
- `simple`: Same as `last-piece`. Decent runtime (70ms), but it's unstable and I see no way to iterate it ~100 times in fast mode.