<div dir="ltr"><div><div><div><div>The SML semantics for when to raise exceptions is a little weird:<br><br>"If <var>di</var> < 0 or if |<var>dst</var>| < <var>di</var>+|<var>src</var>|, then the <code><a href="http://sml-family.org/Basis/general.html#SIG:GENERAL.Subscript:EXN:SPEC">Subscript</a></code> exception is raised."<br><br></div>This means if I want to copy an empty array into (non-existent) index 3 of a length 3 array, there should be no exception.<br></div>This is hard to accomplish with our BoundsCheckByte semantics which ensures a strict less-than the length.<br><br></div>How bad is it to use an explicit LessEq with LengthByte, rather than BoundsCheckByte?<br></div>Or alternatively, can we add that flag to BoundsCheckByte to change its comparison to less-or-equal?<br><div><div><div><div><br></div><div>Or do we want to not follow the SML semantics?<br></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 21 March 2017 at 11:01, Ramana Kumar <span dir="ltr"><<a href="mailto:Ramana.Kumar@cl.cam.ac.uk" target="_blank">Ramana.Kumar@cl.cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>We don't have any higher-order primitives currently... I imagine they'd be quite a bit harder to verify and implement efficiently... You would want clos_call/known type optimisations to specialise a primitive like unfold.<br><br></div>I think I'm currently leaning towards reinstating the concat primitives alongside the Copy{Str,Aw8}{Str,Aw8} primitives. The concat ones eventually get implemented in terms of the latter, but only once it's possible to do so without the extra copying.<br></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On 21 March 2017 at 09:53, <span dir="ltr"><<a href="mailto:Michael.Norrish@data61.csiro.au" target="_blank">Michael.Norrish@data61.csiro.<wbr>au</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">

<div bgcolor="white" link="blue" vlink="purple" lang="EN-GB">
<div class="m_-955857893339873405m_6199437871017552317WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Rather than having to initialise an empty string and then copy into it, why not provide something like a tabulate or unfold function that generates the string?
The function passed in would then be able to write directly into the string’s memory. I’m not sure how to set it up to write whole strings at a time (the helper function would have to return strings, which would almost certainly require a new allocation),
but if done a character at a time, the char values might get to stay in registers:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"> String.tabulate : int * (int -> char) -> string<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">If you had substrings (implemented as string * offset * length triples and sharing the underlying string), then something like<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"> String.unfold : ‘a * (‘a -> (substring * ‘a) option) * int -> string<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">could be done, where the ‘a is a generic “state” parameter (the list of strings to concat), and where the int is maximum length of the result.
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">I think String.concat could be implemented with the latter without any redundant copying.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Michael<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><u></u> <u></u></span></p>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From: </span>
</b><span style="font-family:Calibri;color:black">Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk" target="_blank">Ramana.Kumar@cl.cam.ac.uk</a>><br>
<b>Date: </b>Monday, 20 March 2017 at 16:29<br>
<b>To: </b>"<a href="mailto:developers@cakeml.org" target="_blank">developers@cakeml.org</a>" <<a href="mailto:developers@cakeml.org" target="_blank">developers@cakeml.org</a>><br>
<b>Subject: </b>Re: [CakeML-dev] New string/bytearray operations<u></u><u></u></span></p>
</div><div><div class="m_-955857893339873405h5">
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">I've been rethinking these primitives after the discussion at the last hangout, and have come up with a different set altogether. Can you see a simpler or more elegant approach to the one described below?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Here is the new approach I am considering:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">4 copying primitives in the source language going from string/bytearray to string/bytearray.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">The source comes with an offset and a length to copy.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">If the destination is a string, a new string is created. If the destination is a bytearray, it must be provided.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Concatenation (of lists of strings/arrays) and conversions between (whole) strings and (whole) bytearrays can be implemented in the basis library in terms of these primitives. And the primitives should be efficiently implementable in terms
of a byte-based memcpy primitive further down. (There will need to be bounds checking in the source-level semantics (i.e., Subscript exception can be raised), and this will sometimes be unfortunate (i.e., when obviously in bounds), but I don't think this is
too costly.)<u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">On 14 March 2017 at 16:52, Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk" target="_blank">Ramana.Kumar@cl.cam.ac.uk</a>> wrote:<u></u><u></u></p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal">Hi all,<br>
<br>
I've started adding string/bytearray conversion and concatenation<br>
primitives (issues 244 and 245). Before getting too deep into updating<br>
the compiler etc., may I request a review of the semantics? Here they<br>
are:<br>
<br>
<a href="https://github.com/CakeML/cakeml/commit/67dd15bbd03f516be618ba72f1d56a2764209263" target="_blank">https://github.com/CakeML/cake<wbr>ml/commit/67dd15bbd03f516be618<wbr>ba72f1d56a2764209263</a><br>
<br>
I noticed that v_to_char_list might be better as vs_to_char_list, to<br>
be run after v_to_list (rather than duplicating its<br>
list-deconstruction functionality). But I leave such refactoring for<br>
another time.<br>
<br>
Cheers,<br>
Ramana<u></u><u></u></p>
</blockquote>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div></div></div>
</div>