On 08/06/2010 08:34 PM, Walter Bright wrote:
> I wrote these two trivial utilities for the purpose of canonicalizing
> source code before checkins and to deal with FreeBSD's inability to deal
> with CRLF line endings, and because I can never figure out the right
> settings for git to make it do the canonicalization.
>
> tolf - converts LF, CR, and CRLF line endings to LF.
>
> detab - converts all tabs to the correct number of spaces. Assumes tabs
> are 8 column tabs. Removes trailing whitespace from lines.
>
> Posted here just in case someone wonders what they are.
[snip]
Nice, though they don't account for multiline string literals.
A good exercise would be rewriting these tools in idiomatic D2 and
assess the differences.
Andrei

Or improve your google-fu by finding some existing tools that do the job
right. :)
I'm pretty sure Uncrustify is good at most of these issues, not to mention
it's a very nice source-code "prettifier/indenter". There's a front-end
called UniversalIndentGUI, which has about a dozen integrated versions of
source-code prettifiers (including uncrustify, and for many languages). It
has varios settings on the left, and togglable *Live* preview mode which you
can view on the right.
I invite you guys to try it out sometime:
http://universalindent.sourceforge.net/
(+ you can save different settings which is neat when you're coding for
different projects that have different "code design & look" standards)
On Sat, Aug 7, 2010 at 3:50 AM, Andrei Alexandrescu <
SeeWebsiteForEmail@erdani.org> wrote:
> On 08/06/2010 08:34 PM, Walter Bright wrote:
>
>> I wrote these two trivial utilities for the purpose of canonicalizing
>> source code before checkins and to deal with FreeBSD's inability to deal
>> with CRLF line endings, and because I can never figure out the right
>> settings for git to make it do the canonicalization.
>>
>> tolf - converts LF, CR, and CRLF line endings to LF.
>>
>> detab - converts all tabs to the correct number of spaces. Assumes tabs
>> are 8 column tabs. Removes trailing whitespace from lines.
>>
>> Posted here just in case someone wonders what they are.
>>
> [snip]
>
> Nice, though they don't account for multiline string literals.
>
> A good exercise would be rewriting these tools in idiomatic D2 and assess
> the differences.
>
>
> Andrei
>

What does idiomatic D means?
On Fri, 06 Aug 2010 20:50:52 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail@erdani.org> wrote:
> On 08/06/2010 08:34 PM, Walter Bright wrote:
>> I wrote these two trivial utilities for the purpose of canonicalizing
>> source code before checkins and to deal with FreeBSD's inability to deal
>> with CRLF line endings, and because I can never figure out the right
>> settings for git to make it do the canonicalization.
>>
>> tolf - converts LF, CR, and CRLF line endings to LF.
>>
>> detab - converts all tabs to the correct number of spaces. Assumes tabs
>> are 8 column tabs. Removes trailing whitespace from lines.
>>
>> Posted here just in case someone wonders what they are.
> [snip]
>
> Nice, though they don't account for multiline string literals.
>
> A good exercise would be rewriting these tools in idiomatic D2 and
> assess the differences.
>
>
> Andrei
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

On Friday 06 August 2010 18:50:52 Andrei Alexandrescu wrote:
> On 08/06/2010 08:34 PM, Walter Bright wrote:
> > I wrote these two trivial utilities for the purpose of canonicalizing
> > source code before checkins and to deal with FreeBSD's inability to deal
> > with CRLF line endings, and because I can never figure out the right
> > settings for git to make it do the canonicalization.
> >
> > tolf - converts LF, CR, and CRLF line endings to LF.
> >
> > detab - converts all tabs to the correct number of spaces. Assumes tabs
> > are 8 column tabs. Removes trailing whitespace from lines.
> >
> > Posted here just in case someone wonders what they are.
>
> [snip]
>
> Nice, though they don't account for multiline string literals.
>
> A good exercise would be rewriting these tools in idiomatic D2 and
> assess the differences.
>
>
> Andrei
I didn't try and worry about multiline string literals, but here are my more
idiomatic solutions:
detab:
/* Replace tabs with spaces, and remove trailing whitespace from lines.
*/
import std.conv;
import std.file;
import std.stdio;
import std.string;
void main(string[] args)
{
const int tabSize = to!int(args[1]);
foreach(f; args[2 .. $])
removeTabs(tabSize, f);
}
void removeTabs(int tabSize, string fileName)
{
auto file = File(fileName);
string[] output;
foreach(line; file.byLine())
{
int lastTab = 0;
while(lastTab != -1)
{
const int tab = line.indexOf('\t');
if(tab == -1)
break;
const int numSpaces = tabSize - tab % tabSize;
line = line[0 .. tab] ~ repeat(" ", numSpaces) ~ line[tab + 1 .. $];
lastTab = tab + numSpaces;
}
output ~= line.idup;
}
std.file.write(fileName, output.join("\n"));
}
-------------------------------------------
The three differences between mine and Walter's are that mine takes the tab size
as the first argumen,t it doesn't put a newline at the end of the file, and it
writes the file even if it changed (you could test for that, but when using
byLine(), it's a bit harder). Interestingly enough, from the few tests that I
ran, mine seems to be somewhat faster. I also happen to think that the code is
clearer (it's certainly shorter), though that might be up for debate.
-------------------------------------------
tolf:
/* Replace line endings with LF
*/
import std.file;
import std.string;
void main(string[] args)
{
foreach(f; args[1 .. $])
fixEndLines(f);
}
void fixEndLines(string fileName)
{
auto fileStr = std.file.readText(fileName);
auto result = fileStr.replace("\r\n", "\n").replace("\r", "\n");
std.file.write(fileName, result);
}
-------------------------------------------
This version is ludicrously simple. And it was also faster than Walter's in the
few tests that I ran. In either case, I think that it is definitely clearer code.
I would have thought that being more idomatic would have resulted in slower code
than what Walter did, but interestingly enough, both programs are faster with my
code. They might take more memory though. I'm not quite sure how to check that.
In any cases, you wanted some idiomatic D2 solutions, so there you go.
- Jonathan M Davis

Jonathan M Davis:
> I would have thought that being more idomatic would have resulted in slower code
> than what Walter did, but interestingly enough, both programs are faster with my
> code. They might take more memory though. I'm not quite sure how to check that.
> In any cases, you wanted some idiomatic D2 solutions, so there you go.
Your code looks better.
My (probably controversial) opinion on this is that the idiomatic D solution for those text "scripts" is to use a scripting language, as Python :-)
In this case a Python version is more readable, shorter and probably faster too because reading the lines of a _normal_ text file is faster in Python compared to D (because Python is more optimized for such purposes. I can show benchmarks on request).
On the other hand D2 is in its debugging phase, so it's good to use it even for purposes it's not the best language for, to catch bugs or performance bugs. So I think it's positive to write such scripts in D2, even if in a real-world setting I want to use Python to write them.
Bye,
bearophile