Re: [clisp-list] reading of CR/LF for charset:iso-8859-1

Matt Kaufmann <kaufmann@...> writes:
> Hi --
>
> I maintain an application that is build on top of Common Lisp, which
> expects iso-8859-1 for the character encoding. I'd like to set things
> up so that on a linux system, my application reads characters from a
> file exactly as they were written. But my attempt to do so failed,
> dropping a #\Return character, as illustrated by the log below. Is
> there something simple I can do to accomplish my goal, or else might
> that be the case in future CLISP releases? Note that I did see the
> following note at http://www.clisp.org/impnotes/clhs-newline.html:
>
> Justification. Unicode Newline Guidelines say: “Even if you know
> which characters represents NLF on your particular platform, on
> input and in interpretation, treat CR, LF, CRLF, and NEL the
> same. Only on output do you need to distinguish between them.”
>
> However, I'm hoping that since I'm using iso-8859-1 rather than a utf
> encoding, maybe that justification doesn't need to apply.
No, it still applies.
Since you want to read codes such as 13 and 10, you should specify an
element type of (unsigned-byte 8):
[pjb@... :0.0 ~]$ clisp -ansi -norc -q
[1]> (deftype octet () '(unsigned-byte 8))
OCTET
[2]> (with-open-file (in #P"~/tmp/misc/wang.dos"
:element-type 'octet)
(let ((buffer (make-array 256 :element-type 'octet)))
(read-sequence buffer in)
(search #(13 10) buffer)))
29
[3]> (quit)
[pjb@... :0.0 ~]$
--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.
You can take the lisper out of the lisp job, but you can't take the lisp out
of the lisper (; -- antifuchs