This I/O system proposal is a fairly complete subset of Gambit's I/O
system. Some of the features are rather specialized
(e.g. object->string procedure and process-ports) and I'm willing to
drop them from the proposal if they are too controversial.
Marc
High-level I/O
==============
1. STREAM API
Here's how I define the stream API which is central to any I/O system:
1) The stream API is used for accessing sequential data. The stream is
first opened, then there is a sequence of read or write operations,
the end of the stream can be tested (e.g. with eof-object?), and
finally the stream is closed.
2) Read and write operations will block the process/thread until
the operation can complete.
3) The stream API can be extended with a "seek" operation to allow
random access to data. This only makes sense for some sources of
data, such as regular files.
4) Bidirectional streams are those that allow both read and write
operations. A bidirectional stream can be viewed as the fusion of
two independent streams, and it may make sense to close one
direction independently from the other.
5) Streams of one type of data can be used to represent streams of
another type of data by adopting encoding and decoding procedures.
For example, the Scheme procedures write and read implement the
encoding and decoding of a Scheme datum to/from a sequence of
characters. Similarly, a stream of characters can be represented
with a stream of octets by choosing a character encoding
(e.g. latin1, utf8, utf16le, etc).
Note that there are several different types of data that can be
accessed with the stream API. Some classical types are:
- regular files on the filesystem (as opened with stdio's "fopen")
- TCP sockets (as opened with Unix's "connect")
- subprocesses (as opened with Unix's "popen")
There are other types of data for which the stream API is appropriate:
- String ports. A read operation returns the next character. A
write operation adds a character to the string of characters
accumulated by the port. String ports can be generalized to other
types of streams: streams of octets ("u8vector" ports) and streams
of Scheme objects ("vector" ports). Streams of Scheme objects are
particularly useful as FIFOs (aka "pipes") to connect client
threads to a server thread (the elements of the FIFO are requests
for the server). A read operation on an empty FIFO will block
the thread until a client sends a request.
- Directories (as opened with Unix's "opendir"). A read operation
returns the next file name. Using the stream API avoids the
problem of a "directory->list" type API which would momentarily
consume a large amount of space for storing the list of file names
of a very big directory when you simply want to iterate over the
files in the directory.
- TCP ports for accepting connections (as with Unix's "accept").
A read operation accepts the next connection request and
returns a bidirectional port to interact with the client.
The read will block if no connection request is queued on
the IP port number associated with the TCP port.
2. ENCODING
R5RS ports only support text. This allows streams of characters and
also streams of Scheme objects (by encoding the Scheme objects with
characters using their external representation). Unfortunately not
all Scheme objects have an external representation with write/read
invariance and the encoding of characters is not specified by R6RS.
R6RS should also support binary I/O.
Note that characters can be encoded with sequences of octets (using
any of the standard encodings: latin1, utf8, utf16le, ...), and that
objects can be encoded with sequences of characters (using the R5RS
write and read procedures). So this suggests that ports can be
organized in an inheritance hierarchy such that operations possible
for a certain class of port are also possible for the subclasses.
Here are the four abstract classes proposed:
1) An "object-port" (or simply a port) provides operations to read
and write Scheme data (i.e. any Scheme object) to/from the port.
It also provides operations to force output to occur, to change
the way threads block on the port, and to close the port. Note
that the class of objects for which write/read invariance is
guaranteed depends on the particular type of port.
2) A "character-port" provides all the operations of an object-port,
and also operations to read and write individual characters to/from
the port. When a Scheme object is written to a character-port, it
is converted into the sequence of characters that corresponds to
its external-representation. When reading a Scheme object, an
inverse conversion occurs. Note that some Scheme objects do not
have an external textual representation that can be read back.
3) An "octet-port" provides all the operations of a character-port,
and
also operations to read and write individual octets to/from the
port. When a character is written to a octet-port, some encoding
of that character into a sequence of octets will occur (for
example,
#\newline will be encoded as the 2 octets CR-LF when using
latin1 character encoding and cr-lf end-of-line encoding, and a
non-ASCII character will generate more than 1 octet when using utf8
character encoding). When reading a character, a similar decoding
occurs.
4) A "device-port" provides all the operations of an octet-port, and
also operations to control the operating system managed device
(file, network connection, terminal, etc) that is connected to the
port.
The inheritance hierarchy corresponds to this tree (the spine on the
left are all abstract port classes):
object-port____________________________________________
| \ \ \ \
| vector-port directory-port tcp-server-port ...
|
character-port_________
| \ \
| string-port ...
|
octet-port_____________
| \ \
| u8vector-port ...
|
device-port___________________________
\ \ \
file-port tcp-client-port ...
So the result of (open-input-file "foo.txt") would be a device-port
attached to the file "foo.txt". This port would allow file-specific
operations (such as "seek"), binary I/O (such as reading or writing an
octet or group of octets), character I/O (i.e. write-char and
read-char), and Scheme object I/O (i.e. write and read).
On the other hand, the result of (open-input-string "a 123") would be
a character-port allowing character and Scheme object I/O, but not
binary I/O. Analogously, the result of (open-input-vector '#(a 123))
would be an object-port allowing Scheme object I/O (with complete
write/read invariance), but not character or binary I/O.
3. PORT SETTINGS
Port settings are parameters specified when a port is created that
affect how I/O operations on that port behave (character encoding,
buffering, etc). Some port settings are only valid for specific port
classes whereas some others are valid for all ports. Port settings
that are not specified when a port is created will default to some
reasonable values. Keyword objects are used to name the settings to
be set. As a simple example, a device-port connected to the file
"foo" can be created using the call
(open-input-file "foo")
This will use default settings for the character encoding, buffering,
etc. If the utf8 character encoding is desired, then the port could be
opened using the call
(open-input-file (list path: "foo" char-encoding: 'utf8))
Here the argument of the procedure open-input-file has been replaced
by a "port settings list" which specifies the value of each port
setting that should not be set to the default value. Note that some
port settings have no useful default and it is therefore required to
specify a value for them, such as the "path:" in the case of the file
opening procedures. All port creation procedures (i.e. named
open-...) take a single argument that can either be a port settings
list or a value of a type that depends on the kind of port being
created (a path string for files, an IP port number for TCP servers,
etc).
4. OBJECT-PORTS
4.1 Object-port settings
The following is a list of port settings that are valid for all types
of ports.
* direction: ( input | output | input-output )
This setting controls the direction of the port. The symbol
input indicates a unidirectional input-port, the symbol output
indicates a unidirectional output-port, and the symbol
input-output indicates a bidirectional port. The default value
of this setting depends on the port creation procedure.
* buffering: ( #f | #t | line )
This setting controls the buffering of the port. To set each
direction separately the keywords input-buffering: and
output-buffering: must be used instead of buffering:. The
value #f selects unbuffered I/O, the value #t selects fully
buffered I/O, and the symbol line selects line buffered I/O (the
output buffer is drained when a #\newline character is written).
Line buffered I/O only applies to character-ports. The default
value of this setting depends on the port creation procedure.
4.2 Object-port operations
- [procedure] (input-port? OBJ)
- [procedure] (output-port? OBJ)
- [procedure] (port? OBJ)
The procedure input-port? returns #t when OBJ is a
unidirectional input-port or a bidirectional port and #f
otherwise.
The procedure output-port? returns #t when OBJ is a
unidirectional output-port or a bidirectional port and #f
otherwise.
The procedure port? returns #t when OBJ is a port (either
unidirectional or bidirectional) and #f otherwise.
- [procedure] (read [PORT])
This procedure reads and returns the next Scheme object from the
input-port PORT. The end-of-file object is returned when the end
of the stream is reached. If it is not specified, PORT defaults
to the current input-port.
- [procedure] (read-all [PORT [READER]])
This procedure repeatedly calls the procedure READER with PORT as
the sole argument and accumulates a list of each value returned up
to the end-of-file object. The procedure read-all returns the
accumulated list without the end-of-file object. If it is not
specified, PORT defaults to the current input-port. If it is not
specified, READER defaults to the procedure read.
For example:
> (call-with-input-string "3,2,1\ngo!" read-all)
(3 ,2 ,1 go!)
> (call-with-input-string "3,2,1\ngo!"
(lambda (p) (read-all p read-char)))
(#\3 #\, #\2 #\, #\1 #\newline #\g #\o #\!)
> (call-with-input-string "3,2,1\ngo!"
(lambda (p) (read-all p read-line)))
("3,2,1" "go!")
- [procedure] (write OBJ [PORT])
This procedure writes the Scheme object OBJ to the output-port PORT
and the value returned is unspecified. If it is not specified,
PORT defaults to the current output-port.
- [procedure] (newline [PORT])
This procedure writes an "object separator" to the output-port
PORT and the value returned is unspecified. The separator ensures
that the next Scheme object written with the write procedure will
not be confused with the latest object that was written. On
character-ports this is done by writing the character #\newline.
On ports where successive objects are implicitly distinct (such
as "vector ports") this procedure does nothing.
Regardless of the class of a port P and assuming that the external
textual representation of the object X is readable, the expression
(begin (write X P) (newline P)) will write to P a representation
of X that can be read back with the procedure read. If it is
not specified, PORT defaults to the current output-port.
- [procedure] (force-output [PORT])
The procedure force-output causes the output buffers of the
output-port PORT to be drained (i.e. the data is sent to its
destination). If PORT is not specified, the current output-port
is used.
For example:
> (define p (open-tcp-client
(list server-address: "www.iro.umontreal.ca"
port-number: 80)))
> (display "GET /\n" p)
> (force-output p)
> (read-line p)
"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01
Transitional//EN\""
- [procedure] (close-input-port PORT)
- [procedure] (close-output-port PORT)
- [procedure] (close-port PORT)
The PORT argument of these procedures must be a unidirectional or
a bidirectional port. For all three procedures the value returned
is unspecified.
The procedure close-input-port closes the input-port side of
PORT, which must not be a unidirectional output-port.
The procedure close-output-port closes the output-port side of
PORT, which must not be a unidirectional input-port. The ouput
buffers are drained before PORT is closed.
The procedure close-port closes all sides of the PORT. Unless
PORT is a unidirectional input-port, the output buffers are
drained before PORT is closed.
For example:
> (define p (open-tcp-client
(list server-address: "www.iro.umontreal.ca"
port-number: 80)))
> (display "GET /\n" p)
> (close-output-port p)
> (read-line p)
"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01
Transitional//EN\""
- [procedure] (input-port-timeout-set! PORT TIMEOUT [THUNK])
- [procedure] (output-port-timeout-set! PORT TIMEOUT [THUNK])
When a thread tries to perform an I/O operation on a port, the
requested operation may not be immediately possible and the thread
must wait. For example, the thread may be trying to read a line of
text from the console and the user has not typed anything yet, or
the thread may be trying to write to a network connection faster
than the network can handle. In such situations the thread
normally blocks until the operation becomes possible.
It is sometimes necessary to guarantee that the thread will not
block too long or not at all. For this purpose, to each
input-port and output-port is attached a "timeout" and
"timeout-thunk". The timeout indicates the point in time beyond
which the thread should stop waiting on an input and output
operation respectively. When the timeout is reached, the thread
calls the port's timeout-thunk. If the timeout-thunk returns #f
the thread abandons trying to perform the operation (in the case
of an input operation an end-of-file is read and in the case of
an output operation an exception is raised). Otherwise, the
thread will block again waiting for the operation to become
possible (note that if the port's timeout has not changed the
thread will immediately call the timeout-thunk again if the
operation is still not possible).
The procedure input-port-timeout-set! sets the timeout of the
input-port PORT to TIMEOUT and the timeout-thunk to THUNK. The
procedure output-port-timeout-set! sets the timeout of the
output-port PORT to TIMEOUT and the timeout-thunk to THUNK. If it
is not specified, the THUNK defaults to a thunk that returns #f.
The TIMEOUT is either a time object indicating an absolute point
in time (see SRFI 18), or it is a real number which indicates the
number of seconds relative to the moment the procedure is called.
For both procedures the value returned is unspecified.
When a port is created the timeout is set to infinity (+inf.).
This causes the thread to wait as long as needed for the operation
to become possible. Setting the timeout to a point in the past
(-inf.) will cause the thread to attempt the I/O operation and
never block (i.e. the timeout-thunk is called if the operation is
not immediately possible).
****************
The following example shows how to cause the REPL to terminate
when the user does not enter an expression within the next 60
seconds.
> (input-port-timeout-set! (repl-input-port) 60)
>
*** EOF again to exit
5. CHARACTER-PORTS
5.1 Character-port settings
The following is a list of port settings that are valid for
character-ports.
* output-width: POSITIVE-INTEGER
This setting indicates the width of the character output-port in
number of characters. This information could be used by a
pretty-printing procedure. The default value of this setting is
80.
[[
If R6RS is extended in the future to support configuring the
reader and writer with "readtable" objects, a setting for readtable
could be added here:
* readtable: READTABLE
This setting determines the readtable attached to the
character-port. To set each direction separately the keywords
input-readtable: and output-readtable: must be used instead of
readtable:. Readtables control the external textual
representation of Scheme objects, that is the encoding of Scheme
objects using characters. The behavior of the read procedure
depends on the port's input-readtable and the behavior of the
procedures write, pretty-print, and related procedures is
affected by the port's output-readtable. The default value of this
setting is the value bound to the parameter object
current-readtable.
]]
5.2 Character-port operations
- [procedure] (input-port-line PORT)
- [procedure] (input-port-column PORT)
- [procedure] (output-port-line PORT)
- [procedure] (output-port-column PORT)
The current character location of a character input-port is the
location of the next character to read. The current character
location of a character output-port is the location of the next
character to write. Location is denoted by a line number (the
first line is line 1) and a column number, that is the location on
the current line (the first column is column 1). The procedures
input-port-line and input-port-column return the line location
and the column location respectively of the character input-port
PORT. The procedures output-port-line and output-port-column
return the line location and the column location respectively of
the character output-port PORT.
For example:
> (call-with-output-string
'()
(lambda (p)
(display "abc\n123def" p)
(write (list (output-port-line p) (output-port-column
p))
p)))
"abc\n123def(2 7)"
- [procedure] (output-port-width PORT)
This procedure returns the width, in characters, of the character
output-port PORT. The value returned is the port's output-width
setting.
For example:
> (output-port-width (current-output-port))
80
- [procedure] (read-char [PORT])
This procedure reads the character input-port PORT and returns the
character at the current character location and advances the
current character location to the next character, unless the PORT
is already at end-of-file in which case read-char returns the
end-of-file object. If it is not specified, PORT defaults to the
current input-port.
- [procedure] (peek-char [PORT])
This procedure returns the same result as read-char but it does
not advance the current character location of the input-port PORT.
If it is not specified, PORT defaults to the current input-port.
- [procedure] (write-char CHAR [PORT])
This procedure writes the character CHAR to the character
output-port PORT and advances the current character location of
that output-port. The value returned is unspecified. If it is not
specified, PORT defaults to the current output-port.
- [procedure] (read-line [PORT [SEPARATOR [INCLUDE-SEPARATOR?]]])
This procedure reads characters from the character input-port PORT
until a specific SEPARATOR or the end-of-file is encountered and
returns a string containing the sequence of characters read. The
SEPARATOR is included at the end of the string only if it was the
last character read and INCLUDE-SEPARATOR? is not #f. The
SEPARATOR must be a character or #f (in which case all the
characters until the end-of-file are read). If it is not
specified, PORT defaults to the current input-port. If it is not
specified, SEPARATOR defaults to #\newline. If it is not
specified, INCLUDE-SEPARATOR? defaults to #f.
For example:
> (define (split sep)
(lambda (str)
(call-with-input-string
str
(lambda (p)
(read-all p (lambda (p) (read-line p sep)))))))
> ((split #\,) "a,b,c")
("a" "b" "c")
> (call-with-input-string "1,2,3\n4,5"
(lambda (p)
(map (split #\,)
(read-all p read-line))))
(("1" "2" "3") ("4" "5"))
- [procedure] (read-substring STRING START END [PORT])
- [procedure] (write-substring STRING START END [PORT])
These procedures support bulk character I/O. The part of the
string STRING starting at index START and ending just before
index END is used as a character buffer that will be the target
of read-substring or the source of the write-substring. Up to
END-START characters will be transferred. The number of
characters transferred, possibly zero, is returned by these
procedures. Fewer characters will be read by read-substring if
an end-of-file is read, or a timeout occurs before all the
requested characters are transferred and the timeout thunk
returns #f (see the procedure input-port-timeout-set!). Fewer
characters will be written by write-substring if a timeout occurs
before all the requested characters are transferred and the
timeout thunk returns #f (see the procedure
output-port-timeout-set!). If it is not specified, PORT defaults
to the current input-port and current output-port respectively.
For example:
> (define s (make-string 10 #\x))
> (read-substring s 2 5)123456789
3
> 456789
> s
"xx123xxxxx"
6. OCTET-PORTS
6.1 Octet-port settings
The following is a list of port settings that are valid for
octet-ports.
* char-encoding: ENCODING
This setting controls the character encoding of the octet-port.
For bidirectional octet-ports, the character encoding for input
and output is set. To set each direction separately the keywords
input-char-encoding: and output-char-encoding: must be used
instead of char-encoding:. The default value of this setting is
operating system dependent. The following encodings are
supported:
latin1
LATIN1 character encoding. Each character is encoded by a
single octet. Only Unicode characters with a code in the
range 0 to 255 are allowed.
utf8
UTF8 character encoding. Each character is encoded by a
sequence of one to four octets.
utf16
Each character is encoded by 16 or 32 bits, i.e. two or four
octets. Each 16 bit chunk may be encoded using little-endian
encoding or big-endian encoding. If the port is an
input-port and the first two octets read are a BOM ("Byte
Order Mark" character with hexadecimal code FEFF) then the
BOM will be discarded and the endianness will be set
accordingly, otherwise the endianness depends on the
operating system. If the port is an output-port then a BOM
will be output at the beginning of the stream and the
endianness depends on the operating system.
utf16le
UTF16 character encoding with little-endian endianness.
Each character is encoded by 16 or 32 bits, i.e. two or four
octets. No BOM processing is done.
utf16be
UTF16 character encoding with big-endian endianness.
Each character is encoded by 16 or 32 bits, i.e. two or four
octets. No BOM processing is done.
Other encodings could be added, such as: ascii, ucs2, ucs2le,
ucs2be, ucs4, ucs4le, ucs4be, native, and ebcdic.
* eol-encoding: ENCODING
This setting controls the end-of-line encoding of the octet-port.
To set each direction separately the keywords input-eol-encoding:
and output-eol-encoding: must be used instead of eol-encoding:.
The default value of this setting is operating system dependent.
Note that for output-ports the end-of-line encoding is applied
before the character encoding, and for input-ports it is applied
after. The following encodings are supported:
lf For an output-port, writing a #\newline character outputs a
#\linefeed character to the stream (Unicode character code
10). For an input-port, a #\newline character is read when
a #\linefeed character is encountered on the stream. Note
that #\linefeed and #\newline are two names for the same
character, so this end-of-line encoding is actually the
identity function. Text files created by UNIX applications
typically use this end-of-line encoding.
cr For an output-port, writing a #\newline character outputs a
#\return character to the stream (Unicode character code
10). For an input-port, a #\newline character is read when
a #\linefeed character or a #\return character is
encountered on the stream. Text files created by Classic Mac
OS applications typically use this end-of-line encoding.
cr-lf For an output-port, writing a #\newline character outputs to
the stream a #\return character followed by a #\linefeed
character. For an input-port, a #\newline character is read
when a #\linefeed character or a #\return character is
encountered on the stream. Moreover, if this character is
immediately followed by the opposite character (#\linefeed
followed by #\return or #\return followed by #\linefeed)
then the second character is ignored. In other words, all
four possible end-of-line encodings are read as a single
#\newline character. Text files created by DOS and
Microsoft Windows applications typically use this
end-of-line encoding.
6.2 Octet-port operations
When using a buffered octet-port, the read-u8 and read-subu8vector
procedures specified in this section must be called before any use of
the port in a character input operation (i.e. a call to the
procedures read, read-char, peek-char, etc) because otherwise the
character-stream and octet-stream may be out of sync due to the port
buffering.
- [procedure] (read-u8 [PORT])
This procedure reads the octet input-port PORT and returns the
octet
at the current octet location and advances the current octet
location to the next octet, unless the PORT is already at
end-of-file in which case read-u8 returns the end-of-file
object. If it is not specified, PORT defaults to the current
input-port.
For example:
> (call-with-input-u8vector
'#u8(11 22 33 44)
(lambda (p)
(let ((a (read-u8 p))) (list a (read-u8 p)))))
(11 22)
> (call-with-input-u8vector '#u8() read-u8)
#!eof
- [procedure] (write-u8 N [PORT])
This procedure writes the octet N to the octet output-port PORT and
advances the current octet location of that output-port. The value
returned is unspecified. If it is not specified, PORT defaults to
the current output-port.
For example:
> (call-with-output-u8vector '() (lambda (p) (write-u8 33 p)))
#u8(33)
- [procedure] (read-subu8vector U8VECTOR START END [PORT])
- [procedure] (write-subu8vector U8VECTOR START END [PORT])
These procedures support bulk binary I/O. The part of the u8vector
U8VECTOR starting at index START and ending just before index END
is used as an octet buffer that will be the target of
read-subu8vector or the source of the write-subu8vector. Up
to END-START octets will be transferred. The number of octets
transferred, possibly zero, is returned by these procedures.
Fewer octets will be read by read-subu8vector if an end-of-file
is read, or a timeout occurs before all the requested octets are
transferred and the timeout thunk returns #f (see the procedure
input-port-timeout-set!). Fewer octets will be written by
write-subu8vector if a timeout occurs before all the requested
octets are transferred and the timeout thunk returns #f (see the
procedure output-port-timeout-set!). If it is not specified,
PORT defaults to the current input-port and current output-port
respectively.
For example, assuming the console is using latin1 character
encoding:
> (define v (make-u8vector 10))
> (read-subu8vector v 2 5)123456789
3
> 456789
> v
#u8(0 0 49 50 51 0 0 0 0 0)
7. DEVICE-PORTS
7.1 Filesystem devices
- [procedure] (open-file PATH-OR-SETTINGS)
- [procedure] (open-input-file PATH-OR-SETTINGS)
- [procedure] (open-output-file PATH-OR-SETTINGS)
- [procedure] (call-with-input-file PATH-OR-SETTINGS PROC)
- [procedure] (call-with-output-file PATH-OR-SETTINGS PROC)
- [procedure] (with-input-from-file PATH-OR-SETTINGS THUNK)
- [procedure] (with-output-to-file PATH-OR-SETTINGS THUNK)
All of these procedures create a port to interface to an
octet-stream
device (such as a file, console, serial port, named pipe, etc)
whose name is given by a path of the filesystem. The direction:
setting will default to the value input for the procedures
open-input-file, call-with-input-file and
with-input-from-file, to the value output for the procedures
open-output-file, call-with-output-file and
with-output-to-file, and to the value input-output for the
procedure open-file. The procedures open-file,
open-input-file and open-output-file return the port that is
created. The procedures call-with-input-file and
call-with-output-file call the procedure PROC with the port as
single argument, and then return the value(s) of this call after
closing the port. The procedures with-input-from-file and
with-output-to-file dynamically bind the current input-port and
current output-port respectively to the port created for the
duration of a call to the procedure THUNK with no argument. The
value(s) of the call to THUNK are returned after closing the port.
The first argument of these procedures is either a string denoting
a filesystem path or a list of port settings which must contain a
path: setting. Here are the settings allowed in addition to the
generic settings of octet-ports:
* path: STRING
This setting indicates the location of the file in the
filesystem. There is no default value for this setting.
* append: ( #f | #t )
This setting controls whether output will be added to the end
of the file. This is useful for writing to log files that
might be open by more than one process. The default value of
this setting is #f.
* create: ( #f | #t | maybe )
This setting controls whether the file will be created when
it is opened. A setting of #f requires that the file exist
(otherwise an exception is raised). A setting of #t
requires that the file does not exist (otherwise an exception
is raised). A setting of maybe will create the file if it
does not exist. The default value of this setting is maybe
for output-ports and #f for input-ports and bidirectional
ports.
* permissions: 12-BIT-EXACT-INTEGER
This setting controls the UNIX permissions that will be
attached to the file if it is created. The default value of
this setting is #o666.
* truncate: ( #f | #t )
This setting controls whether the file will be truncated when
it is opened. For input-ports, the default value of this
setting is #f. For output-ports, the default value of this
setting is #t when the append: setting is #f, and #f
otherwise.
For example:
> (with-output-to-file
(list path: "nofile"
create: #f)
(lambda ()
(display "hello world!\n")))
*** ERROR IN (console)@1.1 -- No such file or directory
(with-output-to-file '(path: "nofile" create: #f)
'#<procedure #2>)
- [procedure] (input-port-u8-position PORT [POSITION [WHENCE]])
- [procedure] (output-port-u8-position PORT [POSITION [WHENCE]])
When called with a single argument these procedures return the
octet position where the next I/O operation would take place in
the file attached to the given PORT (relative to the beginning of
the file). When called with two or three arguments, the octet
position for subsequent I/O operations on the given PORT is
changed to POSITION, which must be an exact integer. When WHENCE
is omitted or is the symbol "start", the POSITION is relative to
the beginning of the file. When WHENCE is the symbol "current",
the POSITION is relative to the current octet position of the
file. When WHENCE is the symbol "end", the POSITION is relative
to the end of the file. The return value is the new octet
position. On most operating systems the octet position for
reading and writing of a given bidirectional port are the same.
When input-port-u8-position is called to change the octet
position of an input-port, all input buffers will be flushed so
that the next octet read will be the one at the given position.
When output-port-u8-position is called to change the octet
position of an output-port, there is an implicit call to
force-output before the position is changed.
For example:
> (define p ; p is an input-output-port
(open-file '(path: "test" char-encoding: latin1 create:
maybe)))
> (list (input-port-u8-position p) (output-port-u8-position
p))
(0 0)
> (display "abcdefghij\n" p)
> (list (input-port-u8-position p) (output-port-u8-position
p))
(0 0)
> (force-output p)
> (list (input-port-u8-position p) (output-port-u8-position
p))
(11 11)
> (input-port-u8-position p 2)
2
> (list (input-port-u8-position p) (output-port-u8-position
p))
(2 2)
> (peek-char p)
#\c
> (list (input-port-u8-position p) (output-port-u8-position
p))
(11 11)
> (output-port-u8-position p -7 2)
4
> (list (input-port-u8-position p) (output-port-u8-position
p))
(4 4)
> (write-char #\! p)
> (list (input-port-u8-position p) (output-port-u8-position
p))
(4 4)
> (force-output p)
> (list (input-port-u8-position p) (output-port-u8-position
p))
(5 5)
> (input-port-u8-position p 1)
1
> (read p)
bcd!fghij
7.2 Process devices
- [procedure] (open-process PATH-OR-SETTINGS)
This procedure starts a new operating system process and returns
a port that allows communication with that process on its
standard input and standard output. The default value of the
direction: setting is input-output, i.e. the Scheme program can
write to the process' standard input and can read from the
process' standard output.
The first argument of this procedure is either a string denoting a
filesystem path of an executable program or a list of port settings
which must contain a path: setting. Here are the settings
allowed in addition to the generic settings of octet-ports:
* path: STRING
This setting indicates the location of the executable program
in the filesystem. There is no default value for this
setting.
* arguments: LIST-OF-STRINGS
This setting indicates the string arguments that are passed
to the program. The default value of this setting is the
empty list (i.e. no arguments).
* environment: LIST-OF-STRINGS
This setting indicates the set of environment variable
bindings that the process receives. Each element of the list
is a string of the form "VAR=VALUE", where VAR is the
name of the variable and VALUE is its binding. If
LIST-OF-STRINGS is #f, the process inherits the environment
variable bindings of the Scheme program. The default value
of this setting is #f.
* stderr-redirection: ( #f | #t )
This setting indicates how the standard error of the process
is redirected. A setting of #t will redirect the standard
error to the standard output (i.e. all output to standard
error can be read from the process-port). A setting of #f
will leave the standard error as-is, which typically results
in error messages being output to the console. The default
value of this setting is #f.
* pseudo-terminal: ( #f | #t )
This setting indicates what type of device will be bound to
the process' standard input and standard output. A setting
of #t will use a pseudo-terminal device (this is a device
that behaves like a tty device even though there is no real
terminal or user directly involved). A setting of #f will
use a pair of pipes. The difference is important for
programs which behave differently when they are used
interactively, for example shells. The default value of this
setting is #f.
For example:
> (define p (open-process (list path: "/bin/ls"
arguments: '("../examples"))))
> (read-line p)
"complex"
> (read-line p)
"README"
> (close-port p)
> (define p (open-process "/usr/bin/dc"))
> (display "2 100 ^ p\n" p)
> (force-output p)
> (read-line p)
"1267650600228229401496703205376"
7.3 Network devices
- [procedure] (open-tcp-client SETTINGS)
This procedure opens a network connection to a TCP/IP server and
returns a tcp-client-port (a subtype of device-port) that
represents this connection and allows communication with that
server. The default value of the direction: setting is
input-output, i.e. the Scheme program can send information to
the server and receive information from the server. The sending
direction can be "shutdown" using the close-output-port
procedure and the receiving direction can be "shutdown" using the
close-input-port procedure. The close-port procedure closes
both directions of the connection.
The first argument of this procedure is a list of port settings
which must contain a server-address: setting and a
port-number: setting. Here are the settings allowed in addition
to the generic settings of octet-ports:
* server-address: STRING-OR-U8VECTOR
This setting indicates the internet address of the server.
It can be a string denoting a host name, which will be
translated to an IP address by the host-info procedure, or
a 4 or 16 element u8vector which contains the 32-bit IPv4 or
128-bit IPv6 address respectively. There is no default value
for this setting.
* port-number: 16-BIT-EXACT-INTEGER
This setting indicates the IP port-number of the server to
connect to (e.g. 80 for the standard HTTP server, 23 for the
standard telnet server). There is no default value for this
setting.
* keep-alive: ( #f | #t )
This setting controls the use of the "keep alive" option on
the connection. The "keep alive" option will periodically
send control packets on otherwise idle network connections to
ensure that the server host is active and reachable. The
default value of this setting is #f.
* coalesce: ( #f | #t )
This setting controls the use of TCP's "Nagle algorithm" which
reduces the number of small packets by delaying their
transmission and coalescing them into larger packets. A
setting of #t will coalesce small packets into larger ones.
A setting of #f will transmit packets as soon as possible.
The default value of this setting is #f. Note that this
setting does not affect the buffering of the port.
Here is an example of the client-side code that opens a connection
to an HTTP server on port 8080 on the same computer (for the
server-side code see the example for the procedure
open-tcp-server):
> (define p (open-tcp-client
(list server-address: '#u8(127 0 0 1)
port-number: 8080
eol-encoding: 'cr-lf)))
> p
#<input-output-port #2 (tcp-client #u8(127 0 0 1) 8080)>
> (display "GET / HTTP/1.1\n" p)
> (force-output p)
> (read-line p)
"<HTML>"
- [procedure] (open-tcp-server PORT-NUMBER-OR-SETTINGS)
This procedure sets up a socket to accept network connection
requests from clients and returns a tcp-server-port from which
network connections to clients are obtained. Tcp-server-ports are
a direct subtype of object-ports (i.e. they are not
character-ports) and are input-ports. Reading from a
tcp-server-port with the read procedure will block until a
network connection request is received from a client. The read
procedure will then return a tcp-client-port (a subtype of
device-port) that represents this connection and allows
communication with that client. Closing a tcp-server-port with
either the close-input-port or close-port procedures will
cause the network subsystem to stop accepting connections on that
socket.
The first argument of this procedure is an IP port-number (16-bit
nonnegative exact integer) or a list of port settings which must
contain a port-number: setting. Below is a list of the settings
allowed in addition to the settings keep-alive: and coalesce:
allowed by the open-tcp-client procedure and the generic
settings of octet-ports. The settings which are not listed below
apply to the tcp-client-port that is returned by read when a
connection is accepted and have the same meaning as if they were
used in a call to the open-tcp-client procedure.
* port-number: 16-BIT-EXACT-INTEGER
This setting indicates the IP port-number assigned to the
socket which accepts connection requests from clients.
There is no default value for this setting.
* backlog: POSITIVE-EXACT-INTEGER
This setting indicates the maximum number of connection
requests that can be waiting to be accepted by a call to
read (technically it is the value passed as the second
argument of the UNIX listen() function). The default value
of this setting is 128.
* reuse-address: ( #f | #t )
This setting controls whether it is possible to assign a
port-number that is currently active. Note that when a
server process terminates, the socket it was using to accept
connection requests does not become inactive immediately.
Instead it remains active for a few minutes to ensure clean
termination of the connections. A setting of #f will cause
an exception to be raised in that case. A setting of #t
will allow a port-number to be used even if it is active.
The default value of this setting is #t.
Here is an example of the server-side code that accepts
connections on port 8080 (for the client-side code see the example
for the procedure open-tcp-client):
> (define s (open-tcp-server (list port-number: 8080
eol-encoding: 'cr-lf)))
> (define p (read s)) ; blocks until client connects
> p
#<input-output-port #2 (tcp-client 8080)>
> (read-line p)
"GET / HTTP/1.1"
> (display "<HTML>\n" p)
> (force-output p)
7.4 Directory-ports
- [procedure] (open-directory PATH-OR-SETTINGS)
This procedure opens a directory of the filesystem for reading its
entries and returns a directory-port from which the entries can be
enumerated. Directory-ports are a direct subtype of object-ports
(i.e. they are not character-ports) and are input-ports. Reading
from a directory-port with the read procedure returns the next
file name in the directory as a string. The end-of-file object is
returned when all the file names have been enumerated. Another
way to get the list of all files in a directory is the
directory-files procedure which returns a list of the files in
the directory. The advantage of using directory-ports is that it
allows iterating over the files in a directory in constant space,
which is interesting when the number of files in the directory is
not known in advance and may be large. Note that the order in
which the names are returned is operating-system dependent.
The first argument of this procedure is either a string denoting a
filesystem path to a directory or a list of port settings which
must contain a path: setting. Here are the settings allowed in
addition to the generic settings of object-ports:
* path: STRING
This setting indicates the location of the directory in the
filesystem. There is no default value for this setting.
* ignore-hidden: ( #f | #t | dot-and-dot-dot )
This setting controls whether hidden-files will be returned.
Under UNIX and Mac OS X hidden-files are those that start
with a period (such as ., .., and .profile). Under
Microsoft Windows hidden files are the . and .. entries
and the files whose "hidden file" attribute is set. A
setting of #f will enumerate all the files. A setting of
#t will only enumerate the files that are not hidden. A
setting of dot-and-dot-dot will enumerate all the files
except for the . and .. hidden files. The default value
of this setting is #t.
For example:
> (let ((p (open-directory (list path: "../examples"
ignore-hidden: #f))))
(let loop ()
(let ((fn (read p)))
(if (string? fn)
(begin
(write fn)
(newline)
(loop)))))
(close-input-port p))
"."
".."
"complex"
"README"
"simple"
> (define x (open-directory "../examples"))
> (read-all x)
("complex" "README" "simple")
- [procedure] (directory-files [PATH-OR-SETTINGS])
This procedure returns the list of the files in a directory. The
argument PATH-OR-SETTINGS is either a string denoting a filesystem
path to a directory or a list of settings which must contain a
path: setting. If it is not specified, PATH-OR-SETTINGS
defaults to the current directory (the value bound to the
current-directory parameter object). Here are the settings
allowed:
* path: STRING
This setting indicates the location of the directory in the
filesystem. There is no default value for this setting.
* ignore-hidden: ( #f | #t | dot-and-dot-dot )
This setting controls whether hidden-files will be returned.
Under UNIX and Mac OS X hidden-files are those that start
with a period (such as ., .., and .profile). Under
Microsoft Windows hidden files are the . and .. entries
and the files whose "hidden file" attribute is set. A
setting of #f will enumerate all the files. A setting of
#t will only enumerate the files that are not hidden. A
setting of dot-and-dot-dot will enumerate all the files
except for the . and .. hidden files. The default value
of this setting is #t.
For example:
> (directory-files)
("complex" "README" "simple")
> (directory-files "../include")
("config.h" "config.h.in" "foo.h" "makefile" "makefile.in")
> (directory-files (list path: "../include" ignore-hidden:
#f))
("." ".." "config.h" "config.h.in" "foo.h" "makefile"
"makefile.in")
8. VECTOR-PORTS
- [procedure] (open-vector [VECTOR-OR-SETTINGS])
- [procedure] (open-input-vector [VECTOR-OR-SETTINGS])
- [procedure] (open-output-vector [VECTOR-OR-SETTINGS])
- [procedure] (call-with-input-vector VECTOR-OR-SETTINGS PROC)
- [procedure] (call-with-output-vector VECTOR-OR-SETTINGS PROC)
- [procedure] (with-input-from-vector VECTOR-OR-SETTINGS THUNK)
- [procedure] (with-output-to-vector VECTOR-OR-SETTINGS THUNK)
Vector-ports represent streams of Scheme objects. They are a
direct subtype of object-ports (i.e. they are not
character-ports). All of these procedures create vector-ports
that are either unidirectional or bidirectional. The direction:
setting will default to the value input for the procedures
open-input-vector, call-with-input-vector and
with-input-from-vector, to the value output for the procedures
open-output-vector, call-with-output-vector and
with-output-to-vector, and to the value input-output for the
procedure open-vector. Bidirectional vector-ports behave like
FIFOs: data written to the port is added to the end of the stream
that is read. It is only when a bidirectional vector-port's
output-side is closed with a call to the close-output-port
procedure that the stream's end is known (when the stream's end is
reached, reading the port returns the end-of-file object).
The procedures open-vector, open-input-vector and
open-output-vector return the port that is created. The
procedures call-with-input-vector and call-with-output-vector
call the procedure PROC with the port as single argument, and then
return the value(s) of this call after closing the port. The
procedures with-input-from-vector and with-output-to-vector
dynamically bind the current input-port and current output-port
respectively to the port created for the duration of a call to the
procedure THUNK with no argument. The value(s) of the call to
THUNK are returned after closing the port.
The first argument of these procedures is either a vector of the
elements used to initialize the stream or a list of port settings.
If it is not specified, the argument of the open-vector,
open-input-vector, and open-output-vector procedures defaults
to an empty list of port settings. Here are the settings allowed
in addition to the generic settings of object-ports:
* init: VECTOR
This setting indicates the initial content of the stream.
The default value of this setting is an empty vector.
* permanent-close: ( #f | #t )
This setting controls whether a call to the procedures
close-output-port will close the output-side of a
bidirectional vector-port permanently or not. A permanently
closed bidirectional vector-port whose end-of-file has been
reached on the input-side will return the end-of-file object
for all subsequent calls to the read procedure. A
non-permanently closed bidirectional vector-port will return
to its opened state when its end-of-file is read. The
default value of this setting is #t.
For example:
> (define p (open-vector))
> (write 1 p)
> (write 2 p)
> (write 3 p)
> (read p)
1
> (read p)
2
> (close-output-port p)
> (read p)
3
> (read p)
#!eof
- [procedure] (open-vector-pipe [VECTOR-OR-SETTINGS1
[VECTOR-OR-SETTINGS2]])
The procedure open-vector-pipe creates two vector-ports and
returns these two ports. The two ports are interrelated as
follows: the first port's output-side is connected to the second
port's input-side and the first port's input-side is connected to
the second port's output-side. The value VECTOR-OR-SETTINGS1 is
used to setup the first vector-port and VECTOR-OR-SETTINGS2 is
used to setup the second vector-port. The same settings as for
open-vector are allowed. The default direction: setting is
input-output (i.e. a bidirectional port is created). If it is
not specified VECTOR-OR-SETTINGS1 defaults to the empty list. If
it is not specified VECTOR-OR-SETTINGS2 defaults to
VECTOR-OR-SETTINGS1 but with the init: setting set to the empty
vector and with the input and output settings exchanged (e.g. if
the first port is an input-port then the second port is an
output-port, if the first port's input-side is non-buffered then
the second port's output-side is non-buffered).
For example:
> (define (server op)
(receive (c s) (open-vector-pipe) ; client-side and
server-side ports
(thread-start!
(make-thread
(lambda ()
(let loop ()
(let ((request (read s)))
(if (not (eof-object? request))
(begin
(write (op request) s)
(newline s)
(force-output s)
(loop))))))))
c))
> (define a (server (lambda (x) (expt 2 x))))
> (define b (server (lambda (x) (expt 10 x))))
> (write 100 a)
> (write 30 b)
> (read a)
1267650600228229401496703205376
> (read b)
1000000000000000000000000000000
- [procedure] (get-output-vector VECTOR-PORT)
The procedure get-output-vector takes an output vector-port or a
bidirectional vector-port as argument and removes all the objects
currently on the output-side, returning them in a vector. The port
remains open and subsequent output to the port and calls to the
procedure get-output-vector are possible.
For example:
> (define p (open-vector '#(1 2 3)))
> (write 4 p)
> (get-output-vector p)
#(1 2 3 4)
> (write 5 p)
> (write 6 p)
> (get-output-vector p)
#(5 6)
9. STRING-PORTS
- [procedure] (open-string [STRING-OR-SETTINGS])
- [procedure] (open-input-string [STRING-OR-SETTINGS])
- [procedure] (open-output-string [STRING-OR-SETTINGS])
- [procedure] (call-with-input-string STRING-OR-SETTINGS PROC)
- [procedure] (call-with-output-string STRING-OR-SETTINGS PROC)
- [procedure] (with-input-from-string STRING-OR-SETTINGS THUNK)
- [procedure] (with-output-to-string STRING-OR-SETTINGS THUNK)
- [procedure] (open-string-pipe [STRING-OR-SETTINGS1
[STRING-OR-SETTINGS2]])
- [procedure] (get-output-string STRING-PORT)
String-ports represent streams of characters. They are a direct
subtype of character-ports. These procedures are the string-port
analog of the procedures specified in the vector-ports section.
Note that these procedures are a superset of the procedures
specified in the "Basic String Ports SRFI" (SRFI 6).
- [procedure] (object->string OBJ [N])
This procedure converts the object OBJ to its external
representation and returns it in a string. The parameter N
specifies the maximal width of the resulting string. If the
external representation is wider than N, the resulting string will
be truncated to N characters and the last 3 characters will be set
to periods.
10. U8VECTOR-PORTS
- [procedure] (open-u8vector [U8VECTOR-OR-SETTINGS])
- [procedure] (open-input-u8vector [U8VECTOR-OR-SETTINGS])
- [procedure] (open-output-u8vector [U8VECTOR-OR-SETTINGS])
- [procedure] (call-with-input-u8vector U8VECTOR-OR-SETTINGS PROC)
- [procedure] (call-with-output-u8vector U8VECTOR-OR-SETTINGS PROC)
- [procedure] (with-input-from-u8vector U8VECTOR-OR-SETTINGS THUNK)
- [procedure] (with-output-to-u8vector U8VECTOR-OR-SETTINGS THUNK)
- [procedure] (open-u8vector-pipe [U8VECTOR-OR-SETTINGS1
[U8VECTOR-OR-SETTINGS2]])
- [procedure] (get-output-u8vector U8VECTOR-PORT)
U8vector-ports represent streams of octets. They are a direct
subtype of octet-ports. These procedures are the u8vector-port
analog of the procedures specified in the vector-ports section.