Processes

A process is a native (operating-system-level) application or
program that runs separately from the current virtual machine.

Many programming languages have facilities to allow access to system
processes (commands). (For example Java has java.lang.Process
and java.lang.ProcessBuilder.)
These facilities let you send data to the standard input, extract the
resulting output, look at the return code, and sometimes even pipe
commands together. However, this is rarely as easy as it is using
the old Bourne shell; for example command substitution is awkward.
Kawa’s solution is based on these two ideas:

A “process expression” (typically a function call) evaluates to a
LProcess value, which provides access to a Unix-style
(or Windows) process.

In a context requiring a string (or a bytevector), an LProcess is
automatically converted to a string (or bytevector)
comprising the standard output from the process.

Creating a process

The most flexible way to start a process is with either the
run-process procedure or
the &`{command} syntax
for process literals.

What makes an LProcess interesting is that it is also
a blob, which is automatically
converted to a string (or bytevector) in a context that requires it.
The contents of the blob comes from the standard output of the process.
The blob is evaluated lazily,
so data it is only collected when requested.

When you type a command to a shell, its output goes to the console,
Similarly, in a REPL the output from the process
is copied to the console output - which can sometimes by optimized
by letting the process inherit its standard output from the Kawa process.

Substitution and tokenization

To substitute the variable or the result of an expression
in the command line use the usual syntax for quasi literals:

Since a process is convertible a string, we need no special
syntax for command substitution:

`{echo The directory is: &[&`{pwd}]}

or equivalently:

`{echo The directory is: &`{pwd}}

Things gets more interesting when considering the interaction between
substitution and tokenization. This is not simple string
interpolation. For example, if an interpolated value contains a quote
character, we want to treat it as a literal quote, rather than a token
delimiter. This matches the behavior of traditional shells. There are
multiple cases, depending on whether the interpolation result is a
string or a vector/list, and depending on whether the interpolation is
inside quotes.

If the value is a string, and we’re not inside quotes, then all
non-whitespace characters (including quotes) are literal, but
whitespace still separates tokens:

Having quoting be handled by the $construct$:sh
implementation automatically eliminates common code injection problems.

Smart tokenization only happens when using the quasi-literal forms such
as &`{command}.
You can of course use string templates with run-process:

(run-process &{echo The directory is: &`{pwd}})

However, in that case there is no smart tokenization: The template is
evaluated to a string, and then the resulting string is tokenized,
with no knowledge of where expressions were substituted.

Input/output redirection

You can use various keyword arguments to specify standard input, output,
and error streams. For example to lower-case the text in in.txt,
writing the result to out.txt, you can do:

&`[in-from: "in.txt" out-to: "out.txt"]{tr A-Z a-z}

or:

(run-process in-from: "in.txt" out-to: "out.txt" "tr A-Z a-z")

A process-redirect-argument can be one of the following:

in:value

The value is evaluated, converted to a string (as if
using display), and copied to the input file of the process.
The following are equivalent:

&`[in: "text\n"]{command}
&`[in: &`{echo "text"}]{command}

You can pipe the output from command1 to the input
of command2 as follows:

&`[in: &`{command1}]{command2}

in-from:path

The process reads its input from the specified path, which
can be any value coercible to a filepath.

out-to:path

The process writes its output to the specified path.

err-to:path

Similarly for the error stream.

out-append-to:path

err-append-to:path

Similar to out-to and err-to, but append to the file
specified by path, instead of replacing it.

in-from: ’pipe

out-to: ’pipe

err-to: ’pipe

Does not set up redirection. Instead, the specified stream is available
using the methods getOutputStream, getInputStream,
or getErrorStream, respectively, on the resulting Process object,
just like Java’s ProcessBuilder.Redirect.PIPE.

in-from: ’inherit

out-to: ’inherit

err-to: ’inherit

Inherits the standard input, output, or error stream from the
current JVM process.

out-to:port

err-to:port

Redirects the standard output or error of the process to
the specified port.

out-to: ’current

err-to: ’current

Same as out-to: (current-output-port),
or err-to: (current-error-port), respectively.

in-from:port

in-from: ’current

Re-directs standard input to read from the port
(or (current-input-port)). It is unspecified how much is read from
the port. (The implementation is to use a thread that reads from the
port, and sends it to the process, so it might read to the end of the port,
even if the process doesn’t read it all.)

err-to: ’out

Redirect the standard error of the process to be merged with the
standard output.

The default for the error stream (if neither err-to or
err-append-to is specified) is equivalent to err-to: 'current.

Note: Writing to a port is implemented by copying the output or error
stream of the process. This is done in a thread, which means we don’t have
any guarantees when the copying is finished. (In the future we might
change process-exit-wait (discussed later) wait for not only the
process to finish, but also for these helper threads to finish.)

A here document is
a form a literal string, typically multi-line, and commonly used in
shells for the standard input of a process. You can use string literals or
string quasi-literals for this.
For example, this passes the string "line1\nline2\nline3\n" to
the standard input of command:

(run-process [in: &{
&|line1
&|line2
&|line3
}] "command")

Note the use of &| to mark the end of ignored indentation.

Pipe-lines

Piping the output of one process as the input of another
is in princple easy - just use the in:
process argument. However, writing a multi-stage pipe-line quickly gets ugly:

&`[in: &`[in: "My text\n"]{tr a-z A-Z}]{wc}

The convenience macro pipe-process makes this much nicer:

(pipe-process
"My text\n"
&`{tr a-z A-Z}
&`{wc})

Syntax: pipe-processinputprocess*

All of the process expressions must be run-process forms,
or equivalent &`{command} forms.
The result of evaluating input becomes the input to the first
process; the output from the first process becomes
the input to the second process, and so on. The result of
whole pipe-process expression is that of the last process.

Copying the output of one process to the input of the next is
optimized: it uses a copying loop in a separate thread. Thus you can
safely pipe long-running processes that produce huge output. This
isn’t quite as efficient as using an operating system pipe, but is
portable and works pretty well.

Setting the process environment

By default the new process inherits the system environment of the current
(JVM) process as returned by System.getenv(), but you can override it.
A process-environment-argument can be one of the following:

env-name:value

In the process environment, set the "name" to the
specified value. For example:

&`[env-CLASSPATH: ".:classes"]{java MyClass}

NAME:value

Same as using the env-NAME option above, but only if the
NAME is uppercase (i.e. if uppercasing NAME yields
the same string). For example the previous example could be written:

&`[CLASSPATH: ".:classes"]{java MyClass}

environment:env

The env is evaluated and must yield a HashMap.
This map is used as the system environment of the process.

Waiting for process exit

When a process finishes, it returns an integer exit code.
The code is traditionally 0 on successful completion,
while a non-zero code indicates some kind of failure or error.

Procedure: process-exit-waitprocess

The process expression must evaluate to a process
(any java.lang.Process object).
This procedure waits for the process to finish, and then returns the
exit code as an int.

(process-exit-wait (run-process "echo foo")) ⇒ 0

Procedure: process-exit-ok?process

Calls process-exit-wait, and then returns #false
if the process exited it 0, and returns #true otherwise.

This is useful for emulating the way traditional shell do
logic control flow operations based on the exit code.
For example in sh you might write:

if grep Version Makefile >/dev/null
then echo found Version
else echo no Version
fi

Strictly speaking these are not quite the same, since the Kawa
version silently throws away the output from grep
(because no-one has asked for it). To match the output from the sh,
you can use out-to: 'inherit:

Exiting the current process

Procedure: exit [code]

Exits the Kawa interpreter, and ends the Java session.
Returns the value of code to the operating system:
The code must be integer, or the special
values #f (equivalent to -1), or
#t (equivalent to 0).
If code is not specified, zero is returned.
The code is a status code; by convention a non-zero
value indicates a non-standard (error) return.

Before exiting, finally-handlers (as in try-finally,
or the after procedure of dynamic-wind) are
executed, but only in the current thread, and only if
the current thread was started normally. (Specifically
if we’re inside an ExitCalled block with non-zero
nesting - see gnu.kawa.util.ExitCalled.)
Also, JVM shutdown hooks are executed - which includes
flushing buffers of output ports. (Specifically
Writer objects registered with the WriterManager.)

Procedure: emergency-exit [code]

Exits the Kawa interpreter, and ends the Java session.
Communicates an exit value in the same manner as exit.
Unlike exit, neither finally-handlers nor
shutdown hooks are executed.

Deprecated functions

Procedure: make-processcommandenvp

Creates a <java.lang.Process> object, using the specified
command and envp.
The command is converted to an array of Java strings
(that is an object that has type <java.lang.String[]>.
It can be a Scheme vector or list (whose elements should be
Java strings or Scheme strings); a Java array of Java strings;
or a Scheme string. In the latter case, the command is converted
using command-parse.
The envp is process environment; it should be either
a Java array of Java strings, or the special #!null value.

Except for the representation of envp, this is similar to:

(run-process environment: envpcommand)

Procedure: systemcommand

Runs the specified command, and waits for it to finish.
Returns the return code from the command. The return code is an integer,
where 0 conventionally means successful completion.
The command can be any of the types handled by make-process.

Equivalent to:

(process-exit-wait (make-process command #!null))

Variable: command-parse

The value of this variable should be a one-argument procedure.
It is used to convert a command from a Scheme string to a Java
array of the constituent "words".
The default binding, on Unix-like systems, returns a new command to
invoke "/bin/sh" "-c" concatenated with the command string;
on non-Unix-systems, it is bound to tokenize-string-to-string-array.

Procedure: tokenize-string-to-string-arraycommand

Uses a java.util.StringTokenizer to parse the command string
into an array of words. This splits the command using spaces
to delimit words; there is no special processing for quotes or other
special characters.
(This is the same as what java.lang.Runtime.exec(String) does.)