1 Introduction

-*- mode: text; coding: utf-8-unix; -*-
######################################################################
##
## Copyright (C) 2001,2000, 2003
## Department of Computer Science, University of Tromsø, Norway
##
## Filename: README
## Author: Frode Vatvedt Fjeld
## Created at: Wed Dec 8 15:35:53 1999
## Distribution: See the accompanying file COPYING.
##
## $Id: README,v 1.1.1.1 2004/01/13 11:13:13 ffjeld Exp $
##
######################################################################
Binary-types is a Common Lisp package for reading and writing binary
files. Binary-types provides macros that are used to declare the
mapping between lisp objects and some binary (i.e. octet-based)
representation.
Supported kinds of binary types include:
* Signed and unsigned integers of any octet-size, big-endian or
little-endian. Maps to lisp integers.
* Enumerated types based on any integer type. Maps to lisp symbols.
* Complex bit-field types based on any integer type. Sub-fields can
be numeric, enumerated, or bit-flags. Maps to lisp lists of symbols
and integers.
* Fixed-length and null-terminated strings. Maps to lisp strings.
* Compound records of other binary types. Maps to lisp DEFCLASS
classes or, when you prefer, DEFSTRUCT structs.
Typically, a complete binary record format/type can be specified in a
single (nested) declaration statement. Such compound records may then
be read and written with READ-BINARY and WRITE-BINARY.
Binary-types is *not* helpful in reading files with variable
bit-length code-words, such as most compressed file formats. It will
basically only work with file-formats based on 8-bit bytes
(octets). Also, at this time no floating-point types are supported out
of the box.
Binary types may now be declared with the DEFINE-BINARY-CLASS macro,
which has the same syntax (and semantics) as DEFCLASS, only there is
an additional slot-option (named :BINARY-TYPE) that declares that
slot's binary type. Note that the binary aspects of slots are *not*
inherited (the semantics of inheriting binary slots is unclear to me).
Another slot-option added by binary-types is :MAP-BINARY-WRITE, which
names a function (of two arguments) that is applied to the slot's
value and the name of the slot's binary-type in order to obtain the
value that is actually passed to WRITE-BINARY. Similarly,
:MAP-BINARY-READ takes a function that is to be applied to the binary
data and type-name when a record of that type is being read. A
slightly modified version of :map-binary-read is
:MAP-BINARY-READ-DELAYED, which will do essentially the same thing as
:map-binary-read, only the mapping will be "on-demand": A slot-unbound
method will be created for this purpose.
A variation of the :BINARY-TYPE slot-option is :BINARY-LISP-TYPE,
which does everything :BINARY-TYPE does, but also passes on a :TYPE
slot-option to DEFCLASS (or DEFSTRUCT). The type-spec is inferred
from the binary-type declaration. When using this mechanism, you
should be careful to always provide a legal value in the slot (as you
must always do when declaring slots' types). If you find this
confusing, just use :BINARY-TYPE.
Performance has not really been a concern for me while designing this
package. There's no obvious performance bottlenecks that I know of,
but keep in mind that all "binary" reads and writes are reduced to
individual 8-bit READ-BYTEs and WRITE-BYTEs. If you do identify
particular performance bottlenecks, let me know.
The included file "example.lisp" demonstrates how to use this
package. To give you a taste of what it looks like, the following
declarations are enough to read the header of an ELF executable file
with the form
(let ((*endian* :big-endian))
(read-binary 'elf-header stream)
;;; ELF basic type declarations
(define-unsigned word 4)
(define-signed sword 4)
(define-unsigned addr 4)
(define-unsigned off 4)
(define-unsigned half 2)
;;; ELF file header structure
(define-binary-class elf-header ()
((e-ident
:binary-type (define-binary-struct e-ident ()
(ei-magic nil :binary-type
(define-binary-struct ei-magic ()
(ei-mag0 0 :binary-type u8)
(ei-mag1 #\null :binary-type char8)
(ei-mag2 #\null :binary-type char8)
(ei-mag3 #\null :binary-type char8)))
(ei-class nil :binary-type
(define-enum ei-class (u8)
elf-class-none 0
elf-class-32 1
elf-class-64 2))
(ei-data nil :binary-type
(define-enum ei-data (u8)
elf-data-none 0
elf-data-2lsb 1
elf-data-2msb 2))
(ei-version 0 :binary-type u8)
(padding nil :binary-type 1)
(ei-name "" :binary-type
(define-null-terminated-string ei-name 8))))
(e-type
:binary-type (define-enum e-type (half)
et-none 0
et-rel 1
et-exec 2
et-dyn 3
et-core 4
et-loproc #xff00
et-hiproc #xffff))
(e-machine
:binary-type (define-enum e-machine (half)
em-none 0
em-m32 1
em-sparc 2
em-386 3
em-68k 4
em-88k 5
em-860 7
em-mips 8))
(e-version :binary-type word)
(e-entry :binary-type addr)
(e-phoff :binary-type off)
(e-shoff :binary-type off)
(e-flags :binary-type word)
(e-ehsize :binary-type half)
(e-phentsize :binary-type half)
(e-phnum :binary-type half)
(e-shentsize :binary-type half)
(e-shnum :binary-type half)
(e-shstrndx :binary-type half)))
For a second example, here's an approach to supporting floats:
(define-bitfield ieee754-single-float (u32)
(((:enum :byte (1 31))
positive 0
negative 1)
((:numeric exponent 8 23))
((:numeric significand 23 0))))
The postscript file "type-hierarchy.ps" shows the binary types
hierarchy. It is generated using psgraph and closer-mop, which may be
loaded via Quicklisp as shown below:
(ql:quickload "psgraph")
(ql:quickload "closer-mop")
(with-open-file (*standard-output* "type-hierarchy.ps"
:direction :output
:if-exists :supersede)
(psgraph:psgraph *standard-output* 'binary-types::binary-type
(lambda (p)
(mapcar #'class-name
(closer-mop:class-direct-subclasses
(find-class p))))
(lambda (s) (list (symbol-name s)))
t))

This is a thin wrapper around WITH-OPEN-FILE, that tries to set the
stream’s element-type to that required by READ-BINARY and WRITE-BINARY.
A run-time assertion on the stream’s actual element type is performed,
unless you disable this feature by setting the keyword option :check-stream
to nil.

Bind STREAM-VAR to an object that, when passed to READ-BINARY, provides
8-bit bytes from LIST-FORM, which must yield a list.
Binds *BINARY-READ-BYTE* appropriately. This macro will break if this
binding is shadowed.

Bind STREAM-VAR to an object that, when passed to READ-BINARY, provides
8-bit bytes from VECTOR-FORM, which must yield a vector.
Binds *BINARY-READ-BYTE* appropriately. This macro will break if this
binding is shadowed.

Inside BODY, calls to WRITE-BINARY with stream STREAM-VAR will
collect the individual 8-bit bytes in a list (of integers).
This list is returned by the form. (There is no way to get at
the return-value of BODY.)
This macro depends on the binding of *BINARY-WRITE-BYTE*, which should
not be shadowed.

Arrange for STREAM-VAR to collect octets in a vector.
VECTOR-OR-SIZE-FORM is either a form that evaluates to a vector, or an
integer in which case a new vector of that size is created. The vector’s
fill-pointer is used as the write-index. If ADJUSTABLE nil (or not provided),
an error will occur if the array is too small. Otherwise, the array will
be adjusted in size, using VECTOR-PUSH-EXTEND. If ADJUSTABLE is an integer,
that value will be passed as the EXTENSION argument to VECTOR-PUSH-EXTEND.
If VECTOR-OR-SIZE-FORM is an integer, the created vector is returned,
otherwise the value of BODY.

Read a string from STREAM, terminated by any member of the list TERMINATORS.
If SIZE is provided and non-nil, exactly SIZE octets are read, but the returned
string is still terminated by TERMINATORS. The string and the number of octets
read are returned.

From a list of BYTES sized FROM-SIZE bits, split each byte into bytes of size TO-SIZE,
according to *ENDIAN*. TO-SIZE must divide FROM-SIZE evenly. If this is not the case,
you might want to apply MERGE-BYTES to the list of BYTES first.

Takes a binary-type specifier (a symbol, integer, or define-xx form),
and returns three values: the binary-type’s name, the equivalent lisp type,
and any nested declaration that must be expanded separately.