Hash tables

A hashtable is a data structure that
associates keys with values.
The hashtable has no intrinsic order for the (key, value) associations
it contains, and
supports in-place modification as the primary means of setting the contents
of a hash table.
Any object can be used as a key, provided a hash function and a suitable
equivalence function is available.
A hash function is a procedure that
maps keys to exact integer objects.

The hashtable provides key lookup and destructive update in amortised
constant time, provided that a good hash function is used.
A hash function h is acceptable for an equivalence predicate e iff
(eobj1obj2) implies
(= (hobj1) (hobj2)).
A hash function h is good for a equivalence predicate e if
it distributes the resulting hash values for non-equal objects
(by e) as uniformly as possible over the range of hash
values, especially in the case when some (non-equal) objects resemble
each other by e.g. having common subsequences. This definition is
vague but should be enough to assert that e.g. a constant function is
not a good hash function.

Kawa provides two complete sets of functions for hashtables:

The functions specified by R6RS have names starting with hashtable-

The functions specified by the older
SRFI-69 specifiation
have names starting with hash-table-

Both interfaces use the same underlying datatype, so it is possible
to mix and match from both sets.
That datatype implements java.util.Map.
Freshly-written code should probably use the R6RS functions.

R6RS hash tables

To use these hash table functions in your Kawa program you must first:

(import (rnrs hashtables))

This section uses the hashtable parameter name for arguments that
must be hashtables, and the key parameter name for arguments that
must be hashtable keys.

Procedure: make-eq-hashtable

Procedure: make-eq-hashtablek

Return a newly allocated mutable hashtable that accepts arbitrary
objects as keys, and compares those keys with eq?. If an
argument is given, the initial capacity of the hashtable is set to
approximately k elements.

Procedure: make-eqv-hashtable

Procedure: make-eqv-hashtablek

Return a newly allocated mutable hashtable that accepts arbitrary
objects as keys, and compares those keys with eqv?. If an
argument is given, the initial capacity of the hashtable is set to
approximately k elements.

Procedure: make-hashtablehash-functionequiv

Procedure: make-hashtablehash-functionequivk

hash-function and equiv must be procedures.
hash-function should accept a key as an argument and should return
a non–negative exact integer object. equiv should accept two
keys as arguments and return a single value. Neither procedure should
mutate the hashtable returned by make-hashtable.

The make-hashtable procedure returns a newly allocated mutable
hashtable using hash-function as the hash function and equiv
as the equivalence function used to compare keys. If a third argument
is given, the initial capacity of the hashtable is set to approximately
k elements.

Both hash-function and equiv should behave like pure
functions on the domain of keys. For example, the string-hash
and string=? procedures are permissible only if all keys are
strings and the contents of those strings are never changed so long as
any of them continues to serve as a key in the hashtable. Furthermore,
any pair of keys for which equiv returns true should be hashed to
the same exact integer objects by hash-function.

Note: Hashtables are allowed to cache the results of calling the
hash function and equivalence function, so programs cannot rely on the
hash function being called for every lookup or update. Furthermore any
hashtable operation may call the hash function more than once.

Procedures

Procedure: hashtable?obj

Return #t if obj is a hashtable, #f otherwise.

Procedure: hashtable-sizehashtable

Return the number of keys contained in hashtable as an exact
integer object.

Procedure: hashtable-refhashtablekeydefault

Return the value in hashtable associated with key. If
hashtable does not contain an association for key,
default is returned.

Procedure: hashtable-set!hashtablekeyobj

Change hashtable to associate key with obj, adding a
new association or replacing any existing association for key, and
returns unspecified values.

Procedure: hashtable-delete!hashtablekey

Remove any association for key within hashtable and returns
unspecified values.

Procedure: hashtable-contains?hashtablekey

Return #t if hashtable contains an association for key,
#f otherwise.

Procedure: hashtable-update!hashtablekeyprocdefault

proc should accept one argument, should return a single value, and
should not mutate hashtable.

The hashtable-update! procedure applies proc to the value
in hashtable associated with key, or to default if
hashtable does not contain an association for key. The
hashtable is then changed to associate key with the value
returned by proc.

The behavior of hashtable-update! is equivalent to the following
code, but is may be (and is in Kawa) implemented more efficiently in cases
where the implementation can avoid multiple lookups of the same key:

Inspection

Return the equivalence function used by hashtable to compare keys.
For hashtables created with make-eq-hashtable and
make-eqv-hashtable, returns eq? and eqv?
respectively.

Procedure: hashtable-hash-functionhashtable

Return the hash function used by hashtable. For hashtables
created by make-eq-hashtable or make-eqv-hashtable,
#f is returned.

Procedure: hashtable-mutable?hashtable

Return #t if hashtable is mutable, otherwise #f.

Hash functions

The equal-hash, string-hash, and string-ci-hash
procedures of this section are acceptable as the hash functions of a
hashtable only if the keys on which they are called are not mutated
while they remain in use as keys in the hashtable.

Procedure: equal-hashobj

Return an integer hash value for obj, based on its structure and
current contents. This hash function is suitable for use with
equal? as an equivalence function.

Note: Like equal?, the equal-hash procedure must
always terminate, even if its arguments contain cycles.

Procedure: string-hashstring

Return an integer hash value for string, based on its current
contents. This hash function is suitable for use with string=?
as an equivalence function.

Procedure: string-ci-hashstring

Return an integer hash value for string based on its current
contents, ignoring case. This hash function is suitable for use with
string-ci=? as an equivalence function.

Procedure: symbol-hashsymbol

Return an integer hash value for symbol.

SRFI-69 hash tables

To use these hash table functions in your Kawa program you must first:

(require 'srfi-69)

or

(require 'hash-table)

or

(import (srfi :69 basic-hash-tables))

Type constructors and predicate

Procedure: make-hash-table [ equal? [ hash [ size-hint]]] →hash-table

Create a new hash table with no associations.
The equal? parameter is a predicate
that should accept two keys and return a boolean telling whether they
denote the same key value; it defaults to the equal? function.

The hash parameter is a hash function, and defaults to an
appropriate hash function
for the given equal? predicate (see the Hashing section).
However, an
acceptable default is not guaranteed to be given for any equivalence
predicate coarser than equal?, except for string-ci=?.
(The function hash is acceptable for equal?, so if you
use coarser equivalence than equal? other than string-ci=?,
you must always provide the function hash yourself.)
(An equivalence predicate c1 is coarser than a equivalence
predicate c2 iff there exist values x and y such
that (and (c1xy) (not (c2xy))).)

The size-hint parameter can be used to suggested an approriate
initial size. This option is not part of the SRFI-69 specification
(though it is handled by the reference implementation), so specifying
that option might be unportable.

Takes an association list alist and creates a hash table
hash-table which maps the car of every element in
alist to the cdr of corresponding elements in
alist. The equal?, hash, and size-hint
parameters are interpreted as in make-hash-table. If some key
occurs multiple times in alist, the value in the first
association will take precedence over later ones. (Note: the choice of
using cdr (instead of cadr) for values tries to strike
balance between the two approaches: using cadr would render this
procedure unusable for cdr alists, but not vice versa.)

Reflective queries

Procedure: hash-table-equivalence-functionhash-table

Returns the equivalence predicate used for keys of hash-table.

Procedure: hash-table-hash-functionhash-table

Returns the hash function used for keys of hash-table.

Dealing with single elements

Procedure: hash-table-refhash-tablekey [ thunk ] →value

This procedure returns the value associated to key in
hash-table. If no value is associated to key and
thunk is given, it is called with no arguments and its value is
returned; if thunk is not given, an error is signalled. Given a
good hash function, this operation should have an (amortised) complexity
of O(1) with respect to the number of associations in hash-table.

Procedure: hash-table-ref/defaulthash-tablekeydefault→value

Evaluates to the same value as (hash-table-ref hash-tablekey (lambda () default)). Given a good hash function, this
operation should have an (amortised) complexity of O(1) with respect
to the number of associations in hash-table.

Procedure: hash-table-set!hash-tablekeyvalue→void

This procedure sets the value associated to key in
hash-table. The previous association (if any) is removed. Given
a good hash function, this operation should have an (amortised)
complexity of O(1) with respect to the number of associations in
hash-table.

Procedure: hash-table-delete!hash-tablekey→void

This procedure removes any association to key in
hash-table. It is not an error if no association for the
key exists; in this case, nothing is done. Given a good hash
function, this operation should have an (amortised) complexity of O(1)
with respect to the number of associations in hash-table.

Procedure: hash-table-exists?hash-tablekey→boolean

This predicate tells whether there is any association to key in
hash-table. Given a good hash function, this operation should
have an (amortised) complexity of O(1) with respect to the number of
associations in hash-table.

Procedure: hash-table-update!hash-tablekeyfunction [ thunk ] →void

Semantically equivalent to, but may be implemented more efficiently than,
the following code:

Behaves as if it evaluates to
(hash-table-update! hash-tablekeyfunction (lambda () default)).

Dealing with the whole contents

Procedure: hash-table-sizehash-table→integer

Returns the number of associations in hash-table. This operation takes
constant time.

Procedure: hash-table-keyshash-table→list

Returns a list of keys in hash-table.
The order of the keys is unspecified.

Procedure: hash-table-valueshash-table→list

Returns a list of values in hash-table. The order of the values is
unspecified, and is not guaranteed to match the order of keys in the
result of hash-table-keys.

Procedure: hash-table-walkhash-tableproc→void

proc should be a function taking two arguments, a key and a
value. This procedure calls proc for each association in
hash-table, giving the key of the association as key and the
value of the association as value. The results of proc are
discarded. The order in which proc is called for the different
associations is unspecified.

Procedure: hash-table-foldhash-tablefinit-value→final-value

This procedure calls f for every association in hash-table
with three arguments: the key of the association key, the value of the
association value, and an accumulated value, val. The val
is init-value for the first invocation of f, and for
subsequent invocations of f, the return value of the previous
invocation of f. The value final-value returned by
hash-table-fold is the return value of the last invocation of
f. The order in which f is called for different
associations is unspecified.

Procedure: hash-table->alisthash-table→alist

Returns an association list such that the car of each element
in alist is a key in hash-table and the corresponding
cdr of each element in alist is the value associated to
the key in hash-table. The order of the elements is unspecified.

The following should always produce a hash table with the same mappings
as a hash table h:

Returns a new hash table with the same equivalence predicate, hash
function and mappings as in hash-table.

Procedure: hash-table-merge!hash-table1hash-table2→hash-table

Adds all mappings in hash-table2 into hash-table1 and
returns the resulting hash table. This function may modify
hash-table1 destructively.

Hash functions

The Kawa implementation always calls these hash functions with a single
parameter, and expects the result to be within the entire
(32-bit signed) int range, for compatibility with
standard hashCode methods.

Procedure: hashobject [ bound ] →integer

Produces a hash value for object in the range from 0 (inclusive) tp to
bound (exclusive).

If bound is not given, the Kawa implementation returns a value within
the range (- (expt 2 32)) (inclusive)
to (- (expt 2 32) 1) (inclusive).
It does this by calling the standard hashCode method,
and returning the result as is.
(If the object is the Java null value, 0 is returned.)
This hash function is acceptable for equal?.

Procedure: string-hashstring [ bound ] →integer

The same as hash, except that the argument string must be a string.
(The Kawa implementation returns the same as the hash function.)

Procedure: string-ci-hashstring [ bound ] →integer

The same as string-hash, except that the case of characters in
string does not affect the hash value produced.
(The Kawa implementation returns the same the hash function
applied to the lower-cased string.)

Procedure: hash-by-identityobject [ bound ] →integer

The same as hash, except that this function is only guaranteed
to be acceptable for eq?.
Kawa uses the identityHashCode method of java.lang.System.