Description

The dmdiff tools compares two files (FITS or ASCII) and
determines whether they contain the same data. The default
behavior is to compare the data values - e.g. columns in a
table and pixel values in an image - as well as the metadata
in the file - e.g. keyword values, units, and comments - but
options exist to restrict the items being compared. There are
multiple ways of comparing values - such as equality or by
using absolute or relative differences - that can be specified
for different values using the tolfile parameter.

Exit Status

Like the Unix commands `diff' and `cmp', dmdiff assigns
special meaning to its exit status. An exit status of 0 means
that no differences were found in the two input files. An
exit status of 1 means that either differences were found or
an error occurred. An exit status greater than one always
indicates that an error occurred. Note that if the verbose
parameter is set to 0, the tool will produce no output, but
the exit status will still reflect whether differences were
found in the input files. This feature can be useful in
scripts that automatically compare a large number of files.

Current Limitations

There are a few limitations in the tool:

The number of columns in the two infiles must be the same.
If the number of rows are different, dmdiff will only compare
the lesser number of rows in the inputs (i.e. infile1=1000
rows, infile2=500 rows; first 500 rows of each file are
compared).

Values with the same name in the input files must also have
the same datatype. This means that dmdiff cannot compare
columns or images of different types and will exit with an
error if there is a mismatch.

Examples

Example 1

unix% dmdiff file1.fits file2.fits

Compare all header and data values in the
default block of file1.fits and file2.fits.

Example 2

unix% dmdiff "file1.dat[t=100:][cols x,y]" "file2.dat[cols x,y]"

Here the comparison is listed to the X and Y columns in the two
files (in this case ASCII files, using the
ASCII kernel support),
and the data from the first file has an additional
filter (only those rows with t values of 100 or more).

Example 3

unix% dmdiff "file1.fits[EVENTS]" "file2.fits[EVENTS]"

Compare all the header and table values of the EVENTS block in file1.fits and file2.fits.

Detailed Parameter Descriptions

Parameter=infile1 (file required filetype=input stacks=no)

The first file to use. It can contain Data Model syntax. The file does not have to have the
same format as the infile2 parameter.

Parameter=infile2 (file required filetype=input stacks=no)

2nd input file name

The second file to use. It can contain Data Model syntax. The file does not have to have the
same format as the infile1 parameter.

Parameter=outfile (file not required stacks=no)

Output file name

Output file listing summary of differences found.
If the value is omitted or set to 'none',
'NONE', or 'stdout', output will go to the standard output device
(generally the terminal). If outfile is set to 'stderr',
output will go to the standard error device (also generally
displayed on the terminal). Finally, if a filename is
given, output will be written to that file. The clobber
parameter controls whether an existing file will be
overwritten.

Parameter=tolfile (file not required stacks=no)

Tolerance file name

This is an ASCII text file that governs how values
are compared. The file is case insensitive, with
commands on each line, and
empty lines or those beginning with the '#' character are
ignored. The order of the commands does not matter
and commands that do not match the contents of the
file are ignored.

There are multiple ways to compare numeric values,
as discussed below. To refer to an image, use the
block name of the image (use 'dmlist filename blocks'
to find this out). The same syntax is used to refer
to keyword values, rows of a column, or image pixel
values, so whether the command

EVAL=range(1)

refers to a keyword, column, or image, depends on the
input files.

A single value

Using "name=value" means that it is an error if
either file does not equal the given value. The following
example requires all ccd_id values to be equal to 3
and state values to match the string "finished":

ccd_id=3

state=finished

A range of values

The Data Model range syntax - namely a=b:c,
with b or c optional - can be used to specify that
a must be within the range a to b (missing values mean
lower or upper limits). That is,

ccd_id=6:8

ccd_id=6:

ccd_id=:8

require that the ccd_id values be in the
range 6 to 8 (inclusive), greater than or equal to
6, and less than or equal to 8 respectively.

Note that there is no check that the values in
the two files equal each other, just that they
match the range filter.

An absolute difference

The range option is used to
check that the absolute difference between
the two files is within the given limit.
So

chipx=range(1)

events_image=range(1.0e-6)

mean that the chipx values can differ by no more
than 1, and the events_image values no
more than 1.0e-6.

A percentage difference

To express a relative difference, use
% and then the difference as a percentage
(calculated relative to the first file).
Note that the % character is written before
the limit, otherwise it will be taken as
a string comparison. The commands

chipx=%1

events_image=%0.01

mean that the chipx values can differ by 1% or less
and the events_image values by 0.01% or less.

File names

When comparing string values (either column values or
a keyword) that contain file names, the "ignorepath"
directive can be used to make the comparison use only
the file name in the comparison, ignoring any
preceding path components. That is, with the
command

INFILES=ignorepath

the values /path1/to/file1.dat and /data/file1.dat
would be considered equal when stored in either the INFILES
keyword or column.

Ignoring a value

To ignore a keyword, column, or image,
use the ! character followed
by the name of the item. To ignore multiple
item, write each out on a separate line of the
file (preceeded by the ! character).
You can also use the Data Model virtual file syntax
for the infile1 and infile2 parameters to select
(or hide) certain columns.
The following commands will ignore the keywords
DATE, CHECKSUM, and CREATOR:

!DATE
!CHECKSUM
!CREATOR

Parameter=keys (boolean not required default=yes)

Check header keywords?

Determines whether or not the header keys will be
compared. See also the units and comments
parameters. The tolerance file - set by the tolfile
keyword - can be used to filter out certain keywords
and to contol whether, when comparing file names,
the path component should be ignored.

Parameter=data (boolean not required default=yes)

Check table or image data?

Determines whether or not the data values - i.e. the
image pixels of rows of each column - will be
compared.

Parameter=subspaces (boolean not required default=yes)

Check subspaces?

Controls whether or not the
subspace record,
stored in the file by CIAO tools to
record the filters applied,
will be compared.

Parameter=units (boolean not required default=yes)

Check units?

Controls whether or not the units of keywords
and columns will be compared.

Parameter=comments (boolean not required default=yes)

Check comments?

Controls whether or not the comments of
columns and keywords
will be compared. This does not refer to the COMMENT
or HISTORY keywords, which are
not used when comparing files.

Parameter=wcs (boolean not required default=yes)

Check wcs?

Controls whether the WCS keywords be included in the comparison.

Parameter=missing (boolean not required default=yes)

Check for missing header keys?

Determines if missing header keys will be checked.

Parameter=error_on_value (boolean not required default=yes)

Return error when values are different?

Parameter=error_on_comment (boolean not required default=yes)

Return error when comments are different?

Parameter=error_on_unit (boolean not required default=yes)

Return error when units are different?

Parameter=error_on_range (boolean not required default=yes)

Return error when ranges are different?

Parameter=error_on_datatype (boolean not required default=yes)

Return error when datatypes are different?

Parameter=error_on_wcs (boolean not required default=yes)

Return error when wcs's are different?

Parameter=error_on_subspace (boolean not required default=yes)

Return error when subspaces are different?

Parameter=error_on_missing (boolean not required default=yes)

Return error when header key is missing?

Parameter=verbose (integer not required default=1 min=0 max=5)

Debug level

Verbosity level of terminal display information to user (DataModel output included). If verbose is set to 0, the tool will produce
no output, but its exit status will indicate whether differences
were found in the input files. See the section "Exit Status" above.

Parameter=clobber (boolean not required default=no)

Clobber existing file

Controls whether a file is overwritten,
or the tool errors out,
if the outfile parameter is
set to a file name and it exists.

An example tolerance file

The purpose of the tolerance file is to set parameters when
comparing the values of the input files. The tolerance file
is an ASCII file with one keyword rule per line; see the
description of the tolfile parameter for more
information on the syntax and semantics of the commands.

which indicates that any TSTART value must lie between
the given minimum and maximum limits (the checks are inclusive,
but note that there is no requirement that the TSTART values
are the same in the two files, just that they both lie within this
range);
the CHIPX values must not differ by more than 50;
the CCD_ID values must be equal to 8;
the CHECKSUM, DATASUM, and DATE values are ignored (whether
a column or keyword);
the TELESCOP value is set to 'CHANDRA',
and the BACKFILE values are compared ignoring any path
component.
The TIME filter is ignored because it begins with a '#'
character, and note that the names of the values
to be compared are case insensitive.

Changes in CIAO 4.10

Updates support diffing masks in subspace.

Tweak to warning messages when files have mismatched column names.

Fixed issues with units and error_on_units when
dealing with images.

Bugs

Problem with percent sign (%) in strings

dmdiff will produce some bad results if any of
the strings (comment, units, value) have a "%" in them;
e.g. "90%ecf". The "%e" is getting parsed as string
formatting.

The tool does not recognize differences in vector component ranges.

Caveats

dmdiff does not compare region subspace(s).

dmdiff does not recognize string lengths as being different as long
as the strings themselves are the same.