dm

Synopsis

CIAO Data Model: syntax for filtering and binning files

Description

The CIAO Data Model (DM) is a versatile interface used by CIAO to
examine and manipulate standard format datafiles (e.g. FITS,
ASCII). The DM enables powerful filtering and binning of
datafiles. This document is an introduction to the DM syntax used
by the CIAO tools.

Table of Contents

1. DM Syntax and Virtual Files

2. Virtual Columns

3. Renaming and Reordering Columns

Related help files contain information and examples illustrating
the capabilities of the DM. A list of these files can also be
obtain from the CIAO command line with "about dm" or "ahelp -k dm".

1. DM Syntax and Virtual Files

The Data Model offers an easy and powerful means of filtering
data. The filtered file can be directly input to a tool without
writing it to disk first; this is known as a "virtual file." The
virtual file, which can also be referred to as a subspace, is
simply a means of defining a subset of interest in the dataset.

filename: the input filename. All CIAO tools accept FITS file
input, and many also accept ASCII files. Some tools only work
on event files, while others require an input image. Refer to the
individual tool help files for any restrictions.

[block]: the extension of the file to use, e.g. the name of the
image or table. For FITS files, the block corresponds to an HDU
and may be identified by name ("[EVENTS]") or number ("[2]").
If the block is not specified, the first "interesting" block is
used (e.g. [EVENTS] for an event file). To view the blocks in a
file, use "dmlist file.fits blocks".

[filter]: the filter to apply to the data. It indicates, for
instance, which time period, energy range, or spatial region
to use (e.g. "[time=1522012:1522320,1522400:1522600]").
Refer to "ahelp dmfiltering" for a full discussion of filtering.

[binning]: the binning specification for creating an image from
an event file (e.g "[bin x=10:100:1,y=1:100:1]").
Refer to "ahelp dmbinning" for a full discussion of binning.

[columns]: the names of the columns to include ("[cols
time,energy,]") or exclude ("[cols -phas]"). The syntax "[cols
!phas]" may also be used, but the "!" symbol needs to be written
as "\!" in the Unix shell, making the "-" syntax more convenient.

[option]: advanced options for the DM, such as specifying what
the NULL character should be or how much memory to allow a tool to
use. Refer to "ahelp dmopt" for a list of the available options.

[rename]: the name for the block in the output file. The
default behavior is for the output to have the same block name,
unless a file is binned to create an image; in that case,
"_IMAGE" is added to the block name. (For information on
renaming columns, refer to a later section in this file.)

2. Virtual Columns

A file may contain virtual columns whose values are calculated by
applying a mathematical transform to an existing column. Virtual
columns - such as EQPOS(RA,DEC) - do not physically exist in the
event file; they are defined by the WCS information attached to
another column, e.g. SKY.

The transformation is listed in the output of "dmlist evt2.fits
cols":

For most applications, these columns may be used the same as
non-virtual columns in the file. It is possible to list, filter,
and bin on virtual columns.

However, filtering and binning do not work reliably on virtual
columns derived from non-monotonic coordinate transforms
(e.g. MSC(THETA,PHI), or EQPOS near the poles;
see "ahelp coords" for more information on these coordinate systems).

3. Renaming and Reordering Columns

It is possible to rename a column or change the order of the
columns within a file. Note that certain CIAO tools require
particular column names (e.g. time, energy), but none of the tools
make assumptions about the order of the columns within a file.

To rename a column, run dmcopy with the column syntax
"newname=oldname". Multiple columns may be renamed in the same
command.

The "count_rate" column in pi.fits is renamed to "rate" in
pi_rate.fits. With the first command, "rate" will be the only
column in the output file. The "*" operator indicates that all
other columns should be copied unchanged to the output file.

The columns will appear in the output file in the order in which
they are specified. So in the renaming case, "rate" will be the
first column in the output. The "cols" syntax can be used to
reorder columns without modifying them as well:

dmcopy "pi.fits[cols energy,time,pi,count_rate, *]" reorder.fits

Note that for a vector column sky(x,y),

"evt.fits[cols x,y]"

will retain the information that (x,y) is a vector column
called "sky". Any of the following, however, will separate the
vector components and lose the vector-dependent coordinate systems
like RA and Dec:

Examples

Example 1

Example 2

acisf01843N002_evt2.fits[#row=1:4]

Select rows 1-4 from a FITS file.

Example 3

dmlist "evt.fits[events][pha=30:200,time=10:20,50:60]" data

Use the tool dmlist to print specific data values from the
file to the screen. A filter is applied to the "events" block
in the file evt.fits. The filter selects rows in the table
for which the value of the pha column is >= 30 and < 20,
and for which the time is either >= 10 and < 20 or >=
50 and < 60. Both the the pha and time filters must be
satisfied for a row to pass the filter.

Example 4

acisf01843N002_evt2.fits[EVENTS][bin x=3200:4800:4,y=3200:4800:4]

Bin an event file into an image with this input to the tool
dmcopy.

Example 5

acisf01843N002_evt2.fits[EVENTS][bin pi=1:1024:1]

Use this specification as input to the tool dmextract to bin
an event file into a PI spectrum.

Example 6

dmcopy "evt.fits[cols -status]" evt_new.fits

Removed the status column from evt.fits.

Bugs

General

Linear transforms have an extra 0.5 bin shift applied.

When creating a file with a Linear WCS, the CXCDM
adjusts the transform parameters to force a half bin
offset. While mathematically consistent with the
values input, the result may be confusing especially
when the transform is well known, eg °C to °F.

The WCS library that the DM uses has a problem
computing coordinate transforms that involve
the CAR transform.

You may get a seg fault if you try to create a very large
image. What constitutes "very large" depends on the data
type, but for long and float images, 8192x8192 pixels seems
to be the threshold.

(The image doesn't have to be square; it just needs to
have 8192^2 pixels.)

This condition may be met when the "update=no"
option is used. Normally, when you filter a dataset, the
data subspace (which describes the boundaries of each
column's data and therefore is the intersection of the
initial minima and maxima with any subsequent filters)
gets updated to reflect the filtering. However, when you
give the "update=no" option, you instruct the DM not to
update the subspace to reflect the current filter.
Therefore, the full ranges for x and y are used in the
binning, and you get a 8192x8192 image (and a seg fault,
for the reason described above).

There are three issues with the generic use of TC*
keywords in FITS files read and written by DM tools.

The first issue: TC*n[A-Z]. The DM function that composes this keyword
does not strip off the 'P' from the before adding number and letter to
the end.

Second issue: TCTY* keys are not recognized and therefore are not processed.

Third issue: DM is not always retaining the T*NAM information when it stores
information on the DM descriptor, resulting in output keywords T*TYP
instead.

EXTNAME is forced to be same a HDUNAME on output

When a FITS file is copied, the EXTNAME is
forced on output to be the same as the HDUNAME.
Typical Chandra/CIAO files require these to be
the same; however, data from other missions
and projects may not have the same requirement.

When filtering on WCS columns, the range is taken
by converting range of the parent columns and using those
as the limits of the WCS columns. When the transform is
highly non-linear, eg the TAN-P transform used to
go from DETX,DETY to THETA,PHI, this can
leads to incorrect limits and incorrect filters. Users
who want to filter on WCS columns should give explict
ranges and not rely on the computed min/maxes.

bad% dmcopy "evt.fits[theta=:1]"
good% dmcopy "evt.fits[theta=0:1]"

Creating a vector on-the-fly when region filtering

When region-filtering images, you can create a vector on
the fly from any two axes by using a filter like
"(#1,#3)=circle(...)". Although the image is
filtered correctly with a temporary vector, the region
filter isn't recorded in the subspace. Hence, tools that
use the filtered file don't know that pixels outside the
filter region are invalid. As a result, dmstat
reports no nulls in the filtered image (unless you
explicitly tell the DM to set pixels outside the filter to
null by using "opt null=...").

Applying a bit-filter expression to an integer column does
not work, nor does it cause an error.

Using incorrect syntax with the rectangle shape does not
fail when filtering.

For images without a physical coordinate system the DM will
internally create one. This is not written to the output file
which may lead to errors if the image size changes due to spatial
filtering as the output file then has a different physical WCS
compared to the input file.