3. CQL table layout

In the database, channels that use the Channel Access control-system
support can be identified by having their control-system type set to
“channel_access”.
The Channel Access control-system support stores all samples in a single
table with the name channel_access_samples.
The columns of this table are described by
Table C.1, “Columns of table channels_access_samples”.

Table C.1. Columns of table channels_access_samples

Column name

Column type

Data type

Description

channel_data_id

Partition key

uuid

Channel data ID.

decimation_level

Partition Key

int

Decimation level (identified by the decimation period in
seconds). Zero indicates raw samples.

bucket_start_time

Partition Key

bigint

Start time of the sample bucket (in nanoseconds since epoch,
which is January 1st, 1970, 00:00:00
UTC).

sample_time

Clustering Key

bigint

Time stamp of the sample (in nanoseconds since epoch, which is
January 1st, 1970, 00:00:00 UTC).

a_char

Regular

frozen<channel_access_array_char>

Data for a sample of type DBR_CHAR with more
than one element.

a_double

Regular

frozen<channel_access_array_double>

Data for a sample of type DBR_DOUBLE with
more than one element.

a_enum

Regular

frozen<channel_access_array_enum>

Data for a sample of type DBR_ENUM with more
than one element.

a_float

Regular

frozen<channel_access_array_float>

Data for a sample of type DBR_FLOAT with more
than one element.

a_long

Regular

frozen<channel_access_array_long>

Data for a sample of type DBR_LONG with more
than one element.

a_short

Regular

frozen<channel_access_array_short>

Data for a sample of type DBR_SHORT with more
than one element.

a_string

Regular

frozen<channel_access_array_string>

Data for a sample of type DBR_STRING with
more than one element.

current_bucket_size

Static

int

Accumulated size (in bytes) of the samples that have been
written to the sample bucket so far.

disabled

Regular

boolean

Marker for a sample indicating that the channel was disabled at
that point in time.

disconnected

Regular

boolean

Marker for a sample indicating that the channel was disconnected
at that point in time.

gs_char

Regular

frozen<channel_access_aggregated_scalar_char>

Data for an aggregated sample that has been built from samples
of type DBR_CHAR, each having a single
element.

gs_double

Regular

frozen<channel_access_aggregated_scalar_double>

Data for an aggregated sample that has been built from samples
of type DBR_DOUBLE, each having a single
element.

gs_float

Regular

frozen<channel_access_aggregated_scalar_float>

Data for an aggregated sample that has been built from samples
of type DBR_FLOAT, each having a single
element.

gs_long

Regular

frozen<channel_access_aggregated_scalar_long>

Data for an aggregated sample that has been built from samples
of type DBR_LONG, each having a single
element.

gs_short

Regular

frozen<channel_access_aggregated_scalar_short>

Data for an aggregated sample that has been built from samples
of type DBR_SHORT, each having a single
element.

s_char

Regular

frozen<channel_access_scalar_char>

Data for a sample of type DBR_CHAR with a
single element.

s_double

Regular

frozen<channel_access_scalar_double>

Data for a sample of type DBR_DOUBLE with a
single element.

s_enum

Regular

frozen<channel_access_scalar_enum>

Data for a sample of type DBR_ENUM with a
single element.

s_float

Regular

frozen<channel_access_scalar_float>

Data for a sample of type DBR_FLOAT with a
single element.

s_long

Regular

frozen<channel_access_scalar_long>

Data for a sample of type DBR_LONG with a
single element.

s_short

Regular

frozen<channel_access_scalar_short>

Data for a sample of type DBR_SHORT with a
single element.

s_string

Regular

frozen<channel_access_scalar_string>

Data for a sample of type DBR_STRING with a
single element.

The channel_data_id,
decimation_level, and
bucket_start_time form a composite partition key that
identifies the sample bucket.
These parameters are passed to the control-system support by the
Cassandra PV Archiver server and are simply used “as-is”.

The sample_time is used as the clustering key.
This way, it is easily possible to select only those samples from a
sample bucket that have a time stamp within a specific interval.

The current_bucket_size is a static column because it
obviously is the same for the whole sample bucket.
This column is updated by the control-system support each time a sample
is added to the sample bucket.

All other columns are used for storing the sample’s data.
For each sample, exactly one of these columns has a non-null value.
The disabled and disconnected
columns are simple boolean columns.
If one of them is true, it means that the sample is
a marker of the corresponding type.
Each column that stores a regular (non-marker) sample uses a
user-defined type (UDT) that is only used by that column.

Note

The names of the data columns have intentionally been chosen to be
very short.
The reason for this is simple:
Due to how regular columns are internally handled by Cassandra, the
column name is serialized for each row.
When there are many rows, a long column name can contribute to the
total data size significantly.
Most of this overhead is compensated by the compression that is
applied to SSTables before storing them on disk.
However, the sample bucket size that is limited to about 100 MB is
measured before applying the compression.
For this reason, longer column names would significantly reduce the
number of samples that could be stored in each sample bucket.

User-defined types (UDTs) are used for the same reason:
When the various fields that are needed to store a sample would be
represented as separate columns, the overhead that is caused by the
meta-data for each column would increase the total data size
significantly.
Frozen UDTs, on the other hand, are as efficient as frozen tuples,
allowing for the space-efficient storage of sample data while having
human-readable names for their fields.

fraction of the period that is actually covered by the data in
the aggregated sample.
A value of 1.0 means that the data that was
used to calculate the aggregated sample actually covers the full
period that is supposed to be represented by the aggregated
sample.
A value of 0.5 means that the data that was
used to calculate the aggregated sample actually only covers
half of the period that is supposed to be covered by the
aggregated sample.

alarm status (the number is the status code that is used by the
Channel Access protocol to signal the corresponding alarm
status).

precision

smallint

display precision for floating point numbers.

units

text

engineering units.

labels

frozen<list<text>>

labels for enum states.

lower_warning_limit

depends on UDT

lower warning limit.

upper_warning_limit

depends on UDT

upper warning limit.

lower_alarm_limit

depends on UDT

lower alarm limit.

upper_alarm_limit

depends on UDT

upper alarm limit.

lower_display_limit

depends on UDT

lower display limit.

upper_display_limit

depends on UDT

upper display limit.

lower_control_limit

depends on UDT

lower control limit.

upper_control_limit

depends on UDT

upper control limit.

Not all of these fields are present in each UDT.
The value, alarm_severity, and
alarm_status fields are the only ones that are
present in all UDTs.
The std, min,
max, and covered_period_fraction
fields are only present in the
channel_access_aggregated_* UDTs.
The precision field is only present in UDTs
representing samples of a floating-point type.
The units, lower_warning_limit,
upper_warning_limit,
lower_alarm_limit,
upper_alarm_limit,
lower_display_limit,
upper_display_limit,
lower_control_limit, and
upper_control_limit fields are only present in UDTs
that represent samples of a numeric type.
The labels field is only present in the
channel_access_array_enum and
channel_access_scalar_enum UDTs.

The type of the value field depends on the type of
the sample that is represented by the UDT.
The same applies to the lower_warning_limit,
upper_warning_limit,
lower_alarm_limit,
upper_alarm_limit,
lower_display_limit,
upper_display_limit,
lower_control_limit, and
upper_control_limit fields.
The types used for those fields are listed in
Table C.3, “Type of UDT fields”.

Table C.3. Type of UDT fields

User-defined type

Value field type

Limit fields type

channel_access_aggregated_scalar_char

double

tinyint

channel_access_aggregated_scalar_double

double

double

channel_access_aggregated_scalar_float

double

float

channel_access_aggregated_scalar_long

double

int

channel_access_aggregated_scalar_short

double

smallint

channel_access_array_char

blob

tinyint

channel_access_array_double

blob

double

channel_access_array_enum

blob

n/a

channel_access_array_float

blob

float

channel_access_array_long

blob

int

channel_access_array_short

blob

smallint

channel_access_array_string

blob

n/a

channel_access_scalar_char

tinyint

tinyint

channel_access_scalar_double

double

double

channel_access_scalar_enum

smallint

n/a

channel_access_scalar_float

float

float

channel_access_scalar_long

int

int

channel_access_scalar_short

smallint

smallint

channel_access_scalar_string

text

n/a

For aggregated samples, the value field is always of
type double because it stores the mean of all source
samples.
The array types store the value elements in a blob.
The reason for this is that Cassandra’s list type
comes with an overhead that is significant when representing a large
number of elements as it is commonly encountered for Channel Access
channels that have array values.

Storing these arrays inside a blob is very efficient
because the size occupied by each element is not more than the element’s
actual size (e.g. two bytes for a each element of a
DBR_SHORT sample).
The numbers inside the blob are stored in big endian
format, so that when using Java, they can easily be converted back to
numbers by interpreting the ByteBuffer
representing the blob as a buffer of numbers (e.g. an
IntBuffer for samples of type
DBR_LONG).

For array samples of type DBR_STRING, the blob stores
40 bytes for each element.
These 40 bytes represent the raw value as it has been received from the
Channel Access server.