A Real time provider of messages (for example, the inbound
interface of a Web Service)

A process must contain at least one Reader, but may contain many Readers,
if matching data from multiple sources.

Use

Readers are used at the beginning of processes in order to select the
sources of data that you are intending to work with in the process, and
any selection and reordering of data attributes from that data source
that are specific to the process you are intending to create. For example,
for the purposes of a specific process, you may wish to select only the
name and address fields from a data source, and you may wish to reorder
them for the purpose of display throughout your process.

A Reader is automatically added to a process for you, since a process
must always have at least one Reader.

Configuration

Reader Source

Select the Type
of data that you wish to read from the following options:

Staged data - that is, a snapshot of data, or the named
output of another process, in the OEDQ repository

Note:
The snapshot does not necessarily have to exist in the repository.
You may be intending to run the process in streaming mode, meaning the
source data will not be copied into the repository.

Data Interface - that is, a configured source-independent interface
of a set of data attributes

Reference Data - that is, a set of reference data that exists in the OEDQ repository

Real time
provider - that is, a direct connection
to a real time source of messages

Select
the Source of data from the
available sources of the selected type.

All the available attributes in the data
appear in the left pane. Select those that you wish to work with in the
process by using the arrow buttons to select, and de-select attributes:

selects the attributes selected in the left-hand pane as
inputs to the process

selects all available
attributes as inputs

de-selects the selected
inputs in the right-hand pane

de-selects all inputs

In the right-hand pane, the attributes
that you have chosen to work with may be re-ordered by drag-and-drop:

The order that you specify in the Reader
will be used to display results throughout the process.

Note: If you know
you are not intending to work with all the attributes of a given set of data, it is a good idea to exclude them in
the Reader. This will make configuring your processors and browsing your
results much more straight-forward as only the attributes you are interested
in will be displayed.

If you are configuring a process to work with several readers, you can
also choose to change the Data Stream
Name. The Data Stream Name provides a way of referring to the data
source where a process has multiple streams, such as when matching across
several sources.

Options

Option

Type

Purpose

Default Value

No Data Reference Data

Reference Data (No Data Handling Category)

See note below

None

Note on No Data Handling

An option is provided to normalize different types of No Data values
in the reader using a Reference Data map, so that the process based on
the reader will treat these values in a common way. This option is not
used by default, so that you can profile and understand the data in its
'pure' form, allowing you to identify the different types of No Data when profiling.

A system-level No Data Handling map is provided. If used, this will
normalize any empty Strings, or Strings that consist entirely of No Data
characters (that is, spaces, or other non-printing or control characters)
to Null values.

This is the same functionality available when snapshotting data or using the Normalize No Data processor.

If you want to change the No Data characters, you can use a different
No Data Handling map.

Execution

The Reader is a necessary part of any process, whatever the remit of
that process is. Some processors are not suitable for certain types of
execution, however. For example, it is not possible to match and consolidate data
from numerous sources in a real time response process, but selecting a
Real time Reader Source (as above) places no restrictions on the processors
that are available for configuration, as the execution of a process is
driven from how its Reader(s) and Writer(s)
are configured.

In general,
OEDQ is designed for three modes of execution:

Batch execution, where a set of records in one
or more data sources is processed in batch.

Real time monitoring execution, where OEDQ
acts as a data quality probe for a data source, monitoring incoming records
for quality as they are created, but where no real time response to each
record is expected.

Real time response execution, where OEDQ
processes records, and passes them back along with extra data, on
a real time response interface.

Each processor in the library is listed with the execution modes that
can be sensibly used with that processor.

Results Browsing

The results browser for a Reader displays all the records present in the underlying data store once a process has been run.

Output Filters

The Reader does not provide any output filters. All records are read
from the specified source and made available to the remainder of the process.

Example

The following example shows the records that are read from the example
Service Management data in the example Service Management Project.

In this case, the Reader was configured to read all the data attributes
from the source, without changing their order, and without using the default
No Data Handling map. No further processing has yet been defined: