2.3. Sequencing content

The current JBoss DNA release contains a sequencing framework that is designed to sequence data (typically files)
stored in a JCR repository to automatically extract meaningful and useful information. This additional information is then
saved back into the repository, where it can be accessed and used.

In other words, you can just upload various kinds of files into a JCR repository, and DNA automatically processes
those files to extract meaningful structured information. For example, load DDL files into the repository, and let
sequencers extract the structure and metadata for the database schema. Load Hibernate configuration files into the
repository, and let sequencers extract the schema and mapping information. Load Java source into the repository, and let
sequencers extract the class structure, JavaDoc, and annotations. Load a PNG, JPEG, or other image into the repository,
and let sequencers extract the metadata from the image and save it in the repository. The same with XSDs, WSDL, WS
policies, UML, MetaMatrix models, etc.

JBoss DNA sequencers sit on top of existing JCR repositories (including federated repositories) - they basically extract
more useful information from what's already stored in the repository. And they use the existing JCR versioning system. Each
sequencer typically processes a single kind of file format or a single kind of content.

The following sequencers are included in JBoss DNA:

Image sequencer
- A sequencer that processes the binary content of an image file, extracts the metadata for the image, and then
writes that image metadata to the repository. It gets the file format, image resolution, number of bits per pixel
(and optionally number of images), comments and physical resolution from JPEG, GIF, BMP, PCX, PNG, IFF, RAS, PBM,
PGM, PPM, and PSD files. (This sequencer may be improved in the future to also extract EXIF metadata from JPEG
files; see
DNA-26
.)

MP3 sequencer
- A sequencer that processes the contents of an MP3 audio file, extracts the metadata for the file, and then
writes that image metadata to the repository. It gets the title, author, album, year, and comment.
(This sequencer may be improved in the future to also extract other ID3 metadata from other audio file formats; see
DNA-26
.)

As the community develops additional sequencers, they will also be included in JBoss DNA. Some of those that have been
identified as being useful include:

Data Definition Language (DDL) Sequencer
- Process various dialects of DDL, including that from Oracle, SQL Server, MySQL, PostgreSQL, and others. May need
to be split up into a different sequencer for each dialect. (See
DNA-26
)

MP3 and MP4 Sequencer
- Process MP3 and MP4 audio files to extract the name of the song, artist, album, track number, and other metadata.
(See
DNA-30
)

The
examples
in this book go into more detail about how sequencers are managed and used, and
Chapter 5
goes into detail about how to write custom sequencers.