Succinct Data Structures (SuDS) -research group

Group description

The research group studies a new subfield of data compression - data structure
compression. The new aspect compared to traditional compression is that
the compressed data (structure) needs to be represented so that access
to its internal parts is provided without uncompressing the whole
structure. As an example, consider a binary tree of n nodes. It is
possible to represent the tree succinctly using about 2n bits so that
the children and parent of any node can be accessed in constant time. A
standard link structure representation of a binary tree takes of order n
log n bits.

In addition to providing new algorithms and data structures in the field of study,
the group contributes by engineering open source implementations targeted to
applications such as sequence analysis in Bioinformatics, full-text search in
Database Systems, and retrieval of structured documents in Information Retrieval.

News

The group participates in the new Finnish Centre of Excellence in Cancer Genetics Research led by Academy Professor Lauri Aaltonen, starting 2012. The name of the group has changed into genome-scale algorithmics, better describing our current focus.

Collaboration

The project collaborates with several researchers abroad. A close and
long-term collaboration is with Professor
Gonzalo
Navarro from University of Chile.

Software

geneneralized compressed suffix array for indexing multiple alignment of several reference genomes or reference genome plus known variants. Note: Current construction algorithm requires a lot of resources for genome-scale inputs. We are working on a distributed construction algorithm to alleviate this issue.