Welcome on the webpage of PAutomaC!

The competition is over. In addition of the technical report describing the competition that was written at the beginning,
an article published in the proceedings of ICGI'12 is available. It contains a lot of details about the competition and the results.
In the same proceedings, one can find a short paper from the winning team of the competition. Two articles from
other participants to the competition can also be found here (team Hulden) and
here (team Kepler).

July 3rd: After a storm that postponed for 2 days the end of competition, PAutomaC is finished. The winner is the team of Chihiro Shibata and Ryo Yoshinaka, Congratulation to them! And thanks to all participants: this have been a great competition.

May 20th: Phase 2 is launched: The data of the real competition are available! The data sets of the training phase are still available but you cannot submit your results any more. However, the files containing the true probabilities (obtained with the target automata) are available.

March 20th: 16 new data sets are available!

March 8th: The website is fully operational, the first data set is available.

Why this competition?

Finite state automata (or machines) are well-known models for
characterizing the behaviour of systems or processes.
They have been used for several decades in computer and software
engineering to model the complex behaviours of electronic circuits and
softwares.
A nice feature of an automaton model is that it is easy to interpret,
but unfortunately in many applications the original design of a system is
unknown. That is why learning approaches has been used, for instance:

To modelize DNA or protein sequences in bioinformatics.

To find patterns underlying different sounds for speech processing.

To develop morphological or phonological rules for natural language processing.

To modelize unknown mechanical processes in Physics

To discover the exact environment of robots.

To detect Anomaly for detecting intrusions in computer security.

To do behavioural modelling of users in applications ranging from web systems to the automotive sector.

To discover the structure of music styles for music classification and generation.

...

In all such cases, an automaton model is learned from observations of the system, i.e., a finite set of strings.
As the data gathered from observations is usually unlabelled, the standard method of dealing with this situation is to assume a
probabilistic automaton model, i.e., a distribution over strings.
In such a model, different states can generate different symbols with different probabilities: the two main formalisms are Hidden Markov Models (HMM)
and Probabilistic Finite State Automata (PFA).

This is what this competition is about.

How is it working?

We automatically generated PFAs in a way that is described here.
We then generated two data sets from each PFA: one is the training set
and the second is the test set (where duplicate strings have been
removed). The idea is to train your learning algorithm on the first data
set in order to assign
probabilities to the strings in the test set.