Set up Tonic Suite

Prerequisites

*Building Caffe

If you have built DjiNN service, which means you should have Caffe installed, skip this step.

Caffe is under active development and since some of the latest changes may break downstream projects, we provide users with a snapshot version at our Downloads page that is verified to build with Tonic Suite. Caffe can be built using different libraries as detailed here and we recommend reading their installation process to get familiar with the process.

Image Processing Applications

IMC sends an image to the DjiNN service and a prediction of what the image contains is returned to the application by the DjiNN service. The neural network architecture used is AlexNet, and the network is trained on 1.4 million images from ImageNet.

DIG sends an image of a hand-written digit to the DjiNN service and a prediction of the most likely digit (between 0 and 9) is sent back to the application. The neural network architecture used for DIG is created based on MNIST.

FACE sends an image of face to the DjiNN service and a prediction of the identity of the face is sent back to the application. The neural network architecture used for FACE is replicated from DeepFace published by Facebook, and the network is trained on PubFig83+LFW Dataset.

Directory structure

The home directory of image applications is ./tonic-suite/img. In the directory:

./data/ contains pretrained data for face alignment, and the list of classes for IMC and FACE../input/ contains the input images for the image applications../src/ contains all the source files.

Building the applications

After building Flandmark, run make from the home directory of image processing applications:

Changing “–djinn 0″ to “–djinn 1″ will use the djinn service to execute the forward pass. The number of entries in the “–input” file define how many images are batched into 1 query for the forward pass (1 image per line).

Automatic Speech Recognition (ASR) Application

Tonic Suite contains Automatic Speech Recognition (ASR), which takes in a user’s audio file and generate the most possible transcript. This network architecture is adapted from Kaldi, a start-of-the-art speech recognition toolbox. The model is trained on voxforge, an open-source large scale speech corpora.

Directory structure

The home directory of speech application is ./tonic-suite/asr. In the directory:

Building the applications

First, Kaldi depends on an installed ATLAS library as well as a specific set of ATLAS headers to operate. If your system does not have ATLAS installed, you can install from the package manager. On Ubuntu, execute

$ sudo apt-get install libatlas-dev

You can also install ATLAS from source (CPU throttling needs to be off for this). We provide a script tonic-suite/asr/tools/install_atlas.sh to help you install ATLAS from source.

After installing ATLAS, download the set of ATLAS headers required by Kaldi.

configure will generate kaldi.mk which including all the relevant path information including the openFST libraries and ATLAS headers which should resides under tools/ .kaldi.mk also specify the location of the ATLAS libraries on the system. Make sure they are set correctly before proceeding.

Running the applications

Changing “–djinn 0″ to “–djinn 1″ will use the djinn service to execute the forward pass. The number of entries in the “–input” file define how many wav files are batched into 1 query for the forward pass (1 wav file per line).

Changing “–djinn 0″ to “–djinn 1″ will use the djinn service to execute the forward pass. The number of sentences in the “–input” file define how many sentences are batched into 1 query for the forward pass (1 sentence per line).

Citing Tonic suite

If you use Tonic suite in your research, please cite the official publication [1].