This document provides an overview of cpp_client,
a simple command-line utility shipped with BigARTM.

To run cpp_client you need to download input data (a textual collection represented in bag-of-words format).
We recommend to download vocab and docword files by links provided in Downloads section of the tutorial.
Then you can use cpp_client as follows:

cpp_client -d docword.kos.txt -v vocab.kos.txt

You may append the following options to customize the resulting topic model:

-t or --num_topic sets the number of topics in the resulting topic model.

-i or --num_iters sets the number of iterative scans over the collection.

--num_inner_iters sets the number of updates of theta matrix performed on each iteration.

--reuse_theta enables caching of Theta matrix and re-uses last Theta matrix from
the previous iteration as initial approximation on the next iteration. The default alternative
without --reuse_theta switch is to generate random approximation of Theta matrix on each iteration.

--tau_phi, --tau_theta and --tau_decor allows you to specify weights of different regularizers.
Currently cpp_client does not allow you to customize regularizer weights for different topics
and for different iterations. This limitation is only related to cpp_client,
and you can simply achieve this by using BigARTM interface (either in Python or in C++).

--update_every is a parameter of the online algorithm.
When specified, the model will be updated every update_every documents.

You may also apply the following optimizations that should not change the resulting model

-p allows you to specify number of concurrent processors.
The recommended value is to use the number of logical cores on your machine.

--no_scores disables calculation and visualization of all scores.
This is a clean way of measuring pure performance of BigARTM,
because at the moment some scores takes unnecessary long time to calculate.

--disk_cache_folder applies only together with --reuse_theta.
This parameter allows you to specify a writable disk location where BigARTM can cache Theta matrix
between iterations to avoid storing it in main memory.

--merger_queue_size limits the size of the merger queue. Decrease the size of the queue might
reduce memory usage, but decrease CPU utilization of the processors.