Before proceeding to install PyBossa you will need to configure some other
applications and libraries in your system. In this page, you will get a step by
step guide about how to install all the required packages and libraries for
PyBossa using the latest Ubuntu Server Long Term Support version available at
the moment:

PostgreSQL is a powerful, open source object-relational database system.
It has more than 15 years of active development and a proven architecture that
has earned it a strong reputation for reliability, data integrity, and correctness.

PyBossa uses PostgreSQL as the main database for storing all the data, and you
the required steps for installing it are the following:

We recommend to install PyBossa using a virtualenv as it will create a an
isolated Python environment, helping you to manage different dependencies and
versions without having to deal with root permissions in your server machine.

virtualenv creates an environment that has its own installation directories,
that doesn’t share libraries with other virtualenv environments (and
optionally doesn’t access the globally installed libraries either).

You can install the software if you want at the system level if you have root
privileges, however this may lead to broken dependencies in the OS for all your
Python packages, so if possible, avoid this solution and use the virtualenv
solution.

Then, you are ready to download the code and install the required libraries for
running PyBossa.

Note

We recommend you to install the required libraries using a virtual
environment with the command virtualenv (you can install the package
python-virtualenv). This will allow to have all the libraries for PyBossa
in one folder of your choice, so cleaning the installation would be as
simple as deleting that folder without affecting your system.

If you decide to use a virtualenv then, follow these steps (lines starting
with # are comments):

Vim editor is a very popular text editor in GNU/Linux systems, however it
may be difficult for some people if you have never used it before. Thus, if
you want to try another and much simpler editor for editing the
configuration files you can use the GNU Nano editor.

Since version v0.2.1, PyBossa uses Redis not only for caching objects and speed
up the site, but also for limiting the usage of the API requests.

Latest Redis can be installed by downloading the package directly from its
official Redis site. Since Ubuntu 14.04 you can also use the internal package:

sudo apt-get install redis-server

Once you have downloaded it, and installed it, you will need to run two
instances:

Redis-server: as a master node, accepting read and write operations.

Redis-sentinel: as a sentinel node, to configure the master and slave Redis
nodes.

If you have installed the server via your distribution package system, then,
the server will be running already. If this is not the case, check the official
documentation of Redis to configure it and run it. The default values should
be fine.

Note

Please, make sure that you are running version >= 2.6

Note

If you have installed the software using the source code, then, check the
contrib folder, as there is a specific folder for Redis with init.d start
scripts. You only have to copy that file to /etc/init.d/ and adapt it to
your needs.

Redis can be run in sentinel mode with the –sentinel arg, or by its own
command named: redis-sentinel. This will vary from your distribution and
version of Redis, so check its help page to know how you can run it.

In any case, you will need to run a sentinel node, as PyBossa uses it to
load-balance the queries, and also to autoconfigure the master and slaves
automagically.

In order to run PyBossa, you will need first to configure a Sentinel node.
Create a config file named sentinel.conf with something like this:

In the contrib folder you will find a file named sentinel.conf that should
be enough to run the sentinel node. Thus, for running it:

redis-server contrib/sentinel.conf --sentinel

Note

Please, make sure that you are running version >= 2.6

Note

If you have installed the software using the source code, then, check the
contrib folder, as there is a specific folder for Redis with init.d start
scripts. You only have to copy that file to /etc/init.d/ and adapt it to
your needs.

PyBossa comes with a Cache system that it is enabled by default. PyBossa uses
a Redis server to cache some objects like projects, statistics, etc. The
system uses the Sentinel feature of Redis, so you can have several
master/slave nodes configured with Sentinel, and your PyBossa server will use
them “automagically”.

Once you have started your master Redis-server to accept connections,
Sentinel will manage it and its slaves. If you add a slave, Sentinel will
find it and start using it for load-balancing queries in PyBossa Cache system.

If you want to disable it, you can do it with an environment variable:

export PYBOSSA_REDIS_CACHE_DISABLED='1'

Then start the server, and nothing will be cached.

Note

Important: We highly recommend you to not disable the cache, as it will boost
the performance of the server caching SQL queries as well as page views. If
you have lots of projects with hundreds of tasks, you should enable it.

Note

Important: Sometimes Redis is a bit outdated in your Linux distribution.
If this is the case, you will need to install it by hand, but it is really
easy and well documented in the official Redis site.

PyBossa uses the Python libraries RQ and RQScheduler to allow slow or
computationally-heavy tasks to be run in the background in an asynchronous way.

Some of the tasks are run in a periodic, scheduled, basis, like the refreshment
of the cache and notifications sent to users, while others, like the sending of
mails are created in real time, responding to events that may happen inside the
PyBossa server, like sending an email with a recovery password.

To allow all this, you will need two additional Python processes to run in the
background: the worker and the scheduler. The scheduler will create the
periodic tasks while other tasks will be created dynamically. The worker will
execute every of them.

To run the scheduler, just run the following command in a console:

rqscheduler --host IP-of-your-redis-master-node

Similarly, to get the tasks done by the worker, run:

python app_context_rqworker.py scheduled_jobs super high medium low

It is also recommended the use of supervisor for running these processes in an
easier way and with a single command.

Note

While the execution of the scheduler is optional (you will not have the
improvements in performance given by them, but you may also not need them),
the execution of the worker is mandatory for the normal functioning of the
PyBossa server, so make sure you run the command for it.

Sometimes, the PyBossa developers add a new column or table to the PyBossa
server, forcing you to carry out a migration of the database. PyBossa uses
Alembic for performing the migrations, so in case that your production server
need to upgrade the DB structure to a new version, all you have to do is to:

In versions prior to v0.2.3, HTML was supported as the default option for the
‘long_description’ field in projects. In new versions of PyBossa, Markdown has been
adopted as the default option. However, you can use HTML instead of Markdown
by modifying the default PyBossa theme or using your own forked from the default
one.

If you were have been using PyBossa for a while you may have projects in your
database whose ‘long_description’ is in HTML format. Hence, if you are using the
default theme for PyBossa you will no longer see them rendered as HTML and may
have some issues.

In order to avoid this, you can run a simple script to convert all the DB project’s
‘long_description’ field from HTML to Markdown, just by running the following
commands:

The first command will install a Python package that will handle the HTML to
Markdown conversion, while the second one will convert your DB entries.

Note

As always, if you are using the virtualenv be sure to activate it before
running the pip install command.

Note

The latest version of PyBossa requires PostgreSQL >= 9.3 as it is using materialized
views for the dashboard. This feature is only available from PostgreSQL 9.3, so please
upgrade the DB as soon as possible. For more information about upgrading the PostgreSQL
database check this page.

The ultimate crowdsourcing framework to analyze or enrich data that can’t be processed by machines alone.