Ability to extend by writing and attaching scripts to the Python Django Web framework

Lightweight with simple deployment and installation

HOW IT WORKS

The logic behind Techu is pretty straightforward, as you can see in the flow diagram to the right.
Application code sends an HTTP request that corresponds to a specific action.
There are 2 major groups of operations; indexing & searching.
Indexing involves inserting or deleting a document from the index or modifying attributes or text fields for a document.
Searching on the other hand involves performing full-text searches and retrieving highlighted excerpts.

In the current beta version, most request data are passed
via a single data parameter with a JSON-formatted value, with some exceptions
usually involving requests for the Sphinx configurations handling,
but in the future all requests will be following this protocol for simplicity and uniformity.
Oh, and yes, now you can keep you Sphinx configurations in order, by storing them in Techu's MySQL DB schema
(although this feature can be bypassed also). On each regeneration command Techu will
automatically restart the corresponding searchd.

After the application dispatches a request, Nginx receives it and the Django Python Web framework processes the request.
As you can see in the diagram, indexing operations can be optionally queued for asynchronous execution.
In that case, a Redis key is returned as a response and the request
is later converted to SphinxQL and sent to Sphinx with the script referred as applier
(we probably should find a better name for this).

If no queueing is required, then the data are converted to SphinxQL
statement (or statements if you are batch inserting documents).
For a searching operation, either full-text search or highlighted excerpts,
the response can either originate directly from Redis (cache) or if there is no cache entry, the attribute filters and the query
will be converted to SphinxQL and retrieved from Sphinx directly.

WHY REDIS?

Key-value storage, optionally persistent, with very large value length limit (512M) capable of storing a lot of text.
Redis list and hash structures are key components of the caching and queueing sub-systems.

WHY SPHINXQL?

It is faster than the API.

WHY NGINX?

Faster web server, ensures high concurrency and low latency

WHY THESE COMPONENTS OVERALL?

Every component is well established software and excels in its area. Also they can be commonly found in most stacks.
We wouldn't like to reinvent the wheel, plus there is no need for some exotic configuration for you to learn or setup!

INSTALLATION

UBUNTU PACKAGES

1.

apt-get install python-setuptools build-essential

2.

apt-get install mysql-server mysql-client

3.

apt-get install redis-server

4.

apt-get install nginx

5.

apt-get install python-mysqldb python-flup

6.

apt-get install git

PYTHON PACKAGES (REQUIRED)

1.

easy_install redis

2.

easy_install django

3.

easy_install django_graceful

PYTHON PACKAGES (OPTIONAL)

1.

easy_install hiredis

2.

easy_install beautifulsoup4

SPHINX

1.

wget http://sphinxsearch.com/files/sphinx-2.1.1-beta.tar.gz

2.

tar -zxvf sphinx-2.1.1-beta.tar.gz

3.

cd sphinx-2.1.1-beta

4.

./configure && make && make install

CONFIGURATION

1.

git clone https://github.com/georgepsarakis/techu-search-server.git

2.

vim /etc/nginx/sites-available/techu

Add the domain techu (or techu.local)
in your /etc/hosts file, pointing to your server's internal IP, not localhost
in order to connect to Sphinx searchd with the MySQL protocol on a specific port.

curl -XPOST 'http://techu:81/indexer/insert/28/' --data-urlencode data='{
"body": "I have in my Symfony 2.1 RC app a simple Comment model (using Doctrine 2). Every comment has a user and a message.
Currently, the CommentBundle manages comments on articles. I\\'d like it to be more generic to be able to comment any kind of entity without copying code across different bundles dedicated to comments...
For this to work, I also need a way to reference any entity from the comment one. I think having two fields entity_type and entity_id can be a nice solution. However, I can\\'t get the object from these without mapping entity_type to classes manually and using the find method.
So how do I reference an entity from a comment ? And how can I create generic behavior working on several entities ?",
"user_id": 893390,
"title": "Generic comment system in Symfony2",
"last_activity_date": 1368868178,
"creation_date": 1346167729,
"score": 1,
"is_answer": 0,
"id": 12162609
}'