Navigation

A Groonga package provides a C library (libgroonga) and a command line tool (groonga). This tutorial explains how to use the command line tool, with which you can create/operate databases, start a server, establish a connection with a server, etc.

The first step to using Groonga is to create a new database. The following shows how to do it.

Form:

groonga -n DB_PATH

The -n option specifies to create a new database and DB_PATH specifies the path of the new database. Actually, a database consists of a series of files and DB_PATH specifies the file which will be the entrance to the new database. DB_PATH also specifies the path prefix for other files. Note that database creation fails if DB_PATH points to an existing file (For example, dbopenfailed(DB_PATH):syscallerror'DB_PATH'(Fileexists). You can operate an existing database in a way that is in the next chapter).

This command creates a new database and then enters into interactive mode in which Groonga prompts you to enter commands for operating that database. You can terminate this mode with Ctrl-d.

Execution example:

% groonga -n /tmp/groonga-databases/introduction.db

After this database creation, you can find a series of files in /tmp/groonga-databases.

DB_PATH specifies the path of a target database. This command fails if the specified database does not exist.

If COMMAND is specified, Groonga executes COMMAND and returns the result. Otherwise, Groonga starts in interactive mode that reads commands from the standard input and executes them one by one. This tutorial focuses on the interactive mode.

In most cases, a table has a primary key which must be specified with its data type and index type.

There are various data types such as integers, strings, etc. See also Data types for more details. The index type determines the search performance and the availability of prefix searches. The details will be described later.

Let's create a table. The following example creates a table with a primary key. The name parameter specifies the name of the table. The flags parameter specifies the index type for the primary key. The key_type parameter specifies the data type of the primary key.

When only a table name is specified with a table parameter, a select command returns the first (at most) 10 records in the table. [0] in the result shows the number of records in the table. The next array is a list of columns. ["_id","Uint32"] is a column of UInt32, named _id. ["_key","ShortText"] is a column of ShortText, named _key.

The above two columns, _id and _key, are the necessary columns. The _id column stores IDs those are automatically allocated by Groonga. The _key column is associated with the primary key. You are not allowed to rename these columns.

Let's add a column. The following example adds a column to the Site table. The table parameter specifies the target table. The name parameter specifies the name of the column. The type parameter specifies the data type of the column.

Next, let's get a record having a specified key. The following example gets the record whose primary key is "http://example.org/". More precisely, the query parameter specifies a record whose _key column stores "http://example.org/".

Groonga uses an inverted index to provide fast full text search. So, the first step is to create a lexicon table which stores an inverted index, also known as postings lists. The primary key of this table is associated with a vocabulary made up of index terms and each record stores postings lists for one index term.

The following shows a command which creates a lexicon table named Terms. The data type of its primary key is ShortText.

The table_create command takes many parameters but you don't need to understand all of them. Please skip the next paragraph if you are not interested in how it works.

The TABLE_PAT_KEY flag specifies to store index terms in a patricia trie. The default_tokenizer parameter specifies the method for tokenizing text. This example uses TokenBigram that is generally called N-gram.

The table parameter specifies the index table and the name parameter specifies the index column. The type parameter specifies the target table and the source parameter specifies the target column. The COLUMN_INDEX flag specifies to create an index column and the WITH_POSITION flag specifies to create a full inverted index, which contains the positions of each index term. This combination, COLUMN_INDEX|WITH_POSITION, is recommended for the general purpose.

Note

You can create a lexicon table and index columns before/during/after loading records. If a target column already has records, Groonga creates an inverted index in a static manner. In contrast, if you load records into an already indexed column, Groonga updates the inverted index in a dynamic manner.

A query for full text search is specified with a query parameter. The following example searches records whose "title" column contains "this". The '@' specifies to make full text search. Note that a lower case query matches upper case and capitalized terms in a record if NormalizerAuto was specified when creating a lexcon table.

A select command returns a part of its search result if offset and/or limit parameters are specified. These parameters are useful to paginate a search result, a widely-used interface which shows a search result on a page by page basis.

An offset parameter specifies the starting point and a limit parameter specifies the maximum number of records to be returned. If you need the first record in a search result, the offset parameter must be 0 or omitted.

A select command sorts its result when used with a sort_keys parameter.

A sort_keys parameter specifies a column as a sorting creteria. A search result is arranged in ascending order of the column values. If you want to sort a search result in reverse order, please add a leading hyphen ('-') to the column name in a parameter.

The following example shows records in the Site table in reverse order.

If you want to specify more than one columns, please separate column names by commas (','). In such a case, a search result is sorted in order of the values in the first column, and then records having the same values in the first column are sorted in order of the second column values.