By default, "make install" will install all the files in "/usr/local/bin", "/usr/local/lib" etc.
You can specify an installation prefix other than "/usr/local" using "--prefix" to "configure" execution, for instance "--prefix=$HOME".

$ ./configure --prefix=$HOME

Running

Database file generation

Prepare a database file in which gene expression files in CM format are just concatenated as follows:

Prepare query file in CM format and run CELLBLAST profile matcher as follows.

$ ./runGerMatcher db.bin query.CM > result.txt

Example Usage

The following is an example usage of CELLBLAST profile matcher.
The database file "HiSeqHsapiens.bin" and the query file "query.GSM1901473.TF_activity_protein_binding.CM" can be downloaded from CELLBLAST_Database. The query file contains "MF: transcription factor activity, protein binding" genes. The profile matching is performed using only the genes included in both the database and the query.

The search result (using CELLBLAST database version 1.0.1 in August 2018) is shown in the following table consisting of five columns:

· 1st column: Sample ID. Sample accessions numbers (GSM) of NCBI Gene Expression Omnibus (GEO) are used in CELLBLAST database file.
· 2nd column: P-value of Fisher's Z-transformed rank correlation coefficient. The details of the derivation of the p-values are described in Document manuals.
· 3rd column: Spearman's rank correlation coefficient. The details of the derivation of the correlation coefficients are also described in Document manuals.
· 4th column: The number of genes used for the profile matching.
· 5th column: Header information of CM format in database file. In the CELLBLAST database files, GEO's accession numbers of sample (GSM), GEO's accession numbers of platform (GPL), organism, and SHOGoiN Cell IDs of the samples delimitated by "|" are given.