On Mon, Nov 18, 2013 at 4:23 PM, Eric Kuyt <eric.ku...@wur.nl> wrote:
> Hi Peter, it turns out we only have a workbench licence, the clc_assembler
> packaged with the workbench is called ./clc_assembler_ilo
>
> which has the man page below, do you think this is the same binary as the
> clc-assembly-cell assembler?
>
> I will just try to link clc_assembler_ilo to my path and see what it does :)

Try creating a symlink named clc_assembler (or hacking my wrapper
to look for clc_assembler_ilo instead of clc_assembler), but yes, it
looks like the same tool under a different name (see below).
Interestingly yours is newer then ours, perhaps we need an update...
What about the clc_mapper and clc_cas_to_sam binaries which
are used in my clc_mapper.xml wrapper? Are they there under
different names too?
[I have no idea if the CLCbio workbench licence is intended to allow
you to run the clc_assembler at the command line as well - you may
need to double check that to be safe.]
Regards,
Peter
--
$ /mnt/apps/clcBio/clc-assembly-cell-4.1.0-linux_64/clc_assembler
No read files
usage: clc_assembler [options]
Assemble some reads and output contig sequences in fasta format.
Options:
-h / --help: Display this message
-q / --reads: The files following this option are read files. Fasta, fastq,
and sff formats are allowed. (may be used several times)
-i <file1> <file2> / --interleave <file1> <file2>: Interleave the sequences
in two files, alternating between the files when reading the
sequences. Only valid for read files. (may be used several times)
-o <file> / --output <file>: Give the output fasta file (required)
-f <file> / --feature_output <file>: Output scaffolding annotation in
GFF (default) or AGP format. The file suffix is used to determine the
output format. Use '.gff' for GFF format and '.agp' for AGP format.
-m <n> / --min-length <n>: Set the minimum contig length to output (default =
200)
-w <n> / --wordsize <n>: Set the word size for the de Bruijn graph (default
is automatic based on input data size)
-b <n> / --bubblesize <n>: Set the maximum bubble size for the de Bruijn graph
(default is 50)
--cpus <n>: Set the number of cpus to use.
-v / --verbose: Output various information while running.
-p <par> / --paired <par>: Set the paired read mode for the read files
following this option. (may be used several times)
par consists of four strings: <mode> [<dist_mode>] [<min_dist> <max_dist>]
mode is ff, fb, bf, bb and sets the relative orientation of read one and
two in a pair (f = forward, b = backward)
dist_mode is ss, se, es, ee and sets the place on read one and two to
measure the distance (s = start, e = end).
A typical use would be "-p fb ss 180 250" which means that the reads are
inverted and pointing towards each other. The distance includes both the
reads and the sequence between them. The distance may be between 180 and
250, both included.
It is also allowed to insert a "d" before the mode. This indicates that
the reads in the following file(s) should only be used for their paired end
information and not to build initial contigs. E.g. "-p d fb ss 180 250".
To explicitly say that the following reads are not paired, use "no" for
par, i.e. "-p no".
For paired end reads split in two files, use the -i option.
-e <file> / --estimatedistances <file> Estimate paired distances for
all paired
reads and save the distance estimates in <file>. If it is not possible to
get an accurate distance estimate for a file, the original paired distance
is used.
-g <mode> / --fragmentmode <mode>: Set the mode for how reads are used to
create fragments. One mode is "ignore", which ignores the nucleotides
when building initial fragments. The other mode is "use", which uses
the nucleotides when building initial fragments. This is the default mode.
The mode applies to all read files following this option. The option may be
used repeatedly.
-n / --no-scaffolding: Pair info is used for contig creation, but no
scaffolding is performed.
Examples:
Assembly of a single file with reads:
clc_assembler -o contigs.fasta -q reads.fasta
Assembly of two interleaved files with paired end reads:
clc_assembler -o contigs.fasta -p fb ss 180 250 -q -i reads1.fq reads2.fq
Version: 4.10.86742
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/