I noticed that for our new Ilumina data (which generate Sanger format)
the FastQ groomer output is identical to the Ilumina FastQ input file.
I was hoping to go ahead and just use the raw FastQ files as input
(saving disk space) for computing quality statistics to look at box
plots, but it appears that the tool "Compute Quality Statistics"
appears to require that the data have been run through FastQ Groomer
first.
Is there a way to get around this and is this a bug? I assuming this
is some sort of safety measure built into this tool?
-John

Thanks Ross, I don't see it under my local install - are there any
pre-written scripts to integrate it with a local galaxy instance?
I assume you are talking about this tool here:
http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
-John
________________________________________
To: John David Osborne
Cc: galaxy-user@bx.psu.edu
Subject: Re: [galaxy-user] FastQ Groomer and Compute Quality
Statistics
You can avoid the space/time overhead of grooming and get
comprehensive QC reports using the new wrapper for FastQC (under NGS:
QC) - it takes fastq of any flavour (and bam) groomed or not,
producing a superset of the compute quality stats output without the
need for an intermediate step. Highly recommended.
--
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

Hi, John.
it's on main and test - ie the FastQC wrapper is distributed with the
current stable and central branches so your local tool_conf.xml may be
out of date since it's not automagically refreshed from the distro
.sample ? If you do a diff of your local tool_conf.xml with the
current distributed sample, you should see the lines you need to add
which points to rgenetics/fastqc.xml
Thu,Jun 09 at 10:22am grep -i fastqc tool_conf.xml
<label text="FastQC: fastq/sam/bam" id="fastqcsambam"/>
<tool file="rgenetics/rgFastQC.xml"/>
Like everything else, you'll want to install the jar locally so it can
be found by the cluster - the default location is
tool-data/shared/jars/FastQC so the tool can find the fastqc perl
script (yes, I know...but it's worth it!)
<command interpreter="python">
rgFastQC.py -i $input_file -d $html_file.files_path -o $html_file
-n "$out_prefix" -f $input_file.ext -e
${GALAXY_DATA_INDEX_DIR}/shared/jars/FastQC/fastqc
I hope this helps?
--
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

If you know your data is already in Sanger FASTQ format, you can say
this when uploading the data into Galaxy. Or, use the "pencil" icon to
edit the attributes and change the file type. This doesn't change the
file itself on disk.
Peter

Hi guys,
We are trying to load Illumina data to our local Galaxy instance. The
files are between 700 MB and 2.2 GB. Files below 2 GB load in less
than
5 minutes. Files larger than 2 GB don't upload at all. We installed
Galaxy locally because we thought loading files will be faster than
the
server version. Any suggestions to solve this problem is highly
appreciated.
Tilahun