I recently received a sequencing run from a sequencing center. This is an Illumina paired end Miseq run, Single-index. I usually get per-sample files post demultiplexing, however this time I got the whole lane, and with it just four raw files, listed below:

I would usually just send this through split_libraries_fastq.py in qiime, however I am not sure how to do this with the indexing. I would like to demultiplex these reads with minimal quality filtering. The ideal output would be two directories with forward and reverse reads for each sample as .fastq files.

I have copied the output of head file.path below for each file, if this is helpful. I realize the first two reads in the sequence files are probably junk.

You should ask your sequencing provider to do the right thing and reprocess the data so you get it in the format you are used to. This is bad customer service to dump non-demultiplexed data in customers lap and expect them to demultiplex it themselves.

If you have no option but to do this yourself then look at deML to get this done. sabre is another option.

I'm with you on the "this is not cool" front. Unfortunately I have no recourse. If I ran the world raw data would be pased along as well as an extremely simple shell script or similar to demultiplex to per-sample files, and a readme with how to change the demultiplex settings. Thanks for the links!

For some odd reason, it's actually easier to demultiplex with QIIME than try to get QIIME to work with demultiplexed data. The QIIME users I know specifically ask for non-demultiplexed data. At least this was the case with QIIME 1.

deML -i index.txt -f Undetermined_S0_L001_R1_001.fastq.gz -r Undetermined_S0_L001_R2_001.fastq.gz -if1 Undetermined_S0_L001_I1_001.fastq.gz -o demultiplexed/
Conflicts for index1:
WARNING: deML has detected that you have 1024 open file descriptors out of a max. of 1024
therefore you have reached the maximum. If the information is correct, certain files might be empty
Either:
1) Use BAM as input/output
2) Check "ulimit -n" and put a higher number e.g. "ulimit -n 1024"
If you already at the limit, increase the system limits

I had to increase the limit by entering ulimit -n 4096. This was the max i could set on my cluster on the login node. I actually needed to go higher, and this was possible once I setup an interactive session or did this via a batch job.