Maybe I'm missing something, but when I am importing data before demuxing I always have to copy my original filenames to forward.fastq.gz and reverse.fastq.gz.
I sometimes have multiple files for a library and in that case it would be handy if one could use a manifest file to specify the files to load. One of the reasons would be easily exclude files with bad quality sequences
qiime tools import --type MultiplexedPairedEndBarcodeInSequence --input-manifest ./manifest.tsv --output-path ./qiime_artifacts/xyz
where the manifest file would look something like
What do you do now instead when you have multiple multiplexed files?
We currently have to concatenate all forward files and all reverse files before feeding them to the import.
For large projects it would be handy to have a manifest files with library code as well. E.g :
forward_reads reverse_reads libraryNumber
<path>/F123_1_1.fastq.gz <path>/F123_1_2.fastq.gz L1
<path>/F123_2_1.fastq.gz <path>/F123_2_2.fastq.gz L1
<path>/F123_3_1.fastq.gz <path>/F123_3_2.fastq.gz L2
As far as I'm aware, demuxing should be done on a 'per library' basis, given that barcodes might be shared among libraries. In the case below Sample_1 and Sample_3 will become mixed in the current setup.
Adding library awareness should be able to demux the samples correctly based on the file manifest data and the demultiplexing sample sheet if it contains the library number
#SampleID forwardBarcodeSequence reverseBarcodeSequence LibraryNumber
Sample_1 AACCAGAA AACCAGAA L1
Sample_2 AACCATGC AACCATGC L1
Sample_3 AACCAGAA AACCAGAA L2
Sample_4 AACCATGC AACCATGC L2
A setup like this would facilitate automated analysis across many samples sequenced over a larger amount of time, in one go.
Thanks for your great description!
Would you mind opening up an issue on our github: GitHub - qiime2/q2-metadata? That way this will stay on our radar for when we have time to implement this.
Thank you again!