help with cutadapt demux-paired

@thermokarst I am using this method to demultiplex paired end data. I have data for 4 conditions and each condition has varying number of replicates (4,4,6,8). Now I have 4 paired-end multiplexed file which each needs to be demultiplexed to produce the individual sample files. I have a metadata file which stores information on the samples and respective barcodes. The barcode is 3 base long and present at the 5' of each forward read. The barcode is unique for all samples within the same condition but identical with sample in other condition (Not a problem! I would have to demultiplex all groups separately). I first imported using MultiplexedPairedEndBarcodeInSequence. Then I used cutadapt demux-paired to demultiplex. The result I got was quite shocking as all samples have very different number of reads. For example the read number corrosponding to first barcode sequence is largest ~2000 reads and then it keeps reducing and the last sample is usually only have around 10-15 reads. I tried with all different groups and its the same result. I also counted the number of reads respective to each barcode and it is different than what I am getting with this tool. Do you have any idea why is it so? I can provide detail on the outputs if necessary.
Thanks for the help.

Tayyaba

Welcome to the forum!

Yes, it is what I usually do when barcodes are not unique for the entire dataset and separated by index/run/condition.

I would try another tool outside of qiime2, for example, Sabre. I noticed that cutadapt always produce a large file for one barcode and distribution of reads among samples is scewed. I also counted barcodes in raw reads and it was different from cutadapt output. So I switched to sabre since the counts of reads in samples were very close to the numbers of barcodes I counted with custom script. I demultiplex my fastq files first and then import them to qiime2 as demultiplexed reads.