Hello,
I have a question regarding the workflow for demultiplexing I used in Qiime2 with my sequencing data. I know similar questions have been asked about this (e.g. here Importing and Demultiplex process for 4 Fastq Files: R1, R2, Index1 and Index2 and here Importing multiplexed paired-end data with separated barcode sequence files) but in the end I decided to still write something, also because I know it's an issue that other users might have.
My situation:
I have soil/leaf/root microbiome sequencing data from the Illumina MiSeq, I did my libraries using standard primers and barcodes and have my paired-end reads (2x250) in the format:
I1
I2
R1
R2
Following the forum posts mentioned before I decided to give the extract.barcodes.py a try, which worked and produced one barcodes.fastq file.
I imported this back into qiime as EMP Paired-End sequences, now having my
-forward.fastq.gz
-reverse.fastq.gz
-barcodes.fastq.gz
file in one folder.
The import worked and I decided to go on with demultiplexing, now giving the demux emp-paired a try, because why not.
... Surprisingly, it worked! Or at least I think so. ![]()
Now the thing I wonder is, HOW? Since I was literally just telling demux to pull the barcodes from my metadata .csv file in which I concatenated my forward and reverse barcodes (or index, meaning the D701 and D501 barcodes). How would this work in demux, since I don't even specify the length of the barcodes I used (sometimes more than 8bp) it should look for? How would it split the barcodes?
I thought maybe it is smart enough to pull this information from the (also merged!) barcodes.fastq.gz file that I fed in? Still, other parts of the concatenated barcodes might match my insert sequences somewhere else...
I know the PE dmx isn't implemented yet but somehow the results I obtained look okay, I hope you'll be able to see this if I insert a screenshot of my demux.qzv here?
and my read counts are:
So basically I'm really confused, also because I am not at all coming from a bioinformatics background.... I apologize if this is complete gibberish, but still hope someone might help me out ![]()
I'd really love this to work with Qiime2, all other options are much more complicated and besides the dmx part, it is so straightforward! I worked through all other analyses with this data on qiime2, so in the end I learned a lot even if someone is now telling me that what I did was completely wrong ![]()
Thanks a lot for your help in advance! ![]()

