Weird Sequence Length after Demultiplexing

Hi everyone! I am using qiime2-2021.4 on a conda environment. We use FASTQ files generated from a 16S run on the Illumina MiSeq as our initial input in QIIME2.

Lately we've been having an issue and I am trying to narrow down whether it is an instrument issue or a QIIME2 issue, though I tend to lean towards instrument. Nonetheless, I figured I would check.

Basically after demultiplexing (even with a file that has ~200 MBP of data generated, which is our normal file size that would generate a forward sequence count of perhaps ~20k reads per sample), we are only seeing a forward sequence count of around ~300 per sample and we have absolutely no idea why.

Here is the code that we are running:
qiime tools import
--type EMPSingleEndSequences
--input-path
--output-path

qiime demux emp-single
--i-seqs
--m-barcodes-file
--m-barcodes-column 'BarcodeSequence'
--o-per-sample-sequences demultiplexed.qza
--o-error-correction-details errorcorrection.qza

qiime demux summarize
--i-data demultiplexed.qza
--o-visualization demultiplexed.qzv

qiime tools view demultiplexed.qzv

Is there a possibility that this is a QIIME2 issue? Again, inclined to think it is not but just want to cover my bases.

Hey @ThatGuySam,

Thanks for reaching out! :qiime2:

My first guess here (without seeing your barcode/sequence data) would be that this high read loss is an issue that's related to your barcodes. This doesn't necessarily mean there is anything wrong with the quality of your barcodes, but this could mean they need to be reverse complimented or that Golay error correction should be turned off (depending on if they are Golay barcodes or not).

In short, I'd recommend reaching out to your sequence provider for more details on your barcodes - I would ask them what the correct orientation of your barcode sequences should be, and whether or not they are Golay barcodes. This should better inform the parameters you'll want to use when re-running demux emp-single, and will hopefully be what resolves this high read loss that you're seeing.

Hope this helps! :nerd_face: :dna:

Cheers :lizard: