Help with demultiplexing Paired End Barcode In Sequence data

bpscherer · September 4, 2020, 1:58pm

Hi all,

I recently obtained 16S sequence data from ~280 samples for my dissertation.
I got the data from the sequencer in both demultiplexed and non-demultiplexed form.

I'm most familiar with data that is already demultiplexed, but my collaborator said I should start with non-demultiplexed data because QIIME2's demultiplexing is better than Illumina's.

I've taken the demultiplexed data through DADA2 and into some alpha diversity analysis with no problems. Of the 280 samples, only a handful had less than 100 reads, and only 1 completely dropped out after DADA2. From what I can tell, the dataset as a whole is sound and should be good enough to proceed with.

However, when I have tried to import and demultiplex the other (non-demultiplexed) version of the data, I only end up with 24 samples. This version of the data consists of two large files containing all of the forward and reverse reads, respectively. I also have a file with all of the barcodes that I produced from the information given to me by the sequencing center.

Below is the code I have used to get this far, and I have also attached my demux.qzv. Thank you for any insight or ideas!

demux.qzv (300.8 KB)

qiime tools import
--type MultiplexedPairedEndBarcodeInSequence
--input-path fastq
--output-path multiplexed-seqs.qza

qiime cutadapt demux-paired
--i-seqs multiplexed-seqs.qza
--m-forward-barcodes-file barcodes.txt
--m-forward-barcodes-column forward
--m-reverse-barcodes-file barcodes.txt
--m-reverse-barcodes-column reverse
--o-per-sample-sequences demux.qza
--o-untrimmed-sequences untrimmed.qza

thermokarst · September 9, 2020, 11:26pm

2 posts were merged into an existing topic: Where do the per-sample barcodes come from in sample-metadata.tsv?