comparison of single-indexed and dual-indexed MiSeq libraries - has anyone noticed big diversity differences?

ctekellogg · September 11, 2019, 6:06am

I was also actually thinking of writing @gmdouglas - since all of these amplicon pools were sequenced at the IMR - just to see if he or anyone there has run into this. I've pretty much followed the SOP you've written (Amplicon SOP v2 (qiime2 2019.7) · LangilleLab/microbiome_helper Wiki · GitHub) - which is awesome. Thank you.

For example, for a particular run.

Import reads

mkdir reads_qza

qiime tools import \
   --type SampleData[PairedEndSequencesWithQuality] \
   --input-path demux/ \
   --output-path reads_qza/kellogg_16S_IMR4sequences.qza \
   --input-format CasavaOneEightSingleLanePerSampleDirFmt

How's the imported data look:

qiime demux summarize \
   --i-data reads_qza/kellogg_16S_IMR4sequences.qza \
   --o-visualization reads_qza/kellogg_16S_IMR4reads_untrimmed_summary.qzv

Trim primers:

qiime cutadapt trim-paired \
   --i-demultiplexed-sequences reads_qza/kellogg_16S_IMR4sequences.qza \
   --p-cores 4 \
   --p-front-f GTGYCAGCMGCCGCGGTAA \
   --p-front-r CCGYCAATTYMTTTRAGTTT \
   --p-discard-untrimmed \
   --p-no-indels \
   --o-trimmed-sequences reads_qza/kellogg_16S_IMR4reads_trimmed.qza

Hows the data look after trimming (19-20bp shorter - good news!):

qiime demux summarize \
   --i-data reads_qza/kellogg_16S_IMR4reads_trimmed.qza \
   --o-visualization reads_qza/kellogg_16S_IMR4reads_trimmed_summary.qzv

And then run dada2 (i've tried a whole bunch of settings here to try and compromise between my old runs and these new ones; here is one of the many):

qiime dada2 denoise-paired --i-demultiplexed-seqs reads_qza/kellogg_16S_IMR4reads_trimmed.qza \
                               --p-trim-left-f 30 \
                               --p-trunc-len-f 270 \
                               --p-trim-left-r 30 \
                               --p-trunc-len-r 210 \
                               --p-max-ee-f 3 \
                               --p-max-ee-r 5 \
                               --p-n-threads 8 \
                               --output-dir dada2_output

Any thoughts would be so appreciated! Seems to work fine for the old runs (though, I import them differently using qiime demux emp-paired), but then yields so many more features for the dual-indexed libraries run a few weeks ago.