merged run vs individual runs taxonomic assignment discrepancies

I run qiime2 via juypter notebook connected to a server.

I have merged 5 MiSeq runs using "feature-table merge" and "feature-table merge-seqs" to combine tables and rep seqs after running cutadapt:

qiime cutadapt trim-paired \
  --i-demultiplexed-sequences demux.qza \
  --p-cores 24 \
  --p-front-f GGACTACHVGGGTWTCTAAT \
  --p-front-r ACTCCTACGGGAGGCAGCAG \
  --p-error-rate 0.2 \
  --p-overlap 18 \
  --p-match-read-wildcards \
  --p-match-adapter-wildcards \
  --o-trimmed-sequences trimmed-demux.qza

and dada2:

qiime dada2 denoise-paired \
    --i-demultiplexed-seqs trimmed-demux.qza \
    --p-n-threads 6 \
    --p-trunc-len-f 275 \
    --p-trunc-len-r 250 \
    --p-max-ee-f 2 \
    --p-max-ee-r 5 \
    --p-n-reads-learn 1000000 \
    --o-representative-sequences rep-seqs.qza \
    --o-table table.qza \
    --o-denoising-stats stats-dada.qza

for each individual run.

Merged data is then classified using the same parameters as for individual runs:

qiime feature-classifier classify-sklearn \
  --i-classifier v3v4-naive-bayes-classifier-020822.qza \
  --i-reads Merge-1-2-3-4-5_rep-seqs.qza \
  --o-classification Merge-1-2-3-4-5_taxonomy.qza \
  --p-n-jobs -1 \
  --p-pre-dispatch all \
  --verbose

This all seems reasonable but the resulting taxonomic assignments differ from the individual runs for two of the five runs (the other three seem to agree).

This is what I am seeing:

All I had to do was this:

qiime feature-table merge-taxa
--i-data Run1/taxonomy.qza
Run2/taxonomy.qza
Run3/taxonomy.qza
Run4/taxonomy.qza
Run5/taxonomy.qza
--o-merged-data Run1-2-3-4-5_merged-taxonomy.qza

What I had done was follow this tutorial: https://docs.qiime2.org/2017.8/tutorials/fmt/#merging-denoised-data

With a simple suggestion from a GREAT former colleague, I got the result I needed.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.