Dear QIIME2 community,
Situation: we added mock communities to our 360 freshwater lake samples
Observation: For DADA2 as for Deblur and for each of the primer set that were used, the 10 expected genera of the mock community are almost always the top 10 abundant observed genera. That is good. However, regardless the method or primer set, we get a large tail of medium to low abundant, misclassified false positives. Compared to other mock community examples we found here in the forum, our tail of additional taxa is very large.
Questions:
- how to interpret too many taxa in our mock community results?
- what could be the cause? Contamination? Low amount of DNA?
- should we filter the actual samples based on the mock results?
taxa-bar-plots-deblur-all-mock.qzv (429.3 KB)
using qiime quality-control evaluate-composition
for each primer set (1) 515 and (2) 799:
comparison-mock515.qzv (355.2 KB)
comparison-mock799.qzv (348.8 KB)
We are thankful for any feedback we get!
background information:
We added 4 replicates of mock communities to our samples. Some samples, including the mock communities were sequenced with two different primer sets (1) 515F-Y and 926R, and (2) 799F and 1193R.
As mock communities, we used the ATCC® MSA-
1000TM mix, which contains an even abundance of 10 different bacteria genera.
Samples for both primers were demuxed and de-noised individually with DADA2 and Deblur. For each primer set, a classifier using SILVA 132 99% was trained with qiime feature-classifier fit-classifier-naive-bayes
.
Since the results for DADA2 and Deblur are in principle very similar, here the steps using Deblur to denoise and only for one primer set:
qiime demux emp-paired \
--m-barcodes-file sample-metadata-mock799.tsv \
--m-barcodes-column BarcodeSequence \
--p-no-golay-error-correction \
--i-seqs emp-paired-end-sequencesRun5A.qza \
--o-per-sample-sequences demux-mock799.qza \
--o-error-correction-details demux-details-mock799.qza
qiime vsearch join-pairs \
--i-demultiplexed-seqs demux-mock799.qza \
--o-joined-sequences Taxonomy_deblur_mock/demux-joined-mock799.qza
qiime quality-filter q-score-joined \
--i-demux Taxonomy_deblur_mock/demux-joined-mock799.qza \
--o-filtered-sequences Taxonomy_deblur_mock/demux-joined-filtered-mock799.qza \
--o-filter-stats Taxonomy_deblur_mock/demux-joined-filter-stats-mock799.qza
qiime demux summarize \
--i-data Taxonomy_deblur_mock/demux-joined-filtered-mock799.qza \
--o-visualization Taxonomy_deblur_mock/demux-joined-filtered-mock799.qzv
demux-joined-filtered-mock799.qzv (302.9 KB)
qiime deblur denoise-16S \
--i-demultiplexed-seqs Taxonomy_deblur_mock/demux-joined-filtered-mock799.qza \
--p-trim-length 311 \
--p-sample-stats \
--p-no-hashed-feature-ids \
--p-jobs-to-start 12 \
--o-representative-sequences Taxonomy_deblur_mock/rep-seqs-deblur-mock799.qza \
--o-table Taxonomy_deblur_mock/table-deblur-mock799.qza \
--o-stats Taxonomy_deblur_mock/deblur-stats-deblur-mock799.qza
qiime feature-classifier classify-sklearn \
--i-classifier classifier-specific.qza \ ## <-- specifically trained for 799F and 1193R
--i-reads rep-seqs-deblur-mock799.qza \
--p-n-jobs 15 \
--o-classification taxonomy-deblur-mock799.qza
## the same was done for the samples sequenced with the 515F-Y and 926R primer
qiime feature-table merge-taxa \
--i-data taxonomy-deblur-mock515.qza \
--i-data taxonomy-deblur-mock799.qza \
--o-merged-data taxonomy-deblur-all-mock.qza
qiime feature-table merge \
--i-tables table-deblur-mock515.qza \
--i-tables table-deblur-mock799.qza \
--o-merged-table table-deblur-all-mock.qza
qiime taxa barplot \
--i-table table-deblur-all-mock.qza \
--i-taxonomy taxonomy-deblur-all-mock.qza \
--m-metadata-file sample-metadata-all-mock.tsv \
--o-visualization taxa-bar-plots-deblur-all-mock.qzv