Small number of OTUs reported in Qiime2 taxonomic classification

Mania · May 17, 2021, 10:49am

Hi to all,

I' am running a metagenomics analysis on fecal samples using qiime2-2021.2 (locally installed via conda). After performing cutadapt trimming, dada2 denoising and taxonomic classification using Naive Bayes classifiers trained on: Greengenes 13_8 99% OTUs full-length sequences (MD5:03078d15b265f3d2d73ce97661e370b1) (Data resources — QIIME 2 2021.2.0 documentation),
the feature table exported as output contained few OTUs, one in each sample (OTUs are reported only in 6 out of 32 samples) [taxonomic classification.tsv attached].

Below, you can find other things I've tried, besides performing the analysis using trained classifier, as well as the commands I ran. I cannot find out what may have gone wrong during the analysis, and I would be glad if you could enlighten me...
Thank u in advance!
I have already:
a) performed the analysis using pre-trained databases (still few OTUs reported as output):

- Silva 132 99% OTUs full-length sequences (MD5: a02c3f7473fa4369bbc66158c799d39a)
- Greengenes 13_8 99% OTUs full-length sequences (MD5: 9a28a285305c2bfd2de46add7dd520d7)

a) formed the --p-trunc-len-f 210 AND --p-trunc-len-r 185 parameters based on all-cutadapt.qzv output (attached)

c) trained the classifier and perform cutadapt using both adapter and primer sequence as reported in the https://support.illumina.com/documents/documentation/chemistry_documentation/16s/16s-metagenomic-library-prep-guide-15044223-b.pdf (p.3)

"16S Amplicon PCR Forward Primer = 5'
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG
16S Amplicon PCR Reverse Primer = 5'
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC"
The Illumina overhang adapter sequences to be added to locus‐specific sequences are:
Forward overhang: 5’ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG‐[locus‐
specific sequence]
Reverse overhang: 5’ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG‐[locus‐
specific sequence]"

Commands
1) cutadapt trimming
qiime cutadapt trim-paired
--i-demultiplexed-sequences output/demux_samples.qza
--p-cores 3
--p-front-f CCTACGGGNGGCWGCAG
--p-front-r GACTACHVGGGTATCTAATCC
--o-trimmed-sequences qiime demux summarize
--i-data output/all-cutadapt.qza
--o-visualization output/all-cutadapt.qza
--p-error-rate 0.2
--verbose > output/all_cutadapt_stats.txt

2) dada2 denoising
qiime dada2 denoise-paired
--i-demultiplexed-seqs output/all-cutadapt.qza
--p-trunc-len-f 210
--p-trunc-len-r 185
--p-n-threads 0
--o-table output/dada2_table.qza
--o-representative-sequences output/dada2_rep_seq.qza
--o-denoising-stats output/dada2_denoising-stats.qza \

3) taxonomic classification
qiime feature-classifier classify-sklearn
--i-classifier output/99OTUs_greengenes_trained_classifier.qza
--i-reads output/dada2_rep_seq.qza
--o-classification output/seq_taxonomy.qza \

taxa-bar-plots.qzv (443.9 KB) dada2_rep_seq.qzv (217.6 KB)
all-cutadapt.qzv (317.2 KB)
dada2_denoising-stats.qzv (1.2 MB)

jwdebelius · May 17, 2021, 3:28pm

Hi @Mania,

Welcome to the forum!

Looking at your denoising stats, it looks like your sequences failed to join based on the current sequencing and trim length. Depending on your sequencing and primers you want want to re-examine your DADA2 trim length or only use forward reads.

Best,
Justine

Mania · May 21, 2021, 10:16am

Thank you for the quick response @jwdebelius,

I’ll re-run the analysis using a different trim length, and check if the output is affected.

Best,
Mania

system · June 21, 2021, 4:17pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.