Hi to all,
I' am running a metagenomics analysis on fecal samples using qiime2-2021.2 (locally installed via conda). After performing cutadapt trimming, dada2 denoising and taxonomic classification using Naive Bayes classifiers trained on: Greengenes 13_8 99% OTUs full-length sequences (MD5:03078d15b265f3d2d73ce97661e370b1) (Data resources — QIIME 2 2021.2.0 documentation),
the feature table exported as output contained few OTUs, one in each sample (OTUs are reported only in 6 out of 32 samples) [taxonomic classification.tsv attached].
Below, you can find other things I've tried, besides performing the analysis using trained classifier, as well as the commands I ran. I cannot find out what may have gone wrong during the analysis, and I would be glad if you could enlighten me...
Thank u in advance!
I have already:
a) performed the analysis using pre-trained databases (still few OTUs reported as output):
-
- Silva 132 99% OTUs full-length sequences (MD5:
a02c3f7473fa4369bbc66158c799d39a
)
- Silva 132 99% OTUs full-length sequences (MD5:
-
- Greengenes 13_8 99% OTUs full-length sequences (MD5:
9a28a285305c2bfd2de46add7dd520d7
)
- Greengenes 13_8 99% OTUs full-length sequences (MD5:
a) formed the --p-trunc-len-f 210 AND --p-trunc-len-r 185 parameters based on all-cutadapt.qzv output (attached)
c) trained the classifier and perform cutadapt using both adapter and primer sequence as reported in the https://support.illumina.com/documents/documentation/chemistry_documentation/16s/16s-metagenomic-library-prep-guide-15044223-b.pdf (p.3)
"16S Amplicon PCR Forward Primer = 5'
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG
16S Amplicon PCR Reverse Primer = 5'
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC"
The Illumina overhang adapter sequences to be added to locus‐specific sequences are:
Forward overhang: 5’ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG‐[locus‐
specific sequence]
Reverse overhang: 5’ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG‐[locus‐
specific sequence]"
Commands
1) cutadapt trimming
qiime cutadapt trim-paired
--i-demultiplexed-sequences output/demux_samples.qza
--p-cores 3
--p-front-f CCTACGGGNGGCWGCAG
--p-front-r GACTACHVGGGTATCTAATCC
--o-trimmed-sequences qiime demux summarize
--i-data output/all-cutadapt.qza
--o-visualization output/all-cutadapt.qza
--p-error-rate 0.2
--verbose > output/all_cutadapt_stats.txt
2) dada2 denoising
qiime dada2 denoise-paired
--i-demultiplexed-seqs output/all-cutadapt.qza
--p-trunc-len-f 210
--p-trunc-len-r 185
--p-n-threads 0
--o-table output/dada2_table.qza
--o-representative-sequences output/dada2_rep_seq.qza
--o-denoising-stats output/dada2_denoising-stats.qza \
3) taxonomic classification
qiime feature-classifier classify-sklearn
--i-classifier output/99OTUs_greengenes_trained_classifier.qza
--i-reads output/dada2_rep_seq.qza
--o-classification output/seq_taxonomy.qza \
taxa-bar-plots.qzv (443.9 KB) dada2_rep_seq.qzv (217.6 KB)
all-cutadapt.qzv (317.2 KB)
dada2_denoising-stats.qzv (1.2 MB)