Hi everyone,
I have doubts about the interpretation of my taxonomy files, to be clear, I'll separate my pipeline into topics and add files that are relevant to guide me if I've had an error in the process.
I work with low biomass samples, mosquito midguts from different localities.
1 - I've demultiplexed sequences from Illumina Miseq, that I had denoised using dada2 and chose a single end approach due to poor quality in the extension of the reverse sequences.
2 - I have found reads in my two negative controls, a contamination, to avoid this in my final results I decided to use one approach that eliminate the cross contamination.
I decontaminate my table.qza using microDecon in R, and create a table-decon.qza (decontaminated table) to proceed with my analysis.
3 - I trained a classifier for the v3-v4, looking in the forum I realized that it's the 341-805. The primers are from the Illumina tutorial: you can find the primers here
And I use the sequence and taxonomy qza avaiable here: here
Here are my code to retrain the classifier:
qiime feature-classifier extract-reads \
--i-sequences ./silva-138-SSURef-341f-805r-Seqs.qza \
--p-f-primer CCTACGGGNGGCWGCAG \
--p-r-primer GACTACHVGGGTATCTAATCC \
--p-trunc-len 450 \
--p-min-length 100 \
--p-max-length 600 \
--o-reads ./silva-138-SSURef-341f-805r.qza
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads ./silva-138-SSURef-341f-805r.qza \
--i-reference-taxonomy ./silva-138-341f-805r-consensus-taxonomy.qza \
--o-classifier ./silva-138-341f-805r_classifer.qza
-
Here are my code to generate taxonomy file and taxonomy barplot for the decontaminated table and the "original" table:
qiime feature-classifier classify-sklearn
--i-classifier silva-138-341f-805r_classifer.qza
--i-reads rep-seqs-single.qza
--o-classification taxonomy.qzaqiime metadata tabulate
--m-input-file taxonomy.qza
--o-visualization taxonomy.qzvqiime taxa barplot
--i-table table-single-decon.qza
--i-taxonomy taxonomy.qza
--m-metadata-file /mnt/c/users/Joao/desktop/FASTQ_16S/Metadados.txt
--o-visualization taxa-bar-plots_decon.qzvqiime taxa barplot
--i-table table-single.qza
--i-taxonomy taxonomy.qza
--m-metadata-file /mnt/c/users/Joao/desktop/FASTQ_16S/Metadados.txt
--o-visualization taxa-bar-plots.qzv -
Here are prints from my taxa barplot from the "decon" table and the "original" table. My question is why they are so similar, if I I decontaminated my table? It seems that all my samples have a homogeneous profile, which contradicts the literature that shows that samples from different locations have a heterogeneity of microbiota.
Why do we have many taxa that are not even passing order or family in taxonomy level?
6 - I also compared this results with my alpha rarefaction and I saw that there is a difference in the diversity of the groups that does not correspond with this homogeneity
Here are my files in case you want to consult:
table-single.qzv (1.0 MB)
taxa-bar-plots.qzv (365.4 KB) taxonomy.qzv (1.6 MB)
rep-seqs-single.qzv (922.5 KB) alpha-rarefaction.qzv (461.4 KB) table-single-decon.qzv (831.3 KB)
taxa-bar-plots_decon.qzv (363.5 KB)
alpha-rarefaction-decon.qzv (461.2 KB)