Hey guys,
I'm comparing outputs of taxonomic classification from kraken2, greengenes (gg-2022-10-nb-classifier.qza) and silva (silva-138-99-nb-classifier.qza). I'm using qiime2 version 2023.5. I runned them for the same dataset.
My problem is: at family level, both kraken2 and silva have identified f__Prevotellaceae, only greengenes didn't identify; but then at genus level all of these 3 have identified Prevotella. Altough, looking at ncbi taxonomy it says that for being classified as Prevotella it needs to be classified as Prevotellaceae at family level, and for genus Prevotella, greengenes classified it as family Bacteroidaceae (f__Bacteroidaceae; g__Prevotella). How can it be? Is it wrong? I don't know if i'm doing something wrong, here follows the commands I used (to compare percentage of reads):
I have seen this other topic greengenes taxonomy: bacteria belonging to same genus but in different family but now I wonder if it's a thing of greengenes.
There follows the commands I used:
qiime feature-classifier classify-sklearn
--i-classifier gg-2022-10-nb-classifier.qza
--i-reads rep-seqs-dada2.qza
--o-classification taxonomy.qza
qiime feature-table group
--i-table table-dada2.qza
--m-metadata-file sample-metadata.tsv
--m-metadata-column Host_disease
--p-mode sum
--p-axis sample
--o-grouped-table grouped-table.qza
#putting in each taxonomic level:
#class
qiime taxa collapse
--i-table grouped-table.qza
--i-taxonomy taxonomy.qza
--p-level 3
--o-collapsed-table collapsed-table-l3.qza
qiime feature-table relative-frequency
--i-table collapsed-table-l3.qza
--o-relative-frequency-table percentagetable-l3.qza
qiime metadata tabulate
--m-input-file percentagetable-l3.qza
--o-visualization percentagetable-l3.qzv
#order
qiime taxa collapse
--i-table grouped-table.qza
--i-taxonomy taxonomy.qza
--p-level 4
--o-collapsed-table collapsed-table-l4.qza
qiime feature-table relative-frequency
--i-table collapsed-table-l4.qza
--o-relative-frequency-table percentagetable-l4.qza
qiime metadata tabulate
--m-input-file percentagetable-l4.qza
--o-visualization percentagetable-l4.qzv
#family
qiime taxa collapse
--i-table grouped-table.qza
--i-taxonomy taxonomy.qza
--p-level 5
--o-collapsed-table collapsed-table-l5.qza
qiime feature-table relative-frequency
--i-table collapsed-table-l5.qza
--o-relative-frequency-table percentagetable-l5.qza
qiime metadata tabulate
--m-input-file percentagetable-l5.qza
--o-visualization percentagetable-l5.qzv
#genus
qiime taxa collapse
--i-table grouped-table.qza
--i-taxonomy taxonomy.qza
--p-level 6
--o-collapsed-table collapsed-table-l6.qza
qiime feature-table relative-frequency
--i-table collapsed-table-l6.qza
--o-relative-frequency-table percentagetable-l6.qza
qiime metadata tabulate
--m-input-file percentagetable-l6.qza
--o-visualization percentagetable-l6.qzv
Also, when running silva this error appeared about 4 times but qiime2 normally generated its normal output of taxonomy:
Message from syslogd ... nnot exec /etc/apcupsd/apccontrol changeme: No such file or directory qiime2
The command I used:
qiime feature-classifier classify-sklearn
--i-classifier silva-138-99-nb-classifier.qza
--i-reads rep-seqs-dada2.qza
--o-classification taxonomy.qza
But what worries me is that at genus level, both kraken2 and greengenes were able to classify at g__Enterobacter (kraken2), g__Enterobacter_B_683926 (greengenes) but silva only classified at family level f__Enterobacteriaceae;__, but I wonder if that error can affect somehow taxonomic classification at genus level.
Thank you in advance!!