The two files are different. Look for "Ambiguous_taxa" labels in the consensus taxonomy and compare to taxonomy_7_levels
... the ambiguous taxa occur in consensus taxonomy because there is not consensus at that level, but the fill label is listed in taxonomy_7_levels
because (presumably) it is the cluster rep seq.
"all levels" presents problems for really any classifier, since the taxonomy becomes a knotty mess (and this is not an issue specific to q2-feature-classifier either). The BLAST and VSEARCH-based classifiers will have the same issue unless if you use maxaccepts=1
, which is still going to grab the top hit (like classify-sklearn confidence=-1
) so is sub-optimal.
The best solution is really to use the 7-level taxonomy if you can...
You need to import to QIIME 2 — see the feature classifier tutorial on qiime2.org for specific examples.
Dear oh dear — mixed orientation is bad news and not just for taxonomy analysis. dada2 is effectively going to duplicate all ASVs, because the reverse complement of any ASV is a new ASV. Make sense? That's bad news for all analyses, especially if the samples are stratified by orientation.
Yes — classify-sklearn looks at the first 100 or so seqs to decide the orientation, and classifies based on that. Your mixed orientations leave it confused .
You have a few solutions. Fortunately, it sounds like your samples are stratified by orientation (e.g., sample 1 is all in forward orientation and sample 2 is all reverse) . So you could:
- [BEST] reverse the orientation of any reads in the reverse orientation and proceed (starting with dada2).
- classify your sample sets in two sets, separated based on read orientation
But perhaps I misunderstand and all samples are in mixed orientations, in which case use classify-consensus-vsearch
, which can already handle mixed-orientation reads.
Yes! VSEARCH comes pre-installed with QIIME 2 and has a method to reverse read orientations. This will only be useful if your read orientation is stratified by sample, not if all samples are in mixed orientations.