Hi Nicholas,
Had a chance to work on this and am having an issue I can't seem to get around. I am trying to get this working on a dataset I have previously worked with before I apply it to my real data. I have processed this data in a similar manner years ago in mothur, so would expect to see some differences although not drastic ones. Fortunately, the data looks similar in terms of relative abundance of staphylococcus, so I know these same communities I previously analyzed do exist with the way I have processed that data - probably not surprising, but always reassuring ![]()
However when I try and use my staphylococcus reference sequences and taxonomy to classify my sequences, I'm getting almost all of them as unassigned. I've tried messing with the settings (--p-perc-identity, --p-maxaccepts, --p-query-cov) and have minimal changes so I think I'm just not changing the right settings or something..
I extracted rep-seqs and pulled out one of the representative sequences that matches to a species of interest (the species that should account for ~90% of the sequences) and aligned it to the corresponding sequences in the reference alignment and it does match almost entirely, with no gaps just a few bases at the beginning that don't match and a couple hundred at the end that are in my reference sequences but not my representative sequences:
So I don't understand why it isn't being classified when using classify-consensus-vsearch.. I did also try blast, just in case this was some weird thing.
The command I'm using is:
qiime feature-classifier classify-consensus-vsearch --i-query FB_staph_only_seqs.qza --i-reference-reads ../Staph_ref_seqs_V13.qza --i-reference-taxonomy ../Staph_species_taxonomy.qza --o-classification FB_V13_test_staph_filtered_tax.qza --verbose --p-perc-identity 0.65
And like I said, I've messed with various settings, this is just one example.
I did try trimming this down, which I don't think would be the issue since you can get good classifications for regions of 16s with the full classifiers and I am getting classifications for some taxa.. When I trimmed the reference alignment I still have this issue. Here's what that taxa barplot look like, which is similar to the all others from various troubleshooting attempts:
The classifications I do have agree with my previous analysis, but obviously I'm missing a lot. Any idea where things might be going wrong?

