I’m interested in finding closely related species for a group of ASVs (classified as Aliivibrio) in a reference database, what’s the best way to proceed? I know that a better way to answer the question is to sequence the full-length 16S rRNA gene or whole-genome. But I want to have a look at my 16S data first.
What I can think of:
Check the insertion tree generated from the fragment insertion (SEPP) in iTOL.
However, the taxonomy.qza file generated by the qiime2-feature-classifier is not compatible with the insertion tree.
Build a de novo tree.
Download Aliivibrio reference sequences. Which reference database should I use? I used SILVA 132 Ref NR database for the taxonomic classification. Maybe I should use SILVA 132 Ref?
Extract amplicon sequences from the Aliivibrio reference sequences using the universal primers.
Build a de novo tree using fastree, raxml or iqtree.
Ideally, the de novo tree can should be built using command line tools only.
Any suggestions or comments are greatly appreciated!
First, why isn’t the the insertion tree compatible with your taxonomy? Shouldn’t you just be able to concatenate your reference and feature taxonomy and use that to filter the tree? Did you classify with a different database than you used for fragment insertion?
And, for your de novo tree, again, why not just prune the reference tree to just the clade you want? Like, why do amplicons instead if you’re using the reference? You lose information, but I’m not clear what the benefit is. If you’ve got specific ASVs, I can totally see a tree from those, but why not use the reference?
Sorry if these are difficult questions, Im just trying to work through the logic in my head.
PS. Have you checked out Emperess? Its supposed to be super shiny for phylogenetic visualization and I think qiime2 compatible.