Many years ago, I used to use QIIME 1 with pynast to filter sequence sets prior to EPA placement with RAxML (now I would use epa-ng), and I'm looking for a practical way to do this now.
inputs: a set of ASVs that are pre-filtered based on interest, guide alignments and trees for specific clades, again selected based on interest
caveats: assume that we don't know a priori which ASVs fall w/in which clade, and are looking for a practical way to filter the set of ASVs prior to alignment and placement w/in each clade of interest. I'd like to do this for 8+ clades and two genes, so a smooth approach that isn't too hacky or labour intensive would be ideal
goals: given the input ASVs, align them to a template alignment based on similarity (i.e. dissimilar sequences are discarded during the alignment process) and use the resulting alignment to place the ASVs into a guide tree
does anyone have a working method for doing this? specifically: i don't know a good piece of software to use for discarding dissimilar ASVs when aligning to a template. tree placement is trivial once i have the correct inputs for that step
This is a reasonable approach and it is basically what I ended up doing (blastn against reference sequence set, then filter ASVs for alignment to reference alignment based on their %similarity in the blast results). I was hoping there was still a purpose built tool for this that someone knew about, but this method worked okay.