I'm analyzing AMF sequences using the 2021.8 release and the maarjAM reference database. I ran vsearch cluster-features-closed-reference to cluster sequences at 97% similarity (the standard for AMF) and got a clustered and identified qza file as well as an unmatched qza file.

I would like to re-cluster the unmatched qza file at a lower similarity (90%) to see if we can get closer to an ID on some of these sequences. However, because the unmactched qza file only contained a portion of the feature-IDs from the dada2 table, I cannot do this.

Is there a way to filter the feature-table based on sequences? It appears that the reverse is possible (filter sequences based on a large variety of inputs) but I have yet to figure out how to filter a feature table.

I think you are looking for feature-table filter-seqs! Just provide your .qza from DADA2 as the metadata file and it should only keep the features in your table that are also found in it.

Thanks for the reply, one more question. I got the sequence file to filter this way, but how do I filter the dada2 table? In order to recluster I need the dada2 table filtered as well. When I try to put in the table with the filtered sequences I get the error "Filtering with metadata and filtering with a table mutually exclusive". Is there a way to create a feature table without going through dada2 again?


@Jennifer_Bell for that you will need to use feature-table filter-features.

