Reference database for clustering

Hi @Rakaya,

RE 1:

Yep.

RE 2

There is no need to cluster your reference sequences to match the clustering of your data, for reasons previously mentioned within these threads:

Additionally:

RE 3:

Simply add the paramater --p-perc-identity 0.85 to that command. :warning: : I would not recommend using 85%. This is just provided as an example to drastically reduce the database size.

RE 4:

Yes you can classify your 85% clustered reads against the 99% SILVA database. Again, I'd avoid 85%. Also, the content I linked for RE 2, applies here too.

2 Likes