97% vs 94% OTU similarity-what it means with respect to the files in Silva database

Dzana_Basic · January 16, 2018, 3:34pm

HI all,

I'm analysin my data using Silva reference database, and I'm just wondering... When I open taxonomy and rep_set files of 94 and 97 OTU similarity and compare some taxa, I cannot find differences.. So, anyone knows how these files are constructed? What these sequences inside these files represent?

Nicholas_Bokulich · January 16, 2018, 4:50pm

Hi @Dzana_Basic,
These files contain the representative sequences (cluster centroids) for SILVA reference sequences clustered and different % identity thresholds (94% and 97%).

So these files should contain different sequence IDs and sequence counts. Many of the same sequence centroids may be present, however, so that may explain the redundancies you have found.

Does this explain the similarities that you see, or are the files literally replicates of each other?

system · February 16, 2018, 10:50pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.