How to check how many sequences are contained by the classifier file


Does anyone know how to find out how many sequences are there in the classifier file, once it's in the form of .qza? For example, for the file silva-138-99-nb-classifier.qza?

I know that I can check for the number of sequences in Silva 138 on their website, but as far as I understand, this is not the same number that made it to silva-138-99-nb-classifier.qza, because there was additional stage of filtering (by length etc).

I hope someone enlightened can help? :slight_smile:
Thank you!

1 Like

Hi @fgara ,

This cannot be done because the raw sequences are not stored in the trained classifier file.

The only way to get this information is to look at the sequences that were used for training.

In this case you are in luck: the sequences used to train the classifiers are shared on the Q2 data-resources page where you downloaded the pre-trained classifier. So you can download the sequence artifact and check the count of sequences in that file.

I hope that helps!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.