The number of Samples where certain species is present isn't same before and after filtering the features using --p-min-frequency.

Problem: The number of Samples where certain species is present isn’t same before and after filtering the features using --p-min-frequency.

Background: I filtered the features which min frequency is 18 from table.qza and based on filtered-table.qza,
I filtered the features from rep-seqs.qza and then assigned the taxonomy using trained GreenGenes data (gg-13-8-99-515-806-nb-classifier.qza)

code for filtering the table.qza.

qiime feature-table filter-features \
–i-table table.qza \
–p-min-frequency 18 \
–o-filtered-table filtered-table.qza \
–verbose

Problem:
I was comparing the number of Samples where certain species at level 4 was present before and after filtering the features with min frequency (18).
I saw the discrepancy in the number of samples it was observed in before and after filtering. For example:
This species k__Archaea;p__Crenarchaeota;c__MBGA;o__NRP-J was seen in 140/180 samples before filtering but it was only present in 135/180 Samples after filtering.

Could you please help to figure what I am missing?

Thanks.

Hi! In a feature table, you don’t really have a species yet, what you have, is OTUs, or ASVs. Meanwhile, when you are collapsing your table to taxonomy level, you are combining several OTUs together. Some of them are more abundant, some less. So, if you filtered out some less abundant OTUs, you can loose it completely in some samples, in which this taxa was represented only by this OTUs

2 Likes

Thanks Timanix for clarifying.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.