Error with filter-features-conditionally on picrust2 outputs

vimh · August 9, 2023, 2:27pm

Hello everyone,
Im not sure this is the best channel to ask my thing but maybe you can help me. Im trying to use the filter-features-conditionally option on picrust output ko-metagenome table. I want to retain the most frequent KO numbers in order to do a heatmap.

qiime feature-table filter-features-conditionally
--i-table ko-table.qza
--p-abundance 0.9 \
--p-prevalence 0 \
--o-filtered-table filtered-ko-table.qza

It is confusing to me because when I use the above command my table is not filtered at all, the resulting "filtered-ko-table.qza" contains exactly the same information that "ko-table.qza". On the other hand, when i use the below command, my table is empty. All the features are filtered out. I can see it with the feature-table summarize option and because in the first case when I try to draw the heatmap the table is too large and otherwise the plugin says that it can´t draw anything from an empty table (obviously).

qiime feature-table filter-features-conditionally
--i-table ko-table.qza
--p-abundance 0.9 \
--p-prevalence 0.1 \
--o-filtered-table filtered-ko-table.qza

Maybe I don´t really understand the plugin, I´m trying to retain the 10% most abundant features in my table, and I only need them to be very abundant in one single sample in order to compare with the others. I don´t know if picrust2 output tables have different format or the plugin is not reading well the table or what is happening there, but i definitively need your help.

Thank youy very much,
Víctor

gregcaporaso · August 9, 2023, 4:05pm

Hi @vimh,
This action, as called in your second command, will retain all features that are present at least 90% abundance in at least 10% of your samples, so it's working as expected (it's unlikely that any features meet that criteria). Your first command will retain all features that are present at least 90% abundance in at least 0% of your samples - so that one should keep everything. Does that make sense?

I don't think we have an easy way to do exactly what you're looking for here. If you set --p-prevalence to one divided by your total number of samples, that should let you retain features based on abundance in one sample. You could then experiment with --p-abundance settings to try to keep the most abundant features.

Alternatively, if you're comfortable with Python programming, you can use the Python 3 API to load your feature table as a pandas.DataFrame and perform the filtering with pandas.

I'll keep this post queued in case other developers have ideas on how to achieve this that I'm not thinking of.

vimh · August 10, 2023, 8:01am

Hello Greg,
Thank you so much for your help. I wasn´t understanding perfectly how the plugin were working and was trying to fit the plugin to my needs. I will try to do this outside qiime, or calculating a threshold frequency in order to use filter-features with --p-where parameter filtering out frequencies below the threshold.

Effectively, any feature is too frequent to pass the 1% filter. I used --p-abundance 0.001 and the table was filtered.

Thank you again and sorry for the time loss.

Víctor

gregcaporaso · August 10, 2023, 4:23pm

Sounds good - feel free to post if you run into additional questions!