Hello,
I would like to do the following:
filter out features that are in counts of 10 or less IN a sample (rather than across samples). I have posted about this previously and one recommendation was to export my feature table and filter in python. I am wondering how best to do this.
My python is basic so I was wondering if I can get help with this. Would my script below make sense?
Hello!
That's making sense to me.
You should try it.
If you will get errors while reading table, try slightly modified code:
import pandas as pd
df = pd.read_csv("exported-table/feature-table.tsv",sep='\t', skiprows=1,index_col=0) #skip first row (#constructed from biom), set feature IDs to be an index to avoid format issues.
df[df <11] = 0
df.to_csv('filtered_table.tsv',sep='\t')
Thanks so much @timanix! Worked perfectly.
How can I convert the filtered_table.tsv back to a feature table. I'm guessing I can do it in two steps
1- first to convert it back to a biom file (not sure how to do this)
then
2- convert the biom file to a feature table with below script:
Thank you!
Although I'm not sure what to use for the 'input format' for the qiime tools import command - BIOMV100Format or BIOMV210Format? I tried reading the Biom file format docs but still not sure.
Any ideas which would I should choose?
If you're still looking for help with this, I've made command line / python toolkit that has this functionality which you can easily install with conda in your Qiime environment!
Check out the Per Sample Filtering section, where you can set a single integer level to filter at within each sample, our input a .csv with unique filtering levels for each sample.