feature-table filter-features and filter-seqs removing a lot of reads?

emmlemore · April 12, 2023, 2:53pm

Hi all,

I am using this tutorial and I am now 'filtering out contaminant ASVs' from my sample using filter-features and filter-seqs

qiime feature-table filter-features
--i-table /home/qza_files/16S_table.qza
--p-min-frequency 68
--o-filtered-table /home/qza_files/16S_filt_table.qza

I chose a frequency of 68 based on my Mean Frequency, which was 67,851 and I multiplied this number by 0.001 to get 68, so meaning an ASV is kept if it is seen at least 68 times in my samples.

However, after this filtering step, I went from over 25,000 features to 4,000 features. That seems like a lot is removed and I don't know if that's normal? These are seawater + oil samples. Is this step necessary?

Thank you.

cherman2 · April 12, 2023, 4:52pm

Hello @emmlemore,
Looking at the tutorial you linked it seems like your math is good and that you have just a large amount of low frequency features. I don't think that it is super abnormal and your command looks good to me.

This step is definitely not specifically necessary. There are a lot of ways to go about filtering out contaminants. In general, selecting a minimum frequency threshold can be a trade-off between minimizing the number of false positive features (i.e., features that appear due to noise or sequencing errors) and retaining real signals in your data.

You could look into lowing your --p-min-frequency and also using a --p-min-samples parameter. This would allow you to select for samples that have a minimum frequency and appear in a minimum amount of samples.

Another option is to use something like decontam if you have control samples. Here is a forum post where they talk about methods for removing contaminants. Discussion: methods for removing contaminants and cross-talk - #10 by lewisNU

Hope that helps!

emmlemore · April 12, 2023, 6:53pm

Hi @cherman2

Thank you for your reply. I played around with the --p-min-frequency parameter and ended up retaining ~7000 features, which is double the previous ~3500.

system · May 14, 2023, 12:54am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.