Filtering low abundance OTUs following DADA2


I have a question with regards to filtering the feature tables. According to QIIME1 you could specify the percentage you wanted to filter (i.e. to filter at 1% specify 0.01) as follows : - --min_count_fraction Fraction of the total observation (sequence) count to apply as the minimum total observation count of an otu for that otu to be retained.

@Nicholas_Bokulich addressed low abundance filtering in 2013, recommending: "For datasets where a mock community is not included for calibration, we recommend the conservative threshold of (c = 0.005%). "
c=otu abundance threshold.

I just read a recent thread where @Nicholas_Bokulich states that “the feature abundance filtering protocols recommended in the 2013 Nature Methods paper are not tested in conjunction with dada2 and are most likely unnecessary”.

I am now using QIIME2 and after running DADA2 on my samples my feature table contains 3,159 features (total frequency 10,482,132). I have conducted contingency-filtering to filter out singletons after running DADA2 and end up with a total of 1,250 features (total frequency 10,304,300). I have read several papers and some people filter further at 0.005% or even at 0.1% (end up with feature tables containing 100-500 features).

If I filter at 0.005% my feature table contains 536 features (total frequency 10,252,159) and at 0.1% 101 features (total frequency 9,265,493).

What is considered to be acceptable? I would be grateful if someone could recommend an approach to use.

Thank you!

1 Like

Hi @thextramile,
You have already read my advice that this type of filtering is not necessary following dada2. That said, it is your choice if you still want to filter. I would not recommend filtering at a 0.1% abundance threshold, though — that’s extremely stringent, and definitely would skew your results especially after running through dada2.
Good luck!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.