Hi! Filtration step is just an example and if you don't need to filter your samples, just skip it. Or you can filter by other parameter, if you need it for the analysis.
I like filtering my features before ANCOM, but not necessarily my samples. So, I tend to get rid fo things tthat don't have a lot of counts or aren't present in many samples because they tend to be "different" and add noise.
hi... i think filteration step should be provided after the generation of feature table and featureData summary. Since I am new to qiime2, i followed the tutorial and just before ANCOM I got to know about filteration. now I am finding it necessary to filter just after dada2..
please comment
No, I do at least a two part filtration. First, I drop anything with counts below my rarefaction depth because those are deemed "bad quality" samples for thsi study's definition of bad quality.
Then, I double check my samples in PCoA space and may drop samples which do not cluster, period. I may also filter at this step to spit or remove samples that aren't relevant to my current analysis. (For example, sometimes people will send samples about both and but only want to look at so then we filter.)
Third, before feature-based analysis, I filter my table to get rid of anything with less than (1/rarefaction depth) in less than 10% of my communities. My suggestion here, since the joint filtering isnt implemented in qiime (yet... its on my list), to first filter out any feature present in fewer than 10% of your samples and see where that gets you.
This decreases the over all number of features you test while discarding things that are likely either noise or underpowered.
@jwdebelius
I was able to drop 1 sample in rarefaction analysis.
and minimum frequency per feature is 1, this I got to know just before ANCOM, so should I filter this data for low abundance features and re-run all steps???
because I used the same table.qza file for all steps...
Hi!
In my opinion, you can do it or remove low abundant features before ANCOM and use this table only for ANCOM if you don't want to redo all the steps you already did. But I want to see what @jwdebelius will answer
No, I pass a table where I haven't filtered features based on prevelance/abundance into rarefaction and then filter my features before I do differential abundance.
Thank you for providing details on how to filter our tables before ANCOM . I have currently filtered out my features present in less than 10% of my samples.
However, I was also following the ANCOM tutorial (Parkinson's mouse), and was wondering whether you could explain why p-min-frequency is 50.
This is from the pd-mice tutorial:
qiime feature-table filter-features
--i-table ./table_2k.qza
--p-min-frequency 50
--p-min-samples 4
--o-filtered-table ./table_2k_abund.qza
I'm having difficulty deciding what my p-min-frequency should be - I also have a minimum frequency per feature of 1, and I know that filtering low count ASVs should be done to limit FDR. Could you please kindly explain what numbers should be taken into consideration when choosing our p-min frequency before running ANCOM?
If you have run Dada2 or Deblur, there is no ASV in your original table which contains less than 10 counts. If you've since filtered your data (for example, to split experiments, discard samples, etc) then this may no longer be true.
I was involved in writing the PD mice tutorial and, to be honest, I can't remember the exact logic behind that specific depth. (I double checked my notes from that period, and... its not recorded :/. Sometimes that's the way it goes in this field: you pick a something and work with it). The mean/total abundance and prevalence co-vary pretty closely so most of the low abundance features will be picked up in your first filtering.
Thank you very much @jwdebelius for your clarification regarding the specific depth of our p-min frequency. I will keep that strategy in mind when running the rest of my analysis