@yanxianl I am simply stating that the feature abundance filtering protocols recommended in the 2013 Nature Methods paper that @stangedal mentioned are not tested in conjunction with dada2 or deblur, and are most likely unnecessary (based on the results reported in the original papers for dada2 and deblur) or even conflicting. That is not to say that dada2 or any method is perfect — some level of abundance filtering (e.g., to remove singletons and other low-abundance features) may still be useful in some circumstances but this has not been benchmarked so I cannot recommend it.
Excellent — so in your runs you can use mock communities to tune this. In an upcoming QIIME 2 release we will release some new quality control methods that utilize mock communities. Stay tuned for more details.
I would not say that this “assures” the quality of results. Mock communities are not perfect and are prone to human error — but I would say that within reason the method/parameter combinations that maximize mock community accuracy will be best for that individual sequencing run. Those methods may not generalize to other sequencing runs or bioinformatics methods — large-scale benchmarking studies are required to assess general recommendations.
Well this all depends on the upstream processing methods that you used and the biological question. Most beta diversity metrics are quite insensitive to low-abundance taxa (this is described in the 2013 Nature Methods paper mentioned above — but is certainly not without exception), particularly as beta diversity calculations are performed on rarefied feature tables in QIIME 2. In general, though, I’d say don’t worry too much about the impacts on beta diversity if using UniFrac methods.
For alpha diversity this is much more difficult to assess and I cannot make any absolute recommendations here. I can say that dada2 performs quite well without additional abundance filtering (as shown in the original paper but also in my own experience).