Considering the volcano plot, it looks like there isn’t an F-statistic being run. When there are only 2 categories, just the clr mean difference is calculated (which is essentially a log fold change). If you have negative log fold change, that is indicative of decrease (since log(x) < 0 for 0 < x < 1), whereas a positive log fold change is indicative of increase (since log(x) > 0 for x > 1).
The filtering recommendations are done based on empirical evidence. We have seen wonky behavior when there is a lot of low abundance features. It would be great to have insights from the original authors on this though. In addition, ANCOM is very slow when the number of features is high, since it scales quadratically with the number of features, so filtering definitely helps with runtime.
Yes, there is a Holm-Boniferroni test run by default within ANCOM (see code here for details). This can be disabled at your own risk.