Hi @fgara and @timanix,
In generally, rarefaction is avoided before differential abundance. I would avoid LefSe because I tend to think of Kruskal-Wallis as prone to a high false positive rate. (I also hate their figures). I feel like Weiss et al and McMurdie and Holmes showed pretty clearly that rarefaction isn’t great.
In generally, my rule of thumb for rarefaction is that I (currently) apply it before doing diversity analysis (although if my sequencing depth variation is big, sometimes I’ll also thrown in a depth term in my model for richness) unless I’m doing Aitchinson or DECOIDE, which do their own normalization.
For ANCOM, DECOIDE, Songbird, PhILR, Phylofactor, and Gneiss, I don’t rarefy. Usually there’s a normalation either built directly into the method, or there’s a normalization step in the pipeline (most frequently some kind of log transform).
The one other thing I’ll note is that although it affects my composition, I often pre-filter my data before doing differential abundance - I assume that if I have a feature in one sample, I dont have enough of a distribution to perform statistical testing and I tend to drop low abundance/low prevalence stuff.