Hello again!
Technically, you don't have to rarefy for any diversity metric, but since diversity metrics are very sensitive to sequencing depth, we should normalize in some way.
The most common way is rarefying: subsample frequencies from all samples without replacement so that the sum of frequencies in each sample is equal to a sampling depth we set. To set such sampling depth, we typically plot the alpha rarefaction curves and see them to find a tradeoff between "most of the curves are plateauing" and "we don't lose too many samples". I'm aware there are more normalization methods apart from rarefying, but I recommend it since is the most common procedure and the one I use in my research¹.
Exactly! And that's why we rarefy prior to most diversity metrics.
Is this plot including all your samples? Here I would plot alpha rarefaction curves per sample instead. If you see something similar to this in such curves, that would mean that with 1.0e+07 number of reads you already catched most diversity present in your samples (i.e., curves start to plateau), so you could look for a nice point to set the sequencing depth threshold and rarefy to that depth. Then you can generate the Bray-Curtis distance matrix with that table (Aitchison and the differential abundance method ANCOM-BC do not need rarefaction, but you can try with the rarefied table anyway if you wish).
See ².
Best wishes,
Sergio
--
¹ As everything in science (and life), rarefaction is not without controversy. In this regard, @jwdebelius shared a really interesting reference in a previous post:
https://academic.oup.com/bioinformatics/article/38/9/2389/6536959?login=false
.
² I did see that post (like all the others on the forum). I did not answer because I have only used the QIIME 2 implementation of ANCOM-BC and I'm not familiar with the others
. Your post is already assigned to a member of the QIIME 2 team so you should receive an answer soon. Please note that many people take their vacations in August, so it is normal that the answer of your question delays a little bit. From the QIIME 2 Community Code of Conduct:
The moderators of this forum are the QIIME 2 developers. We strive to reply to questions within one working day (i.e., Monday through Friday, not including holidays). Remember that when you post a question to the QIIME 2 Forum, you're asking for someone's time to help you with software that they're giving to you for free. Please be patient while waiting for a reply, and don't cross-post your question.
And, in my case, I'm not even a moderator or a member of the QIIME 2 team 