Hi everyone,

I am using QIIME 2 since 2017, spend countless days with reading in the QIIME 2 forum, and highly appreciate the community support! Moreover, I am an advocate of Open Science and, thus, feel at home in the QIIME 2 community. I thought it might be the right time to get a QIIME 2 forum account and do my very first forum post

I would like to introduce an algorithm to the QIIME 2 community that we developed for the normalisation of microbiome count data. Ever since the classic paper of McMurdie & Holmes in 2014, we have the statistical proof that rarefying (random subsampling without replacement) should never be used for library size normalization (many researchers are still using it though). It was great to see that QIIME 2 is not using random OTU picking without replacement!

Our new algorithm for the normalisation of microbiome count data is called ‘scaling with ranked subsampling’ (SRS). Here is a simplified illustration of how SRS works:

We normalised a bacterial 16S library to different library sizes using rarefying and SRS and compared the Bray-Curtis index of dissimilarity among 10,000 repeats. We found what we already expected from the random OTU picking of rarefying and showed that SRS is highly reproducible:

You can find all the details of SRS here: https://doi.org/10.7717/peerj.9593

An implementation of SRS in R is available for here: https://doi.org/10.20387/BONARES-2657-1NP3

You will see that SRS also uses random subsampling, however, a complex combination of circumstances has to occur for random subsampling to be used (it rarely happens). Additionally, if random subsampling is used, the relative abundance of the affected OTUs will be vary by at most a single count, which is mathematically inevitable. I am aware that our study is not as comprehensive as the study by Weiss et al. 2017, however, I believe that SRS is a suitable and promising tool for the normalisation of microbiome count data. I would be happy to discuss different library size normalisation approaches and the potential advantages/disadvantages of SRS with the QIIME 2 community. Depending on your feedback, I may try to develop a QIIME 2 plugin for SRS

Cheers and keep safe

Lukas