Hi Nate,
I’ve done a good deal of thinking about these topics.
First, you say that the number of sequences vary a good bit between samples. This generally does not reflect true biological information, but library prep/sequencing issues. If you follow a standard protocol you should be putting in the same amount of DNA for each sample. You should be getting roughly the same amount of sequences. The community composition of those sequences are the true biological information that may vary, and it should not be the sequence counts. Now of course it’s not perfect and variation is going to happen because of library prep/sequencing issues. I would use extreme caution in thinking that sequence counts is a reflection of true biological information.
Based on the Weiss et al. 2017 paper “Normalization and microbial differential abundance strategies depend upon data characteristics.” A rough guideline to follow is if the sequencing depth varies greater than 10x you should probably go with rarefying. I generally use CSS and have sequencing depth that varies around 10x in most of my raw libraries. When I do CSS that usually brings the variation in libraries to something more like 3x. The Weiss paper does not directly address this, but I suppose that if you use CSS and that can get you’re variation in sequencing depth to below 10x then you might be ok. I am working with a group compiling several datasets and in that case we needed to rarefy because of the large variation in depth across datasets.
I looked a bit into DESeq, but ended up going with metagenomeSeq and CSS because it seemed a little bit more straightforward. I usually do the CSS normalization in R, and only recently learned they had implemented it into QIIME 1. I can share code on CSS normalization in R if anyone has interest. The issue there is that something is going on with the compatibility of the .biom files between metagenomeSeq and phyloseq and then getting those biom files back into QIIME don’t work.