Unexpectedly high Shannon diversity values in QIIME 2

Sheylle_Green · October 6, 2025, 2:20pm

Hi all,

I’m running QIIME 2 v2024.10 installed via Conda and am seeing unexpectedly high Shannon diversity values in my human 16S data. I would like guidance on whether this is a technical issue in QIIME 2.

Steps I took:

Ran qiime diversity alpha-rarefaction to choose an appropriate sequencing depth.

qiime diversity alpha-rarefaction \
  --i-table table.qza \
  --i-phylogeny rooted-tree.qza \
  --p-max-depth 20000 \
  --m-metadata-file metadata.tsv \
  --o-visualization alpha-rarefaction.qzv

Used the selected depth to run core-metrics-phylogenetic:

qiime diversity core-metrics-phylogenetic \
  --i-phylogeny rooted-tree.qza \
  --i-table table.qza \
  --p-sampling-depth 10000 \
  --m-metadata-file metadata.tsv \
  --output-dir core-metrics-results

Observation / Problem:

The Shannon index from QIIME 2 is higher than expected (range 6–9).
When I export the feature table and calculate Shannon diversity in R, values are as expected (below 6).
I tried relative abundance normalization in QIIME 2, but Shannon values remain high.

I would like to understand if this is expected behavior, a technical issue with my QIIME 2 commands, or a misunderstanding in the rarefaction process. Ideally, I want to calculate Shannon and other alpha/beta diversity metrics in QIIME 2 while keeping results consistent with what is observed outside QIIME.

System information:

QIIME 2 version: 2024.10
Installation: Conda

I’ve searched the forum and reviewed the QIIME 2 glossary, but haven’t found similar reports. Any advice on troubleshooting this or recommended approaches would be greatly appreciated.

colinbrislawn · October 6, 2025, 2:38pm

Good morning @Sheylle_Green,

Welcome to the forums! Thank you for posting all the commands you ran!

Yes, this is expected in older versions of Qiime2 because it uses a different log base for the Shannon calculation.

Strangely, very few people notice this or ask about it, so there's not a lot of docs to find. I wrote this up a few years ago

That's so valid!

gregcaporaso · October 7, 2025, 5:41am

Hi @Sheylle_Green,
I'm following up on @colinbrislawn's as we had a little internal discussion about this. We think that the reason you're seeing this difference between implementations of Shannon Diversity across different tools is indeed the difference in the base used in the calculation. We did not change this in QIIME 2 recently though, so results should be consistent across QIIME 2 versions. (We are planning a change, which is where the confusion came from - but that has not yet been implemented.)

In brief, the exact Shannon values may differ across tools, but the results should be highly correlated. There isn't a single correct choice for the base, so as long as you're comparing values computed with the same base, your comparisons are valid. For the moment, this means avoid comparing values generated with different tools (aside from confirming that they are correlated, if you'd like to do that).

Best,
Greg

Sheylle_Green · October 10, 2025, 7:42am

Hi @gregcaporaso, @colinbrislawn, and all,

Thank you very much for your helpful responses and for clarifying the issue regarding the log base differences in Shannon diversity calculations across tools — that makes sense.

I’ve reached a point where I’m unsure how to move forward. The researcher I’m working with prefers the lower Shannon values calculated in R, as these seem to better reflect their expectations for the dataset.

The challenge is that I also need to use all other alpha and beta diversity metrics (e.g., Faith’s PD, UniFrac) generated through the QIIME 2 pipeline to ensure consistency and reproducibility for the rest of the analysis. From my understanding (and as Greg mentioned), it wouldn’t be appropriate or academically correct to mix Shannon values generated in another tool with beta diversity metrics generated in QIIME 2.

Would anyone have suggestions on how to handle this situation?

Is there a recommended way to manually adjust the Shannon values calculated in QIIME 2 to match those from R (e.g., by applying a log base conversion)?
Or alternatively, is there a sensible, reproducible approach for calculating all alpha and beta diversity metrics outside of QIIME 2 if the Shannon values must come from R?

I’d be grateful for any advice or examples of how others have dealt with similar scenarios.

Thank you again for your time and guidance!

Best,
Sheylle

Nicholas_Bokulich · October 10, 2025, 8:32am

Hi @Sheylle_Green ,

That’s fine — I think this reflect’s @gregcaporaso ‘s advice as well that

As your collaborator and others in the field are more familiar with values generated with a different log base, it makes sense to keep this consistent when publishing in that field so that the results are not misinterpreted (as it is easy for readers to overlook the differences caused by log base differences).

No, I don’t think that @gregcaporaso meant that you should only exclusively use QIIME 2 (but correct me if I am wrong Greg ). Rather, it would be best not to compare Shannon value generated with different tools.

So I think it should be fine to calculate Shannon with another tool of your choice if that is what you prefer. HOWEVER, the main issue that could occur is if this is handling the data in a different way from QIIME 2 (e.g., rarefaction, rarefying, or bootstrapping in a different way), then the methods reporting gets a bit murky.

For this reason, what you propose is probably the most transparent and straightforward: to apply a log base conversion prior to plotting.

We are discussing exposing a Shannon log base parameter in QIIME 2 so that it can be adjusted by the user. So we might manage to add this very soon (the next release comes out late this month), though this might need to wait until the next release. You can track that issue here:

Sheylle_Green · October 10, 2025, 1:44pm

Okay great! I think I understand now.

Thanks so much!

gregcaporaso · October 10, 2025, 10:58pm

That all aligns with my thoughts - thanks for the help @Nicholas_Bokulich!

Good luck @Sheylle_Green, and thanks for getting in touch about this! As @Nicholas_Bokulich mentioned, we should have a better solution soon, though maybe not in time for you this time around. Next time though, if not!

system · November 11, 2025, 4:59am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.