q2 boots outputs super high alpha diversity values

Dear q2 community

I've been using q2-boots inside q2 amplicon 2025.10 and got extremely high alpha diversity results compared to the classical qiime core-metrics.
So, for example these results come from core-metrics + diversity alpha-group-significance:
chao_with_core_metrics.tsv (816 Bytes)
shannon_with_core_metrics.tsv (1010 Bytes)

And these come from the same input data but using q2-boots with the same sampling depth and these parameters: --p-n 10 --p-replacement --p-alpha-average-method median --p-beta-average-method medoid. and later on doing alpha-group-significance.
chao_with_boots.tsv (1001 Bytes)
shannon_with_boots.tsv (1014 Bytes)

For the type of microbiota the correct range of diversity is the one obtained with q2 core-metrics
Maybe there's something wrong with the average?
On the other hand, I must say that beta-diversity with q2 boots and core-metrics + beta significance + adonis are quite similar and make sense.

Thanks a lot for time and help :slight_smile:

Hello Pau,

Would you be willing to upload the .qza output files from classical qiime core-metrics and q2-boots? There's a lot of provenance information about how the script was run and then we can look for clues!

I'm guessing this has to do with --p-replacement... but let's check the files first.

1 Like

I quickly checked on the defaults for core-metrics:

%> qiime diversity core-metrics

Usage: qiime diversity core-metrics [OPTIONS]

  Applies a collection of diversity metrics (non-phylogenetic) to a feature
  table.

Inputs:
  --i-table ARTIFACT FeatureTable[Frequency]
                          The feature table containing the samples over which
                          diversity metrics should be computed.     [required]
Parameters:
  --p-sampling-depth INTEGER
    Range(1, None)        The total frequency that each sample should be
                          rarefied to prior to computing diversity metrics.
                                                                    [required]
  --m-metadata-file METADATA...
    (multiple arguments   The sample metadata to use in the emperor plots.
     will be merged)                                                [required]
  --p-with-replacement / --p-no-with-replacement
                          Rarefy with replacement by sampling from the
                          multinomial distribution instead of rarefying
                          without replacement.                [default: False]

I think default: False means WITH-replacement, which would explain the difference.

Try rerunning boots with --p-no-replacement (i.e., rarefaction) so it matches the defaults of qiime diversity core-metrics and report back!

1 Like

Dear @colinbrislawn ,
Thanks for your quick answer!
I re-run the q2 boots diversity with p-no-replacement and the results are tha same. I used this command:

qiime boots kmer-diversity
--i-table table.qza
--i-sequences dada2-files/rep-seqs.qza
--m-metadata-file metadata.tsv
--p-sampling-depth 238020
--p-n 10
--p-no-replacement
--p-alpha-average-method median
--p-beta-average-method medoid
--p-alpha-metrics pielou_e
--p-alpha-metrics observed_features
--p-alpha-metrics shannon
--p-alpha-metrics chao1
--p-alpha-metrics simpson
--p-beta-metrics braycurtis
--p-beta-metrics jaccard
--p-beta-metrics aitchison
--p-color-by grupo
--output-dir boots-kmer-diversity-2

Here the two .qza with core-metrics and q2-boots.
chao1_boots.qza (5.6 MB)
chao1_core_metrics.qza (4.8 MB)

Thanks a lot for your help!

Hi @pau ,

You are observing very high alpha diversity because you are using the kmer-diversity action, which measures diversity of kmers instead of ASVs (see here for an explanation)

We would expect kmer diversity to be very high relative to ASV diversity, as the ASVs are decomposed into many kmers prior to calculating diversity metrics with this action.

For "normal" ASV diversity metrics, you should use the core-metrics action that @colinbrislawn mentioned above.

good luck!

4 Likes