diversity alpha (faiths_pd) generating NA when other metrics do not and phylogeny is complete

Hello - I'm hoping someone can help clear this up for me because I'm been banging my head against the wall for a while trying to figure this out. I can't find any similar issues, so it's likely something on my end but I wanted to check.

I'm running amplicon-2024.10 on an Apple-chipped MacBook Pro.

I'm experiencing an issue where I'm getting NAs for certain samples when running Faith's PD, but values when running non-phylogenetic metrics such as observed_features and shannon. I'm not getting any error (!), including those related to features present in the table not being present in the phylogeny. I typically compare across metrics, and have never seen this before. I was able to reproduce this with a 16S tree/table and a shotgun tree/table. Thanks in advance for any insight

Command run:
qiime diversity alpha-phylogenetic
--i-table /Users/shaffer-local/Mycelia/Academia/Projects/matrix/data/shotgun/matrix_shotgun_wolr2pe_gg2_biom_lbm_noControls_noSingletons_rar89K.qza
--i-phylogeny /Users/shaffer-local/Mycelia/Bioinformatics/databases/wol/wol_r2/wolr2_phylogeny.qza
--p-metric faith_pd
--o-alpha-diversity /Users/shaffer-local/Mycelia/Academia/Projects/matrix/data/shotgun/matrix_shotgun_wolr2pe_gg2_biom_lbm_noControls_noSingletons_rar89K_alpha_faithspd.qza

Hello Justin,

Can you post the date you used to replicate? Perhaps one of the phylogeny devs can take a look at this and check for bugs.

I understand if some data can not be shared. Only post what's right for you.

Hello. Thanks for your reply. I was able to replicate this with both the GreenGenes 2 phylogeny (16S) and the Web of Life Toolkit App (Woltka) phylogeny (metagenomics). Both trees are too big to post here, but I've linked them below.

GG2:
http://ftp.microbio.me/greengenes_release/current/

Woltka:

1 Like

Update: This is not an issue with QIIME 2, but rather with an R package (qiime2R) that I discovered was dropping trailing zeros from sample IDs for only the faith's PD vector. Thanks for your help. Closing.

3 Likes

OK sorry for sending multiple updates, however I discovered that QIIME 2 is dropping the trailing zeros in the sample names for alpha-phylogenetic. I included a table you can verify this with, but you will have to obtain the phylogeny from the link as its too big to provide here.
matrix_16s_deblur_gg2_biom_silva_noMit_noChl_noUnassigned_noEuk_noDomain_noControls_hbm_noSingletons_rar20630.qza (398.6 KB)

The website that hosts my tree is not currently working - is there a way I can provide a larger file to you and/or can you make a dummy dataset with numeric sample names including trailing zeros to verify?

Thanks in advance,

Justin

2 Likes

This is great. Thank you Justin.

We'll investigate and report back!

1 Like

Hi Lichen,
I was able to recreate this with very specific conditions.

It seems like your Sample IDs need to look like floats, then qiime diversity alpha-phylogenetic will truncate your Sample IDs . For example 1042.0000 will be truncated 1042.0.

I have opened an issue for this here, if you want to follow allow as we address this bug :bug: : BUG: qiime diversity alpha-phylogenetic truncates Sample IDS that look like floats. · Issue #376 · qiime2/q2-diversity · GitHub

As for a work around, I believe that if you renamed your samples to have at least 1 letter in them, they wouldnt be considered floats and therefore would not be truncated. For example 1042.000 would be S1042.000.
You can rename samples in your feature-table using qiime feature-table rename-ids.

Hope this helps!

3 Likes