metric distances with zeros - just one ASV on a table.qza

Manuela_Ramalho · February 26, 2020, 9:24pm

Hi,

I filtered and kept in my table.qza only one ASVs using the Feature ID. It seems that the filtering happens perfectly, but when I create the metric distances with this table_filtered.qza (command "qiime diversity core-metrics-phylogenetic") the distances are with zeros. Below is my script. I'm definitely doing something wrong, but I couldn't identify what it is. Can anyone help me? I will really appreciate this.
I'm using the qiime2-2019.10 version.

################################################
echo SampleID > samples-to-keep.tsv
echo d83d245c28916b37bceb3d7cc49a8ecd >> samples-to-keep.tsv

qiime feature-table filter-features
--i-table core-features_Cephalotes_05_3.qza
--m-metadata-file samples-to-keep.tsv
--o-filtered-table table_OTU5.qza

qiime diversity core-metrics-phylogenetic
--i-phylogeny insertion-roottree_sepp.qza
--i-table table_OTU5.qza
--p-sampling-depth 2000
--m-metadata-file mappingfile_ManuED_Feb2020_2.txt
--output-dir core-features_OTU5

qiime tools export
--input-path core-features_OTU5/jaccard_distance_matrix.qza
--output-path core-features_OTU5/jaccard_distance_matrix.tsv

qiime tools export
--input-path core-features_OTU5/bray_curtis_distance_matrix.qza
--output-path core-features_OTU5/bray_curtis_distance_matrix.tsv

######################

Thanks in advance!

All the best,
Manu

Mehrbod_Estaki · February 26, 2020, 10:50pm

Hi @Manuela_Ramalho,
Is there a specific reason you are only trying to retain a single ASV in your table? This is a rather very odd thing to do in microbiome data.

Zeros by themselves in Jaccard and Bray-Curtis (which is a weighted version of Jaccard) distances are ok, as these distances are constrained between 0-1, with 0 meaning complete overlap (Jaccard)/identical composition (Bray-Curtis) between 2 communities. When you only have 1 feature retained, you either get a score of 1 or 0. Its also possible that the feature you have retained doesn't exist in many of your samples causing those samples to be not even be included in the calculations, in which case no distances can be calculated between samples.

Manuela_Ramalho · February 27, 2020, 2:49pm

Hi @Mehrbod_Estaki,

First of all, thank you very much for your help! I really appreciate this!

So, distances between zero and one are expected - but in the distances generated so far - there are only ZEROS (see distance-matrix.tsv)
distance-matrix.tsv (28.9 KB)

This ASV happens in many of the samples (that's why I found the zeros strange - see table.qzv)table_OTU5.qzv (411.8 KB). It has been identified as one of the ASVs that are part of CORE.

With the metric distances of this unique bacteria (among my samples), I would like to know if there is a correlation with the host phylogeny. That makes sense? Do you have any other tips?

I'm really grateful for all the help!

All the best,
Manu

Mehrbod_Estaki · February 27, 2020, 6:36pm

Hi @Manuela_Ramalho,

You have a table that has a single feature shared among all of your samples. When you run core-metrics with a rarefaction depth of 2,000 that means all the remaining samples that are used in the distance-matrix calculation will have 1 feature and a total count of 2,000. Therefore, all of your sample are identical in composition thus the distance of zero between all of them. I don't think this is the approach you want to take to answer your question (or any question really). I'm not familiar with your dataset so I can't really offer a best answer but maybe the longitudinal linear mixed effects plugin can help, where you can input your feature table with that 1 taxa (needs to be transformed to relative abundance first though before your initial filtering of your table to keep that 1 feature) then you can run regular LME on any of your metadata variables. Truth be told I've never used this approach myself, so maybe @Nicholas_Bokulich can comment whether this is kosher with that longitudinal plugin.

Nicholas_Bokulich · February 27, 2020, 7:00pm

Interesting suggestion to use LME! I think it depends on how you are measuring phylogenetic distance. If you have a continuous value, e.g., phylogenetic distance to some reference point, then yes this sounds like it should work. If you are trying to relate this to pairwise phylogenetic distance then LME would not work. Either way, you may want to consult a real statistician to decide if this is an appropriate test for your question.

system · March 30, 2020, 1:00am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.