non-phylogenetic core metrics - distance matrix 1


I'm facing a problem with the PCoA plots of my non-phylogenetic core metrics analyses with ITS amplicon data. I aim to compare two groups (healthy and patient samples) but for any reason I'm only seen 8 instead of the 12 patient samples.

jaccard_emperor.qzv (950.5 KB)
bray_curtis_emperor.qzv (950.6 KB)
(look for "Collective")

It looks like that these 4 samples are displayed together at the same position in the 3D space.

I have a look to the distance matrix and distances of these samples to all others showed 1.

distance-matrix.tsv (7.4 KB)

I have a look to the feature table and and absolute counts obtained from the taxabarplots and I had a lot of features with 0 in a lot samples, so I tried to filter out low abundance features. After trying a lot of parameters I think I found the best for my dataset with following command:

qiime feature-table filter-features-conditionally --i-table table_H_E0_NL.qza --p-abundance 0.01 --p-prevalence 0.03 --o-filtered-table table_H_E0_NL_pa0.01_pre0.03.qza

I was able to get rid a lot of features (from 949 to 288) where sampling depth was still quite ok (from 1425 to 1405).

no filter data:
table_H_E0_NL.qzv (566.3 KB)
barplots_H_E0_NL_UNITE2022_single.qzv (489.1 KB)

filtered data:
table_H_E0_NL_pa0.01_pre0.03.qzv (542.5 KB)
barplots_H_E0_NL_UNITE2022_single_pa0.01_pre0.03.qzv (465.2 KB)

Taxonomy composition, in term of diversity, differs a lot between the groups.

Once I performed the core_metrics I just only gained one sample on the plot, 3 others were still at the same position in this spatial space.
jaccard_emperor.qzv (956.0 KB)
distance-matrix.tsv (7.3 KB)

I thought that maybe the metric is not the right choice, so I found this nice post Alpha and Beta Diversity Explanations and Commands and give a try to a bunch of metrics and in only one "roger" the 12 patients samples were seen on the plot. Nevertheless, I don't like the distribution of the samples that much.
roger_plot_E0_HNL_NL.qzv (948.2 KB)

I used forward reads only for these analyses and followed DADA2 strategy for denoise.

I don't know how to proceed and maybe you have any idea? I will really appreciate any suggestions :slight_smile:

(Sorry for the long post!)

Hi @CrZu,
Welcome to the QIIME 2 Forum!

For what it's worth, your PCoA plots look pretty reasonable to me. One thing you could try that might give you better resolution for those samples that are being place on top of each other would be to use qiime diversity beta-rarefaction, and then load the resulting neighbor joining tree using iTOL. You'll almost certainly see that those three samples have very small or length-zero branches between them, but you'll see all three labels show up in the visualization. These neighbor joining trees are a nice way to visualize beta diversity results when you have a relatively small number of samples.

Thank you very much for your answer @gregcaporaso!

I tried what you suggested with following command:
qiime diversity beta-rarefaction --i-table table_E0_NL_pa_pre.qza --p-sampling-depth 1400 --m-metadata-file MF_allsamples_ITS.txt --p-metric jaccard --p-clustering-method nj --o-visualization beta_rare_E0_NL_pa_pre_jaccard_nj.qzv

and it look like this

There are 6 patient samples showing the same branch length 0.5, which are those ones being place on top of each other on the 3D plot.

Since I was still a bit confuse, I filtered out the healthy group and analyzed patient samples only and I could see all samples now on the jaccard plot although distances between most of the samples are still 1.

jaccard_emperor.qzv (967.7 KB)

distance-matrix.tsv (1.1 KB)

So analysing patient samples only, I realize that distance values shown in the matrix are not only used for the performance of PCoA plots but also other calculations (eigenvectors) are responsible of the distribution of the samples in the 3D plot. Thus, comparison with the healthy group make these samples (although their dissimilarity) being place on the top of each other.

Did I understand it right?

Hi @CrZu,
All the steps that you are staying make sense to me.

I think this is a good point. Sometimes samples look closer together then they seem because of the relative distance to other points when projected in the the 3D plots.

Seems like you are understanding correct to me!

Hope this helps!

Hi cherman2,

thank you very much for your response :slight_smile:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.