Gneiss: Coordinates from the balances vs. metadata plot

Hi everybody and @mortonjt

Two things from my gneiss analysis that I would love to have your input on.

1. I wanted to check if I take the right approach in testing my hypothesis and plugging in the right meta-variable in the appropriate command. Basically, I am seeing variation in the microbial community along a soil depth profile and I would like to check several geochemical parameters as metadata if there is some correlation using balances. So, I chose gradient-clustering because I want to assess the change of balances within a gradient of metadata categories.

    qiime gneiss gradient-clustering \
      --i-table community_composition.qza \
      --m-gradient-file metadata.tsv \
      --m-gradient-category "SoilDepth" \
      --o-clustering SoilDepth_gc-hierarchy.qza

    qiime gneiss ilr-transform \
      --i-table community_composition.qza \
      --i-tree SoilDepth_gc-hierarchy.qza \
      --o-balances SoilDepth_gc-balances.qza

    qiime gneiss ols-regression \
      --p-formula "CO2+pH+phosphate" \
      --i-table SoilDepth_gc-balances.qza \
      --i-tree SoilDepth_gc-hierarchy.qza \
      --m-metadata-file metadata.tsv \
      --o-visualization regression-summary.qzv

To check what parameter explains the variation along the depth profile most accurately, the sequences were clustered along soil depth, then OLS regression was performed with CO2+pH+phosphate for all of which I have data along the depth gradient. My questions are a) was m-gradient-category “SoilDepth” correctly chosen in the clustering step and b) would I include soil depth in --p-formula as control and c) in the regression summary for the OLS model, what does the “Intercept” represent (mse, Rsquared, R2diff)?

2. With the balance-taxonomy command I can obtain this nice balance vs. metadata plot. However, it seems that it is possible to only download the pdf of the plot. How would I obtain the coordinates of the data points from the figure?

    qiime gneiss balance-taxonomy \
      --i-table community_composition.qza \
      --i-tree SoilDepth_gc-hierarchy.qza \
      --i-taxonomy taxonomy-vsearch.qza \
      --p-taxa-level 3 \
      --p-balance-name 'y0' \
      --m-metadata-file metadata.tsv \
      --m-metadata-category pH \
      --o-visualization pH_balance-taxa.qzv

One last question: Could somebody explain what exactly the configuration --p-balance-name does?

Thanks for all your help!

cheers,
steffen

Just a heads up @mortonjt mentioned he is travelling for the next two weeks, so responses will probably be delayed a bit. I’m afraid I don’t have the expertise to answer your questions, but if anyone else does, please chime in!

Thank you very much for letting me know @ebolyen .
Especially question 1 is rather theoretical about the approach using balances so I would appreciate any input from anybody.

Thanks for all your support!

steffen

@mortonjt Any feedback?

Hi @steff1088 sorry about the delay. The reason for having the balance-taxonomy command is to be able to summarize interesting balances that get pulled out of the regression. In this case, the top most balance y0 is what you are interested in.

All of the coordinates are available within the output of the ilr-transform. In your case, they are contained in SoilDepth_gc-balances.qza. If you wanted to pull out the balances yourself to manipulate, you can just unpack that qza. If you are using Python or R, you can load that up as a dataframe. Below is the way to do so using the qiime2 API in Python

import qiime2
import pandas as pd
art = qiime2.Artifact.load('SoilDepth_gc-balances.qza')
balances = art.view(pd.DataFrame)
balances['y0']
...

From there, you should be able to pull out the actual data points for data manipulation downstream.

2 Likes

Great, thanks @mortonjt that clarified a lot.

I know my questions 1a-c) are going beyond the technical application of gneiss, but would be important for me to touch base on in order to use the analysis appropriately for my research question. Could you briefly comment on those as well please?

Thank you very much!!:+1:

1 Like

1a) Yes, you can use SoilDepth for your clustering - this is analysis specific.
1b) Yes that is also ok.
1c) The intercept is the intercept for the linear regression. This can also be thought as a reference community that provides a scaffold for building the model.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.