Issues with running Gneiss, loading regression summary, and Firefox - view.qiime2.org compatibility

Alex_14262 · March 28, 2018, 1:43pm

Hi,

I still didn't manage to find a way of looking at 2 categories at a time to get the proportion plots for gneiss. If I filter the feature table so that I am left with two categories, wouldn't that also affect the balances calculated (compared to the balances calculated for the complete feature table where all categories are present)? That means I would have to run gneiss for each pair of category within a level. Wouldn't this inflate the results because of multiple testing?

Thanks

mortonjt · March 28, 2018, 2:48pm

@Alex_14262, there may be some confusion with raw visualization and statistical tests. Constructing multiple plots should not affect the outcome of the statistical test, and you should be able to feed in filtered tables and metadata directly into balance-taxonomy

That being said, if you find yourself doing a substantial amount of manual manipulation of the metadata / tables to generate these sorts of plots, may be worthwhile thinking about building some scripts. It maybe easier to export the feature tables and plot the proportion plots directly Python (see proportion_plots function).

Alex_14262 · March 28, 2018, 2:55pm

Thanks @mortonjt!

The problem is I have tried filtering the composition table (which is used in qiime gneiss balance-taxonomy command), but I cannot use the command qiime feature-table filter-samples and filter based on metadata categories because it is not of FeatureTable[Frequency] type, but FeatureTable[composition] type, which implies I need to filter it before adding a pseudocount, so before the balances are computed. Thus I would need to run the whole analysis every time, for each pair of categories. Please correct me if I am wrong and there is an alternative method

Also, I don't have knowledge of Python, and the turnaround for my project is quite short, so I am limited to qiime.

mortonjt · March 28, 2018, 3:43pm

Yeah, that is not ideal - you shouldn't be having to rerun the entire pipeline every time you want to regenerate some plots.

The python code is not bad, this is basically what you would have to do

import qiime2
import numpy as np
import pandas as pd
from gneiss.plot import proportion_plot
art = qiime2.Artifact.load('<your table name>')
table = art.view(pd.DataFrame)
metadata = pd.read_table('<your metadata>', index_col=0)
numerator = pd.read_table('<numerator.csv from balance-taxonomy>', index_col=0)
denominator = pd.read_table('<denominator.csv from balance-taxonomy>', index_col=0)
# filter out metadata according to pairs
category = '< your treatment covariate>'
group1 = '< your treatment group 1 >'
group2 = '< your treatment group 2 >'
g1 = metadata[category] == group1
g2 = metadata[category] == group2
metadata = metadata.loc[np.logical_or(g1, g2)]
table, metadata = table.align(metadata, axis=0, join='inner')
# now actually plot
proportion_plot(table, metadata, category, 
                left_group=group1, right_group=group2,
                num_features=numerator.index,
                denom_features=denominator.index)

That should be basically it, may have to play around matplotlib to get the figure sizing, colors and what not.

ebolyen · March 28, 2018, 5:52pm

One note on this line:

That can be done with

metadata = qiime2.Metadata.load('<your metadata>').to_dataframe()

which will give you the same metadata-parsing as the rest of QIIME 2

(the csv files should stay pd.read_table like above)

system · April 28, 2018, 11:53pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.