I'm using Gneiss to find relevant differences in microbial composition. I have been trying to understand the methodology pathway, but I'm still a little confuse with some stuff.
When I inspect the balances, why is more important, the p value (signification) or the number of the balance (higher number means worse)? I attach a csv file of the p values (sorted) obtained from the correlation analysis. There are a few significant balances, and I don't known which i should keep (the lowest ynumber?).
Thanks a lot for the links and sorry for not to use the search bar. I carefully read all that topics. It was very useful and I think that now I understand quite more about gneiss, but I came across with an unexpected finding, and I'm looking for help:
I plotted a balance (attached below) with hipothetical interest according to regression summary. In the proportion plot, you can see that Bifidobacterium is more presented in patients with LOWER accumulated-score, but it is represented in the numerator half of the graph. Why? I inspected the numerator taxa csv file and it didn't appear. Instead, It appeared in the denominator taxa csv file. This is contradictory for me.
Is Bifidobacterium part of the denominator half of the balance? Then, why is it plotted in the numerator? Is Bifidobacterium more present in patients with low accumulated score values?
Looks like Bifidobacterium is in the denominator and it is more prevalent in lower accumulated-score. So that part makes sense. The part that should not be happening is the boundary in the proportion plot where it looks like Bifidobacterium is in the numerator, when it is actually in the denominator.
Not sure when if I’ll be able to get around to fixing it - and not sure if it is even worthwhile at this point given that all of these tools are in the process of being superseded by upcoming interactive plots (see https://github.com/fedarko/rankratioviz). But PRs are always welcome.