PERMANOVA with picrust2 data

ja.morillo · February 10, 2020, 12:08pm

Dear colleagues,
We have two simple questions about the feasibility of using q2-diversity plugin for statistical analysis of picrust2 inferred data:

We applied this tutorial q2 picrust2 Tutorial · picrust/picrust2 Wiki · GitHub to a simple dataset (three groups of soil microbiomes, n=5). We used q2-diversity on the q2-picrust2 output, to test for significant differences on ”potential community functions” among the three treatments:
qiime diversity core-metrics
--i-table q2-picrust2_output/pathway_abundance.qza
--p-sampling-depth 7031892
--m-metadata-file metadata.tsv
--output-dir core_metrics_out
--p-n-jobs 18
qiime diversity beta-group-significance
--i-distance-matrix core_metrics_out/jaccard_distance_matrix.qza
--m-metadata-file metadata.tsv
--m-metadata-column Treatment
--p-method permanova
--p-permutations 999
--p-pairwise
--o-visualization diversity/Treatment-jaccard-sign.qzv
--verbose
We used a PERMANOVA test to check for differences in “inferred functional pathways” at community level, using Jaccard and Bray-Curtis index. In principle I don’t see a problem from a conceptual point of view (this is also compositional data), but we would be very happy to read your opinion about this! In case this was not a good idea, do you suggest any alternative to get an “overall” result?
PERMANOVA results indicated non-significant differences with Bray-Curtis index. But with Jaccard index, although the global PERMANOVA test was not significant, in the pairwise comparison we got a very interesting result, when comparing two treatments (p<0.05) what is meaningful for us. Is it then correct to use the pairwaise comparisons when the global PERMANOVA test is not significant? (also important to know when analyzing ASVs). See attached qzv.
Treatment-jaccard-sign.qzv (339.1 KB)
Thanks in advance for your inputs!

colinbrislawn · February 10, 2020, 5:01pm

Using Jaccard distances between pathways seems OK to me. PERMANOVA uses bootstrapping, so it should be pretty flexible. My biggest worry here is about the quality of predictions from Picrust!

EDIT: There are conflicting views on this:
"You can't have your cake & eat it by performing Test #1 & not rejecting the global null, yet still going on to perform Test #2: the Type I error rate is greater than 𝛼 for this procedure." link
vs
"With one exception, post tests are valid even if the overall ANOVA did not find a significant difference among means." link
"these post-hoc procedures were designed to control familywise error
rates IN THE ABSENCE OF ANY SIGNIFICANT PRIOR OMNIBUS ANALYSIS" link

I think we need a really statistician to advise.

Don't be two disappointed in this result. Picrust is not always perfect with it's predictions. Perhaps with RNA seq, you would find more different pathways

Colin

ja.morillo · February 11, 2020, 12:15pm

Once more, thanks a lot Colin! for your inputs!
Yes, we agree, PICRUST is not perfect, and we include this in the disccusion and take into account NSTI quality-control values as well. Thats why we tried to focus on an "overall potential functionality" of the soils (using PERMANOVA), than on specific and perhaps risky interpretation of individual "significant" pathways.
In this sense, what you found about PERMANOVA interpretation is interesting and frustating at the same time. We will try to ask this to a statician, but of course we will be glad to know what an "statician-microbiome-q2-friend" thinks about this, if that hopefully happens in this forum. Actually, this is also applicable to any PERMANOVA analysis at ASV and taxa-level, what is a routine analysis for many of us.
cheers!!

colinbrislawn · February 11, 2020, 4:04pm

Good morning,

Awesome! I can tell you know your stuff!

Let's see if we can get a statistician to comment on this:

Is it then correct to use the pairwaise comparisons when the global PERMANOVA test is not significant?

Let's see if @benjjneb @mortonjt @jwdebelius @Mehrbod_Estaki can point us in the right direction!

Colin

jwdebelius · February 11, 2020, 5:20pm

Not a statistican , but I'm going to comment anyway!

I would be wary, given that you only have 5 samples per group in three groups! To me, that's right on the edge of "not enough data to draw a conclusion" and the fact that your global result doesn't support the pairwise makes me kind of nervous here. But, others may have different opinions.

Best,
Justine

Mehrbod_Estaki · February 11, 2020, 10:58pm

I'm on the same as @jwdebelius on this one.

I think the PERMANOVA would be kosher here. I don't see any issues with it.

In typical datasets I don't even bother running a post-hoc test unless there is a significant global p. In fact I know many statisticians that say running a post-hoc test on non-significant global model is wrong and doesn't make sense... but to me nothing is ever that black and white...so I guess it would really depend on the data and the question, or the pros and cons of the outcome and what other data do you have to support those results.

n=5 is usually about where I draw the line too about challenging myself with the question of do I even need to show any stats test to discuss the results or not. Especially when the effect size is so small, as is your case, you may end up doing more damage with regards to speculating and discussing an outcome with very weak support. No matter how much disclaimer you give in the discussions, readers often tend to forget those and only remember the reported results, which again, depending on the question being answered may or may not be a big deal.
Remember, your discussion doesn't have to be yes difference OR no difference, you can also choose we don't know, or not able to tell. PICRUSt results are really more about generating hypothesis anyways, so you can always leave it with that in your discussion too, for example: we think one interesting pattern to pursue in future, based on our predicted results, is blah blah. (if you have extra data supporting that pathway say with qPCR, western blot, ELISA ..., include it here), then you can discuss what the implications of that would be if that were to be true. That's about as much as I would use this.

ja.morillo · February 12, 2020, 11:34am

Thanks again! for your interesting insights. Yes, we agree that n=5 is very low. This a particular situation of this study, where we had to get samples from companies testing a specific agriculture technique, and even getting these 15 was hard. But we´ve learnt about this discussion about the PERMANOVA and it was very useful! We will report this data with care. Cheers!

system · March 14, 2020, 5:48pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.