Thank you for @michael.shaffer developing the q2-SCNIC package. I did not know this plugin before and I am planning to run run SparCC in python.
I recently read some article, they use the SparCC method to explore the relationship between environmental factors and microbial community.
Please refer the material and method in the following paper: https://www.nature.com/articles/s41396-019-0417-9
I came to have several questions:
Based on the q2-SCNIC tutorial, the input file is OTU table. I saw that SCNIC can run the between table correlations in the github page (but not implemented to using sparcc method currently??). But I am wondering then how can I run the correlations between environmental factor and microbial community. Can you please give me some suggestions. What kind of data file I need to prepare?
If the correlations between OTU table and environmental factor is not available in q2-SCNIC currently. Can I run this kind of correlations in the in the original sparcc package. I want to find some example but cannot find it in their page: https://bitbucket.org/yonatanf/sparcc/src/default/ (Can I added more columns to the OTU table to include the environmental factors. And use the new OTU table (include OTU and environmental factor) to run SparCC. So the final output should include the all the correlations not just the correlations between OTU and environmental variables. Is this make sense to you if I want to follow the paper did which I refer above? )
I also saw some people using SPIEC-EASI method to do network analysis. I am trying to figure what are the differences between SparCC and SPIEC-EASI . I am still confused about this two method. If I want to examine the correlations between microbial community and environmental factors, which approach do you suggest? Can you please give me some advice?
Hi @Lei,
I would suggest you explore the information here regarding network analysis tools that are commonly used by researchers. CoNet is an excellent tool in Cytoscape that can help you to study the association between environmental factor and taxa abundance.
Thank you for sharing the information. I have known CoNet before and it is definitely a great tool. But the algorithm (e.g., spearman, pearson) in CoNet can introduce bias for the compositional data.
Another reason I want to use q2-SCNIC is that this plugin provide ecological module (cluster) analysis, which make the data more easier to interpret.
Still waiting for @michael.shaffer 's reply. Hope he can give some more suggestions.
Thank you.
Hi @Lei,
You are right, q2-SCNIC performs better on ASVs/OTUs because it can address the data compositionality issue. However, I am not sure if one can look into the negative correlations between the nodes using q2-SCNIC. Just wanted to let you know, pl. do ignore if you already are aware of it, that there are two apps for finding clusters/modules within a network: CytoCluster and MCODE. These two apps are available for installation at the Cytoscape store.
Also, I am wondering if anyone has tried the "Between" functionality in SCNIC package for findings association between environmental factors and communities. @michael.shaffer would be the right person to give feedback on this.
I did not know the q2-SCNIC cannot handle negative correlations. Thank you for let me know. For this, I have one questions. Do you know under what condition we need to consider negative correlations. I have read several paper, which only consider positive correlations. Do you think the following sentence, which from a paper make sense to you?
I also did not know Cytosape has apps to find the modules and clusters. I am still preparing the environmental factors for my samples. I will definitely try all the methods and compare the results.
Hi @Lei,
Well, I am not sure what the authors precisely meant by the sentence (yellow) in their article. However, negative correlations/mutual exclusion generally indicate competition, amensalism, alternative niche preference, etc. I would like to draw your attention to a recent article which nicely summarized the pitfalls of different correlation metrics. I picked some important points from this article for you:
"While we highlight the use of SparCC, it is worth noting that there are several other valid choices for network inference that can mitigate the issue of compositionality."
"Both classical correlation methods and more contemporary approaches like SparCC are susceptible to indirect associations."
"Whether inferring interspecies associations or species associations with environmental properties, indirect effects should be considered and accounted for to avoid reporting spurious, noninformative relationships."
Hi @bsen2018,
Thank you so much for your explanation and sharing me with the newest article. The paper well explain the potential biased results might caused by different methods. I have understand them better now.
I have one more question want to ask. I did contact the author of the paper I mentioned in my previous post who used SparCC for OTU and environmental factor correlation analysis. The author told me that they simply combine OTU and environmental factor as an input for the SparCC analysis and get the correlation. However, as I understood, SparCC is designed for the correlation within microbial taxa and the example provided in the SparCC website is only used OTU as an input. How do you think they combine OTU and environmental factor as an input for SparCC analysis. Do you think this approach make sense.
Thank you for your help!! I am looking forward to your reply.