Preparing and importing custom diets and taxonomy files for q2-MICOM

Mehrbod_Estaki · June 10, 2021, 10:42am

Hi @cdiener,
Finally getting a chance to check out q2-micom and it's looking really cool so far! And thanks for the excellent documentations and tutorials!

I have a few quasi related questions about the starting input files to use in micom. I couldn't find a clear answer to these throughout the documentations (both q2, and stand-alone version).

In the q2-micom tutorial, you use a pre-built western diet environment and mention the VMH database can be used for other diets. However, I couldn't find how these other diets (or a custom one made on VMH) should be imported. For ex. your prebuilt western_diet_gut.qza artifact has 4 columns:

flux	dilution	metabolite	reaction
0.0148986	0.1	fru_m	EX_fru_m
0.0148986	0.1	glc_m	EX_glc_m
0.0148986	0.1	gal_m	EX_gal_m

While the diets I download directly from VMH have only 2:

Reaction	Flux Value
EX_etoh[e]	0
EX_h2o[e]	158601.920147786
EX_caro[e]	0.001620496184756
EX_retinol[e]	3.10698212193613

What is the intermediary process exactly to go from one to the other?

The AGORA taxonomy, I think appears to be in NCBI taxonomy format, however the taxonomy you use in the q2-micom tutorial is based on SILVA. What magic is being used under the hood to map these 2 together? On that note, are other databases supported? Namely Greengenes and GTDB?
In a situation where we have information about each individual's diet, we can produce a specific environment model for each individual, assuming this will increase accuracy of flux estimates. Is there an easy way to run the growth simulations by providing a manifest file of sorts that would map each subject with their own diet .qza, rather them running them individually? If not, how would you recommend importing, running growth simulations, and merging a bunch of samples with their own unique diets?

I'm sure more questions will be heading your way soon

cdiener · June 10, 2021, 6:29pm

Hi Merhbod,

So cool to see you using MICOM. Sorry that the docs did not address those questions well. Growth media/environmental conditions was our recent focus area and some of the recent developments did not make it into the docs yet. The good news is that there are pretty simple solutions to all of them now

This is a very common question we get. The VMH diets do need some processing to be unable with MICOM. Basically, the steps are:

convert units to mmol/gDW/h and adjust IDs
(for gut) dilute components usually absorbed in the small intestine
fill in components that are required for bacterial growth but absent in the growth medium
This is now documented in the new MICOM media repository (GitHub - micom-dev/media: Environmental growth media for MICOM.) that provides notebooks that go from the raw data to the QIIME 2 artifacts. There is already one VMH diet that was converted and that might be a good starting point to convert the others. If you do so we would love to get a PR with the new medium

Yes good point. Initially, we just matched by names across those DBs which usually does not perform well with SILVA. For 16S we now recommend using a classifier based on the Refseq 16S sequences which give the best mapping in our experiments. Alex Carr from our lab built one for 515f-805r. If that is useful I'll happy to share it with you if you send me your E-mail to cdiener(a)isbscience.org. GTDB has been requested before and I am on that but still need to map all the AGORA strains from NCBI to GTDB for that. There is an issue tracking that. I originally hoped that AGORA 2 would come out before that but that seems to take a bit longer.
Internally MICOM only uses per-sample media, so it's definitely possible. It's already implemented in the plugin even though it's not documented yet since the analyses are a bit more challenging. Here is a gist that creates a per-sample medium. just using that in qiime micom grow will apply the sample-specific media. If your media use different amounts of nutrients I would correct for that in downstream analyses. For instance, if one medium supplies twice as much nutrients it would lead to higher growth rates and fluxes in those samples.

Mehrbod_Estaki · June 11, 2021, 8:57am

Thanks @cdiener! Those are wonderful additional resources, in fact the example diet you linked there is more or less the diet I was going to try and curate! How convenient. However, now that I see the process I may try to create custom per person ones too.
I'll likely make a new thread to pick your brain about how you select some of those absorption dilution factors as well as the supplementing/completing growth mediums when they don't actually grow. Not sure I quite understand that step in the notebook.

I'll certainly get in touch, that will very useful! Also looking forward to the additional support for GTDB, looks like the AGORA2 preprint is probably stuck in peer review.
Also, very cool that MICOM already has per sample medium support, super convenient that it's just essentially a concatenation

Thanks again, will be in touch!

cherman2 · June 13, 2024, 4:42pm

An off-topic reply has been merged into an existing topic: IndexError: index 0 is out of bounds for axis 0 with size 0

Please keep replies on-topic in the future.