Combining QIIME2 artifacts at the phylogenetic tree creation step for meta-analysis

Jean0521 · July 25, 2024, 3:25am

Hi all, still very much a beginner in using QIIME2. I'm doing a meta-analysis involving 6 studies. 3 different primer pairs were used. What I did was process each dataset separately up to the taxonomy assignment step. Before creating the phylogenetic tree, I merged the pertinent QIIME2 artifacts, then proceeded to creating the tree. I used DADA2 for denoising.

Is this the right approach?

Thank you in advance.

SoilRotifer · July 25, 2024, 4:34pm

HI @Jean0521, I'd be cautious about interpreting phylogenies and performing microbial community analysis from data that have been generated with different primers.

The reason is that each primer pair will have it's own set of amplification biases, i.e. some primer pairs are better at amplifying (or not) some taxa over others. Which will skew your interpretation of which taxa are present and/or more or less abundant.

Also, you will not be ale to construct a robust phylogeny if the sequenced regions do not overlap. You'd likely have to perform closed-reference OTU picking, or use GreenGenes2 to map your reads to a sequence / phylogeny. Again, this will not remove primer biases and will lead to erroneous interpretation of your data.

I'd advise against merging the data in this way. Especially, if these samples are from different environments!

If they are from the same environments, then you can compare / contrast which primer pairs work best.

Mike_Stevenson · July 25, 2024, 4:54pm

Hi @Jean0521

While 3 primer pairs sounds like it could be messy, I think using a logical approach should allow you to gain some valuable outputs.

When you say each dataset was processed separately, does that mean each dataset = 1 primer pair? If so, running each dataset through DADA2 individually sounds good. Assigning taxonomy separately (by primer pair) is also good, as different pairs will target different regions, leading to variations in assignation.

As @SoilRotifer mentioned, be cautious merging these datasets as the regions amplified by these different primer sets might not overlap. You won't be able to compare the results as the phylogenetic relationships have been inferred from different regions.

I wonder could you try to align the amplified regions from the different primer pairs prior to merging, though I'm not sure if this is the best approach.

Best of luck!

Jean0521 · July 29, 2024, 5:26am

Thank you so much, @Mike_Stevenson and @SoilRotifer! (and yes, 1 dataset = 1 primer pair). I'm thinking of doing away with the phylogenetic tree and just use a different distance metric that won't require it (will still work for my research question).

SoilRotifer · July 29, 2024, 6:47pm

This will not remove the primer bias issues we've discussed. The biases arise from the PCR amplification of the sequences. This will occur with or without phylogeny based methods.

If anything, the best you could do is a qualitative / richness comparison, i.e. presence or absence. That is, avoid evenness / abundance based metrics. You can try 'detecting' certain taxa... but again you run into the bias issue.

Either way, making inferences across these different primer sets is generally not advised, for the reasons outlined earlier.

-Mike

SoilRotifer · July 29, 2024, 7:27pm

I suggest you also read these papers about primer choice and bias. There are many more, but these are a few highlights:

Wasimuddin, Klaus Schlaeppi, Francesca Ronchi, Stephen L. Leib, Matthias Erb, and Alban Ramette. 2020. “Evaluation of Primer Pairs for Microbiome Profiling from Soils to Humans within the One Health Framework.” Molecular Ecology Resources, June. https://doi.org/10.1111/1755-0998.13215.
Soergel, David A. W., Neelendu Dey, Rob Knight, and Steven E. Brenner. 2012. “Selection of Primers for Optimal Taxonomic Classification of Environmental 16S rRNA Gene Sequences.” The ISME Journal. https://doi.org/10.1038/ismej.2011.208.
Darwish, Nadia, Jonathan Shao, Lori L. Schreier, and Monika Proszkowiec-Weglarz. 2021. “Choice of 16S Ribosomal RNA Primers Affects the Microbiome Analysis in Chicken Ceca.” Scientific Reports 11 (1): 11848. Choice of 16S ribosomal RNA primers affects the microbiome analysis in chicken ceca | Scientific Reports
Fadeev, Eduard, Magda G. Cardozo-Mino, Josephine Z. Rapp, Christina Bienhold, Ian Salter, Verena Salman-Carvalho, Massimiliano Molari, Halina E. Tegetmeyer, Pier Luigi Buttigieg, and Antje Boetius. 2021. “Comparison of Two 16S rRNA Primers (V3-V4 and V4-V5) for Studies of Arctic Microbial Communities.” Frontiers in Microbiology 12 (February): 637526. Frontiers | Comparison of Two 16S rRNA Primers (V3–V4 and V4–V5) for Studies of Arctic Microbial Communities
Heidrich, Vitor, Lilian T. Inoue, Paula F. Asprino, Fabiana Bettoni, Antonio C. H. Mariotti, Diogo A. Bastos, Denis L. F. Jardim, Marco A. Arap, and Anamaria A. Camargo. 2022. “Choice of 16S Ribosomal RNA Primers Impacts Male Urinary Microbiota Profiling.” Frontiers in Cellular and Infection Microbiology 12. Frontiers | Choice of 16S Ribosomal RNA Primers Impacts Male Urinary Microbiota Profiling.

system · August 30, 2024, 1:27am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.