Genus-level gene trees

Hi all,

Is there a way to make genus-level 16S gene trees of the features?

Regards,
Tyler

Hello Tyler,

I’m not quite sure what you are asking.

Trees are usually made at the feature level, so if you have genus-level annotations for your features, I guess that makes the tree at the feature level too? :man_shrugging:

You could also merge your features at the genus level based on their assigned taxonomy, then build a tree from those features. Is that what you were thinking?
(How would you choose which sequence from each genus to use for tree building, I have no idea. :woman_shrugging: )

Hopefully we can refine this question a bit a get a better answer.

Colin

1 Like

Hi Colin,

Instead of having one branch per feature/OTU, I would like one branch per bacterial genra that is represented in my table, such that feature/OTUs would, technically, be collapsed.

A parallel example would be… instead of a microbial dendrogram representing every sample, it is represented at a higher order from the meta-data (e.g., body region).

Regards,
Tyler

Got it. That makes sense. :+1:

So this script will collapse (merge) the table:
https://docs.qiime2.org/2019.7/plugins/available/taxa/collapse/

But… you will also have to ‘merge’ the ASV sequences that are in the same genera, or just pick out one sequence to use. I’m still not sure what is the best way to do that :woman_shrugging:

Colin

Hi Colin,

That worked - thank you.

You are also correct, in that the “representative-sequences” need to be collapsed, as well. These, technically, are what is inputed into the gene tree. Any idea how to do so?

Alternatively, is there anyway to collapse the rooted gene tree [the output of qiime phylogeny midpoint-root]?

TC

1 Like

I have a similar issue. I have performed all the alpha diversity analyses using a table collapsed to the genus level. Now I am being asked to move on to beta diversity analyses, and I want these to be done on the same genus-level data. However, creating the unifrac matrix requires a phylogeny, which I have only been successful at creating with the whole rep-seqs file.
Do you have any advice on how to create a genus-level unifrac matrix?

1 Like

So, the R package phyloseq includes the functions tip_glom() and tax_glom() which glom/merge based on tree tips and taxonomy levels, respectively. This might be a good bet.
https://joey711.github.io/phyloseq/merge.html#merge_taxa


Before we go further, I guess I want to push back against merging your features at all.

Are we sure it’s needed, or even helpful?

Sure. People are more familiar with genera, so counting them makes more sense than counting ASVs.

Just to be consistent? Keep in mind that most beta diversity metrics are essentially percentages, like .4 Jaccard distance means these samples are 40% different == 60% the same. Percentages are unitless, so 60% is 60% for both ASVs, OTUs, and genera.

While I make most of my graphs :bar_chart: at phyla, family, or genus level, all my beta diversity work is performed at the feature level without merging of any kind. I think that’s pretty common… but I would be interested in seeing how other people work with their data!

Colin

1 Like

HI Colin, thanks–I feel ok not collapsing to genus for the unifrac now.

There are two reasons I would collapse to genus generally:

  1. Consistency, as you mention. Both with my other analyses of this data and with other papers that deal with similar data with which our findings will be compared.
  2. The nature of the sequencing data, issue #1. These are single-end reads trimmed to 125 bp. I do not trust the species-level assignment of such short sequences, but the genus-level assignments are probably ok.
  3. The nature of the sequencing data, issue #2. Not this data but most of the other data I am working with has been created using a mix of five different primer pairs. The ASV numbers are a five-fold overestimate, but when collapsed to genus it’s fine.

Just my 2 cents here. I tend to agree with @colinbrislawn and my workflow is similar to his where I work at ASV level and use taxonomy in creating graphs.
Assigning taxonomy is totally optional as far as I am concerned, you can work with ASVs all the way through your analysis and when you finally want to talk about a particular ASV then you can use its designated taxonomy to ‘guess’ what it is and its functions are. This is the idea behind qiime2 and phyloseq as well that you keep those taxonomies separate and only call on them when you need to, rather than collapsing everything to a taxonomic assignment that may or may not be accurate at all, probably not…

Hi all,

I do work on the ASV-level for all of my analyses. When making a gene tree for all the ASVs, I didn’t want a couple hundred branches… but wanted all ASVs in a genus to be clumped together because the taxa I’m working with are completely new to the microbiome world. The R package would seem to fit the bill… would just need to figure out how to use R and, hence, why I was looking for a q2-related option.

TC

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.