Alignment/Tree Building and Filtering samples


I am running Q2-2018.6. I imported featuredata[sequence] and featuretable[frequency] from USEARCH. That run contains multiple projects and I wish to split it by sample. Should I do this before I build the mafft alignment and phylogenetic trees, creating multiple trees for each project? Alternatively, I would build the tree with all of the samples in all projects and then filter using a subsetted metadata file for core metric or diversity analyses.

Hello Emily,

This is a tricky question. The basic answer is that the exact tree you build will depend on the microbes included in the MSA, so it probably best to split up your samples, remove any OTUs that don’t appear in a cohort, then built one tree for each cohort.

However, the same is true of OTU clustering; the mixing cohorts will effect the exact OTUs that appear in each one, so your cohorts a little bit mixed-up no matter how you build the tree.

Short answer: either way would work OK… but 100% the best option :1st_place_medal: would be to start from the very beginning and process each cohort separately.

I know this is not the question you asked, but I wanted to provide as much detail as possible. Let me know if you have any other questions.


P.S. Did someone give you an old version of Qiime? You should try the new version! :gift:


Thanks @colinbrislawn! I am using 2019.1 (habit…forgot I updated :roll_eyes:) . That answers my question perfectly. I ended up spliting as soon as possible and building multiple MSA.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.