Merging closed OTU and denovo OTU


I’m new to metagenomic analysis and I’m conducting an analysis on crohns and healthy patients both of which have only fasta files for each sample.

I conducted by analysis up until denovo clustering and I hit a roadblock after realizing that the primers mentioned in fasta files for crohns is v1-v2 and for healthy patients is v1-v3. I read up on otu picking and it is recommended to use denovo for overlapping regions in my case v1-v2 for denovo clustering and v1-v3 it would be closed otu picking.

my question, I’m sorry if it is too basic, but is it okay to combine these tables and seqs to move on to chimera and abundance filtering?


Hi @mallika,

This is a little bit challenging! First, I would check your read length/read overlap and make sure you used the same forward primer, at the very least. You’ll still get some primer bias between case sand control, which is good to keep in mind.

You cannot mix and match OTU picking methods between the two datasets. The de novo/closed ref combination will have mixed feature IDs and since everything downstream runs off feature IDs, it will be a problem. I would trim both groups to the same length and then either cluster de novo or closed reference. (I liked closed ref, but lots of people prefer other techniques.)


1 Like

Thank you. I have used closed OTU clustering method on crohns and healthy patients separately and later on merged them as they are of different primers.

I do have another question though - For the taxonomic analysis, I do not have primer sequences to run extract reads command to train the data of otu_97. How can I proceed further?

Hi @mallika,

You do not need to run taxonomic analysis on closed reference clustering. You can just import the database taxonomy and those are your assignments.


Hello again.
I’m following the moving picture tutorial from the phylogeny tree step with merged qza files of seqs and table from crohn’s and healthy patients. I’m at taxonomy analysis step where it requires a trained file using otu_97 fasta file and primer sequences and other information which I’m not aware as to how to go about. You mention in your previous answer that you could import the database taxonomy , would I be able to visualize it like in the moving pictures tutorials? I’m having trouble understanding how can I import the database and the assignments. Could you point me to the resource please? Thank you

Hi @mallika,

When you cluster closed reference OTUs, you get the taxonomy from the database. So, you’d import the taxonomy associated with your database reference set if it’s not already an artifact. For example, you might have to import the taxonomy for Greengenes 13_8 if you used that as your reference, but might not need to import if you got your sequences/taxonomy via RESCRIPt).

If you’re working with a database not already imported, you can import it using the qiime tools import function:

qiime tools import \
  --type 'FeatureData[Taxonomy]' \
  --input-format HeaderlessTSVTaxonomyFormat \
  --input-path [path to taxonomy file] \
  --output-path [path to output taxonomy]

where [path to taxonomy file] and [path ot output taxonomy] are the actual paths on your computer.

You should be able to import your phylogenetic tree the same way, except I think that’s type Phylogeny[Rooted].

Then, you should be ready to continue with the moving pictures tutorial, picking up from the barplot and continuing through ANCOM I.



Thank you. That answered my question.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.