Pipeline Question

Hi, I’m working on my first major QIIME run and I have few questions about the best format for the pipeline. I see the examples online, but I was given an example that differs quite a bit and I wanted to know some opinions on which way I might have a more successful outcome.

Current Pipeline:

  1. Import paired-end demultiplexed sequences
  2. Denoise using dada2
  3. Make feature table - check for low read counts, determine # of reads for rarifying
  4. Run feature classifier to create taxonomy database
  5. Run Vsearch to cluster de-novo by 99% identify
  6. filter low read samples out
  7. rarefy samples and make new feature table
  8. do all downstream analysis (diversity, heatplots, etc).

I guess what I’m mostly wondering about is where to run vsearch. Should I run it before denoising (i.e. should step 5 move to step 2)?

Also, do you see any other issues with the pipeline as I currently have it?

Thanks so much for your help.
Alicia Reigel

Hi @reige012,
Is there a specific reason why you have to include Vsearch at all? Is OTU clustering really needed for your project? I ask because you are already utilizing dada2 to create ASVs which are just higher resolution analogues of OTUs. In most cases, there is no more needed for OTUs. I would just remove that step all together. Also don’t forget about tree building if you want any phylogenetic insights.


You do not need to rarefy either — rarefied tables should only be used as input for alpha/beta diversity methods, and this is automatically built in to the core-metrics pipeline. Do not use rarefied tables for differential abundance methods like ANCOM.


Thank you for your thoughts. These are both very useful comments.

1 Like

I do have another questions in regards to these responses. Is there a way to essentially “clean” and trim the sequences without using dada2 so that I can then cluster into OTUs instead of ASVs? I’d like to do both OTU (99% and 97%), as well as ASVs to see if there are any differences in the final outcome. Thanks!

See q2-quality-filter to perform qiiime1-style quality filtering.

There will be vast differences. Unless if you use a mock community (as the dada2 developers did to benchmark their method) there is no telling which method is “better” :man_shrugging:

1 Like

An off-topic reply has been split into a new topic: Are there alternatives to rarefying for alpha diversity estimation?

Please keep replies on-topic in the future.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.