Pipeline Question

reige012 · November 20, 2018, 5:24pm

Hi, I'm working on my first major QIIME run and I have few questions about the best format for the pipeline. I see the examples online, but I was given an example that differs quite a bit and I wanted to know some opinions on which way I might have a more successful outcome.

Current Pipeline:

Import paired-end demultiplexed sequences
Denoise using dada2
Make feature table - check for low read counts, determine # of reads for rarifying
Run feature classifier to create taxonomy database
Run Vsearch to cluster de-novo by 99% identify
filter low read samples out
rarefy samples and make new feature table
do all downstream analysis (diversity, heatplots, etc).

I guess what I'm mostly wondering about is where to run vsearch. Should I run it before denoising (i.e. should step 5 move to step 2)?

Also, do you see any other issues with the pipeline as I currently have it?

Thanks so much for your help.
Alicia Reigel

Mehrbod_Estaki · November 20, 2018, 9:17pm

Hi @reige012,
Is there a specific reason why you have to include Vsearch at all? Is OTU clustering really needed for your project? I ask because you are already utilizing dada2 to create ASVs which are just higher resolution analogues of OTUs. In most cases, there is no more needed for OTUs. I would just remove that step all together. Also don't forget about tree building if you want any phylogenetic insights.

Nicholas_Bokulich · November 20, 2018, 9:21pm

You do not need to rarefy either — rarefied tables should only be used as input for alpha/beta diversity methods, and this is automatically built in to the core-metrics pipeline. Do not use rarefied tables for differential abundance methods like ANCOM.

reige012 · November 26, 2018, 9:00pm

Thank you for your thoughts. These are both very useful comments.

reige012 · December 10, 2018, 4:33pm

I do have another questions in regards to these responses. Is there a way to essentially "clean" and trim the sequences without using dada2 so that I can then cluster into OTUs instead of ASVs? I'd like to do both OTU (99% and 97%), as well as ASVs to see if there are any differences in the final outcome. Thanks!

Nicholas_Bokulich · December 10, 2018, 4:38pm

See q2-quality-filter to perform qiiime1-style quality filtering.

There will be vast differences. Unless if you use a mock community (as the dada2 developers did to benchmark their method) there is no telling which method is "better"

Nicholas_Bokulich · December 15, 2018, 1:40pm

An off-topic reply has been split into a new topic: Are there alternatives to rarefying for alpha diversity estimation?

Please keep replies on-topic in the future.

system · January 15, 2019, 7:40pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.