Vsearch chimera

stangedal · September 28, 2018, 11:11am

Hi there,

I have a question regarding VSEARCH and how it detects chimera as implemented in Q2. In DADA2 we can choose if detection should be limited within each sample (consensus), or if all samples should be pooled before identifying chimeric sequences. So what does VSEARCH do? Pr sample? Pr Run? Pr what-ever-is-in-your-qza-file?

I just cannot seem to find the answer on my own

Best,
Solveig

Nicholas_Bokulich · September 28, 2018, 1:26pm

Hi @stangedal!
Good questions. q2-vsearch has two different chimera filtering methods: reference based (using vsearch’s uchime_ref method) and de novo (using vsearch’s uchime_denovo method).

For both of these, I believe chimera checking will occur on a what-ever-is-in-your-qza-file basis, since this is performed on sequences in a FeatureData[Sequence] artifact. A feature table is used as input to determine the frequency of each sequence, but as far as I can tell the sequences are still passed to vsearch all together.

For more details on what vsearch does with those sequences, see the vsearch docs.

I hope that helps!

stangedal · September 28, 2018, 10:44pm

Thank you for the answer!

The reason why I started wondering about this is because we have several runs - each going through DADA2 separately. But if we wish to use vsearch for further chimera detection - I am wondering what we risk if choosing the effortless solution of merging all runs, then running vsearch once on a large dataset ( vsearch’s uchime_denovo). Rather than repeating this step for each run (I think we can end up having at least 30). Could this result in a high rate of false-positive chimeric sequences you think?

S

colinbrislawn · September 29, 2018, 12:05am

Hello Solveig,

The vsearch devs have a recommendation about this! They call it an 'Open Question' but I'm pretty sure everyone does it at the study level, which I think is the effortless solution you mention.

Merging runs might also reduce the false negative rate because the additional coverage in sparse OTUs will mean that the parent of a chimera will be in the database so the child chimera can be removed.

I'm not sure which is best... but it's a good question!

Colin

system · October 30, 2018, 6:05am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.