q2-vsearch and vsearch


I am doing closed-ref OTU picking by q2-vsearch plugin. Before the OTU picking step, I merged the paired-end sequences using "qiime vsearch join-pairs" and then did quality control using "qiime quality-filter q-score". Here are my questions:

  1. Is that OK if I use the default parameters in vsearch join-pairs? After join-pairs with default parameters and followed by "quality-filter q-score" with option "--p-min-quality 20" , the interactive quality plot of qzv files shows most base have a quality score >20 at the bottom of box.
  2. I think the parameters in join-pairs are mostly used for quality filter. Like "--p-truncqual", "--p-minlen", "--p-maxns", and "--p-maxee" are actually options in vsearch "--fastq_filter" command. So that makes it a little bit confusing. Do you have better advice in choosing these parameters?
  3. Is "--p-min-quality 20" really necessary in the quality filter? Here suggests the default value "4" is OK (Deblur quality filtering — qiita 0.1.0-dev documentation), However, a higher threshold was recommended here (Alternative methods of read-joining in QIIME 2 — QIIME 2 2021.11.0 documentation). This is another thing makes me confusing.

Thanks in advance !!

Hello @dong316,

Let's start here: download a copy of the vsearch manual from the release page to see what all the options do. The q2-vsearch plugin only exposes some of the options, so there's lots more info in the manual than you may need, but it's still the most comprehensive guide available.

Yes. The developers of both vsearch and Qiime2 strive to provide reasonable defaults. However, some setting are good to customize based on your data.

The setting minovlen in the plugin ( --fastq_minovlen in vsearch --fastq_mergepairs) depends on the length of amplicon sequenced, the length of paired end sequencing used, and the expected overlap you have, so this one makes sense to change. The maxdiffs setting is 10 by default in the q2-vsearch plugin, and if you are expecting a lot of overlap, this may make sense to raise so you don't lose to many reads.

Let's talk quality filtering!

:thinking: :face_with_monocle:

These questions are tricky because qiime vsearch join-pairs / vsearch --fastq_mergepairs provides options to remove reads at three stages in the pipeline!

  1. Before joining, with features like --fastq_minlen and --fastq_maxns
  2. During joining, with features like --fastq_minovlen, --fastq_maxdiffs, --fastq_nostagger
  3. After joining, with features like --fastq_minmergelen and --fastq_maxmergelen

How exactly you want to use all these features is up to you.

As an example, I don't do any filtering in the first step because I want all my reads, even the low quality ones, to make it to the joining step. Then I rely on the joining settings to remove the reads that do not join well by choosing stringent --fastq_maxee and --fastq_minovlen settings.

Try some settings and see that works best on your positive controls! :test_tube: :bar_chart:


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.