Vsearch plugin error for close reference classification

Greetings friends!

Some help is needed, I’m running qiime2-2019.7 through conda and am having problems with the vsearch plugin, I have some paired end samples of the 16s v3v4 region (341f - 806r), which I successfully trimmed primers/barcodes with the Cutadapt plugin and denoised/derep/merged with the dada2 plugin, and successfully trained the Greengene 13_8 97% reference with the primer and got taxonmy of good resolution(using q2-classifier), however, when I was using the vsearch plugin to produce close-referenced results compatible with Faprotax/Bugbase, I ran in to problems. Here are the commands which are not working:

qiime vsearch cluster-features-closed-reference \

–i-table table-dada2.qza
–i-sequences rep-seqs-dada2.qza
–i-reference-sequences gg_97_otus.qza
–p-perc-identity 0.80
–o-clustered-table table-vsearch.qza
–o-clustered-sequences rep-seqs-vsearch.qza
–o-unmatched-sequences unmatched-vsearch.qza

it returned with:

Plugin error from vsearch:

No matches were identified to reference_sequences. This can happen if sequences are not homologous to reference_sequences, or if sequences are not in the same orientation as reference_sequences (i.e., if sequences are reverse complemented with respect to reference sequences). Sequence orientation can be adjusted with the strand parameter.

Which is really weird since I used the exact same reference sequences to train ref-seqs and already got good results, I tried changing the ref-seqs used here to the ones I trained for my primers and still got the same error, I tried lowering the perc-identity from 0.99 to 0.95, 0.9, 0.85 etc. all the way to 0.70 and yet the error persists, can anyone tell me what seems to be the problem? I used to run the exact same procedure with other data with no problem, I suspect the low quality of this particular data (due to the long sequencing range of the primer, around 500bp) is causing the problem but lowering perc-identity is not working as well, I also tried 99%, 94% and 91% from the gg 16s database but still no,table-dada2.qza (157.0 KB) rep-seqs-dada2.qza (447.5 KB) I uploaded the table and the sequence output from dada2 here, but the gg reference is too large, but its a standard gg 13_8 16s fasta data imported to q2 as follow:
qiime tools import **

–input-path 97_otus.fasta **

–output-path gg_97_otus.qza **

–type ‘FeatureData[Sequence]’

and was used before for the same purpose with no problem. Any help is greatly appreciated, thanks!!

1 Like

LOL what an idiot am I, the problem is fixed with the command
–p-strand both \
Which was the recommendation in the error message, but I didn’t think of using it as I used the same exact workflow for my previous data but none have I ever run in to reverse complement sequence as I did this time, I decided to give it a shot and it worked! Still would be nice if anyone can give me some insight as to why my data became reverse complement out of dada2? is it an experiment problem or analysis problem? Im really newb in this area. And does q2-classifier automatically correct for reverse complement? as I didn’t need to change any defaults when running q2 classifier.


Hello @Marvin_Yeung,

I’m glad you got this working!

As far as I know, most of the Qiime plugins do not automatically correct sequence orientation. This is why options like --strand both in vsearch are so helpful.

My best guess is that the reads were sequenced differently, and were reverse complemented to start with when you imported them. Illumina sequencing is directional, so a different sequencing protocol could have created reversed reads. :woman_shrugging:

There are a bunch of different classifiers, that work in different ways. classify-sklearn gives you --p-read-orientation and classify-consensus-vsearch gives you --p-strand, and the defaults for both of these work the reads in either direction, without changing the direction of the reads themselves.

Is that helpful? Let me know if you have more questions!

Yep that helps a bunch! Thanks for the detailed reply!!

1 Like