A problem to train a classifier for paired-end data from MiSeq!

Dear ALL,
I used Qiime2 to have done alpha diversity and beta diversity analysis of my customer data. I'm in the step to do Taxonomic analysis by following the moving -pictures in tutorials. I had a problem to train a classifier for paired-end reads. The sequence length is 300bps. The 16S Amplicon PCR Forward primer F=TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG and the Reverse primer R= GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC. I used the following commands using Greengenes (16S rRNA) 13.8 as:

qiime feature-classifier extract-reads –i-sequences 99_otus.qza –p-f-primer TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG –p-r-primer GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC –p-trunc-len 300 –o-reads ref-seqs.qza.

The error message is:
Plugin error from feature-classifier:
No matches found
Debug info has been saved to /tmp/qiime2-q2cli-err-qxqpa75g.log.

I need your help to solve my problem in order to continue my analysis. I don't know if the primers are too long and are used for the library preparation, but I got them from Illumina company. The primers in the tutorials for training feature classifiers are much shorter than the ones I used in my data. If I can't get it done, I may have to switch to Qiime1.9. By the way, I really like core_diversity_analysis.py function in Qiime1 to get all the results in one step and save me a lot of time as a Biostatistician for doing core service. I don't know if Qiime2 has the similar plugIn function.
One more question, I can't figure out which Plugin in Qiime2 does the similar job as pick_open_reference_otus.py in Qiime 1.9.1.
I'm looking forward to hearing from you.
Thanks in advance!

Xiaohong

Bingo. The primers you are using are the full Illumina adapter + linker + (possibly barcode +) PCR primer, which is far too long (since most of that is non-biological sequence that will not align to any of your reference sequences). You need to figure out which part is just the PCR primer and use that portion.

Good luck!

Dear Nicholas,
Thank you very much for the response.
Yes, you are right. I looked the protocol and have the right paired primers:33 bps for forward from 5’ to 3’ and 34 bp for reverse primer from 5’ to 3’. But It has the same error messages ( β€œNo match found”). The commands are:

qiime feature-classifier extract-reads –i-sequences 99_otus.qza –p-f-primer TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG –p-r-primer GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG –p-trunc-len 300 –o-reads ref-seqs.qza.

The error message is:
Plugin error from feature-classifier:
No matches found

In addition, I also try to use the reverse primer from 3’ to 5’ instead of from 5’ to 3’. I got the Plugin error from feature-classifier. Can I just give up training a classifier? Can I use gg-13-8-99-nb-classifier.qza for the 99% OTUs full-length sequences to do Taxonomic analysis. What do you think?
Have a good day!
Xiaohong

It looks like you removed the PCR primer sequence from your nextera adapters and are trying to trim using the nextera adapters, which are not biological. You removed the part that you should be using here, the PCR primer. You should use:
CCTACGGGNGGCWGCAG
GACTACHVGGGTATCTAATCC

Well, yes, that is another option. Training a classifier specific to your 16S rRNA subregion will increase accuracy a little bit, but not much. The full-length 16S classifiers will still perform well for classifying subregions, so you can use that if you continue to encounter problems during training (right now the issue is that you are not inputting really primer sequences that match biological sequence. but once you get it working you could run into other barriers like memory constraints).

I hope that helps!

Hi-Nicholas,
Thank you so much for spending time to help me with it. I appreciate it. You save my day!! It works.
Have a wonderful weekend!
Xiaohong

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.