Hi!
I have a question about classifier training, especially about extracting reference reads.
(1) On the tutorial website (https://docs.qiime2.org/2017.9/tutorials/feature-classifier/),
it says that “We know from the Moving Pictures tutorial that the sequence reads that we’re trying to classify are 100-base single-end reads that were amplified with the 515F/806R primer pair.” and we need to use following flags
qiime feature-classifier extract-reads
–i-sequences 85_otus.qza
–p-f-primer GTGCCAGCMGCCGCGGTAA
–p-r-primer GGACTACHVGGGTWTCTAAT
–p-trunc-len 100
–o-reads ref-seqs.qza
But, in the moving picture tutorial, I understand that it truncate the sequences at 120 bases.
So, should I change “–p-trunc-len 100” to “–p-trunc-len 120”?
(2) For my data, I use 150 bps of forward and reverse sequence like “atacama soil” tutorial, and I don’t trim or truncate any bases when I did denoise-paired flags.
For more clarification I used following flags,
qiime dada2 denoise-paired
–i-demultiplexed-seqs demux.qza
–o-table table
–o-representative-sequences rep-seqs
–p-trim-left-f 0
–p-trim-left-r 0
–p-trunc-len-f 150
–p-trunc-len-r 150,
If I want to train classifier based on this data using 97% OTUs greengene database,
should I use following flags?
qiime feature-classifier extract-reads
–i-sequences 85_otus.qza
–p-f-primer GTGCCAGCMGCCGCGGTAA
–p-r-primer GGACTACHVGGGTWTCTAAT
–p-trunc-len 100
–o-reads ref-seqs.qza
(3) This is my last and (maybe) very basic question. According to the sequencing information, the forward and primer sequences are FWD:GTGYCAGCMGCCGCGGTAA; REV:GGACTACNVGGGTWTCTAAT (for 515F-806R) which contains Y, M or W.
Can I just copy and paste above primer sequence for qiime feature-classifier extract-reads flags?
I think I have too many questions… Thank you very much in advance for your time and kind help!
Thank you.