I am about to train my own classifier with primer set 341F (CCTACGGGNGGCWGCAG) and 805R (GACTACHVGGGTATCTAATCC). I have two questions, first which databases do you recommend, SILVA or Greengene? Second, I used different values to truncate my forward and reverse reads (during denoising). And, based on the tutorial, ‘‘For classification of paired-end reads and untrimmed single-end reads, we recommend training a classifier on sequences that have been extracted at the appropriate primer sites, but are not trimmed’’. I am a bit confused that what does it mean by “…but are not trimmed”? Shouldn’t I use the trimmed req-seqs for this?
I appreciate your help in advance.