I am working on a project where we have Illumina 16S sequences (V4 and V1-V3) and also have cultured isolates.
Our goal is to determine where our cultured isolates fall in our sequencing data, so in order to keep the taxonomy assignment consistent, we are trying to use a trained classifier.
I trained the classifier with the primers we used for sequencing with Sanger sequencing, and I tried both –p-identity 0.9 and –p-identity 0.8 in the extract reads step. I used the 7 levels taxonomy, majority, from SILVA.
Regardless, we’re getting a majority of reads being assigned to D0_Bacteria, with no other taxonomic levels. When you take these same sequences and BLAST them, or run them through the SINA aligner, they have assignments down to the genus/species level.
The same feature classifier parameters worked well when used with our illumina data (different primer sets, though).
Our Sanger sequencing reads fall between the 27F and 1492R. (we sequenced either forward or backwards & reverse complemented.
Any help on why our feature classifier may be behaving this way, or alternative suggestions on how to compare the taxonomy between these sequencing sets would be greatly appreciated!