Hi,
I am hoping someone with more experience with the UNITE database can help me. I am working with qiime2-2017.8. I have downloaded and imported the 99% UNITE database into QIIME2, where I am trying to use it to assign taxonomy to my representative sequences.
As a first step, I use the ITS primer sequences to extract the region of interest and truncate to 250 bp. I then use the naive bayes classifier (qiime feature-classifier fit-classifier-naive-bayes) with the truncated reads. But when I try and use the classifier (qiime feature-classifier classify-sklearn with my Deblur rep set and my just built UNITE classifier) to assign taxonomy to fungal ITS sequences, every sequence is assigned to kingdom Fungi, a handful of sequences are also assigned to phylum Ascomycota, but I no sequences are assigned any taxa lower than phyla. When I train the classifier without truncating first, the UNITE database does a much better job.
I know the Werner paper mentioned on the features tutorial page is for 16S data, was I incorrect in trying to truncate reads for ITS? If I was, perhaps a note to this effect should be added to the training tutorial page…
Thanks for your input!