Question about taxonomy

Hi @liucong2018,
When I see a mixture of shallow assignments (kingdom level) and deep assignments (species?) such as you are seeing, I begin to suspect the query sequences, rather than the database or classifier (which are usually at fault if all sequences are poorly classified).

See this post and this post for some other examples on the forum, and related advice.

I recommend looking at the unassigned query sequences:

  1. what is the length? if these are particularly short, that’s a very clear reason for why they are receiving very shallow classifications.
  2. try using NCBI blast to classify a handful of these unassigned sequences and see what their closest match is (make sure to exclude uncultured sequences).

If these are in fact 18S sequences that receive good hits with NCBI blast and are of adequate length, then perhaps we should examine the classifier that you trained and the steps you used to train it. But I’d start there.

I hope that helps!