No taxonomy assignment with pre-trained classifier

Hi all,
I'm working with 16S data from several different studies. I have a trouble on taxonomy analysis now. As most of them are V3-V4 amplicon but some of them are V1-V2 and V4-V5, I used pre-trained silva classifier with Silva 138 99% OTUs full-length sequences (https://data.qiime2.org/2022.2/common/silva-138-99-nb-classifier.qza) downloaded on Data Resources section of docs.qiime2.org. Below is the command I used:

qiime vsearch cluster-features-closed-reference
--i-sequences merged_rep-seqs.qza
--i-table merged_table.qza
--i-reference-sequences ../db/SILVA/silva-138-99-seqs.qza
--p-perc-identity 0.97
--p-threads 16
--o-clustered-table otu_clustering_outcome/table-cr-97.qza
--o-clustered-sequences otu_clustering_outcome/rep-seqs-cr-97.qza
--o-unmatched-sequences otu_clustering_outcome/unmatched-seqs-cr-97.qza

qiime feature-classifier classify-sklearn
--i-reads otu_clustering_outcome/rep-seqs-cr-97.qza
--i-classifier silva-138-99-SSU-classifier.qza
--p-n-jobs 16
--output-dir taxonomy_outcome

After doing them and get taxa barplot with qiime taxa barplot command, I found out that 2 out of studies I'm working on are almost not assigned like below:

The studies almost not taxonomical assigned are V3-V4 amplified data, but the thing is other studies with same amplified region are well assigned (even the studies of V1-V2 and V4-V5 are also well assigned). I double checked all of samples have enough reads after denoise process (all more than 700 reads).

Any idea what was the problem? or any ways to get beautiful assignment?

Hi @Jonathan,

Thanks for reaching out!

I checked in with another moderator on this, and it seems highly unusual. Is there any possibility that you clicked/un-clicked the boxes on the right to reveal/hide some groups?

Otherwise, since you used closed-reference OTU picking, there should be taxonomy assignments. If the features are retained via closed reference, then that is a good sign these are real sequences and would have an assignment.

Do you mind sending us your .qzv file so we can look through the provenance?

Thanks! :lizard:

Hi @lizgehret ! Thank you for your reply.

Nop, the picture on the post was the very first view when I open the qzv file.

I also tried using customised trained classifier, but it also gave me the same assignment (I trained 3 different classifier with 3 different region; V1-V2, V3-V4, and V4-V5 referring to primer sequences written on their papers).

And I've sent you a message with the tables from denoise (dada2 single) and closed-reference clustering :slight_smile:

1 Like

Hi @Jonathan,

Thanks for sharing those files with me! Could you also provide me with the taxa barplot .qzv file? My colleague @cherman2 suggested that this could be caused by the specific metadata column used to re-label your taxa barplot. This will also help us to look at the full provenance for your pipeline (up to your final visualization). Thanks! :lizard:

Hi @lizgehret !

I've just sent you the files through message. As taxa barplot.qzv file is too big to send through the message, I sent taxonomy.qza file, instead. Also, I sent barplot.qzv files of those 2 studies with problem. Thank you!

Hi @Jonathan,

Thank you! Unfortunately I do need to examine the taxa barplot in order to look at the associated sample IDs - but you have a couple of different options for how you can share this (if a direct file upload is too large):

  1. Upload your taxa barplot to Dropbox, copy the link to the file (as if you were going to share that link with someone), paste it into view.qiime2.org using the 'file on Dropbox' option shown here:

    You can then copy/paste the URL from QIIME 2 view into your response, and I'll be able to view it directly from that URL.

  2. You can upload your taxa barplot to Google Drive, and share the drive link with me directly.

Whichever is easier for you, both ways will work perfectly fine. Thanks! :lizard:

1 Like

Hi, @lizgehret !

I've just sent you a message with Google Drive link. Please check and let me know if you can't access to it!

1 Like

Hi @Jonathan,

Thanks so much - file received! I am looking over your taxa barplot and checking in with a colleague of mine to see what thoughts they have on this. Thanks for your patience, I'll circle back shortly with next steps!

1 Like

Hi @Jonathan,

Thanks for your patience here! I have a few notes from my colleague @SoilRotifer, which I'll summarize below:

It appears as though you have been playing around and trying many different things, rather than starting from a clean slate with your data. You should have merged your results after running all of the closed reference commands (different variable regions) prior to all of the other stuff. Also, there is no need for classification since you are using closed reference.

It looks like you have also been in a similar discussion with @jwdebelius on this thread regarding taxanomic classification on closed reference OTU clustering, where she mentioned the following:

One of the beauties of closed reference clustering is that the taxonomy is already assigned and the tree is already built for you. You just need to import them into QIIME 2 and you're ready to go!

After reading through that discussion, it sounds like there was some initial confusion as to why you were attempting taxonomic classification - unless you have any further questions regarding the motivation behind that, I would say that you should follow Justine's recommendations there!

Hope this helps,

Cheers :lizard:

1 Like

Hi @lizgehret !

Thank you for your effort on the problem and comments!

I'll get through yours and the thread you mentioned, and I'll ask again if I still have problems or another quetion :slight_smile:
Thank you again!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.