Confidence Threshold for feature-classifier classify-sklearn

amin · March 16, 2021, 8:12pm

I trained a naive bayes classifier using feature-classifier classify-sklearn to assign taxonomy. Because of continuous error, I switched to qiime/2.2019.04 from qiime/2.2020.2 for this step and put “ --p-confidence -1 --p-read-orientation same” (as suggested here in one post) . It worked well for me, but now I get confidence "-1". Could you please tell me what does it mean, and how should I prepare my response for my publication? I am using a reference database for rpoB amplicon sequencing. Here is the pipeline I used:

qiime feature-classifier classify-sklearn --i-classifier classifier19.qza --i-reads dada2_11000.qza --o-classification dada2_rep_seqs_taxonomy_github.qza --p-confidence -1 --p-read-orientation same

qiime taxa filter-table
--i-table table_11000.qza
--i-taxonomy dada2_rep_seqs_taxonomy_github.qza
--p-exclude Unassigned
--o-filtered-table dada2_improved_table.qza

qiime taxa filter-seqs
--i-sequences rep-seqs.qza
--i-taxonomy dada2_rep_seqs_taxonomy_github.qza
--p-exclude Unassigned
--o-filtered-sequences dada2_improved_sequences.qza

qiime feature-classifier classify-sklearn --i-classifier classifier19.qza --i-reads dada2_improved_sequences.qza --o-classification dada2_rep_seqs_taxonomy_improved.qza --p-confidence -1 --p-read-orientation same

amin · March 17, 2021, 9:44pm

For your consideration I am also attaching my rep seqs and trained Classifier. I just noticed I mistakenly write here dada2_11000.qza instead of rep-seqs.qza in my first sklearn command, but when I ran the code I used the correct one, still getting the error. Please let me know if you need anything more. I really appreciate your assistance on this.rep-seqs.qza (2.0 MB) classifier19.qza (4.6 MB)

ChrisKeefe · March 18, 2021, 10:32pm

Looks like you dug that out of the deep archives, @amin! In current versions of QIIME 2, --p-confidence -1 has been replaced with the more descriptive --p-confidence disable. Whichever version you're using, it's generally preferable to fix the underlying issue with the taxonomy rather than passing this parameter, as it disables confidence calculation, causing the classifier:

In future, please share details about the error messages you encounter. Without them, it's hard to help. This post discusses one possible underlying cause at slightly greater length.

Depending on what errors you're running into, I'd very likely recommend you switch back to a contemporary version of QIIME 2. 2021.2 is shiny and new, and does a lot of nice things that a two-year-old version cannot.

Good luck!
Chris

amin · March 24, 2021, 2:36am

I am a novice to this field and no one near me can help me on this area, I really appreciate your time and assistance on this, I am trying to find the solution for more than 10 days but stuck in this step. According to your advice I have switched to contemporary version of QIIME 2. 2021.2, now when I try to train my classifier using the following command:
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads fun_9.6_ref-fasta_21.qza --i-reference-taxonomy fun_9.6_ref-taxonomy_21.qza --o-classifier fun_classifier_21.qza

I get following error:

Plugin error from feature-classifier: Found non-header line when attempting to read the 1st record:

For your examination I am attaching my fasta and taxonomy files (which I obtained from GitHub : Mycorrhizal-symbiosis-modulates-the-rhizosphere-microbiota-to-promote-rhizobia-legume-symbiosis/rpoB_reference at main · godlovexiaolin/Mycorrhizal-symbiosis-modulates-the-rhizosphere-microbiota-to-promote-rhizobia-legume-symbiosis · GitHub), I generated fun_9.6_ref-fasta_21.qza (26.0 KB) fun_9.6_ref-taxonomy_21.qza (60.7 KB) files from rpoB_reference(GitHub) for training my classifier. I have also attached my representative sequence rep-seqs.qza (2.0 MB) so that you can test the classifier. I guess the problem lies either in my fasta or taxonomy file, but I don’t have enough expertise fixing them, I would be ever grateful to you if you can provide me a solution for my problem.

ChrisKeefe · March 24, 2021, 3:34am

When you first looked into this problem, did you see this fantastic post addressing the same error message?

In short, " that error is only raised if your first line (ignoring any blank lines) does not start with a >", and certain software may introduce "invisible" characters that precede that > in your fasta. @Oddant1 describes a diagnostic approach (hex editor) and the OP describes a straightforward solution (delete and replace the >) if you find your issue is the same.

Let me know how it goes!
CK

amin · March 24, 2021, 3:42pm

Thanks a lot for your prompt reply. At last, I successfully trained classifier fun_classifier_21.qza (4.6 MB) in QIIME 2. 2021.2 but when I am testing with my representative sequences rep-seqs.qza (2.0 MB), it’s giving me the following error.

Command I used:
qiime feature-classifier classify-sklearn --i-classifier fun_classifier_21.qza --i-reads rep-seqs.qza --o-classification taxonomy.qza

Error:
File "/usr/miniconda3/envs/qiime2/lib/python3.6/site-packages/q2_feature_classifier/_skl.py", line 81, in _predict_chunk_with_conf
classes = [cls for cls in classes if cls[0].pop(0) == level]
File "/usr/miniconda3/envs/qiime2/lib/python3.6/site-packages/q2_feature_classifier/_skl.py", line 81, in
classes = [cls for cls in classes if cls[0].pop(0) == level]
IndexError: pop from empty list

Plugin error from feature-classifier:

pop from empty list

Potential Cause:
From the previous post I suspect it is a formatting issue of database, I really really appreciate your kind help on this matter.

ChrisKeefe · March 26, 2021, 9:50pm

Thanks for your patience, @amin. I've been under the weather. Often the fastest way to troubleshoot errors like this is to search this forum for the error message. I found a bunch of results, including this one, by searching "IndexError: pop from empty list".

Give that post a read - I suspect you'll need to fix your taxonomy, but that should steer you in the right direction.

amin · March 30, 2021, 3:50am

Thanks @ChrisKeefe, I am trying to fix the taxonomy.

system · April 30, 2021, 9:50am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.