Fungi taxonomic analysis

Guys,

I have used the code below to perform taxonomic analysis for ITS fungi data against unite-ver7-99-classifier-01.12.2017.qza database using qiime2-2020.2

qiime feature-classifier classify-sklearn
–i-classifier unite-ver7-99-classifier-01.12.2017.qza
–i-reads rep-seqs.qza
–o-classification taxonomy.qza

However I got the following debug log. Any help?

Traceback (most recent call last):
File “/Users/elolimyahmed/miniconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/q2cli/commands.py”, line 328, in call
results = action(**arguments)
File “</Users/elolimyahmed/miniconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/decorator.py:decorator-gen-343>”, line 2, in classify_sklearn
File “/Users/elolimyahmed/miniconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 234, in bound_callable
spec.view_type, recorder)
File “/Users/elolimyahmed/miniconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/result.py”, line 289, in _view
result = transformation(self._archiver.data_dir)
File “/Users/elolimyahmed/miniconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/core/transform.py”, line 70, in transformation
new_view = transformer(view)
File “/Users/elolimyahmed/miniconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/q2_feature_classifier/_taxonomic_classifier.py”, line 64, in _1
% (sklearn_version, sklearn.version))
ValueError: The scikit-learn version (0.19.1) used to generate this artifact does not match the current version of scikit-learn installed (0.22.1). Please retrain your classifier for your current deployment to prevent data-corruption errors.

Hi Ahmed,

I believe this should be an easy fix. The error is letting you know that you have to use the pre-fitted classifier generated using the version of QIIME2 that you are running your analysis with. So, re-try using the pre-fitted classifier from v. 2020.2, and you should be good.

1 Like

I agree, this should fix the problem. If anyone else needed to track down an updated database from Unite, like I did, you can find them here.

Thanks @Lichen for your quick response!

Thank you so much @Brightbeard for sharing this link. I downloaded the current fungi database but it is in .gz format. When i extracted the .gz file, i got a 200 MB file without an extension. I think I should the .qza file but where could I find it?

I have tried to use the .gz file directly but it did not work. i also added the .qza extension to the extracted file but did not work either. Any suggestions?

Any suggestions?

I have one (it is what I did), but it is an unfortunate pain in the ass. To remake the feature classifier, you can use the tools import command to import a .fasta file with sequences assigned to reference labels and a .txt file with the reference labels assigned to a taxonomic assignment. Both components for these files can be found in the .gz file you downloaded. You’ll need to unzip the file and copy or delete the information you want to retain to make the files.

The .fasta file should look like this:

>SH1140860.08FU_HF674537_reps_singleton
CATTACCGAATTGTCGACACGAGTTGTTGCTGGTCCCCAAACGGGGGCACGTGCACGCTCTGTTTGTACATCCATTCACACCTGTGCACCCCATGTAGTTCTGTGGTTTGGGGGACTCTGTCCTCTCGCCGTGGTTCTATATCTTTACACACGCTCTGTAATAAAGTCTCATGGAATGTATGCAGCGTTTAACGCAATACAATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGCTATTCCGAAGGGCATGCCTGTTTGAGTATCATGAACACCTCAACTCTCATGGTTCGCCGTGATGAGCTTGGACTTTGGGGGTCTTGCTGGCCTGCGGTCGGCTCCCTTCAAATGAATCAGCTTTCCAGTGTTTGGTGGCATCACGGGTGTGATAAATATCTACGCTTGTGGTTTCCGGAGGATCATTTCCGAATTGGTGGCACGAAGTGGTGGTTGGTCCCAAACGGGGGCAAGTGCCCGGTTTGGTTGTACCATCCATTACCCCTTGGCACCCCNAGGAGGTTTGGGGGTTGGGGGGATTCGTTCTTTTGCCGGGGTTTTATATTTTTACCCCCGGTTTGTAATAAAATTTCCAGGAAAGGAAGCAGGGTTTAAAGCCATTCCATTCCAATTTTAGCAAAGGATTTTTTGGGTTTTGGCTTGGAGAAGGAAGCAAGGAAAATGGGTAAGTAAAGGGAAATGCCGAAATCAAGGAATTCTTGGATTTTTGAACGCCCCCTGGGCCCCTTGGGTTTTTCGAAGGGCCAGCCTGTTTGAGGATTCAGAACCCCTTAAATTTCCAGGTTTGCCGGGGGGAGGCTGGGACTTGGGGGTTCTGGTGGCCTGCGGTTGGCTCCCTTCAAAAGAATTCACTTTCCCAGGTTTGGGGGCCTCCCGGGGGGGAAAAAAATTNACGGCTGGGGGTTTCCGCCAGGTAACCTTCAGTGATGGAGGTTCGCTGGGGCTCATAAATGTCTCTCCTCAGCGAAGACAG
>SH1140861.08FU_KF410664_reps_singleton
GATCATTACGAATTGTCAAAAGCGGGTTGTTGCTGGGTCTTCAAACGGGGGACATGTGCACGCTCTGTTTACACATCCACTCACACCTAGTGCACCCTCCGTAGTTCTATGGTCTCGGGGGACCCTGTCTTCCTGCGCGTGGTTCTACGTCTTTACACACACTCTGTAATAAAGTCTTATGGAGATTGTATGCCGCGTCTAACGCAATAGCAATACAACTTTCAGCAACTGGATCTCTTGGCTCTCGCATCGACTGAAGAACGCATGCGAAATGCGATAAGTAACTGTGACATTGCAGCAATTCACGTGAACTCATCGGAATCTCTCTGAACGCACCTTGCGCCCCTTGTGCTATTCCGAGGGGCGATGCCTGTTTGAGTATCATGAACACCTCAACTCTCATGGTTCGCCATGATGCAGCTTGGACTCTGGGGGTTTTGCTGGCCTGCTGGTCGGCTCCCCTCAAATGAATCAGCCTCCCAGTGTTTGGTAGGCATCACGGGTGTGATAAATATCTACGCTCGCGGTCGTCTGCCAGGTAACCTTTGGTGACAAAGGTTCGCTGGGAAGCTCACCAGATGTCTCTCCTCGGCGAGGACAGCTTTTTTTGAACCGTTCGATCTCAATCCAGGTAGGACTAACCCGTGAACTTTAAGCATATCAATTA
>SH1140862.08FU_HM100661_reps_singleton
CTGAGCTGTCGACACGAGCTGTTGCTGGTCCTCAAACAAGGGGGCATGTGCACGCTCTGTTCACACATCTACTCACAGGTGCACCGTCTGTAGTTTTATGGTCTGGGGGACACACCGTCTTCCTCCCGTGGCTCTACGTCTTTACACACACATCGTAGTTAAGTTTTATGGAATGTGCATCGCTTTTAACGTAATACAATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTTATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCCTTGGCTATTCCGAGGGGCATGCCTGTTTGAGTATCATGAACACCTCAACTCCTCATGTTTCCCGTGATGAGCTTGGACTTCTGGAGGTTTTGCTTACCTGCGGTCTCTCCTCTCAAACGCATCAGCTTGCCAGTGTTTGGTGGCATCACTGGTGAGATAACTATCTATGCTCGTGGCCGTCTGCCAGATAACCTTCAGCGATGGAGGTTTGCTTGAGCTCACAAAGGTCTTTCCACAGCCAAGACTGCTTTTTTAACTTTCGATCTCAAATCCCGTAGGACACCCGCTGAACCGTAGCTGACTAGCGCGCCTAA

And the .txt file should look like this:

|SH1140860.08FU_HF674537_reps_singleton|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Thelephorales;f__Thelephoraceae;g__unidentified;s__unidentified|
|SH1140861.08FU_KF410664_reps_singleton|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Thelephorales;f__Thelephoraceae;g__unidentified;s__unidentified|
|SH1140862.08FU_HM100661_reps_singleton|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Thelephorales;f__Thelephoraceae;g__unidentified;s__unidentified|
|SH1140863.08FU_UDB004564_reps_singleton|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Thelephorales;f__Thelephoraceae;g__unidentified;s__unidentified|

Once you make those two files and get them converted into .qza files, you can use the feature-classifier fit-classifier-naive-bayes command to import the Unite reference reads and taxonomoy and create your new classifier that will be compatible with the new version of sklearn.

There may be a better way to get a new classifier than creating your own from the files posted in the Unite repository. Like I said at the start this is what I did to work around this issue, but I’ve been known to make things harder than they should be. It seems to me like the new files posted on the Unite data repository aren’t formatted and divided correctly for use with Qiime2 like their previous iterations. If this is the case, maybe I should reach out to them and see if they can create them properly or offer to do it for them myself…

Hopefully this isn’t too much trouble as a solution!

1 Like

Hi, did you try by downgrading your scikit version to scikit-learn version (0.19.1) ? It may fix your issues, Hope it works.

Hi @Selva_Sankar Any idea how could I do it?

Thanks @Selva_Sankar

try this,
pip install scikit-learn==0.19.1

ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host=‘files.pythonhosted.org’, port=443): Max retries exceeded with url: /packages/f0/5e/1e1576587c5a9e8de6771806a4cccea8decd268c988453cf35ccbf892929/scikit_learn-0.19.1-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (Caused by SSLError(SSLError(“bad handshake: Error([(‘SSL routines’, ‘tls_process_server_certificate’, ‘certificate verify failed’)],)”,),))

You may need to run pip install requests first?

can you try this please?
conda install scikit-learn==0.19

1 Like

Thanks! How to do it, please?

I did it then run

pip install scikit-learn==0.19.1

But the same error showed up

ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host=‘files.pythonhosted.org’, port=443): Max retries exceeded with url: /packages/f0/5e/1e1576587c5a9e8de6771806a4cccea8decd268c988453cf35ccbf892929/scikit_learn-0.19.1-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (Caused by SSLError(SSLError(“bad handshake: Error([(‘SSL routines’, ‘tls_process_server_certificate’, ‘certificate verify failed’)],)”,),))

are you trying this being in qiime env?
please try after deactivating from the env
use either the pip install or conda install, it might work. Good luck

1 Like

Sadly, I’m hot garbage when it comes to debugging python errors… I wish I could help more with this :confused:

I deactivated qiime env and run conda then pip but got the same error

ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host=‘files.pythonhosted.org’, port=443): Max retries exceeded with url: /packages/f5/2c/5edf2488897cad4fb8c4ace86369833552615bf264460ae4ef6e1f258982/scikit-learn-0.19.1.tar.gz (Caused by SSLError(SSLError(“bad handshake: Error([(‘SSL routines’, ‘tls_process_server_certificate’, ‘certificate verify failed’)])”)))

  1. pip uninstall scikit-learn
  2. pip install scikit-learn==0.19.1
    or conda install scikit-learn==0.19

:crossed_fingers: :crossed_fingers: :crossed_fingers: :crossed_fingers:

1 Like

Thanks @Selva_Sankar
i successfully uninstalled scikit-learn then installed scikit-learn==0.19.1

but when I run the code below

qiime feature-classifier classify-sklearn
–i-classifier unite-ver7-99-classifier-01.12.2017.qza
–i-reads rep-seqs.qza
–o-classification taxonomy.qza

i got this error

(1/2) Invalid value for “–i-classifier”: ‘unite-
ver7-99-classifier-01.12.2017.qza’ is not a QIIME 2 Artifact (.qza)
(2/2) Invalid value for “–i-reads”: ‘rep-seqs.qza’ is not a QIIME 2 Artifact
(.qza)