Issue downloading/unzipping 2020-02-02 Fungal UNITE database

I am attempting to train a fungal classifier using the most recent UNITE database qiime2 release (version 8.2) found here https://unite.ut.ee/repository.php

When I try to unzip after downloading I get gobbledygook, the only thing in the unzipped file is a text file called, 0138B5D5EA2C77B8C2E5B910202FD3E60A9244FC31084E08DAD63E213A03BBFB

If I download and unzip the version before that (version 8.0 released on 2018-11-18) The correct files are present. including the .fasta files and .txt files and the readme doc etc.

I would like to use the most recent release however I am not sure if there is something I am doing wrong or if there is an error in the file.

I am using Qiime2 version 2019.10 on a Linux machine.

Any help is greatly appreciated.

Hi @SarahM, sorry to hear you’re running into a problem! I recategorized this to “Other Bioinformatics Tools” - the QIIME 2 team doesn’t develop or curate the UNITE database. If you’re experiencing issues, I suggest you contact the UNITE team. Keep us posted!

It is actually a * .tar.gz file
tar -xvf 0138B5D5EA2C77B8C2E5B910202FD3E60A9244FC31084E08DAD63E213A03BBFB.gz

1 Like

Hi Sarah,

I have this problem too. Try running this command (in the directory where the .gz file is stored): tar xvzf filename.gz

The output I got was:
(qiime2-2019.7) [email protected]:~/MGS00024/silva-database$ tar xvzf UNITE-ver8-2.gz
sh_qiime_release_04.02.2020/
sh_qiime_release_04.02.2020/sh_taxonomy_qiime_ver8_dynamic_04.02.2020.txt
sh_qiime_release_04.02.2020/developer/
sh_qiime_release_04.02.2020/sh_refs_qiime_ver8_dynamic_04.02.2020.fasta
sh_qiime_release_04.02.2020/sh_taxonomy_qiime_ver8_99_04.02.2020.txt
sh_qiime_release_04.02.2020/sh_taxonomy_qiime_ver8_97_04.02.2020.txt
sh_qiime_release_04.02.2020/QIIME_ITS_readme_04.02.2020.pdf
sh_qiime_release_04.02.2020/sh_refs_qiime_ver8_99_04.02.2020.fasta
sh_qiime_release_04.02.2020/sh_refs_qiime_ver8_97_04.02.2020.fasta
sh_qiime_release_04.02.2020/developer/sh_refs_qiime_ver8_99_04.02.2020_dev.fasta
sh_qiime_release_04.02.2020/developer/sh_refs_qiime_ver8_97_04.02.2020_dev.fasta
sh_qiime_release_04.02.2020/developer/sh_taxonomy_qiime_ver8_97_04.02.2020_dev.txt
sh_qiime_release_04.02.2020/developer/sh_taxonomy_qiime_ver8_dynamic_04.02.2020_dev.txt
sh_qiime_release_04.02.2020/developer/sh_refs_qiime_ver8_dynamic_04.02.2020_dev.fasta
sh_qiime_release_04.02.2020/developer/sh_taxonomy_qiime_ver8_99_04.02.2020_dev.txt

(I had renamed my downloaded 98AE96C6593FC9C52D1C46B96C2D9064291F4DBA625EF189FEC1CCAFCF4A1691.gz file to “UNITE-ver8-2.gz”)

The above files were written to a folder named “sh_qiime_release_04.02.2020” in the directory I was working in.

Hope that helps

2 Likes

Hi @SarahM,
I have the same qiime2 version and I am struggling with memory errors while trying to train my ITS classifier. Would you agree to send me the classifier you generated? I got some fungal untrimmed classifiers but they were trained with older versions of the skit-learn.
Thank you!
Stay safe!
Melisa

Hi Mellisa, I would be happy to share my classifier with you. I will likely re-train another classifier using the dynamic files this week that I would also be happy to share. I would appreciate you letting me know how it works for you as my mock community results are not looking good. I tried to attach the classifier to this reply but I am not sure if it worked (I am not very familiar with the qiime2 forum functions). If it doesn’t work I will make an upload link and reply again.

-Sarah

1 Like

Thank you Sarah! No, I don’t think the files are attached… It would be great if you send me the dynamic classifier as well. Is your scikit-learn version also 0.21.2? This is the version that qiime says I have when the error comes up. Could you try with the upload link? Of course I will let you know its performance so you can compare with your data.
Cheers!
Melisa

Hi Melissa, here is a googledrive link, let me know if this works.
https://drive.google.com/file/d/1py4lkTib_lVMLEuT3Hv5Bx7SNgJhbjHQ/view?usp=sharing

Thanks,
Sarah

Thanks @SarahM!
I already sent you permission to access the link. I will try the classifier and let you know.
Cheers!
Melisa

Hi @SarahM!

I downloaded the classifier but when I tried it, the commands gave me the following:

(1/1) Invalid value for “–i- classifier”: dev_unite-ver8-dynamic-classifier-02.02.2019.qza is not a Qiime 2 Artifact (.qza)

When I tried to visualize the file with the Qiime2 View it says: “Error: Corrupted Zip. Can’t find end of central directory”.

I wonder if maybe there was a problem with the file…
Thanks
Melisa