I am attempting to train a fungal classifier using the most recent UNITE database qiime2 release (version 8.2) found here UNITE - Resources
When I try to unzip after downloading I get gobbledygook, the only thing in the unzipped file is a text file called, 0138B5D5EA2C77B8C2E5B910202FD3E60A9244FC31084E08DAD63E213A03BBFB
If I download and unzip the version before that (version 8.0 released on 2018-11-18) The correct files are present. including the .fasta files and .txt files and the readme doc etc.
I would like to use the most recent release however I am not sure if there is something I am doing wrong or if there is an error in the file.
I am using Qiime2 version 2019.10 on a Linux machine.
Hi @SarahM, sorry to hear you're running into a problem! I recategorized this to "Other Bioinformatics Tools" - the QIIME 2 team doesn't develop or curate the UNITE database. If you're experiencing issues, I suggest you contact the UNITE team. Keep us posted!
Hi @SarahM,
I have the same qiime2 version and I am struggling with memory errors while trying to train my ITS classifier. Would you agree to send me the classifier you generated? I got some fungal untrimmed classifiers but they were trained with older versions of the skit-learn.
Thank you!
Stay safe!
Melisa
Hi Mellisa, I would be happy to share my classifier with you. I will likely re-train another classifier using the dynamic files this week that I would also be happy to share. I would appreciate you letting me know how it works for you as my mock community results are not looking good. I tried to attach the classifier to this reply but I am not sure if it worked (I am not very familiar with the qiime2 forum functions). If it doesn't work I will make an upload link and reply again.
Thank you Sarah! No, I don't think the files are attached... It would be great if you send me the dynamic classifier as well. Is your scikit-learn version also 0.21.2? This is the version that qiime says I have when the error comes up. Could you try with the upload link? Of course I will let you know its performance so you can compare with your data.
Cheers!
Melisa