For qiime2 2024.10, will a pre-trained nb silva classifier be available for 515F/806R? I only see silva for full length. We will be running a qiime2 workshop this spring, and find that using the pre-trained classifier is easiest.
Thanks!
For qiime2 2024.10, will a pre-trained nb silva classifier be available for 515F/806R? I only see silva for full length. We will be running a qiime2 workshop this spring, and find that using the pre-trained classifier is easiest.
Thanks!
It depends on the version of the database. For the most recent version SILVA 138.2, I don't think there is a pre-trained nb classifier for 515F/806R. You will need to train your classifier by first extracting the reads using your primers sequences, then training the nb classifier on the extracted reads. This could be done easily in one job.
Thanks for the info. we were hoping to use silva 138.2 for 515F/806R. Are there any plans to provide this classifier in the future? In the meantime, we will train our own.
Hi @valseitz ,
We decided last year to stop hosting the V4 classifiers and focus on the full-length classifiers to reduce the maintenance burden, because (a) the performance benefit is not so significant vs. full-length (other than lower RAM requirements); (b) this is only one of many primer sets used for 16S analysis; and (c) users who need classifiers trained on a specific subregion can extract that region and train their own. So I don't think the V4 classifiers will be added back.
The higher RAM required for using the full-length SILVA classifier is indeed a concern in workshops. What I have personally started using in workshops/teaching is to train NCBI RefSeqs 16S classifiers using RESCRIPt, because this is a much smaller classifier, requiring lower RAM so most students can also train it on their own laptop. But of course RefSeqs and SILVA have some design differences and will have different strengths in practice, so that's one caveat. Here's a tutorial about this if you are interested:
Very cool to hear that you will be running a QIIME 2 workshop! When/where? You would be very welcome to announce this in the Misc category if it is an open enrollment workshop.
I trained 138.2 V4 classifier for my project. If you still need it I can share it.
Hi @Nicholas_Bokulich - Thanks so much for the helpful reply and providing that tutorial. I will look into that for our workshop.
We host an annual 16S QIIME2-focused microbiome analysis workshop at Colorado State University (Jessica Metcalf Lab) during the summer. Currently, it's only open CSU and CSU-affiliated members, but would be fun to host an open workshop someday! Thanks again!
@timanix Thank you for the offer! for now I will try to train my own but may reach out if needed. Appreciate it!
Thank you for sharing that you have trained the SILVA 138.2 V4 classifier. I am currently unable to train the classifier due to limited memory availability, so it would be incredibly helpful if you could kindly provide it to me. my email address is naimur.fsh.du@gmail.com. Thank you in advance for your generosity.
Just in case if someone else is looking for classifiers that are already trained on different regions or databases, I will share the link to the folder where I store them.
This folder can be updated by me, but the link is stable (at least for now).
For example, if you need the classifier for V4 region, silva 138.2 and qiime2 version 2024.10, then open folder q2_24-10/Silva138_2 and download files:
V34 - V3-V4
V12 - V1-V2
V4 - V4
arch - Archaea
Exact primers can be found in the Jupyter notebook files.
ATTENTION_1: Organisation of that directory is a little bit chaotic since I never had a goal to train the classifiers for all regions and databases, as well as Qiime2 versions. I just put there classifiers that I used for my projects or trained for my colleagues.
ATTENTION_2: Classifiers are provided "as it is" and I am not responsible for any errors I did during the training. Use them at your own risk.
Best,
Timur
Hi again @Nicholas_Bokulich !
In my attempt to train a v4 silva 138.2 classifier on qiime2-amplicon-2024.10, I run into a scikit-bio error:
pkg_resources.ContextualVersionConflict: (scikit-bio 0.6.0 (/projects/lindsval@colostate.edu/software/anaconda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages), Requirement.parse('scikit-bio<0.6.0,>=0.5.0'), {'iow'})
I think I get this error because (from what I can find from this post) the current workaround for greengenes2 compatibility with qiime2 2024.10 in order to avoid downgrading scikit-bio is to run:
conda install "cython<1.0"
pip install q2-greengenes2 "scikit-bio>=0.6.0"
I found that if you dont strictly install scikit-bio>=0.6.0, it results in this error:
**Plugin error from diversity:**
** module 'skbio.diversity.alpha' has no attribute 'sobs'**
So it seems like if you want to classify using greengenes2 and proceed with core-metrics, you must have scikit-bio>=0.6.0, but in order to build a silva classifier for qiime2 2024.10 you need to have scikit-bio 0.5.9 or earlier. @timanix did you run into any issues like this (that is, if you use both silva and gg2 for classification)
Does this seem correct? Can you help me figure out a way in which I can use qiime2 2024.10 to generate a silva 138.2 v4 classifier and be able to run gg2 all the way through to core-metrics?
thanks!
Hello!
Nope, I never encountered such issues, but probably just because I used GG2 a while ago in older versions of qiime2.
Instead of down/ upgrading scikit-bio version in qiime2 environment (that may break some things while fixing other things), I would install several versions of qiime2. For example, for silva I would use the newer one and for GG2 the last version that is compatible with it as it is. But anyway, if GG2 breaks something in your second environment, you can always switch to the latest version that you used for Silva for other analyses after classification by GG2.
Best,
Thanks @timanix ! I will definitely be able to implement this for my own research, however we are trying to find a solution to run both classifications on the same version qiime2 for a workshop we are hosting this summer. Having two versions of qiime2 might be a little confusing to first-timers while creating double the work for our IT team who is installing qiime2 as a module on our HPC. If there are any plans in the future to reconcile this issue let me know!
Then I would go for a Qiime2 version that works with GG2. The latest version of Silva database is 138.2, but it may be not available in older versions of Qiime2 and Rescript.
To avoid updating Rescript within Qiime2 environment (again, it may or may not break some dependencies), I would use Silva 138.1.
Does that combination work for your purposes? If yes, I would check Qiime2 version 2024.2 or 2023.9