Taxonomic assignment of feature IDs to reads

Dear all,

I am really new to QIIME (2) and I was wondering if anyone is able to help me with the taxonomic assignation of my sequences.

I would like to have a taxonomic assigning of my feature ID’s (OTU’s) to the sequences. I was able to create a OTU-table and to get a fasta-file with all the feature ID’s with corresponding sequences. The problem now is that it would take me forever to BLAST every single feature ID to NCBI, so I think it is actually possible to have this done by QIIME 2.

I am totally aware that a taxonomic classifier is probably needed, but I am really not sure how to get such a classifier for COI-sequences (for arthropods to be exact). I was actually able to download a COI training set which has been used in earlier studies, which I think I can use for my sequences right now. The training set regards a tar.gz file. I am just not sure how to proceed from here. Is it even possible to use this training set? How can I import this training set?

Again, I am very new to QIIME, so I am very sorry about being not that specific or being too vague. Please let me know If you have any questions.

Thanks in advance!

Margreet

Welcome @Margreet! (to QIIME 2 and the forum) :tada:

QIIME 2 has a variety of plugins and methods for predicting taxonomic affiliations of sequences. q2-feature-classifier is the main plugin, with various methods to choose from... you can read the documentation at docs.qiime2.org for usage details (e.g., for training a naive Bayes classifier from a set of training sequences). In general if you are brand-new to QIIME 2 I recommend checking out the tutorials and other documentation there... it will be really useful for giving you an introduction to the different plugins available (as well as a description of what a "plugin" is if you are not yet aware).

You are in luck — @devonorourke recently put together the following tutorial AND a pre-trained COI classifier (scroll to the very bottom of the tutorial to get the classifier, the sequences and taxonomy used to train that classifier (e.g., in case you want to re-train it or use a non-sklearn classifier) or other intermediate files):

Good luck!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.