Taxonomy and Accession number of the sequence

Dear all,
I successfully used QIIME2 to get the ASV table and taxonomy for my ITS sequence without having particular problems. However, I was wondering if there is a way to get for each feature the accession number of the reference sequence from the UNITE database that matched the features’ sequence I have. In the taxonomy file I get only the feature ID, the taxonomy classification and the confidence.

Thanks in advance and have a nice day,
Davide

Hi @Pikula,

Good question. The short answer is no, not usually, because the way the q2-feature-classifier methods work is to determine the most probable lineage and/or some consensus among the top hits, rather than identifying a single top hit... so there is no single accession ID associated with any given classification (usually).

However, it is possible to configure the classify-consensus-* methods to just look for the top hit instead of performing a consensus classification by using the max-accepts and/or max-hit parameters (see the documentation for more details related to each method). To see the feature ID associated with these top hits, though, you would need to put the accession number in the taxonomy label somehow, so that this appears in the classification results.

Another option (especially if you are interested in the accession IDs for just a handful of sequences) would be to filter your FeatureData[Sequence] artifact to contain only the query sequences you want to match, then use qiime quality-control evaluate-seqs... that method is just running blastn under the hood and will show you the full blast report so you can see the accession IDs for all potential matches. Or run blastn directly.

Good luck!

1 Like

Dear Nicholas,

many thanks for the in-depth explanation. I will have a look and play around with the suggested parameters.
Thanks again

1 Like