Seeking Advice on Analyzing Overlap in Taxonomic Classification Results

Dear QIIME2 Community,

I'm currently working on the analysis of my samples where both ITS1 and ITS2 regions have been amplified. I conducted separate analyses for each set of samples.

I used the qiime feature-classifier classify-sklearn method to classify my representative sequences. Now, I'm interested in understanding the level of taxonomic overlap between the ITS1 and ITS2 sets.

Specifically, I would like to know how many sequences have been classified as the same in both datasets. Could you please provide suggestions on how I can obtain this information? Ideally, I would like to generate a list of accession numbers for these shared sequences.

Thank you so much for your assistance and any suggestions you can offer.



Hi @Antani!

So, just to make sure I understand - you used the same taxonomic classifier on two sets of rep seqs? One with the ITS1 region amplified, and the other with ITS2?


Hi @lizgehret
Correct, same taxonomic classifier on both.
I've used the Unite ITS database v9.

Thank you so much for your help!

1 Like

Hey @Antani,

So sorry for the delayed response here!

Unfortunately there isn't an action within QIIME 2 that will accomplish this. I think your best bet would be to try and compare your rep-seqs files in Python or R - returning a list of IDs for all matching values between the two files.

Hope this helps! Cheers :lizard:


Hi @Antani,

Just following up on this - there are a couple of options within QIIME 2 that will get you some of what you want (and might get you close enough to what you're wanting without having to write a custom Python/R script).

q2-quality-control has the evaulate taxonomy action, which compares a pair of observed and expected taxonomic assignments to calculate precision, recall, and F-measure at each taxonomic level, up to maximum level specified by the depth parameter. reSCRIPt also has an action with the same name that will produce a similar output.

These actions won't get you a list of IDs, but will provide you with general quantification of the amount of overlap between the two datasets.

Hope this helps! Cheers :lizard:


Hi @Antani ,

See also qiime rescript evaluate-classifications.

Good luck!


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.