Taxonomy IDs for Qiita/redbiom

Hello, I was following the clawback tutorial (Understanding q2-clawback weights learning) to build wheighted classifiers from Annelida gut habitats (assuming that worm habitats might differ from mammalian habitats).

My idea was to start with ´´´redbiom search metadata "where host_taxid==6340" ´´´ to get a list of sample ids from Annelida. Since the output is empty, I am wondering whether taxids can be used in the same way as at NCBI (where txid 6340 specifies Annelida and all taxonomic groups within Annelida), or whether taxids in redbiom are not hierarchically stored (only the specific taxid is stored)?

Hi @arwqiime ,
I have recategorized your question as "other bioinformatics tools" as it is rather a question about how Qiita and/or redbiom handles taxids. This is a pre-clawback step so I also changed the title for clarity.

I am not 100% certain, but I think that you are correct, that Qiita and redbiom probably do not store hierarchical taxid info, it is most likely just the specific taxon linked to that taxid, since I belive that this is just stored in a sample metadata file in Qiita (but maybe redbiom handles this differently, I am not sure). You could search directly on Qiita for more specific taxids in Annelida to confirm (or maybe search via common names).

provided you can find a workaround (maybe a list of specific taxids or sample ids?) and pull up enough records, this indeed sounds like a good approach. I would still recommend comparing vs. the uniform classifier to make sure that results look reasonable, as you would probably only recover a small number of records (if any).

good luck!

Hello @arwqiime and @Nicholas_Bokulich,

@Nicholas_Bokulich, thank you for pinging me on this thread.

You are right, Qiita/redbiom only stores the host taxon_id and scientific_name (no hierarchical data) as provided by the user (that deposited that data), so searching for something less specific could help. Now, as I test I searched for "worm" and found some samples but they seem to be from other hosts that have the word "worm" somewhere in their metadata.

Anyway, @arwqiime, if you find some public raw data that might be useful for your meta-analysis and you want to process in Qiita (this will make it available for you and future users), please send an email to the qiita-help account and provide the PUBMED ID or SRA/ENA study accessions. We normally help there with this kind of efforts.

Best and good luck!

1 Like

Dear @antgonza
Thank you for your activities in this case. I aggree that the use of a 'keyword' like 'worm' could be misleading.
When browsing the Qiita data via the web portal, I realized that most of the 'gut' samples originated from the medical field (mouse, human, etc.), and only a few were from non-mammalian hosts (e.g. a great study of a few dozens of butterfly species from Harvard). I have classified my samples with the 'standard' and wheighted classifiers (Animal distal/proximal gut) but could not see big differences. I will have to look at some taxa, that change, and will have to find out, whether these taxa are found in guts by other techniques.
Thank you also for the offer to send you PUBMED or SRA/ENA study accessions. I will come back if there are any.
Best regards

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.