Adapting feature IDs after dada2

vca007 · March 18, 2020, 5:53pm

Hi, I am using pyro-dada2 to form ASVs, but the ASVs formed are assigned complex ids. I was wondering if any would have an idea on how to change those for more regular ids, such as ASV_XX, and thereafter usable to combine with the sequences file. Thank you!

Nicholas_Bokulich · March 18, 2020, 6:39pm

Hi @vca007,
Those complex IDs contain useful info for comparison against other studies — before switching IDs, I recommend reading a bit more on what these represent:

But if you really want to swap the IDs, you can use qiime feature-table group to relabel the IDs to be whatever you want. Here is an example of relabeling to use the sequences themselves, but you can create any label you like as a tab-delimited "feature metadata" file with the original feature IDs in the left-hand column and the new id in the right-hand column:

I hope that helps!

vca007 · March 19, 2020, 1:33pm

Thanks Nicholas!

I understand the use when comparing to other studies and its usefulness. My current issue is just that I am doing some phylogenetic analyses with some specific sequences, and it becomes difficult to keep in mind a sequence represented in general what distribution for instance among my simples, whereas in the past, OTU_1 was easier to associate.

Thanks for the tip about the feature-table group, I managed to change the FeatureTable[Frequency], but I am not sure to understand how to adapt the IDs in a FeatureData[Sequence] file format. For the table,i created a metadata file with the list of IDs on a first column, and new IDs on a second column.

Thank you for your fast response!

Nicholas_Bokulich · March 19, 2020, 3:19pm

Sorry @vca007, there is not an easy way to relabel IDs in FeatureData[Sequence], at least not without breaking provenance. The thing to do would be export the data, relabel outside of QIIME 2, and then re-import.

vca007 · March 19, 2020, 5:16pm

Hi Nicholas, indeed it is probably the easiest. Thanks for all the support. I don't know if it is allowed to post it, but just for the others, I have been using the rename.fasta function of phylotools package in R. Thanks again!

Nicholas_Bokulich · March 19, 2020, 5:27pm

Of course! In fact, you are welcome (and encouraged!) to share your full workflow you are using to export to R, rename.fasta, and re-import (if you are going back to QIIME 2). Other users will likely be interested in this (and to other readers: if you have other solutions using python or R or bash or [insert other method here], please don't hesitate to share).

Just to be clear: the QIIME 2 forum is intended for QIIME 2 support, but lots of topics in general microbiomics and external software tools are relevant to the QIIME 2 community, as is reflected in the general discussion, community tutorials, and other bioinformatics tools topic categories... e.g., we have tutorials describing how to export QIIME 2 data for specific analyses in R. So as long as everything is done in a spirit of camaraderie and support for the QIIME 2 community, it is welcome.

Furthermore, QIIME 2 is not necessarily intended to "do everything" and there are lots of niche analyses, as well as some operations that we just haven't had time to implement yet (like renaming FeatureData[Sequence] indices!), so information on these is welcome.

system · April 19, 2020, 11:27pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.