I combined several biom files in an analysis in qiita which were sourced from 11 different studies on qiita. However, the biom and metadata both seem to be problematic when I try to load them into R as a phyloseq object. For example, I get this warning with the biom file and when I try to merge it with metadata, not all of them are combined because some of the IDs are different. When I opened the biom file in R, I see that some of the IDs have Xs before their names. Since there are so many samples, it is difficult to figure out which ones are different. When I open the biom file on text editor, though, there are no X's. I think as the import process, something is going on wrong because I get this warning:
The biom file in text editor does not look like a normal biom file either:
I don’t think there is anything wrong with the biom files or the metadata produced by Qiita.
I believe your issue comes down to 2 problems, maybe.
I’m not familiar with import_biom command, but it does look like its expecting a biom table with greengenes taxonomy rankings. Are you trying to import an ASV table or OTU table from qiita? If you’re using ASVs (i.e. from the deblur process) then you won’t be able to use this import method. You can simply convert the biom table to a .tsv file and import it easily into R/Phyloseq
The other issue is that sometimes special characters in column names like “#” , or having duplicate column names, can make R do weird things upon import, including adding those random "X"s to resolve the issue.
You’ll want to check those as well.
Another great resource that might be useful for you: https://github.com/jbisanz/qiime2R
Keep us posted
Thank you for your response. Converting to tsv or using qza_to_phyloseq works. However, this file does not seem to contain taxonomy. I had used file from dada2 before that worked with the same script. How should I add taxonomy to the file from qiita?
I think the problem comes from special characters but this is a biom file issue I think.
What are these characters added at the end?
I’m afraid you’ve lost me. I don’t know what you’ve done exactly, using what code/method, and what you are now trying to do. Could you please clarify these and provide exact codes you’re using?
Remember that when you produce a feature table from DADA2/Deblur, these are ASVs and don’t have taxonomy attached to them. This is done intentionally in QIIME 2 as well as other platforms like Phyloseq. The taxonomy stays separate in both platforms and is only called on when needed.
If you want your biom table to have taxonomy instead of ASV IDs (which is not something I would personally advise) you can use an approach like I describe here, just skip the OTU clustering part.
I’m in the process of figuring this out. I will post here when I got to resolve it so that others would be able to find the solution if they had the same issue.
Thanks, that would be great!
Remember that in both QIIME 2 and phyloseq, -by design- the taxonomy of your features is kept in a separate file and is only called on when taxonomy info is required. Meaning that if you are working with phyloseq, then you’ll need to import a separate taxonomy file, and not have this part of your biom table. Even though you CAN add taxonomy to your biom table, this isn’t the recommended approach.
Since you are operating from Qiita to QIIME 2/phyloseq, have a look at this tutorial here to see how you can create taxonomy file in QIIME 2, importing that into R/phyloseq is easy following the links you posted earlier.