Import error rising from new BugBase metadata requirements

anamarija · November 21, 2019, 10:21am

I wanted to thank you for your tutorial (GitHub - mestaki/qiime2-to-BugBase), since I'm trying to use Bugbase for some time.
Firstly I tried this summer following their Documentation, and unsuccessfully.
Today I've tried your tutorial to the letter, it generated all the files quickly and without problems but still when I upload them to the web app of Bugbase I get this message: There was an error while parsing your data. Please ensure your OTU and mapping files follow the format explained in the documentation. OTU tables must be the json version of biom.
I'm at my wits end.

I'm using qiime2-2019.4 version, macOS Mojave.
This is how my feature-table-tax-biom1.biom looks like in text editor:

This is how my metadata.txt looks like:

Any suggestion would be helpful...

Mehrbod_Estaki · November 21, 2019, 10:37am

Hi @anamarija,
Glad that little tutorial is finally getting some use
The first thing that sticks out to me is your metadata first column name #SampleID. There is a note in my tutorial regarding this:

You will have to manually rename the first column to sample-id as per BugBase’s requirement.

Can you rename that and try again?

anamarija · November 21, 2019, 11:01am

Hi, thank for extra fast reply!
I changed it, but it seems there is a problem with biom file, because when I upload it without metadata it still gets the same error. I also tried with your data from the tutorial, but have the same issue.

anamarija · November 21, 2019, 11:05am

Could it be that is too big file: 328 KB?

anamarija · November 21, 2019, 11:14am

Sorry for multiple answers, but I was finally successful with your data, but only if I don't add mapping file (even with "sample-id" change).

Mehrbod_Estaki · November 21, 2019, 11:36am

HI @anamarija,
Thanks for the update. What did you do different this time that it worked, albeit without the metadata?

It appears that the BugBase documentations has changed a bit since last I made that tutorial. In fact the first header now is required to be #SampleID instead, not sample-id as my tutorial says.
The fact that this works for you but only without the metadata file suggests that there is some special character or formatting issue that does not meet their requirements. I would carefully check their requirements again. Take extra care looking for hidden characters/spaces if your metadata file was used something like excel.

Be a tab-delimited text file
Have sample IDs in the first column
Have column headers in the first row
Have #SampleID as the first header
Contain only letters, numbers, underscores and hyphens
Not contain spaces, commas or quotes
Never contain confidential information

I'll try and update my tutorial whenever I get some free time.

anamarija · November 21, 2019, 12:27pm

Hi, I did everything the same, that is what is confusing... The only difference is in the file size, but that shouldn't be an issue (their Documentation said: < 15 mb (web application only)). I've tried to analyze only half my data, result is the same.
Thanks anyway for your time.

anamarija · November 21, 2019, 12:43pm

Btw, I was successful with your data after I changed in mapping file #SampleID and put underscores in spaces e.g. left_plam, right_palm.

Mehrbod_Estaki · November 21, 2019, 1:19pm

Aha! That would explain it then, those must be the newer requirements added to BugBase. I'll make sure to account for those in the updated version of my tutorial

anamarija · November 21, 2019, 2:56pm

I finally found out what the problem was (although it shouldn't be)!
So, the only difference between your and my data was that I picked OTUs at 99 identity level, and your's were at 97. So I got back, picked them at 97 and it worked perfectly!
I really don't know why, but it must be something to do with 99.otus.fasta or 99_taxonomy, because I tried it 5 times... Anyway, hope this will be helpful to somebody

Mehrbod_Estaki · November 28, 2019, 10:43am

Hi @anamarija,
I just went over the tutorial again and was successful on some new data clustering at 99% identity asyou did. The only change I had to make was to my mapping file (as mentioned above sample-id -> `#SampleID). Did you by chance use the 99_otus.fasta reference database with the 97% taxonomy? Because that would certainly throw an error, you would have to use matching otus/taxonomy. What % identity you use for clustering shouldn't cause any errors.

anamarija · December 5, 2019, 11:20am

Hi @Mehrbod_Estaki, sorry for the delay in answering. I agree that it shouldn't cause errors, so I'll try to do 99% identity again soon to check if the mistake is mine. Get back to you with that

anamarija · December 10, 2019, 3:29pm

Hi, just wanted to say that I've run 99% identity in Bugbase and it worked. Don't know what the issue was before, but now it works. So everything you did in your tutorial works. Thanks again

Mehrbod_Estaki · December 10, 2019, 11:32pm

Thanks for the update @anamarija. I believe your issue was that in your first attempt you perhaps used 99% OTUs file but with 97% taxonomy file. The mismatch there would certainly cause issues. You always want to make sure you use the corresponding OTUs and taxonomy files. Glad it's all cleared up.

anamarija · December 11, 2019, 12:34pm

Actually, I think the problem was that I accidentally used 99% otus file from the rep_set_aligned folder...has the same file name. When I did 97% I didn't make that mistake, so it worked.
Thanks again

arlandan · September 29, 2020, 3:58pm

Thanks @anamarija for your post.

Hi @Mehrbod_Estaki,
thank you for sharing your protocol. It worked well for me. I have one more question. The OTU table that BugBase requires has to be picked against greengenes databse (16s). And the OTU table prepared in your way contains OTUID (first column), taxonomy (last columns), and the rest of the columns are feature frequency data fro each sample. Would you think a biom (json) file with only taxonomy column and sample feature frequency data (line the following) work? Thank you!

Best.

Mehrbod_Estaki · September 29, 2020, 7:22pm

Hi @arlandan,
I'm not sure to be honest. You could always try!
But basically, BugBase expects Greengenes taxonomic names in order to infer traits, and it specifically looks for the taxonomy column for that info. So if your biom table has that, I'm guessing it would be ok. Try it and if it doesn't work then we at least have a starting point for trouble-shooting.

Sorry couldn't be of any more help.

Mehrbod_Estaki · January 14, 2022, 8:40pm

An off-topic reply has been split into a new topic: Issues with the Qiime2BugBase worflow

Please keep replies on-topic in the future.

Mehrbod_Estaki · April 22, 2022, 5:51am

An off-topic reply has been split into a new topic: BugBase web not accepting Q2 sample metadata file

Please keep replies on-topic in the future.