I wanted to thank you for your tutorial (GitHub - mestaki/qiime2-to-BugBase), since I'm trying to use Bugbase for some time.
Firstly I tried this summer following their Documentation, and unsuccessfully.
Today I've tried your tutorial to the letter, it generated all the files quickly and without problems but still when I upload them to the web app of Bugbase I get this message: There was an error while parsing your data. Please ensure your OTU and mapping files follow the format explained in the documentation. OTU tables must be the json version of biom.
I'm at my wits end.
I'm using qiime2-2019.4 version, macOS Mojave.
This is how my feature-table-tax-biom1.biom looks like in text editor:
Hi @anamarija,
Glad that little tutorial is finally getting some use
The first thing that sticks out to me is your metadata first column name #SampleID. There is a note in my tutorial regarding this:
You will have to manually rename the first column to sample-id as per BugBase’s requirement.
Hi, thank for extra fast reply!
I changed it, but it seems there is a problem with biom file, because when I upload it without metadata it still gets the same error. I also tried with your data from the tutorial, but have the same issue.
HI @anamarija,
Thanks for the update. What did you do different this time that it worked, albeit without the metadata?
It appears that the BugBase documentations has changed a bit since last I made that tutorial. In fact the first header now is required to be #SampleID instead, not sample-id as my tutorial says.
The fact that this works for you but only without the metadata file suggests that there is some special character or formatting issue that does not meet their requirements. I would carefully check their requirements again. Take extra care looking for hidden characters/spaces if your metadata file was used something like excel.
Be a tab-delimited text file
Have sample IDs in the first column
Have column headers in the first row
Have #SampleID as the first header
Contain only letters, numbers, underscores and hyphens
Not contain spaces, commas or quotes
Never contain confidential information
I'll try and update my tutorial whenever I get some free time.
Hi, I did everything the same, that is what is confusing... The only difference is in the file size, but that shouldn't be an issue (their Documentation said: < 15 mb (web application only)). I've tried to analyze only half my data, result is the same.
Thanks anyway for your time.
Aha! That would explain it then, those must be the newer requirements added to BugBase. I'll make sure to account for those in the updated version of my tutorial
I finally found out what the problem was (although it shouldn't be)!
So, the only difference between your and my data was that I picked OTUs at 99 identity level, and your's were at 97. So I got back, picked them at 97 and it worked perfectly!
I really don't know why, but it must be something to do with 99.otus.fasta or 99_taxonomy, because I tried it 5 times... Anyway, hope this will be helpful to somebody
Hi @anamarija,
I just went over the tutorial again and was successful on some new data clustering at 99% identity asyou did. The only change I had to make was to my mapping file (as mentioned above sample-id -> `#SampleID). Did you by chance use the 99_otus.fasta reference database with the 97% taxonomy? Because that would certainly throw an error, you would have to use matching otus/taxonomy. What % identity you use for clustering shouldn't cause any errors.
Hi @Mehrbod_Estaki, sorry for the delay in answering. I agree that it shouldn't cause errors, so I'll try to do 99% identity again soon to check if the mistake is mine. Get back to you with that
Hi, just wanted to say that I've run 99% identity in Bugbase and it worked. Don't know what the issue was before, but now it works. So everything you did in your tutorial works. Thanks again
Thanks for the update @anamarija. I believe your issue was that in your first attempt you perhaps used 99% OTUs file but with 97% taxonomy file. The mismatch there would certainly cause issues. You always want to make sure you use the corresponding OTUs and taxonomy files. Glad it's all cleared up.
Actually, I think the problem was that I accidentally used 99% otus file from the rep_set_aligned folder...has the same file name. When I did 97% I didn't make that mistake, so it worked.
Thanks again
Hi @Mehrbod_Estaki,
thank you for sharing your protocol. It worked well for me. I have one more question. The OTU table that BugBase requires has to be picked against greengenes databse (16s). And the OTU table prepared in your way contains OTUID (first column), taxonomy (last columns), and the rest of the columns are feature frequency data fro each sample. Would you think a biom (json) file with only taxonomy column and sample feature frequency data (line the following) work? Thank you!
Hi @arlandan,
I'm not sure to be honest. You could always try!
But basically, BugBase expects Greengenes taxonomic names in order to infer traits, and it specifically looks for the taxonomy column for that info. So if your biom table has that, I'm guessing it would be ok. Try it and if it doesn't work then we at least have a starting point for trouble-shooting.