Thanks for all the support we got throughout the years. I came to the stage that I need to upload my microbiota datasets (16S) on an online reprository before submission of the manuscirpt. Please let me know based on your experience, what are the pros and cons of some of the available tools. I was told that SRA is tricky if we have many samples and need to split into batches.
Thanks in advance
Marwa
Hi!
I only have experience with ENA, so can not say anything about other repos. Just uploaded 892 paired samples, pair by pair with CLI and Python scripts. But I know that they also have a tool for Windows for bunch uploading.
Thanks for the details about SRA procedures, will look into it now.
How it might take to get them public? I have over 100 samples for the first study and less than 50 for the other, some samples have been used in both manuscripts. All samples were on one sequencing run.
Better to split them? ok to have same samples (18 samples) in both?
Each study have their own metadata where we were answering different question.
I would split them by papers and make them public before or after submitting a manuscript.
They are not since it is just my script to upload samples one by one in the loop. I can share my notebook with the code I used if you are going to submit to ENA.
For it, you will need:
Register a study (paper)
Register all samples from it (by uploading the corresponding table)
Get accession numbers of the study and samples
Upload samples by accesion number of sample and study.
Drop me PM if you will decide to upload there and I will send it.
I recommend considering deposition into Qiita [ref]. It already houses hundreds of thousands of 16S and metagenomic samples, and provides an easy automated mechanism to deposit in ENA which satisfies journal data deposition requirements.
how long it takes to get a project ID to share in my manuscript after submission either for ENA-EBI, SRA-NCBI, or qiita? from your experience. I submitted to ENA two weeks ago via the windows file explorer option: Uploading Files To ENA — ENA Training Modules 1 documentation and got no info or update when I contacted the ENA team.
I could retract my files and check for other reprository if it is faster. Please let me know how long it takes to get ID after sending your files (that I will add in my mansucript).
In ENA, one usually register project first (with project ID issued immediately), and then upload their sequences to the project. If you uploaded your sequences then I guess you should already have Project ID.
Thanks for all of your repsonses, I don't know but perhaps the way I submitted my sequences need follow-ups from ENA side to create project ID. I created a BioProject ID instead to add in my manuscript then can upload datasets later on today. @colinbrislawn
I tried qiita but I couldn't manage. ENA didn't respond for long time.
The ones that responded once I created my BioProject is NCBI.
I am trying now to do the final step (BioSample attributes upload). I used the template file and I filled it in using my metadata, but I am getting error!
Not sure if you encountered that error before? I emailed ncbi help hoping they will help, not sure.
I have a unique ID for each sample but some samples are replicates so normally some rows will be similar in other columns.
Screenshot for the error and also my biosample_attribute file (I tried two formats either with and without 1 in the name of the excel file):