ncbi COI sequences download

Hi All!
I have metabarcoding data for COI and I want to train a qiime2 classifier with NCBI COI sequnces..
I tried to use rscript in qiime2-amplicon-2024.2, but it took too much time (even working on weekends or 9 pm to 3 am ET), and as my internet connection is very slow it continuously end the process prematurely after running for hours, with no download at all..
Also, I tried to get it from entrez edirect, but same issue..
My questions:

  • Is there any easy way to download an ncbi COI database (nucleotide sequences, taxonomy)?
  • Is there anywhere I can get ready-to-use and up-to-date COI classifier to use with qiime2? BOLD or NCBI are very much appreciated..or both!
    Thanks so very much in advance!

Hi @Khaled,
There is no ready to go classifiers but I would recommend this resource: COI Workflow Parameter Considerations - #8 by SoilRotifer because it talks about chucking your data and merging to accommodate internet issues/large datasets!
Hope that helps!


Thank you so much Chloe for the answer.. I tried to use the instructions in this thread.. Great indeed! but still the issue of extremely long time for downloading, and usually doesnot contniue for connection interruption.
How long this process usually take? I mean downloading these files using rscript from ncbi, say for metazoans alone for example??

Hi @Khaled,

Unfortunately it varies depending on a lot of factords so I am unable to tell you how long I would expect this to take.

Sorry to not have better news!