can BOLD references be used on invertebrates

Hi, Devon:

I have a naive question. Can I use your bold_anml_seqs.qza and bold_anml_taxa.qza directly for invertebrates? I assume that these two are already filtered and rescripted.

Thanks,

Jin

1 Like

Hello! can you please add this file (bold_anml_seqs.qza) or pls direct me to it in some way? thank you!

Hi @jlli2000, AFAIK you should be able to use these for invertebrates. I would try classifying against the database and confirm. You can also try the approach outlined here to make your own reference database too. There are quite a few options for CO1 databases these days. :slight_smile:

Hi @Francesco_Frisenna,
You can access the files from here.

2 Likes

For invertebrates? Yes, the sequences in the bold_anml_seqs files were built from the a large collection of publicly available BOLD COI sequences classified to arthropods, chordates, and others. Admittedly, not all invertebrates are arthropods! Nevertheless, I'm guessing that I covered as many of the invert bases as was possible in BOLD at that time.

At the time, I split up three very similar R scripts to grab all this information. In the first two bunches were:

  1. All COI BOLD arthropods
  2. All COI BOLD chordates
  3. Everything else that seemed to be useful (for my purposes) as an animal. What were those "others"? These were the names I drew from in BOLD, and would be in those bold_anml_seqsfiles:
'Acanthocephala', 'Acoelomorpha', 'Annelida', 'Brachiopoda', 
'Bryozoa', 'Chaetognatha', 'Cnidaria', 'Ctenophora', 'Cycliophora', 
'Echinodermata', 'Entoprocta', 'Gastrotricha', 'Gnathostomulida', 'Hemichordata', 
'Kinorhyncha', 'Mollusca', 'Nematoda', 'Nematomorpha', 'Nemertea', 'Onychophora',
'Phoronida', 'Placozoa', 'Platyhelminthes', 'Porifera', 'Priapulida', 'Rhombozoa', 
'Rotifera','Sipuncula','Tardigrada', 'Xenacoelomorpha'

You can see a version of the R scripts I wrote to grab these three batches data from the BOLD site here: tidybug/scripts/R_scripts at master ยท devonorourke/tidybug ยท GitHub. Check out the scripts titled bold_datapull_*.R

Hope this resolves your question

3 Likes