Hi, Devon:
I have a naive question. Can I use your bold_anml_seqs.qza and bold_anml_taxa.qza directly for invertebrates? I assume that these two are already filtered and rescripted.
Thanks,
Jin
Hi, Devon:
I have a naive question. Can I use your bold_anml_seqs.qza and bold_anml_taxa.qza directly for invertebrates? I assume that these two are already filtered and rescripted.
Thanks,
Jin
Hello! can you please add this file (bold_anml_seqs.qza) or pls direct me to it in some way? thank you!
Hi @jlli2000, AFAIK you should be able to use these for invertebrates. I would try classifying against the database and confirm. You can also try the approach outlined here to make your own reference database too. There are quite a few options for CO1 databases these days.
Hi @Francesco_Frisenna,
You can access the files from here.
For invertebrates? Yes, the sequences in the bold_anml_seqs
files were built from the a large collection of publicly available BOLD COI sequences classified to arthropods, chordates, and others. Admittedly, not all invertebrates are arthropods! Nevertheless, I'm guessing that I covered as many of the invert bases as was possible in BOLD at that time.
At the time, I split up three very similar R scripts to grab all this information. In the first two bunches were:
bold_anml_seqs
files:'Acanthocephala', 'Acoelomorpha', 'Annelida', 'Brachiopoda',
'Bryozoa', 'Chaetognatha', 'Cnidaria', 'Ctenophora', 'Cycliophora',
'Echinodermata', 'Entoprocta', 'Gastrotricha', 'Gnathostomulida', 'Hemichordata',
'Kinorhyncha', 'Mollusca', 'Nematoda', 'Nematomorpha', 'Nemertea', 'Onychophora',
'Phoronida', 'Placozoa', 'Platyhelminthes', 'Porifera', 'Priapulida', 'Rhombozoa',
'Rotifera','Sipuncula','Tardigrada', 'Xenacoelomorpha'
You can see a version of the R scripts I wrote to grab these three batches data from the BOLD site here: tidybug/scripts/R_scripts at master ยท devonorourke/tidybug ยท GitHub. Check out the scripts titled bold_datapull_*.R
Hope this resolves your question