How to manually import a BUSCO database

I’m running qiime2-moshpit-2025.7 installed as conda env on an isolated Red Hat Enterprise Linux system without access to the internet. I would like to generate a busco-db-bacteria.qza for MAG quality control, but using fetch as in the MOSHPIT tutorials isn’t an option:

mosh annotate fetch-busco-db
--p-lineages bacteria_odb12
--o-db busco-db-bacteria.qza
--verbose

I have questions about manually importing the BUSCO database. I downloaded the bacteria_odb12.2025-05-14.tar.gz (with another system that has an internet connection) from Index of /v5/data/lineages/

I also downloaded the associated file_versions.tsv .

I extracted the tar.gz archive and copied the resulting bacteria_odb12 directory to the Linux system and added the .tsv. These are the files within that directory:

ancestral
ancestral_variants
dataset.cfg
file_versions.tsv
hmms
info
refseq_db.faa.gz
scores_cutoff

I tried manually importing using:

mosh tools import
--type ReferenceDB[BUSCO]
--input-path bacteria_odb12
--output-path bacteria_odb12.qza
--input-format BuscoDatabaseDirFmt

But this error was returned:
There was a problem importing bacteria_odb12:

Unrecognized file (bacteria_odb12/ancestral) for BuscoDatabaseDirFmt.

It turns out that mosh tools import wasn’t recognizing any of the files.

Does this look like a simple mismatch regarding the type and input-format? Is there another way to manually import the database to generate the .qza?

Hi there!

After checking further with @misialq, we’d recommend avoiding manual import of the BUSCO directory. These lineage databases contain multiple internal files arranged in a very specific layout, and BuscoDatabaseDirFmt expects exactly the structure generated by mosh annotate fetch-busco-db. Even small differences in directory nesting or metadata can cause the import to fail, which is likely what’s happening here.

If possible, a safer approach would be:

  • Run mosh annotate fetch-busco-db --p-lineages bacteria_odb12 on a machine with internet access

  • Generate the busco-db-bacteria.qza artifact there

  • Transfer the resulting .qza file to your isolated RHEL system

Since you only need a single lineage, the artifact should be relatively small and easy to transfer :slight_smile:

Hope that helps!:folded_hands:

Cheers,

Paula

3 Likes

Thanks Paula. Will use your suggested work around.

1 Like