maarjAM database

Hi QIIME2 community,

I’m currently working on the analysis of arbuscular mycorrhizal fungi (AMF) using the 18S rRNA gene region (primers AMV4.5NF and AMDGR). I have already completed denoising with DADA2 and obtained my ASV representative sequences. I would now like to assign taxonomy using the MaarjAM database, which is specific to AMF.

However, I’m having trouble locating a QIIME2-compatible version of the MaarjAM database. Specifically, I’m looking for:

  • A reference sequence FASTA file
  • A corresponding taxonomy mapping file
  • Or ideally, a pre-trained .qza classifier for use with qiime feature-classifier classify-sklearn

I’ve searched the MaarjAM website (http://maarjam.botany.ut.ee/) but couldn’t find clear download options for the full reference database in a format suitable for QIIME2.

Questions:

  1. Is there a QIIME2-compatible version of the MaarjAM database available for public use?
  2. Has anyone successfully trained a classifier using this database for AMF taxonomy assignment?
  3. If not, could someone kindly guide me through the process of converting the MaarjAM database into QIIME2 format?
1 Like

Hi @Salma_Sarker

You can download the MaarjAM databases in QIIME2-compatible formats from here. After downloading, you'll need to import the sequences and taxonomy files into QIIME2, then train a classifier. Your commands will look something like this:

qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path maarjam_database_SSU_TYPE.qiime.fasta \
--output-path /maarjam-seqs.qza

qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path maarjam_database_SSU_TYPE.qiime.txt \
--output-path maarjam-taxonomy.qza

qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads maarjam-seqs.qza \
--i-reference-taxonomy maarjam-taxonomy.qza \
--o-classifier maarjam-classifier.qza

I tried using this database once for a test sequencing run, but ran into some issues with the primers, so I didn’t get very far. Still, I hope this is helpful for you!

all the best!

5 Likes

Hi @buzic

Thank you so much for your help.

I’ve followed your instructions and it seems everything worked well. I was able to assign taxonomy to about 28K features using the 18S rDNA QIIME release (2021) from MaarjAM. While no species-level assignments came through (only up to genus level), everything else looks perfect.

However, one thing that’s been bothering me is the speed—the entire taxonomy assignment step finished within a minute, which felt unusually fast. Below are the exact commands I used:

# Step 1: Import reference sequences and taxonomy
qiime tools import \
  --type 'FeatureData[Sequence]' \
  --input-path maarjam_database_SSU.qiime.fasta \
  --output-path maarjam_seqs.qza

qiime tools import \
  --type 'FeatureData[Taxonomy]' \
  --input-path maarjam_database_SSU.qiime.txt \
  --input-format HeaderlessTSVTaxonomyFormat \
  --output-path maarjam_taxonomy.qza

I then trained the classifier and used classify-sklearn to assign taxonomy.

Let me know if this quick run-time is expected or if I might have missed something.

This fast run time is possible if both the database and the test data is small / low complexity, and this databases is small! :pinching_hand:

Zooming out a little, every Qiime2 command should tell you if it fails. I also inspect the output files with view.qiime2.org to see more information about how the command ran.

1 Like