Suggestions for using nifH ARB database for taxonomy assignment in QIIME2

I am new to metagenomics data analysis.
Currently I am working on nifH amplicon data.
I have done quality filtering, reads merging, denoising and ASV construction using DADA2 denoising plugin in QIIME2.
Now I have to do taxonomic assignment to ASV's. Currently there is no QIIME compatible sequence and taxonomy reference database available for nifH functional gene.

There's a nifH gene sequence database constructed by jzehr lab which is in .arb format. My question is how to use such kind of databases which are in arb format to assign taxonomy. Is there any specific tool or method?

Has anyone used this database for taxonomic assignment?
Can anyone suggest me or guide me as to how can I do taxonomic assignment for nifH amplicon data ?
How can I construct a QIIME compatible reference sequence and taxonomy database for nifH gene?
Apologies if I have posted a wrong topic, I have seen some already discussed topics related to this topic..but couldn't find any specific solution. I am asking here to get some suggestions and guidance from people who have already done such kind of work with nifH data.

Thanks in advance.

Hi @vkk_24!

Not one that we release on the QIIME 2 website, but others must be out there, since there have been a few forum users who report using nifH in QIIME 2.

I recommend contacting the makers of that database to ask them if they have any ideas, they would have the best advice on how to convert to fasta and extract taxonomy information. They might even be interested in making a Q2 compatible release…

Here is another forum topic that is pretty similar to yours and answers at least the second of these questions (and requires starting with fasta). It sounds like @EGvibrio was using a custom database — nevertheless, perhaps @EGvibrio has some advice or has worked with the jzehr nifH database? :

I have not worked with nifH personally, but know of some nifH mock communities on mockrobiota, where the contributor recommended another nifH database released by jzehr, so maybe this would be something useful?:

Let’s see if @EGvibrio or others who have worked with nifH might have any advice!


Thanks @Nicholas_Bokulich,

Hi @vkk_24,

I was in the same boat like you. I have the fasta file so you can use it for QIIME2 and I can send you if needed. But because there is not a taxonomic table, I built myself the table by blasting sequences on NCBI. If you want I can send you the code. Or, the qza file so you don’t have to bother yourself with the taxonomic assignment.


Hi @EGvibrio ,
could u also send me a copy of the reference fasta file and the taxonomy table?
also code is fine . :joy:

Hi @EGvibrio
could u also send me the qza file?
thank you!

Hi. My name is Cinthya
I am working on nifH amplicon data.
Now I have to do taxonomic assignment to ASV's.
I wonder if you could also send me a copy of the reference fasta file and the taxonomy table to do the taxonomy assignment and follow the same format. Thank you very much!

Hi @cinthya_vieyra

Did you get the nifH reference sequence and taxonomy files. If not I'll be happy to share with you. I have created a QIIME2 compatible sequence and taxonomy file.

1 Like

Hi @vkk_24 I am working on the nifH sequences now. Could please also share the qza files. Many thanks!

Hi @vkk_24! I've encountered a similar problem with classifying nifH amplicon sequences. I'm wondering if you would be willing to share copies of QIIME2 compatible files that you created with me? Or are they publicly available somewhere else? Thank you!