amino acid seqs analyses


Using qiime2-2023.5, I run dada2 denoise process for the fastq DNA seqs, and got rep-seqs.qza and table.qaz files. Now, I am trying to convert the rep-seqs to amino acid seqs and use the aa seqs for further analyses (e.g., diversity and taxonomy). I successfully converted the DNA seqs to aa seqs in BioEdit; but I met a problem importing the AA seqs using following command because the importing file is not DNAFASTAFormat file.

$ qiime tools import --input-path //rep-seqs-edit-aa.fasta --output-path //hgcA_rep-seqs-edit-aa.qza --type 'FeatureData[Sequence]'

Do you know how to import the aa seqs, and use the converted file for running further analyses in qiime2?


Hi @baehsung,

Thanks for reaching out! We don't currently support any analysis of protein sequences in QIIME 2, but we would love to hear exactly what you're trying to do in your analysis so that we can consider adding support in the future. For your awareness (and any developers on the forum who have interest in adding functionality in the future) - we do have the following semantic types in QIIME 2 (they just don't have any actions associated with them):


Cheers :lizard:

1 Like

Thanks for reply.

I'm analyzing the DNA seqs from amplicons (~1.0 kbp) of a functional gene using PacBio platform. As aforementioned, I started the analyses with dada2 denoise step, and successfully finished the downstream analyses, getting several results on the diversity and taxonomy.

Unlike 16S, the functional genes are more variable in DNA seqs; such that, there is an increasing trend to use aa seqs in order to get a stable relationship between those genes. That is the reason why I'm willing to apply the aa seqs, which can be gotten by translating the rep-seqs from dada2, for qiime2. Hopefully, I would like hear an updated qiime for this purpose soon.

Best regards,


1 Like

Hi @baehsung,

Thanks for sharing those details! @SoilRotifer brought up a couple of options for further aa analysis that you may be interested in looking into:

  • q2-protein-pca provides phylogenetic analysis for protein sequences - it looks like it has been maintained at least up until a year ago, so it's worth testing out to see if this would be useful for you.
  • RESCRIPt is actively maintained and also provides some protein sequence analysis via the get-ncbi-data-protein action, so that would also be worth looking into as well.

Hope this helps! Cheers :lizard:


thanks for informing them. I'll look into. Hee-Sung


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.