DADA2 per sample FASTA output


I hope I am posting this in the right category:

I’ve written two Python3 scripts.

The scripts work together, converting a DADA2 feature table (with the literal, unhashed, ASV sequence as the ID) to a “per biological sample” FASTA file output.

The FASTA output is one multi-sequence file.

So far, it works on my dataset of about 200 samples, ~3.5-4 million reads, and ~4,000 unique ASVs.

Is this something useful for the QIIME2 community?

If so, I may consider either providing psuedocode, or actually debugging, automating, and merging the two scripts as an individual user friendly script.



Hi @Jasmine,
Just giving my 2 cents since nobody else has piped up…

A few forum users have asked about how to create outputs like this, so I expect that others will use and appreciate this if you share your script! I invite you to post any such contributions to the community contributions section of the forum.