Import --i-reference-taxonomy taxonomy.tsv to .qza

There's a couple problems here. First, these forward and reverse sequence files can't actually be right -- they are identical to one another. This results in no sequences being merged, and the unintuitive error message which should be improved on the plugin side:

The second problem is that the primers are not removed -- the v6 primer with 3 ambiguous nucleotide (AAACTYAAAKGAATTGRCGG) is at the start of each read. Those ambiguous nucleotides appear as real variation to exact sequence variant methods like DADA2, and cause the long run times and over-inflated diversity at the end, as every biological variant gets split 8-ways.

So two things need to happen: (1) find the right reverse files or use the forward reads alone, and (2) remove the primers before processing or with the --p-trim-left [FWD_PRIMER_LENGTH] argument if using dada2.

3 Likes