Possible Analysis Pipeline for Ion Torrent 16S Metagenomics Kit Data in QIIME2?

Unclassified taxa can occur for a few reasons with the vsearch-based classifier:

  1. there really is no good match.
    1. Seems unlikely if this is region-specific, but that could occur if the reference sequences have a jumble of different regions covered (e.g., different primers were used for sequencing those reference sequences). It looks like you are using Greengenes? So that is probably not the case, either.
    2. Can occur if --p-perc-identity is set too high... but the default is really low, so that's probably not the issue.
    3. Can also occur if reads are in the wrong orientation, but the default is to check both orientations, so not the issue here.
  2. there is more than one match at the basal taxonomic level (kingdom in the case of greengenes) above the consensus threshold. So theoretically you may be hitting both bacteria and archaea, leading to "unclassified" (because reliable classification cannot be achieved at even the basal level).
    1. Since your query sequences (and possibly also reference sequences) are mixed orientations and you are searching in both directions, you could be picking up hits in both directions (only one of which would be valid!). I recommend increasing --p-perc-identity a little bit to see what happens, also test out the max-hits and top-hits-only settings to see how they impact your results. If you suddenly get much better results (without setting max-hits or max-accepts to 1!!!) then you've struck gold.

Could you clarify? Are you using extract-reads on the reference sequences? If so, don't use trunc-len on those reads (since your query seqs are in mixed orientations they can hit at either end of that read)

Yes that works too, but based on @thermokarst's notes, it looks like the biom tables you have are already collapsed on taxonomy so may not be enormously helpful.

:partying_face:
Let's circle back to this — I think the prospect of building a community tutorial (especially one written by different research groups who are bringing different perspectives to the table) is really exciting.

Good luck to you both! Let us know if you have any more questions.

4 Likes