Taxonomy Analysis - reference data and train dataset


I would like to conduct taxonomy analysis for 44 human stool samples. I followed the Moving Pictures tutorial with the green gene and rep-seqs.qza, and then imported my own map file and filtered-table.qza, then I got a very long list of error.

Then I saw that we should not use the green gene data and rep-seqs qza for our real samples. I am confused which reference file that I need to use and if I need to train my data set?

Thank you!

Hi @ihl216,

An error would be helpful, since there’s dozens of things that could go wrong, and likely hundreds more we’ve never even seen before. If you could provide the log file generated alongside the error (or re-run with --verbose and paste that full output), that would be great.

That is not true at all. It may be the case that you will get better resolution by tuning your classifier to your primer set, but we’d need to know what you are amplifying first. Where did you see this recommendation? It may be talking about something else.

1 Like

Did you see that warning from this tutorial? If so, then yes you should use a more complete reference dataset, but this step may also be unnecessary if you are using 16S V4 primers as we’ve already trained them for that particular amplicon (you can use Silva as well).

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.