Taxanomic classifications (soil microbiome tutorial)

Hi,

I am working through the Atacama soil microbiome tutorial. It has all worked well and I have been able to generate good data plots etc.
I am now at the final stage, assigning the taxonomy and creating the interactive bar plots. There was no instructions about what classifier to use or what primers had been used in the sequencing. I therefore used the same parameters as in the ‘moving picture tutorial’ however all that is picked out in the taxonomy plots is ‘bacteria’ no additional detail. Should I be using a different classifier? Is there somewhere I can find the primer pairs used?
Please help!

Hi @lec49,

According to the original study, the same primer set (515f/806r) was used, so the same classifier used for the moving pictures tutorial (this one) is appropriate.

The error you describe usually occurs when the wrong classifier or input data are used... e.g., you can find several examples on this forum where users are using a classifier for the wrong primer region, have high levels of non-target contaminants, or have low-quality input sequences. None of these should be the case for these tutorial data. A few thoughts/suggestions:

  1. What version of QIIME2 are you running? Make sure you are running the latest version of QIIME2 and of the classifier. The underlying packages that QIIME2 uses for some analyses are constantly being updated — so for example if you are using an out-of-date version of QIIME2 or an old classifier you will get a warning that scikit-learn versions do not match (which probably would not cause this particular issue, but might).
  2. Since you have tabulated your seqs according to the tutorial instructions, you could try BLASTing a couple of these (just click on the seqs in the visualizer to link to NCBI BLAST) to do a quick check and compare to the tutorial results just to make sure the sequences are hitting 16S rRNA. Since you are following the tutorial step-by-step there should not be any problems, but you never know...

If all else fails, please post the exact commands that you are using, any warnings/errors you receive, and the query sequences that you are using as input.

I hope that helps!

Hi Nicholas_Bokulich

Thanks for your reply.

I have checked that QIIME 2 and the classifier are the up to date versions and there was no updates needed.
I checked a few of the sequences from the rep-seqs.tsv file against BLAST and they were almost all detected as ‘16S rRNA from uncultured bacteria’. Does this suggest that it is the 16S region that has been amplified and sequenced but no classified bacteria are being detected.

Surely this shouldn’t be happening with a tutorial data set?

No, that's normal and just a characteristic of the nucleotide ref that NCBI BLASTn uses by default. There are lots of "uncultured" sequences in there, so it is almost inevitable (even with tutorial data, or even known reference sequences!) to blast to "uncultured".

There is an option of exclude "uncultured" when performing the BLAST search, but that's not important here. The important point here is that your sequences are matching 16S sequences (even if they are "uncultured"), so the query sequences that you are using are not the problem (and they should not be).

We're moving down the checklist:

  1. QIIME2/dependencies are the correct versions
  2. Query sequences look good
  3. Pre-trained 515f/806r classifiers on QIIME2 website should work fine for the Atacama data

This all indicates that the data you are working with should work just fine. Please double-check that you are using the correct file paths for the correct classifier (download a new one just to be sure) and for the correct query sequences. You could try a different classification method (e.g., classify-consensus-blast) but this classifier really should work just fine with the Atacama data, so at this point I suspect the input files must have been mixed up at some point.

So please give it another try and:

Could you please share your taxonomy plots here?

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.