Based on taxa-bar-plots.qzv I see that I am having many taxonomic classified up to only domain level (see the graph below). This is something unexpected for me. I thought that it has to do something with the classifier but while reading the QIIME2 forum it looks like I did everything correct (see the scripts below). Do you know what might be the reason it?
It looks like there may be an issue with your data, rather than your scripts, possibly either in the database or sequences you’re using. Can you explain a bit more of your up-stream pipeline to help diagnose the problem. What region are you sequencing? What kind of sequencing, importing, and denosing/clustering did you use?
I targeted the V1-V2 region of 16S and the samples were sequenced in 4 runs using MiSeq (2x300bp). First, I imported the multiplexed sequences. Further, I used “qiime cutadapt demux-paired” for demultiplexing followed by trimming the adapters, “qiime dada2 denoise-paired” and I merged the denoised data. Below you can find the scripts which I used:
Thanks! That sounds fairly normal. Do you have reasonable sequencing depth?
Could you also try running the full length Silva classifier on the resources page to see if it gives you what you expect? It’s not specific to a region, but it might help narrow down if the problem is because of your classifer or your data.
Do you have reasonable sequencing depth?
I think so.
Could you also try running the full length Silva classifier on the resources page to see if it gives you what you expect?
I got the same results as in the case of only_16S.
Based on the suggested posts and this one I tried to use a full-length classifier (by skipping qiime feature-classifier extract-reads) and I got the same results:
I also tried different databases, GG vs. SILVA, still the same strange results.
I tried SILVA_97 vs SILVA_99, still the same.
I tried pre-trained classifier vs. my-trained classifier, the same results.
Some time ago I run qiime1 with the same datasets and I didn't observe such results. That is why I thought it has to do something with the classification.
Just to check I chose randomly one unclassified sequence and one Bacteria__ sequence and I put them into BLAST and SILVA_db, and there I am getting a taxonomic classification.
E.g. Here are the BLAST results for Bacteria__:
and here for SILVA:
So I wonder, why I do not see any assigned classification when using qiime2. Any idea?
presumably you followed different steps in qiime1 so there are vaster pipeline differences that could explain this.
Can you exclude uncultured hits from your BLAST results? Notably, those results show uncultured organisms, which are not particularly useful for diagnosing this.
Your SILVA results are a better indicator. This leads me to suggest a different approach:
How about you try classifying with classify-consensus-vsearch instead? This will mirror what you did with the SILVA webtool but for everything. I suspect what may be going on is your reads are in mixed orientations — do you happen to know if your reads are in mixed orientations? — the sklearn classifier gets confounded (because the classifier is trained on the reads as they exist in the reference database, which usually occur in a single direction), but the vsearch classifier works just fine for mixed orientation reads.
Please give that a try and let me know how it goes!