Hello everyone!
Before anything, this is my first post in the QIIME2 Forum. I'm already sorry if this is not the right topic to place my questions. I'll try to provide most information as possible.
With that out of the way, I'm kinda confuse about the results from feature-classifier classify-consensus-blast
, where there is a huge amount of 'Unassigned' features. I'm working with pair ended, demultiplexed, and primer-trimmed ITS1 sequences of endophyte fungi, generated with Illumina Miseq. Therefore, these are the steps that I've worked around so far, before the taxonomy assignment via BLAST:
Importing sequence data.
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path manifest.tsv \
--input-format PairedEndFastqManifestPhred33V2 \
--output-path demux-paired-end.qza
--verbose
DADA2 pipeline parameters. Truncated the sequences based on the QC plots.
qiime dada2 denoise-paired \
--i-demultiplexed-seqs demux-paired-end.qza \
--p-trunc-len-f 250 \
--p-trunc-len-r 120 \
--p-max-ee-f 0.5 \
--p-max-ee-r 0.5 \
--p-chimera-method consensus \
--p-n-threads 0 \
--o-table table.qza \
--o-representative-sequences rep-seqs.qza \
--o-denoising-stats denoising-stats.qza \
--verbose
This is the visualization of DADA2 stats.
BLAST
qiime feature-classifier classify-consensus-blast \
--i-query drimys/rep-seqs.qza \
--i-reference-reads unite-ver8-99-seqs-10.05.2021.qza \
--i-reference-taxonomy unite-ver8-99-tax-10.05.2021.qza \
--p-maxaccepts 1 \
--p-perc-identity 0.8 \
--p-query-cov 0.9 \
--o-classification blast.qza \
--verbose
I'm using the UNITE database (the latest QIIME2 release I could find in their website). These are the results that I'm getting with the BLAST. You can notice that ca. 500 features are classified as 'Unassigned'.
So I'm kinda confused here. It seems that my pipeline steps are fine, as it is in accordance with other posts I've seen. I recognize that my DADA2 and BLAST parameters are stringent, but that probably it's not the cause of this. Even if I use the standard feature-classifier classify-consensus-blast
parameters, I still get a lot of 'Unassigned' features.
Any thoughts on what could be happening here? Are these features just sequence/overall errors or it could have something to do with my pipeline steps? If the first case it's correct, I need to just disconsider these features in further analysis, or there's something that can be done?
I'm aware that there are some posts regarding the same problem, but none of them are related to ITS data.
Thank you for reading until here, and I'm also already thankful for your responses!
Bye!