Taxonomic analysis issue

From my reps-seqs file generated after DADA2 , I can see that I have host DNA in addition to the bacterial DNA. The region that I have amplified is V4-V6 region.
rep-seqsPE.qzv (269.2 KB)

I still proceeded with the subsequent steps of taxonomic analysis using SILVA 132, where I had trained it acc. to my primer region. In the end, I generate a taxonomy table having almost all of them as either unassigned or as D_0__Bacteria. I did this because I assumed that I could filter out the unwanted sequences using the qiime taxa filter-seqs & qiime taxa filter-table.

When I use the command (below) to filter out the unwanted taxa both in my reps-seqs and taxanomy table :
Command 1
qiime taxa filter-seqs
--i-sequences /test/rep-seqsPE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned,
--o-filtered-sequences /test/rep-seqs-bac.qza

For command 1 I get: Plugin error from taxa:
All features were filtered, resulting in an empty collection of feature sequences.

Command 2
qiime taxa filter-table
--i-table /test/tablePE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned
--o-filtered-table /test/table-bac.qza
However, for the command 2 I generate a table that looks like:


Command for visualizing : qiime metadata tabulate
--m-input-file /data/p281301/test/table-bac.qza
--o-visualization /data/p281301/test/table-bac.qzv
The taxonomy table I generate, consists of unassigned and bacteria. But in the reps-seqs file, when I was directed to the ncbi database, I did spot few bacterial genera (and species):this does not reflect in the taxonomy table. For example, ncbi database shows E.coli (lets say) as a result, so correspondingly I should get the taxa until genus level and not just the domain d_0_bacteria. Where am I going wrong? From my reps-seqs file generated after DADA2 , I can see that I have host DNA in addition to the bacterial DNA. The region that I have amplified is V4-V6 region.
rep-seqsPE.qzv (269.2 KB)

I still proceeded with the subsequent steps of taxonomic analysis using SILVA 132, where I had trained it acc. to my primer region. In the end, I generate a taxonomy table having almost all of them as either unassigned or as D_0__Bacteria. I did this because I assumed that I could filter out the unwanted sequences using the qiime taxa filter-seqs & qiime taxa filter-table.

When I use the command (below) to filter out the unwanted taxa both in my reps-seqs and taxanomy table :
Command 1
qiime taxa filter-seqs
--i-sequences /test/rep-seqsPE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned,
--o-filtered-sequences /test/rep-seqs-bac.qza

For command 1 I get: Plugin error from taxa:
All features were filtered, resulting in an empty collection of feature sequences.

Command 2
qiime taxa filter-table
--i-table /test/tablePE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned
--o-filtered-table /test/table-bac.qza
However, for the command 2 I generate a table that looks like:


Command for visualizing : qiime metadata tabulate
--m-input-file /data/p281301/test/table-bac.qza
--o-visualization /data/p281301/test/table-bac.qzv
The taxonomy table I generate, consists of unassigned and bacteria. But in the reps-seqs file, when I was directed to the ncbi database, I did spot few bacterial genera (and species):this does not reflect in the taxonomy table. Where am I going wrong?
i have already looked at here, here, here, here. .

Also to let you know, I have looked this up here, here, here, here.

Your first link is expired. Would like to take a look at it. Ben

edit: when I had this problem, the dada2 portion did not have enough overlap and ended up without partial reads.

Hey @ben, the first link is fine - here is a view link to it. You might’ve run into a situation where you downloaded and pre-cached the viz. Thanks!

PS - you can also just download the QZV and open directly in q2view.

Sorry, @ben was right, it was not working and then I edited the post by uploading the qzv file. But did not specify that I had correctwd, sorry for the confusion

1 Like

I performed taxonomic analysis on the single end reads which is for the V4 region. Earlier when I did the taxonomic analysis (here) using paired end reads for the same samples, I found almost all of the features had corresponded to either d_0_bacteria or unassigned. I assumed it might be because I did not successfully merge them.

Hence on repeating the same steps with the single end reads, for few features I did get family and genus level classification (but still very low). The remaining are still unassigned (which is from the host) while for several features I still see d_0_bacteria: what does this mean? Can anything be done to improve this classification? The now improved taxonomy file for single end reads is below.
taxonomySE.qzv (1.3 MB)

chances are you were using an inappropriate reference dataset — you can search the forum for other examples, but 99% of the time this happens when, e.g., someone has V1-3 reads and are using a V4 classifier. Or even V4-6 and using a V4 classifier. The classifier needs to be trained on reads ≥ the target region (so e.g., full-length 16S is fine for classification of V4).

One of two things:

  1. the reads are too short
  2. the reads are not really bacterial but are more host; spot check a few with NCBI BLAST (exclude uncultured/unclassified/unknown) to confirm.
  3. same issue as #1; the classifier does not fully overlap your target region so you are getting bad classifications.
  4. your reference database has poor coverage of the target taxa?
1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.